Grants and Contracts Details
Description
Visualizing and Assisting the Analytic Process
Prof. Brent Harrison, Department of Computer Science, University of Kentucky
Prof. Stephen G. Ware, Department of Computer Science, University of Kentucky
Background: Data-Driven Analyst Work?ow Graphs
In our 2020 LAS project, we built a game where analysts explore a collection of documents to
identify the perpetrator of an insider attack at a ?ctional startup company. We logged every action
and prompted players to explain their motivations and conclusions, then used the data to build
graphs of each analyst’s work?ow.
In 2021 we developed a process to represent general work?ow. Individual player logs are seg-
mented into meaningful units. We trained a neural encoder to recognize and cluster similar segments
and recognize when players are e?ectively doing the same tasks. Clusters are then summarized and
compiled into a generalized work?ow graph representing the hierarchical process that analysts follow
during their work?ow.
In 2022 we will improve the usability of these work?ow graphs and demonstrate that our process
can generalize by transferring insights from our Insider Threat game to LAS’s PandaJam exercise.
We will provide an intelligent, adaptive visualization of work?ow graphs to (1) provide guidance to
analysts as they work and (2) visualize and inspect their work?ow both through after-action review
to gain insight about their process and build a model of best practices.
Task 1: Identifying Relevant Graph Components from Context
While informative, the work?ow graphs that were generated previously can be quite large and o?er
little in the way of informative visualization. One insight gained during this project was that at
any given time, only parts of the graph were relevant to the current state of the analysis.
During the next phase of this e?ort, we will develop techniques to predict relevant parts of a
work?ow graph based on the current context (how we got here, and where we’re likely to go next,
given this particular user’s current path). We will explore the following techniques for this problem:
• Graph Mining [2]: When data is available, graph mining techniques can be used to perform
goal recognition, which will enable us to identify subgraphs that are likely to be explored
later during an analysis.
• Reinforcement Learning: Reinforcement learning and inverse reinforcement learning [1]
can be used to learn analysis policies, which themselves can be used to identify likely future
analysis actions and subgraphs in the work?ow graph.
• Graph Neural Networks [3]: We can use graph neural networks to learn generalized graph
representations which should allow us to transfer knowledge from a known domain (Insider
Threat) to a novel domain (PandaJam) to ?nd relevant subgraphs in new environments.
To evaluate this task, we will use the Insider Threat dataset we gathered in 2020 and the PandaJam
dataset as it develops in 2022. We will measure how well these techniques perform in situations
where work?ow graph data is available (Insider Threat) and in situations where it is scarce or
emerging (PandaJam).
1
Task 2: Adaptive Visualization of Work?ow
Larger work?ow graphs contain more information but are overwhelming when rendered as a single,
static image. We currently provide tunable parameters to control the size of the graph, but intel-
ligent, adaptive visualization would allow users to parse the larger graphs and gain more insight
from them instead of simply removing content. In the second phase of this project, we will develop
interactive browser-based work?ow graph visualizations that allow the user to navigate the graph
as an interactive narrative, visualizing graph content in di?erent ways to communicate information
more e?ectively and inspect the analystic process to gain insight.
In Task 1 we developed ways to recognize which parts of the graph are most relevant given an
analyst’s actions so far and potential future actions. In Task 2, we will investigate how this infor-
mation can be presented to an analyst during an analysis task such that that relevant information is
conveyed but is not overwhelming. This also requires investigating when each representation should
be used to be most e?ective based on the current context. Examples of potential representations
include:
• List of Documents: Early on, analysts tend to sift through large lists of documents to
explore what is available and form speci?c hypotheses.
• Investigation Narrative: Our process summarizations can help to guide analysts and focus
general questions into speci?c tasks.
• Timeline and Map: When investigating speci?c questions, documents can be visualized to
reveal their temporal and geographic relationships.
• Linked Node Diagrams: Concepts like people, places, and events show up across many
documents and these conceptual connections prompt analysts for what tasks to perform next.
Task 3: Graph Arithmetic for After-Action Review
Graphs can be combined and segmented using various operations that have analogies in arithmetic
(e.g. adding ∼ union, subtraction ∼ intersection and compliment, etc.). Our current general
work?ow graphs can be considered as the average of many individual graphs.
Our visualizations can assist analysts by enabling them to break down, compare, and combine
work?ow graphs during the after-action reviews of PandaJam. Analysts can inspect their own
work?ow to see how it is unique and how it overlaps with that of other analysts and with the
overall work?ow. They can edit and combine graphs to build an ideal work?ow that represents
best practices. These graphs can also be used to improve the graph embeddings produced by
the graph neural networks used during Task 1, since analysts will e?ectively be providing labeled
examples of useful techniques.
Speci?c Deliverables
1. An online technique for identifying the relevant parts of a work?ow graph based on the
previous analysis actions taken. This includes trained models for this task using graph mining
techniques, reinforcement learning, and/or graph neural networks.
2. Interactive work?ow subgraph visualizations as well as a technique to determine when a
visualization should be used based on past analysis events.
3. A tool for performing after-action review which enables analysts to segment their behavior
trace using graph arithmetic actions such as union, subtraction, intersection, and compliment.
4. Academic publications summarizing novel research results obtained during development.
2
Team and Capabilities
These PIs were previously funded by LAS in 2020 and 2021 to build Threat Game, a web-based
serious game to collect data for modeling analyst work?ow. This game, the resulting detailed
dataset collected from professional analysts, and the software developed for learning, clustering,
and visualizing analyst work?ow, will provide a foundation for this project.
Details on both PIs, their teams, and projects can be found at: http://cs.uky.edu/∼sgware/people
Brent Harrison is an Assistant Professor at the University of Kentucky who specializes in
arti?cial intelligence, machine learning, and autonomous systems. Given this, Harrison’s expertise
will be most useful in developing the graph mining and graph neural network models for identifying
relevant subgraphs in a work?ow graph as well as determining intelligent graph visualizations.
Representative publications include:
1. Brent Harrison et al. “Rationalization: A neural machine translation approach to generating
natural language explanations”. In: Proceedings of the 2018 AAAI/ACM Conference on AI,
Ethics, and Society. 2018, pp. 81–87
2. Lara Martin et al. “Event representations for automated story generation with deep neural
nets”. In: Proceedings of the AAAI Conference on Arti?cial Intelligence. Vol. 32. 1. 2018
3. Md Sultan Al Nahian et al. “A hierarchical approach for visual storytelling using image
description”. In: International Conference on Interactive Digital Storytelling. Springer. 2019,
pp. 304–317
Stephen G. Ware is an Assistant Professor of Computer Science at the University of Kentucky
where he directs the Narrative Intelligence Lab. He specializes in intelligent multi-agent planning
based on the beliefs and intentions of agents, including how to infer beliefs and intentions from
action. His work is frequently applied to interactive narratives for games, training simulations, and
virtual reality. Representative publications include:
1. Rachelyn Farrell and Stephen G. Ware. “Narrative planning for belief and intention recog-
nition”. In: Proceedings of the 16th AAAI international conference on Arti?cial Intelligence
and Interactive Digital Entertainment. 2020, pp. 52–58
2. Edward T. Garcia, Stephen G. Ware, and Lewis J. Baker. “Measuring presence and perfor-
mance in an intelligent virtual reality police use of force training simulation prototype”. In:
Proceedings of the 32nd AAAI international conference of the Florida Arti?cial Intelligence
Research Society. 2019, pp. 276–281
3. Stephen G. Ware et al. “Multi-agent narrative experience management as story graph prun-
ing”. In: Proceedings of the 15th AAAI international conference on Arti?cial Intelligence and
Interactive Digital Entertainment. 2019, pp. 87–93
References
[1] Pieter Abbeel and Andrew Y Ng. “Apprenticeship learning via inverse reinforcement learning”.
In: Proceedings of the twenty-?rst international conference on Machine learning. 2004, p. 1.
[2] Jun Hong. “Goal recognition through goal graph analysis”. In: Journal of Arti?cial Intelligence
Research 15 (2001), pp. 1–30.
[3] Zonghan Wu et al. “A comprehensive survey on graph neural networks”. In: IEEE transactions
on neural networks and learning systems 32.1 (2020), pp. 4–24.
3
Status | Finished |
---|---|
Effective start/end date | 1/1/21 → 12/31/22 |
Funding
- North Carolina State University: $186,208.00
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.