Visualizing and Assisting the Analytic Process

Grants and Contracts Details

Description

Visualizing and Assisting the Analytic Process Prof. Brent Harrison, Department of Computer Science, University of Kentucky Prof. Stephen G. Ware, Department of Computer Science, University of Kentucky Background: Data-Driven Analyst Work?ow Graphs In our 2020 LAS project, we built a game where analysts explore a collection of documents to identify the perpetrator of an insider attack at a ?ctional startup company. We logged every action and prompted players to explain their motivations and conclusions, then used the data to build graphs of each analyst’s work?ow. In 2021 we developed a process to represent general work?ow. Individual player logs are seg- mented into meaningful units. We trained a neural encoder to recognize and cluster similar segments and recognize when players are e?ectively doing the same tasks. Clusters are then summarized and compiled into a generalized work?ow graph representing the hierarchical process that analysts follow during their work?ow. In 2022 we will improve the usability of these work?ow graphs and demonstrate that our process can generalize by transferring insights from our Insider Threat game to LAS’s PandaJam exercise. We will provide an intelligent, adaptive visualization of work?ow graphs to (1) provide guidance to analysts as they work and (2) visualize and inspect their work?ow both through after-action review to gain insight about their process and build a model of best practices. Task 1: Identifying Relevant Graph Components from Context While informative, the work?ow graphs that were generated previously can be quite large and o?er little in the way of informative visualization. One insight gained during this project was that at any given time, only parts of the graph were relevant to the current state of the analysis. During the next phase of this e?ort, we will develop techniques to predict relevant parts of a work?ow graph based on the current context (how we got here, and where we’re likely to go next, given this particular user’s current path). We will explore the following techniques for this problem: • Graph Mining [2]: When data is available, graph mining techniques can be used to perform goal recognition, which will enable us to identify subgraphs that are likely to be explored later during an analysis. • Reinforcement Learning: Reinforcement learning and inverse reinforcement learning [1] can be used to learn analysis policies, which themselves can be used to identify likely future analysis actions and subgraphs in the work?ow graph. • Graph Neural Networks [3]: We can use graph neural networks to learn generalized graph representations which should allow us to transfer knowledge from a known domain (Insider Threat) to a novel domain (PandaJam) to ?nd relevant subgraphs in new environments. To evaluate this task, we will use the Insider Threat dataset we gathered in 2020 and the PandaJam dataset as it develops in 2022. We will measure how well these techniques perform in situations where work?ow graph data is available (Insider Threat) and in situations where it is scarce or emerging (PandaJam). 1 Task 2: Adaptive Visualization of Work?ow Larger work?ow graphs contain more information but are overwhelming when rendered as a single, static image. We currently provide tunable parameters to control the size of the graph, but intel- ligent, adaptive visualization would allow users to parse the larger graphs and gain more insight from them instead of simply removing content. In the second phase of this project, we will develop interactive browser-based work?ow graph visualizations that allow the user to navigate the graph as an interactive narrative, visualizing graph content in di?erent ways to communicate information more e?ectively and inspect the analystic process to gain insight. In Task 1 we developed ways to recognize which parts of the graph are most relevant given an analyst’s actions so far and potential future actions. In Task 2, we will investigate how this infor- mation can be presented to an analyst during an analysis task such that that relevant information is conveyed but is not overwhelming. This also requires investigating when each representation should be used to be most e?ective based on the current context. Examples of potential representations include: • List of Documents: Early on, analysts tend to sift through large lists of documents to explore what is available and form speci?c hypotheses. • Investigation Narrative: Our process summarizations can help to guide analysts and focus general questions into speci?c tasks. • Timeline and Map: When investigating speci?c questions, documents can be visualized to reveal their temporal and geographic relationships. • Linked Node Diagrams: Concepts like people, places, and events show up across many documents and these conceptual connections prompt analysts for what tasks to perform next. Task 3: Graph Arithmetic for After-Action Review Graphs can be combined and segmented using various operations that have analogies in arithmetic (e.g. adding ∼ union, subtraction ∼ intersection and compliment, etc.). Our current general work?ow graphs can be considered as the average of many individual graphs. Our visualizations can assist analysts by enabling them to break down, compare, and combine work?ow graphs during the after-action reviews of PandaJam. Analysts can inspect their own work?ow to see how it is unique and how it overlaps with that of other analysts and with the overall work?ow. They can edit and combine graphs to build an ideal work?ow that represents best practices. These graphs can also be used to improve the graph embeddings produced by the graph neural networks used during Task 1, since analysts will e?ectively be providing labeled examples of useful techniques. Speci?c Deliverables 1. An online technique for identifying the relevant parts of a work?ow graph based on the previous analysis actions taken. This includes trained models for this task using graph mining techniques, reinforcement learning, and/or graph neural networks. 2. Interactive work?ow subgraph visualizations as well as a technique to determine when a visualization should be used based on past analysis events. 3. A tool for performing after-action review which enables analysts to segment their behavior trace using graph arithmetic actions such as union, subtraction, intersection, and compliment. 4. Academic publications summarizing novel research results obtained during development. 2 Team and Capabilities These PIs were previously funded by LAS in 2020 and 2021 to build Threat Game, a web-based serious game to collect data for modeling analyst work?ow. This game, the resulting detailed dataset collected from professional analysts, and the software developed for learning, clustering, and visualizing analyst work?ow, will provide a foundation for this project. Details on both PIs, their teams, and projects can be found at: http://cs.uky.edu/∼sgware/people Brent Harrison is an Assistant Professor at the University of Kentucky who specializes in arti?cial intelligence, machine learning, and autonomous systems. Given this, Harrison’s expertise will be most useful in developing the graph mining and graph neural network models for identifying relevant subgraphs in a work?ow graph as well as determining intelligent graph visualizations. Representative publications include: 1. Brent Harrison et al. “Rationalization: A neural machine translation approach to generating natural language explanations”. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 2018, pp. 81–87 2. Lara Martin et al. “Event representations for automated story generation with deep neural nets”. In: Proceedings of the AAAI Conference on Arti?cial Intelligence. Vol. 32. 1. 2018 3. Md Sultan Al Nahian et al. “A hierarchical approach for visual storytelling using image description”. In: International Conference on Interactive Digital Storytelling. Springer. 2019, pp. 304–317 Stephen G. Ware is an Assistant Professor of Computer Science at the University of Kentucky where he directs the Narrative Intelligence Lab. He specializes in intelligent multi-agent planning based on the beliefs and intentions of agents, including how to infer beliefs and intentions from action. His work is frequently applied to interactive narratives for games, training simulations, and virtual reality. Representative publications include: 1. Rachelyn Farrell and Stephen G. Ware. “Narrative planning for belief and intention recog- nition”. In: Proceedings of the 16th AAAI international conference on Arti?cial Intelligence and Interactive Digital Entertainment. 2020, pp. 52–58 2. Edward T. Garcia, Stephen G. Ware, and Lewis J. Baker. “Measuring presence and perfor- mance in an intelligent virtual reality police use of force training simulation prototype”. In: Proceedings of the 32nd AAAI international conference of the Florida Arti?cial Intelligence Research Society. 2019, pp. 276–281 3. Stephen G. Ware et al. “Multi-agent narrative experience management as story graph prun- ing”. In: Proceedings of the 15th AAAI international conference on Arti?cial Intelligence and Interactive Digital Entertainment. 2019, pp. 87–93 References [1] Pieter Abbeel and Andrew Y Ng. “Apprenticeship learning via inverse reinforcement learning”. In: Proceedings of the twenty-?rst international conference on Machine learning. 2004, p. 1. [2] Jun Hong. “Goal recognition through goal graph analysis”. In: Journal of Arti?cial Intelligence Research 15 (2001), pp. 1–30. [3] Zonghan Wu et al. “A comprehensive survey on graph neural networks”. In: IEEE transactions on neural networks and learning systems 32.1 (2020), pp. 4–24. 3
StatusFinished
Effective start/end date1/1/2112/31/22

Funding

  • North Carolina State University: $186,208.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.