Welcome to Story Analyzer, an application that blends NLP with data visualization to demonstrate the maxim: "a picture tells a thousand words."

About Story Analyzer

Story Analyzer is an app that helps users visualize and understand a story through the use of natural language processing (NLP) and data visualization. Specifically, Story Analyzer uses Stanford’s CoreNLP Java library for performing information extraction on a story, and uses D3 and Google JavaScript visualization APIs to generate a dashboard of results involving several interrelated data visualizations.

The term “story” refers to a narrative that involves people, groups, organizations, or other entities (subjects) performing actions that can affect other people, organizations, or entities (objects). These events occur in certain places and at certain times, and they may include other contextual features of interest.

The term “story analysis” pertains to identifying these key elements of the story (subject, objects, actions, time, place, and other contexts), and moreover to represent the relationships between these elements for each event that takes place in the story. Story Analyzer helps to visually and interactively answer this question: Who did what to whom, where and when did it happen, and what else was going on at the time?

This page includes links to many dashboards, including dashboards from public impeachment-related documents as shown below. You can also see dashboards of other documents from the "Dashboards" menu at the top.

For more information you can read this presentation, which I gave at the 2020 Southeast Decision Science Institute (SEDSI) conference in Charleston, SC.



Using Story Analyzer Dashboards

Story Analyzer dashboards are divided into several expandable/collapsible sections containing visualizations. Clicking a bar allows you to expand or collapse its corresponding section, so you can choose which sections to make visible.

When you first open a dashboard, you will see the Narrative and Highlighted Information section. The other sections are: People, Groups, Interactions, and Narrative Web, Dates and Times, Locations, Subjects, Actions, and Objects, and Verbs, Nouns, and Contexts.

Scroll through the overview of each of the sections in the carousel below:

These visualizations are highly interactive, both within a section and between sections. As your mouse hovers over an element of a visualization, related elements in the other visualizations are highlighted, and the relevant sentences display in the Narrative and Highlighted Information section.

Clicking on an element freezes everything, so that your mouse hover won't make changes, although hovering still brings up tooltips. Clicking again on an element unfreezes everything, and you can again hover to bring related elements and text into view.

In general, hovering over any element causes all other related elements throughout all sections of the dashboard to be highlighted. You can see this happen if you have multiple sections open and start moving your mouse over the different visualizations. The idea is that you will quickly narrow in on the topics related to the selected item from multiple perspectives.

The narrative web uses a force graph visualization, and you can reposition a cluster of nodes by clicking and dragging; others will follow. I think it produces a pleasing effect for the user. Play with all the visualizations for a little while and you'll see what I mean.

Each text element of the dashboard (e.g. person, group, action, subject, object, time/date, location) is a word or phrase from the narrative. Each includes a suffix consisting of two numbers separated by dashes. For example, in the Mueller report you may see a text element like FBI Director James Comey-15-9. The rightmost number (9 in this case) is the sequential number of the earliest sentence of the narrative where Comey appears as a character in the story. The number to its left (15) is the token number within that sentence. Tokens are words or punctuation characters in the sentence, so these numbers indicate that the word "Comey" is at position 15 of the 9th sentence in the narrative being displayed.

Color coding abounds in a Story Analyzer dashboard. In the Narrative and Highlighted Information section, blue represents people, red is for groups, yellow is for locations (cities, countries, etc.), and green is for dates and times.

Within the People, Groups, Interactions and Narrative Web section, the bands on the circumference of the Interactions visualization are either blue (for people) or red (for groups). In the Narrative Web, coloring is similar, with blue nodes for people, red nodes for groups, yellow nodes for countries, etc. By hovering over a node in the narrative web you will see a tool tip showing what kind of entity it represents.

You'll find word clouds in several sections of the dashboard. There are six word clouds, one for each: actions, subjects of actions, objects of actions, times, places, and other contexts. These clouds are all related and interactive. Hovering over an element of one cloud will cause related elements in all the other clouds to be highlighted. Words and phrases in the word clouds will be color-coded to link actions to related subjects, objects, times, places, and contexts. When you select an element (subject, object, time, place, context, or action) in a cloud, that element appears in black. Other clouds will show related elements according to their relationships.

For example, select an element in the Subjects cloud. In all other clouds, elements related to that subject are highlighted in a color. The colors of the actions performed by that subject will match the colors of the objects, times, places, and contexts in which the actions occur in ths story. You can click on the Subject to freeze the dashboard elements. In this frozen state, hovering over an action that the subject performed will highlight all the objects, times, places, and contexts for that action by making them bigger.


Dashboards of The Mueller Report

Here you can see several dashboards of the Mueller Report, which tells the story of the 2016 presidential campaign and post-campaign period. Volume 1 depicts Russian efforts to influence this campaign. Volume 2 describes findings related to potential obstruction of justice during the investigation of these events.

Dashboards of White House memorandum response to the House Impeachment Report

The Big picture

I've applied Story Analyzer to many impeachment-related public government documents, including the Mueller report, various opening statements of impeachment witnesses, Republican talking points, the House Impeachment report, and the Horowitz report. Also included are the "transcripts" of President Trump's conversation with Ukranian President Zelensky and the 2020 State of the Union address. As time goes on I expect to add more dashboards. All are accessible to from this page.

Not sure where to start? Check out this overview dashboard. This displays a treemap visualization. You can focus on people and groups, documents, or places and times.

I included a few dashboards of Wikipedia articles and news stories as well. Check them out from the Dashboards menu at the top.


About me

I am a professor at James Madison University, where I teach Computer Information Systems in the College of Business. My research interests focus on AI, natural language processing, and data visualization. I am also very interested in pedagogical/curricular issues and methods, especially related to the information systems discipline. In my spare time, I like to read, play online chess, exercise, spend some time with my wife, family, and friends, and of course, code.

Acknowledgements

Many thanks go to Kenny Nguyen, Connor Manyx, and Ved Sheth, who are all CIS majors at James Madison University. They have contributed invaluable assistance as research assistants under James Madison University's REU Program. Thanks also to my son, Brendan Mitri, who designed the home page you are currently looking at, fixed some glitches in my code for me, and assisted with additional dashboard coding.

Thanks to James Madison University for contributing REU funding and for providing me an educational leave in Fall 2015, during which I learned how to work with NLP and visualization APIs.

Story Analyzer is built using many useful software APIs and code snippets. Stanford's CoreNLP does the heavy AI lifting. Dashboards are built using Data Driven Documents (D3), as well as Google's Visualization and Map APIs. Some specific code examples were particularly useful for the Story Analyzer dashboard. For example, the Timeline visualization is heavily influenced by this D3 code. The Interactions visualization is an example of a chord diagram, and was adapted from this code. The word clouds rely on this code.