Welcome to Story Analyzer, an application that blends NLP with data visualization to demonstrate the maxim: "a picture tells a thousand words."
The term “story” refers to a narrative that involves people, groups, organizations, or other entities (subjects) performing actions that can affect other people, organizations, or entities (objects). These events occur in certain places and at certain times, and they may include other contextual features of interest.
The term “story analysis” pertains to identifying these key elements of the story (subject, objects, actions, time, place, and other contexts), and moreover to represent the relationships between these elements for each event that takes place in the story. Story Analyzer helps to visually and interactively answer this question: Who did what to whom, where and when did it happen, and what else was going on at the time?
For more information you can read this description, based on a presentation I made at the 2018 International Conference in Information Systems in San Francisco.
I've applied Story Analyzer to many impeachment-related documents, including the Mueller Report,
various opening statements of impeachment witnesses, and the Horowitz report. As time goes on I intend to add more dashboards.
All are accessible to from this page.
Not sure where to start? Check out this overview dashboard. This displays a treemap visualization. You can focus on people and groups, documents, or places and times.
Story Analyzer dashboards are divided into several expandable/collapsible sections containing visualizations. Clicking a bar allows you to expand or collapse its corresponding section, so you can choose which sections to make visible.
When you first open a dashboard, you will see the Narrative and Highlighted Information section. The other sections are: People, Groups, Interactions, and Narrative Web, Dates and Times, Locations, Subjects, Actions, and Objects, and Verbs, Nouns, and Contexts.
Scroll through the overview of each of the sections in the carousel below:
These visualizations are highly interactive, both within a section and between sections. As your mouse hovers over an element of a visualization, related elements in the other visualizations are highlighted, and the relevant sentences display in the Narrative and Highlighted Information section.
Clicking on an element freezes everything, so that your mouse hover won't make changes, although hovering still brings up tooltips. Clicking again on an element unfreezes everything, and you can again hover to bring related elements and text into view.
In general, hovering over any element causes all other related elements throughout all sections of the dashboard to be highlighted. You can see this happen if you have multiple sections open and start moving your mouse over the different visualizations. The idea is that you will quickly narrow in on the topics related to the selected item from multiple perspectives.
The narrative web uses a force graph visualization, and you can reposition a cluster of nodes by clicking and dragging; others will follow. I think it produces a pleasing effect for the user. Play with all the visualizations for a little while and you'll see what I mean.
Each text element of the dashboard (e.g. person, group, action, subject, object, time/date, location) is a word or phrase from the narrative. Each includes a suffix consisting of two numbers separated by dashes. For example, in the Mueller report you may see a text element like FBI Director James Comey-15-9. The rightmost number (9 in this case) is the sequential number of the earliest sentence of the narrative where Comey appears as a character in the story. The number to its left (15) is the token number within that sentence. Tokens are words or punctuation characters in the sentence, so these numbers indicate that the word "Comey" is at position 15 of the 9th sentence in the narrative being displayed.
Color coding abounds in a Story Analyzer dashboard. In the Narrative and Highlighted Information section, blue represents people,
red is for groups, yellow is for locations (cities, countries, etc.), and green is for dates and times.
Within the People, Groups, Interactions and Narrative Web section, the bands on the circumference of the Interactions visualization are either blue (for people) or red (for groups). In the Narrative Web, coloring is similar, with blue nodes for people, red nodes for groups, yellow nodes for countries, etc. By hovering over a node in the narrative web you will see a tool tip showing what kind of entity it represents.
You'll find word clouds in several sections of the dashboard. There are six word clouds, one for each: actions, subjects of actions, objects of actions, times, places, and other contexts. These clouds are all related and interactive. Hovering over an element of one cloud will cause related elements in all the other clouds to be highlighted. Words and phrases in the word clouds will be color-coded to link actions to related subjects, objects, times, places, and contexts. When you select an element (subject, object, time, place, context, or action) in a cloud, that element appears in black. Other clouds will show related elements according to their relationships.
For example, select an element in the Subjects cloud. In all other clouds, elements related to that subject are highlighted in a color. The colors of the actions performed by that subject will match the colors of the objects, times, places, and contexts in which the actions occur in ths story. You can click on the Subject to freeze the dashboard elements. In this frozen state, hovering over an action that the subject performed will highlight all the objects, times, places, and contexts for that action by making them bigger.
In the interest of civic engagement, and out of curiosity of what my software can do, I applied Story Analyzer to the Mueller Report, which is clearly in the form of a narrative and therefore a good test for Story Analyzer.
I included a few dashboards of Wikipedia articles as well. Check them out from the Dashboards menu at the top.
I am a professor at James Madison University, where I teach Computer Information Systems in the College of Business. My research interests focus on AI, natural language processing, and data visualization. I am also very interested in pedagogical/curricular issues and methods, especially related to the information systems discipline. In my spare time, I like to read, play online chess, exercise, spend some time with my wife, family, and friends, and of course, code.
Many thanks go to Kenny Nguyen, a CIS major at James Madison University, who worked for me during summer and fall of 2019 under James Madison University's REU Program. Kenny gave invaluable assistance by preparing the text data from the Mueller PDF document, testing and validating the Mueller dashboards, and writing code to gather and display statistical data from text narratives.
Thanks also to my son, Brendan Mitri, who designed the home page you are currently looking at, fixed some glitches in my code for me, and assisted with additional dashboard coding.
Thanks to James Madison University for contributing REU funding and for providing me an educational leave in Fall 2015, during which I learned how to work with NLP and visualization APIs.
Story Analyzer is built using many useful software APIs and code snippets. Stanford's CoreNLP does the heavy AI lifting. Dashboards are built using Data Driven Documents (D3), as well as Google's Visualization and Map APIs. Some specific code examples were particularly useful for the Story Analyzer dashboard. For example, the Timeline visualization is heavily influenced by this D3 code. The Interactions visualization is an example of a chord diagram, and was adapted from this code. The word clouds rely on this code.