Narrative Mining: A Different Way of Looking at Stories and Conversations

In a world where both public relations agencies and internet troll farms draw on a vast pool of data to influence everything from buyer choices to foreign policy, the power of such stories to affect peoples’ lives becomes clearer than ever.

A narrative is a story that is shared by and among people and organizations. They are the ideas, opinions, theses, accounts, sentiments, positions, declarations, statements, and emotions that people hold on certain topics, that they then communicate to others via human language (also called natural language). And in a world where both public relations agencies and internet troll farms draw on a vast pool of data to influence everything from buyer choices to foreign policy, the power of such stories to affect peoples’ lives becomes clearer than ever.

In his bestselling book Narrative Economics, Nobel Prize-winning economist Robert Shiller says that narratives can have a profound impact on economic outcomes, and therefore deserve thorough examination by economists. According to Shiller, studying these narratives can vastly improve our ability to predict, prepare for, and mitigate the effects of major economic events such as recessions and financial crises. But recent years have also shown us that narratives – spreading in the form of popular stories shared on social media and other online channels – also affect political and social events. More than this, narratives can also be harnessed to bring about specific desired outcomes, as in the case of Cambridge Analytica’s use of Facebook data to influence hundreds of elections around the world.
Our proposal is to apply the same analytical rigor of economists and data scientists to communications data to better understand the narratives that people share and the way that these stories spread and change over time. By doing so, we hope to surface richer, truly data-driven insights to enhance everything from strategic planning and reporting to messaging and influencer outreach. The integrated approach we take to achieve this is called narrative mining.

What is narrative mining?

This is the application of network theory, NLP, and other statistical methods to discover, unpack, analyze, and understand narratives. When analyzing narratives, we aim to answer the following questions:

Narrative mining involves complex statistical calculations borrowed from network analysis, data science, natural language processing, and even the life sciences. None of the techniques that we’ve integrated into our narrative mining workflow are new; for example, network analysis has been applied to the study of biological systems for decades and to literary texts since the 1990s (and more recently to scientific journal articles). Within natural language processing, our preferred model for topic clustering – latent Dirichlet allocation or LDA – has been around for exactly 20 years. The scientific heritage of the techniques brought together for narrative mining gives us the confidence that our approach is objective, analytically sound, and supported by ample peer reviewed research. This is the foundation that allows us to innovate. And empowered by our experience developing software specifically for communicators and the wealth of data offered by such technologies, we hope to bring these tested methods to the world of public relations, digital, and marketing communications.

The biggest question: how do my narratives compare with those of my customers, constituents, and stakeholders?

One of the most important ideas within the narrative mining framework is narrative congruence. Communications firms and the people and organizations they represent have long struggled to listen to and understand the stories told by the publics that they seek to engage. Standard listening tools can give users the content of these stories, but often fail to provide insights not only on how these stories spread but also how the stories told by brands and their audiences overlap or differ. Understanding this difference is crucial to building effective strategies for communicating to third parties.
This is where narrative congruence comes in. Using graph theory and network analysis, we are able to map out the narratives relating to specific themes or topics, identify the people who originate and share these stories, describe the rate and manner with which they spread, and most importantly, measure and describe the similarities and differences of the stories of organizations and certain groups of people. Understanding these similarities and differences can help communicators see where the gaps in their messages and strategies lie, as well as the opportunities. It can help communicators determine how well their messages are received by their audiences, and whether these messages are echoed by these audiences in social media. And it can also help communicators figure out what topics and themes resonate with online communities, as well as the actors that they should listen to and possibly engage in their outreach efforts.

How do we answer these questions?

Narrative mining can be boiled down into five major processes:

Data gathering, cleaning, and organization. We listen to news articles and social media posts on the topics our clients care about and collect information on the profiles that share these stories online. Unstructured text data is compiled into a corpus that can be understood by a computer for analysis.
Topic modeling. We use a combination of LDA and network analysis to quickly discover and synthesize the sub-topics, themes, and narratives from this corpus of text, which can then be measured and summarized by our team of analysts.
Narrative congruence analysis. We use specialized network analysis software to statistically determine the extent of similarity and dissimilarity in the narratives being shared online by organizations and people. With these tools, we can not only measure the extent of this overlap, but also describe the ways that these narratives intersect or diverge.
Social network analysis. We use various software tools to identify clusters of social media users who frequently engage with one another on the given topics, as well as the individual actors who are most influential, active, and engaging within these clusters. Within the limitations set by the specific social platforms being analyzed, we are generally able to mine a wealth of useful data on these actors relating not only to the size of their following but also the way they connect to other users in the network.
Text analysis. We use NLP tools to perform various actions such as keyword analysis, parts of speech analysis, and concordance to provide a fuller picture of the substance of these narratives.

What kind of data do we analyze and what tools do we use?

Narrative mining involves processing large volumes of communications data that fall into two broad categories: text data and social network data. A single monthly narrative mining report can easily cover thousands of news articles, and tens of thousands of social media users and their posts. We use both network analysis and NLP to analyze the content of stories on specific topics that come out in newspapers, online news portals, blogs, and most of the popular social media platforms, and then draw insights from this that might not be readily available with standard listening tools. And we also use network analysis to look at the relationships of people and organizations engaging with one another on social media on these same topics, providing a fuller picture of the narratives that drive change in society.

To do all this, we use a variety of software tools both proprietary and third party. These include Media Meter v3 – the latest version of our media monitoring and listening software as well as our own implementations of open source libraries for NLP based on the Python and R programming languages. We also use third party software favored by data scientists around the world, such as Gephi and NodeXL for network calculation and layout, Sketch Engine for corpus management and text analysis, and the Mallet toolkit for tasks such as topic modeling. Niche software packages that are generally used for analyzing biological networks are part of our workflow as well, making for a rather complex yet well-integrated system of tools and techniques. Our goal in the next year is to further integrate these different tools into our own software, with the aim of delivering the rich insights offered by narrative mining relatively quickly and at scale.

Make sure you’re heard.