What Twitter Users Are Saying About Racial Injustice & How it Shapes Today’s Dominant Narratives
he global rise of populism since 2016 provides clear evidence of the transformative power of narratives and the need to understand, measure, and compare them in a thorough, scientific manner. With both the Democratic and Republican parties having concluded their national conventions in late August, the narratives that are transmitted and shared between parties and their constituents will again play a decisive role in who comes to power.This series of articles is an attempt to better understand popular sentiments on key issues affecting the US presidential elections in November, as identified by recent surveys by Pew Research Center. These include racial inequality, economic inequality, health care, foreign policy, the coronavirus pandemic, Supreme Court appointments, and climate change. Our goal is to analyze the messages being conveyed by both Democratic and Republican parties in relation to these issues and compare how these overlap with or differ from sentiments being expressed by the American voting public in social media.
By doing so, we hope to shed light on the communications gaps and opportunities that exist between the candidates and their constituents. We will later apply similar techniques to understanding narrative congruence in the context of other topics. This article looks at how issues related to racism were characterized in both the Democratic National Convention, the Republican National Convention, and the most recent tweets mentioning either Democratic candidate Joe Biden or Republican candidate President Donald Trump.
To help you better understand the concept of narrative mining and the process behind it, we formed a glossary covering most of the technical terms you will be encountering in this series of articles.
We started by creating corpora of the speeches from the Democratic National Convention (DNC) and the Republican National Convention (RNC), datasets of which were downloaded from Kaggle. We also used NodeXL Pro to gather the most recent 60,000 tweets that mentioned either Joe Biden or Donald Trump, for whom their respective conventions served as a national campaign launchpad (for brevity, this dataset will be referred to as “political tweets”).
These corpora were compiled and processed using Sketch Engine for keyword, concordance, and parts of speech analysis. They were then individually run through Cowo to identify conceptual associations at the sentence level, i.e. pairs of concepts that most frequently appeared in the same sentence were identified and mapped out. This gave us separate network files for the DNC and RNC speeches, as well as for the political tweets, which we are using to represent the sentiments of the general public.
We then used Gephi to calculate network centralities (to help determine the most important concepts in the network) and modularity class (to cluster the concepts with the strongest associations with one another, with each cluster representing a sub-theme of the overall narrative network), and to visualize the networks using the Circle Pack algorithm.
These bar charts show the total number of significant concepts (nodes) used in these three datasets (RNC speeches, DNC speeches, and political tweets), as well as the total number of conceptual pairs for each.
The Venn diagram below shows the intersection of our three narrative networks in terms of edges (conceptual pairs), as well as the exclusive edges of each. This was constructed using the web app NetSets which allows visual comparison of multiple networks. Here we can see that there were 326 concepts that were significant for all three datasets: the DNC, RNC, and political tweets.
We can further quantify the intersection of these networks using CompNet, which calculates the Jaccard coefficient of both nodes and edges across the three networks, a measure of the degree of overlap (and therefore also of exclusiveness) of the DNC and RNC speeches, and the political tweets.
After calculating the Jaccard Similarity Coefficient for nodes (concepts) and edges (conceptual pairs) and CNSI (similarity coefficient for the neighborhood architecture of the nodes and edges), we can determine the overall narrative congruence of two networks (DNC and RNC, DNC and political tweets, and RNC and political tweets) using our formula:
DNC ꓴ RNC
Concept Similarity (how much overlap exists in the concepts used?): 0.13 (a score of 1.0 is a perfect match)
Concept Pair Similarity (how much overlap exists in the pairs of concepts used?): 0.02 (a score of 1.0 is a perfect match)
CNSI (how similar is the context in which these concepts were used?): 0.14 (a score of 1.0 is a perfect match)
NCS of DNC ꓴ RNC (what is the overall narrative congruence of these two networks?): 9.6 (out of 100)
DNC ꓴ Political Tweets
Concept Similarity (how much overlap exists in the concepts used?): 0.09
Concept Pair Similarity (how much overlap exists in the pairs of concepts used?): 0.03
CNSI (how similar is the context in which these concepts were used?): 0.12
NCS of DNC ꓴ RNC (what is the overall narrative congruence of these two networks?): 8.0 (out of 100)
RNC ꓴ Political Tweets
Concept Similarity (how much overlap exists in the concepts used?): 0.09
Concept Pair Similarity (how much overlap exists in the pairs of concepts used?): 0.01
CNSI (how similar is the context in which these concepts were used?): 0.06
NCS of DNC ꓴ RNC (what is the overall narrative congruence of these two networks?): 5.3 (out of 100)
Here we can see that there was greater overlap between the speeches at the DNC and the dataset of political tweets, with an overall narrative congruence of 8.0 compared to the narrative congruence of 5.3 between the RNC speeches and the political tweets. The overlap in concepts used was the same when comparing both the DNC and the RNC with the political tweets, but there was significantly greater overlap in the surrounding context for these concepts between the DNC and the political tweets, than the RNC and the political tweets.
What does this tell us?
This greater narrative congruence score means that the DNC and the political tweets dataset talked about the same things to a greater degree than did the RNC and the political tweets, i.e. more of the same concepts were used and the other important words surrounding these concepts were also more similar. This could be because certain important topics like race might have been downplayed in the RNC speeches relative to the DNC speeches and the political tweets, but to get to the details, it will be necessary to do topic clustering and text analysis.
Mapping the intersections between the narrative networks
The next step in our analysis is to map out the intersections of these networks as well as their exclusive edges, so we can begin to synthesize the specific overlaps and differences between each. During this process, we can identify the conceptual pairs that are exclusive to the individual narrative networks (most importantly the Twitter network), as this is where the communications gaps will reside.
This is the network of conceptual pairs common only to DNC speeches and the political tweets
How did these different actors talk about race issues?
Like-colored clusters are groups of concepts that co-occur most frequently with one another at the sentence level. These represent sub-themes from which we can mine the various narratives at play. The more important the concept, the larger the node used to represent it in the network graph.
Here we can see that race issues were prominent for all three datasets, but the concepts used to characterize race issues were vastly different. The DNC speeches made specific reference to white supremacy and racism as a systemic issue, which the RNC did not while the tweets mentioning Biden and Trump did. Furthermore, the RNC speeches described the recent protests in the United States as riots – a sentiment that was echoed in the political tweets.
That said, these graphs by themselves do not provide the whole picture, and this is where a text analysis of the specific context in which these conceptual pairs appear is required. The job of the analyst here is to review the actual content, using the above graphs to more quickly and efficiently examine the most significant conceptual associations within the sub-themes in which they were used.
To make the synthesizing of these sub-themes even more efficient, we next perform keyword, parts of speech, and concordance analysis on the individual narrative networks, specifically focusing on race-related concepts.
What did we find?
Text analysis of the RNC and DNC speeches, and of the political tweets, yielded the following key insights:
Race issues were downplayed at the RNC, to the extent that they were virtually treated as a non-issue. In the RNC speeches, race-driven protests were described as the work of a “radical left” represented by the Biden camp.
The DNC speeches made racial injustice a central issue, with “systemic racism” being the top keyword.
This indicates that for the given date range, discussions on race by the Trump camp was largely devolved to social media, where many supporters (over half of the dataset) echoed the party position that racial injustice and systemic racism were not key issues.
This comes despite Pew Research findings that more US voters on Twitter identify as Democrats rather than Republicans.
Race issues were downplayed as a non-issue at the RNC
Our analysis revealed a clear downplaying (perhaps even a flat-out denial) of the existence of systemic racism in the United States in the RNC speeches. A keyword analysis of the text showed that while race concerns were brought up, these were far less prominent in comparison to the DNC speeches and to the political tweets corpus. Instead, criminal justice and prison reform was presented as a key component of President Donald Trump’s African-American program. At the same time, the recent race-driven protests in the United States were described by the GOP as “riots”, the work of the “radical left”, and “mob rule.”
Racial injustice and systemic racism were at the heart of the DNC speeches
On the other hand, the DNC speeches put the issue of systemic racism at the center of the conversation, with “systemic racism” and “racial injustice” among the top keywords. In this dataset, race was treated as the top issue to be confronted by the next president of the United States.
Pro-Trump messages dominated the political tweets when race issues were mentioned
With studies showing a small liberal-democratic bias among US Twitter users (with close to 30 percent identifying as Democrats and about 22 percent identifying as Republicans), one could reasonably expect the political tweets dataset to adhere more closely to the DNC speeches than the RNC speeches. This is partially correct, as we already have seen that there was greater overlap in the choice of concepts and the concept pairs between the DNC and Twitter than between the RNC and Twitter. A reading of the content however shows us important differences, which sheds light on the communications approach being adopted by the Trump camp.
We see that in the Twitter dataset, “racist city” was the top keyword by a wide margin, with references to “critical race theory” immediately following. However, the vast majority of tweets that contained these keywords were in fact pro-Trump messages, with 25,690 tweets denying that Rochester, NY was a “racist city” because it had a Black mayor and a Black police chief, and calling on people to “stop the insanity” and “vote Trump.”
Meanwhile, 5,418 tweets that referenced “critical race theory” praised President Trump’s recent defunding of critical race theory programs in American universities receiving federal funding, calling such programs a “Democrat hoax” and the work of “far left radicals.”
This makes it clear that while the race-related concepts used in the political tweets were the same as in the DNC speeches, the narrative was completely different. It is important to note that practically all of these tweets had the exact same message. with pro-Trump tweets drowning out race-related pro-Biden tweets in the analysis period. This points to what might be an extension of the wildly successful social media strategy used by Trump and other populists since 2016, where discussions on polarizing issues are effectively swarmed by an army of supporters with a shared script designed to be as inflammatory as possible.
These two tables show the concordance analysis results for “racist city” and “critical race theory,” as explained above. It shows the specific context in which the keywords were used in these posts.
A fork in the road: official party rhetoric vs social media
While the Twitter dataset looked at here is limited to only the most recent 60,000 tweets mentioning either Biden or Trump, our findings here are indication of what is could very well be a carefully planned strategy of farming out specific race-related pro-Trump messages to be echoed across social media while the official RNC rhetoric gave very little play to race issues. It will be interesting to see whether this holds true as the November elections approach, and whether the same approach is applied in discussions on other issues such as health care, economic inequality, or climate change. These are things we’ll be taking a look at as we continue this series of articles each week.