Twitter and the Dominant Narratives on Racial Injustice
By: M2 Communications
This series of articles is an attempt to better understand popular sentiments on key issues affecting the US presidential elections in November, as identified by recent surveys by Pew Research Center. These include racial inequality, economic inequality, health care, foreign policy, the coronavirus pandemic, Supreme Court appointments, and climate change. Our goal is to analyze the messages being conveyed by both Democratic and Republican parties in relation to these issues and compare how these overlap with or differ from sentiments being expressed by the American voting public in social media.
By doing so, we hope to shed light on the communications gaps and opportunities that exist between the candidates and their constituents. We will later apply similar techniques to understanding narrative congruence in the context of other topics. This article looks at how issues related to racism were characterized in both the Democratic National Convention, the Republican National Convention, and the most recent tweets mentioning either Democratic candidate Joe Biden or Republican candidate President Donald Trump.
Our Methodology
To help you better understand the concept of narrative mining and the process behind it, we formed a glossary covering most of the technical terms you will be encountering in this series of articles.
We started by creating corpora of the speeches from the Democratic National Convention (DNC) and the Republican National Convention (RNC), datasets of which were downloaded from Kaggle. We also used NodeXL Pro to gather the most recent 60,000 tweets that mentioned either Joe Biden or Donald Trump, for whom their respective conventions served as a national campaign launchpad (for brevity, this dataset will be referred to as “political tweets”).
These corpora were compiled and processed using Sketch Engine for keyword, concordance, and parts of speech analysis. They were then individually run through Cowo to identify conceptual associations at the sentence level, i.e. pairs of concepts that most frequently appeared in the same sentence were identified and mapped out. This gave us separate network files for the DNC and RNC speeches, as well as for the political tweets, which we are using to represent the sentiments of the general public.
We then used Gephi to calculate network centralities (to help determine the most important concepts in the network) and modularity class (to cluster the concepts with the strongest associations with one another, with each cluster representing a sub-theme of the overall narrative network), and to visualize the networks using the Circle Pack algorithm.
These bar charts show the total number of significant concepts (nodes) used in these three datasets (RNC speeches, DNC speeches, and political tweets), as well as the total number of conceptual pairs for each.
The Venn diagram below shows the intersection of our three narrative networks in terms of edges (conceptual pairs), as well as the exclusive edges of each. This was constructed using the web app NetSets which allows visual comparison of multiple networks. Here we can see that there were 326 concepts that were significant for all three datasets: the DNC, RNC, and political tweets.
We can further quantify the intersection of these networks using CompNet, which calculates the Jaccard coefficient of both nodes and edges across the three networks, a measure of the degree of overlap (and therefore also of exclusiveness) of the DNC and RNC speeches, and the political tweets.
After calculating the Jaccard Similarity Coefficient for nodes (concepts) and edges (conceptual pairs) and CNSI (similarity coefficient for the neighborhood architecture of the nodes and edges), we can determine the overall narrative congruence of two networks (DNC and RNC, DNC and political tweets, and RNC and political tweets) using our formula:
DNC ꓴ RNC
Concept Similarity (how much overlap exists in the concepts used?): 0.13 (a score of 1.0 is a perfect match)
Concept Pair Similarity (how much overlap exists in the pairs of concepts used?): 0.02 (a score of 1.0 is a perfect match)
CNSI (how similar is the context in which these concepts were used?): 0.14 (a score of 1.0 is a perfect match)
NCS of DNC ꓴ RNC (what is the overall narrative congruence of these two networks?): 9.6 (out of 100)
DNC ꓴ Political Tweets
Concept Similarity (how much overlap exists in the concepts used?): 0.09
Concept Pair Similarity (how much overlap exists in the pairs of concepts used?): 0.03
CNSI (how similar is the context in which these concepts were used?): 0.12
NCS of DNC ꓴ RNC (what is the overall narrative congruence of these two networks?): 8.0 (out of 100)
RNC ꓴ Political Tweets
Concept Similarity (how much overlap exists in the concepts used?): 0.09
Concept Pair Similarity (how much overlap exists in the pairs of concepts used?): 0.01
CNSI (how similar is the context in which these concepts were used?): 0.06
NCS of DNC ꓴ RNC (what is the overall narrative congruence of these two networks?): 5.3 (out of 100)
Here we can see that there was greater overlap between the speeches at the DNC and the dataset of political tweets, with an overall narrative congruence of 8.0 compared to the narrative congruence of 5.3 between the RNC speeches and the political tweets. The overlap in concepts used was the same when comparing both the DNC and the RNC with the political tweets, but there was significantly greater overlap in the surrounding context for these concepts between the DNC and the political tweets, than the RNC and the political tweets.
What does this tell us?
This greater narrative congruence score means that the DNC and the political tweets dataset talked about the same things to a greater degree than did the RNC and the political tweets, i.e. more of the same concepts were used and the other important words surrounding these concepts were also more similar. This could be because certain important topics like race might have been downplayed in the RNC speeches relative to the DNC speeches and the political tweets, but to get to the details, it will be necessary to do topic clustering and text analysis.
Mapping the intersections between the narrative networks
The next step in our analysis is to map out the intersections of these networks as well as their exclusive edges, so we can begin to synthesize the specific overlaps and differences between each. During this process, we can identify the conceptual pairs that are exclusive to the individual narrative networks (most importantly the Twitter network), as this is where the communications gaps will reside.
This is the network of conceptual pairs common only to DNC speeches and the political tweets:
This is the network of conceptual pairs common only to RNC speeches and the political tweets:
How did these different actors talk about race issues?
Like-colored clusters are groups of concepts that co-occur most frequently with one another at the sentence level. These represent sub-themes from which we can mine the various narratives at play. The more important the concept, the larger the node used to represent it in the network graph.
Here we can see that race issues were prominent for all three datasets, but the concepts used to characterize race issues were vastly different. The DNC speeches made specific reference to white supremacy and racism as a systemic issue, which the RNC did not while the tweets mentioning Biden and Trump did. Furthermore, the RNC speeches described the recent protests in the United States as riots – a sentiment that was echoed in the political tweets.
That said, these graphs by themselves do not provide the whole picture, and this is where a text analysis of the specific context in which these conceptual pairs appear is required. The job of the analyst here is to review the actual content, using the above graphs to more quickly and efficiently examine the most significant conceptual associations within the sub-themes in which they were used.
To make the synthesizing of these sub-themes even more efficient, we next perform keyword, parts of speech, and concordance analysis on the individual narrative networks, specifically focusing on race-related concepts.
What did we find?
- Race issues were downplayed at the RNC, to the extent that they were virtually treated as a non-issue. In the RNC speeches, race-driven protests were described as the work of a “radical left” represented by the Biden camp.
- The DNC speeches made racial injustice a central issue, with “systemic racism” being the top keyword.
- Over half of the 60,000 tweets mentioning either Biden or Trump within the context of race range were pro-Trump, with 25,690 denying the existence of systemic racism (Rochester City was used as a talking point) and 5,418 applauding the recent defunding of critical race theory in US universities. Many of these were re-tweets or quote tweets, typical of herd behavior commonly observed online.
- This indicates that for the given date range, discussions on race by the Trump camp was largely devolved to social media, where many supporters (over half of the dataset) echoed the party position that racial injustice and systemic racism were not key issues.
- This comes despite Pew Research findings that more US voters on Twitter identify as Democrats rather than Republicans.
Race issues were downplayed as a non-issue at the RNC
Racial injustice and systemic racism were at the heart of the DNC speeches
Pro-Trump messages dominated the political tweets when race issues were mentioned
A fork in the road: official party rhetoric vs social media
ALSO READ: The US Healthcare Narrative: Democrats vs Republicans
About M2