Jan 19 2019: students of Covington Catholic High School in Kentucky are filmed jeering at and blocking the path of participants in a Washington DC Indigenous peoples march. The incident draws immediate fire in social media, the distinctive red ‘MAGA’ caps some of the boys wear and their chanting ‘build a wall!’ ties the confrontation irrevocably to politics of the day – the ugly images and video typify the time.
Video shows a crowd of teenagers wearing ‘Make America Great Again’ hats taunting a Native American elder after Friday’s Indigenous Peoples March at the Lincoln Memorial https://t.co/2bYADuaUV2 pic.twitter.com/NDtiifPjoo
— CNN (@CNN) January 20, 2019
Not all tweets condemned the students. The course of events was almost immediately, predictably disputed, either on the grounds that it was a ‘leftist’ set-up or that the boys were the victims, not aggressive antagonists. The Wall Street Journal’s Byron Tau meantime, while not rebutting the video evidence, was inclined to put the incident down to the folly of youth:
Everyone posturing on this terrible website has done stupid, foolish, ignorant and downright horrible things that they surely regret as teenagers and were just fortunate enough to grow up in a world where 99% of people didn’t carry networked cameras.
— Byron Tau (@ByronTau) January 19, 2019
For contrast, this is the tenor of less indulgent responses:
60 year challenge pic.twitter.com/6ZRAJPfwZQ
— JP (@JPnMiami) January 19, 2019
Over the next six hours, while noise online about the incident grew, Tau’s Bad Take attracted 650 retweets, three thousand likes, and seventeen thousand replies (these numbers further inflated by now).
I scraped all of these tweets, together with retweets of replies, assorted quote-tweets and mentions, and mapped them in a graph:
In extracting tweets that included Tau’s username or the ID of his ratiod tweet, and were posted from that tweet on, I set out explicitly to make a graph of an instance of epic ratio, having an idea what this would look like.
One reason for doing this is as another test of the method – do graphs come out as they ought in the case of a predictable data-set. Another is to see – and record – the visual pattern that a bad ratio produces in a graph. I’ll recognise it when I see it again as part of a larger graph, and have a template from a typical instance.
replies to the tweet with a terrible ratio have an excellent ratio.
I expected to see a slew of red and yellow connections (replies and mentions respectively) going to the most prominent node on the map, and a smaller number of blue edges (retweets).
I learned that the algorithm that spaces the nodes in the graph places retweets closer to the connected node than replies, quotes and mentions.This makes sense, since nodes with these other kinds of connections are more likely to have multiple connections (replies are to more than one account*, or mention of more than one). A node with multiple connections tends to be located further away from any nodes it’s connected to than one with a single connection.
What I didn’t anticipate is how many of the replies to the tweet with a terrible ratio have an excellent ratio. These retweets of replies – forming the unexpected blue-dominated portion of the graph – are collectively another measure of the poor reception of the tweet. This isn’t easily quantifiable in Twitter’s interface but with the full data-set it is: it includes 13325 retweets, 650 of which were of the original tweet.
It’s a nice sting in the tail for the ratio that replies were retweeted 12675 times: twenty retweets for every one retweet of Byron Tau’s unpopular post.
When a data-set is centred on one post, it’s a straight-forward task to pick out that tweet and most prominent responses to it for semantic representation, as mapped below.
*a current limitation of the code that prepares the graph files is that only six accounts mentioned or ten accounts replied to are counted in the data-set. Sometimes, though, tweets are replies to thirty or forty other accounts, or there may be a dozen accounts mentioned in the body of the tweet. There’s a report on this when the code runs, so I know that I typically lose accounts ‘replied to’ in a small percentage of tweets (2-5%).
I’ll have to remedy this, because it’s annoying to lose any relevant data, because it feels like misrepresentation to leave anything out, and because one reason for writing my own code to process the data table of extracted tweets was in order to capture dialogue, due to the belief that enormous threads with hundreds or thousands of replies are some of the most significant parts of twitter argument as a whole.