Battlefield Recovery, an execrable show that turns the looting of war dead into ‘entertainment’, was shown on Saturday on Channel 5 in the UK. I won’t dignify it by linking to it; instead see this article in the Guardian.
I wondered however what the tweeting public thought about the show – keeping in mind that Channel 5 viewers may or may not be the same kinds of folks who engage with Twitter. I used Ed Summer’s TWARC to collect approximately 3600 tweets (there are likely many more, but the system timed out). The file containing the IDs of all of these tweets is available here. You can use this file in conjuction with TWARC to recover all of the tweets and their associated metadata for yourself (which is approximately 19 mb worth of text). You can explore the language of the tweets for yourself via Voyant-Tools.
- about 1000 of the tweets are unique tweets
- the remaining 2600 tweets are retweets
- number 1 retweet with 96 rts at the time of collection: https://twitter.com/GERArmyResearch/status/685458043220439040
- number 2 retweet with 76 rts at the time of collection: https://twitter.com/sommecourt/status/685901048024809474
- number 3 retweet with 63 at the time of collection: https://twitter.com/archaeologyuk/status/685435904400470016
- 4th, with 56: https://twitter.com/typejunky/status/685915116089544704
- 5th, with nearly 50: https://twitter.com/arranjohnson/status/685919779803271169 … the sheer danger the show and its participants put themselves in is astonishing.
So the most retweeted interventions show a pretty strong signal of disapproval. I have not looked into users’ profiles to see whether or not folks identify as archaeologists. Nor have I mapped users’ networks to see how far these messages percolated, and into what kinds of communities. This is entirely possible to do of course, but this post just represents a first pass at the data.
Let’s look at the patterns of language in the corpus of tweets as a whole. I used the LDAVis package for R to create an interactive visualization of topics within the corpus, fitting it to 20 topics as a first stab. You can play with the visualization here. If you haven’t encountered topic modeling yet, it’s a technique to reverse engineer a corpus into the initial ‘topics’ from which the writers wrote (could have written). So, it’s worth pointing out that it’s not ‘truth’ we’re seeing here, but a kind of intellectual thought exercise: if there were 20 topics that capture the variety of discourse expressed in these tweets, what would they look like? The answer is, quite a lot of outrage, dismay, and disappointment that this TV show was aired. Look particular at say topic 8 or topic 3, and ‘disgust’. Topic 1, which accounts for the largest slice of the corpus, clearly shows how the discussants on twitter were unpacking the rebranding of this show from its previous incarnation as ‘Nazi War Diggers’, and the pointed comments at Clearstory Uk, the producers of Battlefield Recovery.
We can also look at patterns in the corpus from the point of view of individual words, imagining the interrelationships of word use as a kind of spatial map (see Ben Schmidt, Word Embeddings). If you give it a word – or a list of words – the approach will return to you words that are close in terms of their use. It’s a complementary approach to topic models. So, I wanted to see what terms were in the same vector as the name of the show & its producers (I’m using R). I give it this:
some_terms = nearest_to(model,model[[c("battlefieldrecovery", "naziwardiggers", "clearstoryuks")]],150) plot(filter_to_rownames(model,names(some_terms)))
And I see the interrelationships like so:
…a pretty clear statement about what 3600 tweets felt, in aggregate along this particular vector. Of the tweets I saw personally (I follow a lot of archaeologists), there was an unequivocal agreement that what this show was doing was no better than looting. With word vectors, I can explore the space between pairs of binaries. So let’s assume that ‘archaeologist’ and ‘looter’ are opposite ends of a spectrum. I can plot this using this code:
actor_vector = model[["archaeologists"]] - model[["looters"]] word_scores = data.frame(word=rownames(model)) word_scores$actor_score = model %>% cosineSimilarity(actor_vector) %>% as.vector ggplot(word_scores %>% filter(abs(actor_score)>.725)) + geom_bar(aes(y=actor_score,x=reorder(word,actor_score),fill=actor_score<0),stat="identity") + coord_flip()+scale_fill_discrete("words associated with",labels=c("archaeologist","looter")) + labs(title="The words showing the strongest skew along the archaeologist-looter binary")
which gives us:
You can see some individual usernames in there; to be clear, this isn’t equating those individuals with ‘archaeologist’ or ‘looter’, rather, tweets mentioning those individuals tend to be RT’ing them or they themselves are using language or discussing these particular aspects of the show. I’m at a loss to explain ‘muppets’. Perhaps that’s a term of derision.
So, as far as this analysis goes – and one ought really to map how far and into what communities these messages penetrate – I’d say on balance, the twittersphere was outraged at this television ‘show’. As Nick said,