Text re-use in Instagram posts selling human remains

Lincoln Mullen has a wonderful R package on rOpenSci for detecting and measuring text reuse in a corpus of material (the kind of thing that is enormously useful if you’re interested in 19th century print culture, for instance). I wondered to myself what I would find if I fed it the corpus of material I’ve collected (see this gist) concerning the trade in human remains on Instagram (It’s looking for ngrams 5 words long, which means that I end up looking at 3k posts from my initial corpus of 13k). We’re writing all of this up for submission shortly, so … Continue reading Text re-use in Instagram posts selling human remains

networks-simulation-workshop-image

Workshop on Networks & Simulation for the Humanities – Nov 9, Discovery Centre Macodrum Library

Carleton University, Ottawa, Macodrum Library Discovery Centre RM 481, 11 – 2 Understanding the complexity of past and present societies is a challenge across the humanities. Simulation and network science provide computational tools for confronting these problems. This workshop will … Continue reading Workshop on Networks & Simulation for the Humanities – Nov 9, Discovery Centre Macodrum Library

Getting Data out of Open Context & Doing Useful Things With It: Coda

Previously, on tips to get stuff out of Open Context… In part 1, I showed you how to generate a list of URLs that you could then feed into `wget` to download information. In part 2, I showed you how to use `jq` and `jqplay` – via the amazing Matthew Lincoln, from whom I’ve learned whatever small things I know about the subject – to examine the data and to filter it for exactly what you want. Today – combining wget & jq Today, we use wget to pipe the material through jq to get the csv of your dreams. … Continue reading Getting Data out of Open Context & Doing Useful Things With It: Coda

Getting Data out of Open Context & Doing Useful Things With It: Part 2

If you recall, at the end of part 1 I said ‘oh, by the way, Open Context lets you download data as csv anyway’. You might have gotten frustrated with me there – Why are we bothering with the json then? The reason is that the full data is exposed via json, and who knows, there might be things in there that you find you need, or that catch your interest, or need to be explored further. (Note also, Open Context has unique URI’s – identifiers- for every piece of data they have; these unique URIs are captured in the … Continue reading Getting Data out of Open Context & Doing Useful Things With It: Part 2