New paper out: Towards a Method for Discerning Sources of Supply within the Human Remains Trade via Patterns of Visual Dissimilarity and Computer Vision

We have a new paper out:

Graham, S., Lane, A., Huffer, D. and Angourakis, A., 2020. Towards a Method for Discerning Sources of Supply within the Human Remains Trade via Patterns of Visual Dissimilarity and Computer Vision. Journal of Computer Applications in Archaeology, 3(1), pp.253–268. DOI:

“While traders of human remains on Instagram will give some indication, their best estimate, or repeat hearsay, regarding the geographic origin or provenance of the remains, how can we assess the veracity of these claims when we cannot physically examine the remains? A novel image analysis using convolutional neural networks in a one-shot learning architecture with a triplet loss function is used to develop a range of ‘distances’ to known ‘reference’ images for a group of skulls with known provenances and a group of images of skulls from social media posts. Comparing the two groups enables us to predict a broad geographic ‘ancestry’ for any given skull depicted, using a mixture discriminant analysis, as well as a machine-learning model, on the image dissimilarity scores. It thus seems possible to assign, in broad strokes, that a particular skull has a particular geographic ancestry. ”

Our code is at

The key idea: a one-shot neural network can be used to measure the web of differences in carefully selected social media images (backgrounds removed) of human skulls. patterns of similar *dissimilarities* can then be compared with osteological or forensic materials and then we can look at what vendors say about the remains. We find that the stories told are often dubious. The web of differences also seems to imply that Indigenous North American human remains are being traded, but not labelled as such. While bonetraders will be quick to point out that ‘buying human skulls is legal’ (and we’ll write more about that in due course), trading in Indigenous Human remains gets into NAGPRA territory & it’s most definitely illegal (US):



Zettlekasten to Online Wiki

I was never taught how to take notes. Periodically, I try to develop better habits. I’ll go read various blogs, forums, product pages, looking for the thing that’ll make everything come together, make my reading more effective, make my thinking so much sharper…


Some time ago I bought an ipad (‘it’ll be for research! honestly! for pdfs!’) and still my reading/note taking didn’t come together. I have Liquid Text on it and pdf viewer. Liquid Text is pretty neat… but I find its ability to pull multiple pdfs together and all of its note taking, connecting just doesn’t work for me – on an iPad. Part of the problem is that I wasn’t using it the way its designers imagined a person might use it. (Apparently, it’s now available on Microsoft devices and in that context I think it would really work for me). The other part was, well, probably a discipline thing. Or lack thereof.

PDF Viewer is a nice little app for reading pdfs, and when I tied it to Zotero with zotfile… now we’re talking! Got my notes back on my writing machine, so headway.


In the past, for various projects, I’ve tried the whole one-idea-per-card note taking system called ‘Zettelkasten‘. Combine that with an editor that does search and creation at the same time (like nvAlt), and I actually got kinda good at pulling stuff out and framing searches, finding connections between my notes. It’s a bit like ‘commonplace books‘, at least the way I’ve been using ’em.  I’ve also been thinking of these in the context of open notebook science, reproducibility and that sort of thing – Caleb McDaniel put it best:

: The truth is that we often don’t realize the value of what we have until someone else sees it. By inviting others to see our work in progress, we also open new avenues of interpretation, uncover new linkages between things we would otherwise have persisted in seeing as unconnected, and create new opportunities for collaboration with fellow travelers. These things might still happen through the sharing of our notebooks after publication, but imagine how our publications might be enriched and improved if we lifted our gems to the sunlight before we decided which ones to set and which ones to discard? What new flashes in the pan might we find if we sifted through our sources in the company of others?

He used an open notebook powered by Gitit to write his book Sweet Taste of Liberty and it won a Pulitzer Prize! (Caleb’s original open notebook) .

So how do I put these ‘zettels’ online? ‘The Archive‘ is a nice little bit of software, developed on top of nvAlt, and I like how it works. I have it saving each note as an md file into a git repository on my machine. I push these things to a github repo. Now, there are plenty of static site generators that will turn a collection of markdown into a static website, but collaboration on the underlying files is still an iffy process. I spun up a wiki.js instance on Reclaim Cloud and then figured out how to connect it to the github repo (thread here).

I am now the proud owner of a wiki that my students can edit and collaborate with me on some of my larger projects (they can just use the web interface, which is nice, no faffing about); whenever I git pull I have their research to hand in my preferred note taking app; whenever I push they get my stuff. And our research is out there in the open.


– configure storage to grab from github using https, not ssh

– spaces in file names will break the import/export

– set up a metadata template in ‘The Archive’ so that notes will render nicely there.

HIST3000|CLCV3000 Introduction to Digital Archaeology – Trailers!

I started scratching out ideas for what this ‘intro to digital archaeology’ class might look like as I taught my early summer course, ‘Crafting Digital History.’ Scratches became mindmaps and random scraps of paper and orphaned text files. One thing that I found really worked well with the DH course was that it had a regular beat to it. Each week, the rhythm and routine was the same, although within that there was a lot of choice about what to do and how to approach it. I want to preserve that for the digiarch class; I also want to provide more signposts along the way, so I’m planning to seed the readings with my own annotations using; I also saw someone on Twitter mention that they might embed short wee videos of themselves speaking about each reading, in the reading via annotation and I thought, ‘my god, that’s brilliant’ and so I’ll give that a try too. I have the link to the tweet somewhere, just not here as I write.

Anyway, in the interests of providing more structure and more presence, I’ve also been building trailers for the course and the modules within it. Making these have helped narrow down what it is I want to do; you can’t touch on everything, so you’d better go deep rather than wide. Without further ado…

and a bit about me…

Elegy for George Floyd

Today is the funeral of George Floyd, the man murdered by police in Minneapolis. Since his death, other instances of police brutality as the police riot have been collated in various places; one reckoning has over 400 instances (link here, kept by Greg Doucette, and just the ones that have been shared on Twitter!).

We – Andrew Reinhard and myself – wanted to honour George Floyd, and so we composed ‘Elegy for George Floyd’, a data composition built from sonifying the data in that spreadsheet and then remixing the results.

As you listen, you will hear a trumpet (police siren / police action) that waxes and wanes with the brutality of the action recorded. The reports for each incident were loaded into Voyant-Tools, where they were reorganized by the most common terms. Each word was then replaced in the report by its count; then all of the scores for each report were added up. This index value was then mapped against four octaves in D# minor, a key that invokes “…Feelings of the anxiety of the soul’s deepest distress, of brooding despair, of blackest depresssion, of the most gloomy condition of the soul. Every fear, every hesitation of the shuddering heart, breathes out of horrible D# minor. If ghosts could speak, their speech would approximate this key. ” (source). These reports are scored into the music twice – one voice in whole notes, a second voice in arpeggiated chords to reflect the sirens and chaos of the police brutality

Each city’s latitude and longitude and the cumulative report number were converted into chords and baseline.

The resulting sonification was then remixed, with an 808 bass line added. T808 runs throughout the entire song, the heartbeat of George Floyd that abruptly stops at 8:46. It contrasts with the intrusive double-bass of the police line generated in the original sonification. The crescendos of all of the data tracks reflect clashes with the police. Towards the end of the song, there are instances (and then a full minute) of tracks playing backwards, which reflects how upside-down things have become.

The remixed piece is at 90 bpm which we feel adds to the gravitas of the work; it is unsettling and sad, but yet, even now, contains beauty and hope.

With respect, we offer this piece in that spirit.

Our original tracks are available at We invite you to remix and recompose your own version.

We are uploading the piece to itunes, and any monies it might earn will be donated to #blm.

Archstopmo: the 2020 edition

How lovely, that with everything going on, that some folks found the time to try their hand at archaeological stop motion! Let’s watch some films:

Buried Ship

Abby and Maggie Mullen write,

“We created this film because we like Vikings and we wanted to make something about the ocean. (Our team likes a lot of different archaeological sites, so we went through a lot of ideas before landing on this one!) We found that the two-minute limitation made it both easier and more challenging, because it’s difficult to communicate a complicated story in two minutes, with Legos, but that helped us narrow down our topic.

Our process started with research about different archaeological sites, and when we found two stories about different Viking ships found with GPR, we decided it could be fun to try to view the site from both above the ground and below it.

Our set designer painted our backdrops in watercolor and built the sets in Lego. We had to adjust the scale of our Lego models multiple times, which she built, to make our photography work. We weren’t 100% successful, but an 8yo’s attention span is limited and we can’t exactly run out to the store right now to get more supplies.

We used an iPhone to take the photographs. We set it up on a tripod with a remote shutter to make it easier to keep it mostly in the same place. We then transferred our photos to a MacBook Pro and put the photos into iMovie to create the stop-motion. Our “silent film” text slides were created in PowerPoint, and we used a song from the YouTube Studio free music collection for our soundtrack.”

Comments on Youtube include, “I really liked this! It was so interesting AND beautiful. Really well done. It made me want to learn more!” and “Great information! I did not know that Viking ships had been found so recently from so long ago. I greatly enjoyed the scene settings and photography. The accompanying music was excellent.”

The Venus of Willendorf: an archaeological yarn

Karen Miller writes,

“As a traditional women’s craft, crochet is an apt sculptural method to recreate an iconic archaeological artefact that evokes the beauty of the female body. I was excited to find the pattern at Lady Crafthole’s ‘Cabinet of Crochet Curiositie’s I filmed it on an ipad with the Stop Motion Studio app and added the title and credits in iMovie. ”

Archaeological Tea-construction

Beth Pruitt writes,

“This video is about methodological theory in archaeology, created for SAA’s Online Archaeology Week after the cancellation of the planned Austin Public Archaeology Day at the 2020 SAA Annual Meeting. Through observing the attributes of the rim sherd (its curvature, decoration, etc.), archaeologists can make inferences about the rest of the whole, even when pieces remain missing. This is based on an in-person activity that I do at public archaeology events to help visitors understand laboratory methods and induction. I used the app Stop Motion Studio for taking the frame photos and strung them together in the Windows 10 Photos app. I drew the animated overlays frame-by-frame in Inkscape.”

Jury Prizes

  • To Maggie and Abby Mullen, in the ‘Story of a Site’ category
  • To Karen Miller, in the ‘Biography of an Object’ category
  • To Beth Pruitt, in the ‘Archaeological Theory’ category

Best Overall and Choix du Peuple

To be announced May 4th! Make your votes on the Choix du Peuple:

Tuesday May 5th:

And with the polls closed, looks like ‘Tea-Construction’ is the Choix du Peuple!

Searching Inside PDFs from the Terminal Prompt

I have reason, today, to want to search the Military Law Review. If you know which issue the info you’re looking for is located, then you can just jump right in.

When do we ever know that? There’s no search-inside feature. So we’ll build one ourselves. After a bit of futzing, you can see that all of the pdfs are available in this one directory:


$ wget -A .pdf

should just download them all directly. But it doesn’t. However, you can copy the source html to a text editor, and with a bit of regex you end up with a file with just the paths directly to the pdf.  Pass that file as -i urls.txt to wget, and you end up with a corpus of materials.

How do we search inside? This question on Stackoverflow will help us out.  But it requires pdftotext to be installed. Sigh. Always dependencies! So, following this, here we go.

On the command line (with Anaconda installed):

conda create -n envname python=3.7
conda activate envname
conda config --add channels conda-forge
conda install poppler

The pdfs are in a folder called ‘MLR’ on my machine. From one level up:

$ find /MLR -name '*.pdf' -exec sh -c 'pdftotext "{}" - | grep --with-filename --label="{}" --color "trophies"' \;

et voila!

a quick thought on library JSON

Read this post today:

which seems very cool. In Tom’s #update 1, he points to a parser that one of his readers wrote for this imagined spec, and if you format your books according to Tom’s spec, and point the parser at the file, you get this really cool interface on your materials: see .

Anyway, the thought occurred that the ruby script that inukshuk wrote with regard to my query about adding materials to tropy notes in json (full thread here, .rb file here ) could easily be modified to produce the json from simple lists in txt.

So I might fart around with that.


Archstopmo: An Archaeology Stop Motion Movie Festival!

April 5 – 30th, with winners revealed May 4th.

Let’s have a movie festival. Also, I like stop-motion films – y’know, like the Wallace and Gromit films (and here’s an archaeological one I just found on youtube using playmobil!). So here’s what you do –


  1. Make a stop motion film on one of the following themes:
    • a. archaeological theory
    • b. history of archaeology
    • c. the story of a site
    • d. the story of a dig
    • e. the biography of an object
  2. Make it two minutes in length
  3. Can be lego, clay, paper cut outs, whatever you like
  4. Upload to youtube
  5. Tweet it out with #archstopmo
  6. Check out twitter under the ‘archstopmo’ hashtag
  7. Prepare an artist’s statement that explains and contextualizes your work, as well as the software you’ve used, and your process.
  8. Submit your film at the form below.
  9. Have fun!

There will be a film gallery which will be updated frequently with entries and links to the artist’s statement.


Prizes (there are no prizes, only glory!) will be selected by a panel of judges, plus one audience choice.

  • Best in each category – five prizes
  • Choix des Judges – best overall
  • Choix du Peuple – best by voting

Submit Your Work


Featured image by Adi Suryanata  on unsplash

Ah, I See You Have A Policy: A Screenshot Essay on the Trade in Human Remains

Warning: There are many photographs of human remains in this post.

There is a literature on the online trade in human remains going back to at least Huxley and Finnegan’s 2004 piece on eBay in the Journal of Forensic Science,  and since then, several academics have been active in discussing the ethical, moral, and legal dimensions of this trade, producing a steady stream of articles. At the same time, the trade was transformed by the merging of social media with marketplace and ad-driven revenue models, expanding in scope and reach. Several platforms, over the last decade, have added wording to their prohibited categories of goods that deals with human remains. Let’s walk through some of that.

I found a copy of the World Archaeological Congress 2010 Newsletter in the Internet Archive, with this one line describing a human skull seen on Etsy, and WAC’s successful request to Etsy to remove the post.

The post was not in fact removed. And can still be found online.

It sold in 2011. What’s etsy’s stance on human remains, anyway?

Etsy’s current policy on human remains. Such as it is. Human remains were added to the prohibited list in 2012.

The seller from 2010, still active, using a different skull as a prop. Still selling human remains, now points people towards her Facebook page, and since Etsy banned human remains, wants you to send private messages if you’re interested. Facebook’s good for that sort of thing, eh? Private messaging, I mean.

Facebook says no human body parts or fluids.

But here’s a Facebook store selling…. human remains.

We are not surprised, to find human remains on Facebook. After all, Facebook owns Instagram, and there are any number of posts there selling human remains. Including this one. But wait, is that an Amazon box? Does Amazon have a human remains policy?

Yes, yes they do. And it seems a bit contradictory. And unenforced.

And it is trivial to find human remains being sold on Amazon. Like this skull. Displayed sideways, since the photo was taken with the seller’s cellphone.

Since I’m on, you might see advertisements interspersed in this essay. It will be interesting to see which advertisements WordPress matches to this post; it might even be hard to see the difference between those ads and these screencaptures.

Ebay, 2012: ” [the policy prohibits] “humans, the human body, or any human body parts”  but expressly permits “clean, articulated (jointed), non-Native American skulls and skeletons used for medical research.” (Marsh, 2012, HuffPost). Today?

It was on eBay that we all (the archaeological ‘we’) first twigged that human remains selling online was lucrative and booming. While their policy has changed over the years, the policy is now admirably lucid and succinct. Did this tighter, stronger, policy have any impact?

It is possible to find the ruins and remains of specialist eBay aggregator sites like this one in the Internet Archive. I spent quite a lot of time tracking as many of these down as I could, teasing out which posts were actually for human remains, and which ones were replicas or adjacent materials, and scraping the data, plotting it over time.

And I see three phases here. An early phase where there was a lot of money happening (remember, these values are approximate indications rather than absolute totals. They give us a sense of the trend rather than the exact dollar number); a phase where language is suddenly cagey about what precisely is being sold (the stand? or the skull? Remember the earlier wishy-washy policy of 2012?), and the volume drops; and then, from July 2016: eBay bans human remains outright. And human remains drop out of the aggregators completely. The ban – to judge from these numbers – worked. Graphs and underlying research Graham, forthcoming.

Have we accomplished anything? eBay certainly has, I think, and that’s worth thinking about.  Perhaps an auction site where sales are also dependent on reputation responds better to moral suasion than the other platforms. When is it in a platform’s best interest to actually police its own policies?

Human remains are in a nebulous zone, legally. In Canada, the law to my mind seems pretty clear:

Section 182.B seems to cover it. These materials are human beings. Buying and selling humans interferes -at the very least!- with human dignity. I’m no lawyer, and I don’t think this has ever been tested in court. But: If a platform profits from a user’s breaking of the platform’s very own policies on human remains, if a platform turns a blind eye, is the platform not condoning the trade? Is this not a nudge-nudge wink-wink tacit approval of the trade? Who should want to invest in a platform that makes money from selling human beings? Should we not hold such a platform accountable?

See ACCO for more on various illicit and illegal trades happening across social media. For more on our project studying the trade in human remains, see

Posts referred to have also been saved to the Internet Archive.

a note on git-lfs

Sometimes, I have files that are larger than github’s 100 mb. So here’s what you need to do.

brew install git-lfs
brew upgrade git-lfs

Start a new git repository, and then make sure git large file storage (git lfs) is tracking the large file. For instance, I just moved a topic model visualization to a repo on github (20,000 archaeological journal articles). It has a data csv that is 135 mb. So I made a new repo on github, but didn’t initialize it on the website. Instead, after getting git-lfs installed on my machine:

git init
git lfs track "20000/data/topic_words.csv"
git add .gitattributes 20000/data/topic_words.csv
git commit -m "initial"
git add .
git commit -m "the rest"
git remote add origin
git push -u origin master

Making Nerdstep Music as Archaeological Enchantment, or, How do you Connect with People Who Lived 3000 Years Ago?

by Shawn Graham, Eric Kansa, Andrew Reinhard

What does data sound like?

Over the last few days, what began as a bit of a lark has transformed into something more profound and meaningful. We’d like to share it with you—not just the result, but also our process. And in what we’ve made, perhaps, we find a way of answering the title’s question: how do you connect with people who lived 3,000 years ago?

In the recent past, Shawn has become more and more interested in representing the patterns we might detect, at a distance, in the large collections of digital data that are becoming more and more available . . . using sound. Called ‘sonification’, this technique maps aspects of the information against things like timbre, scale, instrumentation, rhythm, and beats-per-minute to highlight aspects of the data that a visual representation might not pick up. It’s also partly about making something strange—we’ve become so used to visual representations of information that we don’t necessarily recognize the ways assumptions about it are encoded in the visual grammars of barcharts and graphs. By trying to represent historical information in sound, we have to think through all of those basic decisions and elaborate on their implications.

Last week, he was toying with mapping patterns of topics in publications from Scotland from the 18th and 19th centuries as sound, using an online app called ‘TwoTone’. He shared it on Twitter, and well, one thing led to another, and a conversation began between Shawn, Eric, and Andrew: What might archaeological data sound like?

Sing in me Muse, through thine API, of sherds and munsell colors, of stratigraphic relations, and of linked thesauri URIs!

—Eric Kansa

Get Some Data

First things first: get some data. Open Context (Eric’s pet project) carefully curates and publishes archaeological data from all over the world. He downloaded 38,000 rows of data from the excavations at the Etruscan site of Poggio Civitate (where, in a cosmic coincidence, Andrew attended field school in 1991) and began examining it for fields that could be usefully mapped to various sonic dimensions. Ultimately, it was too much data! While there are a variety of ways of performing a sonification (see Cristina Wood’s Songs of the Ottawa, for instance), TwoTone only accepts 2,000 rows. The data used therefore for this audio experiment was very simple—counts of objects from Poggio Civitate were rendered as arpeggiated piano lines over three octaves; average latitude and average longitude were calculated for each class of thing thereby making a chord, and then each class of thing had its own unique value. Shawn’s initial result of data-driven piano sonification can be listened to here.

The four original dimensions of the sonification appear above, mapped in TwoTone. The rising notes in the bottom track are the item type ids. All of the materials come from the same chronological period, thus to listen (or view left-to-right) needed some sort of organizing principle. Whether or not it is the right principle is a matter of interpretation and debate.

Archaeology is a Remix

But what if an actual musician got a hold of these tracks? Andrew recently published a work called ‘Assemblage Theory’ where he remixed found digital music in order to explore ideas of archaeological assemblages.[1] Taking his experimentation in electronic dance music (EDM) a step beyond Assemablage Theory, he took Shawn’s four original tracks based on Eric’s 3,000-year-old data and began to play, iterating through a couple of versions, in a genre he calls ‘nerdstep’. He crafted a 5-minute piece that has movements isolating one of the four data threads, which sometimes crash together like waves of building data, yet are linked together. He opted for 120 bpm, a dance music standard, and then, noting where the waves of data subside into quiet pools, was inspired to write some lyrics. “The quiet segues are basically data reflexivity in audio form,” he says.

Data propagation
All this information
Gives me a reaction
Need time for reflection

A one-way conversation
This endless computation
Numbs me from sensation
Need time for reflection

Give me time to breathe
Give me time to think

Data raining down on me

Emotionally exhausting
How much will this cost me
I’m alone but you are watching
Look up from your screen

Give me time to breathe
Give me time to think
Look up from your screen.

Reinhard used the open source Audacity audio software application to create the song based on archaeological data sonification. The first four tracks are Shawn’s piano parts, staggered in such a way as to introduce the data bit-by-bit, and then merged with 16 other tracks—overburden or matrix. In the beginning, they are harmonious and in time, but because of subtle variations in bpm, by the time the song ends the data have become messy and frenetic, a reflection of the scattered pieces within the archaeological record, something that happens over time. Each movement in the song corresponds to an isolated data thread from one of Shawn’s piano parts, which then loops back in with the others to see how they relate.

Life is A Strange Loop

Speaking of loops, let’s think about the full loop we’ve encountered here. 3,000 years ago, at a plateau in the tufa landscape of southern Etruria, people lived their lives, only to have their debris carefully collected, studied, systematized, counted, digitized, and exposed online. No longer things but data, these counts and spaces were mapped to simple sonic dimensions using a web-toy, making a moderately pleasing experience. Remixed, the music moves us, enchants us, towards pausing and thinking through the material, the labour, the meanings, of a digital archaeology.[2] If/when this song is performed in a club (attn: John Schofield and the Theoretical Archaeology Groups [TAG] in both the UK and North America), the dancers would then be embodying our archaeological knowledge of Poggio in their movements, in the flows and subtle actions/reactions their bodies make across the floor. In dancing, we achieve a different kind of knowledge of the world, that reconnects us with the physicality of the world.[3] The eruptions of deep time into the present [4] – such as that encountered at an archaeological site – are weird and taxing and require a certain kind of trained imagination to engage with. But by turning the data into music, we let go of our authority over imagination, and let the dancers perform what they know.

For the three of us as creators, this playful sonification of data allows us to see archaeological material with fresh eyes . . . errrrrr ears . . . and by doing so restores the enchantment we once felt at the start of a new project, or of being interested in archaeology in the first place. Restoring the notion of wonder into three middle-aged archaeologists is no small feat, but the act of play enabled us to approach a wealth of artifacts from one site we know quite well, and realize that we didn’t know it quite like this. Using the new music bridges the gap between humans past and present and in dancing we (and hopefully you) embody the data we present. It’s a new connection to something old, and is experienced by bodies. This is perhaps almost as intoxicating as the work done by Patrick McGovern (U. Penn) and Sam Caglione (Dogfish Head) in their experimentation and creation of ancient ales, the first of which was “Midas Touch”, a surprisingly drinkable brew concocted from an ancient recipe recovered on excavation in Asia Minor. Archaeology is often a cerebral enterprise, which deserves—at times—a good ass-shaking derived from a driving beat.

I’m listening now and am amazed. It is really beautiful, not only as a finished product, but as a process that started with people who lived their lives almost 3000 years ago.

—Eric Kansa

Reflexivity, by KGR [5]


[1] Reinhard’s article, “Assemblage Theory: Recording the Archaeological Record,” and two responses by archaeologists Jolene Smith and Bill Caraher.

[2] An argument made by Perry, Sara. (2019). The Enchantment of the Archaeological Record. European Journal of Archaeology, 22(3), 354-371. doi:10.1017/eaa.2019.24

[3] See for instance Block, Betty, and Judith Kissel (2001). Dance: The Essence of Embodiment. Theoretical Medicine and Bioethics 22(1), 5-15. DOI: 10.1023/A:1009928504969

[4] Fredengren, Christina (2016). Unexpected Encounters with Deep Time Enchantment. Bog Bodies, Crannogs and ‘Otherworldly’ sites. The materializing powers of disjunctures in time. World Archaeology 48(4), 482-499, DOI: 10.1080/00438243.2016.1220327

[5]  Kansa-Graham-Reinhard (pronounced as either “Cager” or “Kegger”—the GIF-debate of archaeological nerdstep/nerdcore).


Block, Betty, and Judith Kissel (2001). Dance: The Essence of Embodiment. Theoretical Medicine and Bioethics 22(1), 5-15. DOI: 10.1023/A:1009928504969

Caraher, William. (2019). “Assemblage Theory: Recording the Archaeological Record: Second Response” Epoiesen

Fredengren, Christina (2016). Unexpected Encounters with Deep Time Enchantment. Bog Bodies, Crannogs and ‘Otherworldly’ sites. The materializing powers of disjunctures in time. World Archaeology 48(4), 482-499, DOI: 10.1080/00438243.2016.1220327

Perry, Sara. (2019). The Enchantment of the Archaeological Record. European Journal of Archaeology, 22(3), 354-371. doi:10.1017/eaa.2019.24

Reinhard, Andrew. (2019). “Assemblage Theory: Recording the Archaeological Record” Epoiesen

Smith, Jolene. (2019). “Assemblage Theory: Recording the Archaeological Record: First Response” Epoiesen

Anthony Tuck. “Murlo“. (2012) Anthony Tuck (Ed.) . Released: 2012-07-06. Open Context. <> DOI: ARK (Archive):

Featured Image by Sarthak Navjivan

A Song of Scottish Publishing, 1671-1893

The Scottish National Library has made available a collection of chapbooks printed in Scotland, from 1671 – 1893, on their website here. That’s nearly 11 million words’ worth of material. The booklets cover an enormous variety of subjects. So, what do you do with it? Today, I decided to turn it into music.

As part of writing the second edition to the Historian’s Macroscope, I’ve been re-writing the topic modeling section, and I’ve included working with this information, and building a topic model for it using R. As part of that exercise, I preprocessed all the data so that it would be a bit easier for the newcomer to work with it (which will be held in a Github repo for the purpose). Part of the preprocessing was adding a ‘publication date’ to the NLS-provided inventory file (which involved a whole bunch of command line regex etc to grab that info from the METS metadata files).

To turn this into sound – I used the Topic Modeling Tool  to build a quick topic model on the 3000 + text files containing the ocr’d text. The TMT can also match your metadata up against the topic results, which is very nice and handy, especially for turning the results into music, which I did with the TwoTone app. Drop the resulting csv onto TwoTone, and your columns are ready to map to the music; the visualization is also handy to get a sense of when your topics are most prominent (where the left hand side is my earliest date, and the right hand side is my latest date):

Then I played with the settings, filtering things so that notes only are played if they are making a meaningful contribution to the entire year’s text.

You can listen to it on Soundcloud.

The piano arpeggios are mapped to a topic that seems largely to be bad ocr. The pipe organ corresponds to a topic about religion. The trumpet seems to be stories of people off to make their fortune (as I read the topic words for that topic). There’s a double base in there, which I assigned to the ‘histories’ topic (because why not). The glockenspiel is assigned to the topic that I understand as ‘folk wisdom’, while the harp is mapped to stories of love and romanced (and doomed love too, for that matter).

What do we learn doing this? Well, for one thing, it forces us to think about the constructedness of our ‘visualizations’, which is never a bad thing. It foregrounds how much dirty data is in this thing. It shows change over time in Scottish publishing habits (“we could have done that with a graph, Shawn!” to which I say: So what? Now you can engage a different part of your brain and feel that change over time.)