Knowledge Management

This is my first post on wordpress.com where I have to use the Gutenberg editor and man I hate it. I should probably migrate this damned thing – 14 years old! – out of wordpress. I’ve wasted a huge amount of time just trying to get things organized so I know what the hell is going on. Yeah, I can’t stand change, once I figure out how to do something the way I like it. But, on a similar theme… I’m still trying to figure out out to *take good notes*.

When I was a PhD student, I kept incredibly dense and largely useless notes. By ‘useless’ I mean – once I’d made an entry in my great big notebook, it was incredibly hard to find it again. To see which other materials it might speak to. I had several notebooks, where I’d duly copy out interesting passages and make my observations, and then shove slips of paper or post-it notes in on those pages where I thought ‘hey, this might be important and I really need to find this again’. Because I had no structure for getting into and through those notes, I’ve never used those notebooks since. Which is a waste.

Nobody ever taught me how to take notes.

When it came time to write, I’d get a big piece of paper and try to sketch out how the Big Idea I was writing about worked. I’d write down page numbers, cryptic directions to various pages in various notebooks, half-baked references, remembrances of important things I’d read, and draw circles and lines and swoops and eventually something would emerge out of that, but it was a messy, wasteful process.

What I’ve been searching for ever since was a way where I could capture the exciting ideas I was reading, the interesting thoughts I was having, in such a way that knowledge would crystallize out of the mess. With time, I’ve started to come up with a way that works on paper – a notebook with a line down the middle of the page; observations or important phrases copied on one side, with my reflection or thoughts on the other side. A citekey scribbled at the top of the page to connect to my reference manager (I’ve used all of the reference managers, it seems). An index page at the front of the notebook. Then, when it comes time to write, my Big Page is at least a little bit tidier with references to ‘orange nb p24 re bennett 32’.

There are a number of posts on this ol’ blog about taking notes, and different systems I’ve tried to cobble together to make ’em. In recent years, I’ve really become interested in the whole zettlekasten scene; the basic idea is one idea, one note. Sometimes I copy the relevant passage down, but most of the time, it’s just me riffing on something I’ve read; usually no more than three or four sentences. Then a system for indexing these so that notes can be compiled into larger overview notes or broken back down again. It doesn’t have to be digital, but of course, digital search and storage makes life easier. I’ve used everything from Notational Velocity through to plugins and mods for Sublime Text or Atom. And these all work in the sense that I’m able to pull together all of my relevant atomic notes and sometimes – if I’ve been really switched on – the compiled overview note goes into whatever I’m writing in its entirety. The note taking process is the writing process.

I really like when that happens.

Unfortunately, I find it hard to maintain the use of these different packages for zettles consistently. I think the reason for this is because, despite the ability to recombine, search, and find my atomic notes, I still can’t see the connections between things very well. But, with my most recent book project finally out of the way, and a bunch of other things finally having made their way through the publishing process, I’m ready to start again.

And boy, how the landscape has changed!

Roam, Foam, Org-roam, and Obsidian

It was a chance tweet I saw by Jonathan Reeve that sent me out on this latest note-taking odyssey, by the way:

I had to investigate. The major thing that has changed I think is the idea of ‘networked thought’ has really entered into the note taking space. And I think that’s what I’ve always missed in note taking process. The idea that if you make connections as appropriate between ideas, eventually larger structures emerge; these larger structures (network structures like shortest paths, clusters of various kinds, most-central nodes of various kinds) can give insight into the nature of your thoughts/nodes and perhaps suggest insights that you might not otherwise have spotted.

There is a wide array of editors to help you with this, all of which include network visualizations of links, backlinks, and tag structures. Some, like Roam  are subscription based and keep your notes somewhere on the cloud; others like  Foam  or Org-Roam are open source and keep your notes locally as markdown files (though Org-Roam is built on emacs and life’s too short). Then there’s  Obsidian which is not open source, but does keep your notes as separate markdown files. It has a pretty slick interface, and it will publish and host your ‘vault’ (folder of notes) as a website if you so wish (for a hosting fee, which seems pretty reasonable). If you ever read Caleb McDaniel’s ‘Open Notebook History‘ that feature will be quite interesting.*

I’ve been kicking the tires on Obsidian for the last week, and I have to say, I quite like it. I have a few community plugins installed that let me ‘refactor’ (break apart or merge together) notes as appropriate, that let me insert citations from my Zotero library (or create new notes from scratch on a given resource in my Zotero library) with links back to the original pdfs/resource, and a few cosmetic tweaks. New panes can be opened at will from a variety of places, and if you have the screen real-estate, organized however you like. I grabbed my existing folder of notes and opened it within Obsidian; I created a new index note to provide some consistent points of entry:

*I keep my notes, my ‘vault’, in a git-tracked folder, pushing online to a private git repo. I was also pushing to a public wikijs instance I host on Reclaim Cloud, but the importer broke and I can’t make it work any more. Anyway, that was probably too much – if I want to make my notes available online, I can probably just gh-pages them and that’ll serve. You can automate the process of pushing new notes to github; see this post by Bryan Jenks.

When you search for keywords or phrases or tags, the results of those searches can be turned into instant notes with wiki-style links. See that graph at top right? The green nodes are tags, the blue are notes, the red are notes that I’ve created while writing other notes that remain to be filled in.

Workflow

So here’s my workflow. I have Zotero and Zotfile installed, so I can send pdfs to a folder on my ipad. On the ipad, I use pdf reader to annotate. Zotfile retrieves these and pulls them back into Zotero. I use zotero-mdnotes to push the notes to my folder (‘vault’) of notes. (If I’m reading something physical, I can just mark it up or use my paper notebooks as before, and then transfer/consolidate notes into a new note in Obsidian.)

These I can then refactor into individual atomic notes as necessary. Using BetterBibTex for Zotero, I have also exported (with constant updating) my library’s bib file (as csl json) to the vault; I can then add the cite-key to any atomic note as appropriate. I add tags as appropriate. I link to other notes as appropriate. Obsidian shows me when a given note is referenced by another or mentions another and so I can use that to guide back-linking too.

Then, I can garden. By ‘gardening’, I mean, exploring my notes and their connections and thinking about what I’m seeing. Perhaps I add new notes. Perhaps I prune or delete notes. Perhaps I add more links or tags.

I love the graph feature. But I wish I could analyze it. There is a plugin that exports your graph to Neo4j for analysis, but that’s almost too much power for what I have in mind, and besides, you need to learn the cypher query language to make sense of that kind of thing. The ‘Infranodus‘ platform might be worth exploring here, as it does network metrics and text analysis too and can ingest your notes (see for instance this post) but I didn’t feel like signing up with credit card to something I just wanted to explore a bit (Infranodus can be installed locally, but it’s a beast of a thing to configure – it depends on Neo4j! – and after wasting the better part of a day on it, I threw in the towel).

No, good ol’ gephi or cytoscape or similar is all I need. So I did a bit of digging – where does Obsidian hold all of that info? It turns out, there is a json graph in a folder called ‘ObsidianCache’ that contains the current representation of your vault and its interlinkages:

Now, I’m certain that one could write a bit of python to grab each note and its links and tags, represent as a graph, and then do a few network metrics. But I don’t know how to do that in python – yet. But I can do it with jq , and reshape the json so that I end up with note – link and note – tag pairs. Gephi doesn’t ingest json, so I use a bit of R to turn it into graphml. Hey presto, a network I can explore in Gephi! What are the most central ideas? What kinds of ‘communities’ exist? I am imagining that knowing this information would help kick start my writing, or help me detect emergent ideas I hadn’t considered yet. (Other people feed their notes in Devonthink, which does some natural language magic to find connections in your notes. That’s another of the beautiful things about keeping your notes in plain text on your own machine).

The relevant jq query:

jq --raw-output '.metadata[] | {title: .frontmatter.title, tag: .tags[]?.tag}'

Then a bit of regex in sublime to put commas at the end of each line, wrap in square brackets, then a bit of R:

ibrary(igraph)
library(jsonlite)
setwd("~/Desktop")
thing <- fromJSON("tag-test.json") 
g <- graph_from_data_frame(thing, directed = FALSE)
write_graph(g, "tag-2mode.graphml", "graphml")

Open ‘er up in Gephi, using the multimode plugin to turn it from a network of notes to tags, to tags – tags by virtue of notes in common…

So obviously, there is some mucky data in my test vault, but interesting, eh? Incidentally, the ‘sg’ tag is for when I’ve had some inspiration that I want to come back to. And of course, maybe note to note by virtue of common tags would be a more interesting/useful view. Or perhaps, since a ‘tag’ could be considered as a kind of semantic note on its own, I just leave it notes – tags and treat it all as unimodal. Things to explore!

So we’ll see how things go. This morning, I spent a happy hour refactoring and building notes from a great article about archaeological photography at Dura Europos. Baird writes,

“Taking photographs, like drawing reconstructions, was a means by which the archaeologists could attempt to understand the object and the past and to rebuild the ruin. At Dura, photography was not a passive recording device as it is thought of in most histories of archaeology; rather, it was something that seems to have been an active means of constructing a particular past (fig. 5). Time in these photographs refers both to the practice of taking the photographs— the posing and framing—and the excavator’s construction of a time in the image; thus, they reflect a temporal breach that constructed an East in which modern peoples are equated with ancient.”

Active note taking, gardening our thoughts using these digital tools, seems to me a bit like how Baird writes about photography, perhaps. But I haven’t fully fleshed out that thought yet; perhaps its because it lets me build something new from others’ mental excavations of their own thought. Or I’m pushing the metaphor too far. Back to the garden I go!

Some useful videos

Below is a video of PhD student Courtney Applewhite describes how she uses Obsidian to study for her comps; something similar to this approach might be worth adapting.

New paper out: Towards a Method for Discerning Sources of Supply within the Human Remains Trade via Patterns of Visual Dissimilarity and Computer Vision

We have a new paper out:

Graham, S., Lane, A., Huffer, D. and Angourakis, A., 2020. Towards a Method for Discerning Sources of Supply within the Human Remains Trade via Patterns of Visual Dissimilarity and Computer Vision. Journal of Computer Applications in Archaeology, 3(1), pp.253–268. DOI: http://doi.org/10.5334/jcaa.59

“While traders of human remains on Instagram will give some indication, their best estimate, or repeat hearsay, regarding the geographic origin or provenance of the remains, how can we assess the veracity of these claims when we cannot physically examine the remains? A novel image analysis using convolutional neural networks in a one-shot learning architecture with a triplet loss function is used to develop a range of ‘distances’ to known ‘reference’ images for a group of skulls with known provenances and a group of images of skulls from social media posts. Comparing the two groups enables us to predict a broad geographic ‘ancestry’ for any given skull depicted, using a mixture discriminant analysis, as well as a machine-learning model, on the image dissimilarity scores. It thus seems possible to assign, in broad strokes, that a particular skull has a particular geographic ancestry. ”

Our code is at https://github.com/bonetrade/visual-dissimilarity

The key idea: a one-shot neural network can be used to measure the web of differences in carefully selected social media images (backgrounds removed) of human skulls. patterns of similar *dissimilarities* can then be compared with osteological or forensic materials and then we can look at what vendors say about the remains. We find that the stories told are often dubious. The web of differences also seems to imply that Indigenous North American human remains are being traded, but not labelled as such. While bonetraders will be quick to point out that ‘buying human skulls is legal’ (and we’ll write more about that in due course), trading in Indigenous Human remains gets into NAGPRA territory & it’s most definitely illegal (US): law.cornell.edu/uscode/text/18.

 

Zettlekasten to Online Wiki

I was never taught how to take notes. Periodically, I try to develop better habits. I’ll go read various blogs, forums, product pages, looking for the thing that’ll make everything come together, make my reading more effective, make my thinking so much sharper…

sigh.

Some time ago I bought an ipad (‘it’ll be for research! honestly! for pdfs!’) and still my reading/note taking didn’t come together. I have Liquid Text on it and pdf viewer. Liquid Text is pretty neat… but I find its ability to pull multiple pdfs together and all of its note taking, connecting just doesn’t work for me – on an iPad. Part of the problem is that I wasn’t using it the way its designers imagined a person might use it. (Apparently, it’s now available on Microsoft devices and in that context I think it would really work for me). The other part was, well, probably a discipline thing. Or lack thereof.

PDF Viewer is a nice little app for reading pdfs, and when I tied it to Zotero with zotfile… now we’re talking! Got my notes back on my writing machine, so headway.

~

In the past, for various projects, I’ve tried the whole one-idea-per-card note taking system called ‘Zettelkasten‘. Combine that with an editor that does search and creation at the same time (like nvAlt), and I actually got kinda good at pulling stuff out and framing searches, finding connections between my notes. It’s a bit like ‘commonplace books‘, at least the way I’ve been using ’em.  I’ve also been thinking of these in the context of open notebook science, reproducibility and that sort of thing – Caleb McDaniel put it best:

: The truth is that we often don’t realize the value of what we have until someone else sees it. By inviting others to see our work in progress, we also open new avenues of interpretation, uncover new linkages between things we would otherwise have persisted in seeing as unconnected, and create new opportunities for collaboration with fellow travelers. These things might still happen through the sharing of our notebooks after publication, but imagine how our publications might be enriched and improved if we lifted our gems to the sunlight before we decided which ones to set and which ones to discard? What new flashes in the pan might we find if we sifted through our sources in the company of others?

He used an open notebook powered by Gitit to write his book Sweet Taste of Liberty and it won a Pulitzer Prize! (Caleb’s original open notebook) .

So how do I put these ‘zettels’ online? ‘The Archive‘ is a nice little bit of software, developed on top of nvAlt, and I like how it works. I have it saving each note as an md file into a git repository on my machine. I push these things to a github repo. Now, there are plenty of static site generators that will turn a collection of markdown into a static website, but collaboration on the underlying files is still an iffy process. I spun up a wiki.js instance on Reclaim Cloud and then figured out how to connect it to the github repo (thread here).

I am now the proud owner of a wiki that my students can edit and collaborate with me on some of my larger projects (they can just use the web interface, which is nice, no faffing about); whenever I git pull I have their research to hand in my preferred note taking app; whenever I push they get my stuff. And our research is out there in the open.

Gotchas:

– configure storage to grab from github using https, not ssh

– spaces in file names will break the import/export

– set up a metadata template in ‘The Archive’ so that notes will render nicely there.

HIST3000|CLCV3000 Introduction to Digital Archaeology – Trailers!

I started scratching out ideas for what this ‘intro to digital archaeology’ class might look like as I taught my early summer course, ‘Crafting Digital History.’ Scratches became mindmaps and random scraps of paper and orphaned text files. One thing that I found really worked well with the DH course was that it had a regular beat to it. Each week, the rhythm and routine was the same, although within that there was a lot of choice about what to do and how to approach it. I want to preserve that for the digiarch class; I also want to provide more signposts along the way, so I’m planning to seed the readings with my own annotations using hypothes.is; I also saw someone on Twitter mention that they might embed short wee videos of themselves speaking about each reading, in the reading via annotation and I thought, ‘my god, that’s brilliant’ and so I’ll give that a try too. I have the link to the tweet somewhere, just not here as I write.

Anyway, in the interests of providing more structure and more presence, I’ve also been building trailers for the course and the modules within it. Making these have helped narrow down what it is I want to do; you can’t touch on everything, so you’d better go deep rather than wide. Without further ado…

and a bit about me…

Elegy for George Floyd

Today is the funeral of George Floyd, the man murdered by police in Minneapolis. Since his death, other instances of police brutality as the police riot have been collated in various places; one reckoning has over 400 instances (link here, kept by Greg Doucette, and just the ones that have been shared on Twitter!).

We – Andrew Reinhard and myself – wanted to honour George Floyd, and so we composed ‘Elegy for George Floyd’, a data composition built from sonifying the data in that spreadsheet and then remixing the results.

As you listen, you will hear a trumpet (police siren / police action) that waxes and wanes with the brutality of the action recorded. The reports for each incident were loaded into Voyant-Tools, where they were reorganized by the most common terms. Each word was then replaced in the report by its count; then all of the scores for each report were added up. This index value was then mapped against four octaves in D# minor, a key that invokes “…Feelings of the anxiety of the soul’s deepest distress, of brooding despair, of blackest depresssion, of the most gloomy condition of the soul. Every fear, every hesitation of the shuddering heart, breathes out of horrible D# minor. If ghosts could speak, their speech would approximate this key. ” (source). These reports are scored into the music twice – one voice in whole notes, a second voice in arpeggiated chords to reflect the sirens and chaos of the police brutality

Each city’s latitude and longitude and the cumulative report number were converted into chords and baseline.

The resulting sonification was then remixed, with an 808 bass line added. T808 runs throughout the entire song, the heartbeat of George Floyd that abruptly stops at 8:46. It contrasts with the intrusive double-bass of the police line generated in the original sonification. The crescendos of all of the data tracks reflect clashes with the police. Towards the end of the song, there are instances (and then a full minute) of tracks playing backwards, which reflects how upside-down things have become.

The remixed piece is at 90 bpm which we feel adds to the gravitas of the work; it is unsettling and sad, but yet, even now, contains beauty and hope.

With respect, we offer this piece in that spirit.

Our original tracks are available at https://github.com/shawngraham/elegy-for-George-Floyd. We invite you to remix and recompose your own version.

We are uploading the piece to itunes, and any monies it might earn will be donated to #blm.

Archstopmo: the 2020 edition

How lovely, that with everything going on, that some folks found the time to try their hand at archaeological stop motion! Let’s watch some films:

Buried Ship

Abby and Maggie Mullen write,

“We created this film because we like Vikings and we wanted to make something about the ocean. (Our team likes a lot of different archaeological sites, so we went through a lot of ideas before landing on this one!) We found that the two-minute limitation made it both easier and more challenging, because it’s difficult to communicate a complicated story in two minutes, with Legos, but that helped us narrow down our topic.

Our process started with research about different archaeological sites, and when we found two stories about different Viking ships found with GPR, we decided it could be fun to try to view the site from both above the ground and below it.

Our set designer painted our backdrops in watercolor and built the sets in Lego. We had to adjust the scale of our Lego models multiple times, which she built, to make our photography work. We weren’t 100% successful, but an 8yo’s attention span is limited and we can’t exactly run out to the store right now to get more supplies.

We used an iPhone to take the photographs. We set it up on a tripod with a remote shutter to make it easier to keep it mostly in the same place. We then transferred our photos to a MacBook Pro and put the photos into iMovie to create the stop-motion. Our “silent film” text slides were created in PowerPoint, and we used a song from the YouTube Studio free music collection for our soundtrack.”

Comments on Youtube include, “I really liked this! It was so interesting AND beautiful. Really well done. It made me want to learn more!” and “Great information! I did not know that Viking ships had been found so recently from so long ago. I greatly enjoyed the scene settings and photography. The accompanying music was excellent.”

The Venus of Willendorf: an archaeological yarn

Karen Miller writes,

“As a traditional women’s craft, crochet is an apt sculptural method to recreate an iconic archaeological artefact that evokes the beauty of the female body. I was excited to find the pattern at Lady Crafthole’s ‘Cabinet of Crochet Curiositie’s https://www.crochetcuriosities.com/. I filmed it on an ipad with the Stop Motion Studio app https://apps.apple.com/au/app/stop-motion-studio/id441651297 and added the title and credits in iMovie. ”

Archaeological Tea-construction

Beth Pruitt writes,

“This video is about methodological theory in archaeology, created for SAA’s Online Archaeology Week after the cancellation of the planned Austin Public Archaeology Day at the 2020 SAA Annual Meeting. Through observing the attributes of the rim sherd (its curvature, decoration, etc.), archaeologists can make inferences about the rest of the whole, even when pieces remain missing. This is based on an in-person activity that I do at public archaeology events to help visitors understand laboratory methods and induction. I used the app Stop Motion Studio for taking the frame photos and strung them together in the Windows 10 Photos app. I drew the animated overlays frame-by-frame in Inkscape.”

Jury Prizes

  • To Maggie and Abby Mullen, in the ‘Story of a Site’ category
  • To Karen Miller, in the ‘Biography of an Object’ category
  • To Beth Pruitt, in the ‘Archaeological Theory’ category

Best Overall and Choix du Peuple

To be announced May 4th! Make your votes on the Choix du Peuple:

Tuesday May 5th:

And with the polls closed, looks like ‘Tea-Construction’ is the Choix du Peuple!

Searching Inside PDFs from the Terminal Prompt

I have reason, today, to want to search the Military Law Review. If you know which issue the info you’re looking for is located, then you can just jump right in.

When do we ever know that? There’s no search-inside feature. So we’ll build one ourselves. After a bit of futzing, you can see that all of the pdfs are available in this one directory:

https://www.loc.gov/rr/frd/Military_Law/Military_Law_Review/pdf-files/

so

$ wget https://www.loc.gov/rr/frd/Military_Law/Military_Law_Review/pdf-files/ -A .pdf

should just download them all directly. But it doesn’t. However, you can copy the source html to a text editor, and with a bit of regex you end up with a file with just the paths directly to the pdf.  Pass that file as -i urls.txt to wget, and you end up with a corpus of materials.

How do we search inside? This question on Stackoverflow will help us out.  But it requires pdftotext to be installed. Sigh. Always dependencies! So, following this, here we go.

On the command line (with Anaconda installed):

conda create -n envname python=3.7
conda activate envname
conda config --add channels conda-forge
conda install poppler

The pdfs are in a folder called ‘MLR’ on my machine. From one level up:

$ find /MLR -name '*.pdf' -exec sh -c 'pdftotext "{}" - | grep --with-filename --label="{}" --color "trophies"' \;

et voila!

a quick thought on library JSON

Read this post today: https://tomcritchlow.com/2020/04/15/library-json/

which seems very cool. In Tom’s #update 1, he points to a parser that one of his readers wrote for this imagined spec, and if you format your books according to Tom’s spec, and point the parser at the file, you get this really cool interface on your materials: see https://bookshelves.ravern.co/shelf?url=https://tomcritchlow.com/library.json .

Anyway, the thought occurred that the ruby script that inukshuk wrote with regard to my query about adding materials to tropy notes in json (full thread here, .rb file here ) could easily be modified to produce the json from simple lists in txt.

So I might fart around with that.

 

Archstopmo: An Archaeology Stop Motion Movie Festival!

April 5 – 30th, with winners revealed May 4th.

Let’s have a movie festival. Also, I like stop-motion films – y’know, like the Wallace and Gromit films (and here’s an archaeological one I just found on youtube using playmobil!). So here’s what you do –

How?

  1. Make a stop motion film on one of the following themes:
    • a. archaeological theory
    • b. history of archaeology
    • c. the story of a site
    • d. the story of a dig
    • e. the biography of an object
  2. Make it two minutes in length
  3. Can be lego, clay, paper cut outs, whatever you like
  4. Upload to youtube
  5. Tweet it out with #archstopmo
  6. Check out twitter under the ‘archstopmo’ hashtag
  7. Prepare an artist’s statement that explains and contextualizes your work, as well as the software you’ve used, and your process.
  8. Submit your film at the form below.
  9. Have fun!

There will be a film gallery which will be updated frequently with entries and links to the artist’s statement.

Judging

Prizes (there are no prizes, only glory!) will be selected by a panel of judges, plus one audience choice.

  • Best in each category – five prizes
  • Choix des Judges – best overall
  • Choix du Peuple – best by voting

Submit Your Work

 

Featured image by Adi Suryanata  on unsplash https://unsplash.com/photos/5T0bY-x9A8U

a note on git-lfs

Sometimes, I have files that are larger than github’s 100 mb. So here’s what you need to do.

brew install git-lfs
brew upgrade git-lfs

Start a new git repository, and then make sure git large file storage (git lfs) is tracking the large file. For instance, I just moved a topic model visualization to a repo on github (20,000 archaeological journal articles). It has a data csv that is 135 mb. So I made a new repo on github, but didn’t initialize it on the website. Instead, after getting git-lfs installed on my machine:

git init
git lfs track "20000/data/topic_words.csv"
git add .gitattributes 20000/data/topic_words.csv
git commit -m "initial"
git add .
git commit -m "the rest"
git remote add origin https://github.com/shawngraham/archae-topic-models.git
git push -u origin master

Making Nerdstep Music as Archaeological Enchantment, or, How do you Connect with People Who Lived 3000 Years Ago?

by Shawn Graham, Eric Kansa, Andrew Reinhard

What does data sound like?

Over the last few days, what began as a bit of a lark has transformed into something more profound and meaningful. We’d like to share it with you—not just the result, but also our process. And in what we’ve made, perhaps, we find a way of answering the title’s question: how do you connect with people who lived 3,000 years ago?

In the recent past, Shawn has become more and more interested in representing the patterns we might detect, at a distance, in the large collections of digital data that are becoming more and more available . . . using sound. Called ‘sonification’, this technique maps aspects of the information against things like timbre, scale, instrumentation, rhythm, and beats-per-minute to highlight aspects of the data that a visual representation might not pick up. It’s also partly about making something strange—we’ve become so used to visual representations of information that we don’t necessarily recognize the ways assumptions about it are encoded in the visual grammars of barcharts and graphs. By trying to represent historical information in sound, we have to think through all of those basic decisions and elaborate on their implications.

Last week, he was toying with mapping patterns of topics in publications from Scotland from the 18th and 19th centuries as sound, using an online app called ‘TwoTone’. He shared it on Twitter, and well, one thing led to another, and a conversation began between Shawn, Eric, and Andrew: What might archaeological data sound like?

Sing in me Muse, through thine API, of sherds and munsell colors, of stratigraphic relations, and of linked thesauri URIs!

—Eric Kansa

Get Some Data

First things first: get some data. Open Context (Eric’s pet project) carefully curates and publishes archaeological data from all over the world. He downloaded 38,000 rows of data from the excavations at the Etruscan site of Poggio Civitate (where, in a cosmic coincidence, Andrew attended field school in 1991) and began examining it for fields that could be usefully mapped to various sonic dimensions. Ultimately, it was too much data! While there are a variety of ways of performing a sonification (see Cristina Wood’s Songs of the Ottawa, for instance), TwoTone only accepts 2,000 rows. The data used therefore for this audio experiment was very simple—counts of objects from Poggio Civitate were rendered as arpeggiated piano lines over three octaves; average latitude and average longitude were calculated for each class of thing thereby making a chord, and then each class of thing had its own unique value. Shawn’s initial result of data-driven piano sonification can be listened to here.

The four original dimensions of the sonification appear above, mapped in TwoTone. The rising notes in the bottom track are the item type ids. All of the materials come from the same chronological period, thus to listen (or view left-to-right) needed some sort of organizing principle. Whether or not it is the right principle is a matter of interpretation and debate.

Archaeology is a Remix

But what if an actual musician got a hold of these tracks? Andrew recently published a work called ‘Assemblage Theory’ where he remixed found digital music in order to explore ideas of archaeological assemblages.[1] Taking his experimentation in electronic dance music (EDM) a step beyond Assemablage Theory, he took Shawn’s four original tracks based on Eric’s 3,000-year-old data and began to play, iterating through a couple of versions, in a genre he calls ‘nerdstep’. He crafted a 5-minute piece that has movements isolating one of the four data threads, which sometimes crash together like waves of building data, yet are linked together. He opted for 120 bpm, a dance music standard, and then, noting where the waves of data subside into quiet pools, was inspired to write some lyrics. “The quiet segues are basically data reflexivity in audio form,” he says.

Data propagation
All this information
Gives me a reaction
Need time for reflection

A one-way conversation
This endless computation
Numbs me from sensation
Need time for reflection

Reflexivity
Give me time to breathe
Give me time to think

Reflexivity
Data raining down on me

Emotionally exhausting
How much will this cost me
I’m alone but you are watching
Look up from your screen

Reflexivity
Give me time to breathe
Give me time to think
Look up from your screen.

Reinhard used the open source Audacity audio software application to create the song based on archaeological data sonification. The first four tracks are Shawn’s piano parts, staggered in such a way as to introduce the data bit-by-bit, and then merged with 16 other tracks—overburden or matrix. In the beginning, they are harmonious and in time, but because of subtle variations in bpm, by the time the song ends the data have become messy and frenetic, a reflection of the scattered pieces within the archaeological record, something that happens over time. Each movement in the song corresponds to an isolated data thread from one of Shawn’s piano parts, which then loops back in with the others to see how they relate.

Life is A Strange Loop

Speaking of loops, let’s think about the full loop we’ve encountered here. 3,000 years ago, at a plateau in the tufa landscape of southern Etruria, people lived their lives, only to have their debris carefully collected, studied, systematized, counted, digitized, and exposed online. No longer things but data, these counts and spaces were mapped to simple sonic dimensions using a web-toy, making a moderately pleasing experience. Remixed, the music moves us, enchants us, towards pausing and thinking through the material, the labour, the meanings, of a digital archaeology.[2] If/when this song is performed in a club (attn: John Schofield and the Theoretical Archaeology Groups [TAG] in both the UK and North America), the dancers would then be embodying our archaeological knowledge of Poggio in their movements, in the flows and subtle actions/reactions their bodies make across the floor. In dancing, we achieve a different kind of knowledge of the world, that reconnects us with the physicality of the world.[3] The eruptions of deep time into the present [4] – such as that encountered at an archaeological site – are weird and taxing and require a certain kind of trained imagination to engage with. But by turning the data into music, we let go of our authority over imagination, and let the dancers perform what they know.

For the three of us as creators, this playful sonification of data allows us to see archaeological material with fresh eyes . . . errrrrr ears . . . and by doing so restores the enchantment we once felt at the start of a new project, or of being interested in archaeology in the first place. Restoring the notion of wonder into three middle-aged archaeologists is no small feat, but the act of play enabled us to approach a wealth of artifacts from one site we know quite well, and realize that we didn’t know it quite like this. Using the new music bridges the gap between humans past and present and in dancing we (and hopefully you) embody the data we present. It’s a new connection to something old, and is experienced by bodies. This is perhaps almost as intoxicating as the work done by Patrick McGovern (U. Penn) and Sam Caglione (Dogfish Head) in their experimentation and creation of ancient ales, the first of which was “Midas Touch”, a surprisingly drinkable brew concocted from an ancient recipe recovered on excavation in Asia Minor. Archaeology is often a cerebral enterprise, which deserves—at times—a good ass-shaking derived from a driving beat.

I’m listening now and am amazed. It is really beautiful, not only as a finished product, but as a process that started with people who lived their lives almost 3000 years ago.

—Eric Kansa

Reflexivity, by KGR [5]

Endnotes

[1] Reinhard’s article, “Assemblage Theory: Recording the Archaeological Record,” and two responses by archaeologists Jolene Smith and Bill Caraher.

[2] An argument made by Perry, Sara. (2019). The Enchantment of the Archaeological Record. European Journal of Archaeology, 22(3), 354-371. doi:10.1017/eaa.2019.24

[3] See for instance Block, Betty, and Judith Kissel (2001). Dance: The Essence of Embodiment. Theoretical Medicine and Bioethics 22(1), 5-15. DOI: 10.1023/A:1009928504969

[4] Fredengren, Christina (2016). Unexpected Encounters with Deep Time Enchantment. Bog Bodies, Crannogs and ‘Otherworldly’ sites. The materializing powers of disjunctures in time. World Archaeology 48(4), 482-499, DOI: 10.1080/00438243.2016.1220327

[5]  Kansa-Graham-Reinhard (pronounced as either “Cager” or “Kegger”—the GIF-debate of archaeological nerdstep/nerdcore).

References

Block, Betty, and Judith Kissel (2001). Dance: The Essence of Embodiment. Theoretical Medicine and Bioethics 22(1), 5-15. DOI: 10.1023/A:1009928504969

Caraher, William. (2019). “Assemblage Theory: Recording the Archaeological Record: Second Response” Epoiesen http://dx.doi.org/10.22215/epoiesen/2019.10

Fredengren, Christina (2016). Unexpected Encounters with Deep Time Enchantment. Bog Bodies, Crannogs and ‘Otherworldly’ sites. The materializing powers of disjunctures in time. World Archaeology 48(4), 482-499, DOI: 10.1080/00438243.2016.1220327

Perry, Sara. (2019). The Enchantment of the Archaeological Record. European Journal of Archaeology, 22(3), 354-371. doi:10.1017/eaa.2019.24

Reinhard, Andrew. (2019). “Assemblage Theory: Recording the Archaeological Record” Epoiesen http://dx.doi.org/10.22215/epoiesen/2019.1

Smith, Jolene. (2019). “Assemblage Theory: Recording the Archaeological Record: First Response” Epoiesen http://dx.doi.org/10.22215/epoiesen/2019.5

Anthony Tuck. “Murlo“. (2012) Anthony Tuck (Ed.) . Released: 2012-07-06. Open Context. <http://opencontext.org/projects/DF043419-F23B-41DA-7E4D-EE52AF22F92F> DOI: https://doi.org/10.6078/M77P8W98 ARK (Archive): https://n2t.net/ark:/28722/k2222wm10

Featured Image by Sarthak Navjivan https://unsplash.com/photos/iTZOPe7BpTM

A Song of Scottish Publishing, 1671-1893

The Scottish National Library has made available a collection of chapbooks printed in Scotland, from 1671 – 1893, on their website here. That’s nearly 11 million words’ worth of material. The booklets cover an enormous variety of subjects. So, what do you do with it? Today, I decided to turn it into music.

As part of writing the second edition to the Historian’s Macroscope, I’ve been re-writing the topic modeling section, and I’ve included working with this information, and building a topic model for it using R. As part of that exercise, I preprocessed all the data so that it would be a bit easier for the newcomer to work with it (which will be held in a Github repo for the purpose). Part of the preprocessing was adding a ‘publication date’ to the NLS-provided inventory file (which involved a whole bunch of command line regex etc to grab that info from the METS metadata files).

To turn this into sound – I used the Topic Modeling Tool  to build a quick topic model on the 3000 + text files containing the ocr’d text. The TMT can also match your metadata up against the topic results, which is very nice and handy, especially for turning the results into music, which I did with the TwoTone app. Drop the resulting csv onto TwoTone, and your columns are ready to map to the music; the visualization is also handy to get a sense of when your topics are most prominent (where the left hand side is my earliest date, and the right hand side is my latest date):

Then I played with the settings, filtering things so that notes only are played if they are making a meaningful contribution to the entire year’s text.

You can listen to it on Soundcloud.

The piano arpeggios are mapped to a topic that seems largely to be bad ocr. The pipe organ corresponds to a topic about religion. The trumpet seems to be stories of people off to make their fortune (as I read the topic words for that topic). There’s a double base in there, which I assigned to the ‘histories’ topic (because why not). The glockenspiel is assigned to the topic that I understand as ‘folk wisdom’, while the harp is mapped to stories of love and romanced (and doomed love too, for that matter).

What do we learn doing this? Well, for one thing, it forces us to think about the constructedness of our ‘visualizations’, which is never a bad thing. It foregrounds how much dirty data is in this thing. It shows change over time in Scottish publishing habits (“we could have done that with a graph, Shawn!” to which I say: So what? Now you can engage a different part of your brain and feel that change over time.)

Enjoy.