Failing Gloriously and Other Essays

‘Failing Gloriously and Other Essays’, my book reflecting on what ‘failure’ means, can mean, should mean, in the digital humanities and digital archaeology will be released on Dec 1. From the publisher website (where you’ll be able to get your copy in due course):

Failing Gloriously and Other Essays documents Shawn Graham’s odyssey through the digital humanities and digital archaeology against the backdrop of the 21st-century university. At turns hilarious, depressing, and inspiring, Graham’s book presents a contemporary take on the academic memoir, but rather than celebrating the victories, he reflects on the failures and considers their impact on his intellectual and professional development. These aren’t heroic tales of overcoming odds or paeans to failure as evidence for a macho willingness to take risks. They’re honest lessons laced with a genuine humility that encourages us to think about making it safer for ourselves and others to fail.

A foreword from Eric Kansa and an afterword by Neha Gupta engage the lessons of Failing Gloriously and consider the role of failure in digital archaeology, the humanities, and social sciences

The book will be available in print for $, and for free via pdf download.

Quinn Dombrowski has posted a wonderfully generous review over on Stanford Digital Humanities . I hope you’ll find value in it too!

scraping with rvest

We’re working on a second edition for the Historian’s Macroscope. We’re pruning dead links, updating bits and bobs, and making sure things still work the way we imagined they’d work.

But we really relied on a couple of commercial pieces of software and while there’s nothing wrong in doing that, I really don’t want to be shilling for various companies, and trying to explain in print how to click this, then that, then look for this menu…

So, I figured, what the hell, let’s take the new-to-digital-history person by the hand and push them into the R and RStudio pool.

What shall we scrape? Perhaps we’re interested in the diaries of the second American President, John Adams. The diaries have been transcribed and put online by the Massachusetts Historical Society. The diaries are sorted by date on this page. Each diary has its own webpage, and is linked to on that index page. We would like to collect all of these links into a list, and then iterate through the list, grabbing all of the text of the diaries (without all of the surrounding html!) and copying them into both a series of text files on our machine, and into a variable so that we can do further analysis (eventually).

If you look at any of the webpages containing the diary entries, and study the source (right-click, ‘view source’) you’ll see that text of the diary is wrapped or embraced by an opening




<div class="entry">

and closing

</div>



That’s what we’re after.  If you look at the source code for the main index page listing all of the diaries, you’ll see that the links are all relative links rather than absolute ones – they just have the next bit of the url relative to a baseurl. Every webpage will be different; you will get used to right-clicking and ‘viewing source’ or using the ‘inspector’

For the purposes of this exercise, it isn’t necessary to install R and RStudio on your own machine, although you are welcome to do so and you will want to do so eventually. For now we can run a version of RStudio in your browser courtesy of the Binder service – if you click the link here a version of RStudio already preconfigured with many useful packages will (eventually) fire up in your browser, including rvest and dpylr, which we will be using shortly.

With RStudio loaded up, select file > new file > r script (or, click on the green plus sign beside the R icon).

The panel that opens is where we’re going to write our code. We’re not going to write our code from first principles though. We’re going to take advantage of an existing package called ‘rvest’ (pronounce it as if you’re a pirate….) and we are going to reuse but gently modify code that Jerid Francom first wrote to scrape State of the Union Addresses. By writing scripts or code to do our work (from data gathering all the way through to visualization) we enable other scholars to build on our work, to replicate our work, and to critique our work.

In the code snippets below, any line that starts with a # is a comment. Anything else is a line we run.


library(rvest)
library(dplyr)

These first two lines tell R that we want to use the rvest and dplyr packages to make things a bit easier. Put your cursor at the end of each line, and hit the ‘run’ button. R will pass the code into the console window below; if all goes well, it will just show a new prompt down there. Error messages will appear if things go wrong, of course. The cursor will move down to the next line; hit ‘run’ again. Now let’s tell R the baseurl and the main page that we want to scrape. Type:


base_url <- "https://www.masshist.org"
# Load the page
main.page <- read_html(x = "https://www.masshist.org/digitaladams/archive/browse/diaries_by_date.php")

I don’t know why WordPress is mangling those three lines up, breaking them apart like that. They should look like this:

We give a variable a name, and then use the <- arrow to tell R what goes into that variable. In the code above, we are also using rvest’s function for reading html to tell R that, well, we want it to fill the variable ‘main.page’ with the html from that location. Now let’s get some data:

# Get link URLs
urls <- main.page %>% # feed `main.page` to the next step
    html_nodes("a") %>% # get the CSS nodes
    html_attr("href") # extract the URLs
# Get link text
links <- main.page %>% # feed `main.page` to the next step
    html_nodes("a") %>% # get the CSS nodes
    html_text() # extract the link text

In the code above, we first create a variable called ‘urls’. We feed it the html from main.page; the %>% then passes the data on the left to the next function on the right, in this case ‘html_nodes’ which is a function that travels through the html looking for the ‘a’ node in the CSS, and then passes that to the next part, the ‘href’ of a hyperlink. The url is thus extracted. Then we do it again, but this time pass the text of the link to our ‘links’ variable. You’ve scraped some data!

But it’s not very usable yet. We’re going to make a ‘data frame’, or a table, of these results, creating a column for ‘links’ and a column for ‘urls’. Remember how we said earlier that the links were all relative? We’re also going to paste the base url into those links so that we get the complete path, the complete url, to each diary’s webpage.


# Combine `links` and `urls` into a data.frame
# because the links are all relative, let's add the base url with paste
diaries <- data.frame(links = links, urls = paste(base_url,urls, sep=""), stringsAsFactors = FALSE)

Here, we have created a ‘diaries’ variable, and we’ve told R that it’s actually a dataframe. Into that dataframe we are saying, ‘make a links column, and put links into it; and make an urls column, but paste the base_url and the link url together and do not put a space between them’. The ‘stringsAsFactors’ bit isn’t germane to us right now (but you can read about it here.) Want to see what you’ve got so far?


View(diaries)

The uppercase ‘V’ is important; a lowercase view doesn’t exist, in R. Your dataframe will open in a new tab beside your script, and you can see what you have. But there are a couple of rows there where we’ve grabbed links like ‘home’, ‘search’, ‘browse’ which we do not want. Every row that we want begins with ‘John Adams’ (and in fact, if we don’t get rid of those rows we don’t want, the next bit of code won’t work!).


# but we have a few links to 'home' etc that we don't want
# so we'll filter those out with grepl and a regular
# expression that looks for 'John' at the start of
# the links field.
diaries <- diaries %>% filter(grepl("^John", links))

We are telling R to overwrite ‘diaries’ with ‘diaries’ that we have passed through a filter. The filter command has also been told how to filter: use ‘grepl’ and the regular expression (or search pattern) ^John. In English: keep only the rows that begin with the word John in the links column. Try View diary again. All the extra stuff should be gone now!

We still haven’t grabbed the diary entries themselves yet. We’ll do that in a moment, while at the same time writing those entries into their own folder in individual text files. Let’s create a directory to put them in:


#create a directory to keep our materials in

dir.create("diaries")

and now, we’re going to systematically move through our list of diaries, one row at a time, extracting the diary entry which, when we examined the webpage source code earlier, we saw was marked by an ‘entry’ div. Here we go!


# Loop over each row in `diaries`
for(i in seq(nrow(diaries))) { # we're going to loop over each row in 'diaries', extracting the entries from the pages and then writing them to file.
text <- read_html(diaries$urls[i]) %>% # load the page
html_nodes(".entry") %>% # isloate the text
html_text() # get the text

# Create the file name
filename <- paste0("diaries/", diaries$links[i], ".txt") #this uses the relevant link text as the file name sink(file = filename) %>% # open file to write
cat(text) # write the file
sink() # close the file
}

The first line sets up a loop – ‘i’ is used to keep track of which row in ‘diaries’ that we are currently in. The code between the { and } is the code that we loop through, for each row. So, we start with the first row. We create a variable called ‘text’ into which we get the read_html function from rvest to read the html for the webpage address that exists in the url column of ‘diaries’ in row i. We pass that html to the html_nodes function, which looks for the div that embraces the diary entry. We pass what we found there to the html_text function, which extracts the actual text.

That was part one of the loop. In part two of the loop we create a filename variable and create a name from the link text for the webpage by pasting the folder name diaries + link-name-from-this-row + .txt. We use the ‘sink’ command to tell R we want to drain the data into a file. ‘cat’, which is short for ‘concatenate’, does the writing, putting the contents of the text variable into the file. Then we close the sink. We get to the closing bracket } and we start the loop over again, moving to the next row.

Cool, eh?

You now have a folder filled with text files, that we can analyze with a variety of tools or approaches, and a text variable all ready to go for more analysis right now in R.

The full code is in this github gist:

The Resurrection of Flinders Petrie

The following is an extended excerpt from my book-in-progress, “An Enchantment of Digital Archaeology: Raising the Dead with Agent Based Models, Archaeogaming, and Artificial Intelligence”, which is under contract with Berghahn Books, New York, and is to see the light of day in the summer of 2020. I welcome your thoughts. The final form of this section will no doubt change by the time I get through the entire process. I use the term ‘golems’ earlier in the book to describe the agents of agent based modeling, which I then translate into archaeogames, which then I muse might be powered by neural network models of language like GPT-2.

The code that I used to generate pseudo Gibbons and pseudo Sophocles modelled the probabilities of different letters following one another. While sophisticated at the time, that approach is now little more than a toy. With the increase in computational power and complexity, these newer models open up tricky ethical issues for us, and in particular, if we use them to try to give our digital creations their own voice to speak. Let me sketch out how these new models work, resurrect Flinders Petrie, and then we’ll examine the aftermath.

More complex models of language now try to work out the ‘rules’ of language by ‘masking’ whole words and working out the probabilities to deduce the hidden word from the words that sit to either side. Others try to represent a word as an ‘embedding’ in multi-dimensional space (using a model built by Google). But the most complicated model, and the most successful, as of this writing, is probably the GPT-2 model, developed by the OpenAI foundation. It was trained on webpages found from outward links on Reddit that Reddit users scored as highly useful, some 8 million websites. The sheer size of this data let OpenAI develop a model that contains 1.5 billion parameters. Because of the potential malicious uses of the model, OpenAI did not release this model, but instead made available a series of smaller models- a ‘small’ model with 117 million parameters, and a ‘medium’ model with 355 million (there is a ‘large’ model available, but it is too large to run on a typical laptop).

Jesse Vig from the ‘Towards Data Science’ group blog has a post (2019) exploring how this model generates text. He begins by experimenting, trying to break the model (an ‘alchemical’ approach, as Rahimi 2017 laments), feeding it two distinct sentence prompts:

‘The dog on the ship ran’
‘The motor on the ship ran’

The model completes the first sentence with, “The dog on the ship ran off, and the dog was found by the crew.” It completes the second sentence with, “The motor on the ship ran at a speed of about 100 miles per hour.” The model understands the difference between dogs and motors, and the likely probabilities of different words completing the sentence. The model has what its creators call an attention mechanism, which focuses on the linguistic idea of the ‘head’ of the noun phrase. There are other linguistic patterns captured by the model, arranged 12 to a layer across 12 layers. It is this attention mechanism and the learning of linguistic patterns that accounts for this model’s success.

I fed the opening chapter of this book to talktotransformer.com, an interactive front-end for the GPT-2 model; the bolded text is what I wrote, the second paragraph is the machine’s:

This is a book about enchantment and digital archaeology. It is a book about trying to pull together the connective threads on nearly twenty years of work in simulation, agent modelling, video games, and Roman economic history. These are not, on the face of it, ‘enchanting’ topics. But hear me out. It’s about trying to remember what it was that was magical about archaeology when I first began, and why I find digital approaches to archaeology to still be magical.

In this book, I’d like to trace out some of the elements that underpin magical engagement with archaeology. I would like to show you some of the tools digital archaeology (D-arch), and computer archaeology have developed in order to be able to address specific aspects of magical engagement. But it isn’t hard to imagine that if you were an archaeologist working in the digital frontier of archaeology a lot of these new tools might not have occurred to you. To me, it’s about being able to be enchanted by the methods that a new medium has provided to the archaeological enterprise

I don’t know whether to be pleased or worried by this result. It could almost be the blurb on the back of this book’s cover. It gets a lot of things right.

Salesforce, a marketing company, has built their own model called CTRL on top of GPT-2 and released it on Github (https://github.com/salesforce/ctrl). This model has 1.6 billion parameters “trained to condition on control codes that specify domain, subdomain, entities, relationships between entities, dates, and task-specific behavior. Control codes were derived from structure that naturally co-occurs with raw text, preserving the advantages of unsupervised learning while providing more explicit control over text generation” (Salesforce, 2019). Presumably, Salesforce is not as concerned about malicious weaponization of their model as OpenAI is.

This elaboration of the GPT-2 model can be prompted with for instance a URL (real or fake) and, told that the prompt text is a ‘link’, it will generate text informed by the metadata contained in the link (Woolf, 2019). For instance, using Woolf’s code for interacting with CTRL, I fed it the fake url, https://www.cnn.com/world/09/19/2019/evidence-for-atlantis-found-in-suburban-ottawa-backyard. This fake URL does not point to anything at CNN, but it does mimic the structure of a real CNN story. But, the words in the page title are understood by CTRL as controls on that 1.6 billion dimensional space on the kind of language that would be appropriate, collapsing the probabilities to the ‘news story’ dimensions. The result:

Evidence of Atlantis found in suburban Ottawa
MUST WATCH
Story highlights
The discovery was made by a man who lives on the property
It is believed to be one of only two known locations where ancient artifacts have been discovered
(CNN) A Canadian couple has uncovered evidence that could help solve an age-old mystery — and possibly prove Atlantis’ existence.
In what they say is the first time such a find has ever happened in Canada, a husband and wife from Ottawa, Ontario, stumbled upon a cairn…

I fed the same model fake URLs patterned after my university department’s faculty biography pages. I created two names, ‘Sarah O’Connor’ and ‘Stuart O’Connor’. Sarah’s biography was utterly glowing, but listed the many service organizations that Sarah belonged to, while Stuart’s biography put its energy into listing all of his books. The model understands the gendered and hidden labour of the academy it seems – should you play with talktotransformer.com and experiment with sexualized or racialized language, do not be surprised when it generates toxic or not-safe-for-work text, reflecting our current world view back at us (at least, current as represented by the internet in 2019).

The text that models like GPT-2 and CTRL can generate demonstrate an incredibly sophisticated understanding of how the English language works, and the interrelationships of different concepts (seemingly without limit!) that might be the subject of conversation. GPT-2 and CTRL are not just language models, but knowledge models.

And somewhere in there is archaeology.

The experiment with using CTRL demonstrates that my thought experiment from earlier in this chapter, building a digital pseudo-Wheeler who I might want to interrogate about his worldview, is feasible now. What’s more, the CTRL codes show that it is possible to collapse the multiverse of possible generations to a local universe conditioned on a particular worldview. In neural network work on images, you might be familiar with the idea of photos that get ‘painted’ in the style of Monet or Picasso via ‘style-transfer’. This is the same idea, but with text.

I used Max Woolf’s code for retraining the GPT-2 ‘medium’ model (with 355 million parameters, 1.5 gb in size) on Flinders Petrie’s Ten Years’ Digging in Egypt 1881-1891, creating a model of Petrie’s language and – potentially – worldview (I would have used Wheeler, but Petrie’s writings are already available in the Gutenberg Project website, readily digitized.) Then, I ask this simPetrie some questions about archaeology. The dialogue below represents our ‘conversation’; each question by me was a prompt to the model to generate text. None of simPetrie’s responses have been edited. As you read simPetrie’s responses, reflect on what they (may) reveal about the doing of archaeology in the late 19th century. Given that Flinders Petrie is celebrated in archaeology for the development of both field and analytical methods, let’s ask his digital doppelgänger for its thoughts on the matter:

SG> ’Professor Petrie, what is the key finding of your researches?''
SimPetrie> ’The only way to find out what is the key to every mystery is to know everything about it, and then to try to discover the mystery as far as possible by deductions from what is known. When we come to the real problem, how do we find the real in the unreal?''
SG> 'Professor Petrie, could you describe your methods for us?''
SimPetrie> ’I have always tried to be fair to the student, and to give as much information as possible. I have tried to give as many facts as I can remember; and then try to match them with each other in the various accounts, so as to arrive at some general idea. People often ask me where the evidence for a fact comes from; and that is no easy matter, when the evidence is so scanty. I have often thought of writing a work on probability, in which every fact might be found, and the numbers by which they must be numbered, by heart, according to their weight in evidence. I believe I have found the right ratio; and the more I think of it, the more I see that it will not only be right, but will be of great service to our subject.
This method of ascertaining the amount of evidence is the only one that seems to me entirely consistent with scientific accuracy. It is easy enough to find the average of thousands of facts; but it is much harder to find the average of the facts themselves.’
SG> “Could you describe for us the role of local labour in your excavations and surveys?’”
SimPetrie> ’The local labourers, who live and work under me, are the most intelligent and hardy people I have met. They live by digging, and having to earn their living by selling the finished product to other local people who use it to make shoes, &c. Though they are often exploited in this business, they never trouble about it too much, and are content with what they can get for a good wage. Their only ambition is to live as cheaply as possible. The knowledge which I have gained from their labours has been of great service to me in my excavations, and I have always been glad when one of them has told me how he has improved his skill, or found a new construction, by improving the methods of his fellows. I should be sorry to see such methods go to waste for want of improvement and improvement of the methods of my neighbours, for the end in view is the same.
SG> 'Professor Petrie, what are the sorrows of archaeology?''
SimPetrie>’ 'The sorrows of a dead man. I have seen the dead. All that is preserved is what is in their power to leave behind them. The remains of the temples, though they may be exposed, are of no use. They are only patches of stone, in a country where there are many other buildings. If the dead man had been a sculptor, he would have had no need of the temples; nor is it likely that any one could have made any use of them except by accident. The temples, then, were but a sad sacrifice to an unmeaning king.

SimPetrie is startling coherent. The original Petrie’s concern for rigor and consistency in his archaeological recording and systemization of materials appears in the digital Petrie. The use/abuse of local labour is hinted at, with a touch of the colonial condescension we might expect from a 19th century Englishman. The overly -to our ears- florid language of the period comes through.

When we ask simPetrie about ‘his’ world, the neural network takes our question as a signal of how to collapse the possibilities for the generation of its response. Careful questioning and breaking could reveal the limits of that simulated world view. How does that worldview map back to the original Petrie’s? How far can it be pushed before it breaks? Much like an agent based model has to be run through all of its possible combinations of parameters to understand the simulated world view, the simulated history’s behaviourspace, we have to figure out a method for doing the same for this neural networked model. One way perhaps of doing this might be to deploy data mining and text analysis. I could imagine asking simPetrie the same question a thousand times at each ‘temperature’ or creativity setting between 0 and 1. Then, I would topic model (look for statistical patterns of co-occurance of words in a response) these responses, and map how the discourses found therein persist or evolve over the creative space of simPetrie’s responses. That might begin to give us a map of the territory that we have stumbled upon. It will require much work and indeed play, experimentation, and the willful breaking of the models to expose the sharp edges.

Some of the things we wish to play with, like the GPT-2 and CTRL models with their billions of parameters, are perhaps too big for enchantment? Is this where we spill from enchantment to terror? These models after all, now that they’ve been generated (and consider the energy and environmental costs of training such models is estimated to be five times worse that that emitted by a car over its entire lifespan, or approximately 626,000 pounds of carbon dioxide equivalent, Strubell et al 2019; Hao 2019) can now be deployed so easily that a single scholar on a commercial laptop can use them. The technology behind these models is not that far removed from the technologies that can simulate and generate perfect audio and perfect video of things that never happened or were never said, so-called ‘deepfakes’ (these too depend on neural network architectures). We will need to develop methods to deal with and identify when these models are deployed, and quickly. By the time this book is in your hands, there will be new models, larger models, of text generation, of language, and they will be deployed across a range of tasks. It will be exceedingly hard to spot the work written by the machine, versus that written by a human. Our golems are getting out of control. But there are other ethical issues, too.

The Ethics of Giving the Golems a Voice

“When we teach computers to write, the computers don’t replace us any more than pianos replace pianists—in a certain way, they become our pens, and we become more than writers. We become writers of writers.” – Goodwin 2016

“The hypothesis behind invisible writings was laughably complicated.  All books are tenuously connected through L-space and, therefore, the content of any book ever written or yet to be written may, in the right circumstances, be deduced from a sufficiently close study of books already in existence.  Future books exist in potentia, as it were…” Pratchett, The Last Continent

“How do we find the real in the unreal?” – simPetrie

In a world where computers can be creative on their own, ‘authorship’ is not about putting the words down on the page, and ‘scholarship’ is not necessarily about marshalling facts about the world in a logical order to make an argument. Instead, they become an act of creative composition and recomposition, or remixing and selecting of texts for training and hyper parameters to be tuned. It is in fact the same skills and techniques and scholarly work that informs the creation of agent based models. This kind of generative computational creative writing is not really about making a machine pass for a human, but, much like the agent based models discussed earlier in this volume, it is about discovering and mapping the full landscape of possibilities, the space within which Petrie could have written. These particular questions prompted the machine to collapse the possibility space around how archaeology was conducted, and whose voice mattered in that work; thus the results perhaps give us access to things that were so obvious they were never written down. What is the evidentiary status of a mapping of the behaviour space of the model? There could be a fascinating PhD thesis in that question. But this dialogue with simPetrie, for me, also raises some interesting ethical issues that so far in digital archaeology – led by the work of people like Meghan Dennis or Lorna Richardson or Colleen Morgan – we are only beginning to explore.

Tiffany Chan, for her MA thesis in English at the University of Victoria, used a recurrent neural network to map out the space of one particular author. She writes,

“[W]hat could we learn about our object of inquiry (in this case, literature) if we broke down, remade, and compared or interpreted it either alongside or as if it were the original? Articulated in Victorian terms, this project is like conducting a séance with a computer instead of a Ouija board. The computer mediates between human and machine, between the dead and the living. If, as Stephen Greenblatt suggests, literary study begins with “the desire to speak with the dead”… then [this project] begins by impelling the dead to speak.” (2017).

Colleen Morgan wrote, a decade ago, in the context of video games that use historical persons as non-player characters to decorate the games, “NPCs are nonhuman manifestations of a network of agents (polygons, “modern” humans, fiber-optics, and the dead person herself) and the relationships between these agents and as a result should be studied as such.  But does this understanding of an NPC as a network make it ethical to take such liberties with the visages of the dead? What does it mean when Joey Ramone comes back from the dead to sell Doc Martins?”

In these two passages, we find many of the threads of this book. We see ‘networks’ as both a literal series of connective technologies that thread the digital and analog worlds together. We see an impulse to raise the dead and ask them questions, and we see something of the ethical issues in making the dead speak. For instance, Petrie plainly did not say any of the things the simPetrie did in our dialogue. What if simPetrie had said something odious? It’s entirely possible that the model could extrapolate from hateful speech collected in its training corpus, triggered by passages in the small body of text of Petrie with which I perturbed the original.

What if that text gets taken out of context (an academic book or journal article) and is treated as if Petrie actually did say these things? In a conversation on Twitter about simPetrie, the computer scientist and sometimes archaeogamer John Aycock raised the issue with me of desecration: similar to the way human remains can be desecrated and ill-used in the real world, could this use of computation be a kind of desecration of a person’s intellectual remains? Lorna Richardson points out that the creation of any kind of visualization of archaeological materials or narrative ’is a conscious choice, and as well as political act.’ (Richardson, 2018). If these models are the instrument through which I ‘play’ the past as Goodwin (2016) suggests, then I am responsible for what collapses out of that possibility space. The ethical task would be to work out the ways the collapsing possibility space can do harm, and to whom.

The advertising and entertainment industries have the greatest experience so far with raising simulacra of dead celebrities to sell us things and to entertain us. Tupac Shakur raps on stage with Snoop Dog, years after his death. Michale Jackson performs from beyond the grave at the Billboard Awards. Nat King Cole sings a duet with his daughter. Steve McQueen races a 2005 Ford Mustang. These uses of the dead, and their resurrection, are more troubling that portrayals of historical figures in films or video games, because of the aura of authenticity that they generate. Alexandra Sherlock argues that

“… The digital individual continues, irrelevant of the death of its author and prototype, and since the relationship that viewers have with this social entity was always conducted through representations and images anyway, nothing about this relationship actually changes… in popular culture the media persona becomes divorced from the actual embodied celebrity and their representations become a separate embodiment of their own – an embodiment with which people are able to identify and bond with in an authentic and real way.” (2013: 168).

These representations of dead celebrities worked because they depended upon, and continued to promote, para-social one-sided relationships – the public was so used to the feeling of being connected with the idea of these individuals, that their digital resurrection proved no obstacle, no barrier to enjoying the performance. Sherlock discusses an episode where the digital resurrection of a celebrity did go wrong – the resurrection of Orville Redenbacher, of popcorn fame: “Rather than promoting the enchanting notion of immortality, Redenbacher’s advertising agency had accidentally and rather embarrassingly reminded viewers of the mortality of Redenbacher, and themselves by extension’ (170). The advertisement fell into the uncanny valley, the term from robotics that describes when a robot is so human-like that the few errors in the depiction (lifeless eyes, for instance) generate a feeling of creepiness.

Sherlock calls this entire process of using the images of entertainers, whether as holograms or on film, ‘digital necromancy’, and attributes some of the success (or failures) to the idea that, in addition to profiting from a para-social relationship, the revenants fill a need for answers, a need for reassurance in the face of death, given that Western culture largely avoids talking about death:

“…a form of necromancy does exist today, precisely in response to the marginalization of death. One might perhaps consider the technicians who created the Bob Monkhouse advertisement [where the comedian tells the audience about his own death from cancer] as modern necromancers – reanimating the digital remains of the deceased Monkhouse to impart his knowledge concerning his own death. It is as though the ancient art of necromancy has resurfaced in the practice of digital resurrection.” (171).

All of which is to say: simPetrie could become ‘real’ in the same way the personas of entertainers and celebrities become ‘real’, and the views and opinions expressed by the digital doppelgänger given far more weight than is warranted. “Subconsciously, their appearances may appeal to embedded beliefs that the dead are wise and knowledgeable: if they speak or show themselves to us, we should pay attention. Somehow the dead seem more believable.” (172)

When 2k Games, the makers of the game Civilization, in its sixth iteration, included the Cree Pîhtokahanapiwiyin (Poundmaker) as one of the playable leader characters, they put words in his mouth. Milton Tootoosis of the modern Poundmaker First Nation said, “[This representation] perpetuates this myth that First Nations had similar values that the colonial culture has, and that is one of conquering other peoples and accessing their land… That is totally not in concert with our traditional ways and world view.” (Chalk, 2018). While the depiction and lack of consultation with the Poundmaker First Nation is troubling enough on its own, imagine if the game-character of Pîhtokahanapiwiyin was coded in the way simPetrie was, and imagine further that the developers did not consult with the Cree on which texts to use for training – or whether to do this at all.

The danger of the neural networked power representation is in its liveliness, the possibility of fostering the kind of para-social bonds that make the examples drawn from the advertising and entertainment worlds work. A neural network powered representation of a key figure in Cree history would run the risk of becoming the version of Pîhtokahanapiwiyin that sticks; who builds and designs such a representation, and for what aim, matters. This neural network approach to giving voice to a video game’s non-player characters, to an agent-based simulation’s agents, is exceedingly powerful. If we are building simulations of the past, whether through archaeogaming or agent modeling, we either need to make our software agents mere ciphers for actual humans, or we need to think through the ethics of consultation, of representation, and permission in a much deeper way. The technology is racing ahead of our ability to think through its potential harms.

There is also the ethical issue in the creation of the training data for GTP-2 in the first place, the creation of the possibility space. The authors of those 8 million webpages obviously never consented to being part of GTP-2; the material was simply taken (a kind of digital colonialism/terra nullius). The use of Reddit as a starting place, and relying on Reddit users’ selection of ‘useful’ sites (by the awarding of ‘karma’ points of 3 or more to a link) does not take into account the demographics of the Reddit user community/communities. The things that white men 18-35 living in a technophilic West see as interesting or valuable may not be the kind of possibility-space that we really want to start baking into our artificial intelligences powering the world. Taking a page from information ethics, Sicart (2009) argues in the context of video games that permitting meaningful choices within a game situation is the correct ethical stance; where are the meaningful choices for me who ‘plays’ the GPT-2 model, or for me whose website may be somewhere inside the model?

A framework for considering the myriad ethical issues that might percolate out of this way of raising the dead and giving them a voice again might be the ‘informational ethics’ of Floridi and Sanders, as interpreted by Sicart from the perspective of video games. This perspective considers ‘beings’ in terms of their data properties. Data properties are the properties of relationships and the contingent situation of a thing. That is to say, what makes the rock on my desk a paperweight rather than merely debris is its relationship to me, our past history of a walk on the beach and the act of me picking the rock up, and the proper ways of using objects for holding down papers on desks (Sicart 2009, 246, citing Floridi 2003). Compare this with Ingold’s ‘material against materiality’, where he invites you to pick up a stone, wet it, and then come back to it a short while later:

“[…]the stone has changed as it has dried out. Stoniness, then, is not in the stone’s ‘nature’, in its materiality. Nor is it merely in the mind of the observer or practitioner. Rather, it emerges through the stone’s involvement in its total surroundings – including you, the observer – and from the manifold ways in which it is engaged in the currents of the lifeworld. The properties of materials, in short, are not attributes but histories.” (Ingold 2007, 15)

The meaning of data entities lies within the web of relationships with other data entities, and all things, whether biological or digital, are data entities (Sicart 2009 128-130; Morgan 2009). From this perspective there is moral import because to reduce information complexity is to cause damage: “information ethics considers moral actions an information process” (Sicart 2009 130). The information processes that give birth to simPetrie, that abstract information out of GPT-2, that collapse the parameter space to one local universe out of its multiverses, are all moral actions. For instance, these language models and these neural network technologies are predicated on an English model of the world, and English approach to language. Models like GPT-2 obtain part of their power through their inscrutability. Foucault (1999: 222) wondered what an ‘author’ might be, and concluded it emerges in the condensation of physical and cultural influences, that ‘the author function’ disappears instead to be experienced:

“What are the modes of existence of this discourse? Where has it been used, how can it circulate, and who can appropriate it for himself? What are the places in it where there is room for possible subjects? […] What difference does it make who is speaking?”

That is the ethical question posed by archaeogaming, because the ‘who’ isn’t just humans anymore.

“An Open Access Oops?” – my #patc4 source

“An Open Access Oops?”

Abstract:
I generally believe that making my research and my results open access is a moral imperative. But recently, certain events in the reception of our research on the trade in human remains online have made me wonder if there are situations where the greater good is served by _not_ making our work openly available. In this piece, I recount what happened and reflect on the contexts of archaeological openness.

Delivered: 15 minutes/ 45 secs per tweet. Below is the text I pasted into the ‘what’s new?’ box as fast as I could go. Turns out you can’t schedule a thread in tweetdeck; or if you can, I couldn’t figure it out.

Hi folks, I’m Shawn Graham; I’m a prof in the history dept @Carleton_U . Somewhere along the way I became a digital archaeologist. My #patc4 paper is “An Open Access Oops”.

Lemme tell you a little story & let me ask some little questions. /1

[gif House saying oops ]

Firstly, I became a digital archaeologist from necessity. If people shared data, I cld pretend to myself that I was ‘doing’ archae! Open access was a lifeline. Playing, exploring, & building from other people’s data allowed me to re-invent myself /2 #patc4

I’ve always felt then, aside from all the other arguments for open access, there was a moral imperative to pay it back. Right? I had benefited; now that I’m in a position to do it, I need to get my materials out there, in remembrance of the lost post-phd guy I was. /3 #patc4

Fast-forward. I never set out to study the trade in human remains http://bonetrade.github.io. But here I am, & we’ve been publishing in OA journals, making code and data freely available… Good, right? Well… here’s what happened. Let’s air what feels like a fail. /4 #PATC4

[gif ‘fail’ krusty, judges 0]

In january, the faculty did a piece on our ‘Bone Trade’ project (@damien_huffer) (here: https://m.carleton.ca/fass/story/innovative-historian-studies-the-sale-of-human-remains-on-the-internet/).

This summer, a local journalist wanted to talk to me about the project; the story was published here: https://ottawacitizen.com/news/local-news/carleton-prof-harnesses-machine-learning-to-explore-the-bone-trade-netherworld /5 #PATC4


So far, so good! Everyone wants their research to attract some attention, right? The Citizen is part of the Postmedia group, so the story got taken up by various papers across Canada.

Then a political candidate bought a human skull as a gift for her boyfriend. /6 #PATC4

[oh no < – kermit gif]


APTN, Aboriginal Peoples Television Network, broke the story and asked me for comment, having seen the other newspaper article. The APTN story was taken up by lots of other outlets, including Newsweek. Suddenly, there were interview requests everywhere /7 #PATC4

Our work made it into Wired (the politician did not) https://www.wired.co.uk/article/instagram-skull-trade . But, in trying to be ‘balanced’, it seems, the story included interviews w collectors. And they made the editorial decision to embed _in the story_ posts from Instagram selling human remains /8 #PATC4

The story was picked up and re-worked across multiple outlets. Here’s the Sun’s attempt https://www.thesun.co.uk/tech/9542441/human-remains-for-sale-instagram-black-market/. We’ve been erased from the research, and the nuance we try for in our work is lost. But the collectors are getting a lot of oxygen! /9 #patc4

A number of outlets contacted us, for interviews (including BBC), requesting that we also put them in touch with collectors. I refused to do this. If we were studying sex trafficking, would you ask us to put you in touch with pimps? /10 #patc4

[gif why monkey]

I know this is not a particularly egregious case; there are far worse out there. But we know that buyers/sellers of human remains are reading our work and adapting accordingly. With the press attention, and the celebration of the ‘eccentric’ collectors, + /11 #patc4

how much traffic have we driven to collectors? to what degree have we helped promote the trade we are studying? how have we changed their behaviour to _enhance_ their ability to trade without prying eyes? /12 #patc4

These human remains were collected in morally, ethically, legally dubious circumstances. To reduce them to clickbait is to return us to the era of ‘human zoos’. How many times will these people be dehumanized? But… we published OA. We put our material out there. /12 #patc4

It’s our fault, right? Publishing the work needs to be done openly, I thought, given how these remains were collected in the first place in secret (eg https://www.academia.edu/14663044/Harlan_I._Smiths_Jesup_Fieldwork_on_the_Northwest_Coast p154). sunlight, disinfectant?

Maybe I was wrong. /13 #PATC4

But hiding the work behind paywalls is wrong, too. Publicly funded work should be accessible by the public (which publics, SG?). We didn’t conceive the project as ‘public archae’, but if we had we would not have gotten into this mess of inadvertently promoting sellers. /14 #PATC4

A month or two later, I return to scraping Instagram, and I notice new figures active, old figures gone, & maybe the internet’s short attention span has taken care of the situation. Maybe I worry too much. But is this a case where OA is the wrong approach? /15 #patc4

Or is the error: the attracting of attention, drawing the eye of a media ecosystem addicted to both-sides-ism, an ecosystem addled by ‘engagement’ mechanics predicated on outrage? /16 #patc4

[eye of sauron]

I know I conceived this project without thinking about how, if you study things online, things online have a way of pushing back. In which case, I decided to talk about it here at #patc4, so that I can learn from wiser heads. /17

The human remains trade in its origins is part of the literal flow of human bodies from around the world into the West. As @priscillaulguim reminds us https://twitter.com/priscillaulguim/status/1169382105547202561 OA assumes I have the right to share; but not always true & the contexts are complex. /18 #PATC4

I am also from the global north, the consumer of these bodies, of these data. Unthinking OA (as @priscillaulguium alluded to last night https://twitter.com/priscillaulguim/status/1169382281485701127) allows me to profit academically from these bodies one more time. /19 #patc4

Before I was a prof, OA let me play at being an archaeologist. Now on the other side, I want to get my research out there: but naive OA, especially in archaeology, is not without its risks, as this summer has demonstrated. I need to do better. /fin #patc4

[screenshot of the thing below]

PS One more thing- The one seller, who got progressively higher and higher profile in the news stories? IG deleted his account. His webstore remains, but he’s rebuilding on Instagram. The internet makes Red Queens of us all. https://en.wikipedia.org/wiki/Red_Queen_hypothesis /really fin

[pic!]

quick visualization of tags – notes using sublime, zettlekasten, gephi, and bash

So you take your notes following the Zettlekasten method, do you? One thought per card? Cool. I was never taught how to take good notes, and I still struggle with it. Rene Schallner’s zk-sublime  suits the way I like to work these days, in a text editor. I end up with a lovely folder filled with markdown notes that have internal links, tag searching, ‘friends’ searching… it’s great. As long as I’m using Sublime 3. (which is no chore).

Anyway, I was thinking to myself that it would be nice to feed the notes into a static site generator to make a nice online version that other folks could peruse. This would require converting all of the internal links to markdown links, and if I was using Jekyll etc, adding the right kind of metadata to every post. I cheated, and tried to use mdwiki, a no-longer-actively-maintained project that turns a folder into a site with the addition of a single html file (containing all of the necessary js and so on). I spent a lot of time on that; here’s a bash script that turns the directory listing of my note folder into an index.md that mdwiki can use:


#!/bin/bash
# A sample Bash script to turn the contents of a directory
# into a md file with filenames as md links


# put the directory contents into a file
echo "creating toc"
ls > index.md

# put the brackets around the line
echo "beginning line formatting"
sed -i '.bak' 's/^/[/' index.md
sed -i '.bak' 's/$/]/' index.md

# duplicate the line

sed -i '.bak' -E 's/^(.*)/\1\1/' index.md

# now to convert the SECOND [ and ] to ( )

sed -i '.bak' 's/\[/\(/2' index.md
sed -i '.bak' 's/\]/\)/2' index.md

# and this bit was the start of me trying to create a unique page for each
# tag, which eventually would end up listing all relevant
# note pages. I got the files made, at any rate; nothing in 'em yet.

grep tags *.md -R > tags.md

sed -i '.bak' 's/#/ /g' tags.md
sed -E 's/([0-9]+.)([A-Za-z ]+.)('md:tags:')//g' tags.md | tr ' ' '\n' > tags2.md
sed -i '.bak' '/^[[:space:]]*$/d' tags2.md
cat tags2.md | xargs touch
rm tags2.md
echo "done"

which was fine, but meh.

So I abandoned that, after so.many.hours. I started focusing on the tags instead, realizing that at least having a visualization of how my notes interconnect. Every note has ‘tags’ in the metadata, so a grepping we go:


grep tags *.md -R > tags.md
sed -E 's/([0-9]+.)([A-Za-z ]+.)('md:tags:')/"\1 \2"\,/g' tags.md > net.csv

This gives me two columns, a file name in quotations, and the relevant tags. I cheat and use find and replace in excel on the second column to replace spaces with semi-colons. This I can then open in gephi, selecting ‘adjacency’ and ‘semi-colon’ and boom. A nice visual depiction of how my notes inter-connect.

First part of the day: several hours. Second part: 30 minutes. Sigh.

 

 

SimRomanCity

Ever since I first read about the original SimCity source code being open sourced as Micropolis (play here), I have wanted to build a course around using that code to simulate a Roman city. Students would keep open notebooks and devlogs, and together, we’d build our simulation.

To start we would spend a few weeks looking at the literature, the archaeology, and the scholarship surrounding ideas of the ancient Roman city, and from these, develop an idea of what kinds of things one would want to have in a simulation – and what kinds of questions a simulation might answer, or lessons it might teach. This would take us about four or five weeks.

SimCity has had enormous influence in games and beyond, and in many ways our everyday thinking about how cities work can be traced back to the way SimCity modeled urban systems. I would have the students look into the history of SimCity and Will Wright’s influences, and discuss what that might for how we understand ancient cities, and how the study of the ancient city is entangled with these particular models of modern, Western cities that SimCity represents.

The second half of the course is where things’d get really interesting. We’d take those paper designs and that understanding of SimCity-as-an-artefact and we’d build. We’d take the source code, and try to modify it to model an ancient Roman city. Is this possible? What assumptions about the ways cities work are hardbaked into the ‘SimCity’ framework ab initio? If we can just change the skin of the game, its sprites and graphics, and come up with something that functions how we imagine ancient cities did, what does this say about our ideas of the past? Maybe we’d find that some of our ideas about the past are not as true as we perhaps thought. This might be a case where we could expect failure but that would be ok, because then we could spend a few weeks on the why and how of that failure and what that tells us about the consequences of the influence of SimCity.

But alas, when I look at the Micropolis source code, I am stymied. I have no idea how to even begin. I shelved the idea.

But recently, I came across a port of the game, still in development, by Graeme McCutcheon. His port (works best in Chrome) translates the game to js/html5. And when I look at the code, it seems fairly intelligible!

So now it’s just a matter of figuring out how to build the game from his source code. After much farting around, I figured out more or less what one has to do.

1. Fork his repo.

2. Clone it to your machine.

3. Get nodejs

4. Open the micropolisjs folder in your terminal, and install the dependencies listed in the package.json file with npm install

5. You can start it up right away with npm run-script startand then going to localhost:8080 in your browser.

The various scripts and models that make up the game’s simulation are in the src folder; edit these, then use npm run-script build. And of course, all the sprites and graphics could be altered in any graphics program.

It would be a steep learning curve, but since we’d do this as a class, I think every student could find a role through which to contribute. Anyway, I’m off now to design a Roman tileset.

Don’t buy human remains

I was interviewed by Kristy Cameron for the Evan Solomon Show (radio) today. It was about my perspective on this story  about a federal candidate for election who bought a human skull as a gift for her boyfriend. Short answer:

Don’t buy human remains.

In anticipation of the interview, I wrote some notes about what I wanted to say, which I’m pasting here below:

What are the ethical issues?

– there are several ethical problems with giving a skull as a gift, and they circle around what a skull is, and where these remains come from, and how they come to be traded:

1. the skull was a human person. Trading skulls reduces people to mere things.
2. many of these skulls are on the market largely as a result of white people collecting non-white people, robbing graves, collecting the bones of slaves, of prisoners, for the purposes of ‘scientific racism’, of proving the superiority of one race over another.
3. Even skulls from ‘european’ sources: did they consent? Of course not.
4. a skull is not a ‘thing’, it is a person: to many indigenous groups from whose members many human remains were stolen, to not be buried and accorded respect and dignity as appropriate to the group is a continuing harm to the group.
5. the skull has no archaeological context – the exact knowledge of the conditions of burial, the other objects or scientific information that allows us to work out the meaning of objects from the past – so the trade destroys knowledge about the past
6. from what we can see in the photograph, (Damien Huffer & I) there are some indications that make us suspicious about how this skull came to be on the market. For one thing, there looks to still be dirt on it. The skull itself seems to be flaking, which can be caused by alternating wet/dry or freeze/thaw conditions. There is also a chip on the skull that looks quite recent and doesn’t look like it was caused by an animal or natural causes; my first thought is maybe a pick or tool, as there also looks to be root marks on the skull. So, given the photograph, we think there’s reason to be concerned that this skull might only have recently been dug up. We have seen videos on Facebook of recent graves being opened. Ms. Rattée says she has documentation that it is European in origin, but that’s no guarantee.

How are they sold?

– these are bought and sold on instagram, facebook, and other social media marketplaces. Skulls were bought and sold through shops long before social media, but social media increases the reach and size of the market. Facebook of course makes money from ads served alongside these posts, so it’s in FB’s interests to facilitate the reach and ‘engagement’ with the posts.

What are my thoughts on the situation?

– it is not illegal to buy and sell human remains in Canada, but I feel it ought to be simply by virtue of the fact that we owe it to our fellow Canadians, Indigenous Canadians, to try to right some of the wrongs we have done in the name of ‘science’. Harlan Smith, the ‘father of BC archaeology’, robbed graves in the 19th century and sent the remains to new york to go into a museum. He knew what he was doing was wrong: there’s no excuse. Social media makes human remains into entertainment. If a potential politician sees no problem with buying and selling a dead human, that does not speak well to their judgement regarding living humans.

– as far as using the skull as a model: a resin cast is surely a good enough model for drawing skulls on skin.

(featured image: israel palacio on unsplash)

HIST5706 Fall 2019: Guerilla Digital Public History

I had to cook up a course description. Coming at you Fall 2019…. Guerilla Digital Public History!

HIST 5706: Digital History – Guerrilla Public Digital History
Fall 2019


Introduction:

The sources for the history of our times are fragile. Joe Ricketts, the billionaire owner of DNAInfo and Gothamist, shut the local news publications down rather than tolerate a unionized workforce. For 11 minutes, Trump was kicked off Twitter. Ian Bogost sees in both episodes a symptom of a deeper problem:

> both are pulling on the same brittle levers that have made the contemporary social, economic, and political environment so lawless.

As public historians, what are we to do about this? There are a lot of issues highlighted here, but let’s start at the most basic. It takes nothing to delete the record. The fragility of materials online is both a danger, and an opportunity, for us. Some scholars have “gone rogue” in trying to deal with this problem. That is to say, they neither sought nor obtained permission. They just scoped out a process, and did it.

I initially called this class ‘guerrilla public digital history’ partly tongue in cheek. I imagined us doing some augmented reality type projects in public spaces. Re-programming those public spaces. Using digital techs to surface hidden histories, and insert them into spaces where they didn’t ‘belong’. Counterprogramming. That was the ‘guerilla’ bit.

I still want to do all that. But I think we’re going to have to do a bit more. Digital Public Historians have a role to play I suspect in countering the information power asymmetry. These ways are impromptu, without authorization. Rogue. Improvised. And yet, they have to remain ethical. What are the ethics here?

What is a ‘guerilla digital public history’? What are the stories in Ottawa that require a guerilla digital public history? What do you need to know in order to tell such a story?

I don’t know. But we’re going to find out.

Examples of previous student work in this class may be found at http://picturinglebretonflats.ca/ and https://nathpicard.github.io/Old-Chinatown-Ottawa/ . Both of these pieces were award-winning.

Class Format:

We meet once per week in a three-hour block, a kind of collaborative studio-based approach. It involves a whole lot of experimentation and making. Things will break, and will go in directions that you didn’t expect.

Aims and Goals:

Digital history is a collaborative endeavour. I want you to learn how to identify, learn, and deploy the relevant technologies suitable to the story you wish to tell; I want you to learn that different technologies promote different kinds of telling, and envision different kinds of humans who are permitted to do the telling.

Part of the learning will involve documenting your practice. I will get you started with three expressive digital media that you can use to explore what it means to do guerilla digital history in the nation’s capital. You will leave this course with an actual ‘thing’ you’ve created and deployed, and a toolkit of your own. We will do a mixture of activities, readings, and discussions to enable you to ground your guerilla digital history toolkit in the scholarship. You will build this toolkit as you put in train your own act of guerilla digital history.

This can be disappointing if you are expecting a more traditional arrangement. If you want to learn how to do computational analysis of historical texts, I’d suggest the self-directed, non-credit version of HIST3814o Crafting Digital History (http://craftingdigitalhistory.ca) would be more appropriate for you, and you can explore that on your own (but I’d be happy to talk you through it). But in this class, we’re doing something very different.

The logic of a guerrilla digital history sees:

• digital history is about making things
• the point of making is about discovery, not justification
• through making we come to understand the issue deeply, differently, divergently
• that the digital world overlays and intertwines the physical world and so we can’t leave it to the tech folks alone: we must engage
• that because this engagement can involve using digital tools, platforms, and data against the ways that the hegemons desire, it is political
• that because it is political, it involves an element of danger (for whom is undefined) and so the weapons of geurilla digital history might be truth and beauty bombs

Assessment:

• 4 Oral Reports – 25% total
• 10 Devlogs – 25% total – to be kept in a timely fashion over the duration of the course
• Project – 50% total – due the last day of term
–> Paradata: 20%
–> the Thing itself: 30%

(Thus, 70% of the grade is on your process and reflection).

Text:

There is no text to purchase. Readings will be open-access on the web; links to specific texts will be on the course website.

featured image simson petrol on unsplash.com

Invasion of the Digital Humanities

Earlier this academic year, I gave a talk at the Canada Science and Technology Museum about ‘the Invasion of the Digital Humanities’ and why museums might want to keep an eye on DH. The slides are at: http://j.mp/sg-oct16 but I realized I never shared the speaking notes. So here they are. You might find some mileage in ’em.

~o0o~

I am an imposter. I don’t work in a museum, I have no training in museology and I certainly don’t face the challenges that each of you deal with every day. I like to think I’m a fellow traveller, maybe. But once I starting thinking about imposters, I wondered, Who are the imposters in the museum? It might well be the digital humanities.


The Plan

  1. Imposters, not invaders
  2. The care, feeding and valuing of imposters
  3. Why this matters to your institution and practice

Note:
In this talk I’m going to do 3 things.

  1. I’m going to try to avoid defining the digital humanities by focussing instead on how it (they?) make us feel, on the process. I’m going to tie it to the imposter syndrome we all feel when we first try something new in our work, and I’m going to try to imagine what it might be like for a museum-goer to encounter something DH-y for the first time
  2. Then I’m going to talk about how I try to teach digital humanities.
  3. From that, I’m going to try to distill the essential oil of DH that I hope you’ll be able to use for your own practice

in·vade  (ĭn-vād′)
    v. in·vad·ed, in·vad·ing, in·vades
    v.tr.
    1. To enter by force in order to conquer or pillage 
    2. To enter as if by invading; overrun or crowd
    3. To enter and proliferate in bodily tissue, as a pathogen
    4. To encroach or intrude on; violate

Note:
Full disclosure – I suggested the title of this talk months ago and wish I’d not used the word ‘invasion’. Invasion: it suggests power, control, dominance, colonialism, extraction, subversion, a taking over.


Note:
These are not words or connotations I want to see associated with DH. Nevertheless, even cursory googling will produce a lot of articles written from such perspectives.


Note:
(this slide left intentionally blank)

DH as the cuckoo in the nest
critiques of dh as latest shiny thing, doing it because they can

DH as a tactical term, magical pixy fairy dust

DH as a crisis led model: oh god we need money if we say we’re digital maybe digital will help us save money oh no

I’m trained as a roman archaeologist, and in archaeology, everything is always a crisis. Dig to rescue! Heritage at risk! nag nag nag. No one wants to listen to a nag. No one wants to eat their broccoli


Note:
But what if DH was played instead for an enchantment led model? What if DH opened up an emotional response?

In this museum, there’s a wonderful book in one of the galleries. A projector above it shines information down onto the pages. When it senses a hand near an image, the image becomes alive; if you flip the pages, new information appears.

It’s magical. I sat for half an hour and watched kids play with it. It was enchanting. A simple harry-potter-esque motif.


Note:
(this slide left intentionally blank)

‘well of course, they’re digital natives’ someone might say.

Heaven help me if I ever hear the term ‘digital native’ though. There are no such things.

I taught high school once, a decade ago, in the rural northern part of this region, about an hour and a half from here. We brought the kids to the city, to one of the museums. Several of them had never seen an escalator in real life before. An escalator. They knew they existed; they’d just never experienced them before. Indeed, that whole trip was amazing for me, because these kids just didn’t know how to interact with any of the materials on display.

They were imposters, and they knew it. They didn’t touch, they didn’t explore. They merely looked.



Note:
A museum also has an outward looking face. Its website for instance. Some are more complicated than others. Maybe there’s been a digital strategy put in place. Maybe there’s some amazing API, some SPARQL endpoint that permits me to link all the data together and ask semantic questions of that data, remixing it into something new.

Or maybe I just look at it, and feel it’s way beyond me. I’m an imposter. I don’t belong here.


Note:
Our digital interventions, whether in physical space or online space, need a pedagogical scaffolding if we want to bring the imposters amongst us out of the cold, to fold them into what we’re doing.

DH can offer that scaffolding, because it sits at that intersection between the human and the machine.


Note:
Paradoxically, I find that scaffolding, that support, where things break. As an archaeologist, it’s the broken things that teach us most. Sometimes we have to deliberatly break something to understand it.


I want to propose a DH of broken things. [obvious nod to Mark Sample here]

A DH that celebrates the imposters.


Note:
Dh as glorious fails

fails as engagement, fail as pedagogy

failure makes us feel like imposters, right?


1. Technological Failure
2. Human Failure
3. Failure as Artifact
4. Failure as Epistemology

Note:
kinds of fails – croxall/warnick types

but failing in a museum makes us imposters again, after all we’re pros, right? and no one wants to admit something didn’t work. Ok, let’s set that aside, look at it from the visitor perspective

fails we experience as a vistor – take two seconds, and think what these might be. try to match these up against the typology of digital fails. 30 seconds, explain to the person beside you


Note:
but what if failure was formalized as a culture of experimentation? an enchantment that leads to wonder?


Note:
(this slide left intentionally blank)
this is how I try to teach DH or rather, embody a DH pedagogy in my teaching.

DH then is an approaching to working with digital data and computational tools that exists in a reflexive cycle. We use the tools because they enable to do interesting things; but the things they enable us to do change how we see the world.


Note:

my critical making class for instance uses 3d photogrammetry to scan an artefact into the digital, then over the duration of the term we do the full cycle of humanities computing as Bethany Noviskie calls it, abstracting the data further and further away until we come back again via 3d printing or augmented reality or DIY projectors. I foster a culture of open notebooks and fail logs, where the grading is not on the finished project but on the reflective process, teasing out the various meanings as we go. I should mention that all of my students feel like imposters: “If I wanted to study computers sir I wouldn’t have taken history”.

One of the most profound projects was Matt and Marc’s study of the Terry Fox statue at Parliament Hill. Terry Fox is a Canadian icon; in the early 1980s, having lost a leg to cancer, he embarked on a fundraising run across Canada for cancer research. The statue depicts him mid-stride, his face a rictus of pain (Fox ran the equivalent of a marathon every day for several months, until a relapse finished his Marathon of Hope for good). His prosthetic leg never fit quite right; the binding of technology to man became a theme in Marc and Matt’s work as they translated the physical statue, the memory-in-bronze, into a memory-in-bits, working through the cycle back to the physical world and tying their breakages (in skills, in technology, in the ways they chose to represent Fox) to the breakages in Fox’s own body, our memory of him, and our commemoration.


DH outputs sure look shiny

  • text mining
  • image remixing, glitching
  • sonification,
  • 3d photogrammetry
  • etc

and there’s nothing wrong with that.

But we need to remember


while we apply the digital to the humanities

we also apply the humanities to the digital

Note:
for instance, Allison Parrish, poet and programmer, reminds us that to encode something is to forget; it is an act of forgetting. So I ask them, what has to be forgotten in order to make something digital, or to digitize something, or to use some whiz bangy tool?

who gets forgotten? who can be hurt by this?


The essential oil


Note:
(this slide left intentionally blank)
DH in a museum then should enable the visitor, the stakeholder, to change and be changed by the experience, the encounter, with the space

DH in the museum should make us remember the imposters amongst us, and bring them in out of the cold by recognizing and valuing what makes them feel like an imposter, by providing a scaffolding for understanding the various ‘fails’ and pulling meaning out of them. By making it safe to fail at being a museum visitor. by encouraging touching, breaking, making, and reflection.


Note:
ok, so what’s the easiest route into dh in a museum setting?



forward the past! the cast gallery

Note:

  • casts as imposters
  • what were casts for
  • the problem of aura – dh’s answer: performing the replication gives aura!
  • 3d scanning
  • DIY AR
  • who owns a cast
    • image licensing recreates colonialism
    • so open access policies on materials becomes a kind of social justice issue

Note:
DH is not about whizzy tech, or trying to attract new audiences, or new kinds of outreach. Sure, it has a lot of that. BUT DH in a museum could be a lot more.

Canada & Their Pasts project from a few years back found that the public trusted museums most to learn about their world, their pasts. That’s a powerful and important role. So I’ll leave you with my wish list: I want to see DH not as a special secret sauce that gets added to placate senior admin and boards of directors; I want to see museums as powerhouses of DH research, as leading venues where this field happens, where it takes place. digital work is necessarily public work I think, so I see DH + Museums as a nexus for leadership.

  • I want, along with Dan Pett, to see museums participating in DH research.
  • opening themselves to serendipitous encounters
  • using digital to find ways to move away from crises towards enchantment
  • using their collections to push innovative research
  • making their collections open
  • making their research reproducible (because we’ve focussed on process)

Note:

So, not so much about invasions. Invite the imposters in, instead.


thank you

i'm
shawn graham
carleton u
@electricarchaeo

image credits

Alien Invastion, Javier Rodriguez pixabay

Cigar Man, Ryan McGuire pixabay

Abstract digital art, Noonexy pixabay

Scaffolding, Jacek Dylang unsplash

Broken pottery, Cluttersnap unsplash

Legos, Efraimstochte pixabay

Terry Fox scan, by Marc Bitar and Matt Burgstaller

‘Choices’, Javier Allegue Barros unsplash

‘Crafting’, Jasmin Schreiber unsplash

‘Goal’, Tama66 pixabay

 

 

 

Manuscript excerpt – Tasks for Golems

I’m well underway on my next book which I’m thinking of as the necromancy book (provisional title: ‘Digital Necromancy: Archaeological Enchantment, Agent Modeling, and Archaeogaming’). I’ve just pounded out a chapter that shows some of the sausage-making behind building an agent based model, ie, the messiness of it all. It comes in the final third of the book, after much discussion about enchantment and vibrant materials and sense and sensation and a whole host of other things.

Ideally a reader who hasn’t built a model before will be able to piece something together from the code snippets bundled with Netlogo, and I show how one can take inspiration from the included models and code to build a re-implementation of my Itineraries model published in 2006. I finish it off by showing that digital modes-of-thinking transfer to the analog world with a quick glimpse at the logic and rationale behind our board game, FORVM

You can read the piece and leave comments at

https://docs.google.com/document/d/1X581PPzo5XvVC5UTvwjJlU5703J2Pe8kQKCAyoTXQB4/edit?usp=sharing

This is the first draft of this section. It’s going to be rough and awkward and unpolished.

Getting Started in Digital Archaeology

I’m writing a book (a snippet of which I posted earlier ) and I’m using Scrivener to do it. Scrivener allows me to write in chunks, and to rearrange the chunks as the thing develops. This morning I wrote a chunk that I’m not entirely sure goes where I currently have it. Eventually it’ll find its home; right now, it’s in an afterword. Possibly it should be an appendix. Maybe it needs to go in the beginning. For now, I thought I’d share it because it might be useful for someone out there. And if it’s not useful, better to find out now than when it’s in print, eh?

~o0o~

Digital archaeology, as I have conceived it here, is not about computation in the service of finding the answer. It is about deforming, and thinking through, the various networks and distributed agencies that tie us to the past and simultaneously make it strange, that enchant and confound us. There are any number of courses on the books at universities around the world, any number of tutorials on any number of websites, that will walk you through how to do x using software package y, and when you know exactly what it is you need to do, these can be enormously helpful.

The best strategy for deformance however is to play. Play around – you’re allowed! Try things out. See what happens when you do this. But we – as the academy, as the guardians of systemized knowledge – have managed to beat playfulness out of our students. What’s more, when you’re just starting out, and you’re not sure of the terminology, not sure of even what it is you’re after, what question you’re really asking, it is easy to succumb to information paralysis – too much information means you’re not able to act at all. The strategy I take with my own students is to make it safe to fail, safe to play around, what Stephen Ramsay famously called the screwmeneutical imperative. To do this, you need to have someone model productive failure, to have someone to point to who is trying things out and reporting back on what has worked and what has not. Beyond this, there is therefore no magic recipe, no silver bullet:

“Should I learn python or R or javascript or….?”

“No. You should identify the problem at hand, and then use whatever it is that works for you” is the unwelcome answer.

The truth is, you exist at this point now with access to these particular resources and this particular digital environment. You use what you have now to see more of the landscape of possibilities. Getting started then just means to fold what you already know how to do into a cycle of experimentation. If you want to get started in digital archaeology, develop the habit of note taking and reflective practice every time you sit down with the machine. Do not remove yourself from the reporting. As Mark Sample once wrote, citing the Oblique Strategies of musician Brian Eno, ‘your mistake was a vital connection’ (2015). You will find in your mistakes your own sources of enchantment, of vibrant materiality.

Guidelines for developing your own digital archaeology
Digital work is craft work, and like all craft, there is any amount of tacit and embodied knowledge that you will have to learn. These guidelines are meant to help you work these out.

  • First of all, recognize that digital work is slow. Computation itself might happen quickly, but getting to the point where you’re doing what needs to be done in order to do x, y, or z is a slow process. Understanding what the results might mean is similarly slow. Opening black boxes of algorithms (recipes, step by step instructions) and understanding what someone else has coded is slow, painstaking work. To encode something necessarily means that something else has to be forgotten. Ask yourself: what has to be forgotten in order for this to work?
  • Play. Modern computers and devices are by default largely locked down by Apple, by Microsoft. You are not supposed to do anything outside of the ecosystem of apps and software that are provided to you. Push against this. Learn to open the hood. Find the terminal, find the command line. You will be pushing here against modern techno-capitalism (you anarchist!) This will be the hardest part.
  • Keep track of everything. Write down what you did, and why you did it and what aids, tutorials, blog posts, and walkthroughs you were reading.
  • Search the exact error messages you receive – copy the error message and search – with perhaps a bit of context. Chances are someone else has already had this error before and has posted the solution.
  • Share this information, within the boundaries of what is safe for you, given your particular situation, to do. As a white middle aged tenured man, I can be far more open about my failures online than other people because I am privileged, and so I keep an open research blog at electricarchaeology.ca (and if you are a white middle aged tenured man, why aren’t you making it safe for others to share what works and what hasn’t?)
  • The framework that Brian Croxall and Quinn Warnick developed for discussing ‘failure’ in the context of digital pedagogy can be usefully employed to provide structure to your notes and reflections. Understanding why and how something failed, the type of failure, is a necessary precursor to developing your digital craft.
  • Carefully detail when things do work, in the context of what hasn’t in the past. Not only will you have a handy reminder of what to do the next time this particular task presents itself, but you will have a record of your own progression, your own development over time that will help keep you motivated.
  • Attend to enchantment. These moments when you are confronted by the uncanny and the delightful are signals of deeper assemblages of distributed agency in your materials. Where you find enchantment, there you will find that you are learning something deeper about the world. Digital archaeology is amazing. Why shouldn’t you find enchantment, joy, in your work?

A vital resource for learning the tools of digital work for those of us in the humanities is The Programming Historian, a set of peer-reviewed tutorials that continues to grow (and is also available in French and Spanish). Survey the lessons there to see various hands-on walkthroughs of tools or approaches that you might wish to use on your own materials. Begin with the lessons by Ted Dawson (for PCs) and Ian Milligan and James Baker (Mac) on the command line and the terminal. Lemercier and Zalc’s Quantitative Methods in the Humanities might also be a good spot to start, as well as the various agent based modeling and network tutorials collected by Tom Brughmans on his blog, https://archaeologicalnetworks.wordpress.com/resources/ . For agent based modeling in particular, download Netlogo from the Centre for Connected Learning at and Northeastern University https://ccl.northwestern.edu/netlogo/ and work through its tutorials. The Open Digital Archaeology Textbook Environment covers many different computational tasks from an archaeological perspective, and also comes with prebuilt computational environments that can be launched with a single click (hence taking care of the problems of installing software packages and allowing the reader to jump into learning rather than spending time toiling with configuring their own machine); it may be found at http://o-date.github.io.

How do you get started? These guidelines will help, but remember, there is no rule-book for digital archaeology.

Object Style Transfer

Image style transfer is a technique where a neural network extracts the elements of a photo’s style and then renders a different photo in the style of the first one. It is computationally intensive, but it’s now become one of the standard party tricks of computer vision work. There are several places you can go to online to give it a whirl; I’ve been using https://deepdreamgenerator.com/ to do style transfer onto the same image over and over again. The idea is that this way, I begin to learn slowly what the machine is actually looking at. I began with the well known ancient portrait, woman with a stylus:

And ran it through several different style transfers, including a photograph of our department library:

a nineteenth century photograph of four men:

A zombie:

and the Euphronios Krater:

It seems clear to me that the computer’s attention is very much attracted to the spackling/ visual noise on the left hand side of the painting, as well as her hands. Incidentally, I asked the Visual Chatbot what it thought was going on:

Image recognition algorithms always will tell you how many giraffes there are in an image, because they make guesses – the major training datasets always have giraffes in them, even though giraffes are unlikely to be in your photographs, so if you are asking about giraffes, there must be giraffes in there…

This was all quite fun. But the last example, of the woman-as-styled-by-krater got me to thinking about object style transfer. So many archaeological objects are photographed against neutral backgrounds, it occurred to me that the style transfer algorithm would just apply the ‘style’ of one thing onto the other thing. So I took a photo of a bust of Vespasian…

and transfered the style from the Euphronios Krater …

and achieved this:which is extremely pleasing. I also wondered about mosaics – as the original pixel art – and what that might achieve. Given that it was Super Bowl Sunday and people were sharing pictures of #superbOwls I borrowed an image that Sarah Bond tweeted of a 3rd century owl mosaic:and crossed it against the Laocoon:

and achieved this:So what’s the lesson here? I don’t really know yet. Well, in the first instance, I was just trying to break things, to see what would happen by feeding the algorithm things it wasn’t expecting. What attracts the attention of the machines? But it’s got me thinking more about how archaeological photographs are composed, and what happens when things get remediated. Tara Copplestone once had a blog post up on photobashing as archaeological remediation that was about turning photos into watercolour paintings using filters in Gimp. Remediation forces a readjustment of your archaeological eye and imagination. Here, the process offloaded to an active thing that has its own way of seeing – notice how it focused on the inscription in the mosaic, or the red figure foliage on the krater – that are independent of us and independent of its training. The automated aesthetic?