A Map of Archaeogaming

Andrew posted a mindmap of the kinds of things that fall under the ‘archaeogaming’ rubric. He mentioned that it’d be nice to have it with that xkcd aesthetic. The different kinds of archaeogaming were laid out like a ‘Reingold-Tilford’ tree

As it happens, there’s a package for R that will take the standard plot() commands and format the output – complete with stickfigures – like an xkcd comic.

I took Andrew’s original map and turned it into a standard network list, with source and target columns separated by commas. I thought maybe Raw might be a good solution:archaeomap1

It was pretty enough, but not that xkcd flavour I was looking for. A bit more rooting around and I realized that the various network packages for R would get that layout (see this great tutorial) but, by this point, I’d come across this d3 network package and I thought I’d try it out. It’s quite straight forward, and could make the simple network quite easily from my csv. But I wanted what d3network calls a ‘diagonalnetwork’ – which required (or seemed to require? The fact that I’m asking this shows that I have just enough knowledge to get myself into trouble, and not enough to solve my problems) my data in json format. I used a csv to json converter, which did me no good. So I carefully studied the example data, and turned my csv into json flare by hand. Here’s the file so you can do something with it.

At this point, it would’ve been easier to draw the damned thing by hand, and scan it in. But anyway.

So – to install xkcd in R – see this script. That also has the code for grabbing the json, and slapping the xkcd font on it. Now, if I wasn’t using d3network to make that graph – which outputs interactive html, remember- but instead using some of the standard network packages in R, then it’s fairly straightforward to customize all of the regular plot options to make something really cool – eg

But anyway. I ended up with a graph, it uses the fun xkcd font, and then a little cut’n’paste later we get a map of archaeogaming



Marky Markov Moodie

I’m playing with this ruby gem https://github.com/zolrath/marky_markov which generates Markov chain texts. I’m feeding it that well known gem of Canadian literature, Susanna Moodie’s Roughing it in the Bush.

Here’s my script, once marky_markov is installed:


require 'marky_markov'
markov = MarkyMarkov::TemporaryDictionary.new
markov.parse_file "moodie.txt"
puts markov.generate_10_sentences 10
puts markov.generate_200_words 2

and I pipe it to an output file. The result is pleasing; rather sounds like Susanna’s been hitting the sauce on those long winter nights trapped in the cabin in the bush of (comparatively tame) Canadian wilderness.



By this policy liberal provision is made for free grants of wild land in Canada.

Fusileers, and had been severely wounded in the Orkney Islands; he was a great and glorious destiny.

Canada became the great land-mark for the prosperity of the woodman’s life.

Bare feet and rags are too common in the Province a noble and progressive back country, and add much to improve the agricultural interests, and have given to Canada some of the country Dominion and Local Governments are now doing much to open up the resources of Canada, by the Government to the Province a noble and progressive back country, and add much to open up What the backwoods of Canada are to the saucy familiarity of servants, who, republicans at heart, think themselves quite as good as their employers.

less restricted by the construction of public roads, and upon each other as mutual enemies who were seeking to appropriate the larger share of the Province, but one of the mightiest empires that I dared to give my opinion freely on a subject which had engrossed a great and rising country.

class; and have led to better and more productive methods of cultivation, than were formerly practised in the council for the reproduction of the iron horse, and the love of books and book-lore and realize the want of social freedom.

Intensely loyal, the emigrant officers rose to a man to defend the British emigrants, who looked upon each lot is clearing five acres and erecting thereon a small house, which will be granted to heads of families, who, by six annual instalments, will be given, without any charge whatever, under a protective Homestead Act.

Our print shops are full of the new townships has been said to prejudice intending emigrants from making Canada their home.

In their zeal to uphold British authority, they made no excuse for the space of three or four years, landed upon these shores.

They feel more independent social life than in the cultivation of a great and rising country.

Macdonald is premier—has done wonders during the last four years by means of its Immigration policy, which has been most successfully carried out by the Hon. rapid strides she has made towards the fulfilment of a great boon to the settler, but it would have been converted into fruitful fields, the rude log cabin of the excellent constitution that we now enjoy.

Our busy factories and foundries—our copper, silver and plumbago mines—our salt and petroleum—the increasing exports of native artists.

To try his fortunes in Canada.


Incidentally, I just ran the same script on some of my own writing on video games and archaeology. Very meta archaeogaming. Result is over here.


Book Launch: ‘Exploring Big Historical Data: The Historian’s Macroscope’ Nov 17

We’re launching our book on November 17th, at 11.30 in the History department lounge, 4th floor of Paterson Hall. Drop by if you’re around!

I’m also going to undertake to stream the conversation on youtube. I’ve never set a livestream up (there seems to be an assumption round here that if you’re the digital guy, you’re also totally au-courrant with audio/vidsual tech and the ins and outs of broadcasting. This is not the case) so I’m not guaranteeing this will work. But if it does, it’ll be at:

Finally, what does a book launch entail? I have no real idea. And so, what we’ve done is this – Shawn Anctil, who is one of our PhD students and doing very cool work in digital history himself (he sometimes blogs here) will moderate a round table discussion with myself, Ian Milligan, and Scott Weingart (Scott will be skyping in). We’ll field questions from those assembled and of course via twitter (use #hmbook as a tag).

The round table conversation will last about an hour, and then Ian and I will give a bit of a digital history workshop. Our plan will be to focus on network analysis, so if this interests you, bring a laptop and have Gephi installed. Note that Gephi sometimes has Java issues – this will help. The workshop won’t be streamed, I think. Maybe it will. Really depends on whether I a) can get streaming to work and b) forget to turn it off again.

See you on Tuesday!

The humane hack – a snippet of an argument

[this is the snippet of an argument, and all that I’ve managed to produce today for #AcWriMo. I kinda like it though and offer it up for consumption, rough edges, warts, and all.  It emerges out of something Shawn Anctil said recently about ‘the Laws of Cool‘ when we were talking about his comps which happen this Thursday. In an effort to get my head around what he said, I started to write. This might make it into a piece on some of my recent sound work. Alan Liu’s stuff is always wonderful to read because it turns my head inside out, and I make no warrant that I am doing justice to Alan’s ideas. It’s been a while since I last looked, and I realize I really need to block out several days to do this properly. Anyway, working in public, fail gloriously, etc etc, i give you a snippet of an argument:]

Alan Liu, in 2004, wondered what the role of the arts and humanities was in an age of knowledge work, of deliverables, of an historical event horizon that only goes back the last financial quarter.  He examined the idea of ‘knowledge work’ and teased out how much of the driving force behind it is in pursuit of the ‘cool’. Through a deft plumbing of the history of the early internet (and in particular, riffing on Netscape’s ‘what’s cool?’ page from 1996 and their inability to define it except to say that they’d know it when they saw it ), Liu argues that cool is ‘the aporia of information… cool is information designed to resist information [emphasis original]… information fed back into its own signal to create a standing interference pattern, a paradox pattern’ (Liu, 2004: 179).  The latest web design, the latest app, the latest R package for statistics, the latest acronym on Twitter where all the digital humanists play: cool, and dividing the world.

That is, Liu argued that ‘cool’ was amongst other things a politics of knowledge work, a practice and ethos. He wondered how we might ‘challenge knowledge work to open a space, as yet culturally sterile (coopted, jejune, anarchistic, terroristic), for a more humane hack of contemporary knowledge?’ (Liu 2004: 9). Liu goes on to discuss how the tensions of ‘cool’ in knowledge work (for us, read: digital archaeology) also intersects with an ethos of the unknown, that is, of knowledge workers who work nowhere else somehow manage to stand outside that system of knowledge production. (Is alt-ac ‘alt’ partially because it is the cool work?). This matters for us as archaeologists. There are many ‘cool’ things happening in digital archaeology that somehow do not penetrate into the mainstream (such as it is). The utilitarian dots-on-a-map were once cool, but are now pedestrian. The ‘cool’ things that could be, linger on the fringes. If they did not, they wouldn’t be cool, one supposes. They resist.

To get that more humane hack that Liu seeks, Liu suggests that the historical depth that the humanities provides counters the shallowness of cool:

“The humanities thus have an explanation for the new arts of the information age, whose inheritance of a frantic sequence of artistic modernisms, postmodernisms, and post-postmodernists is otherwise only a displaced encounter with the raw process of historicity. Inversely, the arts offer the humanities serious ways of engaging – both practically and theoretically- with “cool”. Together, the humanities and arts might be able to offer a persuasive argument for the humane arts in the age of knowledge work” 2004:381.

In which case, the emergence of digital archaeologists and historians in the last decade might be the loci of the humane hacks – if we move into that space where we engage the arts.

We need to be making art.


If I could read your mind – Sonifying John Adams’ Diary

Maybe the question isn’t one of reading someone’s thoughts, but rather, listening to the overall pattern of topics within them. Topic modeling does some rather magical things. It imposes sense (it fits a model) onto a body of text. The topics that the model duly provide us with insight into the semantic patterns latent within the text (but see Ben Schmidts WEM approach which focuses on systems of relationships in the words themselves – more on this anon). There are a variety of ways emerging for visualizing these patterns. I’m guilty of a few myself (principally, I’ve spent a lot of time visualizing the interrelationships of topics as a kind of network graph, eg this). But I’ve never been happy with them because they often leave out the element of time. For a guy who sometimes thinks of himself as an archaeologist or historian, this is a bit problematic.

I’ve been interested in sonification for some time, the idea that we represent data (capta) aurally. I even won an award for one experiment in this vein, repurposing the excellent scripts of the Data Driven DJ, Brian Foo. What I like about sonification is that the time dimension becomes a significant element in how the data is represented, and how the data is experienced (cf. this recent interview on Spark with composer/prof Chris Chafe). I was once the chapel organist at Bishop’s University (I wasn’t much good, but that’s a story for another day) so my interest in sonification is partly in how the colour of music, the different instrumentation and so on can also be used to convey ideas and information (rather than using algorithmically purely generated tones; I’ve never had much formal musical training, so I know there’s a literature and language to describe what I’m thinking that I simply must go learn. Please excuse any awkawrdness).

So – let’s take a body of text, in this case the diaries of John Adams.  I scraped these, one line per diary entry (see this csv we prepped for our book, the Macroscope). I imported into R and topic modeled for 20 topics. The output is a monstrous csv showing the proportion each topic contributes to the entire diary entry (so each row adds to 1). If you use conditional formatting in Excel, and dial the decimal places to 2, you get a pretty good visual of which topics are the major ones in any given entry (and the really minor ones just round to 0.00, so you can ignore them).

It rather looks like an old-timey player piano roll:

Player Piano Anyone?

I then used ‘Musical Algorithms‘ one column at a time to generate a midi file. I’ve got the various settings in a notebook at home; I’ll update this post with them later. I then uploaded each midi file (all twenty) into GarageBand in the order of their complexity – that is, as indicated by file size:

Size of a file indicates the complexity of the source. Isn’t that what Claude Shannon taught us?

The question then becomes: which instruments do I assign to what topics? In this, I tried to select from the instruments I had readily to hand, and to select instruments whose tone/colour seemed to resonate somehow with the keywords for each topic. Which gives me a selection of brass instruments for topics relating to governance (thank you, Sousa marches); guitar for topics connected perhaps with travels around the countryside (too much country music on the radio as a child, perhaps); strings for topics connected with college and studying (my own study music as an undergrad influencing the choice here); and woodwinds for the minor topics and chirp and peek here and there throughout the text (some onomatopoeia I suppose).

Garageband’s own native visualization owes much to the player piano aesthetic, and so provides a rolling visualization to accompany the music. I used quicktime to grab the garageband visuals, and imovie to marry the two together again, since qt doesn’t grab the audio generated within the computer. Then I changed the name of each of the tracks to reflect the keywords for that topic.

Drumroll: I give you the John Adams 20:

A short note on mapping text

by Tsahi Levent-Levi

Kristina had a question.

So I started puttering.

We came up with this.

1. Grab the text of a blog post (but not too much, or do this in a bunch of rounds).

2. Put it at the end of this URL:


like so:

http://geotxt.org/api/1/geotxt.json?m=stanfords&q=Edwin Ernesto Rivera Gracias was in El Salvador and after voluntarily agreeing to return to the United States to face charges he was flown to Denver on Wednesday, according to the FBI. He surrendered to Salvadoran authorities and FBI agents on Tuesday, said FBI spokesman Dave Joly.

(For more information about this step, see: http://www.geotxt.org/api/)

That will use the Stanford NER to extract places & geocoordinates, and return them in valid GEOJSON.

3. Copy the geojson.

4. Go to geojson.io, and paste it in the right hand editing window.

5. You now have a map, congratulations! (You could’ve uploaded the geojson to many places –for instance, github gist– , but stay with me). Hit the ‘save’ button, and select KML.

6. You now have the data as KML for use with the Google environment. Go to Google’s My Maps. Create a new map.

7. Click the ‘import’ button, and drag-and-drop the KML onto the loader. Ta da! A google map that you can now annotate, share, etc etc etc.

Here’s the first paragraph of Kristina’s most recent post over on Forbes all mapped out.


Stanford NER, extracting & visualizing patterns

This is just a quick note while I’m thinking about this. I say ‘visualizing’ patterns, but there are of course many ways of doing that. Here, I’m just going quick’n’dirty into a network.

Say you have the diplomatic correspondence of the Republic of Texas, and you suspect that there might be interesting patterns in the places named over time. You can use the Stanford Named Entity Recognition package to extract locations. Then, using some regular expressions, you can transform that output into a network file. BUT – and this is important – it’s a format that carries some baggage of its own. Anyway, first you’ll want the Correspondence. Over at The Macroscope, we’ve already written about how you can extract the patterns of correspondence between individuals using regex patterns. This doesn’t need the Stanford NER because there is an index to that correspondence, and the regex grabs & parses that information for you.

But there is no such index for locations named. So grab that document, and feed it into the NER as Michelle Moravec instructs on her blog here. In the  terminal window, as the classifier classifies Persons, Organizations, and Locations, you’ll spot blank lines between batches of categorized items (edit: there’s a classifier that’ll grab time too; that’d be quite handy to incorporate here – SG). These blanks correspond to the blanks between the letters in the original document. Copy all of the terminal output into a new Notepad++ or Textwrangler document. We’re going to trim away every line that isn’t led by LOCATION:


and replace with nothing. This will delete everything that doesn’t have the location tag in front. Now, let’s mark those blank lines as the start of a new letter. A thread on Stack Overflow suggests this regex to find those blank lines:



^ is the beginning of string anchor
$ is the end of string anchor
\s is the whitespace character class
* is zero-or-more repetition

and we replace with the string new-letter.

Now we want to get all of the locations for a single letter into a single line. Replace ‘LOCATION’ with a comma. This budges everything into a single line, so we need to reintroduce line breaks, by replacing ‘new-letter’ with the new line character:

find: (new-letter)
replace \n(\1)

I could’ve just replaced new-letter with a new-line, but I wanted to make sure that every new line did in fact start with new-letter. Now find and replace new-letter so that it’s removed. You now have a document with the same number of lines as original letters in the original correspondence file. Now to turn it into a network file! Add the following information at the start of the file:

format = nodelist1
labels embedded:

DL will tell a network analysis program that we are dealing with UCINET’s DL format. N equals the number of nodes. Format=nodelist1 says, ‘this is a format where the first item on the line is connected to all the subsequent items on that line’. As a historian or archaeologist, you can see that there’s a big assumption in that format. Is it justified? That’s something to mull over. Gephi only accepts DL in format=edgelist1, that is, binary pairs. If that describes the relationship in your data, there’s a lot of legwork involved in moving from nodelist1 to edgelist1, and I’m not covering that here. Let’s imagine that, on historical grounds, nodelist1 accurately describes the relationship between locations mentioned in letters, that the first location mentioned is probably the place where the letter is being written from, or the most important place, or….

“labels embedded:” tells a network program that the labels themselves are being used as data points, and “data:” indicates that everything afterwards is the data. But how did we know how many nodes there were? You could tally up by hand; you could copy and paste your data )(back when each LOCATION was listed) into a spreadsheet and use its COUNT function to find uniques; I’m lazy and just bang any old number in there, and then save it with a .dl extension.  Then I open it using a small program called Keyplayer. This isn’t what the program is for, but it will give you an error message that tells you the correct number of nodes! Put that number into your DL file, and try again. If you’ve got it right, Keyplayer won’t do anything – its silence speaks volumes (you can then run an analysis in keyplayer. If your DL file is not formatted correctly, no results!).

You now have a DL file that you can analyze in Pajek or UCINET. If you want to visualize in Gephi, you have to get it into a DL format that Gephi can use (edgelist) or else into .net format. Open your DL file in Pajek, and then save as Pajek format (which is .net). Then open in Gephi. (Alternatively, going back a step, you can open in Keyplayer, and then within Keyplayer, hit the ‘visualize in Pajek’ button, and you’ll automatically get that transformation). (edit: if you’re on a Mac, you have to run Pajek or Ucinet with something like Winebottler. Forgot to mention that).

Ta da!

Locations mentioned in letters of the Republic of Texas

Locations mentioned in letters of the Republic of Texas



-ing history!

Still playing with videogrep. I downloaded 25 heritage minute commercials (non-Canadians: a series of 1 minute or so clips that teach us Canucks about the morally uplifting things we’ve done in the past, things we’ve invented, bad-things-we-did-but-we’ve-patched-over-now. You get the gist.). I ran them through various pattern matches based on parts-of-speech tagging. It was hard to do anything more than that because the closed captioning (on which this all rests) was simply awful. Anyway, there’s a healthy dose of serendipity in all of this, as even after the search is done, the exact sequence the clips are reassembled in is more or less random.

And with that, I give you the result of my pattern matching for gerunds:

-ing history! A Heritage Minute Auto-Supercut.

Historical Maps into Minecraft


Dow’s Lake area, settlement by 1847 Map Source: Bruce Elliott, Nepean, The City Beyond, page 23, posted on http://www.bytown.net/dowslake.htm

The folks over at the New York Public Library published an excellent & comprehensive tutorial for digitizing historical maps, and then importing them into Minecraft.

First: thank you!

Unfortunately, for me, it’s not working. I document here what I’ve been doing and ideally someone far more clever than me will figure out what needs to happen…

The first parts of the tutorial – working with QGIS & Inkscape – go very well (although there might be a problem with colours, but more on that anon). Let’s look at the python script for combining the elevation map (generated from QGIS) with the blocks map (generated from Inkscape). Oh, you also need to install imagemagick, which you then run from the command line, to convert SVG to TIF.

“The script for generating the worlds uses PIL to load the TIFF bitmaps into memory, and pymclevel to generate a Minecraft worlds, one block at a time. It’s run successfully on both Mac OS X and Linux.”

After digitizing, looks like this.

After digitizing, looks like this.

I’ve tried both Mac and Linux, with python installed, and PIL, and pymclevel. No joy (for the same reasons as for Windows, detailed below). Like most things computational, there are dependencies that we only uncover quite by accident…

Anyway, when you’ve got python installed on Windows, you can just type the python file name at the command prompt and you’re off. So I download pymclevel, unzip it, open a command prompt in that folder (shift + right click, ‘open command prompt here’), and type ‘setup.py’. Error message. Turns out, I need setuptools. Which I obtain from:


Download, install. Works. Ok, back to the pymclevel folder, setup.py, and new error message. Looks like I need something called ‘cython’.


I download, unzip, go to that folder, setup.py. Problem. Some file called ‘vcvarsall.bat’ is needed. Solution? Turns out I need to donwload Microsoft Visual Studio 10. Then, I needed to create an environment variable called ‘vs90comntools’, which I did by typing this at the command prompt:

set VS90COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\Tools\

Wunderbar. I go back to the pymclevel folder, I run setup.py again, and hooray! It installs. I had PIL installed from a previous foray into things pythonesque, so at least I didn’t have to fight with that again.

I copy the generate_map.py script into notepad++, change the file names within it (so that it finds my own elevation.tif and features.tif files, which are called hogs-elevation.tif and hogs-features.tif; the area I’m looking at is the Hogsback Falls section of the Rideau. In the script, just change ‘fort-washington’ to ‘hogs’ or whatever your files are called). In my folder, at the command prompt, I type generate_map.py and get a whole bunch of error messages: various ‘yaml’ files can’t be found.

Did I mention PyYaml has to be installed? Fortunately, it has a windows installer.  Oh, and by the way – PyWin is also needed; I got that error message at one point (something obscure about win32api), and downloading/installing from here solved it: http://sourceforge.net/projects/pywin32/files/pywin32/

Ok, so where were we? Right, missing yaml files, like ‘minecraft.yaml’ and ‘classic.yaml’, and ‘indev.yaml’ and ‘pocket.yaml’. These files were there in the original repository, but for whatever, they didn’t install into the pymclevel that now lives in the Python directory. So I went to the pymclevel repo on github, copied-and-pasted the code into new documents in notepad++, saved them as thus:


Phew. Back to where I was working on my maps, and have my generate_map.py, which I duly enter and…. error. can’t find ‘tree import Tree, treeObjs’.  Googling around to solve this is a fool’s errand: ‘tree’ is such a common word, concept in programming that I just can’t figure out what’s going on here. So I turned that line off with a # in the code. Run it again…. and it seems to work (but is this the key glitch that kills all that follows?).

(update: as Jonathan Goodwin points out, ‘tree.py’ is there, in the NYPL repo

…so I uncommented out the line in generate_map.py, saved tree.py in the same directory, and ran the script again. Everything that follows still happens. So perhaps there’s something screwed-up with my map itself.)

The script tells me I need to tell it whether I’m creating a creative mode map or a survival mode:

so for creative mode: c:>generate_map.py map

for survival: c:>generate_map.py game

And it chugs along. All is good with the world. Then: error message. KeyError: 255 in line 241, block_id, block_data, depth = block_id_lookup[block_id]. This is the bit of code that tells the script how to map minecraft blocks to the colour scheme I used in Inkcraft to paint the information from the map into my features.tif. Thing is, I never used RGB R value of 255. Where’s it getting this from? I go back over my drawing, inspecting each element, trying to figure it out. All seems good with the drawing. So I just add this line to the code in the table:

block_id_lookup = {

[..existing code…]

255 : (m.Water.ID, 0, 1),


And run it again. Now it’s 254. And then 253. Then 249. 246. 244. 241. Now 238.

And which point, I say piss on this, and I provide you with my features tif and elevation tif and if you can please tell me what I’m doing wrong, I’d be ever so appreciative (and here’s the svg with the drawing layers, for good measure).

….when I first saw the tutorial from the NYPL, I figured, hey! I could use this with my students! I think not, at least, not yet.

(update 2: have downloaded the original map tifs that the NYPL folks used, and am running the script on them. So far, so good: which shows that, once all this stuff is installed, that it’s my maps that are the problem. This is good to know!)

Part Two:

(updated about 30 minutes after initial post) So after some to-and-fro on Twitter, we’ve got the tree.py problem sorted out. Thinking that it’s the maps where the problem is, I’ve opened the original Fort Washinton features.tif in MS Paint (which is really an underappreciated piece of software). I’ve zoomed in on some of the features, and compared the edges with my own map (similarly opened and zoomed upon). In my map, there are extremely faint colour differentations/gradations where blocks of colour meet. This, I think, is what has gone wrong. So, back to Inkscape I go…

Update the Third: looks like I made (another) silly error – big strip of white on the left hand side of my features.tif. So I’ve stripped that out. But I can’t seem to suss the pixel antialiasing issue. Grrrrr! Am now adding all of the pixels into the dictionary, thus:

lock_id_lookup = {
0 : (m.Grass.ID, None, 2),
10 : (m.Dirt.ID, 1, 1), # blockData 1 == grass can’t spread
11 : (m.Dirt.ID, 1, 1), # blockData 1 == grass can’t spread
12 : (m.Dirt.ID, 1, 1), # blockData 1 == grass can’t spread
14 : (m.Dirt.ID, 1, 1), # blockData 1 == grass can’t spread
16 : (m.Grass.ID, None, 2),
20 : (m.Grass.ID, None, 2),
30 : (m.Cobblestone.ID, None, 1),
40 : (m.StoneBricks.ID, None, 3),
200 : (m.Water.ID, 0, 2), # blockData 0 == normal state of water
210 : (m.WaterActive.ID, 0, 1),
220 : (m.Water.ID, 0, 1),
49 : (m.StoneBricks.ID, None, 3),
43 : (m.StoneBricks.ID, None, 3),

…there’s probably a far more elegant way of dealing with this. Rounding? Range lookup? I’m not v. python-able…

Update, 2.20pm: Ok. I can run the script on the Fort Washington maps and end up with a playable map (yay!). But my own maps continue to contain pixels of colours the script doesn’t want to play with. I suppose I could just add 255 lines worth, as above, but that seems silly. The imagemagick command, I’m told, works fine on a mac, but doesn’t seem to achieve anything on my PC. So something to look into (and perhaps try this http://www.graphicsmagick.org/ instead). In the meantime, I’ve opened the Fort Washington map in good ol’ Paint, grabbing snippets of the colours to paste into my own map (also open in Paint). Then, I use Paint’s tools to clean up the colour gradients at the edges on my map. In essence, I trace the outlines.

Then, I save, run the script and…… success!

I have a folder with everything I need (and you can have it, too.) I move it to

C:\Users\[me]\AppData\Roaming\.minecraft\saves and fire up the game:

Rideau River in Minecraft!

Rideau River in Minecraft!

Does it actually look like the Hogs’ Back to Dow’s Lake section of the Rideau Canal and the Rideau River? Well, not quite. Some issues with my basic elevation points. But – BUT! – the workflow works! So now to find some better maps and to start again…

Shared Authority & the Return of the Human Curated Web

A few years ago, I wrote a piece on Why Academic Blogging Matters: A structural argument. This was the text for a presentation as part of the SAA in Sacremento that year. In the years since, the web has changed (again). It is no longer enough for us to create strong signals in the noise, trusting in the algorithmns to connect us with our desired publics. (That’s the short version. The long version is rather more nuanced and sophisticated, trust me).

The war between the botnets and the SEO specialists has outstripped us.

In recent months, I have noticed an upsurge of new ‘followers’ on this blog with emails and handles that really do not seem to be those of actual humans. Similarly, on Twitter, I find odd tweets directed at me filled with gibberish web addresses (which I dare not touch). Digital Humanities Now highlighted an interesting post in recent days that explains what’s going on, discusses this ‘war’, and in how this post came to my attention, points the way forward for the humanistic use of the web.

In ‘Crowd-Frauding: Why the Internet is Fake‘, Eric Hellman discusses a new avenue for power (assuming that power ‘derives from the ability to get people to act together’. In this case, ‘cooperative traffic generation’, or software-organized crime. Hellman was finding a surge of fake users on his site, and he began to investigate why this was. Turns out, if you want to promote your website and jack up its traffic, you can install a program that manufacturers fake visitors to your sites, who click around, click on adverts, register… and in turn does this for other users of the software. Money is involved.

“In short, your computer has become part of a botnet. You get paid for your participation with web traffic. What you thought was something innocuous to increase your Alexa- ranking has turned you into a foot-soldier in a software-organized crime syndicate. If you forgot to run it in a sandbox, you might be running other programs as well. And who knows what else.

The thing that makes cooperative traffic generation so difficult to detect is that the advertising is really being advertised. The only problem for advertisers is that they’re paying to be advertised to robots, and robots do everything except buy stuff. The internet ad networks work hard to battle this sort of click fraud, but they have incentives to do a middling job of it. Ad networks get a cut of those ad dollars, after all.

The crowd wants to make money and organizes via the internet to shake down the merchants who think they’re sponsoring content. Turns out, content isn’t king, content is cattle.”

Hellman goes on to describe how the arms race, the red queen effect, between these botnets and advertising models that depend on clickrates etc will push those of us without the computing resources to fight in these battles into the arms of the Googles, the Amazons, the Facebooks: and their power will increase correspondingly.

“So with the crowd-frauders attacking advertising, the small advertiser will shy away from most publishers except for the least evil ones- Google or maybe Facebook. Ad networks will become less and less efficient because of the expense of dealing with click-fraud. The rest of the the internet will become fake as collateral damage. Do you think you know how many users you have? Think again, because half of them are already robots, soon it will be 90%. Do you think you know how much visitors you have? Sorry, 60% of it is already robots.”

I sometimes try explaining around the department here that when we use the internet, we’re not using a tool, we’re sharing authority with countless engineers, companies, criminals, folks-in-their-parents-basement, ordinary folks, students, algorithms whose interactions with other algorithms can lead to rather unintended outcomes. We can’t naively rely on the goodwill of the search engine to help us get our stuff out there. This I think is an opportunity for a return of the human curated web. No, I don’t mean building directories and indices. I mean, a kind of supervised learning algorithm (as it were).

Digital Humanities Now provides one such model (and there are of course others, such as Reddit, etc). A combination of algorithm and human editorial oversite, DHNow is a cybernetic attempt to bring to the surface the best in the week’s digital humanities work, wherever on the net it may reside. We should have the same in archaeology. An Archaeology Now!  The infrastructure is already there. Pressforward, the outfit from the RRCHNM has developed a workflow for folding volunteer editors into the weekly task of separating the wheat from the chaff, using a custom built plugin for WordPress. Ages ago we talked about a quarterly journal where people would nominate their own posts and we would spider the web looking for these nominations, but the technology wasn’t really there at that time (and perhaps the idea was too soon). With the example of DHNow, and the emergence of this new front in botnets/SEO/clickfraud and the dangers that that poses, perhaps it’s time to revisit the idea of the human-computer curated archaeoweb?

Hollis Peirce, George Garth Graham Research Fellow

Hollis Peirce on Twitter: https://twitter.com/HollPeirce

Mr. Hollis Peirce https://twitter.com/HollPeirce

I am pleased to announce that the first George Garth Graham Undergraduate Digital History Research Fellow will be Mr. Hollis Peirce.

Hollis is a remarkable fellow. He attended the Digital Humanities Summer Institute at the University of Victoria in the summer of 2012. At DHSI he successfully completed a course called “Digitization Fundamentals and Their Application”. In the fall semester of 2012 he was the impetus behind, and helped to organize,  THATCamp Accessibility on the subject of the impact of digital history on accessibility in every sense of the word.

Hollis writes,

Life for me has been riddled with challenges.  The majority of them coming on account of the fact that I, Hollis Peirce, am living life as a disabled individual with Congenital Muscular Dystrophy as many things are not accessible to me.  However, I have never let this fact hold me back from accomplishing my goals.  Because of this, when I first started studying history I knew I was not choosing an easy subject for a disabled individual such as myself.  All those old, heavy, books on high library shelves that history is known for made it one of the most inaccessible subjects possible to study.  All that changed however, when I discovered digital history.

It was thanks to a new mandatory class for history majors at Carleton University called The Historian’s Craft taught by a professor named Dr Shawn Graham.  This course was aimed at teaching students all about how to become a historian, and how a historian is evolving through technology.  At that moment the idea for ‘Accessibility & Digital History’ came to mind.  From that point on many steps have been taken to advance my studies in this field, which has led to being selected as the first George Garth Graham Undergraduate Digital History Reseach Fellow.

Hollis and I have had our first meeting, about what his project might entail. When I initially cooked this idea up, I thought it would allow students the opportunity to work on my projects, or those of my colleagues around the university. As we chatted about Hollis’ ideas, (and where I batted around some of my own stuff),  I realized that I had the directionality of this relationship completely backwards.

It’s not that Hollis gets to work on my projects. It’s that I get to work on his.

Here’s what we came up with.

At THATCamp Accessibility, we recorded every session. We bounced around the idea of transcribing those sessions, but realized that that was not really feasible for us. We started talking about zeroing in on certain segments, to tell a history of the future of an accessible digital humanities… and ideas started to fizz. I showed Hollis some of Jentery Sayer’s stuff, especially his work with Scalar . 

Jentery writes,

the platform particularly facilitates work with visual materials and dynamic media (such as video and audio)….it enables writers to assemble content from multiple sources and juxtapose them with their own compositions.

Can we use Scalar to tell the story of THATCamp Accessibility that captures the spontaneity, creativity, and excitement of that day in a way that highlights the issues of accessibility that Hollis  wants to explore? And if we can, how can we make it accessible for others (screenreaders, text-to-speech, etc?) And if we focus on telling history with an eye to accessibility (oh, how our metaphors privilege certain senses, ways of knowing!) maybe there will be lessons for telling history, full stop?

Stay tuned! Hollis is setting up his blog this week, but he’ll be posting over at http://hollispeirce.grahamresearchfellow.org/

Historian’s Macroscope- how we’re organizing things

‘One of the sideshows was wrestling’ from National Library of Scotland on Flickr Commons; found by running this post through http://serendipomatic.org

How do you coordinate something as massive as a book project, between three authors across two countries?

Writing is a bit like sausage making. I write this, thinking of Otto von Bismarck, but Wikipedia tells me:

  • Laws, like sausages, cease to inspire respect in proportion as we know how they are made.
    • As quoted in University Chronicle. University of Michigan (27 March 1869) books.google.de, Daily Cleveland Herald (29 March 1869), McKean Miner (22 April 1869), and “Quote… Misquote” by Fred R. Shapiro in The New York Times (21 July 2008); similar remarks have long been attributed to Otto von Bismarck, but this is the earliest known quote regarding laws and sausages, and according to Shapiro’s research, such remarks only began to be attributed to Bismarck in the 1930s.

I was thinking just about the messiness rather that inspiring respect; but we think there is a lot to gain when we reveal the messiness of writing. Nevertheless, there are some messy first-first-first drafts that really ought not to see the light of day. We want to do a bit of writing ‘behind the curtain’, before we make the bits and pieces visible on our Commentpress site, themacroscope.org.  We are all fans of Scrivener, too, for the way it allows the bits and pieces to be moved around, annotated, rejected, resurrected and so on. Two of us are windows folks, the other a Mac. We initially tried using Scrivener and Github, as a way of managing version control over time and to provide access to the latest version simultaneously. This worked fine, for about three days, until I detached the head.

Who knew that decapitation was possible? Then, we started getting weird line breaks and dropped index cards happening. So we switched tacts and moved our project into a shared dropbox folder. We know that with dropbox we absolutely can’t have more than one of us be in the project at the same time. We started emailing each other to say, ‘hey, I’m in the project….now. It’s 2.05 pm’ but that got very messy. We installed yshout  and set it up to log our chats. Now, we can just check to see who’s in, and leave quick memos about what we were up to.

Once we’ve got a bit of the mess cleaned up, we’ll push bits and pieces to our Commentpress site for comments. Then, we’ll incorporate that feedback back in our Scrivener, and perhaps re-push it out for further thoughts.

One promising avenue that we are not going down, at least for now, is to use Draft.  Draft has many attractive features, such as multiple authors, side-by-side comparisons, and automatic pushing to places such as WordPress. It even does footnotes! I’m cooking up an assignment for one of my classes that will require students to collaboratively write something, using Draft. More on that some other day.