archaeology, environments, teaching, Uncategorized

Call for Collaborators: The Open Digital Archaeology Textbook Environment (ODATE)

The Open Digital Archaeology Textbook Environment is a collaborative writing project led by myself, Neha Gupta, Michael Carter, and Beth Compton. (See earlier posts on this project here).  We recognize that this is a pretty big topic to tackle. We would like to invite friends and allies to become co-authors with us. Contact us by Jan 31st; see below.

Here is the current live draft of the textbook. It is, like all live-written openly accessible texts, a thing in the process of becoming, replete with warts, errors, clunky phrasing, and odd memos-to-self. I’m always quietly terrified to share work in progress, but I firmly believe in both the pedagogical and collegial value of such endeavours. While our progress has been a bit slower than one might’ve liked, here is where we currently stand:

  1. We’ve got the framework set up to allow open review and collaboration via the Hypothes.is web annotation framework and the use of Github and gh-pages to serve up the book
  2. The book is written in the bookdown framework with R Markdown and so can have actionable code within it, should the need arise
  3. This also has the happy effect of making collaboration open and transparent (although not necessarily easy)
  4. The DHBox computational environment has been set up and is running on Carleton’s servers. It’s currently behind a firewall, but that’ll be changing at some point during this term (you can road-test things on DHBox)
  5. We are customizing it to add QGIS and VSFM and some other bits and bobs that’d be useful for archaeologists. Suggestions welcome
  6. We ran a test of the DHBox this past summer with 60 students. My gut feeling is that not only did this make teaching easier and keep all the students on the same page, but the students also came away with a better ability to roll with whatever their own computers threw at them.
  7. Of six projected chapters, chapter one is in pretty good – though rough – shape

So, while the majority of this book is being written by Graham, Gupta, Carter and Compton, we know that we are leaving a great deal of material un-discussed. We would be delighted to consider additions to ODATE, if you have particular expertise that you would like to share. As you can see, many sections in this work have yet to be written, and so we would be happy to consider contributions aimed there as well. Keep in mind that we are writing for an introductory audience (who may or may not have foundational digital literacy skills) and that we are writing for a linux-based environment. Whether you are an academic, a professional archaeologist, a graduate student, or a friend of archaeology more generally, we’d be delighted to hear from you.

Please write to Shawn at shawn dot graham at carleton dot ca to discuss your idea and how it might fit into the overall arc of ODATE by January 31st 2018. The primary authors will discuss whether or not to invite a full draft. A full draft will need to be submitted by March 15th 2018. We will then offer feedback. The piece will go up on this draft site by the end of the month, whereupon it will enjoy the same open review as the other parts. Accepted contributors will be listed as full authors, eg ‘Graham, Gupta, Carter, Compton, YOUR NAME, 2018 The Open Digital Archaeology Textbook Environment, eCampusOntario…..

For help on how to fork, edit, make pull requests and so on, please see this repo

 

Featured Image: “My Life Through a Lens”, bamagal, Unsplash

Advertisements
Standard
archaeology, history, making, publishing

Building Epoiesen

The curtain goes up, the first pawn moves, the first shot is fired*—but that’s not the start. The play, the game, the war is just a little window on a ribbon of events that may extend back thousands of years. The point is, there’s always something before. It’s always a case of Now Read On.

Much human ingenuity has gone into finding the ultimate Before.

The current state of knowledge can be summarized thus:

In the beginning, there was nothing, which exploded.

* Probably at the first pawn.

  • Terry Pratchett, Lords and Ladies 

Epoiesen is now alive, a journal for creative engagement in history and archaeology. But when did Epoiesen begin? What was its genesis?

As I look through my notebooks and emails and miscellaneous files, I can’t find the *exact* beginning (I know that I’ve been interested in new publishing models for a while though). I find in my inbox an email setting up a meeting with George Duimovich and Pat Moore from our Library to talk about Open Journal Systems in March of 2015. I find scribbles of ideas in notebooks going back to about 2014 (not coincidentally, shortly after my tenure and promotion portfolio was shoehorned from its born-digital format into dead pdfs). In October 2015, I find a google doc that I shared with some folks for an idea of something called ‘Paradata: A Journal of Digital Scholarship in Archaeology and Ancient History’. The influence of HeritageJam I think is clear too 😉 I wrote,

“I see this idea as being parallel to things like http://openarchaeologydata.metajnl.com/ which publishes ‘data papers’. Paradata would publish the scholarship that goes into making something of that kind of info, while the dead tree versions can be where people duke it out over the interpretations. Moreover, since link rot and website death is a real issue, Paradata would hit a real niche where projects could continue ever after”

But that idea seems to have run out of steam.  I’m not entirely sure why. I find one note that suggests we felt that our idea perhaps was too close to what DHCommons Journal had in mind.

My notes go silent for a while. Then I find scribbles in my notebooks again from around the time of my participation in MSUDAI, the Digital Archaeology Institute at MSU, concomitant with the creation of @tinyarchae my Tracery-powered dysfunctional excavation bot. That was August 2016. Then, sometime in September of last year, I find a website I built:

O what could’ve been, eh? Here, I’m clearly going for a bit of whimsy; not so much just paradata for conventional digital projects, but maybe something more. The core idea still seems to be there – a place to validate digital things.  I rather like that template, and I need to remind myself what I was using to build it. Structurally, there’s a debt here to open research notebooks done in Jekyll, so that’s probably it.  I did show this ‘Miscellaney’ to people, but there were some very strong reactions to the name, to the whimsy, as it were – see below. (I still like ‘Haptic Visions’ as a category though).

The actual email that led to Epoiesen seeing the light of day comes from October 16 2016:

Hi Pat,

As I was saying to George – and I think you and I have talked about this too on occasion – I’ve been interested to explore creating an open access journal for digital archaeology. I’ve seen the open journals platform, and while it is very cool, it’s not quite what I’m thinking of. I’m interested in something a bit more idiosyncratic that would be based on simple text files in the markdown format, and building the site/journal from that with a static site generator.

The idea is to create what amounts to a kind of literary journal, but for creative engagement with the past. I would solicit everything from twitter bots (I’ve created one that tweets out what amounts to a procedurally-generated soap opera, scenes from an excavation) to music, to art, to creative writing, to data viz… I would solicit reviews, but these would also be published alongside the work under the reviewers’ name. The Hypothesis web annotation architecture would also be built in  […] In a way, it would be a place to publish the ‘paradata’ of digital making in archaeology … Does this sound feasible? Is it something we could do? Maybe I could drop by sometime to chat.

Pat said ‘Yes’. Simple word, ‘yes’. Strong word, ‘yes’. Librarians are powerful.

From that initial meeting, many more meetings took place. Research. Emails. Phone calls. I’ll try to summarize… My first port of call was of course those folks who’ve done this kind of thing before. Martin Paul Eve published a series of posts on his blog that offered his advice on starting an open access journal, and I can’t recommend them enough. Indeed, if you’re one of the people who received an email from me about joining the editorial board, you’ll recognize that I adhered rather closely to Eve’s model.

I was still going with the ‘Smith’s’ name until about November of last year, when I find an email I wrote,

I have, on the advice of several people whose situation is far more precarious than my own, gone for a bit of a name change to signal a bit less whimsy….They rightly pointed out to me that as junior folks the perception of their work is everything, and my whimsical ’Smiths’ name would undermine rather than help them…

One of the earliest folks on board was the wonderful Sara Perry.  I find we exchanged several yonks-worth of emails, throwing ideas around about who to contact, who might be persuaded to submit, and so on. The wonderful folks of the editorial board as a group kept me grounded, found potential contributors, suggested Trello as a way of keeping track of who was doing what, and basically helped keep things on track when my enthusiasm threatened to derail things.

While all of this was going on, I continued to play with the design and platform. I eventually settled on Hexo as a static site generator. I’d been using Jekyll with my open research notebook, but Jekyll frankly is just not something I can work with.

Now Hexo is not without its idiosyncracies. I learned how it builds the site out of little snippets of ‘ejs’ code. I learned how to embed (that is, into what ‘partial’ to paste) Hypothes.is . I figured out where to place the bits and bobs of Tipuesearch (an open source jquery search engine plugin) into the site. (It generates a full json representation of all the site content, so not only making it searchable, but other folks can use it for remixing, data viz, whatever). You wouldn’t believe how hard it was to work out how to make the list-of-articles page, list alphabetically, rather than by date. There was also a battle where the hexo deploy command for pushing everything to my test site ingested accidentally a bunch of stuff I didn’t want – super huge image files – and so I had to wade deep into the waters of git to fix (and I thank the DHAnswers channel in DH Slack for the help!).  Turns out, if you’re using Hexo and Github, don’t fiddle with anything via the website. Getting DOIs embedded into the page metadata, that was also difficult.

Here’s what the YAML metadata for an article looks like:

---
title: Destory History
date: 2017-09-01 20:01:04
tags: interactive fiction
cover_index: /imgs/coyne/quinten-de-graaf-258711.jpg
cover_detail: /imgs/coyne/11146352055_64c730a741_o.jpg
author: "Coyne, Lucas"
doi: "10.22215/epoiesen/2017.4"
---

The partials grab the ‘author’ for alphabetizing posts, and the doi for embedding into the metadata:

<% if (page.doi){ %>
        <meta name="dc:identifier" content="<%= page.doi %>" />
    <% } %>

That might not mean very much to you, or look very impressive, but darnit,  it took several hours of reading stackoverflow and fiddling with things, generating the site over and over again, so I’m pasting it here for posterity…. anyway, I now have a default template for creating articles, that has reminders within it of the kinds of information that I need to include and where they go.

One year later, Epoiesen exists in the world! I announced it with a tweet…

https://twitter.com/electricarchaeo/status/913041896238436352

… we will be doing a formal ‘Ta Da!’ during open access week in October. So many people come together at just the right time to make something in this world. Serendipity, and someone says ‘Yes’, and suddenly, there is something that wasn’t before. What else is tenure for, if it isn’t to make space for someone else to do something new? I hope you’ll consider Epoiesen for your own experiments and creative engagements.

~o0o~

I’m grateful to everyone who has sent me a note or tweeted regarding the start of Epoiesen. I look forward to seeing where this adventure will lead! Thank you, all. I’m also grateful to Neville Morley, who writes about Epoiesen’s situation in the broader publishing landscape in ‘Changing the Rules

Moving forward, I’ve very excited to work with Bill Caraher and The Digital Press at the University of North Dakota to publish the Epoiesen Annual, where all of the articles from a given year are gathered together and given (printed) flesh.

Stay tuned! Make wonderful things!

Standard
archaeology, data management, environments

R is for Archaeology: A report on the 2017 Society of American Archaeology meeting, by B Marwick

This guest post is by Ben Marwick of The University of Washington in Seattle. He reports on R workshop at the recent SAA in Vancouver.

The Society of American Archaeology (SAA) is one of the largest professional organisations for archaeologists in the world, and just concluded its annual meeting in Vancouver, BC at the end of March. The R language has been a part of this meeting for more than a decade, with occasional citations of R Core in the posters, and more recently, the distinctive ggplot2 graphics appearing infrequently on posters and slides. However, among the few archaeologists that have heard of R, it has a reputation for being difficult to learn and use, idiosyncratic, and only suitable for highly specialized analyses. Generally, archaeology students are raised on Excel and SPSS. This year, a few of us thought it was time to administer some first aid to R’s reputation among archaeologists and generally broaden awareness of this wonderful tool. We developed a plan for this year’s SAA meeting to show our colleagues that R is not too hard to learn, it is useful for almost anything that involves numbers, and it has lots of fun and cool people that use it to get their research done quicker and easier.

Our plan had three main elements. The first element was the debut of two new SAA Interest Groups. The Open Science Interest Group (OSIG) was directly inspired by Andrew MacDonald’s work founding the ESA Open Science section, with the OSIG being approved by the SAA Board this year. It aims to promote the use of preprints (e.g. SocArXiv), open data (e.g. tDAR, Open Context), and open methods (e.g. R and GitHub). The OSIG recently released a manifesto describing these aims in more detail. At this SAA meeting we also saw the first appearance of the Quantitative Archaeology Interest Group, which has a strong focus on supporting the use R for archaeological research. The appearance of these two groups shows the rest of the archaeological community that there is now a substantial group of R users among academic and professional archaeologists, and they are keen to get organised so they can more effectively help others who are learning R. Some of us in these interest groups were also participants in fora and discussants in sessions throughout the conference, and so had opportunities to tell our colleagues, for example, that it would be ideal if R scripts were available for for certain interesting new analytical methods, or that R code should be submitted when manuscripts are submitted for publication.

The second element of our plan was a normal conference session titled ‘Archaeological Science Using R’. This was a two hour session of nine presentations by academic and professional archaeologists that were live code demonstrations of innovative uses of R to solve archaeological research problems. We collected R markdown files and data files from the presenters before the conference, and tested them extensively to ensure they’d work perfectly during the presentations. We also made a few editorial changes to speed things up a bit, for example using readr::read_csv instead of read.csv. We were told in advance by the conference organisers that we couldn’t count on good internet access, so we also had to ensure that the code demos worked offline. On the day, the live-coding presentations went very well, with no-one crashing and burning, and some presenters even doing some off-script code improvisation to answer questions from the audience. At the start of the session we announced the release of our online book containing the full text of all contributions, including code, data and narrative text, which is online at https://benmarwick.github.io/How-To-Do-Archaeological-Science-Using-R/ We could only do this thanks to the bookdown package, which allowed us to quickly combine the R markdown files into a single, easily readable website. I think this might be a new record for the time from an SAA conference session to a public release of an edited volume. The online book also uses Matthew Salganik’s Open Review Toolkit to collect feedback while we’re preparing this for publication as an edited volume by Springer (go ahead and leave us some feedback!). There was a lot of enthusiastic chatter later in the conference about a weird new kind of session where people were demoing R code instead of showing slides. We took this as an indicator of success, and received several requests for it to be a recurring event in future meetings.

The third element of our plan was a three hour training workshop during the conference to introduce archaeologists to R for data analysis and visualization. Using pedagogical techniques from Software Carpentry (i.e. sticky notes, live coding and lots of exercises), Matt Harris and I got people using RStudio (and discovering the miracle of tab-complete) and modern R packages such as readxl, dplyr, tidyr, ggplot2. At the end of three hours we found that our room wasn’t booked for anything, so the students requested a further hour of Q&A, which lead to demonstrations of knitr, plotly, mapview, sf, some more advanced ggplot2, and a little git. Despite being located in the Vancouver Hilton, this was another low-bandwidth situation (which we were warned about in advance), so we loaded all the packages to the student’s computers from USB sticks. In this case we downloaded package binaries for both Windows and OSX, put them on the USB sticks before the workshop, and had the students run a little bit of R code that used install.packages() to install the binaries to the .libpaths() location (for Windows) or untar’d the binaries to that location (for OSX). That worked perfectly, and seemed to be a very quick and lightweight method to get packages and their dependencies to all our students without using the internet. Getting the students started by running this bit of code was also a nice way to orient them to the RStudio layout, since they were seeing that for the first time.

This workshop was a first for the SAA, and was a huge success. Much of this is due to our sponsors who helped us pay for the venue hire (which was surprisingly expensive!). We got some major support from Microsoft Data Science User Group (which we learned about from a post by Joseph Rickert and Open Context, as well as cool stickers and swag for the students from RStudio, rOpenSci, and the Centre for Open Science. We used the stickers like tiny certificates of accomplishment, for example when our students produced their first plot, we handed out the ggplot2 stickers as a little reward.

Given the positive reception of our workshop, forum and interest groups, our feeling is that archaeologists are generally receptive to new tools for working with data, perhaps more so now than in the past (i.e. pre-tidyverse). Younger researchers seem especially motivated to learn R because they may have heard of it, but not had a chance to learn it because their degree program doesn’t offer it. If you are a researcher in a field where R (or any programming language) is only rarely used by your colleagues, now might be a good time to organise a rehabilitation of R’s reputation in your field. Our strategy of interest groups, code demos in a conference session, and a short training workshop during the meeting is one that we would recommend, and we imagine will transfer easily to many other disciplines. We’re happy to share more details with anyone who wants to try!

Standard
archaeology, draft, making

On Writing with Bookdown

How do you collaborate remotely with co-authors? How do you make sure that what you write is as accessible as possible to as many people as possible, and do so such that your readers become collaborators as well?

When Ian, Scott and I wrote The Macroscope, we tried a combination of a comment-press enabled wordpress site, coupled with a private ychat instance and a dropbox filled with word doc files. We had initially tried Scrivener with a github repo (since a scrivener project is ultimately a stack of rtf files), but we ran into sync problems, which were partly caused by using Scrivener on windows and mac at the same time, and… well, it didn’t work off the bat.  You can read more about that experience over on the AHA.

For my current project, the Open Digital Archaeology Textbook Environment (ODATE for short; I’m rubbish at a) names and b) acronyms) Neha, Michael, Beth and I are trying to use Slack to manage our conversations, and writing instead in a series of markdown files, and use a repo to manage the collaboration, resolve conflicts, etc. Right now this is going swimmingly because for various reasons the rest of the gang can’t devote any time to this project. For my part, I’m trying to get the infrastructure and preliminary writing set up so we can hit the ground running in around mid march.

Given our requirements outlined above, I looked around at a couple of different platforms, settling eventually on Bookdown . I like Bookdown because I can write in whatever editor strikes my fancy (switching into R to do the build) and it will give me a complete flat website (*with* search), an epub, and a PDF, grabbing my citations from a Bibtext file and formatting appropriately (we have a collaborative Zotero library for adding to while we write/research; export to Bibtext, boom!). Right now, I’m writing within RStudio. With some minimal tweaking, it also allows me to build in Hypothes.is collaborative annotation (and via the Hypothesis API, I plan on collating annotations periodically to guid revisioning and also perhaps to build into the book user appendices, but that idea is still nebulous at the moment). I’ve also run the resulting website from Bookdown through various web accessibility tools, and the report comes back fairly positive. With some more tweaking, I think I can make the final product super-accessible, or at the very least, produce PDFs that screenreaders and so on can work with.

Getting Bookdown set up was not without its idiosyncracies. The RStudio version has to be the preview release. Then:

  1. create a new project in RStudio (do this in a brand new folder)
  2. run the following script to install Bookdown:
install.packages("devtools")
devtools::install_github("rstudio/bookdown")
  1. create a new textfile with metadata that describe how the book will be built. The metadata is in a format called YAML (‘yet another markup language’) that uses keys and values that get passed into other parts of Bookdown:
title: "The archaeology of spoons"
author: "Graham, Gupta, Carter, & Compton"
date: "July 1 2017"
description: "A book about cocleararchaeology."
github-repo: "my-github-account/my-project"
cover-image: "images/cover.png"
url: 'https\://my-domain-ex/my-project/'
bibliography: myproject.bib
biblio-style: apa-like
link-citations: yes

This is the only thing you need to have in this file, which is saved in the project folder as index.Rmd.

  1. Write! We write the content of this book as text files, saving the parts in order. Each file should be numbered 01-introduction.Rmd, 02-a-history-of-spoons.Rmd, 03-the-spoons-of-site-x.Rmd and so on.
  2. Build the book. With Bookdown installed, there will be a ‘Build Book’ button in the R Studio build pane. This will generate the static html files for the book, the pdf, and the epub. All of these will be found in a new folder in your project, _book. There are many more customizations that can be done, but that is sufficient to get one started.

To get Hypothesis working, we have to modify the _output.yml file:


bookdown::gitbook:
  includes:
      in_header: hypothesis.html

That html file is just a file with a single line, where you add the script src line for embedding hypothesis (see this guidance).

You end up with a Gitbook-looking site, but without any of the gitbook editor flakiness. Have fun! Right now, we’re using a private Hypothesis group too to leave feedback on the various bits that have been written, so hopefully this will make for a smoother collaborative experience. Stay tuned.

Standard
archaeology, digital history, environments

Notes on running the DH-USB

Our digital archaeology textbook will be intertwined with an instance of the DHBox. One of the participants in that project is Jonathan Reeve, who has been building a version that runs off a bootable USB. So naturally, I had to give it a spin. I ran out, got a new usb stick and….

…had to figure out Bittorrent. Every time I went to install the client, every browser I had on every machine kept blocking it as malicious. Normally I can work around this sort of thing, but it was really pernicious. Turned out, my stable of computers were all quite happy with uTorrent instead. With that installed, I grabbed the torrent files from the DH-USB repository, and let them do their magic. It took 3 hrs to get the full .img file.

…had to figure out how to put that .img onto a usb stick such that it would be bootable. Unetbootin should’ve worked, but didn’t. In the end, I had to do it from the command line, per the ‘alternative instructions’:

MacOS: Identify the label of your USB drive with the command diskutil list. Then unmount the disk with diskutil unmountDisk /dev/diskX, replacing diskX with your drive name. Finally, run sudo dd if=/path/to/dh-usb.img of=/dev/rdiskX bs=1m again replacing /path/to/dh-usb.img with the path to the .img file, and diskX with the name of your disk.

Then I had to figure out how to get the damned machines to boot from the stick rather than their own hard drive. On the Mac, this was easy – just hold the alt key down while the machine powers up, and you can then select the usb stick. NB: you can also, it seems, select whatever wifi network happens to be in the air at this stage, but if you do this (I did) everything will go sproing shortly thereafter and the stick won’t boot. So don’t do this. On the Windows 10 machine I had access to, booting up from a disk or stick is no longer the straight-forward ‘hold down f11’ or whatever anymore. No, you have to search for the ‘advanced startup’ options, and then find the boot from disk option, where  you specify the usb stick. THEN the machine powers down and up again… and will tell you that the security settings won’t let you proceed any further. Apparently, there’s a setting somewhere in the BIOS that you have to switch, but as it wasn’t my machine and I’d had enough, I abandoned it. Windows folks, godspeed. (Incidentally, for various reasons, computers much older than about five years are out of luck, as some key pieces of ur-code have changed in recent years:

[you need] a modern system that supports UEFI. Legacy BIOS boot may be possible, but it hasn’t been extensively tested

I had some other issues subsequent as I tried to install R and R Studio, but I’ve sorted those out with Jonathan and by the time you read this, they probably won’t be issues any more (but you can click on the ‘closed issues’ on the repo to see what my issues were). One thing that drove me nuts was trying to persuade Arch Linux to find the damned wifi.

I eventually stumbled across this re ubuntu: https://help.ubuntu.com/community/WifiDocs/Driver/bcm43xx

so tried this:

$ lspci -vvnn | grep -A 9 Network

and saw that I had kernal modules: brcmfmac, wl, but none in use. So I tried this:

$ sudo modprobe brcmfmac

and ran the first command again; kernal now in use!

$ sudo wifi-menu

…and connected. Kept getting connection errors; went to settings > network and connected through there, ta da!

~o0o~

There you have it. A portable DH computer on a stick, ready to go. For use in classes, it’s easy enough to imagine just buying a bunch of usb sticks and filling them up with not only the computing parts but also the data sets, supporting documentation, articles etc and distributing them in class; for my online class this summer maybe the installation-onto-the-stick steps can be made more streamlined… of course, that’s what DH-Box prime is for, so I’ve asked the kind folks over in the school of computer science if they wouldn’t mind installing it on their open stack. We shall see.

Standard
archaeology, data management, data mining, making

ODATE: Open Digital Archaeology Textbook Environment (original proposal)

“Never promise to do the possible. Anyone could do the possible. You should promise to do the impossible, because sometimes the impossible was possible, if you could find the right way, and at least you could often extend the limits of the possible. And if you failed, well, it had been impossible.”
Terry Pratchett, Going Postal

And so we did. And the proposal Neha, Michael, Beth, and I put together was successful. The idea we pitched to ecampus ontario is for an open textbook that would have an integral computational laboratory (DHBox!), for teaching digital archaeology. The work of the DHBox team, and their generous licensing of their code makes this entire project possible: thank you!

We put together a pretty ambitious proposal. Right now, we’re working towards designing the minimal viable version of this. The original funding guidelines didn’t envision any sort of crowd-collaboration, but we think it’d be good to figure out how to make this less us and more all of you. That is, maybe we can provide a kernal that becomes the seed for development along the lines of the Programming Historian.

So, in the interests of transparency, here’s the meat-and-potatoes of the proposal. Comments & queries welcome at bottom, or if I forget to leave that open, on twitter @electricarchaeo.

~o0o~

Project Description

We are excited to propose this project to create an integrated digital laboratory and e-textbook environment, which will be a first for the broader field of archaeology

Digital archaeology as a subfield rests upon the creative use of primarily open-source and/or open-access materials to archive, reuse, visualize, analyze and communicate archaeological data. Digital archaeology encourages innovative and critical use of open access data and the development of digital tools that facilitate linkages and analysis across varied digital sources. 

To that end, the proposed ‘e-textbook’ is an integrated cloud-based digital exploratory laboratory of multiple cloud-computing tools with teaching materials that instructors will be able to use ‘out-of-the-box’ with a single click, or to remix as circumstances dictate.

We are proposing to create in one package both the integrated digital exploratory laboratory and the written texts that engage the student with the laboratory. Institutions may install it on their own servers, or they may use our hosted version. By taking care of the digital infrastructure that supports learning, the e-textbook enables instructors and students to focus on core learning straight away. We employ a student-centred, experiential, and outcome-based pedagogy, where students develop their own personal learning environment (via remixing our tools and materials provided through the laboratory) networked with their peers, their course professors, and the wider digital community.

Project Overview

Digital archaeology as a field rests upon the creative use of primarily open-source and/or open-access materials to archive, reuse, visualize, analyze and communicate archaeological data. This reliance on open-source and open-access is a political stance that emerges in opposition to archaeology’s past complicity in colonial enterprises and scholarship; digital archaeology resists the digital neo-colonialism of Google, Facebook, and similar tech giants that typically promote disciplinary silos and closed data repositories. Specifically, digital archaeology encourages innovative, reflective, and critical use of open access data and the development of digital tools that facilitate linkages and analysis across varied digital sources. 

To that end, the proposed ‘e-textbook’ is an integrated cloud-based digital exploratory laboratory of multiple cloud-computing tools with teaching materials that instructors will be able to use ‘out-of-the-box’ with a single click, or to remix as circumstances dictate. The Open Digital Archaeology Textbook Environment will be the first of its kind to address methods and practice in digital archaeology.

Part of our inspiration comes from the ‘DHBox’ project from CUNY (City University of New York, http://dhbox.org), a project that is creating a ‘digital humanities laboratory’ in the cloud. While the tools of the digital humanities are congruent with those of digital archaeology, they are typically configured to work with texts rather than material culture in which archaeologists specialise. The second inspiration is the open-access guide ‘The Programming Historian’, which is a series of how-tos and tutorials (http://programminghistorian.org) pitched at historians confronting digital sources for the first time. A key challenge scholars face in carrying out novel digital analysis is how to install or configure software; each ‘Programming Historian’ tutorial therefore explains in length and in detail how to configure software. The present e-textbook merges the best of both approaches to create a singular experience for instructors and students: a one-click digital laboratory approach, where installation of materials is not an issue, and with carefully designed tutorials and lessons on theory and practice in digital archaeology.

The word ‘e-textbook’ will be used throughout this proposal to include both the integrated digital exploratory laboratory and the written texts that engage the student with it and the supporting materials. This digital infrastructure includes the source code for exploratory laboratory so that faculty or institutions may install it on their own servers, or they may use our hosted version. This accessibility is a key component because one instructor alone cannot be expected to provide technical support across multiple operating systems on student machines whilst still bringing the data, tools and methodologies together in a productive manner. Moreover, at present, students in archaeology do not necessarily have the appropriate computing resources or skill sets to install and manage the various kinds of server-side software that digital archaeology typically uses. Thus, all materials will be appropriately licensed for maximum re-use. Written material will be provided as source markdown-formatted text files (this allows for the widest interoperability across platforms and operating systems; see sections 9 and 10). By taking care of the digital infrastructure that supports learning, the e-textbook enables instructors and students to focus on core learning straight away.

At our e-textbook’s website, an instructor will click once to ‘spin up’ a digital laboratory accessible within any current web browser, a unique version of the laboratory for that class, at a unique URL. At that address, students will select the appropriate tools for the tasks explored in the written materials. Thus, valuable class time is directed towards learning and experimenting with the material rather than installing or configuring software.

The e-textbook materials will be pitched at an intermediate level; appropriate remixing of the materials with other open-access materials on the web will allow the instructor to increase or decrease the learning level as appropriate. Its exercises and materials will be mapped to a typical one-semester time frame.

Rationale

Digital archaeology sits at the intersection of the computational analysis of human heritage and material cultural, and rapidly developing ecosystems of new media technologies. Very few universities in Ontario have digital archaeologists as faculty and thus digital archaeology courses are rarely offered as part of their roster. Of the ten universities in Ontario that offer substantial undergraduate and graduate programs in archaeology (see http://www.ontarioarchaeology.on.ca/archaeology-programs), only three (Western, Ryerson and Carleton) currently offer training in digital methods. Training in digital archaeology is offered on a per project level, most often in the context of Museum Studies, History, or Digital Media programs. Yet growing numbers of students demand these skills, often seeking out international graduate programs in digital archaeology. This e-textbook therefore would be a valuable resource for this growing field, while simultaneously building on Ontario’s leadership in online learning and Open Educational Resources. Moreover, the data and informatics skills that students could learn via this e-textbook, as well as the theoretical and historiographical grounding for those skills, see high and growing demand, which means that this e-textbook could find utility beyond the anthropology, archaeology, and cultural heritage sectors.

Our e-textbook would arrive at an opportune moment to make Ontario a leading centre for digital archaeological education. Recently, the provincial government has made vast public investment in archaeology by creating ‘Sustainable Archaeology’ (http://sustainablearchaeology.org/), a physical repository of Ontario’s archaeological materials and centre for research. While growing amounts of digitized archaeological materials are being made available online via data publishers such as Open Context (http://opencontext.org), and repositories such as tDAR (https://www.tdar.org), DINAA (http://ux.opencontext.org/archaeology-site-data/dinaa-overview/) and ADS (http://archaeologydataservice.ac.uk), materials for teaching digital archaeology have not kept pace with the sources now available for study (and print-only materials go out of date extremely quickly). Put simply, once archaeological material is online, we face the question of “so what?” and “now what?” This e-textbook is about data mining the archaeological database, reading distantly thousands of ‘documents’ at once, graphing, mapping, visualizing what we find and working out how best to communicate those findings. It is about writing archaeology in digital media that are primarily visual media. Thus, through the e-textbook, students will learn how to collect and curate open data, how to visualize meaningful patterns within digital archaeological data, and how to analyze them.

Furthermore, this e-textbook has two social goals:

  1. It agitates for students to take control of their own digital identity, and to think critically about digital data, tools and methods. This in turn, can enable them to embody open access principles of research and communication.
  2. It promotes the creation, use and re-use of digital archaeological data in meaningful ways that deepen our understanding of past human societies.

Research materials that are online do not speak for themselves, nor are they necessarily findable or ‘democratized’. To truly make access democratic, we must equip scholars with “digital literacy” — the relevant skills and theoretical perspectives that enable critical thinking. These aims are at the heart of the liberal arts curriculum. We know that digital tools are often repurposed from commercial services and set to work for research ends in the social sciences and liberal arts. We are well aware that digital tools inherently emphasize particular aspects of data, making some more important than others. Therefore, it is essential that students think critically about the digital tools they employ. What are the unintended consequences of working with these tools? There is a relative dearth of expertise in critically assessing digital tools, and in seeing how their biases (often literally encoded in how they work) can impact the practice of archaeology.

To that end, we employ a student-centred, experiential, and outcome-based pedagogy, where students develop their own personal learning environment (via remixing our tools and materials provided through the laboratory) networked with their peers, their course professors, and the wider digital community.

Content Map

E-textbook Structure (instructional materials to support the digital exploratory laboratory)

The individual pieces (files and documents) of this e-textbook will all be made available using the distributed Git versioning control software (via Github). This granularity of control will enable interested individuals to take the project to pieces to reuse or remix those elements that make the most sense for their own practice. Since the writing is in the markdown text format, learners can create EPubs, PDFs, and webpages on-demand as necessary, which facilitates easy reuse, remixing and adaptation of the content. The granularity of control also has the added bonus that our readers/users can make their own suggestions for improvement of our code and writing, which we can then fold into our project easily. In this fashion our e-textbook becomes a living document that grows with its use and readership.

Introduction. Why Digital Archaeology?

Part One: Going Digital

  1. Project management basics
    1. Github & Version control
    2. Failing Productively
    3. Open Notebook Research & Scholarly Communication
  2. Introduction to Digital Libraries, Archives & Repositories
    1. Command Line Methods for Working with APIs
    2. Working with Open Context
    3. Working with Omeka
    4. Working with tDAR
    5. Working with ADS
  3. The Ethics of Big Data in Archaeology

The digital laboratory elements in this part enable the student to explore versioning control, a bash shell for command line interactions, and an Omeka installation.

Part Two: Making Data Useful

  1. Designing Data Collection
  2. Cleaning Data with OpenRefine
  3. Linked Open Data and Data publishing

The digital laboratory elements in this part continue to use the bash shell, as well as OpenRefine.

Part Three: Finding and Communicating the Compelling Story

  1. Statistical Computing with R and Python Notebooks; Reproducible code
  2. D3, Processing, and Data Driven Documents
  3. Storytelling and the Archaeological CMS: Omeka, Kora
  4. Web Mapping with Leaflet
  5. Place-based Interpretation with Locative Augmented Reality
  6. Archaeogaming and Virtual Archaeology
  7. Social media as Public Engagement & Scholarly Communication in Archaeology

The digital laboratory elements in this part include the bash shell, Omeka (with the Neatline mapping installation) and Kora installations, mapwarper, RStudio Server, Jupyter notebooks (python), Meshlab, and Blender.

Part Four: Eliding the Digital and the Physical

  1. 3D Photogrammetry & Structure from Motion
  2. 3D Printing, the Internet of Things and “Maker” Archaeology
  3. Artificial Intelligence in Digital Archaeology (agent models; machine learning for image captioning and other classificatory tasks)

The digital laboratory elements in this part include Wu’s Visual Structure from Motion package, and the TORCH-RNN machine learning package.

Part Five: Digital Archaeology’s Place in the World

  1. Marketing Digital Archaeology
  2. Sustainability & Power in Digital Archaeology

To reiterate, the digital laboratory portion of the e-textbook will contain within it a file manager; a bash shell for command line utilities (useful tools for working with CSV and JSON formatted data); a Jupyter Notebook installation; an RStudio installation; VSFM structure-from-motion; Meshlab; Omeka with Neatline; Jekyll; Mapwarper; Torch for machine learning and image classification. Other packages may be added as the work progresses. The digital laboratory will itself run on a Linux Ubuntu virtual machine. All necessary dependencies and packages will be installed and properly configured. The digital laboratory may be used from our website, or an instructor may choose to install locally. Detailed instructions will be provided for both options.

Standard
archaeology, making

Cacophony: Bad Algorithmic Music to Muse To

I was going to actually release this as an actual album, but I looked into the costs and it was a wee bit too pricey. So instead, let’s pretend this post is shiny vinyl, and you’re about to read the liner notes and have a listen.

~o0o~

Track 1. Listening to Watling Street

To hear the discordant notes as well as the pleasing ones, and to use these to understand something of the unseen experience of the Roman world: that is my goal. Space in the Roman world was most often represented as a series of places-that-come next; traveling along these two-dimensional paths replete with meanings was a sequence of views, sounds, and associations. In Listening to Watling Street, I take the simple counts of numbers of epigraphs in the Inscriptions of Roman Britain website discovered in the modern districts that correspond with the Antonine Itinerary along Watling street. I compare these to the total number of inscriptions for that county. The algorithm then selects instruments, tones, and durations according to a kind of auction based on my counts, and stitches them into a song. As we listen to this song, we hear crescendoes and dimuendoes that reflect a kind of place-based shouting: here are the places that are advertising their Romanness, that have an expectation to be heard(Roman inscriptions quite literally speak to the reader); as Western listeners, we have also learned to interpret such musical dynamics as implying movement (emotional, phsycial) or importance. The same itinerary can then be repeated using different base data – coins from the Portable Antiquities Scheme database, for instance – to generate a new tonal poem that speaks to the economic world, and, perhaps the insecurity of that world (for why else would one bury coins?).

Code: This song re-uses Brian Foo’s 2 Trains code.

~o0o~

Track 2: Mancis the Poet. (Original blog post.) A neural network trained on Cape Breton Fiddle tunes in ABC notation; the output then sonified. This site converts ABC notation to MIDI file and makes the pdf of the score; this site converts to mp3, which I then uploaded to soundcloud.

~o0o~

Track 3: John Adams’ 20 (original post here). Topic modeling does some rather magical things. It imposes sense (it fits a model) onto a body of text. The topics that the model duly provide us with insight into the semantic patterns latent within the text…What I like about sonification is that the time dimension becomes a significant element in how the data is represented, and how the data is experienced. So – let’s take a body of text, in this case the diaries of John Adams.  I scraped these, one line per diary entry (see this csv we prepped for our book, the Macroscope). I imported into R and topic modeled for 20 topics. The output is a monstrous csv showing the proportion each topic contributes to the entire diary entry (so each row adds to 1). If you use conditional formatting in Excel, and dial the decimal places to 2, you get a pretty good visual of which topics are the major ones in any given entry (and the really minor ones just round to 0.00, so you can ignore them). I then used ‘Musical Algorithms‘ one column at a time to generate a midi file. I’ve got the various settings in a notebook at home; I’ll update this post with them later. I then uploaded each midi file (all twenty) into GarageBand in the order of their complexity – that is, as indicated by file size.

~o0o~

Track 4: Jesuit Funk

The topic modeling approach seemed promising. I took the english translation of the complete Jesuit Relations, fitted a topic model, and then set about sonifying it. This time, I explored the live-coding music environment, Sonic Pi, but focussed on one topic only.

Code: https://gist.github.com/shawngraham/7ea86a33471acaaa5063

~o0o~

Track 5: PunctMoodie (original blog post here). There was a fashion, for a time, to create posters of various famous literary works as represented only by their patterns of punctuation. I used this code to reduce Susanna Moodie’s ‘Roughing it in the Bush‘ to its punctuation. Then I mapped the punctuation to its numeric ascii values, and fed the result into the Sonic Pi.

Bonus track! Disco version:

~o0o~

Track 6: Human Bone Song. I have scraped several thousand photos from Instagram for a study on the trade in human remains facilitated by social media. I ran a selection (the first 1000) through Manovich’s ImagePlot; first voice is brightness, second is saturation, third (drums) is hue. Sonified with musicalgorithms.org.

~o0o~

Track 7: Song of Dust and Ashes. Same rough procedure as before but with site photos from the Kenan Tepe excavations archived in Open Context. (see http://smgprojects.github.io/imageplot-opencontext/ ). Sonified via musicalgorithms.com, mixed in garageband. First pass.

~o0o~

Track 8: Kenentepe Colours – same data as track 7, but I used a fibonacci series per musicalgorithms.com to perform the duration mapping. Everything else was via the presets. Instrumentation via garageband.

~o0o~

Track 9: Bad Equity. I wondered if sonification could be useful in determining patterns in bad OCR of historical newspapers. I grabbed 1500 editions of The Shawville Equity, 1883-1914, (http://theequity.ca) using wget from the provincial archives of Quebec. Then, measure the number of special characters in each OCR’d txt file, taking the presence of these as a proxy for bad OCR. Add up for each file. Then to musicalgorithms, defaults. Then, because I’m a frustrated musician (and a poor one, at that), I threw a beat on it for the sake of interest. Read the full discussion and code over here.

Track 10: Lullaby

I wrote this one for my kids. It’s not algorithmic in any way. That’s me playing.

(featured image, British Library https://www.flickr.com/photos/britishlibrary/11147476496 )

Standard
archaeology, environments, making

The OpenContext & Carleton Prize for Archaeological Data Visualization

We are pleased to announce that the winner of the 1st OpenContext & Carleton University Data Visualization Prize is awarded to the ‘Poggio Civitate VR Data Viewer’, created by the team led by Russell Alleen-Willems. 

The team hacked this data viewer together over a weekend as a proof-of-concept. In the typical spirit of the digital humanities and digital archaeology, they developed a playful approach exploring the materials using the potential of the HTC Vive sdk to ingest Open Context data as json, and then to place it into a relative 3d space. We particularly appreciated their candour and self-assessment of what worked, and didn’t work about their project, and their plans for the future. We look forward to seeing their work progress, and hope that this prize will help them move forward. Please explore their project at https://vrcheology.github.io/ .

Congratulations to the team, and thank you to all who participated. Please keep your eyes peeled for next year’s edition of the prize!

The team members are:

  • Russell Alleen-Willems (Archaeology domain knowledge, Unity/C# Scripting)
  • Mader Bradley (JSON Data Parsing/Unity Scripting)
  • Jeanpierre Chery (UX Design, Unity/C# Scripting)
  • Blair Lyons (Unity/C# Scripting)
  • Aileen McGraw (Instructional Design and Program Storytelling)
  • Tania Pavlisak (3D modeler)
  • Jami Schwarzwalder (Git Management, Team Organization, and Social Media)
  • Paul Schwarzwalder (Unity/C# Scripting)
  • Stephen Silver (Background Music)
Standard
archaeology

Text re-use in Instagram posts selling human remains

Lincoln Mullen has a wonderful R package on rOpenSci for detecting and measuring text reuse in a corpus of material (the kind of thing that is enormously useful if you’re interested in 19th century print culture, for instance). I wondered to myself what I would find if I fed it the corpus of material I’ve collected (see this gist) concerning the trade in human remains on Instagram (It’s looking for ngrams 5 words long, which means that I end up looking at 3k posts from my initial corpus of 13k). We’re writing all of this up for submission shortly, so this textreuse isn’t in our paper, yet; but anyway, a preview…

A score of ‘1’ indicates a perfect match. After running my materials through, I found many posts scoring 1. I thought, hmm, probably an error? Or perhaps, duplicate entries had found their way into my corpus? But after hand checking several I realized, no, the image is always different. So that’s interesting: people selling this material use the same language time and time again. Let’s consider some of it. We’ll start with this post:

  • Real human skull for sale, message me for more info. #skull #skulls #skullforsale #humanskull #humanskullforsale #realhumanskull #realhumanskullforsale #curio #curiosity

A post that scored 1 for similarity has the exact same text but a vastly different photograph. I’m not going to link to the photos or posts here because I don’t want to encourage this. A post at .9375 similarity has one extra hashtag appended to the text (and of course, a different photo):

  • Real human skull for sale, message me for more info. #skull #skulls #skullforsale #humanskull #humanskullforsale #realhumanskull #realhumanskullforsale #curio #curiosity #dead

We continue so on until we’re at arond .5 for our score:

  • Skull and arm £400 for the pair. One of the fingers on the hand is missing it’s tip and the whole arm needs glue removing and tidying up a bit. Real human skull for sale, message me for more info. #skull #skulls #skullforsale #humanskull #humanskullforsale #realhumanskull #realhumanskullforsale #curio #curiosity #dead

These posts are all by the same individual. That one phrase, ‘Real human skull for sale, message me for more info’, and that sequence of hashtags is as good an identifier for this individual as any username I’m thinking. I’m still going through these results, but the thought occurs that perhaps I might find *different* users using very similar language. If I found that, that would be very interesting indeed – a sign of influence between users? A sign of community? A kind of shibboleth, a marker of belonging?

Other implications?

Standard
archaeology, digital history, networks, simulation

Workshop on Networks & Simulation for the Humanities – Nov 9, Discovery Centre Macodrum Library

Carleton University, Ottawa, Macodrum Library Discovery Centre RM 481, 11 – 2

networks-simulation-workshop-imageUnderstanding the complexity of past and present societies is a challenge across the humanities. Simulation and network science provide computational tools for confronting these problems. This workshop will provide a hands-on introduction to two popular techniques, agent based modeling and social network analysis. The workshop has been designed with humanities students in mind, so no prior computer experience required.

The workshop is led by Tom Brughmans and Iza Romanowska of University of Konstanz and the University of Southampton, two of the leading digital archaeologists. Brughmans is co-editor of the recent volume, ‘The Connected Past: Challenges to Network Studies in Archaeology and History‘ published by Oxford University Press. Romanowska edits the scholarly blog ‘Simulating Complexity‘ and is a Fellow of the Software Sustainability Institute where she promotes the use of computational methods in the humanities

Please Pre-REGISTER at http://bit.ly/nov9-workshop

Standard