Rolling out DHBox for HIST3814

First of all, the support of Steve Zweibel of CUNY and Andrew Pullin from Carleton’s Computer Science department has been above and beyond the call of duty, and I owe them a case (if not several) of beer.

So below I offer some observations on rolling out our own DHBox at Carleton for my HIST3814o Crafting Digital History students’ use. If you’ll recall, our Open Digital Archaeology Textbook Environment (ODATE) proposes to roll a digital archaeology textbook into a customized DHBox. I took advantage of HIST3814o to dry-run DHBox for ODATE and also to solve some of the problems of a distance, online course in Digital History so that I only had to provide tech support for one kind of computer (an Ubuntu box) rather than the myriad setups that my students might use.

So here we go:

  • I don’t think there’s a DHBox outside DHBox.org in a production setting, so there are a lot of unknowns on our end involved in doing what we’ve proposed to do:
    • How to make it work on a shared server
    • What kind of load it would experience, and hence, what kind of resources it would need
    • How to add more abilities to it
    • How to on-board readers/users
  • Thus, I rewrote my HIST3814o, Crafting Digital History so that we could try to answer these questions in a limited sandbox environment (and where, if things went wrong, I could recover from). Students wouldn’t have to use their own computers to compute but rather, would log into our DHBox. This gave us a maximum number of students (60) from whose experience we can extrapolate the kinds of things ODATE might encounter in terms of computational load, ongoing maintenance, and issues in terms of Carleton’s wider internet security. NB The things that these digital history students do with the DHBox are computational less intense than the things that the archaeology textbook users will do (text analysis versus 3d reconstruction from images, for example), and thus if anything we’re simply defining a minimal baseline.
  • We’ve discovered a number of things:
    • The amount of RAM and CPU processor cores we need is very much higher than we anticipated. Currently, we have had to upsize to 224 GB of space for the students, and increase RAM 4x and CPU processors 5x. Of course, I made a very poor guess at the outset
    • While 60 students are enrolled, it seems that only 20 – 25 are fully active in the course as of yet (but it’s only been 1 week), and so we expect we’ll probably have to increase again.
    • Adding new capabilities to the underlying image of DHBox can break things in unexpected ways. It’s a Docker container being run in an OpenStack environment, so we’ve had to get au-fait with Docker, too. Fortunately, we’ve been in close communication with Steve at CUNY and have been able to find solutions. I say ‘we’, but this is all Andrew and Steve. I am the cause of work in other people
    • Extending the default existence of a user’s DHBox instance from 1 month to 2 months caused odd errors we haven’t fully sussed yet (and in any event, will have an effect on the size of the memory we need), so we rolled that back
    • We discovered that Carleton University’s antivirus provider, was/is using deep-packet sniffing which was preventing elements of the DHBox from working properly in the on-campus computer labs. ITS has now white-listed our server for the DHBox
    • Because of Carleton security concerns, all users of the DHBox have to be logged into the Carleton VPN in order to use the DHBox. For ODATE, this will not be acceptable, and we’ll have to find a solution (which may mean a commercial provider, which will increase costs, although we did budget for these).
    • I think that’s everything for now, but I don’t have my notes handy

I intend to reveal the first section of the ODATE text this summer, so that people can try it out with their own classes this September. We (Neha, Michael, Beth, and I) are current writing the section on the ethics of digital work in archaeology which I believe is crucial to get right before letting people see our drafts. While I don’t think our computational environment – the official ODATE DHBox instance – will be up and running by then (that is, NOT the one I’m using for my HIST3814o class), individuals will be able to go to DHBox.org itself and try things out there OR run things on their own machines (if they have access to Ubuntu Linux computers).

Incidentally, HIST3814o is available online at http://craftingdigitalhistory.ca ; I run a fully open-access version of it for non-Carleton folks concurrently. For OA participants, I use a separate instance of Slack as a communication space, see https://electricarchaeology.ca/2017/07/03/crafting-digital-history-open-access-version-summer-2017/ . One of the OA participants has blogged about starting the course with me at https://infoliterati.com/2017/07/09/down-the-rabbit-hole-with-crafting-digital-history/. You’re welcome to have a look!

Featured image by Patrick Lindenberg