ODATE in Perpetual Beta

ODATE in Perpetual Beta

Digital archaeology is always in a state of becoming. How could it be otherwise? And so, I offer up to you what we’ve managed to put together for the Open Digital Archaeology Textbook Environment ODATE as a perpetual beta, never finished, always open to further refinement, expansion, pruning, growth. I hope that you will take and use it in the spirit in which it is offered – of collaboration, of working things through in public.

My ambition is for this text and its associated computational notebooks/binders to become the kernel from which many versions may grow. Use those parts that are useful for your teaching or learning. Recombine with other things you find to create a solution that fits your particular context. Fork a copy of the source on Github (and we also provide instructions on how to do that, and what it means) and improve it. Expand it. Make a pull request back to us to fold your changes into ODATE prime – we’ll update the author roll accordingly!

Some of the pieces are not quite complete yet, but I’d rather have this out in the world, growing, than sitting quietly on my machine waiting for the right word. The perfect is the enemy of the good, they say. We flag those pieces that could use some more work right away. You will find rough edges. Pick up some sandpaper, and join the fun.

~o0o~

ODATE as a whole may be found at http://o-date.github.io

The list of computational notebooks and binders is at https://o-date.github.io/support/notebooks-toc/

How to collaborate with us: https://o-date.github.io/support/contribute/

The perpetually in beta textbook itself: https://o-date.github.io/draft/book/

The source code for the textbook: https://github.com/o-date/draft/

The repository with all our code: https://github.com/o-date

Stand-alone off-line apps replicating the textbook, for the major operating systems: https://github.com/o-date/draft/releases

PDF version of the textbook: https://o-date.github.io/draft/book/odate.pdf

Advertisements

From Agent Models to Archaeogaming: A Digital Archaeology

From Agent Models to Archaeogaming: A Digital Archaeology

I’ve been working away on a new book.  I’m sharing with you now the current state of the introduction. It doesn’t quite hang together yet, and I need to stop with the whole zombie schtick (‘golems’ are a better metaphor), but anyway. Would you read this book? I need a better title, too.

~o0o~

This book is about, in a narrow sense, the ways in which I’ve reanimated Roman society using agent based modelling and archaeogaming. But in a larger sense, it’s about digital enchantment in the ways that scholars like Sara Perry (2018),  Russel Staiff (2014), and Yannis Hamilkais (2014) have written. It’s about responding to archaeology not as a crisis to be solved, but as source for wonder. It’s about whether digital archaeology is fast or slow, whether it is engaging or alienating, whether or not it is sensory and sensual.

What are computers for, in archaeology?

The question might seem absurd. What is a pencil for? A shovel? A database? Our tools are only ever appropriate to particular situations. Not every moment on an excavation requires a mattock or a pail; a dental pick and a dustpan might be called for. By the same token, maybe we don’t always require a computer to achieve a digital archaeology. Maybe a smartphone is all we need. Maybe an iPad. Maybe we just need what Jentery Sayers (2018, elaborating on Kershenbaum 2009) calls ‘paper computers’.

The point is, if we stop simply accepting that a computer is always necessary, we can see again some of the enchantment these amazing devices possess, and we can begin to imagine again the kinds of questions they might be best suited to. There is any amount of criticism of computing, of digital archaeology that focuses on the alienating aspects of the work. Caraher has argued that to use a computer as part of your process, whether in the field or in the lab is to somehow be pushed away from the tacit and sensuous ways-of-knowing that characterize the doing of archaeology (2015).

Perhaps we are asking the wrong questions of these devices. For me, the use of computation in archaeology is a kind of magic, a way of heightening my archaeological imagination to see in ways I couldn’t. It lets me raise the dead (digital zombies?) with all the terror wonder, and ethical problems that that implies. Shouldn’t we raise the dead?  Why shouldn’t we put words in their mouths, give them voices, and talk with them to find out more about their (after) lives?

In this book, I’m making an argument that a slow, reflexive, sensual, enchanted engagement with the past is possible (even desirable) when we use digital computational approaches. That is not to say that it is not a rigorous approach. The first step in this approach is a clear formalism, a clear re-statement in code about what I believe to be true about the past. It has to be that way, because the fundamental action of the computer is to copy. Decisions we take in a computational medium are multiplied and accelerated, so those initial decisions can have unintended or unforeseen consequences when they are rendered computational.

Such formalisms also have to be rendered as relationships as well. Research on artificial neural networks demonstrates that meaning can emerge through cascades of coordinated firings of neurones through weighted channels, backwards and forwards. These weights do not need to be known beforehand, but can be learned as the network is exposed to stimuli. To my mind, this points to a way of computing the past that does not rely on higher-level equations that describe a social phenomenon, but rather a way of letting interaction precede the equation. We set up the conditions for interactions,  relationships, and networks to emerge. Understand that I am not arguing for a naive use of computing and letting answers percolate out. That is nonsense. Rather, I am arguing for the correct level of complexity to model, to put into a simulation. The first part of this book is a consideration of networks as a substrate; the second revivifies these networks, raising the dead through simulation.

These are games that play themselves, these simulations. Wouldn’t it be interesting to enter the game ourselves? This is part of the enchantment. In the third part of this book I discuss what it takes to make this happen, and what archaeogaming, chatbots, and other playful digital toys can offer to our research and more importantly for the audience for whom archaeology holds wonder. I weave throughout this book my engagement with what makes digital work sensuous and enchanting in the ways that Perry and Staiff describe. It is unapologetically a personal engagement.

~o0o~

Insofar as the actual archaeological data in this book and my computational engagements with them are concerned, I have collected together and edited some of my previously published papers that employ a variety of small thought experiments and agent-based models and toys. The computational parts are tools-to-think-with, rather than things that will prove an hypothesis. They are arranged in a logic that reflects the way that I have come to think about Roman society, especially cities and the social life within them. It seems to me that Roman cities and societies can be thought of as nodes of entangled systems, as biological processes that smear across boundaries and scales, and whose actions can be modeled upon those entanglements. With video game technologies, we can insert the researcher/student/public into the model for deeper learning, engagement: a first person perspective. Not I should hasten to add, a Roman perspective. Rather, a deformation of our own just-so stories we tell about the past with the authority provided by a disembodied narration. If there is truth in the stories we tell, then there is truth in the embodied perspective provided by a computational rendering of that story.

I have done my best to excise that part of me that writes in impenetrable archaeo-jargon. Forgive me my failures. I write this book not so much for an academic audience invested heavily in modelling and simulation, but rather for my history students afraid to engage with digital work. It is when things break and in the cleavages that we see most clearly the problems and potentials of technology, and so failure is a necessary part of the process.

The book shifts scales quite often.  It begins with a focus on the flows of energy and materials necessary to sustain the exoskeleton of the City, its built fabric. We then expand outwards to consider the fossilized traces of the social networks that enabled that flow. Once we have a network, we consider ways in which the equifinality of networks can be used to iterate our deformations, our perspectives, and so the kinds of questions we might ask. Now that we are at a regional level, the next chapter considers a model of regional space, its interactions, and the ways local interactions give rise to global structures.The remainder of the book deals with ways we can use these simulations, and these archaeological networks, for generating insight into the social contexts of Roman power. The book returns to where we started, with the city, and concludes with new work exploring the ways that the city-builder genre conditions our understanding of ancient cities, and how we might subvert, divert, and repurpose such games to our own ends.

These particular case studies are wrapped in a larger argument about the proper role of computation in archaeology. In the end, I do not subscribe to a techno-chauvinism that sees digital responses as the obvious end-goal for archaeology, nor a techno-utopianism that describes what ought to be (cf Broussard 2018). Rather, I see space for a creative engagement with digital tools that opens up a landscape, a tasks cape, for returning some enchantment to what we do.

~o0o~

My first encounter with ‘real’ archaeology was as an 18 year old college student on his first real adventure out of the country (out of the back woods, in truth). We were working (paying to work) on an excavation in the Peloponnesus, in the hinterland of Corinth. In the bottom of the high mountain valley of Zaraka you will find lake Stymphalos, where Hercules defeated the Stymphalian Birds. Not much of note happened in this valley; the Romans marched through on their way to annihilating Corinth in 142 BCE; the Crusaders of the Fourth Crusade built a monastery. During the second world war and subsequent Greek Civil War, bitter battles were fought for control of the area. Sometime in the 15th century a person was buried and their head lopped off, for future archaeologists to find, and to feed stories of Balkan vampires; but that’s about it.

My trench? My trench was full of bricks. The trench next to mine? That was the trench with the vampire in it.

Fast forward a few years, and I’m now in Rome, hot on the trail of aqueduct remains across the Roman countryside on a vespa scooter. Thomas Ashby and Esther van Deman had done this during the interwar years (without the vespa), but Rome and its countryside were a very different place, then. Armed with copious photocopies,  a dog-eared copy of  Trevor Hodge’s Roman Aqueducts and Water Supply, and a military topographic map (thirty years out of date) I zoomed down the lanes and byways and industrial estates on the modern periphery of Rome. When I found some ruins, I tried to correlate what I found with the descriptions in Ashby and van Deman. I measured, I photographed, and I drew. The point of these exertions was a massive Excel database that used my basic understanding of the geometry of solids (is it pie-r-squared or half the width times the height or…) to build a beautiful mathematical model of the finished aqueduct. I spent three months pulling this model apart to figure out the quantities of human labour and materials to make this structure. Back on the road, to double check, to find the missing pieces… a glorious summer of roadside picnics, coffees in truck stops, shepherd dogs chasing me from the fields, climbing down into ravines or up onto brick lined vaults.

A few years later, and it’s just me staring at a storage shed full of bricks. Roman bricks are heavy. They are large, and they are thick. They litter the fields of Italy. When they are collected, it is sometimes to take a geochemical peek at their composition. Where might they clays come from? More often, it is because they contain very complex makers’ marks, these bricks from near Rome. They tell you a year, an estate, a brick maker, a landlord. They remind me a lot of how marks on timber floated down the Ottawa River were used by the timber barons to keep records straight, for paying for the use of timber slides, for working out who owned what. I find them interesting, but in self defence against the teasing I receive – hey brickstamp boy! – I play up the boring bit. Hell, we’re archaeologists, we can’t always excavate vampires, right?

Vampires.

Raising the dead.

Hmmmm….

It’s about this point where I first encounter the idea of ‘social networks’ – a full decade before Facebook – and I start to wonder what I might see if I tie these estate owners, estate names, brick makers, makers’ marks and so on together.

In the blue glow of the cathode-ray monitor, the tangled hairball of connections starts to emerge and I begin to see changing patterns over time, patterns that begin to give life to these long dead workers….

~o0o~

This is a book about the practical magic – the practical necromancy? – that digital archaeology brings to the larger field. To use computers in the course of doing archaeological research does not a digital archaeology make. Digital archaeology requires enchantment. When we are using computers, the computer is not a passive tool. It is an active agent in its own right. The way it is built, the way the code is designed, contain so many elements of unconscious bias from all of its myriad creators (and blood: do not forget how much actual human blood is shed to obtain the rare earths and minerals upon which computing rests [reference to that alexa AI map]) means that the computer is our co-creator. In a video game, the experience of the player is not the result of a passive reception of representation by the game author. The player’s active engagement with the emergent representation of the rules put in motion by the author but interpreted in the context of the local game environment means that meaning of the game is the product of three authors. We can see this in video games, but it’s not always clear that this is also true of say GIS or 3d photogrammetry.

In that emergent dynamic, in that co-creation with a non-human but active agent, we might find the enchantment, the magic of archaeology that is currently lacking in archaeology. Sara Perry identifies the lack of magic, the lack of enchantment, in the ‘crisis’ model of archaeology that animates our teaching, our research, and our public outreach. If archaeology is always in danger, then every act of archaeology is an act of rescue, and every act of rescue implies a morality play, a this-is-good-for-you aesthetic to which the public should respond appropriately.

Is it any wonder that the History Channel is filled with ancient aliens nonsense rather than ‘proper’ documentaries?  [Brenna Haslett on ghost hunters?]

Archaeology – academic archaeology – has lost its grip on wonder and enchantment and romance. This is not a plea to sanitize the past, or to pander to tired tropes (but remember: most of those tropes were created by archaeologists who went out of their way to communicate their research to the public. It is not their fault that subsequent archaeologists turned their backs on the public and let those tropes fester). It is a plea to find the magic and wonder in what they do. [St george and the vampire?]

And so I offer this book, a guide to practical necromancy, in that spirit. By pulling together the connective threads on nearly twenty years of work in simulation, agent modelling, video games, and Roman economic history, I want to map out a way for digital archaeology to connect with what Andrew Reinhard has identified as ‘archaeogaming’: if I take the fossils of a Roman social network, and reanimate them with autonomous software agents, just what kind of digital archaeology have I created? What other kinds are out there?

The Making of FORVM: Trade Empires of Rome

The Making of FORVM: Trade Empires of Rome

Almost three years ago, Tom Brughmans sent me an email to see if I’d be interested in some kind of academic exchange. At the time, he was at the University of Konstanz and there is a state-province level exchange program that we could apply to. ‘C’mon over!’ said I, and soon Tom, Iza Romanowska, and their two wee babies arrived in Ottawa.

This began one of the most productive partnerships I’ve ever enjoyed. The plan was initially to do something digital – Tom and Iza are amongst the most accomplished simulationists and digital archaeologists out there – but plans soon changed. ‘How about a board game?’ said Tom, and I was sold. (update: Tom says it was me who suggested the game! Funny thing, memory)

We began by looking at the collection of board games in MacOdrum Library. What games did we like? Why did we like them? What problem space (as Jeremiah McCall terms it) do they address, and how? What is the key issue in our own research that a board game could address? I have a giant whiteboard in my office, and we started sketching these ideas out. Tom and I have both written and created simulations of Roman economics, and we have both explored Roman archaeology from a network perspective, so it made sense to us to use these experiences as points of departure.

Earlier in the year, I had also participated in the Interactive Pasts conferences, giving a paper on agent simulation as ‘games that play themselves‘. This got me to thinking about the differences between agent models, video games, and board games, and we started thinking about board games as being ‘analog simulations’ that encouraged modding, tinkering, and ‘house rules’. That is, unlike a video game that makes you perform its creators’ ideas about how the world-space works (and are thus very hard to see or contest – but not impossible) an analog game/simulation invites reflection on the rules and system. Thus we set out to make a game that reflects our perspective on the importance of network dynamics and information asymmetry in the Roman world, but that also invites its players to reflect on and perhaps alter/mod those rules for their own purposes.

A board game!

We started sketching out a flow-chart of how the game would progress as if it were a Netlogo simulation. Somewhere in my office I know I still have the giant sheets of paper on which we scrawled pseudo-code, replete with crossing outs, multiple hands and colours, as we worked out how the game should be played. We used a networked representation of connectivity in the Roman world that Tom whipped up in Visone, against a map of the Mediterranean, to start building our board. Eventually, we ended up with this:

Original Board for FORVM

And we started playing. And replaying. And modifying. And playing. And fixing. I dragooned one of the MA students (Hi Elise!) to be our fourth player. And we played. And fixed. And played some more. Eventually, Tom, Iza, and the boys had to go back home. Over the next two years, we kept fiddling with the game, and Tom’s extended network of friends and family playtested and continued to refine the game. At this point, we commissioned the brilliant artist Ian Kirkpatrick to produce the artwork for the game, and to turn our board above into this:

Once we had the game manufactured, we tried to have a copy sent to Tom and Iza so that we could reveal it and play it at a workshop Tom put on in Oxford in early October… but alas, that copy is somewhere in Spain, ping-ponging between different postal sorting offices, or slid down behind a radiator somewhere. A copy did make its way to me in Ottawa, and I asked my colleague Marc Saurette (who does an amazing semester-long seminar where the students role-play medieval politics) to play-test it one last time with his students.

A slightly-blurry shot of the game in action

With a few tweaks in place, we are delighted to announce that the game is now available for purchase! It’s manufactured in the United States. If you’re ordering from a non-US location, make sure to select the international tracking option lest your copy go missing in the postal system too. The game has its own website at http://www.forvm.ca/  but you can purchase direct from the manufacturer at TheGameCrafter.com. We’re not making money on this; it’s all at cost, an exercise in getting our research knowledge out into the public sphere.

Would you like to play a game?

 

Award for Outstanding Work in Digital Archaeology – ODATE

Award for Outstanding Work in Digital Archaeology – ODATE

I was pleased to find the following note in my email Thursday last from the AIA….

Previous winners of the award may be found here. Speaking for everyone on the ODATE team, we are honoured to join their company! Earlier projects that have been honoured are becoming part of the entire ecosystem of digital archaeology infrastructure, and I’m pleased that our part aimed at the teaching side of that balance has been recognized. Part of our digital pedagogy uses reproducible computational notebooks that integrate data, code, and analysis. My ambition is that through ODATE we normalize and regularize this kind of reproducible research in archaeology more generally.

I’d also like to thank my collaborators and co-writers on this project who have put up with me these past two years – Neha Gupta, Michael Carter, and Beth Compton. When the going got rough, other folks jumped in to help us complete the work – Jolene Smith, Andreas Angourakis, Andrew Reinhard, Lorna Richardson, Kate Ellenberger, Zack Batist, Joel Rivard, Ben Marwick, and Rob Blades. These folks come from all walks of archaeological life, from the library to the lecture hall, from grad school to professional archaeology. They are all wonderful scholars!

So this award is shared across a community of practice: thank you all.

I should note that this project was funded by eCampusOntario, ‘the online hub for learners and educators across Ontario’, and I’m grateful to them, and the EDC at Carleton, for supporting this somewhat different approach to what an online textbook could be.

…oh, and ODATE itself? Well, the url for it is out there, in the aether; we’re still trying to sand off some of the rougher corners, fill in some of the bits and pieces. You can find it easily enough (ah well, here it is), but know that the official ‘ta da!’ is coming.  But here are all of the computational notebooks that you can run in your browser, right now.

Thank you everyone at ODATE for coming along on this adventure, and thank you AIA!

aialogo

#Archink

#Archink

Katherine Cook organized an archaeology themed edition of the wider #inktober challenge (draw something, every day, for the entire month). Her prompts:

archink prompts
#Inktober #archink prompts

I cannot draw. But I can trace. My hands shake a lot when I try to concentrate on doing fine work. (And yet, when I play piano, they don’t. Go figure). Below are my efforts at #archink. I didn’t hit every day, but I did get a lot of them.

Fragments

It’s been a long month.

Then I was on a plane, so I tried tracing a photo I took in the Museum of London:

 

Interprets

 

Builds

 

Sites

And then I gave Google Storyboard a try, on a video about the University of Reading excavations at Roman Silchester:

…when you’re using the app, you can reload it for different layouts, effects, video stills transformed into sketches of different styles etc.

 

Conserves

 

Conceals

 

Holds

 

Moves

 

Writes

 

Theorizes

Treasures

 

Colonizes

 

Decolonizes

 

Narrates

 

Collects

 

It’s been an interesting challenge. I think if I kept it up, I might eventually learn how to actually draw something ex novo. But until then, there is some satisfaction in remediating, tracing, things I find.

A Quick Note on HackMD for Collaborative Notetaking in Class

A Quick Note on HackMD for Collaborative Notetaking in Class

I’ve long been interested in collaborative notetaking in class as a way of making presence in class more meaningful. In my imagination, collaboratively written notes from class discussions and exercises intersect with other kinds of notes (Hypothes.is for instance for reading, Zotero on bibliography) to make a sort of super zettlekasten.

In class this term (‘Bad Archaeology‘) I’m framing discussion as a series of unconferences. As part of that, I’m also making any notes that I scribble together available to the students via HackMD.io. HackMD also has a nice feature that integrates with Reveal.js so that I can quickly spin out a slide deck from a bit of markdown in a new note, like so:

---
title: Slidedeck Sept 4 Getting Started
slideOptions:
  transition: fade
  theme: night
---

## Sept 4 Getting Started

---

![an image](url to the image)

Note:
Speaking notes hide here; not visible in slide mode but visible in edit

---

and so on

This really fits well with my existing flow. You can create a ‘book’ by making a note with a list of links, then hitting ‘book mode’. The page that loads up will use your note as a table of contents on the left of the page, and the contents from the first linked page as the default first page:

Screenshot from my HackMD notebook

I’m imagining my students making many cards, then filing them altogether in a book-like format. Permissions can be set on individual cards to restrict who can edit them (so just the students, students plus me, outside world, etc). Materials can be exported to dropbox, github, odf format, etc. YAML can be added to each note to ask Google not to index and so on.

HackMD has pricing for more features, more space and so on;  if the business model is good presumably it’s going to hang around for a while. But… there’s always the fear, right? Turns out, you can deploy the whole thing to your own space too – the repository is at https://github.com/hackmdio/codimd. (There’s a desktop interface I see, which is neat, it’s in the organization’s repository list). It doesn’t look easy to deploy, mind you. I have a free account with Heroku, so when I saw the ‘deploy to heroku’ button….

Reader, I pressed it.

It failed the first time, but deployed the second time, so now I have a collaborative markdown notepad of my very own.

(ha, as I look at my Heroku dashboard, I see that I set this up once before a year or two ago! Completely forgot about it…)

featured image by Aaron Burden via Unsplash

DCGAN for Archaeologists

DCGAN for Archaeologists

The following is cross-posted from our project website at bonetrade.github.io

Learning about GANs 

Melvin Wevers has been using neural networks to understand visual patterns in the evolution of newspaper advertisements in Holland. He and his team developed a tool for visually searching the newspaper corpus. Melvin presented some of his research at #dh2018; he shared his poster and slides so I was able to have a look. Afterwards, I reached out to Melvin and we had a long conversation about using computer vision in historical research.

His poster is called ‘ImageTexts: Studying Images and Texts in Conjunction’ which clearly is relevant to our work in the BoneTrade. In his research, he looks at the text for ‘bursty’ changes in the composition of the text. That is, points where the content changes ‘state’ in terms of the frequency of the word distribution. The other approach is to use Generative Adversarial Networks on the images.

So what are GAN? This post is a nice introduction and uses this image to capture the idea:

Image from Dev Nag on Medium

In essence, you have two networks. One learning how to identify your source images, and the second learning how to fool the first by creating new images from scratch.

Why should we care about this sort of thing? For our purposes here, it is one way of learning just what features of our source images our identifiers are looking for (there are others of course). Remember that one of the points of our research is to understand the visual rhetoric of these images. If we can successfully trick the network, then we know what aspects of the network we should be paying attention to. Another intriguing aspect of this approach is that it allows a kind of ‘semantic arithmetic’ of the kind we’re familiar with from word vectors:

The easiest way to think about words and how they can be added and subtracted like vectors is with an example. The most famous is the following: king – man + woman = queen. In other words, adding the vectors associated with the words king and woman while subtracting man is equal to the vector associated with queen. This describes a gender relationship.

Another example is: paris – france + poland = warsaw. In this case, the vector difference between paris and france captures the concept of capital city.

I will admit that I haven’t figured out quite how to do this yet, but I’ve found various code snippets that should permit this.

Finally, as Wevers puts it, ‘The verisimilitude of the generated images is an indication of the meaningfulness of the learned subspace’. That is, if our generated images are not much good, then that’s an indication that there’s just far too much noise going on in our source data in the first place. Garbage in, garbage out. In Melvin’s poster, the GAN “was able to learn the variances in car models, styling, color, position and photographic composition seen in the adverts themselves.”

In which case, it seems that GANS are a worthwhile avenue to explore for our research.

Dominic Monn published an article and accompanying Jupyter Notebook on building a GAN trained on one of the standard databases, ‘CelebFaces Attributes data set’ which has more than 200,000 photographs of ‘celebrities’ (training dataset composition is a topic for another post). It’s probably a function of my computer but I couldn’t get this up and running correctly (setting up and using AWS computing power will be a post and tutorial in due course). It is interesting in that it does walk you through the code, which is not as forbidding as I’d initially assumed.

I had more success with Taehoon Kim’s ‘tensorflow implementation of “Deep Convolutional Generative Adversarial Networks”’, which is available on Github at https://github.com/carpedm20/DCGAN-tensorflow. I don’t have a GPU on this particular machine, so everything was running via CPU; I had to leave my machine for a day or two, and also use the caffeinate command on my Mac to keep it from going to sleep while the process ran (quick info on this here).

I had a number of false starts. Chief amongst these was the composition of my training set.

  1. You need lots of images. Reading around, 10 000 seems to be a bottom minimum for meaningful results
  2. The images need to be thematically unified somehow. You can’t just dump everything you’ve got. I went through a recent scrape of instagram via the tag skullforsale and pulled out about 2300 skull images. That was enough to get the code to run, but as you’ll see, not the best results. Of course, I was only trying to learn how to use the code and work out what the hidden gotchas were.

Gotchas 

Ah yes, the gotchas.

  • images have to be small. Resize them to 256 x 256 or 64 x 64 pixels. Use Imagemagick’s ‘mogrify’ command.
  • images have to be rgb
  • weird errors about casting into array eg https://github.com/carpedm20/DCGAN-tensorflow/issues/162: ValueError: could not broadcast input array from shape (128,128,3) into shape (128,128) means that we have to use Imagemagick’s ‘convert’ command there too.
  • greyscale images screw things up. Convert those to RGB as well
  • running the code: use the dockerized version, and put the data inside the DATA folder.
  • running the main.py script: --crop always has to be appended.

Command snippets:

convert image1.jpg -colorspace sRGB -type truecolor image1.jpg

make sure there are no grayscale images

identify -format "%i %[colorspace]\n" *.jpg | grep -v sRGB

convert images to 64×64

mogrify -resize 64x64 *.jpg

convert to sRGB

mogrify -colorspace sRGB  *.jpg

run main.py

python main.py --dataset=skulls --data_dir data --train --crop`

Results? 

I let the code run until it reached the end of its default iteration time (which is a function of the size of your images). Results were… unimpressive. With too small a dataset, the code would simply not run.

Some outputs:

after two epochs

After nearly an hour the first visualization of the results after a mere two epochs of iterations… a dreamy mist-scape as the machine creates.

first actual working results

In this mosaic, which represents the results from my first actual working run (20 epochs), you can, if you squint, see a nightmarish vision of monstrous skulls. Too few images, I thought (about a thousand, at this point). So I spent several hours collecting more images, and tried again…

results of a run

Maybe I’m only seeing what I want to see, but I see hints of the orbital bones around the eyes, the bridge of the nose, in the top left side of each test image in the mosaic.

So. I think this approach could prove productive, but I need a) more computing power b) more images c) run for much much longer.

I wonder if I can remove my decision making process in the creation of the corpus from this process. Could I construct a pipeline that feeds the mass of images we’ve created into a CNN, use the penultimate layer and some clustering to create various folders of similar images, and then pass the folders to the GAN to figure out what it’s looking at, and visualize the individual neurons?

Jupyter Notebooks for Digital Archaeology (and History too!)

Jupyter Notebooks for Digital Archaeology (and History too!)

As the fall academic term approaches, and we get closer to version 1.0 of the Open Digital Archaeology Text (ODATE), I thought I would share the plethora of Jupyter Notebooks we’ve put together to support the work.  (A video showing the whole ODATE project is over here on youtube). The text of ODATE still has some rough edges and there are parts still coming together. Indeed, it will never be finished as it is my hope that it grows and is forked and becomes the kernel for many many coursepacks and workshops and syllabi; more on that later when we’re closer to pulling back the official curtain.

As far as these notebooks go, more will come with time. Feel free to use these in your teaching – please let me know if you do and how it goes –  and please do suggest edits (either by leaving an issue on the github repo or by making edits and a pull request on github). I would be delighted to include more that other people have built, so let me know if this interests you!

These notebooks can be downloaded and run locally if you have Jupyter installed (you’ll need to pay attention to the requirements and postBuild files if you do that, in order to get all the bits and pieces installed. Use a virtual environment too!) If you click the ‘launch binder’ button, the notebooks will launch in an interactive environment hosted by Binder. (Once they’re up and running, you can also change the url where it says ‘tree’ to ‘lab’ to have these notebooks in a Jupyter Lab interface).

Note: Run each cell in the notebook in sequence from top to bottom; use shift+enter to run the cell or hit the ‘run’ button in the notebook toolbar. Have students work through the notebooks, then make changes, modify, or expand the notebook for themselves. When the notebook is running, there is an ‘export’ option under the ‘file’ tab. Export as jupyter notebook. The resulting text file (with .ipynb extension) could be submitted for course work (and run within the relevant binder, of course).

(featured image: Brandon Green, unsplash.com)

These links will launch the notebooks using Binder. It can sometimes take a few moments for the environment to launch; be patient. Click on the ‘status’ link when launching to see the environment build.


Introduction to Jupyter Notebooks

Binder Repository

Contains:

  • Welcome.ipynb
  • demo-R.ipynb

This notebook contains everything necessary to set up a Github repo that can become the basis of a Binder. Consult the repository’s Readme file to see how it can be customized for your own particular usage. Fork (make a copy) of this repo as often as is necessary! Many of the exercises in the first part of ODATE require nothing more than this.


Working with APIs

Binder Repository

Contains:

  • chronicling america api.ipynb
  • open context api.ipynb
  • Open Context Measurements.ipynb
  • mapping-with-ipyleaflet.ipynb

These notebooks demonstrate progressively more complicated ways of retrieving data via an API.


Archaeological Data into R

Binder Repository

Contains:

  • Retrieving Data from the Portable Antiquities Scheme Database.ipynb
  • archdata.ipynb

The first notebook shows how to use R to pull archaeological data from an online database. The second notebook shows how to interact with the archdata package, a collection of archaeological datasets already pulled together for usage in R.


Databases

Binder Repository

Contains:

  • intro to sql.ipynb
  • SQLite Database and R.ipynb
  • visualizing results of sql query in python.ipynb

This notebook demonstrates how to ingest a variety of csv (or other format) files into a single SQL database. It shows how to query the database, and to push the results of the query into a dataframe for further analysis or visualization.


Linked Open Data

Binder Repository

Contains:

  • sparql-intro.ipynb
  • Using R to Retrieve and Visualize Data from SPARQL.ipynb

The first notebook shows how to craft sparql queries for the British Museum, Wikidata, and Nomisma endpoints. The first notebook uses a SPARQL kernel that also allows for graphing visually the data relationships; it also has a ‘magic’ command for writing the data to json or csv. The second notebook demonstrates how to use the sparql package for R to query an endpoint and then manipulate the results to do some simple statistics and visualization.


Spatial Archaeology

Binder Repository

  • linlithgow_spatial.ipynb
  • canmore_survey_shetland.ipynb
  • 1_spatialarchaeology.ipynb
  • working with remote sensing data.ipynb

These notebooks are courtesy Dr. Rachel Opitz, of the University of Glasgow who is Lecturer in Spatial Archaeology. There are two binders which can be launched, ours and Dr. Opitz’s; consider launching Dr. Opitz’s as she updates the work for her own teaching, or use the ODATE version that is updated only periodically. Dr. Opitz’s version can be launched here: Binder

The Linlithgow notebook explores burial data, while the Canmore notebook explores the map of registered monuments in the Shetlands, recorded in Scotland’s Canmore database. The 1_spatialarchaeology.ipynb notebook explore’s data from Dr. Opitz’s team’s excavations at the ancient city of Gabii in Italy. The final notebook works with remote sensing data and explores cropmarks in hyperspectral images.


Scraping

Binder Repository

  • Extracting Data from PDFs using Tabulizer.R
  • metadigitise.R
  • Building a Scrapy Scraper.ipynb

To launch the two .R scripts, use the built-in RStudio server in this binder. This binder will take a bit of time to load up.

From the Home page for this binder, select new -> RStudio. Then open the Extracting Data from PDFs using Tabulizer.R (or the metadigitise.R) file.

Put your cursor at the first line in the script (top left window); run one line at a time.


LiDAR

Binder Repository

  • Demo using Montreal LiDAR data.ipynb
  • Avebury LiDAR.ipynb

These two notebooks show how to unzip .laz files into .las, and to visualize the data therein.

The final codeblock in both notebooks creates an animated gif from the data. That final codeblock is computationally intensive; it will take some time to run. The results will be written to a new folder called ‘export’; you can open that folder by clicking on the jupyter logo at top and then clicking on the ‘export’ folder.

You will know the code is finished when the [*] at the left of the code block changes to a number.

Start with the Montreal demo notebook. It contains some code that the Avebury notebook depends on.


Agent Based Modeling

Binder Repository

Contains:

  • Schelling Segregation Model – schelling/analysis.ipynb
  • Epstein Civil Violence Model – epstein/epstein civil violence.ipynb
  • Forest Fire Model – forest_fire/forest fire model.ipynb
  • Virus on a network – virus_on_a_network/virus.ipynb

Start with the Forest Fire model – it is one of the best known introductory models in the field. As you experiment with these models, ask yourself, ‘what would it take for this to be a model of an archaeological concept?’ The ‘Virus’ model notebook is still under development.


Agent Based Modeling with Netlogo

Binder

A notebook for running Netlogo models ‘headless’ (eg no GUI) by specifying parameters for an experiment. These parameters are passed to a bash script and run from within the notebook. Results are written to file for further analysis in the python. A variety of likely python data packages are also provided for import. (Others can be added by running !pip install <package> from within a notebook.


Computer Vision

Binder Repository

Contains:

  • Building an archaeological image classifier with tensorflow.ipynb

This notebook demonstrates how Transfer Learning can be applied to create a neural network model trained on archaeological imagery. A mobile version that uses the results of this notebook to create an image identification app may be followed here


Clustering Images with Tensorflow

Binder Repository

Contains:

  • find-similar-images.ipynb
  • Affinity Propogation.pynb

These notebooks take the second-last layer of the neural network and demonstrate how to study it for clustering visually similar images.


Sonification

Binder Repository

  • Intro to Sonification.ipynb

This notebook walks through the process of mapping time-series data to musical notation, to create mid files that can then be turned into sound.


Creativity

Binder [https://github.com/o-date/creativity/]

  • Glitching an image with prism sorting.ipynb
  • semantic_similarity_chatbot.ipynb

This binder can take quite a bit of time to pull together. Please be patient. There are steps in the chatbot notebook that can also take a bit of time; watch for the [*] beside a running code block to disappear before moving to the next one.


Creativity 2

Binder Repository

  • World Building – model demo.ipynb
  • History Generator – ahistorygenerator.ipynb

The first notebook generates a world by simulating topography and erosion, and then using an agent based model to play out its history. The second is a far more simple model of state formation, fision, and fusion but uses Tracery to generate its historical chronicle, and graphviz to visualize it.


Make a Research Compendium

Binder Repository

This repository is an experimental demonstration of how you might combine a research compendium created by rrtools with Binder, a service that creates an executable environment with RStudio in your browser.

Please read the rrtools documentation and this repository’s readme before launching this binder.

Sticking it to DCGAN

These are notes to self; a proper blog post eventually

Working with DCGAN – https://github.com/carpedm20/DCGAN-tensorflow

This is cool too, but just doesn’t work for me (final codeblock runs, no results)

->  much depends on the input data

-> more is better

-> images need to be resized; smaller than 256 x 256

conda activate env
  • convert images to 64×64
    mogrify -resize 64x64 *.jpg
  • make sure there are no grayscale images
    identify -format "%i %[colorspace]\n" *.jpg | grep -v sRGB
    
  • fix those that are:
    convert input.jpg -colorspace sRGB -type truecolor output.jpg
    

    or, to run over the whole lot, in bash (and you might be able to pipe the identify results right into this):

    for f in *.jpg; do
      convert ./"$f" -colorspace sRGB -type truecolor ./"$f"
    done
    
  • and fire that thing off:
    python main.py --dataset images --data_dir data --train --crop

 

Buy a damned GPU