The Making of FORVM: Trade Empires of Rome

The Making of FORVM: Trade Empires of Rome

Almost three years ago, Tom Brughmans sent me an email to see if I’d be interested in some kind of academic exchange. At the time, he was at the University of Konstanz and there is a state-province level exchange program that we could apply to. ‘C’mon over!’ said I, and soon Tom, Iza Romanowska, and their two wee babies arrived in Ottawa.

This began one of the most productive partnerships I’ve ever enjoyed. The plan was initially to do something digital – Tom and Iza are amongst the most accomplished simulationists and digital archaeologists out there – but plans soon changed. ‘How about a board game?’ said Tom, and I was sold. (update: Tom says it was me who suggested the game! Funny thing, memory)

We began by looking at the collection of board games in MacOdrum Library. What games did we like? Why did we like them? What problem space (as Jeremiah McCall terms it) do they address, and how? What is the key issue in our own research that a board game could address? I have a giant whiteboard in my office, and we started sketching these ideas out. Tom and I have both written and created simulations of Roman economics, and we have both explored Roman archaeology from a network perspective, so it made sense to us to use these experiences as points of departure.

Earlier in the year, I had also participated in the Interactive Pasts conferences, giving a paper on agent simulation as ‘games that play themselves‘. This got me to thinking about the differences between agent models, video games, and board games, and we started thinking about board games as being ‘analog simulations’ that encouraged modding, tinkering, and ‘house rules’. That is, unlike a video game that makes you perform its creators’ ideas about how the world-space works (and are thus very hard to see or contest – but not impossible) an analog game/simulation invites reflection on the rules and system. Thus we set out to make a game that reflects our perspective on the importance of network dynamics and information asymmetry in the Roman world, but that also invites its players to reflect on and perhaps alter/mod those rules for their own purposes.

A board game!

We started sketching out a flow-chart of how the game would progress as if it were a Netlogo simulation. Somewhere in my office I know I still have the giant sheets of paper on which we scrawled pseudo-code, replete with crossing outs, multiple hands and colours, as we worked out how the game should be played. We used a networked representation of connectivity in the Roman world that Tom whipped up in Visone, against a map of the Mediterranean, to start building our board. Eventually, we ended up with this:

Original Board for FORVM

And we started playing. And replaying. And modifying. And playing. And fixing. I dragooned one of the MA students (Hi Elise!) to be our fourth player. And we played. And fixed. And played some more. Eventually, Tom, Iza, and the boys had to go back home. Over the next two years, we kept fiddling with the game, and Tom’s extended network of friends and family playtested and continued to refine the game. At this point, we commissioned the brilliant artist Ian Kirkpatrick to produce the artwork for the game, and to turn our board above into this:

Once we had the game manufactured, we tried to have a copy sent to Tom and Iza so that we could reveal it and play it at a workshop Tom put on in Oxford in early October… but alas, that copy is somewhere in Spain, ping-ponging between different postal sorting offices, or slid down behind a radiator somewhere. A copy did make its way to me in Ottawa, and I asked my colleague Marc Saurette (who does an amazing semester-long seminar where the students role-play medieval politics) to play-test it one last time with his students.

A slightly-blurry shot of the game in action

With a few tweaks in place, we are delighted to announce that the game is now available for purchase! It’s manufactured in the United States. If you’re ordering from a non-US location, make sure to select the international tracking option lest your copy go missing in the postal system too. The game has its own website at http://www.forvm.ca/  but you can purchase direct from the manufacturer at TheGameCrafter.com. We’re not making money on this; it’s all at cost, an exercise in getting our research knowledge out into the public sphere.

Would you like to play a game?

 

Advertisements

Award for Outstanding Work in Digital Archaeology – ODATE

Award for Outstanding Work in Digital Archaeology – ODATE

I was pleased to find the following note in my email Thursday last from the AIA….

Previous winners of the award may be found here. Speaking for everyone on the ODATE team, we are honoured to join their company! Earlier projects that have been honoured are becoming part of the entire ecosystem of digital archaeology infrastructure, and I’m pleased that our part aimed at the teaching side of that balance has been recognized. Part of our digital pedagogy uses reproducible computational notebooks that integrate data, code, and analysis. My ambition is that through ODATE we normalize and regularize this kind of reproducible research in archaeology more generally.

I’d also like to thank my collaborators and co-writers on this project who have put up with me these past two years – Neha Gupta, Michael Carter, and Beth Compton. When the going got rough, other folks jumped in to help us complete the work – Jolene Smith, Andreas Angourakis, Andrew Reinhard, Lorna Richardson, Kate Ellenberger, Zack Batist, Joel Rivard, Ben Marwick, and Rob Blades. These folks come from all walks of archaeological life, from the library to the lecture hall, from grad school to professional archaeology. They are all wonderful scholars!

So this award is shared across a community of practice: thank you all.

I should note that this project was funded by eCampusOntario, ‘the online hub for learners and educators across Ontario’, and I’m grateful to them, and the EDC at Carleton, for supporting this somewhat different approach to what an online textbook could be.

…oh, and ODATE itself? Well, the url for it is out there, in the aether; we’re still trying to sand off some of the rougher corners, fill in some of the bits and pieces. You can find it easily enough (ah well, here it is), but know that the official ‘ta da!’ is coming.  But here are all of the computational notebooks that you can run in your browser, right now.

Thank you everyone at ODATE for coming along on this adventure, and thank you AIA!

aialogo

#Archink

#Archink

Katherine Cook organized an archaeology themed edition of the wider #inktober challenge (draw something, every day, for the entire month). Her prompts:

archink prompts
#Inktober #archink prompts

I cannot draw. But I can trace. My hands shake a lot when I try to concentrate on doing fine work. (And yet, when I play piano, they don’t. Go figure). Below are my efforts at #archink. I didn’t hit every day, but I did get a lot of them.

Fragments

It’s been a long month.

Then I was on a plane, so I tried tracing a photo I took in the Museum of London:

 

Interprets

 

Builds

 

Sites

And then I gave Google Storyboard a try, on a video about the University of Reading excavations at Roman Silchester:

…when you’re using the app, you can reload it for different layouts, effects, video stills transformed into sketches of different styles etc.

 

Conserves

 

Conceals

 

Holds

 

Moves

 

Writes

 

Theorizes

Treasures

 

Colonizes

 

Decolonizes

 

Narrates

 

Collects

 

It’s been an interesting challenge. I think if I kept it up, I might eventually learn how to actually draw something ex novo. But until then, there is some satisfaction in remediating, tracing, things I find.

A Quick Note on HackMD for Collaborative Notetaking in Class

A Quick Note on HackMD for Collaborative Notetaking in Class

I’ve long been interested in collaborative notetaking in class as a way of making presence in class more meaningful. In my imagination, collaboratively written notes from class discussions and exercises intersect with other kinds of notes (Hypothes.is for instance for reading, Zotero on bibliography) to make a sort of super zettlekasten.

In class this term (‘Bad Archaeology‘) I’m framing discussion as a series of unconferences. As part of that, I’m also making any notes that I scribble together available to the students via HackMD.io. HackMD also has a nice feature that integrates with Reveal.js so that I can quickly spin out a slide deck from a bit of markdown in a new note, like so:

---
title: Slidedeck Sept 4 Getting Started
slideOptions:
  transition: fade
  theme: night
---

## Sept 4 Getting Started

---

![an image](url to the image)

Note:
Speaking notes hide here; not visible in slide mode but visible in edit

---

and so on

This really fits well with my existing flow. You can create a ‘book’ by making a note with a list of links, then hitting ‘book mode’. The page that loads up will use your note as a table of contents on the left of the page, and the contents from the first linked page as the default first page:

Screenshot from my HackMD notebook

I’m imagining my students making many cards, then filing them altogether in a book-like format. Permissions can be set on individual cards to restrict who can edit them (so just the students, students plus me, outside world, etc). Materials can be exported to dropbox, github, odf format, etc. YAML can be added to each note to ask Google not to index and so on.

HackMD has pricing for more features, more space and so on;  if the business model is good presumably it’s going to hang around for a while. But… there’s always the fear, right? Turns out, you can deploy the whole thing to your own space too – the repository is at https://github.com/hackmdio/codimd. (There’s a desktop interface I see, which is neat, it’s in the organization’s repository list). It doesn’t look easy to deploy, mind you. I have a free account with Heroku, so when I saw the ‘deploy to heroku’ button….

Reader, I pressed it.

It failed the first time, but deployed the second time, so now I have a collaborative markdown notepad of my very own.

(ha, as I look at my Heroku dashboard, I see that I set this up once before a year or two ago! Completely forgot about it…)

featured image by Aaron Burden via Unsplash

DCGAN for Archaeologists

DCGAN for Archaeologists

The following is cross-posted from our project website at bonetrade.github.io

Learning about GANs 

Melvin Wevers has been using neural networks to understand visual patterns in the evolution of newspaper advertisements in Holland. He and his team developed a tool for visually searching the newspaper corpus. Melvin presented some of his research at #dh2018; he shared his poster and slides so I was able to have a look. Afterwards, I reached out to Melvin and we had a long conversation about using computer vision in historical research.

His poster is called ‘ImageTexts: Studying Images and Texts in Conjunction’ which clearly is relevant to our work in the BoneTrade. In his research, he looks at the text for ‘bursty’ changes in the composition of the text. That is, points where the content changes ‘state’ in terms of the frequency of the word distribution. The other approach is to use Generative Adversarial Networks on the images.

So what are GAN? This post is a nice introduction and uses this image to capture the idea:

Image from Dev Nag on Medium

In essence, you have two networks. One learning how to identify your source images, and the second learning how to fool the first by creating new images from scratch.

Why should we care about this sort of thing? For our purposes here, it is one way of learning just what features of our source images our identifiers are looking for (there are others of course). Remember that one of the points of our research is to understand the visual rhetoric of these images. If we can successfully trick the network, then we know what aspects of the network we should be paying attention to. Another intriguing aspect of this approach is that it allows a kind of ‘semantic arithmetic’ of the kind we’re familiar with from word vectors:

The easiest way to think about words and how they can be added and subtracted like vectors is with an example. The most famous is the following: king – man + woman = queen. In other words, adding the vectors associated with the words king and woman while subtracting man is equal to the vector associated with queen. This describes a gender relationship.

Another example is: paris – france + poland = warsaw. In this case, the vector difference between paris and france captures the concept of capital city.

I will admit that I haven’t figured out quite how to do this yet, but I’ve found various code snippets that should permit this.

Finally, as Wevers puts it, ‘The verisimilitude of the generated images is an indication of the meaningfulness of the learned subspace’. That is, if our generated images are not much good, then that’s an indication that there’s just far too much noise going on in our source data in the first place. Garbage in, garbage out. In Melvin’s poster, the GAN “was able to learn the variances in car models, styling, color, position and photographic composition seen in the adverts themselves.”

In which case, it seems that GANS are a worthwhile avenue to explore for our research.

Dominic Monn published an article and accompanying Jupyter Notebook on building a GAN trained on one of the standard databases, ‘CelebFaces Attributes data set’ which has more than 200,000 photographs of ‘celebrities’ (training dataset composition is a topic for another post). It’s probably a function of my computer but I couldn’t get this up and running correctly (setting up and using AWS computing power will be a post and tutorial in due course). It is interesting in that it does walk you through the code, which is not as forbidding as I’d initially assumed.

I had more success with Taehoon Kim’s ‘tensorflow implementation of “Deep Convolutional Generative Adversarial Networks”’, which is available on Github at https://github.com/carpedm20/DCGAN-tensorflow. I don’t have a GPU on this particular machine, so everything was running via CPU; I had to leave my machine for a day or two, and also use the caffeinate command on my Mac to keep it from going to sleep while the process ran (quick info on this here).

I had a number of false starts. Chief amongst these was the composition of my training set.

  1. You need lots of images. Reading around, 10 000 seems to be a bottom minimum for meaningful results
  2. The images need to be thematically unified somehow. You can’t just dump everything you’ve got. I went through a recent scrape of instagram via the tag skullforsale and pulled out about 2300 skull images. That was enough to get the code to run, but as you’ll see, not the best results. Of course, I was only trying to learn how to use the code and work out what the hidden gotchas were.

Gotchas 

Ah yes, the gotchas.

  • images have to be small. Resize them to 256 x 256 or 64 x 64 pixels. Use Imagemagick’s ‘mogrify’ command.
  • images have to be rgb
  • weird errors about casting into array eg https://github.com/carpedm20/DCGAN-tensorflow/issues/162: ValueError: could not broadcast input array from shape (128,128,3) into shape (128,128) means that we have to use Imagemagick’s ‘convert’ command there too.
  • greyscale images screw things up. Convert those to RGB as well
  • running the code: use the dockerized version, and put the data inside the DATA folder.
  • running the main.py script: --crop always has to be appended.

Command snippets:

convert image1.jpg -colorspace sRGB -type truecolor image1.jpg

make sure there are no grayscale images

identify -format "%i %[colorspace]\n" *.jpg | grep -v sRGB

convert images to 64×64

mogrify -resize 64x64 *.jpg

convert to sRGB

mogrify -colorspace sRGB  *.jpg

run main.py

python main.py --dataset=skulls --data_dir data --train --crop`

Results? 

I let the code run until it reached the end of its default iteration time (which is a function of the size of your images). Results were… unimpressive. With too small a dataset, the code would simply not run.

Some outputs:

after two epochs

After nearly an hour the first visualization of the results after a mere two epochs of iterations… a dreamy mist-scape as the machine creates.

first actual working results

In this mosaic, which represents the results from my first actual working run (20 epochs), you can, if you squint, see a nightmarish vision of monstrous skulls. Too few images, I thought (about a thousand, at this point). So I spent several hours collecting more images, and tried again…

results of a run

Maybe I’m only seeing what I want to see, but I see hints of the orbital bones around the eyes, the bridge of the nose, in the top left side of each test image in the mosaic.

So. I think this approach could prove productive, but I need a) more computing power b) more images c) run for much much longer.

I wonder if I can remove my decision making process in the creation of the corpus from this process. Could I construct a pipeline that feeds the mass of images we’ve created into a CNN, use the penultimate layer and some clustering to create various folders of similar images, and then pass the folders to the GAN to figure out what it’s looking at, and visualize the individual neurons?

Jupyter Notebooks for Digital Archaeology (and History too!)

Jupyter Notebooks for Digital Archaeology (and History too!)

As the fall academic term approaches, and we get closer to version 1.0 of the Open Digital Archaeology Text (ODATE), I thought I would share the plethora of Jupyter Notebooks we’ve put together to support the work.  (A video showing the whole ODATE project is over here on youtube). The text of ODATE still has some rough edges and there are parts still coming together. Indeed, it will never be finished as it is my hope that it grows and is forked and becomes the kernel for many many coursepacks and workshops and syllabi; more on that later when we’re closer to pulling back the official curtain.

As far as these notebooks go, more will come with time. Feel free to use these in your teaching – please let me know if you do and how it goes –  and please do suggest edits (either by leaving an issue on the github repo or by making edits and a pull request on github). I would be delighted to include more that other people have built, so let me know if this interests you!

These notebooks can be downloaded and run locally if you have Jupyter installed (you’ll need to pay attention to the requirements and postBuild files if you do that, in order to get all the bits and pieces installed. Use a virtual environment too!) If you click the ‘launch binder’ button, the notebooks will launch in an interactive environment hosted by Binder. (Once they’re up and running, you can also change the url where it says ‘tree’ to ‘lab’ to have these notebooks in a Jupyter Lab interface).

Note: Run each cell in the notebook in sequence from top to bottom; use shift+enter to run the cell or hit the ‘run’ button in the notebook toolbar. Have students work through the notebooks, then make changes, modify, or expand the notebook for themselves. When the notebook is running, there is an ‘export’ option under the ‘file’ tab. Export as jupyter notebook. The resulting text file (with .ipynb extension) could be submitted for course work (and run within the relevant binder, of course).

(featured image: Brandon Green, unsplash.com)

These links will launch the notebooks using Binder. It can sometimes take a few moments for the environment to launch; be patient. Click on the ‘status’ link when launching to see the environment build.


Introduction to Jupyter Notebooks

Binder Repository

Contains:

  • Welcome.ipynb
  • demo-R.ipynb

This notebook contains everything necessary to set up a Github repo that can become the basis of a Binder. Consult the repository’s Readme file to see how it can be customized for your own particular usage. Fork (make a copy) of this repo as often as is necessary! Many of the exercises in the first part of ODATE require nothing more than this.


Working with APIs

Binder Repository

Contains:

  • chronicling america api.ipynb
  • open context api.ipynb
  • Open Context Measurements.ipynb
  • mapping-with-ipyleaflet.ipynb

These notebooks demonstrate progressively more complicated ways of retrieving data via an API.


Archaeological Data into R

Binder Repository

Contains:

  • Retrieving Data from the Portable Antiquities Scheme Database.ipynb
  • archdata.ipynb

The first notebook shows how to use R to pull archaeological data from an online database. The second notebook shows how to interact with the archdata package, a collection of archaeological datasets already pulled together for usage in R.


Databases

Binder Repository

Contains:

  • intro to sql.ipynb
  • SQLite Database and R.ipynb
  • visualizing results of sql query in python.ipynb

This notebook demonstrates how to ingest a variety of csv (or other format) files into a single SQL database. It shows how to query the database, and to push the results of the query into a dataframe for further analysis or visualization.


Linked Open Data

Binder Repository

Contains:

  • sparql-intro.ipynb
  • Using R to Retrieve and Visualize Data from SPARQL.ipynb

The first notebook shows how to craft sparql queries for the British Museum, Wikidata, and Nomisma endpoints. The first notebook uses a SPARQL kernel that also allows for graphing visually the data relationships; it also has a ‘magic’ command for writing the data to json or csv. The second notebook demonstrates how to use the sparql package for R to query an endpoint and then manipulate the results to do some simple statistics and visualization.


Spatial Archaeology

Binder Repository

  • linlithgow_spatial.ipynb
  • canmore_survey_shetland.ipynb
  • 1_spatialarchaeology.ipynb
  • working with remote sensing data.ipynb

These notebooks are courtesy Dr. Rachel Opitz, of the University of Glasgow who is Lecturer in Spatial Archaeology. There are two binders which can be launched, ours and Dr. Opitz’s; consider launching Dr. Opitz’s as she updates the work for her own teaching, or use the ODATE version that is updated only periodically. Dr. Opitz’s version can be launched here: Binder

The Linlithgow notebook explores burial data, while the Canmore notebook explores the map of registered monuments in the Shetlands, recorded in Scotland’s Canmore database. The 1_spatialarchaeology.ipynb notebook explore’s data from Dr. Opitz’s team’s excavations at the ancient city of Gabii in Italy. The final notebook works with remote sensing data and explores cropmarks in hyperspectral images.


Scraping

Binder Repository

  • Extracting Data from PDFs using Tabulizer.R
  • metadigitise.R
  • Building a Scrapy Scraper.ipynb

To launch the two .R scripts, use the built-in RStudio server in this binder. This binder will take a bit of time to load up.

From the Home page for this binder, select new -> RStudio. Then open the Extracting Data from PDFs using Tabulizer.R (or the metadigitise.R) file.

Put your cursor at the first line in the script (top left window); run one line at a time.


LiDAR

Binder Repository

  • Demo using Montreal LiDAR data.ipynb
  • Avebury LiDAR.ipynb

These two notebooks show how to unzip .laz files into .las, and to visualize the data therein.

The final codeblock in both notebooks creates an animated gif from the data. That final codeblock is computationally intensive; it will take some time to run. The results will be written to a new folder called ‘export’; you can open that folder by clicking on the jupyter logo at top and then clicking on the ‘export’ folder.

You will know the code is finished when the [*] at the left of the code block changes to a number.

Start with the Montreal demo notebook. It contains some code that the Avebury notebook depends on.


Agent Based Modeling

Binder Repository

Contains:

  • Schelling Segregation Model – schelling/analysis.ipynb
  • Epstein Civil Violence Model – epstein/epstein civil violence.ipynb
  • Forest Fire Model – forest_fire/forest fire model.ipynb
  • Virus on a network – virus_on_a_network/virus.ipynb

Start with the Forest Fire model – it is one of the best known introductory models in the field. As you experiment with these models, ask yourself, ‘what would it take for this to be a model of an archaeological concept?’ The ‘Virus’ model notebook is still under development.


Agent Based Modeling with Netlogo

Binder

A notebook for running Netlogo models ‘headless’ (eg no GUI) by specifying parameters for an experiment. These parameters are passed to a bash script and run from within the notebook. Results are written to file for further analysis in the python. A variety of likely python data packages are also provided for import. (Others can be added by running !pip install <package> from within a notebook.


Computer Vision

Binder Repository

Contains:

  • Building an archaeological image classifier with tensorflow.ipynb

This notebook demonstrates how Transfer Learning can be applied to create a neural network model trained on archaeological imagery. A mobile version that uses the results of this notebook to create an image identification app may be followed here


Clustering Images with Tensorflow

Binder Repository

Contains:

  • find-similar-images.ipynb
  • Affinity Propogation.pynb

These notebooks take the second-last layer of the neural network and demonstrate how to study it for clustering visually similar images.


Sonification

Binder Repository

  • Intro to Sonification.ipynb

This notebook walks through the process of mapping time-series data to musical notation, to create mid files that can then be turned into sound.


Creativity

Binder [https://github.com/o-date/creativity/]

  • Glitching an image with prism sorting.ipynb
  • semantic_similarity_chatbot.ipynb

This binder can take quite a bit of time to pull together. Please be patient. There are steps in the chatbot notebook that can also take a bit of time; watch for the [*] beside a running code block to disappear before moving to the next one.


Creativity 2

Binder Repository

  • World Building – model demo.ipynb
  • History Generator – ahistorygenerator.ipynb

The first notebook generates a world by simulating topography and erosion, and then using an agent based model to play out its history. The second is a far more simple model of state formation, fision, and fusion but uses Tracery to generate its historical chronicle, and graphviz to visualize it.


Make a Research Compendium

Binder Repository

This repository is an experimental demonstration of how you might combine a research compendium created by rrtools with Binder, a service that creates an executable environment with RStudio in your browser.

Please read the rrtools documentation and this repository’s readme before launching this binder.

Sticking it to DCGAN

These are notes to self; a proper blog post eventually

Working with DCGAN – https://github.com/carpedm20/DCGAN-tensorflow

This is cool too, but just doesn’t work for me (final codeblock runs, no results)

->  much depends on the input data

-> more is better

-> images need to be resized; smaller than 256 x 256

conda activate env
  • convert images to 64×64
    mogrify -resize 64x64 *.jpg
  • make sure there are no grayscale images
    identify -format "%i %[colorspace]\n" *.jpg | grep -v sRGB
    
  • fix those that are:
    convert input.jpg -colorspace sRGB -type truecolor output.jpg
    
  • and fire that thing off:
    python main.py --dataset images --data_dir data --train --crop

 

Buy a damned GPU

Exploring Tensorflow for Poets, or, Building a Pottery Classifier

Exploring Tensorflow for Poets, or, Building a Pottery Classifier

reblogged from our project website at bonetrade.github.io

In our recent paper, ‘Fleshing Out the Bones’ we used the trained ‘Inception 3’ model as a way of determining clusters of images that we then studied for clues and hints: why did the machine cluster them this way? What are the common features?

Unsupervised learning: It’s not unlike reading entrails.

Diagram of the sheep's liver found near Piacenza with Etruscan inscriptions on the bronze sheep's Liver of Piacenza, Wikipedia

An alternate approach is to take an existing model, and add new training data to it. Pete Warden put together a tutorial a few years ago called Tensorflow for Poets that has since been formalized as a Google CodeLabs tutorial. Last night I tried the tutorial out using a corpus of Roman fabrics and wares. The nuts-and-bolts of doing this are over in the tutorials.

The hardest part was getting the training data organized. It needs to be in a folder where each image is in a subfolder where the name of the subfolder is the category, eg:


|
|-training-images
     |
     |-terrasig
     |-african_red_slip
     |-veranice_nera

…etc. Once that was done, it went quite smoothly. What will take some time is figuring out what the different architecture and other flags do. For instance, in the default command suggested by the tutorial, I had to determine that I needed to add the flag on validation size and set it to use the entire training set (as my training set is probably way too small).

python -m scripts.retrain \
  --bottleneck_dir=tf_files/bottlenecks \
  --how_many_training_steps=500 \
  --model_dir=tf_files/models/ \
  --summaries_dir=tf_files/training_summaries/mobilenet_0.50_224 \
  --output_graph=tf_files/retrained_graph.pb \
  --output_labels=tf_files/retrained_labels.txt \
  --architecture mobilenet_0.50_224 \
  --validation_batch_size=-1 \
  --image_dir=tf_files/gallery

“The architecture flag is where we tell the retraining script which version of MobileNet we want to use. The 1.0 corresponds to the width multiplier, and can be 1.0, 0.75, 0.50 or 0.25. The 224 corresponds to image resolution, and can be 224, 192, 160 or 128. For example, to train the smallest version, you’d use –architecture mobilenet_0.25_128.” Harvey, 2017

I’m not sure I understand exactly what this all means yet, practically speaking. Anyway, I now know what I’ll need our MA research assistants to do this fall.

  1. study our first corpus’ clustering results to come up with some training categories
  2. divide that corpus into a training dataset
  3. train a new image classifier
  4. run that classifier on our new corpus (which I’m still collecting)
  5. compare that with the clustering approach without our new classifier.
  6. compare the results of both with the posts’ text

The ambition? To be able to sort at scale the images we find automatically into sensible structure.

 

Featured Image by Adrien Ledoux on Unsplash

MA in History, Public History, Digital Humanities: 2 Positions

MA in History, Public History, Digital Humanities: 2 Positions

Damien Huffer and I are working on a project that elaborates from our ‘Insta-Dead‘ work, which data mined Instagram to explore the trade in human remains. We have an opportunity for two potential MA students to start September 2018 to work with us, at Carleton University.  At the same time, the students would pursue their own research within the ambit of this project, which revolves around the use of various AI technologies – especially, but not limited to, various kinds of neural networks. Ideally, the students’ own research projects would push the research into other domains, for instance, historical photographs; tourist photos; advertising using historical imagery, digital historical consciousness.

Interested candidates are invited to contact Dr. Shawn Graham at shawn dot graham at carleton dot ca to discuss their potential research project, to gauge their potential fit with the funding envelope and other potential supplementary funding sources. Candidates are also invited to review the MA History, MA Public History, with Collaborative Digital Humanities Program requirements –

https://carleton.ca/history/graduate/ma-program/program-requirements/m-a-with-specialization-in-digital-humanities/

https://carleton.ca/history/m-a-in-public-history/

Criteria:
– a good first degree in a relevant subject (history, archaeology, etc)
– existing ability in digital humanities methods or issues is desirable, but not critical. Much more important is an ability to think creatively about the problems or potentials of computational viewpoints.

Activities that the students might be involved in:

+ Writing code to generate datasets
+ Developing various NN
+ Analyzing results
+ Ground-truthing training datasets (making sure that training images are properly classified)
+ Curating and preparing materials for data publication in appropriate venues
+ Research and writing of tutorials
+ Research and writing connected with their own research interests as they intersect with this project
+ Communicating the results of research with relevant publics at conferences and other venues

 

featured image by Green Chamelon on Unsplash