Jupyter Notebooks for Digital Archaeology (and History too!)

As the fall academic term approaches, and we get closer to version 1.0 of the Open Digital Archaeology Text (ODATE), I thought I would share the plethora of Jupyter Notebooks we’ve put together to support the work.  (A video showing the whole ODATE project is over here on youtube). The text of ODATE still has some rough edges and there are parts still coming together. Indeed, it will never be finished as it is my hope that it grows and is forked and becomes the kernel for many many coursepacks and workshops and syllabi; more on that later when we’re closer to pulling back the official curtain.

As far as these notebooks go, more will come with time. Feel free to use these in your teaching – please let me know if you do and how it goes –  and please do suggest edits (either by leaving an issue on the github repo or by making edits and a pull request on github). I would be delighted to include more that other people have built, so let me know if this interests you!

These notebooks can be downloaded and run locally if you have Jupyter installed (you’ll need to pay attention to the requirements and postBuild files if you do that, in order to get all the bits and pieces installed. Use a virtual environment too!) If you click the ‘launch binder’ button, the notebooks will launch in an interactive environment hosted by Binder. (Once they’re up and running, you can also change the url where it says ‘tree’ to ‘lab’ to have these notebooks in a Jupyter Lab interface).

Note: Run each cell in the notebook in sequence from top to bottom; use shift+enter to run the cell or hit the ‘run’ button in the notebook toolbar. Have students work through the notebooks, then make changes, modify, or expand the notebook for themselves. When the notebook is running, there is an ‘export’ option under the ‘file’ tab. Export as jupyter notebook. The resulting text file (with .ipynb extension) could be submitted for course work (and run within the relevant binder, of course).

(featured image: Brandon Green, unsplash.com)

These links will launch the notebooks using Binder. It can sometimes take a few moments for the environment to launch; be patient. Click on the ‘status’ link when launching to see the environment build.


Introduction to Jupyter Notebooks

Binder Repository

Contains:

  • Welcome.ipynb
  • demo-R.ipynb

This notebook contains everything necessary to set up a Github repo that can become the basis of a Binder. Consult the repository’s Readme file to see how it can be customized for your own particular usage. Fork (make a copy) of this repo as often as is necessary! Many of the exercises in the first part of ODATE require nothing more than this.


Working with APIs

Binder Repository

Contains:

  • chronicling america api.ipynb
  • open context api.ipynb
  • Open Context Measurements.ipynb
  • mapping-with-ipyleaflet.ipynb

These notebooks demonstrate progressively more complicated ways of retrieving data via an API.


Archaeological Data into R

Binder Repository

Contains:

  • Retrieving Data from the Portable Antiquities Scheme Database.ipynb
  • archdata.ipynb

The first notebook shows how to use R to pull archaeological data from an online database. The second notebook shows how to interact with the archdata package, a collection of archaeological datasets already pulled together for usage in R.


Databases

Binder Repository

Contains:

  • intro to sql.ipynb
  • SQLite Database and R.ipynb
  • visualizing results of sql query in python.ipynb

This notebook demonstrates how to ingest a variety of csv (or other format) files into a single SQL database. It shows how to query the database, and to push the results of the query into a dataframe for further analysis or visualization.


Linked Open Data

Binder Repository

Contains:

  • sparql-intro.ipynb
  • Using R to Retrieve and Visualize Data from SPARQL.ipynb

The first notebook shows how to craft sparql queries for the British Museum, Wikidata, and Nomisma endpoints. The first notebook uses a SPARQL kernel that also allows for graphing visually the data relationships; it also has a ‘magic’ command for writing the data to json or csv. The second notebook demonstrates how to use the sparql package for R to query an endpoint and then manipulate the results to do some simple statistics and visualization.


Spatial Archaeology

Binder Repository

  • linlithgow_spatial.ipynb
  • canmore_survey_shetland.ipynb
  • 1_spatialarchaeology.ipynb
  • working with remote sensing data.ipynb

These notebooks are courtesy Dr. Rachel Opitz, of the University of Glasgow who is Lecturer in Spatial Archaeology. There are two binders which can be launched, ours and Dr. Opitz’s; consider launching Dr. Opitz’s as she updates the work for her own teaching, or use the ODATE version that is updated only periodically. Dr. Opitz’s version can be launched here: Binder

The Linlithgow notebook explores burial data, while the Canmore notebook explores the map of registered monuments in the Shetlands, recorded in Scotland’s Canmore database. The 1_spatialarchaeology.ipynb notebook explore’s data from Dr. Opitz’s team’s excavations at the ancient city of Gabii in Italy. The final notebook works with remote sensing data and explores cropmarks in hyperspectral images.


Scraping

Binder Repository

  • Extracting Data from PDFs using Tabulizer.R
  • metadigitise.R
  • Building a Scrapy Scraper.ipynb

To launch the two .R scripts, use the built-in RStudio server in this binder. This binder will take a bit of time to load up.

From the Home page for this binder, select new -> RStudio. Then open the Extracting Data from PDFs using Tabulizer.R (or the metadigitise.R) file.

Put your cursor at the first line in the script (top left window); run one line at a time.


LiDAR

Binder Repository

  • Demo using Montreal LiDAR data.ipynb
  • Avebury LiDAR.ipynb

These two notebooks show how to unzip .laz files into .las, and to visualize the data therein.

The final codeblock in both notebooks creates an animated gif from the data. That final codeblock is computationally intensive; it will take some time to run. The results will be written to a new folder called ‘export’; you can open that folder by clicking on the jupyter logo at top and then clicking on the ‘export’ folder.

You will know the code is finished when the [*] at the left of the code block changes to a number.

Start with the Montreal demo notebook. It contains some code that the Avebury notebook depends on.


Agent Based Modeling

Binder Repository

Contains:

  • Schelling Segregation Model – schelling/analysis.ipynb
  • Epstein Civil Violence Model – epstein/epstein civil violence.ipynb
  • Forest Fire Model – forest_fire/forest fire model.ipynb
  • Virus on a network – virus_on_a_network/virus.ipynb

Start with the Forest Fire model – it is one of the best known introductory models in the field. As you experiment with these models, ask yourself, ‘what would it take for this to be a model of an archaeological concept?’ The ‘Virus’ model notebook is still under development.


Agent Based Modeling with Netlogo

Binder

A notebook for running Netlogo models ‘headless’ (eg no GUI) by specifying parameters for an experiment. These parameters are passed to a bash script and run from within the notebook. Results are written to file for further analysis in the python. A variety of likely python data packages are also provided for import. (Others can be added by running !pip install <package> from within a notebook.


Computer Vision

Binder Repository

Contains:

  • Building an archaeological image classifier with tensorflow.ipynb

This notebook demonstrates how Transfer Learning can be applied to create a neural network model trained on archaeological imagery. A mobile version that uses the results of this notebook to create an image identification app may be followed here


Clustering Images with Tensorflow

Binder Repository

Contains:

  • find-similar-images.ipynb
  • Affinity Propogation.pynb

These notebooks take the second-last layer of the neural network and demonstrate how to study it for clustering visually similar images.


Sonification

Binder Repository

  • Intro to Sonification.ipynb

This notebook walks through the process of mapping time-series data to musical notation, to create mid files that can then be turned into sound.


Creativity

Binder [https://github.com/o-date/creativity/]

  • Glitching an image with prism sorting.ipynb
  • semantic_similarity_chatbot.ipynb

This binder can take quite a bit of time to pull together. Please be patient. There are steps in the chatbot notebook that can also take a bit of time; watch for the [*] beside a running code block to disappear before moving to the next one.


Creativity 2

Binder Repository

  • World Building – model demo.ipynb
  • History Generator – ahistorygenerator.ipynb

The first notebook generates a world by simulating topography and erosion, and then using an agent based model to play out its history. The second is a far more simple model of state formation, fision, and fusion but uses Tracery to generate its historical chronicle, and graphviz to visualize it.


Make a Research Compendium

Binder Repository

This repository is an experimental demonstration of how you might combine a research compendium created by rrtools with Binder, a service that creates an executable environment with RStudio in your browser.

Please read the rrtools documentation and this repository’s readme before launching this binder.