Gaze & Eonydis for Archaeological Data

I’m experimenting with Clement Levallois‘ data mining tools ‘Gaze‘ and ‘Eonydis‘. I created a table with some mock archaeological data in it: artefact, findspot, and date range for the artefact. More on dates in a moment. Here’s the fake dataset.

Firstly, Gaze will take a list of nodes (source, target), and create a network where the source nodes are connected to each other by virtue of sharing a common target. Clement explains:

Paul,dog
Paul, hamster
Paul,cat
Gerald,cat
Gerald,dog
Marie,horse
Donald,squirrel
Donald,cat
… In this case, it is interesting to get a network made of Paul, Gerald, Marie and Donald (sources nodes), showing how similar they are in terms of pets they own. Make sure you do this by choosing “directed networks” in the parameters of Gaze. A related option for directed networks: you can choose a minimum number of times Paul should appear as a source to be included in the computations (useful to filter out unfrequent, irrelevant nodes: because you want only owners with many pets to appear for instance).

The output is in a nodes.dl file and an edges.dl file. In Gephi, go to the import spreadsheet button on the data table, import the nodes file first, then the edges file. Here’s the graph file.

Screenshot, Gaze output into Gephi, from mock archaeo-data

Screenshot, Gaze output into Gephi, from mock archaeo-data

Eonydis on the other hand takes that same list and if it has time-stamps within it (a column with dates), will create a dynamic network over time. My mock dataset above seems to cause Eonydis to crash – is it my negative numbers? How do you encode dates from the Bronze Age in the day/month/year system? Checking the documentation, I see that I didn’t have proper field labels, so I needed to fix that. Trying again, it still crashed. I fiddled with the dates to remove the range (leaving a column to imply ‘earliest known date for this sort of thing’), which gave me this file.

Which still crashed. Now I have to go do some other stuff, so I’ll leave this here and perhaps one of you can pick up where I’ve left off. The example file that comes with Eonydis works fine, so I guess when I return to this I’ll carefully compare the two. Then the task will be to work out how to visualize dynamic networks in Gephi. Clement has a very good tutorial on this.

Postscript:

Ok, so I kept plugging away at it. I found if I put the dates yyyy-mm-dd, as in 1066-01-23 then Eonydis worked a treat. Here’s the mock data and here’s the gexf.

And here’s the dynamic animation! http://screencast.com/t/Nlf06OSEkuA

Post post script:

I took the mock data (archaeo-test4.csv) and concatenated a – in front of the dates, thus -1023-01-01 to represent dates BC. In Eonydis, where it asks for the date format, I tried this:

#yyyy#mm#dd  which accepted the dates, but dropped the negative;

-yyyy#mm#dd, which accepted the dates and also dropped the negative.

Thus, it seems to me that I can still use Eonydis for archaeological data, but I should frame my date column in relative terms rather than absolute, as absolute isn’t really necessary for the network analysis/visualization anyway.

Hollis Peirce, George Garth Graham Research Fellow

Hollis Peirce on Twitter: https://twitter.com/HollPeirce

Mr. Hollis Peirce https://twitter.com/HollPeirce

I am pleased to announce that the first George Garth Graham Undergraduate Digital History Research Fellow will be Mr. Hollis Peirce.

Hollis is a remarkable fellow. He attended the Digital Humanities Summer Institute at the University of Victoria in the summer of 2012. At DHSI he successfully completed a course called “Digitization Fundamentals and Their Application”. In the fall semester of 2012 he was the impetus behind, and helped to organize,  THATCamp Accessibility on the subject of the impact of digital history on accessibility in every sense of the word.

Hollis writes,

Life for me has been riddled with challenges.  The majority of them coming on account of the fact that I, Hollis Peirce, am living life as a disabled individual with Congenital Muscular Dystrophy as many things are not accessible to me.  However, I have never let this fact hold me back from accomplishing my goals.  Because of this, when I first started studying history I knew I was not choosing an easy subject for a disabled individual such as myself.  All those old, heavy, books on high library shelves that history is known for made it one of the most inaccessible subjects possible to study.  All that changed however, when I discovered digital history.

It was thanks to a new mandatory class for history majors at Carleton University called The Historian’s Craft taught by a professor named Dr Shawn Graham.  This course was aimed at teaching students all about how to become a historian, and how a historian is evolving through technology.  At that moment the idea for ‘Accessibility & Digital History’ came to mind.  From that point on many steps have been taken to advance my studies in this field, which has led to being selected as the first George Garth Graham Undergraduate Digital History Reseach Fellow.

Hollis and I have had our first meeting, about what his project might entail. When I initially cooked this idea up, I thought it would allow students the opportunity to work on my projects, or those of my colleagues around the university. As we chatted about Hollis’ ideas, (and where I batted around some of my own stuff),  I realized that I had the directionality of this relationship completely backwards.

It’s not that Hollis gets to work on my projects. It’s that I get to work on his.

Here’s what we came up with.

At THATCamp Accessibility, we recorded every session. We bounced around the idea of transcribing those sessions, but realized that that was not really feasible for us. We started talking about zeroing in on certain segments, to tell a history of the future of an accessible digital humanities… and ideas started to fizz. I showed Hollis some of Jentery Sayer’s stuff, especially his work with Scalar . 

Jentery writes,

the platform particularly facilitates work with visual materials and dynamic media (such as video and audio)….it enables writers to assemble content from multiple sources and juxtapose them with their own compositions.

Can we use Scalar to tell the story of THATCamp Accessibility that captures the spontaneity, creativity, and excitement of that day in a way that highlights the issues of accessibility that Hollis  wants to explore? And if we can, how can we make it accessible for others (screenreaders, text-to-speech, etc?) And if we focus on telling history with an eye to accessibility (oh, how our metaphors privilege certain senses, ways of knowing!) maybe there will be lessons for telling history, full stop?

Stay tuned! Hollis is setting up his blog this week, but he’ll be posting over at http://hollispeirce.grahamresearchfellow.org/

Networked Corpus Index

I found this today: http://www.networkedcorpus.com  for visualizing the results of MALLET topic models, by reference to exemplary passages. Its creators explicitly contrast this with index building by hand… more about all that later; here’s how to get it working for you.

See the documentation at https://github.com/jeffbinder/networkedcorpus
Install python 2.7, numpy, and scipy. Download the installers, and follow the instructions.

Then download the networked corpus zip file, and unzip it somewhere on your machine.

In the networked-corpus folder, there is a subfolder called ‘res’ and a script called gen-networked-corpus.py. Move these two items to your Mallet folder.

Generate a topic model as you would normally do, from the command line.

Then, type at the command prompt:

gen-networked-corpus.py --input-dir <the folder with your original texts in it> --output-dir <the name of a folder you'd like all the output to be in>

Navigate to that folder in your browser, open the index.html and voila.

 

 

Exploring Trends in Archaeology: Professional, Public, and Media Discourses

The following is a piece by Joe Aitken, a student in my CLCV3202a Roman Archaeology for Historians class at Carleton University. His slides may be found here. I asked Joe if I could share his work with the wider world, because I thought it an interesting example of using simple text analysis to explore broader trends in public archaeology. Happily, he said yes.

Exploring Trends in Archaeology: Professional, Public, and Media Discourses

An immense shift in content and terminology emerges when analysing the text of several documents relating to the archaeology of Colchester, as information grows from its genesis as an archaeological report, through the stage of public archaeology, and finally to mass media. Many inconsistencies emerge as the form in which archaeological information is presented changes.

This analysis was done with the help of Voyant Tools, “a web-based text analysis environment.”[1] Z-score, representing the number of standard deviations above the mean at which each term appears, will be used as the basic marker of frequency. Skew, “A measure of the asymmetry of relative frequency values for each document in the corpus,”[2] will also be used. Having a skew close to zero suggests that the term appears with relative consistency throughout the documents. This means that in comparison to, for example, “piggery,” with a skew of 11, terms with a low skew are not only frequent in the corpus as a whole, but are prevalent in many of the documents that make up the corpus.

A text analysis of Colchester Archaeological Trust Reports 585-743 (February 2011 to 22nd October 2013)[3] is the basis of this comparison. Dominant in this corpus are terms related to archaeological excavations. The term “report” has a z-score of 8.69, “finds” has a z-score of 6.43, and “site” has a z-score of 8.81. The same terms, respectively, have skews of 0.93, 0, and 0.88. Another relatively consistent term is “pottery,” which has a skew of 1 and a z-score of 5.26. “Brick”, with a skew of 2.17 and a z-score of 3.1, is similarly consistent.

The relevance of these figures becomes clearer upon a comparison with the public archaeological writings as they appear on the Colchester Archaeologist blog. The blog exists on the public-facing website of the Colchester Archaeological Trust, and has been blogging about its archaeological discoveries since 2011. This analysis will use the Voyant-Tools difference function, which returns a value based on a comparison between the z-scores of two corpora,[4] as well as a direct comparison of the z-score and skew of each term between the two corpora.

Some of the most consistent terms from the archaeological corpus appear very infrequently in the public archaeology. “Pottery” has a skew of 9.49 and a z-score of 0.25, and appears at about 1/5 of the frequency as it does in the reports. “Brick” similarly disappears: in the public archaeology, it has a skew of 9.56 and a z-score of -0.02, compared to a skew of 2.17 and a z-score of 3.1 in the archaeological reports.

Terms relating to the excavation also disappear. “Finds,” which in the archaeological reports has a skew of 0 and a z-score of 6.43, has a skew of 4.94 and a z-score of 0.42 in the public archaeology. “Report” similarly changes from a skew of 0.93 to 9.87, with it’s z-score dropping from 8.69 to -0.06. Site follows this trend to a lesser extent, although this is likely due to it appearing in the public archaeology in the context of “website,” rather than as an archaeological term. Still, the shift in z-score and skew are significant, and in the same direction: an archaeological z-score of 8.81 to a public z-score of 3.83, and an archaeological skew of 0.88 to a public skew of 1.28. In each case, these commonly used terms from the archaeological reports appeared less frequently and less consistently in the blog.

On the other hand, some terms are much more common in the public archaeology. Compared to the corpus of archaeological reports, the public archaeology texts contain the term “circus” at 5 times the frequency. In the blog, “circus” has a z-score of 5.77 and a relatively stable skew of 1.79, compared to a minimal z-score of 0.69 and a volatile skew of 6.3 in the archaeological reports. A similar change occurs to the term “burial,” although to a lesser extent: from report to blog, the z-score rises from 0.25 to 0.86, and the skew drops from 3.84 to 3.65.

Terms with a high skew and a non-insignificant z-score in the archaeological reports seem to be the most prevalent terms altogether in the public archaeology, while terms with a skew closer to zero in the reports disappear in the public archaeology: that is, the terms that appear infrequently but in large numbers in the reports are the ones selected for representation in the blog. This emphasises rare and exciting discoveries, such as the circus and large burials, while ignoring the more regular and consistent discoveries of pottery and bricks. For terms with high skew, there is a consistent rise in z-score and drop in skew in the incidences of the term between the archaeological and public corpora. For terms with a skew closer to zero, there is a consistent decline in z-score. The two trends that terms follow with regards to their relative frequencies between the two corpora can be defined as follows: low-skew terms, which tend to disappear, and significant-z-score/high skew terms, which tend to be emphasised in the public archaeology.

Archaeology in the media seems to mostly follow from the public archaeology rather than the archaeological reports on most aspects. The media corpus contains articles about the archaeology of Colchester from sources ranging from local to national media, including the BBC, the Colchester Daily Gazette, the Essex County Standard, and the Independent, in addition to international Archaeological publications. In these articles, “circus” has a low skew of 1.51, although its z-score isn’t as overwhelmingly high as it is in the public archaeology at 1.64. Still, it is much greater than the z-score of 0.69 for “circus” in the reports, and this z-score most likely reflects a greater lexical variety rather than a focus on other aspects of the archaeology, as this is the fifth-highest z-score in the entire media corpus. Still, there is less emphasis on the circus here than in the blog.

In common between the public and media corpora is their near complete removal of non-Roman archaeological terminology. The term “medieval” appears 1555 times in the archaeological corpus, with a z-score of 3.42 and a skew of 2.64. In the public corpus, the same term appears twice, with a z-score of negative -0.09 and a skew of 10.30. In the selection of news about the archaeology of Colchester, the term never appears. This follows the same trends of selection as the public archaeology: “medieval,” a low-skew term in the archaeological corpus, is ignored in favour of high-skew terms.

Although the media and public corpora contain writings about the same discoveries and use similar language, the frequency at which they do so differs. The media, unlike the blog, is unlikely to repeatedly write about the circus even when no new information is available. Rather, each media seems to be inspired by the archaeological reports, but takes its information from the public archaeology. That is, instead of repeating the public archaeology, the media takes inspiration from the actual archaeological discovery, but takes their information about this archaeology from the blog rather than directly from the report.

Altogether, archaeological writing about Colchester appears to become much narrower over time. While the archaeological reports assumedly accurately reflect what is found, the public archaeology, and, in turn, the media, does not. Instead, they focus on more marketable and exciting aspects of the archaeology: these can be recognized as the high-skew/high-z-score terms in the analysis. As a result, the particulars of the excavation, as well as the majority of findings, are de-emphasised; these are the low-skew terms. By the stage of public presentation, only a very narrow view of the archaeology of Colchester has been presented. It is almost exclusively monumental and Roman, and is at odds with the multiplicity of archaeological findings that are seen in the reports.

Corpora

Archaeological Reports: http://voyant-tools.org/?corpus=1385952648533.7651

Public Archaeology: http://voyant-tools.org/?corpus=1385952090402.1310

Archaeology in Media: http://voyant-tools.org/?corpus=1385743429982.2427

Academic Archaeology: http://voyant-tools.org/?corpus=1385756548766.8274

All reports, blog posts, articles, papers, corpora, and a list of stopwords used is available at: https://www.dropbox.com/sh/kdj0ez8mwep0c7e/ZKViQxSG99.

 Professional Bibliography

“Colchester Archaeological Trust – Online Report Library.” CAT Reports 585-743. http://cat.essex.ac.uk/all-reports.html

Public Bibliography

“News | The Colchester Archaeologist.” All posts since 2013-11-30. http://www.thecolchesterarchaeologist.co.uk/?cat=11

Media Bibliography

Anonymous. “Colchester dig uncovers ‘spearmen’ skeletons.” BBC, 18 April 2011.

-—. “Colchester Roman circus visitor centre a step closer.” BBC, 14 May 2012.

—-. “Roman ruins to go on display as part of new restaurant.” Essex County Standard, 31 December 2012.

—-. “Colchester archaeology shares in £250,000 funding boost.” Daily Gazette, 27 March 2013.

—-. “2,000-Year-Old Warrior Grave & Spears Unearthed.” Archaeology, 18 September 2013.

Brading, Wendy. “Roman history all set to be revealed.” Daily Gazette, 19 June 2012.

—-. “Excavations to find out Colchester life – Roman style.” Daily Gazette, 11 October 2012.

—-. “Experts discover new Roman graves.” Daily Gazette, 16 January 2013.

—-. “Warrior grave found in excavation.” Essex County Standard, 16 September 2013.

Calnan, James. “Archaeologists discover 900-year-old-abbey.” Daily Gazette, 22 February 2011.

—-. “Uncovered: The remains of two Roman soldiers. Daily Gazette, 14 April 2011.

—-. “Colchester Archaeological Trust unearths English Civil War star fort.” Daily Gazette, 26 August 2011.

—-. “Roman Circus site may open next summer.” Daily Gazette, 16 December 2011.

Cox, James. “Roman road found beneath the southwell arms.” Daily Gazette, 30 July 2012.


[1] “Voyeur Tools: See Through Your Texts,” http://hermeneuti.ca/voyeur

[2] Mouseover text.

[3] “Colchester Archaeological Trust,” http://cat.essex.ac.uk/all-reports.html.

[4] Brian Croxall, “Comparing Corpora in Voyant Tools.” http://www.briancroxall.net/2012/07/18/comparing-corpora-in-voyant-tools/.

Visualizing texts using Overview

I’ve come across an interesting tool called ‘Overview‘. It’s meant for journalists, but I see no reason why it can’t serve historical/archaeological ends as well. It does recursive adaptive k-means clustering rather than topic modeling, as I’d initially assumed (more on process here). You can upload texts as pdfs or within a table. One of the columns in your table could be a ‘tags’ column, whereby – for example – you indicate the year in which the entry was made (if you’re working with a diary). Then, Overview sorts your documents or entries into nested folders of similiarity. You can then see how your tags – decades – play out across similar documents. In the screenshot below, I’ve fed the text of ca 600  historical plaques into Overview:

Overview divides the historical plaques, at the broadest level, of similarity into the following groups:

‘church, school, building, toronto, canada, street, first, house, canadian, college (545 plaques),

‘road, john_graves, humber, graves_simcoe, lake, river, trail, plant’ (41 plaques)

‘community’ with ‘italian, north_york,  lansing, store, shepard, dempsey, sheppard_avenue’, 13 documents

‘: years’ with ‘years_ago, glacier, ice, temperance, transported, found, clay, excavation’, 11 documents.

That’s interesting information to know. In terms of getting the info back out, you can export a spreadsheet with tags attached. Within Overview, you might want to tag all documents together that sort into similar groupings, which you could then visualize with some other program. You can also search documents, and tag them manually. I wondered how plaques concerned with ‘children’, ‘women’, ‘agriculture’, ‘industry’, etc might play out, so I started using Overview’s automatic tagger (search for a word or phrase, apply that word or phrase as a tag to everything that is found). One could then visually explore the way various tags correspond with particular folders of similar documents (as in this example). That first broad group of ‘church school building canada toronto first york house street canadian’ just is too darned big, and so my tagging is hidden (see the image)- but it does give you a sense that the historical plaques in Toronto really are concerned with the first church, school, building, house, etc in Toronto (formerly, York). Architectural history trumps all. It would be interesting to know if these plaques are older than the other ones: has the interest in spaces/places of history shifted over time from buildings to people? Hmmm. I’d better check my topic models, and do some close reading.

Anyway, leaving that aside for now, I exported my tagged texts, and did a quick and dirty network visualization of tags connected to other tags by virtue of shared plaques. I only did this for 200 of the plaques, because, frankly, it’s Friday evening and I’d like to go home.

Here’s what I saw [pdf version]:

visualizing-tags-via-overview

So a cluster with ‘elderly’, ‘industry’, ‘doctor’, ‘medical’, ‘woman’…. I don’t think this visualization that I did was particularly useful.

Probably, it would be better to generate tags that collect everything together in the groups that the tree visualization in Overview generates, export that, and visualize as some kind of dendrogram. It would be good if the groupings could be exported without having to do that though.

Getting Historical Network Data into Gephi

I’m running a workshop next week on getting started with networks & gephi. Below, please find my first pass at a largely self-directed tutorial. This may eventually get incorporated into the Macroscope.

Data files for this tutorial may be found here. There’s a pdf/pptx with the images below, too.

The data for this exercise comes from Peter Holdsworth’s MA dissertation research, which Peter shared on Figshare here. Peter was interested in the social networks surrounding ideas of commemoration of the centenerary of the War of 1812, in 1912. He studied the membership rolls for women’s service organization in Ontario both before and after that centenerary. By making his data public, Peter enables others to build upon his own research in a way not commonly done in history. (Peter can be followed on Twitter at https://twitter.com/P_W_Holdsworth).

On with the show!

Download and install Gephi. (What follows assumes Gephi 0.8.2). You will need the MultiMode Projection pluging installed.

To install the plugin – select Tools >> Plugins  (across the top of Gephi you’ll see ‘File Workspace View Tools Window Plugins Help’. Don’t click on this ‘plugins’. You need to hit ‘tools’ first. Some images would be helpful, eh?).

In the popup, under ‘available plugins’ look for ‘MultimodeNetworksTransformation’. Tick this box, then click on Install. Follow the instructions, ignore any warnings, click on ‘finish’. You may or may not need to restart Gephi to get the plugin running. If you suddenly see on the far right of ht Gephi window a new tab besid ‘statistics’, ‘filters’, called ‘Multimode Network’, then you’re ok.

Slide1

Getting the Plugin

Assuming you’ve now got that sorted out,

1. Under ‘file’, select -> New project.
2. On the data  laboratory tab, select Import-spreadsheet, and in the pop-up, make sure to select under ‘as table: EDGES table. Select women-orgs.csv.  Click ‘next’, click finish.

(On the data table, have ‘edges’ selected. This is showing you the source and the target for each link (aka ‘edge’). This implies a directionality to the relationship that we just don’t know – so down below, when we get to statistics, we will always have to make sure to tell Gephi that we want the network treated as ‘undirected’. More on that below.)

Slide2

Loading your csv file, step 1.

Slide3

Loading your CSV file, step 2

3. Click on ‘copy data to other column’. Select ‘Id’. In the pop-up, select ‘Label’
4. Just as you did in step 2, now import NODES (Women-names.csv)

(nb. You can always add more attribute data to your network this way, as long as you always use a column called Id so that Gephi knows where to slot the new information. Make sure to never tick off the box labeled ‘force nodes to be created as new ones’.)

Adding new columns

Adding new columns

5. Copy ID to Label
6. Add new column, make it boolean. Call it ‘organization’

Filtering & ticking off the boxes

Filtering & ticking off the boxes

7. In the Filter box, type [a-z], and select Id – this filters out all the women.
8. Tick off the check boxes in the ‘organization’ columns.

Save this as ‘women-organizations-2-mode.gephi’.

Now, we want to explore how women are connected to other women via shared membership.

Setting up the transformation.

Setting up the transformation.

Make sure you have the Multimode networks projection plugin installed.

On the multimode networks projection tab,
1. click load attributes.
2. in ‘attribute type’, select organization
4. in left matrix, select ‘false – true’ (or ‘null – true’)
5. in right matrix, select ‘true – false’. (or ‘true – null’)
(do you see why this is the case? what would selecting the inverse accomplish?)

6. select ‘remove edges’ and ‘remove nodes’.

7. Once you hit ‘run’, organizations will be removed from your bipartite network, leaving you with a single-mode network. hit ‘run’.

8. save as ‘women to women network.csv’

…you can reload your ‘women-organizations-2-mode.gephi’ file and re-run the multimode networks projection so that you are left with an organization to organization network.

! if your data table is blank, your filter might still be active. make sure the filter box is clear. You should be left with a list of women.

9. You can add the ‘women-years.csv’ table to your gephi file, to add the number of organizations the woman was active in, by year, as an attribute. You can then begin to filter your graph’s attributes…

10. Let’s filter by the year 1902. Under filters, select ‘attributes – equal’ and then drag ’1902′ to the queries box.
11. in ‘pattern’ enter [0-9] and tick the ‘use regex’ box.
12. click ok, click ‘filter’.

You should now have a network with 188 nodes and 8728 edges, showing the women who were active in 1902.

Let’s learn something about this network. On statistics,
13. Run ‘avg. path length’ by clicking on ‘run’
14. In the pop up that opens, select ‘undirected’ (as we know nothing about directionality in this network).
15. click ok.

16. run ‘modularity’ to look for subgroups. make sure ‘randomize’ and ‘use weights’ are selected. Leave ‘resolution’ at 1.0

Let’s visualize what we’ve just learned.

17. On the ‘partition’ tab, over on the left hand side of the ‘overview’ screen, click on nodes, then click the green arrows beside ‘choose a partition parameter’.
18. Click on ‘choose a partition parameter’. Scroll down to modularity class. The different groups will be listed, with their colours and their % composition of the network.
19. Hit ‘apply’ to recolour your network graph.

20. Let’s resize the nodes to show off betweeness-centrality (to figure out which woman was in the greatest position to influence flows of information in this network.) Click ‘ranking’.
21. Click ‘nodes’.
22. Click the down arrow on ‘choose a rank parameter’. Select ‘betweeness centrality’.
23. Click the red diamond. This will resize the nodes according to their ‘betweeness centrality’.
24. Click ‘apply’.

Now, down at the bottom of the middle panel, you can click the large black ‘T’ to display labels. Do so. Click the black letter ‘A’ and select ‘node size’.

Mrs. Mary Elliot-Murray-Kynynmound and Mrs. John Henry Wilson should now dominate your network. Who were they? What organizations were they members of? Who were they connected to? To the archives!

Congratulations! You’ve imported historical network data into Gephi, manipulated it, and run some analyzes. Play with the settings on ‘preview’ in order to share your visualization as svg, pdf, or png.

Now go back to your original gephi file, and recast it as organizations to organizations via shared members, to figure out which organizations were key in early 20th century Ontario…

The George Garth Graham Undergraduate Digital History Research Fellowship is Go!

futurefunder-win Over the Thanksgiving weekend, the FutureFunder campaign in honour of my grandfather, to create an undergraduate research fellowship in digital history, achieved its funding goal.

I wanted to thank everyone who contributed. Whether that contribution was through donations, through sharing on social media, or through sending me emails making ‘hey, you should really talk to [person x]…’ connections, this could not have happened without the support and buy-in of the DH community, my colleagues, and the alumni of the History department. Kylie, Pia, and Ryan in University Advancement were also tremendous supporters, helping garner national media attention, making connections, and coming up with novel ideas about how to promote it further.

Thank you, all!

Now begins the *really* fun part! Over the coming weeks, I’ll be working with University Advancement, our department’s undergraduate committee, and digitally-inclined folks hither and yon to set up the formal parameters for awarding the fellowship. One of the conditions of the fellowship would be for the student to maintain an active research blog, where she or he would detail their work, their reflections, their explorations and experiments. It would become the locus for managing their digital online identity as a scholar. I think I will recommend that the student use Reclaim Hosting to do this, since RH’s whole raison d’etre and my own sense of what students need to be doing online mesh very well (and see this post on the future for RH!)

We already have great, digitally-inclined undergraduate students in the history department here. Rob is a HASTAC Scholar. Hollis organized a THATCamp. Oliver is getting into data mining. These are just the three easiest for me to link to. Others like Devin and Joe have done fantastic thinks using Voyant Tools. Matthew, Julia, and Zack have gone into networks in a big way. Allison has been developing expertise with Omeka.

I’m excited to see what’ll happen next. Thank you, everyone, for supporting this Fellowship!

A thousand worlds: sci-fi networks in archaeology

A Guest Post by Tom Brughmans, PhD Student, University of Southamptonrune durham

Here is a common plot in sci-fi literature and movies (based on a popular physics model): the world you know is but one in an endless range of parallel universes, where each one is slightly different. Who would ever have thought this would be a good starting point for archaeological discussions? Yet the meeting in Durham I recently attended showed that parallel universes might have more in common with archaeology than we think.

I was invited by Rune Rattenborg to join a workshop in Durham called ‘A Thousand Worlds: Network Models in Archaeology’. This concept of a thousand worlds can be interpreted in an archaeological research context in different ways. On the one hand, and most similar to the sci-fi parallel universes plot, you could think about the many different reconstructions of past realities that could all explain a single archaeological pattern. Literally thousands of hypotheses could be raised to explain a certain pattern, each of them suggesting different mechanisms driving human behaviour and ultimately its expression in the archaeological record. On the other hand, you could think about the many academic perspectives archaeologists find useful for understanding the past. Perspectives ranging from highly quantitative (you can place me in that camp) to very qualitative, from local to global, from scientific to philosophical, and from an explicitly present-day perspective to attempting to recreate past perspectives. Each one of these is a valid way of thinking about past human behaviour and behavioural change (or rather every configuration or combination of these perspectives).

Both of these interpretations motivated Rune to title his workshop ‘A Thousand Worlds’. He noticed that archaeologists interested in questions of past connectivity and those of us using network perspectives often address the challenges we are faced with in very different ways. The only common ground of most network perspectives seems to be that the relationships between entities are considered crucial to understanding the behaviour of these entities. For example, the romantic relationship between two individuals will affect the decision to stay in and watch a Hugh Grant romantic comedy or to go out for a beer with the guys. But Rune also noticed that each perspective allows for a wide variety of reconstructions of past realities. These two issues seem to confuse archaeologists who might be interested in using such a network perspective in their archaeological research. I totally agree with Rune’s motivation to create some order in this chaos. The main questions of this workshop therefore were: what different network perspectives are out there? What rules govern them? What do they allow us to do that we could not do before? And what are their limitations?

To some extent the meeting was successful in addressing these questions. A number of very different perspectives were discussed by selected proponents: I introduced an extremely formal network science approach, which was discussed rather more pragmatically by Anna Collar; Michelle de Gruchy highlighted some interesting challenges in a geographical context; another group of presenters (Kristoffer Damgaard, Eivind Heldaas Seland, Sofie Laurine Albris, Rune Rattenborg) used the concept of connectivity and explored how it could be reflected in archaeological and literary sources. Finally Ronan O’Donnell introduced the actor-network theory (ANT) perspective through a fascinating case study on a post-Medieval landscape in Northumberland, UK, from which the strong difference between the aims of the ANT and network science research perspectives became particularly clear.

Nevertheless, by the end of the meeting it became clear that we were not entirely successful in addressing the many questions we set out to answer. Eivind Heldaas Seland skilfully summarised each paper and formulated three key questions that require more attention: how can these different perspectives and approaches usefully work together? What is the added value of some of these compared to a more traditional description of our sources? How can we better use these perspectives in the future? The fact that we were unsuccessful at addressing these questions shows how complex and non-trivial they are (and we also ran out of discussion time). But for what it’s worth, I take this opportunity to share some of my thoughts on these questions, combined with some of the points I picked up from others during the discussion.

First of all, I believe the first question presents the false impression that the different network perspectives can and need to work together. I would argue that, many network perspectives do not need to and most of them do not work well together at all. This is because some of them (like ANT and network science) are designed to address very different questions. But even those approaches that have more in common, like the quantitative vs. qualitative use of network science, don’t necessarily need to be combined into an almighty network approach. There is no need for a great unifying theory or method in archaeology, not even for one that just focuses on questions of connectivity. Rather, I consider the different network perspectives as tools that function according to certain rules, and once these rules are known the tools have a potential to make small but crucial contributions to our knowledge of the past. I believe that if we are to ever achieve the full potential of these exciting new approaches for archaeology we will need to first critically explore them in isolation.

Secondly, the added value of these perspectives is more obvious than how they should be applied. Many in the audience seemed to agree that the concept of the network itself is a powerful tool to think with. It forces us to consider the potentially important role played by relationships between entities (however defined: humans, molecules, parallel universes), which might allow us to ask and answer new questions. For me the added value lies in the recognition that all archaeologists make assumptions about the nature of such relationships when they formulate hypotheses about past phenomena. It can be useful to think about these assumptions in terms of network concepts and, most importantly, there is a real need to be critically aware of their existence and formulate them clearly. Network science can help archaeologists to think about their assumptions of past relationships, to formally express them (in words and/or in numbers), and to evaluate their implications for past behavioural change and its reflection in the archaeological record.

Finally, the “better use” of such approaches and perspectives is not optional, it is necessary if they are ever to become useful within an archaeological research context. However, a critical use and application is not just a critical awareness of the rules that govern them. Rather, an equal if not larger effort should be afforded to the archaeological interpretation of network science results, or the differences in the interpretative process that a networks perspective implies. I believe none of the scholars that attended the Durham meeting would disagree with that. The studies they presented could be roughly divided into two groups: those that THINK through network and those that DO networks. I believe the former is more important than the latter, because there can be no doing without thinking. Although this sounds like an obvious statement it is worth emphasising it because the use of quantitative network analysis is too often treated like a “black box” approach, which it is not. Every network science study in archaeology, no matter how quantitative, aims to better understand (aspects of) past phenomena. When doing so, the scholar formulates a hypothesis, expresses their assumptions about past relationships and their roles, or at least clearly defines what they mean by the network concepts they use. Only after this phase of network thinking can a scholar proceed to network doing, which involves representing hypotheses/assumptions/the archaeological record as network data (points and lines, and what they mean). The ability to use advanced quantitative tools should not be an excuse for the post-hoc imposition of a theoretical framework that fits the results nicely; nor should the appeal of using fashionable network concepts lead to reluctance to formally express what is meant by them and to evaluate their implications for understanding past phenomena.

Even though none of the three key questions about the role of the networks perspective in archaeology can be conclusively answered at this time, I felt that its future is nevertheless bright. The diversity of possible approaches and perspectives is encouraging and will lead to critical research that promises to help archaeologists better evaluate what approach is useful for their studies of past connectivity, and what is not. Some of these approaches might require multi-disciplinary collaboration, especially the more scary and maths-heavy techniques in the network science toolbox. But archaeologists should never be tempted to outsource the network thinking part of the process. Critical knowledge of the archaeological literature and data leads to an awareness of the relevant research questions, and the same knowledge will lead to valuable interpretations of analytical results and research processes. There might be a thousand pasts out there, and there might be a thousand ways of reaching them, but this quest will always need to be undertaken by archaeologists.

Selected relevant publications:
Brandes, U., Robins, G., McCranie, A., & Wasserman, S. 2013. What is network science? Network Science 1(01): p.1–15.

Brughmans, T. 2013.Thinking through networks: A Review of Formal Network Methods in Archaeology. Journal of Archaeological Method and Theory.

Knappett, C. 2011. An archaeology of interaction. Network perspectives on material culture and society. Oxford – New York: Oxford University Press.

About the author:
Tom Brughmans is currently finishing a PhD in archaeology as a member of the Archaeological Computing Research Group at the University of Southampton and at the University of Leuven. His main research interest is to explore the potential of network science for the archaeological discipline. Tom blogs at archaeologicalnetworks.wordpress.com
Twitter: @tombrughmans

Personal Learning Repository in Omeka.net; exhibit building assignment

In my HIST2809, Historian’s Craft this term, I’ve been asking for students to maintain a repository of their learning using Omeka.net. Every time we do an assignment or an exercise, that work is meant to go into their repository. The final exercise in the course is to build an exhibit of their learning progress. Here is the assignment prompt; I thought folks might be interested.

Omeka is not just for storing items. It is also for exhibiting them. Exhibits are built around the idea that you are telling a story with these items. You will have collected many different items over the course of this term.

Your exhibit should tell the story of your learning in HIST2809.

Help in building the exhibit is available here. An example of an exhibit built by a student at Carleton is ‘Black History in Canada

Your exhibit should:

  • be built around at least five items, including your assignment 1 & 2 originals
  • incorporate (either copy or link to) 2 items from SOMEBODY ELSE’S Omeka repository, providing citation to the original item location (see the forum at the top of cuLearn for the URLs to other people’s repositories)
  • link outwards to at least three other sites or sources (eg an item in a library catalogue, public zotero page, Wordle, Voyant Tools corpus, existing online exhibit, photo gallery… etc).

The point of this exercise is:

  1. To learn how to make an exhibit in Omeka, which is an industry standard in cultural heritage circles.
  2. To see how the assumptions built into the platform constrain or enable various kinds of storytelling. ALL digital resources have assumptions about how the world should work built into them, from Google to JSTOR to the Digital Public Library of America. Working inside Omeka.net gives you a glimpse of how these things work from the creator, rather than consumer, side.
  3. To learn how to analyze digital sources as we would any other source

WHAT YOU WILL SUBMIT:

1. A 500 – 1000 word reflection that analyzes your exhibit under PAPER headings, with the URL to your exhibit, with a final section discussing your process in building the exhibit.

We will be grading this document, not the exhibit itself.

So: have a title page with your name on it, your exhibit title, and the direct URL to that exhibit.

Then, for the reflection/analysis, discuss your exhibit AS IF you were considering it as a primary resource, explicitly using the PAPER headings.

You will tell us about

  • your purpose (obviously, you want to tell us about the evolution of your learning, but you might have other goals, too, that are expressed through careful use of colour, or … ),
  • your argument (the way you arrange things, force particular paths through the material, or…),
  • presuppositions (your worldview as it pertains to the role/value of digital work, perhaps; you might feel that this is a waste of time, or you might love playing and learning with digital tools; or you might be ok with digital but see them as mere tools whereas someone else might think of them more like paint & clay, as things to create with: how does that effect what you’ve done or reflect within it?),
  • epistemology (what has been chosen? what has been left out? why? to what end?)
  • and of course, related…

(The ‘R’ part might be the hardest: read, cite, and consider your exhibition in the light of this article http://dare.uva.nl/document/215092 Jose van Dijck, Search Engines and the production of academic knowledge. International Journal of Cultural Studies,13(6):574–592.)

  • INCLUDE a final section that tells us about the problems/potentials you experienced in building this assignment. In what ways does Omeka lean towards particular kinds of stories or paths through material? Does this matter?

A rubric will be provided. The balance of points will be towards your reflection/analysis, rather than the aesthetics of your exhibit.

-> You could have a bare-bones, ugly, exhibit: that would be perfectly ok. We’re not grading on aesthetics. But aesthetics do make a difference for the visitor to your site- an analysis of a bare-bones, ugly exhibit would need to reflect on what that design choice does, for the visitor, in terms of your purpose, argument…

I certainly want you to be thinking especially carefully about the ways ‘argument’, ‘epistemology’, and ‘related’ are reflected in your exhibit.