Formal social network analysis (and theories of evolving networks) are powerful ways of analyzing the relationships within archaeological materials. I’ve done a wee bit of this in my thesis work, and subsequent agent modeling simulations. In my thesis work, I extracted the social networks by hand, comparing shared names on brick stamps to develop a picture of the evolving social worlds around land exploitation. My agent modeling work automates this a bit more, exporting the resulting social networks as a text file that can then be analyzed in Ucinet or similar. A recent paper discusses how social network data was extracted automatically from literary texts (from Mining the Humanities) :
This year’s conference of the Association for Computational Linguistics, the most prestigious event in computational linguistics, had a paper that got me very excited. It’s called Extracting Social Networks from Literary Fiction [pdf], and here’s the abstract (emphasis added):
We present a method for extracting social networks from literature, namely, nineteenth-century British novels and serials. We derive the networks from dialogue interactions, and thus our method depends on the ability to determine when two characters are in conversation. Our approach involves character name chunking, quoted speech attribution and conversation detection given the set of quotes. We extract features from the social networks and examine their correlation with one another, as well as with metadata such as the novel’s setting. Our results provide evidence that the majority of novels in this time period do not fit two characterizations provided by literacy scholars. Instead, our results suggest an alternative explanation for differences in social networks.
So here’s the big thought – this method (see the post for the details) could usefully be applied to any digital corpus of materials. I’m imagining for instance Google’s news paper archive search. My algorithm would
1. Search the archive for events at a particular place
2.Extract the people featuring in the articles
3.Search again for those people
4.Extract the places featuring in *those* articles
5.Repeat 1-4 n times until dead ends are reached
6.Stitch together the social network of places connected by actors (or actors connected by places).
…for example. Alternatively, something like the Corpus Inscriptionum Latinarum could also be usefully processed this way…?
What could you do with the resulting network? You could map it; you could imagine various flows through it; you could explore how resilient the network is; you could see how it changes over time… indeed, most of this could be programmed so that once you point the algorithm at your corpus, it spits out a story automatically: procedural history, where the rhetoric of the argument is generated by the process.
Edit: As Aditi points out below, that is in fact rather what her New York Times Visual Explorer does, a fascinating tool! Below is the image of the search using it for ‘the 3 places most mentioned with the 4 years most mentioned with archeology’ (nb, ‘archaeology’ provides a different result, and only seems to appear in the NYT since about 1999.)
Hi Shawn, thanks for linking to my blog.
The newspaper article social-network extraction algorithm you mention sounds like a previous project of mine, the New York Times visual explorer.You can see it here:
http://bebop.berkeley.edu/nytimes
More information at http://data.nytimes.com and http://developer.nytimes.com
-Aditi
Hi Aditi!
And there I was thinking I was all original :) Thank you for clarifying that… In truth I don’t think I had heard of it, but I think the technique is v. cool and I look forward to exploring with it!
-Shawn