Reading Inscriptions Algorithmically

Inscriptions are complicated beasts. Frequently quite small and incomplete, epigraphers are able to extract an enormous amount of information from inscriptions – especially when they have other inscriptions with which to contrast and compare. Let us look at the inscriptions from Aphrodisias, which are published online following Epidoc conventions. Because of this, we are able to do some data-mining on them with a minimum of pre-processing.

(Joyce Reynolds, Charlotte Roueché, Gabriel Bodard, Inscriptions of Aphrodisias (2007), available <>, ISBN 978-1-897747-19-3.)

The first one looks like this, when the xml tags are stripped away:

Creative Commons licence Attribution 2.5 (
All reuse or distribution of this work must contain somewhere a link back to the URL
Originally published in Reynolds (1982).
English French German Ancient Greek Transliterated Greek Modern Greek Italian Latin Spanish Turkish 2007-07-04cmrDONE2007-04-02Charlotte Tupmanhand tidiedGBhand tidied 2007-03-15Elliott HallBatch converted Word2XML

Description of MonumentUpper right corner of a white marble block (0.36 x 0.24 x 0.34).
Description of TextInscribed on one face.
LettersLate Republican or Augustan; ave 0.02. rho in ll. 1, 3, 6 has a very small stroke slanting rightwards from the junction of the bowl with the vertical.
Date Late Republican or Augustan (lettering, content)
Edition οὗτος ὁ τόπος ἱερὸς ἄσυλος ὡς ἔκριναν ὁ μέγας Καῖσαρ ὁ δικτάτωρ καὶ ὁ υἱὸς αὐτοῦ αὐτοκράτωρ Καῖσαρ καὶ ἡ σύνκλητος καὶ ὁ δῆμος ὁ Ῥωμαίων καθὼς καὶ τὰ φιλάνθρωπα καὶ δελτογραφήματα καὶ ἐπικρίματα περιέχει ἀνέστησεν δὲ τοὺς ὅρους Γάϊος Ἰούλιος Ζωΐλος ὁ ἱερεὺς τῆς Ἀφροδείτης
Apparatus For the supplements, compare the partner inscription 1.38.
Translation [?This area is] the sacred asylum [?as defined by] the great [?Caesar, the] Dictator, and [?his son] Imperator [Caesar and the ] Senate [and People] of Rome, [as is also contained in the] grants of privilege, the public documents [and decrees. C. Iulius Zoilos priest of Aphrodite set up the boundary stones.]
Commentary See , 159-160.
Locations Stray find. Temple/Church temenos. Museum (1977)
Text Constituted From Transcription (Reynolds)
History of Recording Recorded by the NYU expedition in 1963 (63.596)
Bibliography Published by Reynolds, , doc. 35, whence SEG 1982.1097, BE 1983.388, 1984.878, McCabe 379, R. R. R. Smith, (Mainz, 1993) T5.


Face (1977)

There’s a lot of meta information that goes along with a single inscription above and beyond its transcription and translation, all of it which is necessary to understand the possible significance. I don’t think there’s a better illustration of what ‘close reading’ might mean in archaeology, than the epigrapher’s art.

What might we spot if we look at a corpus of inscriptions from a macro level? What patterns might exist? Is there something going on related to geography? Researcher? language of the inscription? Publication history? Dating? This is where the algorithmns of topic modeling might be useful. My go-to tool for this is MALLET. Mallet allows one to strip out all of the xml tags (see MALLET’s help file from the command line for -import-dir), so I can download the xml files as zip from the Inscriptions of Aphrodisias site, and begin exploring for patterns. I optimize the interval too when I train the topic model, to shake out the utility of the resulting ‘topics’. I began by modeling 50 topics.

You can download the MALLET file and results here, to play with and explore for yourself.

When I look at the results (inscriptionkeys.txt), the ‘strongest’ topics all relate to metadata regarding their online publication (the top 3). The next few clearly relate to the researchers who are behind the inscriptions of Aphrodisias website, so not overly useful for me here. The next couple seem to be a mixture of findspot information and publishing history:

topic6 0.34603 unpublished fragment reynolds face museum version lettering inscription born digital joyce unknown marble expedition white centuries nyu inscribed stray 32 0.23776 face upper moulding left side lettering part lower expedition museum white marble nyu broken aphrodisias asia corner inscribed front
topic39 0.15711 south walls east face west block part wall gate expedition findspot city stretch tupman mama lettering depth measurable marble
topic7 0.14493 mama gaudin published reinach mccabe cormack kubitschek squeeze notebook phi expedition records originally aphrodisias reichel publications recorded charlotte representations
topic43 0.13213 mccabe published originally bodard gabriel rouech aphrodisias phi description findspot reported subsequently charlotte unknown preliminary inscription tidied publication funerary

The remaining topics all deal explicitly with the inscriptions themselves, their texts and their findspots (it seems).

topic47 0.06106 son honoured honours people council claudius priest diogenes family tiberius man high cl public gerousia lived virtue life zenon
topic8 0.05807 roman family wife names father aphrodisian case daughter suggests citizenship early century reference possibly menodotos clear named civic late
topic38 0.05137 son zenon adrastos attalos dionysios athenagoras artemidoros apollonios hypsikles aphrodite diogenes daughter early tupman menestheus cf goddess grandson sons

Groupings in Inscriptions of Aphrodisias
Groupings in Inscriptions of Aphrodisias

Every file is composed of all these different topics, to differing degrees. I would like to visualize the paths of these discourses through the corpus, so I translate the inscriptioncomp.txt file so that I end up with at at least 9/10s of each document’s composition (in practice, this means cutting and pasting the inscriptioncomp.txt file so that I end up with a single list with source-document, target-topic, and weight). I also filtered out those strongest topics described above (5,6,7,9,16,17,29,39,43).

I imported this list into Gephi, and set about trying to find groupings of topics and inscriptions, based on the shared patterns (and the weighting) of relationships. I coloured it by group (modularity) and resized nodes based on ‘betweeness’. What does betweeness mean here? I think it means the principle ideas (the discourse) that ties this entire collection together. In this case, topic 0:

statue base honours shaft ll moulding set city sbi council feature honoured capital aurelius prosopography top moulded ligatures antonius

followed closely by 1 and 37:

topic1 sarcophagus funerary inscription front lid standard necropolis aurelius forms buried tomb east formula elements aur rim burial end line

topic37 city face village house inscribed recording edition wall block text transliterated unknown large line greek area lettering viii marble

Topics - topics, Inscriptions of Aphrodisias
Topics – topics, Inscriptions of Aphrodisias

It might be that these most ‘between’ topics are not the ones that are archaeologically interesting. This is of course a 2-mode network (inscriptions-topics) so it might be desireable to consider this data as two 1-mode networks, inscriptions – inscriptions by virtue of shared topics, and topics – topics by virtue of shared inscriptions. When we take topics – topics, running our familiar grouping and betweeness metrics, topic 37 comes out on top, followed by 10 and 33:

topic10 building reynolds blocks block son architrave published theatre decoration papers fasciae dedication end aphrodite people aphrodisias fascia found demos

topic33 ii iii iv cut text left cross fortune monogram mccabe letters end triumphs broken vi acclamation texts drawing vii

When we turn the two mode network into an inscriptions – inscriptions by virtue of shared topics, we end up with a monster of a graph: 1505 nodes (inscriptions), with 241,002 relationships! The most between inscription is iAph050118:

Building inscription of Helladios
Charlotte M. Roueché2007
Creative Commons licence Attribution 2.5 (
All reuse or distribution of this work must contain somewhere a link back to the URL
Originally published in Roueché (2004).
English French Ancient Greek Transliterated Greek Latin AsiaTurkey
2004-06-08Gabriel BodardChecked and fixed all image divs and refs 2004-03-16 Gabriel Bodard Completed lemmatisation, checked figure ids, tagged keywords 2003-11-04John LavagninoConverted beta code to Unicode 2003-05-27 Gabriel Bodard tidied and corrected 2003-04-30 Juan Garcés tidied and corrected 2003-06-22CMRtagged, tidied and corrected2003-07-14JLGLemmatised2003-08-20CMRname tags reduced2004-01-16CMRtidied; image refs2003-05-27Gabriel BodardTyped and marked-up Greek Description of Monument

A rectangular white marble block, perhaps from a lintel (0.285 × 0.665 × 0.50) with simple moulding above and below on one face. Chipped to the right, but complete.
Description of Text

Inscribed on the moulded face, in one line on the surface between the mouldings, which is slightly concave. The text must have continued onto an adjacent block.

Description of Letters
Flowing style, similar to 5.302, 5.119 and 4.120; 0.05-0.06.

First half of the fourth century (lettering, prosopography).
ὁ ἁγνός

Me also Helladios the pure

For Helladios see also 1.131, 4.120 and discussion at II.35.

Hadrianic Baths: central chamber. Unknown. Findspot (1972)..

History of Recording
Excavated by the NYU expedition.
Bibliography Published by Roueché, Aphrodisias in Late Antiquity, no. 18, whence PHI 605.
Text Constituted From Transcription (Roueché).
Photographs Face (1972)

Seems a bit underwhelming, no? But look at what is in this inscription – a personal name, the central chamber of the Baths, links outward to other inscriptions… reading the inscriptions algorithmically doesn’t absolve us from having to jump back in to do the close reading. Instead, we have to bounce back and forth between the micro and the macro. The modularity routine suspects that there are around 52 distinct subgroups in this material. That’s probably where the most interest will lie, for scholars of this material. Are these groups related to context of discovery, or named individuals appearing in mutliple inscriptions or…? Five groups account for 1456 inscriptions. (It’s easier to load the ‘inscriptions-inscriptions-inscriptions-of-aphrodisias [Nodes].csv’ file to examine all of these). What might be causing the ‘big five’ to group together? I will leave it up to the epigraphers to examine them…

Those 47 inscriptions which the modularity routine found so odd that they each were put into their own group are curious indeed. The first of these uniques is Inscription iAph080906:

ὁ μεγαλοπρεπέστατος πολιτευόμενος σὺν θεῷ πατὴρ τῆς πόλεως

Up with Theopompos, magnificentissimus, member of the council and, with God’s help, pater civitatis

…which seems to be a good place to draw this note to a close. Up with Theopompos indeed! One wonders if he won the election. The remainder (checked at random) seem to have no translations associated with them. So perhaps what really sets these apart is simply that they haven’t been translated. If so, that’s rather astonishing that that should be visible from a topic-model & graph viz combination.


2 thoughts on “Reading Inscriptions Algorithmically

  1. Better insulation, for example, both saves you money and keeps you warm.
    In most cases, people in military services and veterans can benefit from extra privileges
    that allow them to receive a grant without having to pay it back.

    A handyman might not have the skill to do home improvements.

Comments are closed.