Topic Modeling Greek Consumerism

I’m experimenting. Here’s what I did today. 1. Justin Walsh published the data on which his book, ‘Consumerism in the Ancient World’, rests. 2. I downloaded it, and decided I would topic model it. The table, ‘Greek Vases’, has one row = one vase. Let’s start with that, though I think it might be more […]

Topic Modeling #dh2013 with Paper Machines

I discovered the pdf with all of the abstracts from #dh2013 on a memory-stick-cum-swag this AM. What can I do with these? I know! I’ll topic model them using Paper Machines for Zotero. Iteration 1. 1. Drop the pdf into a zotero collection. 2. Create a parent item from it. 3. Add a date (July […]

Topic modeling the things that fell out of pockets

Topic modeling is very popular at the moment in the digital humanities. Ian, Scott and I described them as tools for extracting topics or injecting semantic meaning into vocabularies: “Topic models represent a family of computer programs that extract topics from texts. A topic to the computer is a list of words that occur in […]

Topic Modeling an Archaeological Database 2

Some things I have learned in recent days: data must be cleaned. Really. It’s probably still too noisy, even when you think it isn’t. Eliminate frequently occuring meta-notes (as it were). All citations to Guest & Wells on Coins in the UK, for instance, really muck things up. you can enter a single csv file […]

Topic Modeling the Portable Antiquities Scheme

I got my hands on the latest build of the Portable Antiquities Scheme database. I want to topic model the items in this database, to look for patterns in the small material culture of Britain, across time and space. The data comes in a single CSV, with approximately 500 000 individual rows. The data’s a […]

Topic Modeling With the JAVA GUI + Gephi

I’ve been having an interesting conversation with Ben Marwick, in the comments thread of my initial ‘Getting Started with Topic Modeling’ post. Ben pointed me to an interesting GUI for Mallet, which may be downloaded here. I’ve been trying it out this morning, and I like what I’m seeing. Topic modeling is becoming more and […]

Getting Started with MALLET and Topic Modeling

UPDATE! September 19th 2012: Scott Weingart, Ian Milligan, and I have written an expanded ‘how to get started with Topic Modeling and MALLET’ for the Programming Historian 2. Please do consult that piece for detailed step-by-step instructions for getting the software installed, getting your data into it, and thinking through what the results might mean. […]

Topic Modeling the Day of Archaeology – Part 1

I’ve been topic modeling all of the posts from the Day of Archaeology. Topic modeling looks at patterning of words to determine ‘topics’ : Topic models provide a simple way to analyze large volumes of unlabeled text. A “topic” consists of a cluster of words that frequently occur together. Using contextual clues, topic models can […]