I’m running a workshop next week on getting started with networks & gephi. Below, please find my first pass at a largely self-directed tutorial. This may eventually get incorporated into the Macroscope.
The data for this exercise comes from Peter Holdsworth’s MA dissertation research, which Peter shared on Figshare here. Peter was interested in the social networks surrounding ideas of commemoration of the centenerary of the War of 1812, in 1912. He studied the membership rolls for women’s service organization in Ontario both before and after that centenerary. By making his data public, Peter enables others to build upon his own research in a way not commonly done in history. (Peter can be followed on Twitter at https://twitter.com/P_W_Holdsworth).
On with the show!
Download and install Gephi. (What follows assumes Gephi 0.8.2). You will need the MultiMode Projection pluging installed.
To install the plugin – select Tools >> Plugins (across the top of Gephi you’ll see ‘File Workspace View Tools Window Plugins Help’. Don’t click on this ‘plugins’. You need to hit ‘tools’ first. Some images would be helpful, eh?).
In the popup, under ‘available plugins’ look for ‘MultimodeNetworksTransformation’. Tick this box, then click on Install. Follow the instructions, ignore any warnings, click on ‘finish’. You may or may not need to restart Gephi to get the plugin running. If you suddenly see on the far right of ht Gephi window a new tab besid ‘statistics’, ‘filters’, called ‘Multimode Network’, then you’re ok.
Assuming you’ve now got that sorted out,
1. Under ‘file’, select -> New project.
2. On the data laboratory tab, select Import-spreadsheet, and in the pop-up, make sure to select under ‘as table: EDGES table. Select women-orgs.csv. Click ‘next’, click finish.
(On the data table, have ‘edges’ selected. This is showing you the source and the target for each link (aka ‘edge’). This implies a directionality to the relationship that we just don’t know – so down below, when we get to statistics, we will always have to make sure to tell Gephi that we want the network treated as ‘undirected’. More on that below.)
3. Click on ‘copy data to other column’. Select ‘Id’. In the pop-up, select ‘Label’
4. Just as you did in step 2, now import NODES (Women-names.csv)
(nb. You can always add more attribute data to your network this way, as long as you always use a column called Id so that Gephi knows where to slot the new information. Make sure to never tick off the box labeled ‘force nodes to be created as new ones’.)
5. Copy ID to Label
6. Add new column, make it boolean. Call it ‘organization’
7. In the Filter box, type [a-z], and select Id – this filters out all the women.
8. Tick off the check boxes in the ‘organization’ columns.
Save this as ‘women-organizations-2-mode.gephi’.
Now, we want to explore how women are connected to other women via shared membership.
Make sure you have the Multimode networks projection plugin installed.
On the multimode networks projection tab,
1. click load attributes.
2. in ‘attribute type’, select organization
4. in left matrix, select ‘false – true’ (or ‘null – true’)
5. in right matrix, select ‘true – false’. (or ‘true – null’)
(do you see why this is the case? what would selecting the inverse accomplish?)
6. select ‘remove edges’ and ‘remove nodes’.
7. Once you hit ‘run’, organizations will be removed from your bipartite network, leaving you with a single-mode network. hit ‘run’.
8. save as ‘women to women network.csv’
…you can reload your ‘women-organizations-2-mode.gephi’ file and re-run the multimode networks projection so that you are left with an organization to organization network.
! if your data table is blank, your filter might still be active. make sure the filter box is clear. You should be left with a list of women.
9. You can add the ‘women-years.csv’ table to your gephi file, to add the number of organizations the woman was active in, by year, as an attribute. You can then begin to filter your graph’s attributes…
10. Let’s filter by the year 1902. Under filters, select ‘attributes – equal’ and then drag ’1902′ to the queries box.
11. in ‘pattern’ enter [0-9] and tick the ‘use regex’ box.
12. click ok, click ‘filter’.
You should now have a network with 188 nodes and 8728 edges, showing the women who were active in 1902.
Let’s learn something about this network. On statistics,
13. Run ‘avg. path length’ by clicking on ‘run’
14. In the pop up that opens, select ‘undirected’ (as we know nothing about directionality in this network).
15. click ok.
16. run ‘modularity’ to look for subgroups. make sure ‘randomize’ and ‘use weights’ are selected. Leave ‘resolution’ at 1.0
Let’s visualize what we’ve just learned.
17. On the ‘partition’ tab, over on the left hand side of the ‘overview’ screen, click on nodes, then click the green arrows beside ‘choose a partition parameter’.
18. Click on ‘choose a partition parameter’. Scroll down to modularity class. The different groups will be listed, with their colours and their % composition of the network.
19. Hit ‘apply’ to recolour your network graph.
20. Let’s resize the nodes to show off betweeness-centrality (to figure out which woman was in the greatest position to influence flows of information in this network.) Click ‘ranking’.
21. Click ‘nodes’.
22. Click the down arrow on ‘choose a rank parameter’. Select ‘betweeness centrality’.
23. Click the red diamond. This will resize the nodes according to their ‘betweeness centrality’.
24. Click ‘apply’.
Now, down at the bottom of the middle panel, you can click the large black ‘T’ to display labels. Do so. Click the black letter ‘A’ and select ‘node size’.
Mrs. Mary Elliot-Murray-Kynynmound and Mrs. John Henry Wilson should now dominate your network. Who were they? What organizations were they members of? Who were they connected to? To the archives!
Congratulations! You’ve imported historical network data into Gephi, manipulated it, and run some analyzes. Play with the settings on ‘preview’ in order to share your visualization as svg, pdf, or png.
Now go back to your original gephi file, and recast it as organizations to organizations via shared members, to figure out which organizations were key in early 20th century Ontario…