what follows is our draft chapter for ‘Seeing the Past‘, a colloquium hosted by Kevin Kee at Brock University. The chapter will eventually be published in ‘Seeing the Past: Augmented Reality and Computer Vision in History’ http://kevinkee.ca/seeing-the-past/book-abstract/
Hearing the Past – S Graham, S Eve, C Morgan, A Pantos
This volume is about seeing the past. But ‘to see’ does not necessarily imply vision. To see something can also mean to understand it. We frequently see things that do not exist, in this sense. “I see your point” or “I see what you’re saying”. ‘I hear you’ we sometimes say, also meaning, I understand.
In which case, how should we “see” the past? You can’t see the past. You can only see the present. You might believe something of what you’re looking at as being ‘from’ the past, but it still lives in the here-and-now. Thus, there is always a cognitive load, a ‘break in presence’ [Turner, 2007] that interrupts what we are seeing with awkward details. This is why we talk of the historical imagination, or the archaeological eye. To understand the past through augmented reality might not require vision. Yet, the majority of augmented reality apps currently available privilege the visual, overlaying reconstructions or text on an image of the present through a keyhole, the viewport offered by our small screens. The clumsiness of our interfaces also creates a break in presence. Visual overlays are clunky, with low-resolution 2D graphics, all of which further contribute to breaks in presence.
In short, they do not help us see the past – to understand it – in any meaningful way.
In this chapter, we suggest that ‘hearing’ the past is a more effective and affective way of providing immersive augmented reality. We argue from cognitive and perceptual grounds that audio – spoken word, soundscapes, acoustic horizons and spaces, and spatialized audio – should be a serious area of inquiry for historians exploring the possibilities of new media to create effective immersive augmented reality. We explore some of the phenomenology of archaeological landscapes and the idea of an ‘embodied GIS’ [Eve, 2014] as a platform for delivering an acoustic augmented reality. Drawing on Phil Turner’s work on ‘presence’ in an artificial environment [Turner, 2007], we explore ‘breaks’ in presence that occur in augmented, mixed, and virtual environments. The key idea is that presence is created via a series of relationships between humans and objects, that these relationships form affordances. When these relationships are broken, presence and immersion is lost. We argue that because the sense of hearing depends on attention, audio AR is particularly effective in maintaining what Turner calls ‘affective’ and ‘cognitive/perceptual’ intentionality. In short, the past can be ‘heard’ more easily than it can be ‘seen’. We explore three case studies that offer possible routes forward for an augmented historical audio reality.
‘Eh? Can you speak up?’ The sense of hearing
The sense of hearing, and the active cognition that hearing requires, has not been studied to the same degree or in the same depth as the visual [Baldwin, 2012: 3]. Hearing – and understanding – is also a tactile, haptic experience. Sound waves actually touch us. They move the tiny hairs of the ear canal, and the tiny bones within, and the various structures of the middle and inner ear turn these into the electro-chemical pulses that light up the various parts of our brain. Sound is a kind of tele-haptic:
“…the initial stage of hearing operates as a mechanical process. Later the mechanical energy of sound converts to hydraulic energy as the fluids play a larger vibratory role. Thus at its source, touch operates with and causes sound, and it is only through touch-at-a-distance that we have sound at all. The famous story of Edison’s ears bleeding from his aural experiments makes visceral this tele-touch, which is not always a gentle stroke, no matter how pleasant the sounds, voice or music we might encounter.” [Bishop 2011, 25-6]
But intentional hearing – listening- requires attention. Consider – In the crowded foyer of a cinema, it can be quite difficult to make out what the person opposite is saying. You have to pay attention; the act is tiring. One can try to read lips, trying to match visual cues with auditory cues. In the quiet of a classroom, with the teacher’s back turned, the teacher can hear the surreptitious whisper that while much quieter, speaks volumes. Hearing, unlike sight, requires attention that divides our ability to make semantic or emotional sense of what’s being said, or even to remember quite what was said, when the original audio signal is poor [Baldwin, 2012: 6]. What’s more, our brain is processing the spatial organization of the sound (near sounds, far sounds, sounds that move from left to right), how it is being said, not just the what being said [Baldwin, 2012: 6].
Bishop goes on to argue that touch and vision are senses that can only know the surface; sound waves transcend surfaces, they cause surfaces to vibrate, to amplify (but also, to muffle). And so,
“Sound provides the means to access invisible, unseeable, untouchable interiors. If we consider the import of vision to the general sensorium and metaphorization of knowledge, then the general figurative language of “insight” runs counter to surface vs. deep understanding of the world. Sound, it would seem, not vision or touch, would lead us to the more desired deep understanding of an object or text.” [Bishop, 2011, 26]
Sound permeates and transgresses surfaces; sound gives access to the unseen. Bishop is discussing Karlheinz Stockhausen’s “Helicopter String Quartet’. Bishop goes on to argue that the piece exposes the ways that sound and touch, blur (and “slur”) into a kind of synaesthesia, which defies the ‘assumed neatness of the sensorium’ [Bishop, 2011, 28]. With our western rationality, we assume that the senses neatly cleave. With our western focus on the visual, we prioritize one way of knowing over the others. Chris Godsen, in an introductory piece to an issue of World Archaeology on the senses in the past, argues that our western ‘sensorium’ (what we think of as the five senses) influences and conditions how we understand material culture. He advocates unlearning and unpacking the privileged position of sight [Gosden, 166] what others have called ‘ocularcentrism’ [Thomas 2008]
The effect of structured sound (let’s call it ‘music’) on movement is another interesting area where the haptic qualities of sound may be perceived. Interestingly, there are aspects to music that seem to translate into movement ‘primitives’. A recent study explored the relationship of musical structure (dynamic, harmony, timber) to guided walking (mobile tours) [Hazzard et al, 2014]. The authors note that a focus on structure in music sits between the thematic (where the emotional content of the music is manipulated) and the sonic (which is associated with spatial cues). Thus, they wondered what aspects of structure would be perceived by their non-musically trained study subjects (western, university undergraduates at an Anglophone university) and how the subjects would translate these into music. The subjects listened to four distinct compositions that were designed to emphasize one aspect of musical structure, as they moved around an open field. The subjects were observed and interviewed afterwards. Why did they stop in certain places? Why did they back-track, or move in a spiral?
The authors found that silence in the music was often interpreted as signalling a time to stop, while crescendi (a rising intensity in the music) impelled movement forward while a diminuendo, a lessening, did not imply movement away; rather it signalled the impending end of movement altogether. Musical punctuation caused listeners to try to understand the significance of the particular spot they were standing on. Timbre ‘coloured’ different areas. ‘Harmonic resolution’ signalled ‘arrival’[Hazzard et al, 2014: 609-613]. As will be seen in our case studies, this interplay of silence and crescendo can also be a powerful affective tool to convey the density or paucity of historical information in an area.
Sound requires cognition to make sense; there is nothing ‘natural’ about understanding the sounds that reach our ears. This act of attentiveness can elide other breaks in presence. Sound is tactile. It engages pathways in the brain similar to those involved with processing visual imagery.
Culture & Soundscape
‘As a little red-headed Metis kid, it never occurred to me that the city could sound different to anyone else.’ [Todd, 2014] Zoe Todd recently wrote a moving piece in Spacing on ‘Creating citizen spaces through Indigenous soundscapes’, where she describes amongst other things the profound effect of a flash mob occupying the West Edmonton Mall’s replica Santa Maria, Columbus’ flagship. “The sounds of Indigenous music, language and drumming soaring high up into the mall’s glass ceiling was a revelation: decolonization of our cities is not merely a physical endeavor, but also an aural one.” [Todd, 2014]
Work on the cognitive basis of memory has shown that, rather than being like a filing cabinet from which we retrieve a memory, the act of recollection actively re-writes the memory in the present: our memories are as much about our present selves as they are about the past. Thus, cognitive scientists working in the field of post-traumatic stress disorder are finding that they can reprogram the emotional content of traumatic memories by altering the contexts within which those memories are recalled. Sound plays very much a role in all of this. [see S Hall’s review article 2013 on the state of research, http://www.technologyreview.com/featuredstory/515981/repairing-bad-memories/].
Soundscapes affect us profoundly, and as Todd demonstrates, can be used to radically reprogram, repatriate, decolonize, and contest spaces. Work on the cognitive foundations of memory suggests that sound can literally re-wire our brains and our understanding of memory. Tim Ingold talks about the ‘meshworks’ of interrelationships that create spaces and bind them in time [ref]. Can soundscapes help us ‘visualize’ the past, or at least, surface different patterns in the meshwork? Can we reprogram collective memories of place with sound?
The soundscape has been explored in a historical context by a number of scholars, and in particular, amongst archaeologists as the study of archaeoacoustics. Most work on archaeoacoustics has explored the properties of enclosed spaces [see Blesser & Salter 2007] such as caves [Reznikoff 2008], theatres [Lisa et al. 2004] and churches [Fausti et al. 2003]. For an excellent review of the increasingly extensive literature, see Mills . In particular, Mlekuz has investigated the soundscape of church bells in an area of Slovenia. He takes Schafer’s  definition of the soundscape, who sits it in direct opposition to an acoustic space, explaining that where an acoustic space is the profile of the sound over a landscape, the soundscape is a sonic environment – with the emphasis being put on the way it is perceived and understood by the listener [Mlekuz 2004, para.2.2.1]. This clear distinction between the mechanics and properties of the sound (the acoustic nature) with the affect it has on the listener (the soundscape) fits perfectly with Turner’s idea of the Arc of Intentionality. Where we may be able to recreate the sounds of the historical past, we may not be able to recreate how these sounds came together to create the soundscape of a person existing in that past. The soundscape is a combination of the acoustic properties of sound, space and the individual. However, the acoustic nature of historical sounds will affect us as human beings and will evoke some kind of emotional/affective response – even if it could be argued that this response is not ‘historically authentic’.
The next question to ask, then, is that if sounds, music and voices from the past can affect us in certain ways – how do we deliver those sounds using Augmented Reality, to enable an in-situ experience?
Aural Augmented Reality
Audio tours, a handheld device rented or borrowed from a museum that guides a visitor through the exhibition, are a staple of many museums and heritage sites. The audio tour has been used since the 1950’s [see http://www.acoustiguide.com/ and http://www.duvaws.com/company/profile]. Once a bulky device that had to be curated and maintained by the museum or heritage site, audio tours are quickly taking advantage of the smartphone-enabled age and releasing their tours as downloadable apps or podcasts. This is democratizing the audio tour, allowing new and alternative tours of museums and cities to be released and followed, and potentially undermining the ‘truth’ of the official tour. While we certainly do not deny that the humble audio tour is a form of Aural Augmented Reality, experienced in-situ and influencing the way the user experiences a space, they serve as a narrative-led experience of a space (much as a tour guide in book form would) and do not often explore the haptic or more immersive properties of AAR.
Some applications have taken the idea of the audio guide further, such as the SoundWalk project [http://soundwalk.com/] that offers alternative tours of the Lourve, with a Da Vinci Code theme, or walking tours of the Hassidic areas of Williamsburg narrated by famous actors and members of the community. What makes the SoundWalk tours different, is that they are GPS-powered, and so specific to the place (for instance you are told to open specific doors when they are in front of you, or to look left or right to see individual features). They are also produced with a very high quality of narration, sound-recording and music/sound effects. In addition they play with the notion of yourself melding with the narrator “…ok, for today you are Joseph, that’s my Hebrew name, that’s my Jewish name and that’s your name, for today we are one.” [extract from the Williamsberg Men Hassidic tour http://www.soundwalk.com/#/TOURS/williamsburgmen/]. The SoundWalk tours attempt to create a feeling of immersion by effectively giving a ‘high-resolution’ aural experience, the acting, sound effects, music and beguiling narrative all come together to allow yourself to get lost in the experience, following the voice in your head.
An application that also uses the immersive aspect of storytelling to good effect is the fitness app, ‘Zombies, Run!’ [https://www.zombiesrungame.com/]. The app is designed to aid a fitness regime, by making running training more interesting. When you log into the app, you take on the role of ‘Runner 5’ a soon-to-be-hero that is going to save the world from the Zombie Apocalypse. The app uses your GPS location and compass to direct you on a run around your local neighbourhood, but all the the time you are being pursued by virtual zombies. Go too slowly and the sounds of the zombies will catch you up, their ragged breath chasing you around the park. As part of the run you can also collect virtual medical supplies or water bottles – – indicated to you by the use of in-game voice – that all help to stave off the Apocalypse. By using the very visceral sounds of a pursuer getting closer, combined with the affective power of physically being out of breath, tired and aching – the run becomes an immersive experience, you are not just trying to better your time – you are escaping zombies and trying to save the world. This app works so well, mainly because you don’t have to look at the screen and the suspense of the situation is created mainly through sound [see Perron 2004].
Three Archaeological/Historical Aural Augmented Reality Case Studies
The examples of AAR applications provided so far were not specifically created with an ear to exploring and experimenting with historical sounds or soundscapes. Instead, they provide an immersive narrative (audio tours) or gamify a journey through an alternate present (Zombies, Run!). Historians and archaeologists are currently experimenting with the technology not just as a means to simply tell a story – but to allow the user to ‘feel’ the sounds and have them be affected by what they are hearing. Each of the applications eschews any kind of visual interface, concentrating instead on the power of sound to direct, affect and allow alternate interpretations. The case studies are examples of prototype applications, proofs-of-concept, rather than fully-fledged applications with many users, however, even these experimental models demonstrate the potential benefits of hearing the past.
Using Aural Augmented Reality to explore Bronze Age Roundhouses
As part of his research using the embodied GIS to explore a Bronze Age settlement on Bodmin Moor, Cornwall, United Kingdom, Stuart Eve used a form of Aural Augmented Reality to aid navigation and immersion in the landscape [Eve 2014]. By using the Unity3D gaming engine (which can spatialize sound), Eve created a number of 3D audio sources that corresponded to the locations of the settlement’s houses. As the resulting app was geo-located, the user could walk around the settlement in-situ and hear the augmented sounds of the houses (indistinct voices, laughing, babies crying, etc) getting louder or quieter the closer they got to each sound source. The houses in the modern landscape are barely visible on the ground as circles of stones and rocks, making it hard to discern where each house is. Eve then introduced virtual models of the houses to act as audio occlusion layers, simulating the effect of the house walls and roofs in dampening the sounds coming from within – and only allowing unoccluded sound to emit from the doorways:
“At first, the occlusion of the sounds by the mass of the houses was a little disconcerting, as [visually] the buildings themselves do not exist. However, the sounds came to act as auditory markers as to where the doorways of the houses are. This then became a new and unexpected way of exploring the site. Rather than just looking at the remains of the houses and attempting to discern the doorway locations from looking at the in situ stones, I was able to walk around the houses and hear when the sounds got louder – which indicated the location of the doorway” [Eve 2014:114]
Eve then goes on to suggest that by modelling sound sources and relating them to the archaeological evidence, questions can be asked about the usage of the site, and can be explored in situ. For instance, if some of the houses were used for rituals (as is indicated by the archaeological evidence) what sort of sounds might these rituals make and how would this sound permeate across the settlement? More prosaically, if animals were kept in a certain area within the settlement, how would the sound of them affect the inhabitants? How far could people communicate across the settlement area using calls or shouts?
Eve’s use of AAR to ask archaeological questions of a landscape highlights the exploratory power of an Augmented Reality approach, a different application, Historical Friction, explores the power of AAR to inform us about our surroundings and make us question what is beneath our feet.
‘Historical Friction’ was directly inspired by the work of Ed Summers (of the Maryland Institute for Technology in the Humanities), filtered through the example of ‘Zombies, Run!’. Summers programmed a web-app called ‘Ici’, french for ‘Here’. Ici uses the native abilities of the browser to ‘know’ where it is in space to search and return all of the Wikipedia articles that were geotagged within a radius of that location. [http://inkdroid.org/ici/]. In its current iteration, it returns the article as points on a map, with the status of the article (stub, ‘needs citations’, ‘needs image’, etc) indicated. In its original form, it returned a list with a brief synopsis of the article. Summers’ intent was that the app could work as a call-to-action, to encourage users to expand the coverage of the area in Wikipedia.
Visually, it can be impressive to see the dots-on-the-map as an indication of the ‘thickness’ of the digital annotations of our physical world. Initially, we wanted to make that ‘thickness’ literal, to make it actually physically difficult to move through places dense with historical information by exploiting the haptic nature of sound.
We tried to make it painful, to increase the noise and discords, so that the user would be forced to stop still in their tracks, to take the headphones off, and to look at the area with new eyes. Initially, we took the output from ‘Ici’ and fed it through a musical generator called ‘Musical Algorithmns’. The idea was that the resulting ‘music’ would be an acoustic soundscape of quiet/loud, pleasant/harsh as one moved through space, a kind of cost surface, a slope. We wondered if it would push the user from noisy areas to quiet areas? Would the user discover places they hand’t known about? Would the quiet places begin to fill up as people discovered them? As we iterated, we switched to a text-to-speech algorithm. As ‘Ici’ loads the pages, the text-to-speech algorithm whispers the abstracts of the wiki articles, all at once, in slightly different speeds and tones. ‘Historical Friction’ may be found at at https://github.com/shawngraham/historicalfriction.
Historical Friction deliberately plays with the idea of creating a break in presence – a cacophony of voices that haptically forces the user to stop in her tracks- as a way of focussing attention on those areas that are think and thin with digital annotations about the history of a place.
During the inaugural York University ‘Heritage Jam’ an annual cultural heritage ‘hack-fest’, a group of archaeologists/artists/coders took the Historical Friction application as inspiration and created an AAR app called Voices Recognition.
“Voices Recognition is an app designed to augment one’s interaction with York Cemetery, its spaces and visible features, by giving a voice to the invisible features that represent the primary reason for the cemetery’s existence: accommodation of the bodies buried underground” [Eve, Hoffman, et al., 2014 ].
The way this is achieved is by using a smartphone-based app that again uses the GPS and compass to geo-locate the user within the cemetery. Each of the graves in the cemetery is also geo-located and is attached to a database of online census data, burial records and available biographies of the persons buried within the cemetery. The app then plays the contents of this database for every grave within 10m of the user. In the example application the data themselves are voiced by actors, however, in the full application it is likely these will be computer-generated voices (due to the sheer amount of data attached and the number of graves in the cemetery). (A video of the app in action may be viewed at http://www.youtube.com/watch?v=wAdbynt4gyw). The net result of this is in places a deafening cacophony of voices (especially in the areas of the mass graves) and in other places single stories being told. The umarked mass burials literally shout and clamour to be heard, whereas the grandiose individual monuments whisper the single stories. The usual experience of a cemetery is completely inverted [Eve, S., Hoffman, K., et al. 2014].
The voices recognition app uses augmented audio to represent abstract data in a visceral and tactile way. The subject matter of the app – the deceased – is perhaps an extreme example of information that could potentially have strong emotional impact on visitors. Careful thought is required for the appropriate presentation and distribution of material suitable for the intended cultural sphere to avoid unnecessary upset if such an app were to be made live. However the concept highlights the opportunity to relate a cultural location at a much closer and personal level through audio than can be achieved through the more ‘removed’ visual overlay and presentation. The Voices Recognition project, as well as the SoundWalk project described earlier, highlights the power of using sound not just as a way of exploring dense historical data, but also of presenting this in an engaging and unusual way. As the Voices project states, the app is part pedagogical and part an artistic soundscape. Its use of the overlapping voices as a representation akin to a ‘heat-map’, representing the clustered data because “it’s eminently possible to render delicate distinctions between layers/concentrations, and [for] the human ear to identify them more distinctly than they can colour, light or smell”. [Eve, S., Hoffman, K., et al., 2014].
Building an Aural, Haptic, Augmented Reality to Hear the Past
In a guest lecture to a digital history class at Carleton University in the Fall 2014 semester, Colleen Morgan recounted her experience with the ‘Voices Recognition’ app when it was being tested: ‘Voices, in the cemetery, was certainly the most powerful augmented reality I’ve experienced’.
Building a convincing visual AR experience, that does not cause any breaks in presence is the holy grail of Augmented Reality studies, and something that is virtually impossible to achieve. A break in presence will occur due to the mediation of the experience through a device (Head-Mounted Display, tablet computer, smartphone, etc.); the quality of the rendering of the virtual objects; the level of latency in software that delivers the experience to the eyes; the list is endless and scale-less – once you ‘solve’ one break in presence, then another occurs. The goal then can never be to completely eliminate breaks in presence, but instead to recognise them and treat them with an historian’s caution. Indeed, we can play with them deliberately to use their inevitability to underline the broader historical points we wish to make. For example, the use of artificial crescendo and diminuendo (such as with the Historical Friction and the Voices Recognition application) arrests the user, making them stop and consider why the sounds are getting louder or quieter. By inserting prehistoric sounds into the modern landscape, Eve is creating an anachronistic environment. This is a clear break in presence as that sound should never be heard in the present. However, the alien nature of that particular sound in that landscape jars our cognitive intentional state and again prompts us to examine what that sound might be and why it might have been placed in that particular location.
In this way the case studies presented are showing that AAR does not always have to be a ‘recreation’ or a fully immersive experience. Instead, much as we would treat the written word as the result of a process of bias and production, we should treat any augmented reality experience as the result of a process of bias (what is represented), production (the quality of the experience) and delivery (the way in which it is delivered). Hearing the past requires that we pay attention not just to effect but also affect, and in so doing, it prompts the kind of historical thinking that we should wish to see in the world.