Our work is made possible through the support of the following organisations:

SRB logoSRB logoSRB logoSRB logo

Streaming Together: Audiobooks as shared reading 

Linda Daley and Brigid Magner on audiobooks

This essay is in conversation with ‘A Third Space: Giving voice to works in progress’ by Romy Ash and Rose Michael.


Reading-as-listening 

When we were asked to write a piece for The Conversation in 2021 about whether audiobook reading is ‘cheating’, we found ourselves deluged with comments from audiobook readers, showing the immense popularity of this medium. Along with a multitude of scholars of reading, we firmly believe that the production and reception of audiobooks support an array of active, participatory and social reading practices. In this essay, we consider audiobook reading as a collective enterprise which augments and invigorates our sensory consciousness of narratives.

Audiobooks have the potential to expand our ways of living in the world – this is what we do in the humanities – so it’s not surprising that we are engaged by the practice of reading-as-listening. Through our experience as audiobook readers and Literary Studies educators, we are convinced that streamed audiobooks provide new insights into the complex and sometimes mysterious aspects of reading that are at once neurological, psychological and cognitive, as well as affective and intellectual. Over recent months, we’ve been thinking about how the technology of audiobook production, and the reading practices associated with it, encourages an overturning of the still-dominant images of the lone ‘genius’ author on the one hand, and the solo, private reader on the other. Streaming technology has intensified this disruption, enabling a more communal narrative experience, even if the listener absorbs the narrative alone, with or without headphones.

In many ways, the experience of listening to a streamed audiobook is akin to the reading aloud performed by people in the Middle Ages. Medievalists know that in cultures of aural reading it wasn’t only the one reading aloud, usually male, who was literate. Rebecca Krug (2017) places late medieval aural reading in the context of singing and recitation, one among several communal practices of spiritual relationship. During the fifteenth century, long before the silent, isolated and optical engagement with a written text that came to characterise our modern practice, aural reading was common.

The Book of Margery Kempe (c. 1439), for example, was not only written collaboratively with a series of amanuenses, scribes who listened to and transcribed episodes recounted from her life, it was also inspired by Kempe’s many years of aural reading with a ‘lister’, as she called the young priest with whom she read devotional books for a period of about eight years. She came to view this priest as like a son. It’s probable but by no means certain, says Krug, that Kempe was illiterate, as aural reading was common among the highly literate. Kempe’s was an extraordinary life considering the strictures of medieval Christianity that she both embraced and challenged.

It would be too easy, infers Krug, to think that Kempe’s relation with listers and scribes placed her in a subordinate and passive position vis-a-vis the texts they were reading, and with the text that they were producing. Krug tells us that aural reading was a reading practice that was constitutive of both Kempe’s self-relation and her relations with others. Reading made Kempe into a writer at the same time that it enabled social bonds through the reading of devotional books with others semi-publicly.

Today aural reading, practised aloud and collectively, seems strange, or an activity only for children. Mark McGurl observes that every act of reading is a return to something ‘primal’, the ‘gently bowed’ page of a book figuring as a kind of breast. For McGurl, the audiobook also facilitates ‘the return of sorts of the bedtime story in the life of the adult commuter.’

The skills of reading and writing, which many of us take for granted as necessarily paired, were not always so, according to medievalists. When Margery Kempe determined that she would author her life through the production of a book, it was an extraordinary intention for social and religious reasons. Practically, however, it may not have seemed so extraordinary at a time when authorship had not yet essentially been tied to the ability to write.

These medieval reading and writing practices remind us that having an aural relation to a written text is not merely remedial. Sometimes ‘book-reading’ is framed as an activity that allows us to gain independence, or more likely, solitude from others, yet older practices show that reading usually involved joining with others. The affiliative nature of aural reading was as central as its interpretive aspect. More so, it was a formative practice that could transform the reader into a writer. For Margery Kempe, auditory attention played an important part in her aspiration to become an author and in its practical realisation. This auditory dimension has since diminished, if not completely disappeared, in adults’ relation to texts.

Pre-modern aural reading practices help us to rethink the role of technologised aural reading that streamed audiobooks potentially enable in our own era. Martyn Lyons points out that audiobooks are not a modern invention, since literature was being shared orally long before recording and streaming existed.

The business model of streamed audiobooks would seem to be pitched at the individual consumer, reinforcing the modern sense of silent and isolated engagement with a printed text. Indeed, Amazon’s dominance of the audiobook market is premised on isolated listening experiences rather than the sharing of narratives with multiple others. The advent of lightweight headphones and earbuds also enables a continuation of the silence and isolation of conventional book reading practice.

When and why do we turn away from the codex to the audiobook, or combine it with the audiobook? As constant readers, we find it difficult to separate our professional from our recreational reading as they are almost always interrelated. Here, we reflect on our own experiences with audiobook titles that may not necessarily make it onto our course reading lists. The listening experience, nonetheless, enables us to consider our reading assumptions and habits more broadly for pedagogic possibilities.


Ghosts in the text 

Linda: I began listening to Spare by Prince Harry in the summer holidays of 2023, mainly out of curiosity about his authoring of the memoir as part of a strategic transformation from a working senior royal into a celebrity ‘man-of-the-people.’ The commercial streamer I subscribe to reminded me that six (book) credits were about to expire, and Spare was available as a title. Harry and Meghan had been cast by the British media as increasingly at odds with public expectations of royalty, before standing down from their roles and fleeing the UK. Preceding the memoir, the tell-all Oprah Winfrey interview and the release of the Netflix documentary made it clearer that the couple were seeking an escape not only from the hostile British media, but also from the royal family itself. With the memoir approaching publication, I was curious: what could Harry possibly add in his ‘own words’ that hadn’t already been said? How could the story of his life be told in a way that would sell well enough to ensure that he would never need to sell it again? I also had a more literary curiosity regarding the book’s pre-publicity that gave more than usual prominence to Spare’s ghost writer, J.R. Moehringer – a prominence that potentially undercut the book’s status as a memoir. Whose story was this going to be?

I was conflicted about the prospect of reading Spare. How might I satisfy my curiosity without supporting the royal institution that Harry was maintaining his allegiance to, even while fleeing from it? My conflict was later assuaged by some of the commentary the book inspired, including an article by three eminent Australian historians that read the Duke and Duchess of Sussex’s multi-media campaign as having potential implications for Australian republicanism. Moreover, I wanted to satiate my curiosity without adding to the stratospheric first-day sales record Spare had achieved.

Hearing Harry’s voice seemed to override the fact that the autobiographical text he was reading was composed by (or with) another, reminiscent of Kempe and her amanuenses. His posh, well-modulated voice employs plenty of colloquial turns of phrase, occasional F-bombs, and a good supply of self-deprecation. For example, the tale about his Eton tutor giving him a ruler on which all the monarchs of England are named – a kind of crib tool for the struggling and uninterested prince to learn about his ancestors and also pass his history test. He makes occasional swipes at the new King, depicted as an emotionally detached parent, and at William, portrayed as an uptight, controlling older brother. The memoir is filled with anecdotes, many giving a personal version of what was already on the public record, in addition to recounting his life as a soldier and his whirlwind romance with Meghan.

On the whole, Spare portrays a family drama like so many others, albeit told from the point of view of one who is not meant to tell. Hearing Harry’s engagingly vocalised reading of this ghost-written text succeeded in making me feel that I was listening to Harry’s story rather than listening to Harry’s performance of his story. Spare pulled off the feat of enabling me to suspend my disbelief that I was listening to ‘his words’ even as I knew they were composed by another. The calibre of his vocal performance, along with the calibre of Moehringer’s writing, neutralised the several factors piquing my curiosity about the memoir, potentially over-determining my response to it. This is the power of audiobooks: their affecting capacity to temporarily suspend critical disbelief though the sensuous appeal of the voice and its grain. Listening in the way that streamed audiobooks allow has the potential to amplify or intensify engagement with the text itself.

When the audiobook narrator has a public or professional profile outside that of the text being listened to, all these external associations come to bear on the experience of listening. Emily Wilson’s recent translation of The Odyssey (2020), for example, can be heard differently to earlier versions of the classic not only because of the freshness and vitality of Wilson’s translation, but also because of its narrator: Claire Danes. I listen more attentively because of my familiarity with Danes’ voice from the US TV drama, Homeland. In spite of the vastly different media and story worlds, Danes’ sometimes quavering, occasionally fragile voice registers these past associations for me.  The more distinctive the pitch, timbre, and pace of the voice, the more likely it will be marked in our memory. That memory contains traces of a voice’s former locations and embodiment, eliciting our attention to it on subsequent occasions.


Mesmerising vocalisations 

Brigid: I chose to listen to Sean Prescott’s Bon and Lesley narrated by Tamblyn Lord, after reading Bonny Cassidy’s compelling review of the novel. Lord offers an Australian middle-of-the-road ‘everyman’ quality to the narration – a resonant but monotone voice, which is slightly unconvincing and cartoonish when it comes to female characters yet otherwise compelling. It seems the perfect voice to capture the disturbing suburban charade that unfolds as the surrounding world disintegrates.

Prescott’s debut novel The Town was set in a New South Wales locality where holes opened up suddenly, driving the two central characters to escape to Sydney at the end. In his latest work, there’s no escape from the town of Newnes, although the characters speak idly of a kind of unreachable oasis named Sofala. It centres on the experience of Bon, a white-collar worker who travels three hours a day for work and steps off the train at the regional town of Newnes where he meets Steven. Lesley, another newcomer, plays a maternal role in relation to ‘the boys’ – Bon, Steven and his brother Jack - the infantilised men in the household they assemble. The town is a liminal zone they cannot seem to leave because ‘everything is being packed away’, including public transport. This found family enacts uncanny rituals of settler normality, including regular meals sourced from a service station and strict bedtimes enforced by Lesley. The irony is that Bon has left his own children – and the duties involved with their care – to fall into the role of passive man-child in a dysfunctional family unit.

I mostly listened to Bon and Lesley as I walked the dog around my suburb at night, surrendering to the mesmerising flow of sound. Often, I needed to rewind to the previous chapter to figure out what exactly had occurred in the story because I had been focussing on the cadence of Lord’s voice and the often surreal dialogue. Sometimes lulled into a trance-like state during the nine hours (approximately) of the experience, I had tears spring into my eyes unexpectedly at the end. When Bon walks back to the home he deserted months before, thinking that it might not exist by the time he reaches it, he cuts across the same ‘lines of shops and interminable roads’ and ‘vaulting gauntlets of backyards’, alarming suburban residents along the way. He is disturbed that nothing seems different: ‘That it remained much the same was devastating, despicable’. Due to my own intense anxieties about the future, I read this as an analogy for climate inaction during the long reign of the Liberal government.

I was ambushed by a rush of emotion when Bon looks through the windows of his former house and makes out the shape of his sleeping daughter and imagines his baby son stretched across the parental bed he has deserted:

Even if he hadn’t lost his key back in Newnes, he could still no longer use it. Even if he had it and used it anyway and even if he got away with it, even if he was forgiven, he knew that something mysterious lay in their future, something unpredictable, far beyond his ability to fight.

In this passage, the decline of the world is bleakly foreshadowed and the children’s fate is uncertain. Outside while his family are inside, Bon sleeps rough in a park on the corner, where he is poked by an ‘elderly’ with a ‘jabbing stick’. Officious, hostile rituals continue unabated as the world unravels.

After this moving denouement, and the heartfelt acknowledgements, the upbeat advertisement for Bolinda audio, also voiced by Lord, felt extremely jarring to me: ‘Our audiobooks are becoming increasingly popular among travellers, families and people who are on the go …’ This may be due to the gap between Prescott’s nihilistic vision and the light tone required for such promotional messages. It reminded me that audiobooks, no matter how off-beat or counter-cultural, are nevertheless deeply implicated in the global publishing economy.


Audiobook production

The audiobook is a capacious format for a variety of texts, from narratives performed by amateur readers to dramatised multi-cast performances, akin to radio plays, with high production values. The making of an audiobook is an intense and focused collective process. Penguin Random House, one of the ‘big 4’ publishers, records many of its audiobooks in-house or pays to have them made by other companies. In order to cast the narrator, audio editors will read each book before pulling together a shortlist ready for discussion with the author. Once a narrator is chosen and booked in for the recording, they will be paired with a producer/director who will work with them on the book and direct the performance in studio. As Sam Halstead, editorial director for Penguin Random House Audio, comments:

The recording time can vary greatly depending on the length of the book but on average these are around four full days in studio, but can last much longer. […] It would be impossible to read any book in one take, so there is then a lot of skilled editing work that takes place once the recording is complete to remove any misreads and tidy up any noises in the background that may have crept in.

Both narrator and producer need to know the text intimately ahead of the recording to be ready for any surprises in the narrative – any tricky pronunciations, and character accents. They may liaise with the author as part of this preparation to ensure everything is performed just as intended. The author may also be the narrator, especially with memoir, as in the case of Helen Garner’s diaries.

It can be gruelling to record an entire audiobook alone – as Maria Issaris commented in a panel discussion at the Small Press Network conference in 2022, ‘once you start to get tired, there’s nowhere to hide’. The usual model is for one narrator to tackle all of the voices in a particular text, yet some audiobooks, such as George Saunders’ Lincoln in the Bardo, bring together a huge number of voices. The audiobook cast of Lincoln in the Bardo is made up of 166 narrators including David Sedaris, Lena Dunham, Ben Stiller, Susan Sarandon and Don Cheadle. There were so many people involved that Penguin Random House Audio applied (unsuccessfully) for a Guinness World Record for greatest number of voices on a single audiobook. Over a six-month period, people recorded lines for the audiobook in 17 studios across America. Emphasising the vocal element, George Saunders himself said of it: ‘I love the way that the variety of contemporary American voices mimics and underscores the feeling I tried to evoke in the book: a sort of American chorale.’ The solo narrated audiobook might be compared to a dramatic monologue while a well-peopled (expensive) version such as Penguin Random House’s Lincoln in the Bardo is an ‘ensemble’ or ‘full-cast’ production –‘monstrous hybrids of book and theater’, in words of the Washington Post’s Katherine A. Powers, in which the text is usurped by performance, disappearing into ‘thespian clamor’.


Publishing economies 

The audiobook market is dominated by the Amazon behemoth – the largest and most ‘customer-centric’ retailer on earth. Given that the data collected by Amazon is not available to scholars, it’s difficult to identify patterns and practices of listening. As Simone Murray argues, ‘Clearly the leviathan that is Amazon exerts immense influence on the global book trade, but how are scholars to document, much less critique, algorithmic culture’s self-reinforcing effects on cultural selection if denied access to the workings of the algorithm’s engine-room?’

More intensely aware of the realities of reading than publishing has ever been before, Amazon Unlimited pays royalties to writers based on the number of pages the Kindle user actually reads of their work rather than on the number of downloads alone. This distinction partly inspired Amazon’s introduction in 2017 of Amazon Charts, an in-house weekly best-seller list that divided the category of ‘Most Sold’ from ‘Most Read’. Amazon retains detailed records of Kindle and Audible reading sessions, which it justifies by saying that such data allows customers to pick up where they left off.

While Amazon data is not publicly available, there’s greater access to data through StoryTel, the second most popular audio-streaming service. Karl Berglund has studied the listening habits of StoryTel users, finding that what people read doesn’t seem to immensely affect when they read. Most notably, night readers are often heavy readers. StoryTel books are repeatedly bought and played night after night, presumably as a sort of ‘comfort listen’ or audio-sedative. The accessibility of StoryTel data compared with Amazon provides a more nuanced understanding of how people use their audiobooks – as an ongoing soundtrack or aural wallpaper rather than an epiphanic experience. For Berglund, the patterns are clear: the higher the consumption, the more users were biased toward night reading. The scholar of audiobooks might attempt to interpret the data of a dispersed collective of private StoryTel readers, as Berglund has done, but they can never know exactly why or how these books were chosen and absorbed without detailed follow-up interviews.


Beyond the human narrator? 

In the future, the ‘book’ part of the audiobook may not always be central. Matthew Rubery has argued that ‘books no longer seem essential to the audiobook’s future’, pointing as an example to the The Chopin Manuscript, a serialised thriller written collectively by fifteen different writers and then read by a single narrator. The lead contributor, Jeffery Deaver, went on to write The Starling Project, another mystery written exclusively for audio and described by the author as a ‘nonvisual play’. Audiobooks may be produced collectively, even bypassing the codex altogether, yet they may still offer a ‘literary’ experience.

AI-voiced audiobooks (such as BookBaby) are already making inroads into the market. Audible has long held an anti-synthetic-narration policy, but has been called out by narrators and others who discovered and flagged several AI-narrated titles listed on its site which were subsequently taken down. AI is obviously a cheaper alternative to live narration, yet without the interpretation work done by human narrators, listeners are likely to have more difficulty with making sense of the text.

Undeniably, AI voices are not as ‘listenable’ as human audiobook narrators’ voices –  they do not have ‘the grain’ that Roland Barthes identifies as essentially corporeal. Barthes compares the voice to a wood-type, a textural signifier of both the singular tree and the species or genus, the tree-type, from which a plank of wood or a piece of furniture derives that shows or carries the trace of its material form. When it comes to the grain of the voice, it is ‘the materiality of the body speaking’. Barthes tells us that the voice’s grain communicates, yet does so in a way that is alien to verbal communication. The grain is ‘the body in the singing voice, in the writing hand, in the performing limb’.


An Aboriginal prosody: Audiobook pedagogy 

As David Willingham observes, we often hear voices in our heads as we read. This effect can be notable, especially when we know the sound of the author's voice (Prince Harry or Michelle Obama, for example). ‘For audio books, the reader doesn't need to supply the prosody – whoever is reading the book aloud does so. For difficult-to-understand texts, prosody can be a real aid to understanding.’ One of the courses offered at our university has Alexis Wright’s Carpentaria at its core. As a polyphonic epic set in the Gulf of Carpentaria in northern Australia, Wright’s novel depicts not only a chorale of Indigenous voices, but also multiple speech patterns, variegated verbal rhythms and regionally distinctive accents that may not be familiar to metropolitan and non-Indigenous readers. Alongside reading the print or e-book, students are encouraged to listen to the audiobook version narrated by Noongar actor and dramaturg, Isaac Drandich. Through a range of Indigenous accents and intonations, Drandich’s narration marks the text’s multiple and often unidentified or unidentifiable voices, making audible the multiple points of view around which the novel’s storytelling is built. Drandich’s skilled vocalisation, guided by Wright’s orchestration of the spoken and sonic soundscapes, offers a rich opportunity for encountering the beauty and range of Carpentaria’s prosody.

Audiobooks enable us to expand our habits of textual sense-making through sonic, and particularly vocal, perception beyond a narrowly visual field. Considering the transformations in the humanities that have escalated since the pandemic forced rapid shifts in our communicative practices, why should we not include the incorporation of audiobooks as one of the shifts in the tertiary literary studies classroom? With its multiple scenes of listening to the telling of stories small and large, a novel like Carpentaria invites an expanded pedagogical field where audiobooks facilitate the encounter with a text and its vocal rhythms.

Yet there are practical barriers to this expanded pedagogy. We encourage students to gain access to the streamed version of the novel through their local library via the BorrowBox or Libby apps, or those with a commercial streaming subscription to download and listen to the audiobook while reading the print or digital versions. Unfortunately, these means of audio access are not straightforward to implement for an entire class. The first option requires students to be familiar with municipal libraries when most seem not to be. The commercial streamer subscription costs just over $16 a month, an expense that few students can afford. It also locks them into a proprietary system to which some of them have strong objections. In addition, the most striking difference between the paid and free streaming systems is the instant availability of purchased audiobooks. By contrast, public access through free library streaming systems involves waiting in a queue for popular titles. It’s less convenient to document your own reading with Borrowbox or Libby because you cannot take your own notes within the app whereas the Audible app allows for note-taking at regular intervals throughout.

As Tully Barnett notes, Amazon introduced the Highlights function for ebooks in 2010, allowing readers to select passages and store and access these selections on the device or on the ‘cloud’. The function also displays the number of readers who have ‘highlighted’ a particular passage (shown with a faint underlining and a numeral). The number of highlights visible in popular texts represents a form of asynchronous collective reading. The Kobo e-reader has a similar function with an added element of gamification, with readers able to track statistics about their habits and receive spontaneous awards for reading. These affordances have not been enabled for audiobook reading at the time of writing, although dedicated listener-readers have posted tips for how to record reading reflections outside of the apps they use.

The social reading of audiobooks is booming, along with app subscriptions. The Goodreads social reading platform, also owned by Amazon, allows contributors to give star ratings and add reviews of audiobooks and has compiled a list of Best Audiobooks Ever with almost 2000 entries. Although few distinctions are made between audiobooks, ebooks and print books on these platforms, the quality of narration is inevitably spotlit in reviews of audiobooks.


Intimacies of the ear 

As educators, we can see the immense opportunities for technology-supported social reading offered by the audiobook. Imagine how differently our courses could enact the encounter between orality and textuality that audiobooks enable if it were possible for university libraries to have a licence for titles that are commercially available? Programs like Perusall enable close readings and responses to audio segments and podcasts, yet we are hampered by the lack of legal access to full audio texts. We work around this technological/legal restriction by selecting passages that we read aloud in class. When we listen to the novel, we share it together, almost as they did in the Middle Ages.

Using audiobooks as class resources enables collective embodied responses, with the possibility of augmenting the codex-reading experience. James F. English argues that curricular book listening could ‘come out of the closet’ if audiobooks were set as reading options. This would prompt us to ‘rethink some old assumptions about literariness and mediation’. He sees many positive aspects to using audiobooks, remarking that they could even promote a mode of teaching that encourages more university students to connect deeply with literature. As we’ve already said, narrators can bring depth and intelligence to their voicing of literary texts, illuminating elements that readers may otherwise skim over, or not fully apprehend. Their interpretations can open up possibilities for extended discussions in class. Moreover, in the interests of equity, it’s only fair that a range of formats are made available for those with differing learning styles and neurodivergences.

Audiobooks create an effect of intimacy, or at least a sense of familiarity with a personal voice, a particular narrative, an image or mood. Or all of these at once! Perhaps it is this kind of proximity, this very closeness to a work of crafted writing by way of the ear rather than the eye that is the source of literary studies’ resistance to audiobooks (‘cheating’!) Whether as part of a pedagogy or in critical discourse, having the intimacy of the ear as the sensory mode of engagement challenges the notion of critical objectivity at the heart of much scholarship in the humanities and literary studies. Proximity or intimacy to the work of art is thought to risk over-valuing the personal, the merely expressive, and the inner world over and against the distancing and estrangement that literary criticism conventionally upholds.

Rita Felski describes the distance mandated by critical thinking in the humanities as a mythology – not false, only partial. Rethinking our attachments to works of art, she says, will reinvigorate and expand critical engagement by opening up other, more positive forms of responsiveness and relationality, intellectual, ethical and institutional. In our classrooms we are rethinking the value of audiobooks for how and why we teach what we do. Streamed audiobooks foster attention and attachment through an intimate, aural experience with a professional narrator’s performance of crafted longform writing. These professional narrators enable an enrichment of textual experience and therefore of the pedagogic possibilities for critical understanding.  For us, audiobooks expand both the sensory and cognitive range of our engagements with a text, increasing our attunement to voices beyond our own.