AI and the Future of Literary Studies
Generative AI calls into question much that we think we know about the relationship between writing and the self. Or, to put it more precisely, generative AI surfaces the extent to which language is a set of patterns, ones that though they may be invisible to us, can in fact be recognised by a machine when trained on an unimaginably large corpus.
Compare the following two paragraphs. One is a mission statement for a university in regional Australia. The other is generated by AI from the prompt ‘write a mission statement for university based in Australia, with a regional focus’.
Our innovation and excellence in both education and research generate ideas that transform lives and communities. We will be the region’s most progressive and responsive university, leading in blending digital capability with our distinctive campus precincts. We will leverage strong partnerships to maximise the social, cultural and economic impact we deliver regionally, nationally, and globally.
Our university is dedicated to providing quality education with a regional focus. We strive to prepare our students to become responsible citizens who positively impact their communities, while also fostering a sense of global awareness. Our goal is to produce graduates who are well-rounded, critical thinkers with the skills necessary to succeed in an ever-changing world.
Which was written by AI, and which by humans?
At least to my ears, the first statement sounds less human. While the terms ‘innovation’ and ‘excellence’ are thrown around a lot in universities – and normally in ways that obfuscate meaningful goals or projects – their use here feels to me at best only life-like. It seems improbable that university managers would be so tin-eared as to take these words straight out of senior team meetings and put them into a mission statement that is meant to speak to students, faculty, staff, politicians, and the wider public.
I pause too over the phrase ‘distinctive campus precincts’. Even though this phrase makes a claim for difference (‘distinctive’), it in fact has no place or history. It is what you might say if, like a machine, you did not have a felt sense of where your university is and what distinguishes it. The two tricolons (‘social, cultural and economic impact’; ‘deliver regionally, nationally, and globally’) similarly feel drawn from the dream-like archive of management rhetoric. The university is to be all things to everyone all at once – and, conversely, it is to commit to nothing at all.
By contrast, the second statement is more direct. The first sentence indicates what the university actually does: it gives students from the region access to education. The second sentence, while certainly more in the idiom of the mission statement than the previous, is bolder than any of the sentences contained in that first statement. Regional universities are responsible both to their communities and the wider world, and we can imagine that a mission statement might reflect these humble yet bold aims. The final sentence is more rote again, with its focus on well-rounded and critical thinkers for an ‘ever-changing world’. But it is still a clear and even laudable aim for a regional university to seek to translate from the local to the global.
Reading this second statement over, I realise that this is the university that I would rather work for. It is focused on the student as a person (‘well-rounded, critical thinkers’). This is what I want out of my classroom, as I help my students from all variety of backgrounds to be wider readers, clearer thinkers, and more engaged citizens of the world they are entering. After having encountered real students in the second statement, I am led to realise just how absent they are from the first. The human element of education has disappeared behind robotic claims about the institution’s ‘social, cultural, and economic impact’. There are no people teaching, learning, or researching here.
As you may have guessed by now, it is in fact the second statement that is generated entirely by AI. My only intervention was to remove the phrase ‘in Australia’, from its opening sentence. The first statement, on the other hand, is how a large regional university in Australia phrases its ‘ambition’ in its ‘strategic plan’. The robotic here has come about naturally, as it were, without machine input.
We have long presumed that writing is one of the most human of activities, that it is one of the least available to machine replication. For many of us, it feels as close to the heart as love is – and machines cannot love. Certainly, some of the greatest accounts of human creativity encourage us to think this way. In The Prelude, William Wordsworth feels a ‘mild creative breeze’ passing over him, which soon becomes ‘a tempest, a redundant energy / Vexing its own creation’. This ‘storm’, as he describes it, breaks up the ‘long-continued frost’ of his existence and expression, bringing with it the promise of a new spring. ‘The hope of active days, of dignity and thought’ are opened up to him, a dream of ‘Pure passions, virtue, knowledge, and delight, / The holy life of music and of verse’. In this moment of profound breakthrough, he pours his ‘Song’ out into the open fields and finds his soul tumbling out with it. ‘Great hopes were mine,’ he writes, ‘My own voice cheared me, and, far more, the mind’s / Internal echo of the imperfect sound’.
Yet generative AI calls into question much that we think we know about the relationship between writing and the self. Or, to put it more precisely, generative AI surfaces the extent to which language is a set of patterns, ones that though they may be invisible to us, can in fact be recognised by a machine when trained on an unimaginably large corpus. In our own learning, we have internalised from childhood the rules of a human language – a ‘mother tongue’ – until we have now forgotten that we are operating by any rules at all. It often feels uncanny how generative AI recognizes which words are most likely to follow others, as it identifies patterns and then predicts them.
The result is a synthesis that fools us into feeling that it is drawn from life. In the two vision statements I quoted, GPT3 calls on the corpus of the internet to find the language likely to be used here. There will be phrases about students, communities, impact, critical thought, and the constancy of change. The fact that the first statement – that is, the genuine one – is so out of step with wider patterns might be enough to give us pause. Innovation, excellence, digital capability, and partnerships turn out to be the unusual fixation of this ‘distinctive campus precinct’.
The synthetic qualities of AI-produced language recall an approach to creativity in literary thought that is dramatically different from Wordsworth’s (and indeed from most of my students’). This view, which can now only gain greater currency, radicalises skepticism about human rationality and assumes that authors are not, in effect, in control of what they write. Wordsworth’s image of attempting to listen to the ‘internal echo’ of his imperfect expression gestures in this direction: even when the muse is upon him, finding his true poetic voice remains an immense struggle. The arguments against intention span very different intellectual projects and histories, from Wimsatt and Beardsley’s essay on what they call ‘the intentional fallacy’ to Roland Barthes, for whom texts have ‘no other origin than language itself, language which ceaselessly calls into question all origins.’
Most famous of all – and, as it happens, most congenial to generative AI – is Jacques Derrida’s dictum, ‘il n’y a pas de hors-texte’, translated by Gayatri Spivak as ‘there is nothing outside the text’. This phrase has been treated as emblematic of the anti-humanism of poststructural thinkers, as they leave us as little more than automatons swimming around in a sea of language. Michel Foucault concluded that Derrida was granting ‘the voice of the masters […] unlimited sovereignty’ to ‘indefinitely […] resay the text’. Much of this dispute has now been resolved via synthesis: there may be nothing outside ‘the text’, it seems, but ‘the text’ has been taken to mean all discourse, and all that shapes it – economics and power. In this version, ‘there is nothing outside the text’ is more akin to ‘nothing is non-textual’, as Leslie Hill has it. For generative AI, of course, there is indeed nothing outside the text and nothing is non-textual – it is just that ‘the text’ is everything that has been written and put on the internet.
By contrast, human learning draws on everything we have available to us – our senses, experience, and language – and is shaped by and expressed through neurobiology. Nowhere are the consequences of the distinction between human and machine learning clearer than when AI begins to hallucinate. I asked GPT3 to ‘write a paragraph explaining the meaning of the armadillo in Elizabeth Bishop’s “The Armadillo”’:
In Elizabeth Bishop’s poem ‘The Armadillo,’ the armadillo serves as a symbol of transience and the fleeting nature of life. The poem describes a funeral procession for an armadillo that has been hit by a car, drawing attention to the animal’s vulnerability and mortality. Through the use of imagery and metaphor, Bishop explores the idea that death is an inevitable part of life, and that everything, no matter how seemingly tough or impermeable, will eventually pass away. The armadillo’s armour-like exterior, which serves as a protective barrier in life, becomes a metaphor for the fragility of life itself. Ultimately, the armadillo in ‘The Armadillo’ represents the transience of all living things and serves as a reminder of the ephemeral nature of existence.
Those who know the poem are likely to laugh at this reading: Bishop’s marvellous account of watching fire balloons rising on a saint’s day in Brazil is in no way about a funeral procession for an animal hit by a car. There is no car and the armadillo is fine. Neither does Bishop use the armadillo to reflect on ephemerality and fragility – in fact, quite the opposite. Yet GPT3’s missteps are symptomatic, as they surface patterns of language that are hidden from us, ones that operate both locally and at a colossal level of abstraction.
GPT3 associates armadillos with a car accident because of the many articles on the internet, especially from the United States, in which the creatures are mentioned in relation to collisions and tyre punctures. ‘Spend a few months careering around Texas roads, and odds are you’ll encounter an armadillo or two. These small, armored creatures look like something out of a science fiction story, or a throwback from prehistoric times’, says one website. The armadillo only comes up once in Bishop’s poem, and there are few close readings available online. . As a consequence, GPT3 draws on the patterns it knows – armadillos have armour and are hit by cars – to fill in the gaps.
More interesting again is GPT3’s mastery of high school poetry blather. It has recognised that a work of literature is under discussion, so it draws on the terms and phrases that are likely to be used in this context. The main sources for these materials are websites aimed at helping students with their high school literature examinations. This example features the basics of poetry analysis (‘imagery’; ‘metaphor’) fused with reflections on mortality and the passing away of all things. The pattern here is one that teachers of literary studies are unlikely to want to acknowledge: when poetry comes up, students are rewarded for switching to a language of high seriousness. It is this very pastiche element of GPT3’s account of the poem – it feels like poetry analysis, but it does not know anything about either the poem or the human experience it describes – that makes this writing simultaneously so right and wrong. All poems, it seems, must be about something profound and devastating, rather than about having a nice sandwich – or indeed bottle of Coke.
Much of what literary scholars do when we teach is behave in the manner of an academic discussing books intelligently, demonstrating to students the language that they should then use to discuss literature, the sensibility that they should internalize and then reproduce if they too want to be the right kind of attentive reader. Exemplaris: the modelling of how to do something, the giving of an example, is one of the oldest ideas of what it means to be an educator. Reading over GPT3’s account of the poem is in that sense oddly familiar. The feeling is akin to reading a student essay that is attempting too hard to be ‘academic’, right down to the heavy-handed pursuit of death and mortality. Terms proliferate but meaning is deferred; plausible sentences appear but understanding is not to be found. In GPT3’s poetry analysis, I feel as though I am drowning in a sea of discourse, unreal and endlessly productive.
Where generative AI is especially weak, though, is where it comes up against that which cannot be synthesized out of discourse itself. The armadillo has a meaning that is more specific than generalized poetry blather allows. In the absence of online resources, and short of actually understanding the poem, the AI identifies patterns that are not true and then confidently parrots them. It is a kind of accelerated ‘topic modelling’ familiar from digital research, but without the hand editing, safety checks, and indeed the knowledge, that together lend this approach legitimacy.
All of which brings me back to university mission statements. Of course, they too are their own kind of pastiche, in this case drawn down from the global spectral unconscious of management rhetoric. Just like my students learning how to discuss literature, university leaders at some stage had to figure out how to speak as though they were competent managers, until they too could confidently reproduce the language of strategy, vision, partnerships, leverage, stakeholders, and so on. Yet, unlike my students interpreting poetry, they are not limited by the reality principle. In fact, it is more than that: management-speak cannot survive reality because it is designed to shield its speakers from it. This is normally its strength.
One of the most important early archives for research into natural language processing was the Enron email corpus (which is available for download and manipulation by data scientists). After the US Federal Energy Regulatory Commission finished its investigation of how the company came to collapse so spectacularly in 2001, it decided that it was in the public interest to make the materials widely accessible online. The emails show that weeks before the company’s failure, executives were telling each other that the firm was not only in good shape but that it would break all profit expectations. These cheery despatches from the frictionless world of make-believe were structured in and by the language of corporate confidence. The communications are in that sense much more than lies: they are testaments to a language of pure fantasy. Truth departed the scene and executives limitlessly remixed each other’s stories. When reality eventually intruded on these witless leaders – much too late, and not as a consequence of anyone at Enron telling the truth – the discourse collapsed.
The Enron emails are still present as ghosts in the machine of AI text generation. In all applied computational research, one of the central issues is how to gain access to high quality and machine-readable data. After the emails were made publicly available, AI researchers had at their disposal an enormous archive of people writing in natural language in a workplace setting. It was an irresistible opportunity. The 1.6 million emails have long been the dataset of choice for natural language processing projects. In other words, the most significant material on which natural language processing was based was the trail left behind from the largest bankruptcy in American corporate history at the time. Crucially, the archive is from the period leading up to the collapse. This was a period of such remarkable wrongdoing that Enron’s CEO and COO, along with numerous other executives from the firm, were found guilty of multiple counts of fraud and conspiracy. One of the world’s largest corporate audit firms, Arthur Andersen, had its CPA licence revoked in the aftermath.
Where GPT3 is strongest is precisely in this realm of make-believe, where things cannot be tested or known. Think of it as a firm that is bankrupt and trading but that never collapses – Enron without the Wall Street Journal. Il n’y a pas de hors-texte, it turns out, is true not so much for literary writing and its interpretation, but for the forms of expression that were never more than synthetic in the first place. This is why large language models perform so much more believably when they attempt to pastiche university mission statements and other corporate dejecta than when they attempt to read poems. Such statements are already pastiches that cannot survive outside the entirely internal world of their own discourse. They can be mimicked effortlessly because they ultimately refer to no reality, to nothing concrete, nothing that is meaningfully there. It is language all the way down. In fact, that is their point.
If this is a surprising place to arrive – the suggestion that poetry is real and testable, while university mission statements are not – it is also one that gives a new relevance to the discipline in which I have spent my career working. Generative AI writes convincing bullshit at a speed never seen before. It is a firehose of discourse, university mission statements until the end of time. The patience to understand what words actually mean, where they come from, what value they might have, and whether something is true or simply feels true – these are skills that can only become more significant when anyone can make a machine say anything.
As we move into a new era of writing, the archive of training data and neural patterns available to generative AI will become nearly limitless. While AI system designers will attempt to ensure that large language models are trained solely on human-generated content, there will be limitations to such efforts – not least because humans will use AI to write text that they will then lightly edit. AI text will be internalized by AI models, tending toward what John Barth described in the 1960s as a funhouse of language. More and more and more and more, endless distortions and reflections of each other, turned around and put back online and back into the model, forever. The result will not be consciousness, but rather, and very literally, internet brain, operating at extreme speed.
By contrast, the domain of literary thought remains resolutely human. Language, at least as I explore it in my classroom, is a way of understanding the world and ourselves. It is drawn from what we think, feel, and know: the place where the human contacts reality and attempts to account for it in words. Standing at the front of my class, working through a poem by Sylvia Plath with my students, I will continue to introduce young people to a world of thought and feeling that they have only just begun to understand, to read. It is this space where we become something other than machines maximising our ‘social, cultural and economic impact’. As absurd and naïve as this might sound, it is in this space of teaching where my students find their ownvoices, as opposed to that which has already been prepared for them by a world only too keen to make them ‘resay the text’.
As academic literary studies faces the challenge of generative AI, we should focus on what this kind of teaching has done well in one form or another for over a century: help students to encounter the good, the beautiful, and the true, and learn how to understand texts, themselves, and the world. For far too long we have accepted the parameters defined by our managers, spun and remixed from the top offices of our ‘distinctive campus precincts’. However, it has only become clearer how inimical the forms of value explored in literary studies are to management discourse – a discourse that, like AI, cannot recognise literary values as in any way distinctive. In our universities, costs have been driven down, and academics have been made to offer less and less to our students, and the fundamental human engagement that is at the heart of literary pedagogy has been circumscribed.
Assessment, of course, will have to change. The academic essay will no longer be the sine qua non of examination in university literature departments. It is too easy for students to synthesise material through AI and there will be limited evidence that they have genuinely learnt anything. While there are some situations where it might be appropriate to go back to pen and paper, if the purpose is simply to police students then we are doing both them and our disciplines a disservice. The gains to academic ‘integrity’ would come at the cost of making the examination process ever more artificial and removed from what our students will do as graduates. The written exam does little to help with developing a long-term practice of writing as thinking, as it instead promotes regurgitation. Moreover, in years to come, our graduates will be writing and reading in an environment permeated with AI. Microsoft has invested US $10 billion in OpenAI: its major products, from Word to Outlook to Teams, will soon incorporate the technology as standard. Our students will have the option to reduce the level of AI assistance, but for much of what they do a high level of autopiloting will not only be acceptable but preferable. Unless it is truly integral to the assessment that students write without external inputs, returning to pen and paper can only be a temporary measure until faculties figure out a coherent response.
When I first presented some of the ideas in this essay, a colleague suggested that generative AI now makes it possible to complete university study without reading, writing, or thinking. I said, ‘yes, but how truly different is that to now?’ Essays have been preferred in the mass university because they are scalable and cheap to assess, not necessarily because they reflect good pedagogy. I have found that all too often, the outcomes of such assessments do not reflect classroom dynamics, nor reward genuine engagement and understanding. As I tally up marks at the end of term, this is often a source of frustration to me. We can now change that.
There are any number of better options available to us right now than rounds of submitted essays. We can actually test the skills that we think we are helping students to gain through their educations – ‘authentic assessment’, as contemporary edu-speak has it. In my courses, for example, I want my students to be able to read poems and passages drawn from the course theme and to discuss them coherently. Among other things, they should be able to situate course materials in wider contexts, both historical and theoretical. These activities are what they will do when they become professionals, literary or not, and it is what they will do with books for the rest of their lives. I want my courses to feature a range of assessment options, such as: short student presentations on passages or poems followed by questions and answer from me; video reports with voiceover and creatively chosen images; creative extensions of course materials with elements of reflection and exegesis integrated; and, yes, an end of course essay. These essays, though, will be based more on research that the students have done than has previously been the norm. I will expect my students to argue for something, to show me something new. I will help them to get there by working with them on abstracts and drafts. All of these options support extended and long-term development processes – which are in their own ways forms of writing.
What is holding us back from a better teaching and learning paradigm is the cost cutting that has been rife in universities over the last few decades. Teaching students and assessing them in genuine ways is expensive. Innovative assessment strategies will only be possible with meaningful support from the university – the same institution that is of course hoping to return the ‘efficiency gains’ of AI to itself rather than to its students. The structure of the Australian university system does not make me hopeful on these fronts: our institutions have systematically diverted their resources away from students and researchers and towards administration. The NTEU reports that in the last three years, Australian higher education workers have been underpaid by $83.4 million, and a series of wage theft claims are still dragging through court. Meanwhile, administrators have hired ever more provosts and deans, while also giving themselves large pay rises. Unless we struggle against it, it is likely that efficiency gains from AI will simply end up in the pockets of university leaders.
The biggest challenge of generative AI is hence less a matter of whether students will still want to learn to read books, talk about them, understand themselves, and even write. They will, in whatever form, for the duration of my lifetime anyway. The challenge is that academics have next to no power over how the university uses its resources. In a moment when it is harder than ever to tell if an actual human has read something, much less understood it, we should demand again to connect meaningfully with our students. We can ask them: what do you know about this poem and how did you come to know it? How do you understand this passage in the context of the literary tradition we are helping you to encounter? How does this novel think? Explain it to us. Help us to see it. Think with us and think again – but this time for yourself.