[A new piece in the Economist] There's a killer app for Large Language Models
They're prayer wheels for organizational ritual
[Commercial announcement: My and Abraham Newman’s book, Underground Empire: How America Weaponized the World Economy is still available for $2.99 on Amazon Kindle. Also, it is about to come out in paperback in the UK and US. We now return you to your scheduled programming].
Marion Fourcade and I have a new piece up in the Economist. Here are some gift links. We make an argument that’s obvious in retrospect, but that I haven’t seen anyone else making. Large Language Models’ most straightforward application is as engines of organizational ritual.*
ARTHUR C. CLARKE wrote a story in which the entire universe was created so that monks could ritually write out the nine billion names of God. The monks buy a computer to do this faster and better, with unfortunate consequences for the rest of us. … Rituals aren’t just about God, but about people’s relations with each other. Everyday life depends on ritual performances such as being polite, dressing appropriately, following proper procedure and observing the law. … Organisations couldn’t work without rituals. When you write a reference letter for a former colleague or give or get a tchotchke on Employee Appreciation Day, you are enacting a ceremony, reinforcing the foundations of a world in which everyone knows the rules and expects them to be observed …
People already use [LLMs] to produce boilerplate language, write mandatory statements and end-of-year reports, or craft routine emails. … Because LLMs have no internal mental processes they are aptly suited to answering such ritualised prompts, spinning out the required clichés with slight variations. As Dan Davies, a writer, puts it, they tend to regurgitate “maximally unsurprising outcomes”. For the first time, we have non-human, non-intelligent processes that can generatively enact ritual at high speed and industrial scale, varying it as needed to fit the particular circumstances.
My bits of the article were inspired by two reinforcing pieces of information. One was a conversation with a friend, who works for an organization that requires Diversity, Equity and Inclusion statements. The friend described how they had spent hours writing a thoughtful serious statement; and then spun up ChatGPT to generate one. The friend ended up submitting their own handcrafted statement, but couldn’t help feeling that their organization would have preferred the bland inanity of what ChatGPT had put out, which more perfectly combined the organization’s expectations and the general form of the thing.
The second was an observation from playing around with ChatGPT, which I’m sure other people have struck on, but which I haven’t seen explicitly reported. ChatGPT is very good at generating Schelling Points. The Nobel Prize winning game theorist Thomas Schelling (whose book, The Strategy of Conflict, everyone ought read) talks about how human beings can often solve incredibly difficult coordination problems by looking to shared information. He asked his students how they would meet in New York City on a particular day if they weren’t able to communicate with the other party, and hadn’t set a time and place: they tended to converge on Grand Central Station at midday.
Places like Grand Central Station, and times like midday are salient: they are culturally prominent in such a way that people know that other people know they are important. Hence, if you are trying to guess what the other person will guess, and what the other person will guess that you are guessing, you will converge on an ‘obvious’ place and an ‘obvious’ time.
ChatGPT is very good at finding such ‘obvious’ solutions, even for coordination problems that it can’t plausibly have been trained on directly. And, as the game theorist Michael Chwe argues, one of the key social functions of social rituals is to generate or discover this kind of common knowledge.
Public rituals, rallies, and ceremonies generate the necessary common knowledge. A public ritual is not just about the transmission of meaning from a central source to each member of an audience; it is also about letting audience members know what other audience members know.
This helps explain why LLMs’ outputs so readily reinforce organizational ceremony and ritual: they converge very easily on what is expected, easing social coordination.
Equally - and this is the burden of our criticisms - they tend to disconnect the performance of ritual from people’s personal beliefs and private knowledge. Much organizational ritual involves not just coordination, but the creation or refining of knowledge. For example: many rituals are about assessing others: annual reports, letters of recommendation, academic peer review. And there is a lot of evidence (some circumstantial, some scientific) to suggest that people are increasingly farming out these ceremonial functions to LLMs.
Ethan Mollick is cautiously optimistic that this could have benefits:
And yet, we may not want to dismiss the idea of AI helping with peer review. Recent experiments suggest AI peer reviews tend to be surprisingly good, with 82.4% of scientists finding AI peer reviews more useful than at least some of the human reviews they received from on a paper, and other work suggests AI is reasonably good at spotting errors, though not as good as humans, yet.
although he worries that “the scientific publishing system was not made to support AI writers writing to AI reviews for AI opinions for papers later summarized by AI.”
We’re more pessimistic. It isn’t just that LLMs are likely to do terribly e.g. at identifying genuinely innovative work, but that deploying them may disrupt the implicit social bargains and expectations that the system depends on. These systems work because they are seen as legitimate, and their legitimacy depends on the assumption that the assessor has taken real care in making their decisions. As we say in the Economist:
A bad performance evaluation is one thing if you think the manager has sweated over it, but quite another if you suspect he farmed it out to an algorithm. … Letters of recommendation, peer reviews and even scientific papers themselves will become less trustworthy.
A couple of days ago, Ted Underwood pointed us to this great paper by Zachary Wojtowicz and Simon DeDeo, which makes a similar argument using both game theory and results from computer science. We think that organizational sociology (which is Marion’s area of expertise) has a lot to say too, and hope to do more academic work developing these intuitions in the near future.
* The mechanization of ceremony isn’t new. Prayer wheels were very likely the inspiration for Clarke’s story: his imaginary Tibetan Buddhist monastery has already used a diesel generator to make them more efficient. And Christianity has its own Fordist impulses towards achieving efficiency gains. As a teenager in rural Ireland, I remember finding a pamphlet at home (my parents must have picked it up somewhere) with a prayer for souls in purgatory. It made much of the fact that it was a short prayer: the pre-amble promised that you could redeem five (or some such number) souls from the painful rigors of purgation in the time that it took to boil an egg.
LLMs are our church, yes. And also our state, which could cause complications.
LLM as a Schelling Point Generator is something I've never thought about before...