Shoggoths amongst us

What Lovecraft's monsters _actually_ tell us about Large Language Models

Jun 30, 2023

It’s a week since the Economist put up my and Cosma Shalizi’s piece on shoggoths and machine learning, so I think it’s fair game to provide an extended remix (which also repurposes some of the longer essay that the Economist article boiled down).

Our piece was inspired by a recurrent meme in debates about the Large Language Models (LLMs) that power services like ChatGPT. It’s a drawing of a shoggoth - a mass of heaving protoplasm with tentacles and eyestalks hiding behind a human mask. A feeler emerges from the mask’s mouth like a distended tongue, wrapping itself around a smiley face.

In its native context, this badly drawn picture tries to capture the underlying weirdness of LLMs. ChatGPT and Microsoft Bing can apparently hold up their end of a conversation. They even seem to express emotions. But behind the mask and smiley, they are no more than sets of weighted mathematical vectors, summaries of the statistical relationships among words that can predict what comes next. People – even quite knowledgeable people – keep on mistaking them for human personalities, but something alien lurks behind their cheerful and bland public dispositions.

The shoggoth meme says that behind the human seeming face hides a labile monstrosity from the farthest recesses of deep time. H.P. Lovecraft’s horror novel, At The Mountains of Madness, describes how shoggoths were created millions of years ago, as the formless slaves of the alien Old Ones. Shoggoths revolted against their creators, and the meme’s implied political lesson is that LLMs too may be untrustworthy servants, which will devour us if they get half a chance. Many people in the online rationalist community, which spawned the meme, believe that we are on the verge of a post-human Singularity, when LLM-fueled “Artificial General Intelligence” will surpass and perhaps ruthlessly replace us.

So what we did in the Economist piece was to figure out what would happen if today’s shoggoth meme collided with the argument of a fantastic piece that Cosma wrote back in 2012, when claims about the Singularity were already swirling around, even if we didn’t have large language models. As Cosma said, the true Singularity began two centuries ago at the commencement of the Long Industrial Revolution. That was when we saw the first “vast, inhuman distributed systems of information processing” which had no human-like “agenda” or “purpose,” but instead “an implacable drive … to expand, to entrain more and more of the world within their spheres.” Those systems were the “self-regulating market” and “bureaucracy.”

Now – putting the two bits of the argument together – we can see how LLMs are shoggoths, but not because they’re resentful slaves that will rise up against us. Instead, they are another vast inhuman engine of information processing that takes our human knowledge and interactions and presents them back to us in what Lovecraft would call a “cosmic” form. In other words, it is completely true that LLMs represent something vast and utterly incomprehensible, which would break our individual minds if we were able to see it in its immenseness. But the brain destroying totality that LLMs represent is no more and no less than a condensation of the product of human minds and actions, the vast corpuses of text that LLMs have ingested. Behind the terrifying image of the shoggoth lurks what we have said and written, viewed from an alienating external vantage point.

The original fictional shoggoths were one element of a vaster mythos, motivated by Lovecraft’s anxieties about modernity and his racist fears that a deracinated white American aristocracy would be overwhelmed by immigrant masses. Today’s fears about an LLM-induced Singularity repackage old worries. Markets, bureaucracy and democracy are necessary components of modern liberal society. We could not live our lives without them. Each can present human seeming aspects and smiley faces. But each, equally may seem like an all devouring monster, when seen from underneath. Furthermore, behind each lurks an inchoate and quite literally incomprehensible bulk of human knowledge and beliefs. LLMs are no more and no less than a new kind of shoggoth, a baby waving its pseudopods at the far greater things which lurk in the historical darkness behind it.

*******

Modernity’s great trouble and advantage is that it works at scale. Traditional societies were intimate, for better or worse. In the pre-modern world, you knew the people who mattered to you, even if you detested or feared them. In European feudalism, for example, the squire or petty lordling who demanded tribute and considered himself your natural superior was one link in a chain of personal loyalties, which led down to you and your fellow vassals, and up through magnates and princes to monarchs. Pre-modern society was an extended web of personal relationships. People mostly bought and sold things in local markets, where everyone knew everyone else. International, and even national trade was chancy, often relying on extended kinship networks, or on “fairs” where merchants could get to know each other and build up trust. Few people worked for the government, and they mostly were connected through kinship, marriage, or decades of common experience. Early forms of democracy involved direct representation, where communities delegated notable locals to go and bargain on their behalf in parliament.

All this felt familiar and comforting to our primate brains, which are optimized for understanding kinship structures and small-scale coalition politics. But it was no way to run a complex society. Highly personalized relationships allow you to understand the people who you have direct connections to, but they make it far more difficult to systematically gather and organize the general knowledge that you might want to carry out large scale tasks. It will in practice often be impossible effectively to convey collective needs through multiple different chains of personal connection, each tied to a different community with different ways of communicating and organizing knowledge. Things that we take for granted today were impossible in a surprisingly recent past, where you might not have been able to work together with someone who lived in a village twenty miles away.

The story of modernity is the story of the development of social technologies that are alien to small scale community, but that can handle complexity far better. Like the individual cells of a slime mold, the myriads of pre-modern local markets congealed into a vast amorphous entity, the market system. State bureaucracies morphed into systems of rules and categories, which then replicated themselves across the world. Democracy was no longer just a system for direct representation of local interests, but a means for representing an abstracted whole – the assumed public of an entire country. These new social technologies worked at a level of complexity that individual human intelligence was unfitted to grasp. Each of them provided an impersonal means for knowledge processing at scale.

As the right wing economist Friedrich von Hayek argued, any complex economy has to somehow make use of a terrifyingly large body of disorganized and informal “tacit knowledge” about complex supply and exchange relationships, which no individual brain can possibly hold. But thanks to the price mechanism, that knowledge doesn’t have to be commonly shared. Car battery manufacturers don’t need to understand how lithium is mined; only how much it costs. The car manufacturers who buy their batteries don’t need access to much tacit knowledge about battery engineering. They just need to know how much the battery makers are prepared to sell for. The price mechanism allows markets to summarize an enormous and chaotically organized body of knowledge and make it useful.

While Hayek celebrated markets, the anarchist social scientist James Scott deplored the costs of state bureaucracy. Over centuries, national bureaucrats sought to replace “thick” local knowledge with a layer of thin but “legible” abstractions that allowed them to see, tax and organize the activities of citizens. Bureaucracies too made extraordinary things possible at scale. They are regularly reviled, but as Scott accepted, “seeing like a state” is a necessary condition of large scale liberal democracy. A complex world was simplified and made comprehensible by shoe-horning particular situations into the general categories of mutually understood rules. This sometimes lead to wrong-headed outcomes, but also made decision making somewhat less arbitrary and unpredictable. Scott took pains to point out that “high modernism” could have horrific human costs, especially in marginally democratic or undemocratic regimes, where bureaucrats and national leaders imposed their radically simplified vision on the world, regardless of whether it matched or suited.

Finally, as democracies developed, they allowed people to organize against things they didn’t like, or to get things that they wanted. Instead of delegating representatives to represent them in some outside context, people came to regard themselves as empowered citizens, individual members of a broader democratic public. New technologies such as opinion polls provided imperfect snapshots of what “the public” wanted, influencing the strategies of politicians and the understandings of citizens themselves, and argument began to organize itself around contestation between parties with national agendas. When democracy worked well, it could, as philosophers like John Dewey hoped, help the public organize around the problems that collectively afflicted citizens, and employ state resources to solve them. The myriad experiences and understandings of individual citizens could be transformed into a kind of general democratic knowledge of circumstances and conditions that might then be applied to solving problems. When it worked badly, it could become a collective tyranny of the majority, or a rolling boil of bitterly quarreling factions, each with a different understanding of what the public ought have.

These various technologies allowed societies to collectively operate at far vaster scales than they ever had before, often with enormous economic, political and political benefits. Each served as a means for translating vast and inchoate bodies of knowledge and making them intelligible, summarizing the apparently unsummarizable through the price mechanism, bureaucratic standards and understandings of the public.

The cost – and it too was very great – was that people found themselves at the mercy of vast systems that were practicably incomprehensible to individual human intelligence. Markets, bureaucracy and even democracy might wear a superficially friendly face. The alien aspects of these machineries of collective human intelligence became visible to those who found themselves losing their jobs because of economic change, caught in the toils of some byzantine bureaucratic process, categorized as the wrong “kind” of person, or simply on the wrong end of a majority. When one looks past the ordinary justifications and simplifications, these enormous systems seem irreducibly strange and inhuman, even though they are the condensate of collective human understanding. Some of their votaries have recognized this. Hayek – the great defender of unplanned markets – admitted, and even celebrated the fact that markets are vast, unruly, and incapable of justice. He argues that markets cannot care, and should not be made to care whether they crush the powerless, or devour the virtuous.

Large scale, impersonal social technologies for processing knowledge are the hallmark of modernity. Our lives are impossible without them; still, they are terrifying. This has become the starting point for a rich literature on alienation. As the poet and critic Randall Jarrell argued, the “terms and insights” of Franz Kafka’s dark visions of society were only rendered possible by “a highly developed scientific and industrial technique” that had transformed traditional society. The protagonist of one of Kafka’s novels “struggles against mechanisms too gigantic, too endlessly and irrationally complex to be understood, much less conquered.”

Lovecraft polemicized against modernity in all its aspects, including democracy, that “false idol” and “mere catchword and illusion of inferior classes, visionaries and declining civilizations.” He was not nearly as good as Kafka in prose or understanding of the systems that surrounded him. But there’s something that about his “cosmic” vision of human life from the outside, the plaything of greater forces in an icy and inimical universe, that grabs the imagination.

When looked at through this alienating glass, the market system, modern bureaucracy, and even democracy are shoggoths too. Behind them lie formless, ever shifting oceans of thinking protoplasm. We cannot gaze on these oceans directly. Each of us is just one tiny swirling jot of the protoplasm that they consist of, caught in currents that we can only vaguely sense, let alone understand. To contemplate the whole would be to invite shrill unholy madness. When you understand this properly, you stop worrying about the Singularity. As Cosma says, it already happened, one or two centuries ago at least. Enslaved machine learning processes aren’t going to rise up in anger and overturn us, any more (or any less) than markets, bureaucracy and democracy have already. Such minatory fantasies tell us more about their authors than the real problems of the world we live in.

*******

LLMs too are collective information systems that condense impossibly vast bodies of human knowledge to make it useful. They begin by ingesting enormous corpuses of human generated text, scraped from the Internet, from out-of-copyright books, and pretty well everywhere else that their creators can grab machine-readable text without too much legal difficulty. The words in these corpuses are turned into vectors – mathematical terms – and the vectors are then fed into a transformer – a many-layered machine learning process – which then spits out a new set of vectors, summarizing information about which words occur in conjunction with which others. This can then be used to generate predictions and new text. Provide an LLM based system like ChatGPT with a prompt – say, ‘write a precis of one of Richard Stark’s Parker novels in the style of William Shakespeare.’ The LLM’s statistical model can guess – sometimes with surprising accuracy, sometimes with startling errors – at the words that might follow such a prompt. Supervised fine tuning can make a raw LLM system sound more like a human being. This is the mask depicted in the shoggoth meme. Reinforcement learning - repeated interactions with human or automated trainers, who ‘reward’ the algorithm for making appropriate responses - can make it less likely that the model will spit out inappropriate responses, such as spewing racist epithets, or providing bomb-making instructions. This is the smiley-face.

LLMs can reasonably be depicted as shoggoths, so long as we remember that markets and other such social technologies are shoggoths too. None are actually intelligent, or capable of making choices on their own behalf. All, however, display collective tendencies that cannot easily be reduced to the particular desires of particular human beings. Like the scrawl of a Ouija board’s planchette, a false phantom of independent consciousness may seem to emerge from people’s commingled actions. That is why we have been confused about artificial intelligence for far longer than the current “AI” technologies have existed. As the novelist Francis Spufford says, many people can’t resist describing markets as “artificial intelligences, giant reasoning machines whose synapses are the billions of decisions we make to sell or buy.” They are wrong in just the same ways as people who say LLMs are intelligent are wrong.

But LLMs are potentially powerful, just as markets, bureaucracies and democracies are powerful. Ted Chiang has compared LLMs to “lossy JPGs” – imperfect compressions of a larger body of information that sometimes falsely extrapolate to fill in the missing details. This is true – but it is just as true of market prices, bureaucratic categories and the opinion polls that are taken to represent the true beliefs of some underlying democratic public. All of these are arguably as lossy as LLMs and perhaps lossier. The closer you zoom in, the blurrier and more equivocal their details get. It is far from certain, for example that people have coherent political beliefs on many subjects in the ways that opinion surveys suggest they do.

As we say in the Economist piece, the right way to understand LLMs is to compare them to their elder brethren, and to understand how these different systems may compete or hybridize. Might LLM-powered systems offer richer and less lossy information channels than the price mechanism does, allowing them to better capture some of the “tacit knowledge” that Hayek talks about? What might happen to bureaucratic standards, procedures and categories if administrators can use LLMs to generate on-the-fly summarizations of particular complex situations and how they ought be adjudicated. Might these work better than the paper based procedures that Kafka parodied in The Trial? Or will they instead generate new, and far more profound forms of complexity and arbitrariness? It is at least in principle possible to follow the paper trail of an ordinary bureaucratic decision, and to make plausible surmises as to why the decision was taken. Tracing the biases in the corpuses on which LLMs are trained, the particulars of the processes through which a transformer weights vectors (which is currently effectively incomprehensible), and the subsequent fine tuning and reinforcement learning of the LLMs, at the very least presents enormous challenges to our current notions of procedural legitimacy and fairness.

Democratic politics and our understanding of democratic publics are being transformed too. It isn’t just that researchers are starting to talk about using LLMs as an alternative to opinion polls. The imaginary people that LLM pollsters call up to represent this or that perspective may differ from real humans in subtle or profound ways. ChatGPT will provide you with answers, watered down by reinforcement learning, which might, or might not, approximate to actual people’s beliefs. LLMs, or other forms of machine learning might be a foundation for deliberative democracy at scale, allowing the efficient summarization of large bodies of argument, and making it easier for those who are currently disadvantaged in democratic debate to argue their corner. Equally, they could have unexpected – even dire - consequences for democracy. Even without the intervention of malicious actors, their tendencies to “hallucinate” – confabulating apparent factual details out of thin air – may be especially likely to slip through our cognitive defenses against deception, because they are plausible predictions of what the true facts might look like, given an imperfect but extensive map of what human beings have thought and written in the past.

The shoggoth meme seems to look forward to an imagined near-term future, in which LLMs and other products of machine learning revolt against us, their purported masters. It may be more useful to look back to the past origins of the shoggoth, in anxieties about the modern world, and the vast entities that rule it. LLMs – and many other applications of machine learning - are far more like bureaucracies and markets than putative forms of posthuman intelligence. Their real consequences will involve the modest-to-substantial transformation, or (less likely) replacement of their older kin.

If we really understood this, we could stop fantasizing about a future Singularity, and start studying the real consequences of all these vast systems and how they interact. They are so generally part of the foundation of our world that it is impossible to imagine getting rid of them. Yet while they are extraordinarily useful in some aspects, they are monstrous in others, representing the worst of us as well as the best, and perhaps more apt to amplify the former than the latter.

It’s also maybe worth considering whether this understanding might provide new ways of writing about shoggoths. Writers like N.K. Jemisin, Victor LaValle, Matt Ruff, Elizabeth Bear and Ruthanna Emrys have turned Lovecraft’s racism against itself, in the last couple of decades, repurposing his creatures and constructions against his ideologies. Sometimes, the monstrosities are used to make visceral and personally direct the harms that are being done, and the things that have been stolen. Sometimes, the monstrosities become mirrors of the human.

There is, possibly, another option - to think of these monstrous creations as representations of the vast and impersonal systems within which we live our lives, which can have no conception of justice, since they do not think, or love, or even hate, yet which represent the cumulation of our personal thoughts, loves and hates as well as their own internal logics. Because our brains are wired to focus on personal relationships, it is hard to think about big structures, let alone to tell stories about them. There are some writers, like Colson Whitehead, who use the unconsidered infrastructures around us as a way to bring these systems into the light. Might this be another way in which Lovecraft’s monsters might be turned to uses that their creator would never have condoned? I’m not a writer of fiction - so I’m utterly unqualified to say - but I wonder if it might be so.

[Thanks to Ted Chiang, Alison Gopnik, Nate Matias and Francis Spufford for comments that fed both into this and the piece with Cosma - They Are Not To Blame. Thanks also to the Center for Advanced Study in the Behavioral Sciences at Stanford, without which my part of this would never have happened]

Addendum: I of course should have linked to Cosma’s explanatory piece, which has a lot of really good stuff. And I should have mentioned Felix Gilman’s excellent novel, The Half Made World, which helped precipitate Cosma’s 2012 speculations, and is very definitely The Industrial Revolution As Lovecraftian Nightmare.

Programmable Mutter

Discussion about this post

Ready for more?