Fair warning: this is a long essay. It lays out three apparently isolated observations. It then argues that they are all connected. Finally, it waves towards a broader way to think about the political economy of AI* by looking at the more specific relationship between Large Language Models (LLMs) and intellectual property.
Observation One: Which is just another riff on the regularly repeated thesis of this newsletter. Mainstream debates about LLMs are mostly wrongheaded. LLMs are not going to give rise to autonomous self-willed actors, and are not really comparable to individual human intelligences. Instead, they are “cultural technologies,” as Alison Gopnik would put it. This means that they are potentially useful in much the same ways as other such technologies are useful. They provide new ways to access, order and remix human generated information. They are also potentially problematic, just as those technologies are often problematic too.
Observation Two: The phrase, “the map is not the territory,” was coined by Alfred Korzybski, who created “General Semantics,” a quasi-scientific theory that emanated from the same loose intellectual milieu as cybernetics (good!) and Scientology (bad!). But it was popularized by Stewart Brand, whose Whole Earth Catalogue was one of the chief intellectual sources for the ideologies of Silicon Valley. Over the last decade or so, the phrase has come to seem increasingly ironic, as informational maps have started to engulf the territories that they purportedly describe. Once upon a time, search engines provided links to resources, with brief descriptive snippets of description that helped users figure out if this was the resource they wanted or not. Now, they regularly scrape information so as to summarize and cannibalize the commercially valuable aspects of the websites that they are “linking” to, right on the search page, so that you don’t have to click through to the website itself. These summarizations are hence skewed by the profit motive - modern search guides your attentions towards things that make money for the search engine’s owner, and away from those that don’t.
Observation Three: Without anyone really talking about it, there has been a massive shift in the fights around big tech and intellectual property. Up until about a decade ago, there was a loose but obvious alliance between technology focused lefties who wanted free access to cultural resources and tech companies like Google, as they faced off against common enemies like Disney. Activists like my sorely missed friend Aaron Swartz would get excited about “Free the Mouse” bumper stickers, which protested Disney’s role in getting copyright protections extended. Today, Disney is still a problem (it doesn’t like to pay authors if it can get away with it), but the Mouse is Free. Lefties today are more likely to distrust platform companies, and to side with culture producers and those who want authors to be compensated. As they see it, Google has gone from Don’t Be Evil to Doctor Evil. There is an entire coalition devoted to disentangling research on technology from the funding priorities of big platform and search companies.
You can combine these three observations to generate a simple account of the political economy of AI, which is very different from the standard stories that I usually see. In order: if you understand that AIs (or more precisely the LLMs that people increasingly come into contact with every day) rely on human generated knowledge, you begin to notice the actual struggles for power that are partly obscured by the rhetoric. If you begin to see current information politics as a struggle between the map-makers and the owners and inhabitants of the territories they are mapping, then you develop a different understanding of the stakes. This understanding makes it obvious why the political coalitions around IP and big tech have shifted so dramatically.
This simple account - like all simple accounts - leaves a lot of important stuff out. It’s my first-take approximation of a theory about a big and complicated phenomenon- if there are things I get wrong, yell at me about them! But it also potentially brings out the connections between developments that initially look disconnected. I’ll go through each in turn, and then talk more about how they connect at the end.
LLMs as cultural technologies
If you’ve read this newsletter for a while, you’ll already know more than you ever wanted about the Case Against Both Accelerationism and X-Risk. It would be crude and wrong to suggest that AI Boomerism and Doomerism are deliberately manufactured ideological flimflam. Many people sincerely believe in the one or the other. Still, it isn’t just that imaginary futures distract from the politics of the present, but that the particular futures that are presented systematically obscure how the politics is working out. They incline us to see “AI” as a precursor to “AGI” - artificial general intelligence - conscious goal oriented activity that is equivalent to, or better than human level thinking. They promise delightful virtual companions who will make us happier, and devoted servants who will do all the things we don’t like doing, or alternatively they warn of rebellious and vengeful escaped slaves, or vast self-modifying intelligences that eliminate us as an accidental side effect of their vast projects.
Some suggest that the technologies will just innately be wonderful and awesome, because the Goddess of Progress has decreed it so. Others admit that there should be some rules - but want them to be aimed nearly exclusively at aligning the goals of these posited future intelligences with what we want them to do.
If you understand LLMs instead as cultural technologies - or alternatively as the successor to existing Lovecraftian monstrosities such as markets and bureaucratic states - you become a lot less impressed by these rather hazy visions. Instead, you can start to understand how LLMs and other forms of AI work as technologies that process human generated information.
Markets aren’t intelligent. Nor, for that matter, are ministries of defense. But both markets and ministries process information more or less well, with enormous implications for our lives. LLMs are another such technology; fancier in some ways, more primitive than others. Are they and other forms of AI as profound a social invention as large scale markets and bureaucracy? I doubt it myself, though it is much too early to say. But asking that question at least suggests the kinds of comparison we ought be making.
One helpful framework for comparison is Herbert Simon’s book, The Sciences of the Artificial. Simon treats markets, businesses, governments as though they were roughly analogous, and then talks about how they relate to artificial intelligence. All of the former are systems that allow human individuals, with our squishy, low bandwidth organic brains, to coordinate, achieving vast and sometimes extraordinary things together. The specific kinds of artificial intelligence (rules based systems) that Simon is riffing on are very different from the LLMs we see today. But his broader framework works for them too, and arguably works even better than for the earlier kinds of AI he wrote about. LLMs too serve as information transmitting and coordinating devices.
This implies that just as we can regulate firms and market actors, or collectively call errant government ministries to account, so too ought we regulate LLMs to achieve collective goals. Equally, there may be places where we want to rely on markets, or democratic choice to constrain LLMs instead, or alternatively use LLMs to constrain the various other monsters of information. Political economy in advanced industrial societies is all about shoggoth handling - deploying these vast, sometimes inimical forces not only for useful purposes, but also, as needed to constrain and counter-act each other. Not so much HPL, then, as JKG.
Everything I’ve said so far recapitulates things that Cosma and I have written already, together or separately, to the annoyance of Singularity cultists. But if you pay close attention to these arguments, you’ll see that they are also potentially discomfiting to many of the lefties who denounce LLMs.
I see a lot of people whose attitude to LLMs sort of resembles the notorious borscht belt joke-complaint about the restaurant whose food is so bad, and served in such small portions! Many lefties argue that LLMs are fundamentally useless - they don’t do anything that is conceivably valuable. But at the same time they worry that these technologies will become ubiquitous, fundamentally reshaping the economy around themselves.
There isn’t any absolute logical contradiction between the two claims, and occasionally, quite stupid technologies have spread widely. Still, it’s unlikely that LLMs will become truly ubiquitous if they are truly useless. And there are lots of people who find them useful! Me included, obviously, given the LLM-generated art at the top of this post (I have fwiw sought to engineer the prompt so that no living artists’ incomes were directly harmed in its making; likely with only partial success).
My broader bet is that LLMs, like other big cultural technologies, will turn out to have (a) lots of socially beneficial uses, (b) costs and problems associated with these uses, and (c) some uses that aren’t plausibly socially beneficial at all. Unless a cultural technology is all bad, or mostly bad (which is possibly true of LLMs; but I’ve not seen the case made well), the challenge is to figure out how to make the most of the benefits, while mitigating the problems.
Daron Acemoglu and Simon Johnson make this point well in their recent book, Power and Progress. Technological progress doesn’t happen in a vacuum. Technologies, their trajectories of development and their consequences are shaped by the political, social and economic contexts in which they’re deployed. Rather than appealing to some vague notion of the awesomeness of progress, or the malignity of technology, we want, collectively, to coordinate on paths of technological development that will have spread benefits as broadly as possible, while mitigating for, or compensating for the costs. That, inevitably, means that we need politics and collective action to shape both the deployment and future development of technologies such as LLMs.
The map is eating the territory
To direct these politics, we need to know more about the underlying political economy. So here is my best stab at one aspect of what has been happening over the last couple of decades. Over this period, we have been seeing the rise of new technologies of summarization - technologies that make it cheap and easy to summarize information (or things that can readily be turned into information). As these technologies get better, the summaries can increasingly substitute for the things they purportedly represent.
This explains why they are of general value - usable maps and summaries of big inchoate bodies of information can be incredibly helpful. It also explains why they are politically divisive. When there is money at stake - and there is - there will be ferocious fights between those who want to make money from the summaries, and those who fear that their livelihoods are being summarized out of existence.
For starters, this helps us understand how the politics of search is changing. Once, and not so long ago either, search primarily relied on a stripped down representation of the relationships between websites. Google’s original secret sauce - the PageRank algorithm - treated the number of incoming links that a web page gets as a signal of its usefulness and relevance to a particular topic (I simplify here: but not too much). When you searched Google on that topic, you got a page with a series of links, each accompanied by a brief snippet of text that would likely help you figure out whether this page would give you what you were looking for, with ads confined to the side of the page, so you wouldn’t confuse them with the information you were looking for.
The problem of course, was that websites looking for eyeballs could game this, using linkspam, web rings, and other means to fool Google into paying more attention to them than they deserved. This created a Red Queen’s race between Google’s algorithm, which regularly evolved to frustrate the manipulators, and the shady people trying to manipulate it (LLMs are transforming this fight in ways that I’ll talk about in another post).
Over time, Google’s incentives have shifted, as it looks to monetize its effective monopoly. It isn’t just that you have to skip through lots of ads and sponsored links to find the results that are useful. Now, instead of just mapping links to websites, Google is increasingly replacing the websites themselves with summaries which are skewed to guide users towards services that help Google’s bottom line. If I search on my phone for a local restaurant, I’ll see a prominent ‘order online’ button. If I click on it, I won’t find the restaurant’s own delivery service, even if it has one, and if delivery is basically free (as I just established, looking at a local restaurant). Instead, it will direct me to Doordash, Seamless and Grubhub (all of which presumably cut Google in on the proceeds if I make the mistake of clicking through). If you look through payments industry websites, there is lots on this kind of “integration,” though less, unsurprisingly, on how the proceeds get divvied up among the parties.
Google might claim - not altogether wrongly - that people would prefer to have restaurant information rendered cleanly and simply on the search page, so that they don’t have to hunt through idiosyncratically designed websites that are often flung together on the cheap. And you can reasonably see the move towards Doordash and its competitors as a story of cheaper and generalized outsourced infrastructure replacing bespoke DIY. But Google’s power to decide what goes into a search summarization and what gets left out has consequences. It means that the proceeds of standardization are likely to be distributed in some highly unequal ways. And the problem goes even deeper than that. The summarizations generated by search engines are the nexuses through which consumers look for stuff, and sellers try to find buyers. If you can gimmick these summarizations, you can effectively define the market around your own desired model of profit and monopoly.
Search engines began as maps but have now become monsters. They devour the territories that they are supposed to represent, relentlessly guiding the user toward the place where the mapmaker can maximize profits, rather than where the user really wants to go. This is one of the major drivers of what Cory Doctorow calls “enshittification.” And LLMs are in some ways a much more powerful (though as yet less reliable) generator of summarizations than are search engines. They take a huge corpus of human generated cultural information, summarize it as weighted vectors, and spit out summaries and remixes of it.
The reason why many writers and artists are upset with LLMs is not that different in kind from the unhappiness, say, that news organizations had with Google News, or that restaurants have with the Google search/Doordash Storefront chimera. LLMs can be useful. If you, as a punter, are faced by 50,000 words of text that you have to absorb, and an LLM can reduce it down (with reasonable though not perfect reliability) to 500 words, focused on whatever specific aspect of the text you are interested in, it will save you a lot of time and work. But do you really want to buy the 50,000 word book, if you can get the summary on the Internets for free or for cheap? And if you don’t, what happens to books like that in the future?
Like search engines, the summarizations that LLMs generate threaten to devour the territories they have been trained on (and both OpenAI and Google expect that they will devour traditional search too). They are increasingly substitutable for the texts and pictures that are represented. The output may be biased; it may not be able to represent some things that would come easily to a human writer, artist or photographer. But it will almost certainly be much, much cheaper. And over time, will their owners resist the temptation of tweaking them with reinforcement learning, so that their outputs skew systematically towards providing results that help promote corporate interests? The recent history of search would suggest a fairly emphatic answer. They will not.
All this is leading to the emergence of new economic divides between those who control the means of summarization, and those whose properties or livelihood risks being summarized into effective non-existence. Large swathes of our old political economy risk being torn up at the roots, as maps infect the territories they delineate. It isn’t surprising that those who stand to benefit from this are loudly proclaiming the virtues of technological progress. Nor is it surprising that those who stand to be hurt are pressing back.
Perhaps surprisingly, there isn’t very much in the way of traditional organized political contention around this divide right now. The SAAG-AFTRA strike is the most important example that I’ve seen, where actors pressed back against studios over control of their AI representations. And I hope we see more of it in the future!
Instead, we have two poorish alternatives. First: a lot of people pressing back individually on social media, saying for example that they will boycott people who use AI content. I don’t think that this is likely to do much, and at the limit it risks becoming the kind of political “hobbyism” that Eitan Hersh complains about, where people confuse ‘complaining really loudly on the Internet’ with ‘taking effective steps to change the world.’ Individual complaints rarely work without collective politics behind them. Don’t moan, organize!
Second, there are fights in the law courts that only imperfectly map onto broader notions of the public interest, because they are waged by private actors fighting over their shares of the take. Most obviously, the New York Times is suing OpenAI over OpenAI’s use of copyrighted material to train its LLMs. These battles are the opposite of people yelling on social media in both a bad sense and a good one. They are unrepresentative of ordinary people’s worries, but they are much more likely to be consequential. As usually happens when there isn’t a really organized public voice, big, powerful self-interested entities are clashing, and those on the sidelines have to decide which side they ought be on.
The coalitions are changing. And they should
At this point, I suspect that most readers will be able to guess where all this is going. The reason why the political coalitions have changed - why left leaning activists are no longer on the side of Google and the big platforms calling for weaker copyright controls - is that the valences of intellectual property have shifted. To many on the left, the monopoly control of information that Google, OpenAI, Meta and their rivals are looking to achieve through social networks, search and LLMs look like a bigger threat than the old enemy, Big Content.
And I think they are right, but for complicated reasons! Some provisos: I am not an intellectual property lawyer. I am neither qualified to comment on the likely chances of the various lawsuits that are roiling the AI industry, nor particularly interested in the legal niceties that they will largely turn on. What I am interested in are the political questions that lie beneath. What should we want intellectual property systems (whether they involve individual ownership, collective ownership, copyleft or what have you) to do?
My entirely unoriginal answer (lots of people on the left and the right have said this already, in different ways, including the framers of the U.S. Constitution) is that we want systems that will encourage the creation of useful knowledge, engaging, challenging or otherwise valuable art and other forms of cultural production that make our lives better and more interesting. Of course, people will sharply disagree with each other about what is useful knowledge, valuable art and so on, and they will disagree too on how best to encourage it.
Combining this very broad notion with the LLMs-as-cultural-technologies perspective, has at least one important plausible implication. If these technologies are valuable, so too is the human generated knowledge that they summarize. In a world that is increasingly more complex, we are likely to need all the tools for managing complexity that we can get. But tools like LLMs are likely to be valuable precisely to the extent that they provide an interface that condenses, remixes, and provides access to high quality human knowledge. They may condense and make visible connections across this body of knowledge that would otherwise be hard to see. But they don’t and can’t provide a miraculous solution to the garbage-in, garbage-out problem. If they are trained on crap - whether that be lousy human generated information, or lousy synthetic data - they will produce crap.
This suggests that LLMs should not be viewed as a substitute for high quality human generated knowledge. They should instead be viewed as an obligate complement to such knowledge - a means of making it more useful, which doesn’t have much independent worth without it. And that is important for our collective choices over intellectual property systems. If you want LLMs to have long term value, you need to have an accompanying social system in which humans keep on producing the knowledge, the art and the information that makes them valuable. Intellectual property systems without incentives for the production of valuable human knowledge will render LLMs increasingly worthless over time.
This suggests that one of the key arguments of OpenAI, Andreessen-Horowitz and the like, has it exactly backwards. For example, in its comments to the U.S. Copyright Office, Andreessen-Horowitz claims that
The bottom line is this: imposing the cost of actual or potential copyright liability on the creators of AI models will either kill or significantly hamper their development.
arguing that pretty well any scheme for compensating creators would not only be unworkable, but allow the Chinese Communist Party to displace good old US technological dominance. But if LLMs rely on reasonably high quality knowledge to keep on working, this is the exact opposite of true. The actual “bottom line” is that declining to acknowledge the cost of producing such knowledge will either kill or significantly hamper the technology’s development.
Ensuring the continued production of high quality knowledge is hard. It is especially hard when big monopolies or wannabe monopolies have every incentive to batten on the knowledge production systems that they rely on, cannibalizing expensive-and-laborious systems for producing knowledge by flooding the market with cheaper summarizations. Social movements, activists, policy makers should push back hard against arguments that are in the long run, both specious and self-undermining. Any division of rights and proceeds ought surely recognize that LLMs are valuable engines of summarization - but only in conjunction with high quality knowledge that can valuably be summarized. In other words, enough of the proceeds need to go to the actual knowledge producers for the system to be self-sustaining.
Pulling together the last few thousand words then, LLMs are cultural technologies of summarization, whose value depends on people continuing to actually produce culture that can usefully be summarized. Absent intervention, LLMs will likely develop, as other technologies such as Internet search have, in ways that benefit their makers at the expense of others. The summarizations that they produce risk supplanting the culture that they feed on.
This would be a terrible outcome. Borges wrote a famous and very short story about what happens when the map comes fully to displace the territory.
In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast Map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography.
The maps that are eating our world might end up being useless for completely different reasons. But the resulting Deserts of the West would not be any more hospitable to intellectual life.
Two provisos to all of this. First, obviously, these arguments are all at a pretty broad level of generalization. I don’t pretend to offer any specific guidance as to which system or systems of intellectual property we ought turn to as alternatives. And there are many aspects of LLMs and their consequences that can’t be reduced down to intellectual property. Still, clarifying broad principles can clear away a lot of the brush that otherwise obscures our vision.
Second, I could be wrong! If you agree with the premises that I offer as to what LLMs are, and how they work, then (I think) my claims more or less follow. But the premises are incomplete - there are certainly other reasonable ways you might think about LLMs. And different technical options could open up, or there may be technical aspects I don’t get properly, or am mistaken about (I read as much as I can - but I am not a computer scientist). If, for example, LLMs can somehow generate their own high quality knowledge, then my argument fails. They may turn out to be much less generally useful than I suggest. Alternatively, they may be fatally flawed in ways I don’t acknowledge.
Even if so, I think it is valuable to do what I’m at least trying to do. Many of the people that I read start from the passions - they love LLMs, or they hate them. But how these technologies develop and get regulated has a lot more to do with the constellations of interests that they disturb and they create. If there is one big lesson you should take from this post, it is that the political economy of new technologies, like the political economy of everything, is mostly about who gets what. Understanding that, mapping its consequences, and thinking about both the possibilities that it opens up, and those that it makes harder, is the first step towards making things better.
* I use the term AI here, since most other people do, but under intellectual duress.
A rich and timely essay. I especially liked the Borges bit...I cannot think of a fiction writer more worth re-reading in light of generative AI.
You don't specifically call attention to schooling as an example of the Lovecraftian monstrosities that LLMs may replace, but it strikes me that LLMs were introduced to the general public through a moral panic about ChatGPT eating homework that has now subsided into a low boil of worries about the "disruption" of education. Now Khanmigo is about as likely to replace teachers as we are to see AGI, but the question of how exactly generative AI will change (or not) the bureaucratic structures and internal practices of education seems relevant.
Collective action in this space feels urgent, especially in light of your insight about the importance of the continued production [and distribution] of high quality knowledge.
Thanks for this thoughtful piece. Although there are other uses to LLMs than summarisation, and other AIs such as image generators and multimodal AIs with other uses, the basic logic that the societal usefulness of generative AI depends on knowledge and culture (or more correctly, embodiments of knowledge and culture in digital form) to perform seems sound.
You are right therefore that societal failure to acknowledge the costs of so-called knowledge and cultural production will ultimately threaten the technology (or, more correctly, our society).
How society funds this production, and the role AI corporates should play in it (if any), is the real question.
Confronting this question isn’t strictly in contradiction with the suggestion that imposing costs on AI creators for using digitally embodied knowledge and culture via copyright liability will threaten or kill AI development. There are as many problems with the idea that (traditional) copyright is a good fit for this job as there are with its use to fund public interest journalism. Big media corporations, as you suggest, often see knowledge and culture as a private commodity to be bought and sold, and they have the muscle to cut favourable licensing deals with the biggest search, social or (now) AI platforms.
Yet independent creators lack this power, placing them at a competitive disadvantage to big media.
Copyright licensing requirements also place independent or publicly-funded AI research at a disadvantage.
Together, copyright favours the dominant corporations and incumbents of both the media and AI industries, entrenching their dominant positions further.
And yet consider also, public interest news - and I’d suggest knowledge - needs to be freely accessible to benefit the public, and funded (one way or another) by the people it’s meant to serve.
Western publics generally contribute huge amounts to charities and their governments significant amounts to academic research and the arts - and sometimes news provision - through public grants, as these activities and cultural creations are deemed valuable public goods.
Far better then, that increased levies on the profits of AI corporations (and tech companies, more generally) be used to directly stimulate public-interest knowledge and cultural production, according to publicly-directed needs; and to allow this produce to be freely used for AI training.