11 Comments
User's avatar
Henry Farrell's avatar

So this is in part a spin-off from the Dan Davies Extended Universe as the namecheck in passing suggests. I don't talk about accountability sinks, since I'm trying to get away from the LLMs-as-agents discussion, but there are I think some interesting further questions about dealing with variety. I mention in passing the problem that they tend to select against variety, but Dan has an interesting podcast interview with Patrick McKenzie where he suggests in passing that LLMs could be handy in making organizations better able to deal with complex environments. I suspect that neither of us has thought this through in huge depth, but since I'll be meeting up with him this afternoon, I may ask him ...

I'm guessing that the hallucinations problem is less of a problem for ideology, so long as you can be sure that you are not actually peddling fake quotes (as mentioned, NotebookLM has workarounds for this, but not knowing the Chinese system I can't say for sure). Explicating Chairman Xi thought is a ritual performance, where the maximally unsurprising outcome for some extrapolation of it is often going to be a useful thing to know.

Expand full comment
Gerben Wierda's avatar

I think 'summarisation' is not the best concept (though I can understand seeing these systems as a lossy compression of language or culture). Approximation is the best concept. The token-statistics (based on output from human understanding) can be used to approximate what the result of human understanding could be. These systems do not really summarise (see https://ea.rna.nl/2024/05/27/when-chatgpt-summarises-it-actually-does-nothing-of-the-kind/) even if you can use them to do so. They do a mix of 'ignoring what is to be summarised but generate mostly from parameters' and 'shortening', which isn't summarising (see link). Approximation also covers the behaviour of LLMs better outside the summarising use case.

GenAI also does not hallucinate, that is labeling it from the human perspective. The systems approximate, and some approximations are wrong (but still valid *approximations*). The errors aren't errors, they are fundamental features of the systems.

Thirdly, they don't need to be good to be disruptive. Innovation can be either "doing something heretofore impossible" (AlphaFold) or "doing something more cheaply" (either better, good, or 'good enough'). Most of LLM use is 'doing something cheaply' (both in terms of money as in result). Klarna replacing graphic artists with GenAI is an example of(see https://ea.rna.nl/2024/07/27/generative-ai-doesnt-copy-art-it-clones-the-artisans-cheaply/).

Lastly, *the* issue everyone is still ignoring is the fact that automation offers us a speed/efficiency gain, but the price paid is less agility. All organisations these day suffer that they become less agile because they have been welded to 'large landscapes of brittle machine logic'. Such landscapes are ever harder to change because logic is brittle, and as a result they have 'inertia'. This is an overall automation/IT issue. We may potentially expect landscapes (or even language itself) becoming welded to LLMs to slow doen in variation and change. Fun observation: human intelligence is also mostly 'mental automation' and this delivers (from an evolutionary perspective) necessary speed and efficiency, but shows the same price paid in agility. Our convictions, beliefs, assumptions are our mental automation, and they provide speed and efficiency, but they do not change easily. (See https://www.youtube.com/watch?v=3riSN5TCuoE). We're not moving towards a singularity point but a complexity crunch.

The providers tune their systems such that there is not so much randomness ('temperature') that even grammar fails, because we humans have a quick evaluation of intelligence that uses 'good language' as a proxy for 'intelligence'. So, you cannot push these beyond a certain temperature as they become more creative, but less convincing. Grammar, it turns out, is easier to approximate with tokens than meaning. See https://youtu.be/9Q3R8G_W0Wc

Expand full comment
Alan's avatar

Very good points, especially on the efficiency part, I am currently in a battle in the company where I work about this issue: they want software to improve efficiency, but don't realize that 80% of the gains they will get with software will come from the actual definition, standardization and enforcement of processes in all countries. And of course this means way less flexibility alla across the board as well. I want to add that I am the responsible for all things software and data in the company, so I am not a "Luddite" (in the common sense, I actually think they had a few good points, but I digress...) and I am leaving aside the whole "efficiency at all costs" thing, which is just bullsh*t: do you want maximum efficiency that is unbeatable by everyone? Just don't do the thing, that's maximum efficiency by definition, 0/0 will always be unbeatable...

Expand full comment
Gerben Wierda's avatar

0/0 will give you a new dimension (only partly kidding): https://gctwnl.wordpress.com/2016/12/23/how-much-is-zero-divided-by-zero/

Sorry, could not resist. But it was just to much fun to write that 8 years ago.

Expand full comment
Jack Shanahan's avatar

As someone who led military organizations varying in size from 100 to 25,000+ people, this is the best description I've seen of how LLMs could/will affect future organizational design and daily functions.

Your four categories are not so handwavy! I'm hard-pressed to think of anything you might have missed. The vital 'translation' function is rarely, if ever discussed, in the way you describe it here. In Marisa Tomei's words, it's "dead-on balls accurate."

Expand full comment
Henry Farrell's avatar

Thank you for the kind words - bits and pieces I've heard from people with Pentagon experience are actually part of the intellectual DNA of this argument! Coincidentally, Ethan Mollick also briefly mentioned the importance of translation in a post today - https://www.oneusefulthing.org/p/15-times-to-use-ai-and-5-not-to?utm_source=post-email-title&publication_id=1180644&post_id=152600543&utm_campaign=email-post-title&isFreemail=true&r=byas&triedRedirect=true&utm_medium=email

Expand full comment
Alex Tolley's avatar

If LLMs are more efficient than humans will that reduce the number of employees?

Ed Zitron suggests that AI companies are not able to convince enough corporate customers to pay for their offerings. If they are so useful, why would that be? (because they do not reduce overhead?)

Having just read Dan Davies's book "The Unaccountability Machine", would LLMs create more "accountability sinks"? Would their regurgitation of existing knowledge reduce the organization's ability to handle the increasing variety in their environment, or conversely, provide more time to respond to it creatively?

Any idea how the mentioned Chinese system avoided "hallucinations" in getting Xi's thoughts correctly interpreted and referenced? "Hallucinations" remain a problem for LLMs so far with no prospect of eliminating them with current technology, although the Allen Insitute's curated Ai2 system seems to work well in this regard (smaller is better?).

Expand full comment
John Quiggin's avatar

We had an interesting discussion on AI and bullshit jobs on Crooked Timber in the early days of ChatGPT https://crookedtimber.org/2022/10/08/ai-is-coming-for-bullsht-jobs/

Expand full comment
Claire Hartnell's avatar

There is & always will be, a tension between Taylorist management & Deming management that plays out today in agile vs command & control. In the Taylorist world, the system is mechanical & can be controlled top down through routinised work, sanctions, rewards etc; In Deming’s paradigm, businesses are complex systems & the role of management is to co-ordinate human initiative to offer adaptive responses to points of failure. It is true that modern corporations consume & splurge vast amounts of data & I have no doubt that management consultants are feeding 30 years of business decks through LLMs to produce industry / business summaries. But it’s still more information that must be consumed & understood. I started my career in the days of trips to companies house to photocopy company accounts & plodding visits to business libraries to read long-winded market reports. These have all been replaced with automated services that will slice & dice data into uniform categories. But that didn’t mean consultants / businesses needed fewer people! It was even harder to find an edge in a digitised world full of commoditised data than in a paper + tube trip world. There was no competitive advantage from this explosion of data because everyone had it. So as always, the edge has to come from humans finding signal in bland, generic noise. LLMs are no different. The utter blandness of their content may replace some ritualised duties (this is really a good point) but these duties are only ritualised in big, ordered top down monoliths. And these, as Deepseek has just shown, are very bad at finding signal in informational noise. So, sure - LLMs may be able to write lots of commodified stuff - training manuals, codes of practice, mission statements. But not one of these things is business enhancing. As Deming found, the best way to enhance a business is to get 10 people, with local knowledge, to ponder a problem under time constraints. The role of management is to co-ordinate that & then remove all barriers to diffusion of the viable solution. LLMs will never outcompete this role (unless there’s a step change in their abilities) & my own guess is that after some low hanging fruit is picked, they’ll end up being an alt tab for fiddly cognitive assistance (memory, summary, feedback) without changing anything about the core competitive advantage of businesses: which is to effectively deploy humans to solve problems / seize opportunities & then diffuse these quickly.

Expand full comment
Alan's avatar

The main issue not taken into account is that LLMs are stochastic in nature, this means that randomly and erratically changing answers and "bullsh*tting" (in the Frankfurtian sense https://press.princeton.edu/books/hardcover/9780691122946/on-bullshit) it's a FEATURE, not a bug. This thing is the key that not many people understand, it's not possible to completely remove it from the models, because that's how they work. What does this mean? That we'll be able to use LLMs if and only if we are already sure about what we want as an answer. That's the point of the "they are useless" camp, and being afraid that most people don't understand this fundamental matter and will misuse LLMs it's not being crazy, but almost a certainty.

Expand full comment
Alex Tolley's avatar

One can handle the LLM's stochastic nature by limiting it to the interface and not creating the content. The content is extracted from curated databases, and the LLM then does its best to summarize it and supply the relevant reference[s]. It is the difference between a university freshman talking about something in class they barely know, and having the freshman go to the library, get the relevant information, and tell the class what they found, reading from notes. The latter is going to be far more accurate.

What disappoints me is that Google's Gemini which inserts its AI-generated text at the top of searches [unless blocked] has returned the correct numbers to a question, but the wrong units. Google could surely do better with its resources. So far I have found the Allen Institute's approach using a small curated corpus of journal papers a better approach to answering questions within its domain.

Expand full comment