Discussion about this post

User's avatar
Kaleberg's avatar

I think you'd do better to listen to the mathematicians. They admit they work with signs and symbols, but they know that those signs and symbols have meaning in some real world. The classic example is Euclid's geometry. If you ditch the Fifth Postulate, there's no way to tell if the theorems refer to the geometry of the plane, the sphere or the hyperboloid. To assign a more specific meaning, a mathematician can impose an appropriate postulate.

This is why mathematicians say that when an android proves a theorem, nothing happens. Androids work in the domain of signs and symbols. Mathematicians work with mathematical objects. This doesn't mean that automatic theorem proving is worthless, just that, as the sage said, man is the measure of all things.

Borrowing yet another page from the mathematicians and moving this discussion into a new domain, LLMs deal with signs and symbols. Authors deal with real worlds and real people. When an LLM writes a novel, nothing happens. It is left to the reader to assign meaning. When a human writes a novel, there is a real person trying to convey meaning in some real world. A reader may take a message from an LLM generated novel, but the sender had nothing to say.

A lot of this flows from work on the foundations of mathematics early in the 20th century. It turns out that one cannot pin down meaning with just signs and symbols. Since mathematicians were involved, it is dryly stated. It came as a surprise, and I wonder how much of it leaked into literary theory. (That and so many authors trying to excuse themselves for their earlier adulation of Stalin or Hitler.)

P.S. How dry is mathematical language? My favorite quote: "Not for us rose-fingers, the riches of the Homeric language. Mathematical formulae are the children of poverty." And, even then, ambiguity is at its heart.

Expand full comment
Cosma's avatar

_Pace_ Weatherby, an LLM based on transformers _is_ an example of a generative grammar in Chomsky's sense. It's just that since it's a (higher-order) Markov chain, it sits at the lowest level of the Chomsky hierarchy, that of "regular languages". (*) We knew --- Chomsky certainly knew! --- that regular languages can approximate higher-order ones, but it's still a bit mind-blowing to see it demonstrated, and to see it demonstrated with only tractable amounts of training data.

That said, Chomsky's whole approach to generative grammar was a bet on it being scientifically productive to study syntax in isolation from meaning! This is part of the argument between him and Lakoff (who wanted to insist on semantics)! (There was a reason "colorless green ideas sleep furiously" was so deliberately nonsensical.) This isn't the same as the structuralist approach, but if you want that kind of alienating detachment from ordinary concerns and thinking of language as a cozy thing affirming ordinary humanity, Uncle Noam can provide it just as well as Uncle Roman. Gellner has an essay on Chomsky from the late '60s or early '70s which brings this out very well --- I'll see if I can dig it up.

*: Strictly speaking, finite-order Markov chains form only a subset of regular languages, with even less expressive power. To use a go-to example, the "even process", where you have even-length blocks of 1s, separated by blocks of 0s that can be of any length, is regular, because it can be generated using just two (hidden) states, but not finite-order Markov. A transformer cannot generate this language perfectly, though obviously you can get better and better approximations by using longer and longer context windows.

Expand full comment
27 more comments...

No posts