I first encountered Yudkowsky when a sci-fi reading club I had joined tackled his "Three Worlds Collide" story. I did the reading late and wound up not finishing it in time because as I was hastily reading en route to the meeting, I had to stop and google the author after reading the passage in which he casually lets the reader know we are set in a future were rape was legalized and that somehow improved society, to the point that seeing rape as illegal is considered beyond prude.
I feel like this factoid about him isn't promoted enough.
This article is an over-simplistic framing of the issue and evades exploring any nuance of the particulars. Inventing factions and reducing the arguments to a label is an outstanding way to impede productive, educational debate. It's a Crayola line drawing of a complex nature scene.
AI has begun to contribute to science. We will see new medical therapies, manufacturing materials, small molecules, fusion reaction control, self-driving robo-taxis, and uber-competent Siri-like assistants. These areas can improve critical topics like longevity, energy abundance, and access to education while enabling an explosion in productivity.
But the flip side is real, too. Today, you can interact with open-source competitors to ChatGPT that will gladly outline the production of sarin gas or cultivation of anthrax. These models are already employed to displace journalists and commercial illustrators, AstroTurf social media, and terrorize screen actors and writers into strikes.
In the next few years, maybe months, there's a real chance that one morning you'll roll out of bed and learn that an AI can now pass every human-level-ability test at the expert level, medical boards, bar exams, MENSA tests, doctorate level physics, and math tests. It can re-derive Einsteins' Special Relativity, the Quantum Standard Model and produce new frontier science with a simple query. When you describe your business, it will lay out a strategy that factors in currency risk, customer sentiments, competitors, regulation, and the health of your suppliers.
Likely before your current car lease ends, there's a day after which it's irresponsible to make choices without getting feedback from an AI. A day where the possibility of people being the first to discover new science has passed.
This very smart AI doesn't have a will, desires, or the ability to act. It won't sneak around and take over the world. It's just smarter at every intellectual task than anyone alive and available to everyone as an app on your phone.
The societal impacts will be tremendous and hard to predict. But it begs the question: What's your value add if all you're doing is querying super-Siri? Being a low or no value-add person is a scary prospect in the developed world.
Agency has shown itself to be as simple as adding recursion, memory, and internet access to existing LLMs. This configuration enables an AI to break complex problems into subtasks and attack each subtask. It's a short hop from here to an autonomously AI solving world hunger or planning to turn the world into paper clips.
Sigh. Words that have been repeated ad nauseum since AI work started - "Any Day Now". Absolutely nothing concrete whatever.
As I often repeat: "It's not that LLMs sometimes hallucinate - it's that they ALWAYS hallucinate, and sometimes the person looking at the results thinks they resemble reality."
Interesting take, and great points. Two minor complaints:
First, Yudkowsky's thinking regarding AI alignment is very well developed and massively influential. The references to him being self schooled or writing Potter fan fiction are ad hominem attacks that say nothing about the validity of his writings. He's certainly not perfect and his ideas have been criticized - but I don't think your approach here is constructive.
Second, Vernor Vinges' singularity is the idea that as technological progress accelerates, eventually the speed of development will become steeper and steeper until it is effectively a vertical line - hence the metaphor of a singularity. For people living before the singularity, the world afterward will be incomprehensible. AI is a possible driver of this acceleration but is not the only factor at play. Vernor's vision of the singularity is more nuanced and much broader than how you've described it here.
This is very different from (and much less rosy than) Kurzweil's vision, in which humanity is the beneficiary and object of the singularity.
Fair comment that the piece is a little snarky. I certainly don't think that Yudkowsky is stupid - far from it - but I do think that the entire edifice he has constructed is a wasted effort, in part because I don't believe reasoning works as he and rationalists think it does - https://crookedtimber.org/2020/07/24/in-praise-of-negativity/. The argument is not that Vinge is as optimistic as Kurzweil - rather that the 1993 essay contains within itself the major elements of both the optimistic and pessimistic strands. It's a little remarkable how little the debate has changed as the technology has changed. If I were being even snarkier and had more time and space, I'd develop and test the argument that the reason why this debate is as it is, has less to do with the technology than with the organizing myths of science fiction going back to Shelley's Frankenstein (Aldiss and Wingrove make a very plausible case that it is the ur-text of the genre), which intertwines the 'we scientists are as gods' and 'our creation will surely destroy us' as the twin helixes of its narrative.
And finally, the bit about how AI would "start feeding on itself" was supposed to at least nod towards the accelerating moment of radical transformation thing - the original draft had more, but I had 1,000 words to cover an awful lot of material ...
Fair points, thanks for the reply! 1000 words is a very restrictive limit.
I completely agree regarding the Frankenstein archetype. The Ur-text might go back even further to the myth of Prometheus / Icarus. The subtitle of Frankenstein was, of course, "A Modern Prometheus". Makes you wonder how long through human history this archetype has been a key to our understanding of the world.
It's hard to dismiss the core message. As technology shapes our social systems, worldview, and lives , all influential new technologies inevitably destroy part of the world that was, and open a door to a new ones. Not only do I think that the archetype often plays out in current events, I also think that it might be literally impossible to see our world in terms independent of that archetype. This is mostly because the archetype is so pervasive, but also because it is so broad that any consequence of technological progress can 'fit' into its schema.
I'm sorry but why are those statements ad hominem attacks? I'm very much someone who finds the notion of a self schooled, fanfic writing, geeky public intellectual very attractive, and I find Yudkowsky discourse dull and utterly useless on its own merits. What good ideas does he have?
My own biases might be showing through, although I'd wager most people give more legitimacy to the ivy-schooled than the self-schooled.
I first read yudkowsky in the Foom debates (https://intelligence.org/ai-foom-debate/) but I found his best work was in summarizing / simplifying the AI doomer arguments, which were well developed by others (bostrom etc) but largely inaccessible before his write ups (e.g., "List of AI Lethalities"). Which got him into a few talk shows, where he mostly fumbled the argument.
Any interesting ideas to point to? I skimmed that document and all I'm seeing is science fiction concepts taken at face value as possible futures for no other reason that their ubiquity as tropes, which to me is just media illiteracy flaunted with a pompous flair.
- a strong system cannot be extensively tested without the testing also creating risk
- some problems / risks will only appear at dangerous levels of strength, and are not observable in weak systems
- there is no known way to use the paradigm of loss functions, sensory inputs, and/or reward inputs, to optimize anything within a cognitive system to point at particular things within the environment
- The underparts of human thought are not exposed for direct imitation learning and can't be put in any dataset. This makes it hard and probably impossible to train a powerful system entirely on imitation of human words or other human-legible contents, which are only impoverished subsystems of human thoughts; unless that system is powerful enough to contain inner intelligences figuring out the humans, and at that point it is no longer really working as imitative human thought.
- Orthogonality and instrumental convergence make alignment a much more difficult problem
... And many more. I'm not sure how much of this is original vs borrowed from other authors, but as mentioned, he did an amazing job of making these arguments accessible
Ok, so all of these are very heavy in jargon, very "in group".
I went to the trouble of deciphering the first one, that "there are no pivotal weak AGI acts"... And as far as I can tell it basically means that any entity capable of magically preventing other magical entities from coming into being would, itself, be a magical entity and therefore magic would exist.
I cannot stress enough how much:
1. taking this idea seriously requires you to accept the imminent existence of super intelligent beings with magical powers as a given and
2. How once you buy into this unfounded belief, the actual concept being articulated is a trivial tautology.
For my part, I view this essay as itself part of a minor movement, a skeptical or denialist movement that wants to deny there is any imminent prospect of "superhuman AI", and I would be interested to understand its nature and origins.
I hypothesize that one motivation is a liberal-to-progressive outlook which rejects belief in divine superintelligence in favor of humanist atheism, and which sees belief in artificial superintelligence as a return to the "demon-haunted world", from which secularism liberated humanity.
There's also a complex of interrelated attitudes such as: hostility or skepticism towards the idea of IQ or degrees of intelligence; hostility towards Big Tech and the billionaires who own it, as undemocratic, exploitative, capitalist, a dangerous concentration of private power, etc; preference for social and political solutions to this situation, rather than technical ones.
All this could make a person receptive to deflationary claims about AI, e.g. that LLMs are just stochastic parrots, or that the buzz around AI is just marketing hype.
I'm happy to learn of Dr Saint-Simon. His views of a meta-religion as expressed by this writer form the bulk of a series of ideas on how Cerebral Valley, mostly made up of avowed atheists is reinventing a quasi-Abrahamic religion where they're crowned as the gods of a new dimensional stage of the human race.
The tech community's embracal of a16z's echoes a gnawing temptation to godhood that is shared by many.
Seems to completely ignore the fact that there are many software engineers (many of who work on AI) who dismiss the notion that anyone as of December 2023 has come close to acheving AI.
Yud is super smart, and I find his writing and interviews fascinating. He just takes certain bad possibilities to be given that I can't. Otoh, I'm not in the pmarca cult either, thought I think he too has a bunch of good points. My personal philosophy is that we need to take an these examples of possible doom as a set of constraints for "how not to do it." But the benefit of AI stands to be so great to everyone that I don't think we can afford to not do it at all.
I first encountered Yudkowsky when a sci-fi reading club I had joined tackled his "Three Worlds Collide" story. I did the reading late and wound up not finishing it in time because as I was hastily reading en route to the meeting, I had to stop and google the author after reading the passage in which he casually lets the reader know we are set in a future were rape was legalized and that somehow improved society, to the point that seeing rape as illegal is considered beyond prude.
I feel like this factoid about him isn't promoted enough.
https://www.lesswrong.com/s/qWoFR4ytMpQ5vw3FT/p/bojLBvsYck95gbKNM
This article is an over-simplistic framing of the issue and evades exploring any nuance of the particulars. Inventing factions and reducing the arguments to a label is an outstanding way to impede productive, educational debate. It's a Crayola line drawing of a complex nature scene.
AI has begun to contribute to science. We will see new medical therapies, manufacturing materials, small molecules, fusion reaction control, self-driving robo-taxis, and uber-competent Siri-like assistants. These areas can improve critical topics like longevity, energy abundance, and access to education while enabling an explosion in productivity.
But the flip side is real, too. Today, you can interact with open-source competitors to ChatGPT that will gladly outline the production of sarin gas or cultivation of anthrax. These models are already employed to displace journalists and commercial illustrators, AstroTurf social media, and terrorize screen actors and writers into strikes.
In the next few years, maybe months, there's a real chance that one morning you'll roll out of bed and learn that an AI can now pass every human-level-ability test at the expert level, medical boards, bar exams, MENSA tests, doctorate level physics, and math tests. It can re-derive Einsteins' Special Relativity, the Quantum Standard Model and produce new frontier science with a simple query. When you describe your business, it will lay out a strategy that factors in currency risk, customer sentiments, competitors, regulation, and the health of your suppliers.
Likely before your current car lease ends, there's a day after which it's irresponsible to make choices without getting feedback from an AI. A day where the possibility of people being the first to discover new science has passed.
This very smart AI doesn't have a will, desires, or the ability to act. It won't sneak around and take over the world. It's just smarter at every intellectual task than anyone alive and available to everyone as an app on your phone.
The societal impacts will be tremendous and hard to predict. But it begs the question: What's your value add if all you're doing is querying super-Siri? Being a low or no value-add person is a scary prospect in the developed world.
Agency has shown itself to be as simple as adding recursion, memory, and internet access to existing LLMs. This configuration enables an AI to break complex problems into subtasks and attack each subtask. It's a short hop from here to an autonomously AI solving world hunger or planning to turn the world into paper clips.
Sigh. Words that have been repeated ad nauseum since AI work started - "Any Day Now". Absolutely nothing concrete whatever.
As I often repeat: "It's not that LLMs sometimes hallucinate - it's that they ALWAYS hallucinate, and sometimes the person looking at the results thinks they resemble reality."
Interesting take, and great points. Two minor complaints:
First, Yudkowsky's thinking regarding AI alignment is very well developed and massively influential. The references to him being self schooled or writing Potter fan fiction are ad hominem attacks that say nothing about the validity of his writings. He's certainly not perfect and his ideas have been criticized - but I don't think your approach here is constructive.
Second, Vernor Vinges' singularity is the idea that as technological progress accelerates, eventually the speed of development will become steeper and steeper until it is effectively a vertical line - hence the metaphor of a singularity. For people living before the singularity, the world afterward will be incomprehensible. AI is a possible driver of this acceleration but is not the only factor at play. Vernor's vision of the singularity is more nuanced and much broader than how you've described it here.
This is very different from (and much less rosy than) Kurzweil's vision, in which humanity is the beneficiary and object of the singularity.
Fair comment that the piece is a little snarky. I certainly don't think that Yudkowsky is stupid - far from it - but I do think that the entire edifice he has constructed is a wasted effort, in part because I don't believe reasoning works as he and rationalists think it does - https://crookedtimber.org/2020/07/24/in-praise-of-negativity/. The argument is not that Vinge is as optimistic as Kurzweil - rather that the 1993 essay contains within itself the major elements of both the optimistic and pessimistic strands. It's a little remarkable how little the debate has changed as the technology has changed. If I were being even snarkier and had more time and space, I'd develop and test the argument that the reason why this debate is as it is, has less to do with the technology than with the organizing myths of science fiction going back to Shelley's Frankenstein (Aldiss and Wingrove make a very plausible case that it is the ur-text of the genre), which intertwines the 'we scientists are as gods' and 'our creation will surely destroy us' as the twin helixes of its narrative.
[helices that should be]
And finally, the bit about how AI would "start feeding on itself" was supposed to at least nod towards the accelerating moment of radical transformation thing - the original draft had more, but I had 1,000 words to cover an awful lot of material ...
Fair points, thanks for the reply! 1000 words is a very restrictive limit.
I completely agree regarding the Frankenstein archetype. The Ur-text might go back even further to the myth of Prometheus / Icarus. The subtitle of Frankenstein was, of course, "A Modern Prometheus". Makes you wonder how long through human history this archetype has been a key to our understanding of the world.
It's hard to dismiss the core message. As technology shapes our social systems, worldview, and lives , all influential new technologies inevitably destroy part of the world that was, and open a door to a new ones. Not only do I think that the archetype often plays out in current events, I also think that it might be literally impossible to see our world in terms independent of that archetype. This is mostly because the archetype is so pervasive, but also because it is so broad that any consequence of technological progress can 'fit' into its schema.
In case you haven't seen this gem, which comments on the archetype: https://dresdencodak.com/2009/09/22/caveman-science-fiction/
And thanks for the link to In Praise of Negativity, that's a great reflection and I can't disagree with it.
I'm sorry but why are those statements ad hominem attacks? I'm very much someone who finds the notion of a self schooled, fanfic writing, geeky public intellectual very attractive, and I find Yudkowsky discourse dull and utterly useless on its own merits. What good ideas does he have?
My own biases might be showing through, although I'd wager most people give more legitimacy to the ivy-schooled than the self-schooled.
I first read yudkowsky in the Foom debates (https://intelligence.org/ai-foom-debate/) but I found his best work was in summarizing / simplifying the AI doomer arguments, which were well developed by others (bostrom etc) but largely inaccessible before his write ups (e.g., "List of AI Lethalities"). Which got him into a few talk shows, where he mostly fumbled the argument.
Any interesting ideas to point to? I skimmed that document and all I'm seeing is science fiction concepts taken at face value as possible futures for no other reason that their ubiquity as tropes, which to me is just media illiteracy flaunted with a pompous flair.
A few:
- There are no pivotal weak AGI acts
- a strong system cannot be extensively tested without the testing also creating risk
- some problems / risks will only appear at dangerous levels of strength, and are not observable in weak systems
- there is no known way to use the paradigm of loss functions, sensory inputs, and/or reward inputs, to optimize anything within a cognitive system to point at particular things within the environment
- The underparts of human thought are not exposed for direct imitation learning and can't be put in any dataset. This makes it hard and probably impossible to train a powerful system entirely on imitation of human words or other human-legible contents, which are only impoverished subsystems of human thoughts; unless that system is powerful enough to contain inner intelligences figuring out the humans, and at that point it is no longer really working as imitative human thought.
- Orthogonality and instrumental convergence make alignment a much more difficult problem
... And many more. I'm not sure how much of this is original vs borrowed from other authors, but as mentioned, he did an amazing job of making these arguments accessible
Ok, so all of these are very heavy in jargon, very "in group".
I went to the trouble of deciphering the first one, that "there are no pivotal weak AGI acts"... And as far as I can tell it basically means that any entity capable of magically preventing other magical entities from coming into being would, itself, be a magical entity and therefore magic would exist.
I cannot stress enough how much:
1. taking this idea seriously requires you to accept the imminent existence of super intelligent beings with magical powers as a given and
2. How once you buy into this unfounded belief, the actual concept being articulated is a trivial tautology.
For my part, I view this essay as itself part of a minor movement, a skeptical or denialist movement that wants to deny there is any imminent prospect of "superhuman AI", and I would be interested to understand its nature and origins.
I hypothesize that one motivation is a liberal-to-progressive outlook which rejects belief in divine superintelligence in favor of humanist atheism, and which sees belief in artificial superintelligence as a return to the "demon-haunted world", from which secularism liberated humanity.
There's also a complex of interrelated attitudes such as: hostility or skepticism towards the idea of IQ or degrees of intelligence; hostility towards Big Tech and the billionaires who own it, as undemocratic, exploitative, capitalist, a dangerous concentration of private power, etc; preference for social and political solutions to this situation, rather than technical ones.
All this could make a person receptive to deflationary claims about AI, e.g. that LLMs are just stochastic parrots, or that the buzz around AI is just marketing hype.
I'm happy to learn of Dr Saint-Simon. His views of a meta-religion as expressed by this writer form the bulk of a series of ideas on how Cerebral Valley, mostly made up of avowed atheists is reinventing a quasi-Abrahamic religion where they're crowned as the gods of a new dimensional stage of the human race.
The tech community's embracal of a16z's echoes a gnawing temptation to godhood that is shared by many.
Seems to completely ignore the fact that there are many software engineers (many of who work on AI) who dismiss the notion that anyone as of December 2023 has come close to acheving AI.
Yud is super smart, and I find his writing and interviews fascinating. He just takes certain bad possibilities to be given that I can't. Otoh, I'm not in the pmarca cult either, thought I think he too has a bunch of good points. My personal philosophy is that we need to take an these examples of possible doom as a set of constraints for "how not to do it." But the benefit of AI stands to be so great to everyone that I don't think we can afford to not do it at all.