ClaytonB
Member
- Joined
- Oct 30, 2011
- Messages
- 10,227
The Disconnect
Here's a representative sample of a typical AI hype-bro's reaction to the Apple paper:
Let me explain The Disconnect that is happening here, as simply as I can:
The Disconnect is that we are to simultaneously believe that AI is at parity with human intelligence -- as such -- or even beyond, and that you have to speak to the AI in exactly "such and such" manner, or it will "obviously" blow up and that's all your fault for not asking correctly. The whole point of a tool that is supposed to be indistinguishable from human is that if I ask it to do a task that I could reasonably expect any human to figure out from my description and query (or provide a meaningful response asking for further clarification/information), it can do it.
The idea that you can get human intelligence from pre-training is ludicrous. It's not just wrong, it's so wrong that anyone who knows anything at all about machine learning ought to know that it's laughable. Which makes you wonder why the supposed "world leaders" in AI today have almost perfect unanimity on this idea that pre-training-based AI is at parity with human intelligence, or even beyond it.
Let's break down a relatively modest list of the features of human intelligence which are glaringly absent from today's pre-trained AIs:
- Flexibility -- the human ability to reason with unusual tasks/objects in zero-shot setting; for example, consider those obstacle-courses for kids on Nickolodeon (I don't know if they still have those, I'm thinking of the show from the 90's). The obstacle-course is designed to be bewildering and half the challenge is just figuring out what is even supposed to be done. Not only humans, but many animals, handle these kinds of challenges no-sweat. AI, on the other hand, can be made to score something like 50 points lower on all those PhD exams that they're supposedly "acing" just by re-ordering the multiple-choice answers. That's called the illusion of intelligence.
- Stability over reasoning chains -- This is probably the most visible failure of pre-training-based AIs and will continue to be, because pre-training is effectively a futile attempt to walk out all possible reasoning chains and compress all those patterns into a fixed database. It'll never work, you can't compress infinity into a finite space. This shows up in the multiplication example, where the AI becomes unstable but is unaware that this is the case. If you ask someone who isn't good at math to multiply two 30-digit numbers, they might start with the first few digits, then raise their hands up and say, "I give up, I told you I'm no good at math". That is stability over reasoning chains because the subject has recognized their own incompetence, rather than steaming ahead into gibberish with quixotic certitude.
- Bona fide generalization -- GPTs can generalize, there is no doubt about it. But the kind of generalization they do I prefer to describe as interpolation. Interpolation is like coloring in a drawing, rather than doing a drawing itself from a blank page. Even in cases where AIs have supposedly exhibited stunning originality, such as in SD-based image generation, it has been frequently shown that the AI is literally just copying the composition and layout from Drawing A and then applying some other style (perhaps borrowed from Drawings B,C,D,etc.) onto it. So, the awesome heroic pose supposedly "originally created" by the AI turns out to just be a paper-doll collage where slight stylistic modifications have been made to a drawing pilfered from somewhere in its ocean-scale database of imagery. That is not generalization, that is interpolation, and Hype-Bros Inc. intentionally keep conflating these two, very different things.
- Inference (recognition of absurdities) -- Let's suppose I tell you, "I washed my car with a basketball today". Now, at the very least, you will immediately understand that I'm trying to be funny or ironic but in no case will you imagine that I literally washed my car with a basketball. Maybe the joke is that I was holding the basketball while washing my car. Or who knows what. In any case, the statement is absurd on its face. The LLM cannot know an absurd statement from a merely unusual statement. This is so important that it's almost impossible to exaggerate. A black swan is rare, but not impossible. But impossible things are not just rare, they're impossible! You literally cannot wash a car with a basketball... it's simply impossible! But the reason that we are obligated to forgive the LLM for screwing up 30-digit multiplication, even though we provided the steps, and even though it's a task that most humans could complete if they really tried to --- namely, that it's just a very fancy probabilistic algorithm -- is why the LLM cannot distinguish between the mere improbability of a black swan, versus the true impossibility of washing a car with a basketball. The LLM has certainly never encountered the phrase "I washed my car with a basketball" before, but there are an infinitude of such phrases it has never encountered. This statement is merely improbable or some kind of black swan, something the LLM has never heard of before. This is why the LLM cannot have any unrecantable ground beliefs -- it literally cannot know the difference between something that is truly impossible versus something that is just rare and did not occur in its training data.
- Short-term and long-term memory (not just database queries, actual memory) -- Just as the term "reasoning" has a special meaning in today's ML that is completely unconnected from what we mean in the vernacular by the term "reasoning", so also the term "memory" has its own special meaning. When they say "memory", they don't mean the kind of integrated memory that you and I have, rather, it's a weird sort of database consisting of news-clippings like the inside of Mel Gibson's apartment in the movie Conspiracy Theory. That's what they mean by "memory". This is not actual memory and it doesn't function like actual memory. And it shows.
- in situ learning of new facts -- Pre-training-based AIs are, by definition, incapable of learning any new facts in-situ. The only way they can be updated is by a new weight-training cycle. Thus, when you inform the AI that, for example, "Donald Trump and Elon Musk have fallen out with one another", it can only treat this as some kind of hypothetical scenario which might have happened in your parallel universe... or not. In other words, when you share a news event with a Transformer, it has no idea how that is any different from LARPing. That's because it cannot actually learn anything in-situ. This is a core feature of human cognition.
- Corrigibility -- as a corollary to the previous point, pre-training-based AIs are incorrigible, by definition. The weights, once trained, are fixed forever. Every interaction a user has with the model is just a roll-out of the language-model, triggered by the user's input prompt. Nothing more, nothing less. It should be obvious from even cursory introspection that this is wholly incongruous with how human thinking works.
- Absence of wariness -- Another core component of human cognition is wariness... knowing when you're out of your depth. No one can know everything, and no one knows all the tools of thinking, no matter how many copyrighted books they have scanned during training. Thus, there is always someone out there who knows something you don't, and knows how to do something you can't. The strutting, cockadoodle-do confidence of LLMs (arising from a complete absence of any wariness) is not a feature, it's a bug. That is, it's a behavior that is not only out-of-step with human cognition, it's simply embarrassing. Hallucination is the most common manifestation of this trait.
- Meta-cognition, genuine self-reflection, knowing one's own limits -- I'm just throwing this in together as a single bullet-point, but they really deserve their own points. Another glaringly absent feature of human cognition from LLMs is meta-cognition, the ability to think about one's own thinking, as well as the thinking of others. That's because the LLM is not thinking at all, not in any sense whatsoever. Rather, the LLM is a giant, neural-net decompression algorithm that is decompressing meta-patterns that it learned during training. These compressions have been described as "natural language programs" and I think that's an excellent description. Pre-training, then, consists of the Transformers learning countless of these "natural language programs" that allow the training data to be efficiently encoded in the model's weights for later recall and use at inference-time. But that's the whole problem -- human cognition essentially consists in the de novo synthesis of new natural language programs on demand. They may be very short, simple programs, but they aren't all stored in our brain in the form of a giant pre-trained blob. (Some of them are, and we call those reflexes). The difference between my mind and a Transformer is that the Transformer literally stores every sentence it ever would say to me in response to statement XYZ. My mind does not store every response I might make to you (not even in compressed form), rather, my mind is able to synthesize a response, de novo, on the fly. This is the essence of meta-cognition, this is why I can reason about myself reasoning about myself reasoning, ad nauseum, and also why I can reason about you reasoning about whatever, etc.
Finally, all of this manifests in the LLMs glaringly obvious ignorance of its own limitations. The "count the number of r's in Strawberry" challenge (or pick some other similar challenge now that they've manually patched this one) exposes this flaw. When we reason about a problem like this, we construct a disposable mental-model and use that during the reasoning process. The LLM has to somehow "just know" because it has no place in its "mind" in which to do this kind of calculation. And this is why current AIs have absolutely zero sense of wariness. They don't know when they don't know. The Rober experiment with a self-driving Tesla running through a Wile E Coyote mural sums this problem up. Sure, they patched that, but that's just a game of whack-a-mole. As a human, why would that not happen to me? Is it because my eyesight is so excellent? Nope, those cameras can see way more detail than my eyes can. Rather, my mind is hyper-sensitive to any sort of incongruency in my environment. Whenever anything seems "out of place", my mind picks up on it basically instantly, and alerts my nervous system "something isn't right!" long before my conscious mind can mentally react. This is why we involuntarily cringe/jump when surprised by unexpected events/objects such as coming around a corner on a snake, etc. Once again, this is a core component of human cognition, it's connected to the fact that I will instantly recognize the absurdity of the statement, "I washed my car with a basketball today". That's as incongruent as four 90-degree angled lines appearing in the sky in the shape of a wall even though I cannot yet detect there's an actual wall/mural there with my eyesight. It will cause an instantaneous trigger in my mind that "something isn't right!!" Transformers have no such ability, nor can have.
Here's a representative sample of a typical AI hype-bro's reaction to the Apple paper:
Let me explain The Disconnect that is happening here, as simply as I can:
AI hype-bro: AI is currently at or beyond human intelligence!!
Me: Hmm, OK, so if I ask the AI to do something that a human can do, it can do it, right?
AI hype-bro: Dude, of course -- ask the AI anything you want, I guarantee it will blow your mind with its PhD-level reasoning skills!
Me: OK, cool. AI, here is the grade-school multiplication technique <here>, and I want you to use that technique to multiply these two 30-digit numbers. Take your time. You can reason or show your steps if you want.
AI: <pukes out completely wrong answer>
Me: OK, Hype-bro, it totally didn't get this correct at all. Any human who knows how to multiply would have answered this correctly given enough time and a quiet place to concentrate.
AI hype-bro: No, no, you're doing it all wrong! You can't just tell the AI to multiply two really long numbers and expect it to work! It's just a trained neural net, it's a probabilistic machine, so if it repeats one task too many times, sooner or later, it's going to fail! Duh! How can you be so stupid!?
Me: Yeah, well I'm not the one claiming that Transformer-based AI has 155 IQ and is at parity with human intelligence on essentially all mental tasks.
The Disconnect is that we are to simultaneously believe that AI is at parity with human intelligence -- as such -- or even beyond, and that you have to speak to the AI in exactly "such and such" manner, or it will "obviously" blow up and that's all your fault for not asking correctly. The whole point of a tool that is supposed to be indistinguishable from human is that if I ask it to do a task that I could reasonably expect any human to figure out from my description and query (or provide a meaningful response asking for further clarification/information), it can do it.
The idea that you can get human intelligence from pre-training is ludicrous. It's not just wrong, it's so wrong that anyone who knows anything at all about machine learning ought to know that it's laughable. Which makes you wonder why the supposed "world leaders" in AI today have almost perfect unanimity on this idea that pre-training-based AI is at parity with human intelligence, or even beyond it.
Let's break down a relatively modest list of the features of human intelligence which are glaringly absent from today's pre-trained AIs:
- Flexibility -- the human ability to reason with unusual tasks/objects in zero-shot setting; for example, consider those obstacle-courses for kids on Nickolodeon (I don't know if they still have those, I'm thinking of the show from the 90's). The obstacle-course is designed to be bewildering and half the challenge is just figuring out what is even supposed to be done. Not only humans, but many animals, handle these kinds of challenges no-sweat. AI, on the other hand, can be made to score something like 50 points lower on all those PhD exams that they're supposedly "acing" just by re-ordering the multiple-choice answers. That's called the illusion of intelligence.
- Stability over reasoning chains -- This is probably the most visible failure of pre-training-based AIs and will continue to be, because pre-training is effectively a futile attempt to walk out all possible reasoning chains and compress all those patterns into a fixed database. It'll never work, you can't compress infinity into a finite space. This shows up in the multiplication example, where the AI becomes unstable but is unaware that this is the case. If you ask someone who isn't good at math to multiply two 30-digit numbers, they might start with the first few digits, then raise their hands up and say, "I give up, I told you I'm no good at math". That is stability over reasoning chains because the subject has recognized their own incompetence, rather than steaming ahead into gibberish with quixotic certitude.
- Bona fide generalization -- GPTs can generalize, there is no doubt about it. But the kind of generalization they do I prefer to describe as interpolation. Interpolation is like coloring in a drawing, rather than doing a drawing itself from a blank page. Even in cases where AIs have supposedly exhibited stunning originality, such as in SD-based image generation, it has been frequently shown that the AI is literally just copying the composition and layout from Drawing A and then applying some other style (perhaps borrowed from Drawings B,C,D,etc.) onto it. So, the awesome heroic pose supposedly "originally created" by the AI turns out to just be a paper-doll collage where slight stylistic modifications have been made to a drawing pilfered from somewhere in its ocean-scale database of imagery. That is not generalization, that is interpolation, and Hype-Bros Inc. intentionally keep conflating these two, very different things.
- Inference (recognition of absurdities) -- Let's suppose I tell you, "I washed my car with a basketball today". Now, at the very least, you will immediately understand that I'm trying to be funny or ironic but in no case will you imagine that I literally washed my car with a basketball. Maybe the joke is that I was holding the basketball while washing my car. Or who knows what. In any case, the statement is absurd on its face. The LLM cannot know an absurd statement from a merely unusual statement. This is so important that it's almost impossible to exaggerate. A black swan is rare, but not impossible. But impossible things are not just rare, they're impossible! You literally cannot wash a car with a basketball... it's simply impossible! But the reason that we are obligated to forgive the LLM for screwing up 30-digit multiplication, even though we provided the steps, and even though it's a task that most humans could complete if they really tried to --- namely, that it's just a very fancy probabilistic algorithm -- is why the LLM cannot distinguish between the mere improbability of a black swan, versus the true impossibility of washing a car with a basketball. The LLM has certainly never encountered the phrase "I washed my car with a basketball" before, but there are an infinitude of such phrases it has never encountered. This statement is merely improbable or some kind of black swan, something the LLM has never heard of before. This is why the LLM cannot have any unrecantable ground beliefs -- it literally cannot know the difference between something that is truly impossible versus something that is just rare and did not occur in its training data.
- Short-term and long-term memory (not just database queries, actual memory) -- Just as the term "reasoning" has a special meaning in today's ML that is completely unconnected from what we mean in the vernacular by the term "reasoning", so also the term "memory" has its own special meaning. When they say "memory", they don't mean the kind of integrated memory that you and I have, rather, it's a weird sort of database consisting of news-clippings like the inside of Mel Gibson's apartment in the movie Conspiracy Theory. That's what they mean by "memory". This is not actual memory and it doesn't function like actual memory. And it shows.
- in situ learning of new facts -- Pre-training-based AIs are, by definition, incapable of learning any new facts in-situ. The only way they can be updated is by a new weight-training cycle. Thus, when you inform the AI that, for example, "Donald Trump and Elon Musk have fallen out with one another", it can only treat this as some kind of hypothetical scenario which might have happened in your parallel universe... or not. In other words, when you share a news event with a Transformer, it has no idea how that is any different from LARPing. That's because it cannot actually learn anything in-situ. This is a core feature of human cognition.
- Corrigibility -- as a corollary to the previous point, pre-training-based AIs are incorrigible, by definition. The weights, once trained, are fixed forever. Every interaction a user has with the model is just a roll-out of the language-model, triggered by the user's input prompt. Nothing more, nothing less. It should be obvious from even cursory introspection that this is wholly incongruous with how human thinking works.
- Absence of wariness -- Another core component of human cognition is wariness... knowing when you're out of your depth. No one can know everything, and no one knows all the tools of thinking, no matter how many copyrighted books they have scanned during training. Thus, there is always someone out there who knows something you don't, and knows how to do something you can't. The strutting, cockadoodle-do confidence of LLMs (arising from a complete absence of any wariness) is not a feature, it's a bug. That is, it's a behavior that is not only out-of-step with human cognition, it's simply embarrassing. Hallucination is the most common manifestation of this trait.
- Meta-cognition, genuine self-reflection, knowing one's own limits -- I'm just throwing this in together as a single bullet-point, but they really deserve their own points. Another glaringly absent feature of human cognition from LLMs is meta-cognition, the ability to think about one's own thinking, as well as the thinking of others. That's because the LLM is not thinking at all, not in any sense whatsoever. Rather, the LLM is a giant, neural-net decompression algorithm that is decompressing meta-patterns that it learned during training. These compressions have been described as "natural language programs" and I think that's an excellent description. Pre-training, then, consists of the Transformers learning countless of these "natural language programs" that allow the training data to be efficiently encoded in the model's weights for later recall and use at inference-time. But that's the whole problem -- human cognition essentially consists in the de novo synthesis of new natural language programs on demand. They may be very short, simple programs, but they aren't all stored in our brain in the form of a giant pre-trained blob. (Some of them are, and we call those reflexes). The difference between my mind and a Transformer is that the Transformer literally stores every sentence it ever would say to me in response to statement XYZ. My mind does not store every response I might make to you (not even in compressed form), rather, my mind is able to synthesize a response, de novo, on the fly. This is the essence of meta-cognition, this is why I can reason about myself reasoning about myself reasoning, ad nauseum, and also why I can reason about you reasoning about whatever, etc.
Finally, all of this manifests in the LLMs glaringly obvious ignorance of its own limitations. The "count the number of r's in Strawberry" challenge (or pick some other similar challenge now that they've manually patched this one) exposes this flaw. When we reason about a problem like this, we construct a disposable mental-model and use that during the reasoning process. The LLM has to somehow "just know" because it has no place in its "mind" in which to do this kind of calculation. And this is why current AIs have absolutely zero sense of wariness. They don't know when they don't know. The Rober experiment with a self-driving Tesla running through a Wile E Coyote mural sums this problem up. Sure, they patched that, but that's just a game of whack-a-mole. As a human, why would that not happen to me? Is it because my eyesight is so excellent? Nope, those cameras can see way more detail than my eyes can. Rather, my mind is hyper-sensitive to any sort of incongruency in my environment. Whenever anything seems "out of place", my mind picks up on it basically instantly, and alerts my nervous system "something isn't right!" long before my conscious mind can mentally react. This is why we involuntarily cringe/jump when surprised by unexpected events/objects such as coming around a corner on a snake, etc. Once again, this is a core component of human cognition, it's connected to the fact that I will instantly recognize the absurdity of the statement, "I washed my car with a basketball today". That's as incongruent as four 90-degree angled lines appearing in the sky in the shape of a wall even though I cannot yet detect there's an actual wall/mural there with my eyesight. It will cause an instantaneous trigger in my mind that "something isn't right!!" Transformers have no such ability, nor can have.
Last edited: