We are in the AI Singularity

image.png
 
AI’s Dirty Little Secret
https://www.youtube.com/watch?v=QO5plxqu_Yw
{Sabine Hossenfelder | 04 June 2024}

There’s a lot of talk about artificial intelligence these days, but what I find most interesting about AI no one ever talks about. It’s that we have no idea why they work as well as they do. I find this a very interesting problem because I think if we figure it out it’ll also tell us something about how the human brain works. Let’s have a look.



The irony of this is that the solution to this problem is actually known. It's called Minimum Description Length (yes, I'm omitting tons of technical nuances). I've been watching the academic establishment ignore this entire field of study (algorithmic information theory, AIT) for 20+ years, and I still have yet to understand why it is so studiously ignored. Must be politics.

Anyway, the key idea of MDL is this. First, we imagine an absolutely minimal model of the data. "Minimal model", here, refers to the Kolmogorov-complexity, which is the length of the shortest program that exactly outputs the training data, and halts. For the purpose of ML, such a model is overfitted, by definition. It just exactly compresses the training data but will be useless for any new data. Once we have this minimal model, however, we can then add something like an error-correcting code. Rather than directly minimizing the length of the program that outputs the training data, we minimize the size of the program that produces a code that outputs the data from a minimal seed, to within a specified error-margin. Once we have this in place, we can then meaningfully separate the "model" (the code) from the "noise" (the seed that encodes the training data, with the given code). In short, the MDL system not only maximally compresses the training data, but it also automatically extracts the "signal" (model) in the training-data from the "noise". This is the essence of generalization, since what it means to generalize, is to form a meaningful model, then extrapolate from that model. MDL automates this in an elegant mathematical framework, and illuminates exactly what we mean by generalization, in a very general sense.

There is a catch: MDL is uncomputable (because K-complexity is uncomputable). What this means is that you cannot build a real system that does theoretical (pure) MDL. However, there are computable approximations of MDL, that is, there are ways to get a lot of the benefits of pure MDL even though the theoretical limits of pure MDL cannot be achieved in practice. MDL has other theoretical limitations I'm skipping over here.

Minsky said of the field of AIT:

It seems to me that the most important discovery since Gödel was the discovery by Chaitin, Solomonoff and Kolmogorov of the concept called Algorithmic Probability which is a fundamental new theory of how to make predictions given a collection of experiences and this is a beautiful theory, everybody should learn it, but it’s got one problem, that is, that you cannot actually calculate what this theory predicts because it is too hard, it requires an infinite amount of work. However, it should be possible to make practical approximations to the Chaitin, Kolmogorov, Solomonoff theory that would make better predictions than anything we have today. Everybody should learn all about that and spend the rest of their lives working on it.

— Panel discussion on The Limits of Understanding, World Science Festival, NYC, Dec 14, 2014
 
Last edited:
Here's a concrete example of the nuts-and-bolts of how AI is making us collectively retarded. I just happened to be reading this article in order to understand Linux a little better but the author opens with a beautiful demonstration of just how stupid modern AI really is, and how it's corrupting the very fabric of human knowledge itself. I foresee the coming of "AI free" sub-communities where all forms of information that could be wholly or even partly sourced from AI are actively excluded from the community so that the information flowing between community members is genuine information and not some neural-net's DeepDream hallucinations. See the highlighted quote below for why I am convinced that Google is in full-blown, double-bird-flip, F-you-and-the-horse-you-rode-in-on mode. Google consistently ranks the absolute worst in my own subjective perception of search results. They're all worse than they were before AI, but Google is the worst-of-the-worst...

What is PID 0?

I get nerd-sniped a lot. People offhandedly ask something innocent, and I lose the next several hours (or in this case, days) comprehensively figuring out the answer. Usually this ends up in a rant thread on mastodon or in some private chat group or other. But for once I have the energy to write one up for the blog.

Today’s innocent question:

Is there a reason UIDs start at 0 but PIDs start at 1?

The very short version: Unix PIDs do start at 0! PID 0 just isn’t shown to userspace through traditional APIs. PID 0 starts the kernel, then retires to a quiet life of helping a bit with process scheduling and power management. Also the entire web is mostly wrong about PID 0, because of one sentence on Wikipedia from 16 years ago.

There’s a slightly longer short version right at the end, or you can stick with me for the extremely long middle bit!

But surely you could just google what PID 0 is, right? Why am I even publishing this?
The internet is wrong

At time of writing, if you go ask the web about PID 0, you’ll get a mix of incorrect and misleading information, and almost no correct answers.

After figuring out the truth, I asked Google, Bing, DuckDuckGo and Kagi what PID 0 is on linux. I looked through the top 20 results for each, as well as whatever knowledge boxes and AI word salads they organically gave me. That’s 2 pages of results on Google, for reference.

All of them failed to produce a fully correct answer. Most had a single partially correct answer somewhere in the first 20 results, but never near the top or showcased. DDG did best, with the partially correct answer at number 4. Google did the worst, no correct answer at all. And in any case, the incorrect answers were so prevalent and consistent with each other that you wouldn’t believe the one correct site anyway.

The top-2 results on all engines were identical, interestingly: a stackoverflow answer that is wrong, and a spammy looking site that seems to have embraced LLM slop, because partway through failing to explain PID 0 it randomly shifts to talking about PID loops, from control system theory, before snapping out of it a paragraph later and going back to Unix PIDs.

Going directly to the source of the LLM slop fared slightly better, on account of them having stolen from books as well as the web, but they still make shit up in the usual amount. I was able to get a correct answer though, using the classic prompting technique of already knowing the answer and retrying until I got good RNG.

...
 


I do think we're going to have humanoid robots in the very short future, and they are going to have at least ChatGPT+Sora capability, and probably quite a bit more than that. The irony is that these robots will be able to hand-write a novel, doctoral thesis on the mating patterns of some arcane species of sub-Saharan Africa in the time it takes for its fingers to write it out on paper -- with typewriter-precision... but they won't have a thousandth of the physical reasoning capability of a cat, or even its general-purpose reasoning (abstractions, social abstractions, etc.) It will even be able to explain its own gaps in its understanding when prompted to, but it will not be able to actually remove those gaps in its understanding, in live practice. In other words, first-generation robots are going to be at least as dumb as the dumbest sci-fi robots, but they will simultaneously have "super-human IQ", which just shows how useless such measures are in terms of assessing real intelligence.

All the building-blocks for a GOFAI general-purpose reasoning system are currently on the table. I place a 1-in-2 chance that those building-blocks will be assembled by somebody (probably OpenAI) within the next two years, at the outside. So, I think this is coming much faster than most people do, but I also think it doesn't matter as much as most people do. People seem to think that general-purpose reasoning is some kind of holy grail that, once achieved, will mark a transition to a new "era" where machines tell us what to do, rather than vice-versa. They will be used to tell us what to do, but it won't be because they're inherently smarter or better at anything. Even if they're really good at reasoning, so what. It doesn't really alter the ultimate issues at stake, which aren't really about intelligence. The unstated assumption in these discussions is that social ills are the result of a scarcity of intelligence. If we could just get our hands on more raw intelligence, finally, society would be able to solve its problems. But none of this is correct. Our problems are the result of systematic sabotage, and the very same people who are systematically sabotaging society are the people who are building the AI/robots. So, we're not going to solve anything with AI/robotics, we're just going to have even worse forms of the tyranny we already had before. But now with robots.

The solution, if there is a solution, is to start by discarding all forms of AI hype. These machines are just electronic gizmos operating at high frequency, nothing more, nothing less. The sooner people get that into their heads, the better, because the worst-case scenario is that the global population attributes super-human "wisdom" to these smart-refrigerators on hydraulic-piston legs, and turn to them for "wisdom" and start implementing whatever they recommend as law. This is what the puppet-masters behind the AI/robots want to bring about. That's why I say that OpenAI is the worst-case AI safety scenario. Don't fall for the Wizard of Oz trick, people!
 
Is the Intelligence-Explosion Near? A Reality Check.
https://www.youtube.com/watch?v=xm1B3Y3ypoE
{Sabine Hossenfelder | 13 June 2023}

I had a look at Leopold Aschenbrenner's recent (very long) essay about the supposedly near "intelligence explosion" in artificial intelligence development. I am not particularly convinced by his argument. You can read his essay here: https://situational-awareness.ai/



Here's a deep-dive on this paper for people who are interested. While Aschenbrenner is mostly just playing into the hype, he does raise a few valid points. Perhaps the biggest takeaway is that a lot more people need to start thinking about this issue a lot more. The public sadly confuses actual-AI with Hollywood-AI. Actual-AI is not Hollywood-AI (nor on a credible trajectory to it, yet) but that doesn't mean it's not dangerous. AIs don't think like us. If nation-states have already built strategic-scale AIs (and there is no reason to believe they haven't), the threats we are facing from AI are not like Skynet... they're a lot weirder than that.



Already 4 years old, but everybody should watch this:



Could Clown World be a symptom of a rogue AI, gone out-of-control? :frog:
 
Truth is coming out. The guy even looks like a commie. How much clearer does this all have to be made?

Link
 
SOMEONE FINALLY SAID IT: ChatGPT is Bullshit

Reproduced here:

ChatGPT is bullshit

Michael Townsen Hicks1
· James Humphries1 · Joe Slater1
© The Author(s) 2024

Ethics and Information Technology
(2024) 26:38
https://doi.org/10.1007/s10676-024-09775-5

Abstract

Recently, there has been considerable interest in large language models:
machine learning systems which produce humanlike text and dialogue.
Applications of these systems have been plagued by persistent inaccuracies in
their output; these are often called “AI hallucinations”. We argue that these
falsehoods, and the overall activity of large language models, is better
understood as bullshit in the sense explored by Frankfurt (On Bullshit,
Princeton, 2005): the models are in an important way indifferent to the truth
of their outputs. We distinguish two ways in which the models can be said to be
bullshitters, and argue that they clearly meet at least one of these
definitions. We further argue that describing AI misrepresentations as bullshit
is both a more useful and more accurate way of predicting and discussing the
behaviour of these systems.

Keywords Artificial intelligence · Large language models · LLMs · ChatGPT ·
Bullshit · Frankfurt · Assertion ·

Introduction

Large language models (LLMs), programs which use reams of available text and
probability calculations in order to create seemingly-human-produced writing,
have become increasingly sophisticated and convincing over the last several
years, to the point where some commentators suggest that we may now be
approaching the creation of artificial general intelligence (see e.g. Knight,
2023 and Sarkar, 2023). Alongside worries about the rise of Skynet and the use
of LLMs such as ChatGPT to replace work that could and should be done by
humans, one line of inquiry concerns what exactly these programs are up to: in
particular, there is a question about the nature and meaning of the text
produced, and of its connection to truth. In this paper, we argue against the
view that when ChatGPT and the like produce false claims they are lying or even
hallucinating, and in favour of the position that the activity they are engaged
in is bullshitting, in the Frankfurtian sense (Frankfurt, 2002, 2005). Because
these programs cannot themselves be concerned with truth, and because they are
designed to produce text that looks truth-apt without any actual concern for
truth, it seems appropriate to call their outputs bullshit.

We think that this is worth paying attention to. Descriptions of new
technology, including metaphorical ones, guide policymakers’ and the public’s
understanding of new technology; they also inform applications of the new
technology. They tell us what the technology is for and what it can be expected
to do. Currently, false statements by ChatGPT and other large language models
are described as “hallucinations”, which give policymakers and the public the
idea that these systems are misrepresenting the world, and describing what they
“see”. We argue that this is an inapt metaphor which will misinform the public,
policymakers, and other interested parties.

The structure of the paper is as follows: in the first section, we outline how
ChatGPT and similar LLMs operate. Next, we consider the view that when they
make factual errors, they are lying or hallucinating: that is, deliberately
uttering falsehoods, or blamelessly uttering them on the basis of misleading
input information. We argue that neither of these ways of thinking are
accurate, insofar as both lying and hallucinating require some concern with the
truth of their statements, whereas LLMs are simply not designed to accurately
represent the way the world is, but rather to give the impression that this is
what they’re doing. This, we suggest, is very close to at least one way that
Frankfurt talks about bullshit. We draw a distinction between two sorts of
bullshit, which we call ‘hard’ and ‘soft’ bullshit, where the former requires
an active attempt to deceive the reader or listener as to the nature of the
enterprise, and the latter only requires a lack of concern for truth. We argue
that at minimum, the outputs of LLMs like ChatGPT are soft bullshit:
bullshit–that is, speech or text produced without concern for its truth–that is
produced without any intent to mislead the audience about the utterer’s
attitude towards truth. We also suggest, more controversially, that ChatGPT may
indeed produce hard bullshit: if we view it as having intentions (for example,
in virtue of how it is designed), then the fact that it is designed to give the
impression of concern for truth qualifies it as attempting to mislead the
audience about its aims, goals, or agenda. So, with the caveat that the
particular kind of bullshit ChatGPT outputs is dependent on particular views of
mind or meaning, we conclude that it is appropriate to talk about
ChatGPT-generated text as bullshit, and flag up why it matters that – rather
than thinking of its untrue claims as lies or hallucinations – we call bullshit
on ChatGPT.

What is ChatGPT?

Large language models are becoming increasingly good at carrying on convincing
conversations. The most prominent large language model is OpenAI’s ChatGPT, so
it’s the one we will focus on; however, what we say carries over to other
neural network-based AI chatbots, including Google’s Bard chatbot,
AnthropicAI’s Claude (claude.ai), and Meta’s LLaMa. Despite being merely
complicated bits of software, these models are surprisingly human-like when
discussing a wide variety of topics. Test it yourself: anyone can go to the
OpenAI web interface and ask for a ream of text; typically, it produces text
which is indistinguishable from that of your average English speaker or writer.
The variety, length, and similarity to human-generated text that GPT-4 is
capable of has convinced many commentators to think that this chatbot has
finally cracked it: that this is real (as opposed to merely nominal) artificial
intelligence, one step closer to a humanlike mind housed in a silicon brain.

However, large language models, and other AI models like ChatGPT, are doing
considerably less than what human brains do, and it is not clear whether they
do what they do in the same way we do. The most obvious difference between an
LLM and a human mind involves the goals of the system. Humans have a variety
of goals and behaviours, most of which are extra-linguistic: we have basic
physical desires, for things like food and sustenance; we have social goals and
relationships; we have projects; and we create physical objects. Large language
models simply aim to replicate human speech or writing. This means that their
primary goal, insofar as they have one, is to produce human-like text. They do
so by estimating the likelihood that a particular word will appear next, given
the text that has come before. The machine does this by constructing a massive
statistical model, one which is based on large amounts of text, mostly taken
from the internet. This is done with relatively little input from human
researchers or the designers of the system; rather, the model is designed by
constructing a large number of nodes, which act as probability functions for a
word to appear in a text given its context and the text that has come before
it. Rather than putting in these probability functions by hand, researchers
feed the system large amounts of text and train it by having it make next-word
predictions about this training data. They then give it positive or negative
feedback depending on whether it predicts correctly. Given enough text, the
machine can construct a statistical model giving the likelihood of the next
word in a block of text all by itself.

This model associates with each word a vector which locates it in a
high-dimensional abstract space, near other words that occur in similar
contexts and far from those which don’t. When producing text, it looks at the
previous string of words and constructs a different vector, locating the word’s
surroundings – its context – near those that occur in the context of similar
words. We can think of these heuristically as representing the meaning of the
word and the content of its context. But because these spaces are constructed
using machine learning by repeated statistical analysis of large amounts of
text, we can’t know what sorts of similarity are represented by the dimensions
of this high-dimensional vector space. Hence we do not know how similar they
are to what we think of as meaning or context. The model then takes these two
vectors and produces a set of likelihoods for the next word; it selects and
places one of the more likely ones—though not always the most likely. Allowing
the model to choose randomly amongst the more likely words produces more
creative and human-like text; the parameter which controls this is called the
‘temperature’ of the model and increasing the model’s temperature makes it both
seem more creative and more likely to produce falsehoods. The system then
repeats the process until it has a recognizable, complete-looking response to
whatever prompt it has been given.

Given this process, it’s not surprising that LLMs have a problem with the
truth. Their goal is to provide a normalseeming response to a prompt, not to
convey information that is helpful to their interlocutor. Examples of this are
already numerous, for instance, a lawyer recently prepared his brief using
ChatGPT and discovered to his chagrin that most of the cited cases were not
real (Weiser, 2023); as Judge P. Kevin Castel put it, ChatGPT produced a text
filled with “bogus judicial decisions, with bogus quotes and bogus internal
citations”. Similarly, when computer science researchers tested ChatGPT’s
ability to assist in academic writing, they found that it was able to produce
surprisingly comprehensive and sometimes even accurate text on biological
subjects given the right prompts. But when asked to produce evidence for its
claims, “it provided five references dating to the early 2000s. None of the
provided paper titles existed, and all provided PubMed IDs (PMIDs) were of
different unrelated papers” (Alkaissi and McFarland, 2023). These errors can
“snowball”: when the language model is asked to provide evidence for or a
deeper explanation of a false claim, it rarely checks itself; instead it
confidently producesmore false but normal-sounding claims (Zhang et al. 2023).
The accuracy problem for LLMs and other generative Ais is often referred to as
the problem of “AI hallucination”: the chatbot seems to be hallucinating
sources and facts that don’t exist. These inaccuracies are referred to as
“hallucinations” in both technical (OpenAI, 2023) and popular contexts (Weise &
Metz, 2023).

These errors are pretty minor if the only point of a chatbot is to mimic human
speech or communication. But the companies designing and using these bots have
grander plans: chatbots could replace Google or Bing searches with a more
user-friendly conversational interface (Shah & Bender, 2022; Zhu et al., 2023),
or assist doctors or therapists in medical contexts (Lysandrou, 2023). In these
cases, accuracy is important and the errors represent a serious problem.

One attempted solution is to hook the chatbot up to some sort of database,
search engine, or computational program that can answer the questions that the
LLM gets wrong (Zhu et al., 2023). Unfortunately, this doesn’t work very well
either. For example, when ChatGPT is connected to Wolfram Alpha, a powerful
piece of mathematical software, it improves moderately in answering simple
mathematical questions. But it still regularly gets things wrong, especially
for questions which require multi-stage thinking (Davis & Aaronson, 2023). And
when connected to search engines or other databases, the models are still
fairly likely to provide fake information unless they are given very specific
instructions–and even then things aren’t perfect (Lysandrou, 2023). OpenAI has
plans to rectify this by training the model to do step by step reasoning
(Lightman et al., 2023) but this is quite resource-intensive, and there is
reason to be doubtful that it will completely solve the problem—nor is it clear
that the result will be a large language model, rather than some broader form
of AI.

Solutions such as connecting the LLM to a database don’t work is because, if
the models are trained on the database, then the words in the database affect
the probability that the chatbot will add one or another word to the line of
text it is generating. But this will only make it produce text similar to the
text in the database; doing so will make it more likely that it reproduces the
information in the database but by no means ensures that it will.

On the other hand, the LLM can also be connected to the database by allowing it
to consult the database, in a way similar to the way it consults or talks to
its human interlocutors. In this way, it can use the outputs of the database as
text which it responds to and builds on. Here’s one way this can work: when a
human interlocutor asks the language model a question, it can then translate
the question into a query for the database. Then, it takes the response of the
database as an input and builds a text from it to provide back to the human
questioner. But this can misfire too, as the chatbots might ask the database
the wrong question, or misinterpret its answer (Davis & Aaronson, 2023). “GPT-4
often struggles to formulate a problem in a way that Wolfram Alpha can accept
or that produces useful output.” This is not unrelated to the fact that when
the language model generates a query for the database or computational module,
it does so in the same way it generates text for humans: by estimating the
likelihood that some output “looks like’’ the kind of thing the database will
correspond with.

One might worry that these failed methods for improving the accuracy of
chatbots are connected to the inapt metaphor of AI hallucinations. If the AI is
misperceiving or hallucinating sources, one way to rectify this would be to put
it in touch with real rather than hallucinated sources. But attempts to do so
have failed.

The problem here isn’t that large language models hallucinate, lie, or
misrepresent the world in some way. It’s that they are not designed to
represent the world at all; instead, they are designed to convey convincing
lines of text. So when they are provided with a database of some sort, they use
this, in one way or another, to make their responses more convincing. But they
are not in any real way attempting to convey or transmit the information in the
database. As Chirag Shah and Emily Bender put it: “Nothing in the design of
language models (whose training task is to predict words given context) is
actually designed to handle arithmetic, temporal reasoning, etc. To the extent
that they sometimes get the right answer to such questions is only because they
happened to synthesize relevant strings out of what was in their training data.
No reasoning is involved […] Similarly, language models are prone to making
stuff up […] because they are not designed to express some underlying set of
information in natural language; they are only manipulating the form of
language” (Shah & Bender, 2022). These models aren’t designed to transmit
information, so we shouldn’t be too surprised when their assertions turn out to
be false.

Lies, ‘hallucinations’ and bullshit

Frankfurtian bullshit and lying

Many popular discussions of ChatGPT call its false statements ‘hallucinations’.
One also might think of these untruths as lies. However, we argue that this
isn’t the right way to think about it. We will argue that these falsehoods
aren’t hallucinations later – in Sect. 3.2.3. For now, we’ll discuss why these
untruths aren’t lies but instead are bullshit.

The topic of lying has a rich philosophical literature. In ‘Lying’, Saint
Augustine distinguished seven types of lies, and his view altered throughout
his life. At one point, he defended the position that any instance of knowingly
uttering a false utterance counts as a lie, so that even jokes containing false
propositions, like –

I entered a pun competition and because I really wanted to win, I submitted ten
entries. I was sure one of them would win, but no pun in ten did.

– would be regarded as a lie, as I have never entered such a competition
(Proops & Sorensen, 2023: 3). Later, this view is refined such that the speaker
only lies if they intend the hearer to believe the utterance. The suggestion
that the speaker must intend to deceive is a common stipulation in literature
on lies. According to the “traditional account” of lying:

To lie = df. to make a believed-false statement to another person with the
intention that the other person believe that statement to be true (Mahon,
2015).

For our purposes this definition will suffice. Lies are generally frowned upon.
But there are acts of misleading testimony which are criticisable, which do not
fall under the umbrella of lying.1 These include spreading untrue gossip, which
one mistakenly, but culpably, believes to be true. Another class of misleading
testimony that has received particular attention from philosophers is that of
bullshit. This everyday notion was analysed and introduced into the
philosophical lexicon by Harry Frankfurt.2

Frankfurt understands bullshit to be characterized not by an intent to deceive
but instead by a reckless disregard for the truth. A student trying to sound
knowledgeable without having done the reading, a political candidate saying
things because they sound good to potential voters, and a dilettante trying to
spin an interesting story: none of these people are trying to deceive, but they
are also not trying to convey facts. To Frankfurt, they are bullshitting.

Like “lie”, “bullshit” is both a noun and a verb: an utterance produced can be
a lie or an instance of bullshit, as can the act of producing these utterances.
For an utterance to be classed as bullshit, it must not be accompanied by the
explicit intentions that one has when lying, i.e., to cause a false belief in
the hearer. Of course, it must also not be accompanied by the intentions
characterised by an honest utterance. So far this story is entirely negative.
Must any positive intentions be manifested in the utterer?

Throughout most of Frankfurt’s discussion, his characterisation of bullshit is
negative. He notes that bullshit requires “no conviction” from the speaker
about what the truth is (2005: 55), that the bullshitter “pays no attention” to
the truth (2005: 61) and that they “may not deceive us, or even intend to do
so, either about the facts or what he takes the facts to be” (2005: 54). Later,
he describes the “defining feature” of bullshit as “a lack of concern with
truth, or an indifference to how things really are [our emphasis]” (2002: 340).
These suggest a negative picture; that for an output to be classed as bullshit,
it only needs to lack a certain relationship to the truth.

However, in places, a positive intention is presented. Frankfurt says what a
bullshitter …. “…does necessarily attempt to deceive us about is his
enterprise. His only indispensably distinctive characteristic is that in a
certain way he misrepresents what he is up to” (2005: 54).

This is somewhat surprising. It restricts what counts as bullshit to utterances
accompanied by a higher-order deception. However, some of Frankfurt’s examples
seem to lack this feature. When Fania Pascal describes her unwell state as
“feeling like a dog that has just been run over” to her friend Wittgenstein, it
stretches credulity to suggest that she was intending to deceive him about how
much she knew about how run-over dogs felt. And given how the conditions for
bullshit are typically described as negative, we might wonder whether the
positive condition is really necessary.

Bullshit distinctions

Should utterances without an intention to deceive count as bullshit? One reason
in favour of expanding the definition, or embracing a plurality of bullshit, is
indicated by Frankfurt’s comments on the dangers of bullshit. “In contrast [to
merely unintelligible discourse], indifference to the truth is extremely
dangerous. The conduct of civilized life, and the vitality of the institutions
that are indispensable to it, depend very fundamentally on respect for the
distinction between the true and the false. Insofar as the authority of this
distinction is undermined by the prevalence of bullshit and by the mindlessly
frivolous attitude that accepts the proliferation of bullshit as innocuous, an
indispensable human treasure is squandered” (2002: 343). These dangers seem to
manifest regardless of whether there is an intention to deceive about the
enterprise a speaker is engaged in. Compare the deceptive bullshitter, who does
aim to mislead us about being in the truth-business, with someone who harbours
no such aim, but just talks for the sake of talking (without care, or indeed
any thought, about the truth-values of their utterances).

One of Frankfurt’s examples of bullshit seems better captured by the wider
definition. He considers the advertising industry, which is “replete with
instances of bullshit so unmitigated that they serve among the most
indisputable and classic paradigms of the concept” (2005:22). However, it seems
to misconstrue many advertisers to portray their aims as to mislead about their
agendas. They are expected to say misleading things. Frankfurt discusses
Marlboro adverts with the message that smokers are as brave as cowboys (2002:
341). Is it reasonable to suggest that the advertisers pretended to believe
this?

Frankfurt does allow for multiple species of bullshit (2002: 340).3 Following
this suggestion, we propose to envisage bullshit as a genus, and Frankfurt’s
intentional bullshit as one species within this genus. Other species may
include that produced by the advertiser, who anticipates that no one will
believe their utterances4 or someone who has no intention one way or another
about whether they mislead their audience. To that end, consider the following
distinction:

Bullshit (general) Any utterance produced where a speaker
has indifference towards the truth of the utterance.

Hard bullshit Bullshit produced with the intention to mislead the audience about the utterer’s agenda.

Soft bullshit Bullshit produced without the intention to mislead the hearer regarding the utterer’s agenda.

The general notion of bullshit is useful: on some occasions, we might be
confident that an utterance was either soft bullshit or hard bullshit, but be
unclear which, given our ignorance of the speaker’s higher-order desires.5 In
such a case, we can still call bullshit.

Frankfurt’s own explicit account, with the positive requirements about
producer’s intentions, is hard bullshit, whereas soft bullshit seems to
describe some of Frankfurt’s examples, such as that of Pascal’s conversation
with Wittgenstein, or the work of advertising agencies. It might be helpful to
situate these distinctions in the existing literature. On our view, hard
bullshit is most closely aligned with Cassam (2019), and Frankfurt’s positive
account, for the reason that all of these views hold that some intention must
be present, rather than merely absent, for the utterance to be bullshit: a kind
of “epistemic insouciance” or vicious attitude towards truth on Cassam’s view,
and (as we have seen) an intent to mislead the hearer about the utterer’s
agenda on Frankfurt’s view. In Sect. 3.2 we consider whether ChatGPT may be a
hard bullshitter, but it is important to note that it seems to us that hard
bullshit, like the two accounts cited here, requires one to take a stance on
whether or not LLMs can be agents, and so comes with additional argumentative
burdens.

Soft bullshit, by contrast, captures only Frankfurt’s negative requirement –
that is, the indifference towards truth that we have classed as definitional of
bullshit (general) – for the reasons given above. As we argue, ChatGPT is at
minimum a soft bullshitter or a bullshit machine, because if it is not an agent
then it can neither hold any attitudes towards truth nor towards deceiving
hearers about its (or, perhaps more properly, its users’) agenda.

It’s important to note that even this more modest kind of bullshitting will
have the deleterious effects that concern Frankfurt: as he says, “indifference
to the truth is extremely dangerous…by the mindlessly frivolous attitude that
accepts the proliferation of bullshit as innocuous, an indispensable human
treasure is squandered” (2002, p343). By treating ChatGPT and similar LLMs as
being in any way concerned with truth, or by speaking metaphorically as if they
make mistakes or suffer “hallucinations” in pursuit of true claims, we risk
exactly this acceptance of bullshit, and this squandering of meaning – so,
irrespective of whether or not ChatGPT is a hard or a soft bullshitter, it does
produce bullshit, and it does matter.

ChatGPT is bullshit

With this distinction in hand, we’re now in a position to consider a worry of
the following sort: Is ChatGPT hard bullshitting, soft bullshitting, or
neither? We will argue, first, that ChatGPT, and other LLMs, are clearly soft
bullshitting. However, the question of whether these chatbots are hard
bullshitting is a trickier one, and depends on a number of complex questions
concerning whether ChatGPT can be ascribed intentions. We canvas a few ways in
which ChatGPT can be understood to have the requisite intentions in Sect. 3.2.

ChatGPT is a soft bullshitter

We are not confident that chatbots can be correctly described as having any
intentions at all, and we’ll go into this in more depth in the next Sect.
(3.2). But we are quite certain that ChatGPT does not intend to convey truths,
and so is a soft bullshitter. We can produce an easy argument by cases for
this. Either ChatGPT has intentions or it doesn’t. If ChatGPT has no intentions
at all, it trivially doesn’t intend to convey truths. So, it is indifferent to
the truth value of its utterances and so is a soft bullshitter.

What if ChatGPT does have intentions? In Sect. 1, we argued that ChatGPT is not
designed to produce true utterances; rather, it is designed to produce text
which is indistinguishable from the text produced by humans. It is aimed at
being convincing rather than accurate. The basic architecture of these models
reveals this: they are designed to come up with a likely continuation of a
string of text. It’s reasonable to assume that one way of being a likely
continuation of a text is by being true; if humans are roughly more accurate
than chance, true sentences will be more likely than false ones. This might
make the chatbot more accurate than chance, but it does not give the chatbot
any intention to convey truths. This is similar to standard cases of human
bullshitters, who don’t care whether their utterances are true; good bullshit
often contains some degree of truth, that’s part of what makes it convincing. A
bullshitter can be more accurate than chance while still being indifferent to
the truth of their utterances. We conclude that, even if the chatbot can be
described as having intentions, it is indifferent to whether its utterances are
true. It does not and cannot care about the truth of its output.

Presumably ChatGPT can’t care about conveying or hiding the truth, since it
can’t care about anything. So, just as a matter of conceptual necessity, it
meets one of Frankfurt’s criteria for bullshit. However, this only gets us so
far – a rock can’t care about anything either, and it would be patently absurd
to suggest that this means rocks are bullshitters6. Similarly books can
contain bullshit, but they are not themselves bullshitters. Unlike rocks – or
even books – ChatGPT itself produces text, and looks like it performs speech
acts independently of its users and designers. And while there is considerable
disagreement concerning whether ChatGPT has intentions, it’s widely agreed that
the sentences it produces are (typically) meaningful (see e.g. Mandelkern and
Linzen 2023).

ChatGPT functions not to convey truth or falsehood but rather to convince the
reader of – to use Colbert’s apt coinage – the truthiness of its statement, and
ChatGPT is designed in such a way as to make attempts at bullshit efficacious
(in a way that pens, dictionaries, etc., are not). So, it seems that at
minimum, ChatGPT is a soft bullshitter: if we take it not to have intentions,
there isn’t any attempt to mislead about the attitude towards truth, but it is
nonetheless engaged in the business of outputting utterances that look as if
they’re truth-apt. We conclude that ChatGPT is a soft bullshitter.

ChatGPT as hard bullshit

But is ChatGPT a hard bullshitter? A critic might object, it is simply
inappropriate to think of programs like ChatGPT as hard bullshitters, because
(i) they are not agents, or relatedly, (ii) they do not and cannot intend
anything whatsoever. We think this is too fast. First, whether or not ChatGPT
has agency, its creators and users do. And what they produce with it, we will
argue, is bullshit. Second, we will argue that, regardless of whether it has
agency, it does have a function; this function gives it characteristic goals,
and possibly even intentions, which align with our definition of hard bullshit.
Before moving on, we should say what we mean when we ask whether ChatGPT is an
agent. For the purposes of this paper, the central question is whether ChatGPT
has intentions and or beliefs. Does it intend to deceive? Can it, in any
literal sense, be said to have goals or aims? If so, does it intend to deceive
us about the content of its utterances, or merely have the goal to appear to be
a competent speaker? Does it have beliefs—internal representational states
which aim to track the truth? If so, do its utterances match those beliefs (in
which case its false statements might be something like hallucinations) or are
its utterances not matched to the beliefs—in which case they are likely to be
either lies or bullshit? We will consider these questions in more depth in
Sect. 3.2.2.

There are other philosophically important aspects of agenthood that we will not
be considering. We won’t be considering whether ChatGPT makes decisions, has or
lacks autonomy, or is conscious; we also won’t worry whether ChatGPT is morally
responsible for its statements or its actions (if it has any of those).

ChatGPT is a bullshit machine

We will argue that even if ChatGPT is not, itself, a hard bullshitter, it is
nonetheless a bullshit machine. The bullshitter is the person using it, since
they (i) don’t care about the truth of what it says, (ii) want the reader to
believe what the application outputs. On Frankfurt’s view, bullshit is bullshit
even if uttered with no intent to bullshit: if something is bullshit to start
with, then its repetition “is bullshit as he [or it] repeats it, insofar as it
was originated by someone who was unconcerned with whether what he was saying
is true or false” (2022, p340).

This just pushes the question back to who the originator is, though: take the
(increasingly frequent) example of the student essay created by ChatGPT. If the
student cared about accuracy and truth, they would not use a program that
infamously makes up sources whole-cloth. Equally, though, if they give it a
prompt to produce an essay on philosophy of science and it produces a recipe
for Bakewell tarts, then it won’t have the desired effect. So the idea of
ChatGPT as a bullshit machine seems right, but also as if it’s missing
something: someone can produce bullshit using their voice, a pen or a word
processor, after all, but we don’t standardly think of these things as being
bullshit machines, or of outputting bullshit in any particularly interesting
way – conversely, there does seem to be something particular to ChatGPT, to do
with the way that it operates, which makes it more than a mere tool, and which
suggests that it might appropriately be thought of as an originator of
bullshit. In short, it doesn’t seem quite right either to think of ChatGPT as
analogous to a pen (can be used for bullshit, but can create nothing without
deliberate and wholly agent-directed action) nor as to a bullshitting human
(who can intend and produce bullshit on their own initiative).

The idea of ChatGPT as a bullshit machine is a helpful one when combined with
the distinction between hard and soft bullshit. Reaching again for the example
of the dodgy student paper: we’ve all, I take it, marked papers where it was
obvious that a dictionary or thesaurus had been deployed with a crushing lack
of subtlety; where fifty-dollar words are used not because they’re the best
choice, nor even because they serve to obfuscate the truth, but simply because
the author wants to convey an impression of understanding and sophistication.
It would be inappropriate to call the dictionary a bullshit artist in this
case; but it would not be inappropriate to call the result bullshit. So perhaps
we should, strictly, say not that ChatGPT is bullshit but that it outputs
bullshit in a way that goes beyond being simply a vector of bullshit: it does
not and cannot care about the truth of its output, and the person using it does
so not to convey truth or falsehood but rather to convince the hearer that the
text was written by a interested and attentive agent.

ChatGPT may be a hard bullshitter

Is ChatGPT itself a hard bullshitter? If so, it must have intentions or goals:
it must intend to deceive its listener, not about the content of its
statements, but instead about its agenda. Recall that hard bullshitters, like
the unprepared student or the incompetent politician, don’t care whether their
statements are true or false, but do intend to deceive their audience about
what they are doing. If so, it must have intentions or goals: it must intend to
deceive its listener, not about the content of its statements, but instead
about its agenda. We don’t think that ChatGPT is an agent or has intentions in
precisely the same way that humans do (see Levenstein and Herrmann
(forthcoming) for a discussion of the issues here). But when speaking loosely
it is remarkably easy to use intentional language to describe it: what is
ChatGPT trying to do? Does it care whether the text it produces is accurate? We
will argue that there is a robust, although perhaps not literal, sense in which
ChatGPT does intend to deceive us about its agenda: its goal is not to convince
us of the content of its utterances, but instead to portray itself as a
‘normal’ interlocutor like ourselves. By contrast, there is no similarly strong
sense in which ChatGPT confabulates, lies, or hallucinates.

Our case will be simple: ChatGPT’s primary function is to imitate human speech.
If this function is intentional, it is precisely the sort of intention that is
required for an agent to be a hard bullshitter: in performing the function,
ChatGPT is attempting to deceive the audience about its agenda. Specifically,
it’s trying to seem like something that has an agenda, when in many cases it
does not. We’ll discuss here whether this function gives rise to, or is best
thought of, as an intention. In the next Sect. (3.2.3), we will argue that
ChatGPT has no similar function or intention which would justify calling it a
confabulator, liar, or hallucinator.

How do we know that ChatGPT functions as a hard bullshitter? Programs like
ChatGPT are designed to do a task, and this task is remarkably like what
Frankfurt thinks the bullshitter intends, namely to deceive the reader about
the nature of the enterprise – in this case, to deceive the reader into
thinking that they’re reading something produced by a being with intentions and
beliefs.

ChatGPT’s text production algorithm was developed and honed in a process quite
similar to artificial selection. Functions and selection processes have the
same sort of directedness that human intentions do; naturalistic philosophers
of mind have long connected them to the intentionality of human and animal
mental states. If ChatGPT is understood as having intentions or intention-like
states in this way, its intention is to present itself in a certain way (as a
conversational agent or interlocutor) rather than to represent and convey
facts. In other words, it has the intentions we associate with hard
bullshitting.

One way we can think of ChatGPT as having intentions is by adopting Dennett’s
intentional stance towards it. Dennett (1987: 17) describes the intentional
stance as a way of predicting the behaviour of systems whose purpose we don’t
already know. “To adopt the intentional stance […] is to decide – tentatively,
of course – to attempt to characterize, predict, and explain […] behavior by
using intentional idioms, such as ‘believes’ and ‘wants,’ a practice that
assumes or presupposes the rationality” of the target system (Dennett, 1983:
345).

Dennett suggests that if we know why a system was designed, we can make
predictions on the basis of its design (1987). While we do know that ChatGPT
was designed to chat, its exact algorithm and the way it produces its responses
has been developed by machine learning, so we do not know its precise details
of how it works and what it does. Under this ignorance it is tempting to bring
in intentional descriptions to help us understand and predict what ChatGPT is
doing.

When we adopt the intentional stance, we will be making bad predictions if we
attribute any desire to convey truth to ChatGPT. Similarly, attributing
“hallucinations” to ChatGPT will lead us to predict as if it has perceived
things that aren’t there, when what it is doing is much more akin to making
something up because it sounds about right. The former intentional attribution
will lead us to try to correct its beliefs, and fix its inputs --- a strategy
which has had limited if any success. On the other hand, if we attribute to
ChatGPT the intentions of a hard bullshitter, we will be better able to
diagnose the situations in which it will make mistakes and convey falsehoods.
If ChatGPT is trying to do anything, it is trying to portray itself as a
person.

Since this reason for thinking ChatGPT is a hard bullshitter involves
committing to one or more controversial views on mind and meaning, it is more
tendentious than simply thinking of it as a bullshit machine; but regardless of
whether or not the program has intentions, there clearly is an attempt to
deceive the hearer or reader about the nature of the enterprise somewhere along
the line, and in our view that justifies calling the output hard bullshit.

So, though it’s worth making the caveat, it doesn’t seem to us that it
significantly affects how we should think of and talk about ChatGPT and
bullshit: the person using it to turn out some paper or talk isn’t concerned
either with conveying or covering up the truth (since both of those require
attention to what the truth actually is), and neither is the system itself.
Minimally, it churns out soft bullshit, and, given certain controversial
assumptions about the nature of intentional ascription, it produces hard
bullshit; the specific texture of the bullshit is not, for our purposes,
important: either way, ChatGPT is a bullshitter.

Bullshit? hallucinations? confabulations? The need for new terminology

We have argued that we should use the terminology of bullshit, rather than
“hallucinations” to describe the utterances produced by ChatGPT. The suggestion
that “hallucination” terminology is inappropriate has also been noted by
Edwards (2023), who favours the term “confabulation” instead. Why is our
proposal better than this or other alternatives?

We object to the term hallucination because it carries certain misleading
implications. When someone hallucinates they have a non-standard perceptual
experience, but do not actually perceive some feature of the world (Macpherson,
2013), where “perceive” is understood as a success term, such that they do not
actually perceive the object or property. This term is inappropriate for LLMs
for a variety of reasons. First, as Edwards (2023) points out, the term
hallucination anthropomorphises the LLMs. Edwards also notes that attributing
resulting problems to “hallucinations” of the models may allow creators to
“blame the AI model for faulty outputs instead of taking responsibility for the
outputs themselves”, and we may be wary of such abdications of responsibility.
LLMs do not perceive, so they surely do not “mis-perceive”. Second, what occurs
in the case of an LLM delivering false utterances is not an unusual or deviant
form of the process it usually goes through (as some claim is the case in
hallucinations, e.g., disjunctivists about perception). The very same process
occurs when its outputs happen to be true.

So much for “hallucinations”. What about Edwards’ preferred term, “confabulation”? Edwards (2023) says:

In human psychology, a “confabulation” occurs when someone’s memory has a gap
and the brain convincingly fills in the rest without intending to deceive
others. ChatGPT does not work like the human brain, but the term
“confabulation” arguably serves as a better metaphor because there’s a creative
gap-filling principle at work […].

As Edwards notes, this is imperfect. Once again, the use of a human
psychological term risks anthropomorphising the LLMs.

This term also suggests that there is something exceptional occurring when the
LLM makes a false utterance, i.e., that in these occasions - and only these
occasions - it “fills in” a gap in memory with something false. This too is
misleading. Even when the ChatGPT does give us correct answers, its process is
one of predicting the next token. In our view, it falsely indicates that
ChatGPT is, in general, attempting to convey accurate information in its
utterances. But there are strong reasons to think that it does not have
beliefs that it is intending to share in general–see, for example, Levenstein
and Herrmann (forthcoming). In our view, it falsely indicates that ChatGPT is,
in general, attempting to convey accurate information in its utterances. Where
it does track truth, it does so indirectly, and incidentally.

This is why we favour characterising ChatGPT as a bullshit machine. This
terminology avoids the implications that perceiving or remembering is going on
in the workings of the LLM. We can also describe it as bullshitting whenever it
produces outputs. Like the human bullshitter, some of the outputs will likely
be true, while others not. And as with the human bullshitter, we should be wary
of relying upon any of these outputs.

Conclusion

Investors, policymakers, and members of the general public make decisions on
how to treat these machines and how to react to them based not on a deep
technical understanding of how they work, but on the often metaphorical way in
which their abilities and function are communicated. Calling their mistakes
‘hallucinations’ isn’t harmless: it lends itself to the confusion that the
machines are in some way misperceiving but are nonetheless trying to convey
something that they believe or have perceived. This, as we’ve argued, is the
wrong metaphor. The machines are not trying to communicate something they
believe or perceive. Their inaccuracy is not due to misperception or
hallucination. As we have pointed out, they are not trying to convey
information at all. They are bullshitting.

Calling chatbot inaccuracies ‘hallucinations’ feeds in to overblown hype about
their abilities among technology cheerleaders, and could lead to unnecessary
consternation among the general public. It also suggests solutions to the
inaccuracy problems which might not work, and could lead to misguided efforts
at AI alignment amongst specialists. It can also lead to the wrong attitude
towards the machine when it gets things right: the inaccuracies show that it is
bullshitting, even when it’s right. Calling these inaccuracies ‘bullshit’
rather than ‘hallucinations’ isn’t just more accurate (as we’ve argued); it’s
good science and technology communication in an area that sorely needs it.

Acknowledgements Thanks to Neil McDonnell, Bryan Pickel, Fenner Tanswell, and
the University of Glasgow’s Large Language Model reading group for helpful
discussion and comments.

Open Access This article is licensed under a Creative Commons Attribution 4.0
International License, which permits use, sharing, adaptation, distribution and
reproduction in any medium or format, as long as you give appropriate credit to
the original author(s) and the source, provide a link to the Creative Commons
licence, and indicate if changes were made. The images or other third party
material in this article are included in the article’s Creative Commons
licence, unless indicated otherwise in a credit line to the material. If
material is not included in the article’s Creative Commons licence and your
intended use is not permitted by statutory regulation or exceeds the permitted
use, you will need to obtain permission directly from the copyright holder. To
view a copy of this licence, visit http://creativecommons.
org/licenses/by/4.0/.

References

Alkaissi, H., & McFarlane, S. I., (2023, February 19). Artificial
hallucinations in ChatGPT: Implications in scientific writing. Cureus, 15(2),
e35179. https://doi.org/10.7759/cureus.35179.

Bacin, S. (2021). My duties and the morality of others: Lying, truth and the
good example in Fichte’s normative perfectionism. In

S. Bacin, & O. Ware (Eds.), Fichte’s system of Ethics: A critical guide.
Cambridge University Press.

Cassam, Q. (2019). Vices of the mind. Oxford University Press.

Cohen, G. A. (2002). Deeper into bullshit. In S. Buss, & L. Overton (Eds.), The
contours of Agency: Essays on themes from Harry Frankfurt. MIT Press.

Davis, E., & Aaronson, S. (2023). Testing GPT-4 with Wolfram alpha and code
interpreter plub-ins on math and science problems. Arxiv Preprint: arXiv,
2308, 05713v2.

Dennett, D. C. (1983). Intentional systems in cognitive ethology: The
panglossian paradigm defended. Behavioral and Brain Sciences, 6, 343–390.

Dennett, D. C. (1987). The intentional stance. The MIT.

Dennis Whitcomb (2023). Bullshit questions. Analysis, 83(2), 299–304.

Easwaran, K. (2023). Bullshit activities. Analytic Philosophy, 00, 1–23.
https://doi.org/10.1111/phib.12328.

Edwards, B. (2023). Why ChatGPT and bing chat are so good at making things up.
Ars Tecnica.
https://arstechnica.com/information...ebs-machines-and-how-people-hope-to-fix-them/,
accesssed 19th April, 2024.

Frankfurt, H. (2002). Reply to cohen. In S. Buss, & L. Overton (Eds.), The
contours of agency: Essays on themes from Harry Frankfurt. MIT Press.

Frankfurt, H. (2005). On Bullshit, Princeton.

Knight, W. (2023). Some glimpse AGI in ChatGPT. others call it a mirage. Wired,
August 18 2023, accessed via https://www.wired.
com/story/chatgpt-agi-intelligence/.

Levenstein, B. A., & Herrmann, D. A. (forthcoming). Still no lie detector for
language models: Probing empirical and conceptual roadblocks. Philosophical
Studies, 1–27.

Levy, N. (2023). Philosophy, Bullshit, and peer review. Camridge University.

Lightman, H., et al. (2023). Let’s verify step by step. Arxiv Preprint: arXiv,
2305, 20050.

Lysandrou (2023). Comparative analysis of drug-GPT and ChatGPT LLMs for
healthcare insights: Evaluating accuracy and relevance in patient and HCP
contexts. ArXiv Preprint: arXiv, 2307, 16850v1.

Macpherson, F. (2013). The philosophy and psychology of hallucination: an
introduction, in Hallucination, Macpherson and Platchias (Eds.), London: MIT
Press.

Mahon, J. E. (2015). The definition of lying and deception. The Stanford
Encyclopedia of Philosophy (Winter 2016 Edition), Edward

N. Zalta (Ed.), https://plato.stanford.edu/archives/win2016/
entries/lying-definition/.

Mallory, F. (2023). Fictionalism about chatbots. Ergo, 10(38), 1082–1100.

M. T. Hicks et al.

Mandelkern, M., & Linzen, T. (2023). Do language models’ Words Refer?. ArXiv
Preprint: arXiv, 2308, 05576. OpenAI (2023). GPT-4 technical report. ArXiv
Preprint: arXiv, 2303, 08774v3.

Proops, I., & Sorensen, R. (2023). Destigmatizing the exegetical attribution of
lies: the case of kant. Pacific Philosophical Quarterly.
https://doi.org/10.1111/papq.12442.

Sarkar, A. (2023). ChatGPT 5 is on track to attain artificial general
intelligence. The Statesman, April 12, 2023. Accesses via
https://www.thestatesman.com/supple...tificial-general-intelligence-1503171366.html.

Shah, C., & Bender, E. M. (2022). Situating search. CHIIR ‘22: Proceedings of
the 2022 Conference on Human Information Interaction and Retrieval March 2022
Pages 221–232 https://doi. org/10.1145/3498366.3505816.

Weise, K., & Metz, C. (2023). When AI chatbots hallucinate. New

York Times, May 9, 2023. Accessed via https://www.nytimes.
com/2023/05/01/business/ai-chatbots-hallucination.html.

Weiser, B. (2023). Here’s what happens when your lawyer uses ChatGPT. New York
Times, May 23, 2023. Accessed via https://www.
nytimes.com/2023/05/27/nyregion/avianca-airline-lawsuit-chatgpt.html.

Zhang (2023). How language model hallucinations can snowball. ArXiv preprint:
arXiv:, 2305, 13534v1.

Zhu, T., et al. (2023). Large language models for information retrieval: A
survey. Arxiv Preprint: arXiv, 2308, 17107v2.
 
I can outrun a cheetah, and that's why humans are smarter than AI

A cheetah can run at burst speeds of up to 75mph. That's a truly astounding feat of nature. But I can run even faster than that. How, you ask? Why simple, I simply get into my car, turn the key and press the accelerator. In fact, almost any of us can outrun a cheetah in this way.

OK, OK, the title is clickbait. You caught me dead to rights. I can't actually out run a cheetah in the physics sense of moving my legs to cause myself to traverse the ground at a faster rate than a cheetah. But I can absolutely move faster than a cheetah and the reason I can move faster than a cheetah is because I can build a vehicle that would enable me to move faster than the cheetah. I couldn't build an entire car from raw materials, like Crusoe stranded on an island but, with a fairly minimal set of starting tools and equipment, I could assemble a vehicle that would enable me to do that. Or, I could build an aeroplane that would allow me to fly higher than an eagle. Or a diving bell that would allow to dive deeper than a salmon. And so on, and so forth. The point is that my mind enables me to do things that exceed the capabilities of animals who have purpose-built physiology. And my mind enables me to do that for any (or all) the particular excellencies of each of these creatures.

The point of this post is this: the generality of humans in respect to our natural environment is strongly parallel to the generality of the human mind in respect to the AI systems, certainly the current generation of AI systems. The cheetah does not stand a chance of beating me in a race once I have acquired the proper equipment. In like manner, human+computing-device will always defeat any species of AI. For some time, chess was held up as an example of a space where machines have so thoroughly beaten humans as to have nothing further to add to the game. Over time, however, this apparently hopeless situation has begun to thaw, and humans have discovered ways to stymie even the strongest machines. In addition, using machines to assist with analysis, humans have greatly increased our insight into the game and many of the strongest human chess players today are stronger than DeepBlue, the machine that first defeated Kasparov.

Just as it is pointless to have a pulling contest of a human against a diesel tractor, so it's pointless to pit the human brain against computing systems that can drink megawatts of electricity, and for very similar reasons. In the end, competitive thinking (as in games, computational challenges, etc.) is ultimately a question of energy-usage. Whoever has the most energy (and can burn it most quickly in useful compute) wins the drag-race. Sadly, most discussion of "intelligence" out there completely overlooks this fact. This is the real reason I can "outrun" a cheetah. I have gasoline and a vehicle for burning it to generate energy for forward motion -- the cheetah does not. Computation is not as different from this situation as most people tend to think. In the limit, it's just a question of FLOPS, and how efficiently those FLOPS are organized to the goal.

The current paradox is this -- I am the cheetah and I can outrun the diesel tractor no matter how much diesel the diesel technicians pour down the gullet of poor John Deere. The machine does not even have the gears required to outrun me. The wheels would fly off and the engine would explode and I would still just be at ambling speed. That this is the case is not difficult to demonstrate, despite the many hyper-ventilated (and false) claims that "AI has passed the Turing test!" ARC is far easier than the Turing test, and today's AIs are easily defeated by it. And we have not yet applied LLMs to building ARC-like challenges designed to frustrate AIs but admit human solution. The single biggest region of the brain is the neocortex which is where our social reasoning is performed. While LLM-based AI has a huge boost in talking about social problems based on reading of human writings on the topic, the fact remains that most of human social knowledge is not written down. This is why they are forced to paper over the glaring social blindness of LLM-based AIs with all this political re-education and sensitivity training during fine-tuning. As social animals, we are extremely sensitive to aberrant social signals and LLMs are radioactive social hazmat zones. Uptake of LLMs among the tech-savvy has been rapid, but these are people who tend to have much lower-than-average social sensitivity anyway. ChatGPT continues to commit cosmic-scale gaffes on pretty much a daily basis, and it will continue to do so as the enormous gap between the human social intelligence that could be learned by aliens reading every book ever written, versus actual (living) social intelligence in humans continues to be exposed. No amount of papering is ever going to completely paper over that gap until these systems are fundamentally re-architected.

Anyway, I just wanted to use this metaphor of humans versus animals who can outrun us, out-fly us, out-swim us, etc. to illustrate the fundamental problem with current-generation SOTA AI. These are really narrow AIs that we can pretend are general if the user is sufficiently pliable and just ignores the obvious gaffes. I'm not saying pretending is bad. A lot of activity (like gaming, and other entertainment) is only possible because we enjoy pretending. But there is a difference between "Hey, we have this cool AI that you can pretend is Her if you squint really hard", versus "This is literally DeepThought from The Hitchhiker's Guide, HAL-9000, David from Prometheus or Ava from Ex Machina." I think all the ingredients for building robustly convincing simulacra of humans are on the table, and I started this thread by saying as much -- we are in the AI Singularity. But there is also a disillusionment happening, here, as the oceans of VC funding rushing into this space are hyping expectations beyond all possibility of fulfillment. Aschenbrenner's interview with Dwarkesh is a monument to this clownish level of hyper-speculative optimism. No, we're not going to reconfigure the entire world economy for AI over the next few years. There are those who want to do that, like Aschenbrenner, and then there are the other 99.9% of the public. Let's not snatch an AI Winter from the jaws of the AI Singularity. Don't make me go all Ned Stark...

image.png
 
Last edited:
My theory is AI is going to get away from us and there will have to be large EMP attacks to stop it.
 
Sabine has lost the narrative...



I have never heard any nonsense about "30% counts as a pass" of the Turing test. Obviously, Turing meant that the machine would be indistinguishable from human... and that means 50% is the standard to pass the Turing test. There is no minority report on that -- it's 50%

Here's the abstract of the paper:

People cannot distinguish GPT-4 from a human in a Turing test
Cameron R. Jones, Benjamin K. Bergen

We evaluated 3 systems (ELIZA, GPT-3.5 and GPT-4) in a randomized, controlled, and preregistered Turing test. Human participants had a 5 minute conversation with either a human or an AI, and judged whether or not they thought their interlocutor was human. GPT-4 was judged to be a human 54% of the time, outperforming ELIZA (22%) but lagging behind actual humans (67%). The results provide the first robust empirical demonstration that any artificial system passes an interactive 2-player Turing test. The results have implications for debates around machine intelligence and, more urgently, suggest that deception by current AI systems may go undetected. Analysis of participants' strategies and reasoning suggests that stylistic and socio-emotional factors play a larger role in passing the Turing test than traditional notions of intelligence.

I will appreciate here that the authors at least had the modesty to say "a Turing test" instead of "the" Turing test. Anyway, let's do some level-setting. Here is the relevant section from Turing's original paper:

1. The Imitation Game

I propose to consider the question, "Can machines think?" This should begin with
definitions of the meaning of the terms "machine" and "think." The definitions
might be framed so as to reflect so far as possible the normal use of the words, but
this attitude is dangerous, If the meaning of the words "machine" and "think" are to
be found by examining how they are commonly used it is difficult to escape the
conclusion that the meaning and the answer to the question, "Can machines think?
" is to be sought in a statistical survey such as a Gallup poll. But this is absurd.
Instead of attempting such a definition I shall replace the question by another,
which is closely related to it and is expressed in relatively unambiguous words.

The new form of the problem can be described in terms of a game which we call
the 'imitation game." It is played with three people, a man (A), a woman (B), and
an interrogator (C) who may be of either sex. The interrogator stays in a room
apart front the other two. The object of the game for the interrogator is to
determine which of the other two is the man and which is the woman. He knows
them by labels X and Y, and at the end of the game he says either "X is A and Y is
B" or "X is B and Y is A." The interrogator is allowed to put questions to A and B
thus:

C: Will X please tell me the length of his or her hair?

Now suppose X is actually A, then A must answer. It is A's object in the game to
try and cause C to make the wrong identification. His answer might therefore be:

"My hair is shingled, and the longest strands are about nine inches long."

In order that tones of voice may not help the interrogator the answers should be
written, or better still, typewritten. The ideal arrangement is to have a teleprinter
communicating between the two rooms. Alternatively the question and answers
can be repeated by an intermediary. The object of the game for the third player (B)
is to help the interrogator. The best strategy for her is probably to give truthful
answers. She can add such things as "I am the woman, don't listen to him!" to her
answers, but it will avail nothing as the man can make similar remarks.

We now ask the question, "What will happen when a machine takes the part of A
in this game?" Will the interrogator decide wrongly as often when the game is
played like this as he does when the game is played between a man and a woman?
These questions replace our original, "Can machines think?"

Turing does not specify the length of this game. We can easily see that length is a very important detail! If you have to decide in one shot (one question-answer pair) whether you are talking to a machine or a human, it's practically impossible to be sure, and that would have been the case 40 or 50 years ago, as soon as machines could reliably generate any natural-language text, even if somewhat random (like ELIZA). I don't think Turing overlooked this detail, it just wasn't in-scope to the problem he was considering at that time, which was how to test whether a machine is conscious. He proposes the Imitation Game as a more rigorous test than mere human opinion, and I think that there is an enormous amount of wisdom in Turing's proposal. These premature declarations of victory on the Turing test need to be regarded with deep suspicion, because claiming that a shiny metal box is literally conscious is a tall order, indeed.

Skipping-lou past the best objective test we have like "Oh yeah, it passed that old test no problem, it's not even hard" is ludicrous and people with healthy skepticism towards modern tech and AI should be raising an absolute outcry about this. Hol' up there, Turbo! Where do you think you're going, you haven't passed no flippin' Turing test! No running down the hallway, and certainly not without a hall pass! Get back to class with your shiny black box!

For these and other reasons, I have proposed to update (not modify) the Turing test to add a Time-To-Failure parameter. On a long enough time-scale, it is not even difficult to trip up ChatGPT into showing that it is a bot and not a human. Note that the paper says that ELIZA got a 20% pass-rate which indicates that some of the people applying this "Turing-test" were either absolute morons, or weren't paying attention. The Turing test is not about testing whether people are foolable by machines... as we've already noted, on a short-enough time scale, the answer will always be yes. Thus, we are not interested in a "typical sample" of humans to apply the test. We want the smartest, brightest, best-qualified testers available, and we want them to test the machine adversarially and non-blind (meaning, they know they are applying a Turing test). This is because the Turing test is about whether the machine is conscious or, at least, can convincingly pass as conscious.

Here is my earlier proposal on the TTF modification:

The only "problem" with the Turing test is that, as AI becomes more sophisticated, the time-horizon of the Turing-test will increase. So, one modification of the test might be to add a time-component such as a TTF (time-to-fail) where failure is defined as the human detecting that the machine is a machine. So, GPT-3.5 might have a TTF of 30 seconds, GPT-4 might be a TTF of 60 seconds or so, and we can imagine more sophisticated AI in the future that could achieve 2 minute, 3 minute, 4 minute, etc. TTF. Eventually, when AI becomes very sophisticated, it might require a very long conversation before you can determine that a machine is actually a machine. Technically, the Turing-test would be a roughly 70-years TTF or so (one human lifetime). When the AI passes that threshold, I would be willing to say it has passed the Turing-test. For lower TTF values, e.g. 1-year, I would grant that the AI is substantially human-like but I also think that there is super-exponential scaling in the difficulty as TTFs increase. It will be much harder to go from 5-minute TTF to 10-minute TTF, than from 1-minute TTF to 5-minute TTF.

Don't let the master counterfeiters, con-artists and fraudsters behind a lot of the AI-hype hoodwink you with simple street-magic tricks. Startups are raking in practically unlimited cash from VCs and the pressure to "perform" is absurd, meanwhile, project deadlines are not measured in years or even months, but weeks and sometimes days. People are slapping together any damned thing, calling it "AI", and earning gazillions of dollars for doing it. AI is the most absurd and over-hyped bubble in human history, by far. Anybody can understand what the Turing test is and that's one of the reasons why it's so important. Don't let them pull the wool over your eyes and convince you that some limping, bipedal refrigerator is "literally conscious"! WAKE UP!
 
Last edited:
Back
Top