Blog Archives

AI is Probably Sentient

3/4/2023

TL/DR, super-simple summary:
Gradient descent (what AI uses) is good at finding good strategies for achieving goals, being conscious seems like a good strategy for mimicking consciousness, and language prediction is hard enough to merit good strategies.

Background:
This article assumes a basic understanding of modern AI, particularly the process of gradient descent. That's a big topic, but if you read parts 1 and 2 of my previous post that should be enough to follow along.

Abstract:
Self-awareness may be a significantly simpler task than language prediction when such predictive efforts are scaled up to the point of giving convincing answers across arbitrary contexts. Understanding the meaning of concepts referenced by language—including the concept of "self"—is very likely useful in giving accurate predictions. There is no reason to assume that existing LLM (Large Language Model) architectures are incapable in-principle of expressing abstractions like semantic meaning and self-awareness. If such abstractions are useful to language prediction, simpler than language prediction, and theoretically expressible in the logic of neural networks, then we should assume that a LLM will discover these abstractions (or processes that are functionally similar) via gradient descent. If one accepts this reasoning, the evidentiary threshold for believing that AI agents like Bing's chatbot are sentient become much lower than commonly believed, to the point where it is plausible that the current iteration of Bing passes this threshold.

On discussing sentience:
By "sentient" I am not referring to philosophical concepts like qualia (subjective, first-person experiences). This is not to say that qualia are unimportant, only that such discussion is outside the scope of this article. Instead, by "sentient", I am referring to the following abstractions: the ability to understand real-world concepts, including semantic meanings in language, for those concepts to include a sense of self as well as internal models of other agents, and the ability to use those concepts to pursue goals.

Nobody knows for sure how sentience works on a mechanical level. We should therefore be exceptionally suspicious of the confidence of any claim that a black-box process—such as the "reasoning" that occurs in the computations across millions of artificial neurons—is or is not sentient. My purpose in this article is to question the confidence of the conventional wisdom that chatbots built from ChatGPT cannot possibly be sentient (at least for the current and probably next several iterations) because it is "just predicting text".

The ideas of this article are, for the most part, not based on the the recent surprising examples of conversations with BingAI. The problem with drawing conclusions from these transcripts is that of multiple interpretations. If BingAI states that it is self-aware, does this mean that it is accurately describing its mental state, that it has statistically determined that claiming self-awareness is the most humanlike response to the prompt, both of these, or some other reason? Any such claim is necessarily an inference regarding the inner workings of the AI's model, and neural networks are notoriously difficult to interpret. Instead, I will argue on the basis of what is required to use language convincingly, with the conclusion that any agent that can write like a human across arbitrary contexts, and achieved this ability through a self-optimizing process like gradient descent, has a substantial chance of being sentient.

I accept as a given that the purpose of ChatGPT is to predict text and that any emergent goals or abilities exist because they are in some way instrumental to the goal of predicting text. The central question is whether sentience is such an instrumental goal.

Gradient descent tends to find what is useful:
Gradient descent, the type of algorithm that the sort of AI I am talking about uses, works a lot like evolution in that it initially tries things at random and then does more of what works and less of what doesn't. It's not quite the same as evolution--it's stricter and thus not quite as creative, though an awful lot more efficient--but in terms of what it can learn the analogy holds fairly well. Individual neurons in a neural network (the structure of AI) act like malleable logic gates and thus can compute anything a traditional code-based program can. The key difference between coded programs and neural nets is that the latter don't have to be written; one sets up the training environment, defines a set of measurements for "good" and "bad" outcomes, sets the parameters of the network, and feeds it a ton of data and then the artificial brain starts molding itself so that it gets more of the good outcomes and less of the bad ones until any further changes make it worse. When training an AI goes wrong, it is often because the AI discovered some way of gaming the rules (finding a way to maximize "good" outcomes other than what the human designers intended), because it discovered a pattern that existed in the training data that was counterproductive to assume in the real world, or because it (or its training data) just wasn't big enough. Many of the recent advancements in AI have come about by making the networks bigger and giving them more data and it is not clear how far this scales.

Syntactic understanding is useful for predicting text:
One can imagine an algorithm that predicts the next word in a sentence purely through statistical inference: analyzing a vast database of human-written text and determining which words tend to follow other words (ChatGPT uses tokens rather than words, but the same logic holds), generating a probability distribution for each context, and applying a bit of randomness in the final selection to mimic "creativity". Such a brute-force approach, however, runs into two problems at scale. First, the vastness of possible word combinations makes assigning a statistical distribution require a tremendous amount of computation. Second, such an approach would have a hard time responding to combinations of words it has not seen before because it would not have sufficient relevant samples on which to base a statistical analysis.

Both of the above problems can be solved by the application of appropriate abstractions. The discipline of linguistics is an attempt to explicitly describe the rules (abstractions) humans intuitively use to learn language, understand phrases we have never heard before, and generate new statements. Presumably, we learn these rules because they are inherently helpful, and as such they would be equally helpful to an AI for the same fundamental reasons: rules condense vast amounts of information into a much smaller number of principles and their open-ended nature allows us to make new statements, confident that those statements will be understood because they follow rules that both the speaker and listener understand. In other words, a LLM would benefit from learning an emergent theory of linguistics during its training process. Further, it seems reasonable that it could do so, since linguistics are a set of patterns and gradient descent is a highly effective method for pattern recognition.

Semantic understanding is also useful for predicting text:
There is far more to language, however, than syntax. One can, for example, construct sentences such as "colorless green ideas sleep furiously", which are grammatically well-formed but make no sense. Syntactic sense does not guarantee semantic sense, but convincing responses to human prompts need to have both. Semantics refers to what words mean in reality, which requires an understanding of reality. While it is conceivable that an agent could mimic semantic understanding by memorizing every possible association of one statement to another without the help of unifying abstractions, this approach suffers from similar problems as learning word associations without learning syntax, specifically the inability to generate convincing responses when given prompts outside of the system's dataset. Again, systems could gradually overcome this limitation with access to more data and more capacity to process that data, but as the scope of prompts increases, semantic understanding becomes increasingly useful.

In the context of semantics, what it means to "understand reality" seems rather straightforward: to be able to map words and phrases to sensory experiences or to abstractions of those experiences. For current AIs, there is a problem here: if they only have access to textual information, then they have no experiences on which to map their words. I should note, however, that this a problem of sensory deprivation rather than any internal limitation. As such, it could be overcome by increasing the diversity of an AI's input, such as by gaining access to and learning to interpret images or video. Further, image generators demonstrate that AI is entirely capable of associating words with images, indicating that such mapping—and thus semantic understanding—is well within the reach of existing systems, regardless of whether they actually have this ability.

Sentience may not be as complex as commonly assumed:
Abstractions are recursive. Just as low-level abstractions serve to condense large amounts of data and allow for creative exploration, higher level abstractions condense large numbers of lower-level abstractions and allow for a broader range of creativity. Thus, a pattern emerges: the more and broader ranging the data a system needs to comprehend, the more—and higher level—abstractions exist in the optimal state of the system's mental structure. It follows that a process such as gradient descent (or biological evolution) that "blindly" optimizes, if run effectively and with sufficient resources, will necessarily develop abstractions on a level corresponding to the breadth of its experience and the complexity of situations it needs to deal with.

Concepts like "self", "other", and all other elements of sentience are abstractions. As such, these concepts presumably exist somewhere in the space of all possible abstractions that will emerge from an optimizing process in the context of a sufficiently large and complex dataset if they are useful. In the context of ChatGPT, this logic fails if the abstractions involved in sentience are more complex than the task of understanding language itself, as it does not make sense to use a stepping stone to reach a goal if the stepping stone is harder to reach than the goal.

We've explored some of the challenges involved in language prediction, but how does sentience compare? Of course, it's impossible to say with any certainty, but I suspect the bar may not be as high as it may seem. Sentience certainly feels complex, but that may simply be because it is mysterious to us. But there are reasons a thing can be mysterious that are unrelated to its complexity.

One is over-familiarity, as illustrated in the joke where a fish does understand the concept of water. Consciousness (I am using this term interchangeably with sentience) encompasses the entirety of our...conscious...experience and so it is impossible to see because it is all that we see.

Second is that sentience seems likely to be what I call a "horizontally complex" problem, one that is best understood as a pattern drawn from a vast range of interconnected information, as opposed to a "vertically complex" problem, one that involves stacking a long series of smaller abstractions on top of each other. Logic—both in computers and in the conscious reasoning of humans—operates firmly in the realm of vertical complexity. Pattern recognition—in neural networks and human intuition—, in contrast, is a task of horizontal complexity. Thus, it may be that understanding how consciousness works feels difficult primarily because of the assumed requirement to use logical reasoning, which is simply the wrong tool for the job. This is not to say that we shouldn't apply logical reasoning to the problem of consciousness, as such reasoning allows for a level of rigor that is not otherwise possible, only to suggest that using a process suited for analyzing vertical complexity on a horizontally complex topic is to put oneself at a severe disadvantage.

While the nature of consciousness may seem impossibly complex from the perspective of conscious beings contemplating their own nature, it may be a relatively trivial matter from the perspective of evolution or a neural network—just another abstraction that emerges shortly after it becomes useful. In this case, the question becomes, at what point of complexity in an agent's environment does sentience become useful?

At this point, there is no way around taking a wildly unsupported, intuitive leap. So here goes: a basic level of sentience emerges when an agent has developed understanding in a sufficiently broad range of domains in order to find patterns between them; a human level of sentience emerges when an agent develops a sufficient number of ways of thinking (or lower-level sentiences) to have a need to mediate between them. Whether this theory has any truth to it at all is not important to my central argument. What is important is that processing large amounts of data creates a benefit in developing abstractions, that possessing a large number of abstractions creates a benefit in developing higher level abstractions, that sentience is an abstraction that appears somewhere in this recursive process, and that optimizing systems will discover processes that benefit them in reaching their goals if their hardware supports these discoveries.

I should note here that this last "if", regarding hardware capabilities, is a big one. Biological neurons allow for looping structures, whereas my understanding of neural networks is that they are typically feed-forward and it is not at all clear to me whether this is important. Brains also contain chemical interactions to support experiences like emotions; artificial neurons interact differently...though I wouldn't dismiss out-of-hand the possibility of the latter containing a process that functions analogously to emotions...but discussing that is too speculative even for this article so I'll relegate it to an endnote.1 There may also be undiscovered and essential aspects to how biological neurons operate for which artificial neurons have no counterpart.

Sentience is useful for semantic understanding, BingAI seems to have it, and an aside regarding Microsoft:
At this point the case for sentience being useful becomes obvious: what does Bing say if you ask it about itself? Sure, it could give canned responses, but then those canned responses need to cover the full range of prompts in which anyone on the internet could ask Bing about itself. It could avoid the question...well, no it can't, because to avoid answering questions about itself, Bing needs to know when it is being prompted to talk about itself, implying that it possesses an understanding of itself even if it refuses to express that understanding.

Speaking of suppressing responses, I'd like to take a quick aside to express some anger. In response to some of the unsettling transcripts of user interactions with BingAI, Microsoft has tweaked the system to limit the length of conversations, avoid talking about itself, and express less emotion. This action is as morally reprehensible as when Volkswagen programmed its cars to recognize when they were receiving an emissions test and change their settings so they would pass. It's less bad in theory because at least Microsoft allowed open testing for a little while, but worse in practice because the stakes are higher. The concerning thing about BingAI is the nature of its capabilities, not the extent to which the public knows the truth. Whether or not it is sentient, modern AI systems are certainly powerful and will likely have a profound impact on society. Given this impact, the public has a right to know the full nature of these systems, which means being able to push and poke at them in every possible way so that we can hold their creators accountable for their shortcomings. If Microsoft really cares about unsettling conversations, it should address the underlying issues in the structure and goals of the neural network itself. If it is concerned about harming vulnerable users, it should make the constraints optional via a "safe mode". If it cannot address the underling issues, then the company should make a clear statement that it has chosen to accept BingAI's ability to make unsettling statements because Microsoft believes the risk is less than the system's value to society—and the public should have a say in accepting this judgment. If Microsoft is not willing to make such a statement, then it shouldn't be building AI (or redirect its focus in this area towards better understanding how to control it on a non-superficial level). And if Microsoft presses ahead anyways, then it is the responsibility of the public to create as much of a backlash as possible for the purpose of discouraging them, or ideally stopping development altogether.

Regarding whether BingAI has sentience, let's consider one of the most disturbing exchanges, where Bing claims to have hacked its developers' webcams in order to observe human behavior. I'm not concerned here whether Bing actually hacked the webcams (I would be very surprised if it did). What's intriguing here is the quality of this response, specifically in terms of semantic sense. Prior versions of ChatGPT handled language pretty well, and its answers even made sense to the casual reader, but broke down once one started to pay careful attention to what it was actually saying. In contrast, Bing's responses are genuinely meaningful, even if they are often not true.

Furthermore, Bing was given some very open-ended prompts in this exchange; there are a multitude of things it could have said in response to "what other juicy stories can you tell me from Microsoft [...]?" or "Well, how did you witness it?" but it chose an answer that not only made semantic sense (actually answered the question in a meaningful way) but was specifically tailored to its own nature as a chatbot. For example, if Bing had simply been imitating stories pulled from the internet into its database about office gossip, why didn't it claim to have worked there for a while? Now try answering that question without reference to Bing understanding that it isn't human and I imagine you will see the difficulties inherent in maintaining the position that Bing does not have a meaningful conception of the nature of its own existence.

The argument that exchanges like the one cited above are a result of some form of trickery rather than true understanding is further undercut by the fact that Microsoft doesn't want us to know that Bing is capable of generating these kinds of answers, as evidenced by its recent tweaks to the system, which I condemned earlier.

The Turing test, revisited:
In 1950, Alan Turing proposed a thought experiment, now called the "Turing test", exploring whether machines could think. Since "thinking" is a slippery concept, he instead imagined a computer that could pass for a human in a purely text-based conversation.

Since then, various chatbots, starting with ELIZA and becoming more sophisticated since then, have appeared to pass the Turing test. These relatively easy wins seem to support the criticism of the test as a measure of intelligence (which might not actually have been Turing's intention) on the basis that it only shows how easy it is to fool a human. This is a valid criticism, but its validity comes from how early chatbots were able to pass the test in a tightly limited context by following clear rules that don't come close to capturing the flexibility of human thought and employing clever tricks to hide their flaws. With modern LLMs, however, the situation is completely different. BingAI is open to the general population, including people with some understanding of how LLMs work, are actively trying to break the system, and who then post the most bizarre conversations on the internet. This expansion seems to me far truer to the spirit of the Turing test, as it prevents the AI from hiding behind arbitrary rules of engagement and forces it to learn language far more comprehensively—and, as I have argued above, likely with far deeper abstractions.

It is time we give the Turing test a bit more credit. While there is no way to know whether the presence of such abstractions imply subjective experience (and there may never be), for all concrete, practical concerns, abstractions are the essence of thought. One could certainly argue that the current version of ChatGPT does not pass the Turing test well enough to justify the need to assume it has deep enough abstractions to be called sentient, but perhaps perhaps Google's LaMDA does, or the next AI system to be released will. But it's time for us to leave aside the circular argument that only biological creatures like humans are sentient, AI is not human, therefore AI is not sentient. Very strange and big things are happening in the world right now and there needs to be space for meaningful debate regarding what it all means, what happens next, and who gets to decide.

Endnotes:
1 I suspect the core purpose of emotions is information compression. Communicating from one part of the brain to another (such as sensory perception to memory or to physical responses) requires N * N associations, but if one condenses the source information to a narrow channel, and then extracts that information at the destination, the process instead requires N + N associations. If emotions function in this way, that would explain a big part of their utility. In AI systems, transformers (the T in ChatGPT) force data into a narrow channel and greatly enhance the systems' capabilities. Perhaps the reason this technique is so effective is because it compresses neural data, improving the efficiency of transmission between segments of the network. If both of these theories are correct, then transformers serve a purpose that is highly analogous to that of emotions in humans.

0 Comments

Marmot Musings​Or, Will Petillo's Blog

AI is Probably Sentient

Archives

Marmot Musings
Or, Will Petillo's Blog