Editor note, March 27th, 2024: This post used to be titled "Neural Networks, Lensa, and the Past, Present, and Future of Computing." A bit over a year later, Lensa isn't really a big deal anymore but the majority of the post that is about how AI works still holds up. So I've revised the post to focus on explanation, added some new material in that vein, and pulled the political stuff into a separate post.
AI has been improving so fast that I'm not even going to bother linking to examples of what it can do because those will be irrelevant by the time you read this. The core principles of how the technology works, however, are still relevant, since technology builds on itself and new innovations rely on the old. As such, I expect this gentle-yet-thorough explanation of how AI works to become increasingly incomplete over time, but not necessarily obsolete. This post may contain some minor inaccuracies because my focus is on creating a useful overall mental picture (and also because I learned most of this quite a while ago). I will be avoiding jargon wherever possible, or taking a moment to define it when I have to. To describe my background on the subject, and for anyone who prefers video content over text, here are a couple of explanatory videos of neural network visualizers I built a couple years ago:
Part 1: how computers work. Included as background and as a point of contrast for what comes next. It all starts with the humble transistor. For our purposes, you can think of it like a light switch, but instead of using your finger to move a lever that connects and disconnects a wire, you apply high or low voltage to a control point. Connect a few of these transistors together and you can make a variety of simple circuits, such as logic gates. With an AND gate, for example, electricity passes through only if you apply high voltage to both of the control points. There are also OR gates (either point = good to go), NOT (output is the opposite of the input), XOR (one or the other, not both), and and a few others. Although these gates are ultimately just groups of transistors, treating them as basic units, or building blocks, makes it easier to design more complex circuits, such as: latches (has two stable states and can thus store a bit of information), counters, adders, and so on. These circuits can also be thought of as building blocks, simplifying the design for even more complex circuits. Now imagine a bunch of these electrical circuits arranged on a grid. Each of them is connected by a switchboard (another basic circuit), so that you can send a signal to any one of the circuits by using part of the signal as an identifier saying which circuit the rest of the signal should be directed towards. Now imagine a giant grid of data, all stored as stable points of high and low values—on a hard disk, this is in the form of positive and negative magnetization, an SSD card stores high and low voltages in latch circuits. With those two images in mind, here is how the CPU (Central Processing Unit) in a computer works: it takes in a series of instructions, each of which is basically two identifiers, one being the address for a piece of data and the other being the address of the circuit to send the data, resulting in new data. Such instructions, being a set of binary values describing a location in memory and a circuit identifier, can themselves be expressed as data and can be "coded" by a human—that's what a "computer program" is. Decades ago, people made programs by hole-punching paper, which the computer would translate into electrical signals. This got super tedious so people developed Assembly Language, which allowed people to write instructions in text, which then got translated into electrical signals in pretty much the same way. This was still tedious, so people invented programming languages. Whereas Assembly is a small set of core building blocks, each of which translates directly to something the computer actually does, a programming language consists of a much larger set of instructions, each of which can be directly converted into a series of instructions in Assembly. This development not only created a super convenient shorthand, but it also allowed programmers to think in more abstract ways about their programs—describing what they are trying to accomplish rather than worrying about all the details that need to happen inside the computer. The last step in the process is the advent of Object Oriented programming languages, which improved the process of writing programs that could serve as building blocks for other programs. I keep using the phrase "building blocks" to emphasize a pattern: everything made by programming becomes a resource to simplify the creation of new things. This is why technology keeps accelerating rather than getting bogged down by increasing complexity that no human is capable of understanding. With each development, we can move further away from the concrete reality of electrical circuits (which itself is an abstraction from physics), and closer to the abstract realm of human thought. But as wild as that sounds, there is nothing mystical here, if you have ever used a cookbook then you understand the essence of what "code" is—recipes all the way down. Part 2: neural networks. The basis for modern AI, and how it compares to code...oh, and speaking of "AI", that's actually a catch-all term that has been used to refer to pretty much any non-biological decision-making process, from poorly-scripted characters in video games to metal superhumans in Science Fiction movies. Neural nets are just a subset of AI which have been getting a lot of attention lately, so given that context and the informal tone I am going for here, I'm going to be using the terms "AI", "neural-network-based AI", and "machine learning" mostly interchangeably. Anyways, with the view of computers from Part 1 in mind...forget it, neural networks are nothing like that. Sure, they are ultimately built inside of computers, but conceptually a neural net is so different from the building-block-recipe model described above that I like to think of machine learning as the opposite of code. If the transistor is the atom of a traditional computer, then the artificial neuron is the atom of a neural net. Here is how an artificial neuron works: it takes in a set of numbers, typically in a continuous range between -1 and +1. It multiplies each of these numbers by a "weight", again in a continuous range between -1 and +1. Then it adds all of those weighted numbers together and adds a "bias" to the sum. The result gets modified by a little bit more math in the form of an activation function—this last step isn't really important in this conceptual overview but makes the whole thing work better for technical reasons. That modified result then gets sent somewhere else, either an output signal or to other neurons. The last thing to know about an artificial neuron is that it can change its weights and biases in response to feedback—yet another continuous positive or negative number communicating how "wrong" the neuron's last output was, and in which direction. The way neurons adapt to feedback is a really critical point so I'd like to illustrate it with an analogy. Suppose you spend a regular amount of time each day watching the news from a variety of different channels. Some of those channels are reliable, some are actively lying to you, and some are meaningless babble. But you don't know which is which, so you start off by guessing randomly, then go out into the world and act on that information. This causes you trouble and embarrassment because much of what you guessed was wrong. Fortunately, you were paying attention, and update your beliefs about the reliability of your news channels—you treat the ones that told you the truth as more reliable, you assume the opposite of whatever the liars say, and you ignore the channels that are totally random. This doesn't happen all at once, the channels aren't totally consistent and it takes a while to see the patterns, but ultimately you settle on an understanding that works pretty well. Now think about what the potential results of this approach sound like: "story X is true if source A and B say it is and C says it isn't, or if D says it is true and E says it isn't." Sound familiar? It's logic gates, the basic building blocks of traditional computers! An artificial neuron can thus be thought of as a plastic set of logic gates, forming themselves to the needs of the situation—kind of like a stem cell becoming whatever body cell it needs to be, based on its context. And just as the most complex program is ultimately a set of instructions, if you string enough of these neurons together then you have a plastic computer program, becoming whatever set of instructions it needs to be. Wow, with these amazing self-assembling programs, why aren't programmers obsolete? Well, maybe someday we will be, but for now there are a major tradeoffs, the most notable being local minima and rule-gaming. To understand the problem of "local minima", think of a ball rolling down a hill. The ball's "goal" is to get to the lowest possible point, ideally sea-level. The ball is blind, so it relies on gravity—moving in the direction of the steepest downward slope in its immediate vicinity and repeat until every direction is up. More likely than not, however, the ball will not reach sea-level, merely the bottom of whatever hill it happened to be on when it started rolling. Neural networks face the same problem: because their feedback system moves them inexorably towards a smaller and smaller error, they will stop changing once they reach a state where every possible incremental change makes the result worse, even if that state isn't ideal—or even close. Now, there are a lot of tricks that can be employed to mitigate the problem of local minima. In the analogy of the ball, adding some random jitter to the ball's movement that decreases over time, comparing results of multiple starting points, and so on. A great deal of AI research has gone into these sort of techniques, but involves too much technical detail for me to cover here. As an example illustrating local minima, when I first started playing with toy neural networks in a game engine, I created the following scenario: there is a platform with a character that can move in any direction and also a goal. The character receives positive feedback for reaching the goal and negative feedback for falling off the platform. To encourage the character to move towards the goal as efficiently as possible, I also gave it small negative feedback for every second spent in the process. The result? The character moved in a straight line...off the platform. What happened? Well, at first the character moved randomly and fell off the platform because that was far more likely than wandering aimlessly into the goal. On subsequent random wanderings, the character fell off the platform in different ways. Over time, it noticed the pattern that straight lines off the platform resulted in less overall negative feedback than roundabout lines, so it optimized towards shorter and shorter paths. And before it had a chance to find the goal by accident, it had already locked in a strategy for efficient failure. I solved this problem by eliminating the time penalty, instead slowly decreasing the positive feedback for reaching the goal over time such that it approached but never reached zero. Let that be a lesson for authority figures everywhere: even for computers, rewards work better than punishments! The problem of rule-gaming comes from the fundamental lack of human control in the process by which neural networks form themselves into functioning models. A classical program (which again is essentially a complex recipe) always does exactly the programmer told it to. The programmer may have made a mistake (understatement!) so the program doesn't follow their intent, but it is still following human-written instructions in every sense. A neural net, being a plastic, self-forming program—merely given an architecture, set of goals, and context—behaves in a fundamentally unpredictable manner that may involve undesirable side-effects or even directly contradict the spirit of the task. That's not a bug, it's a feature; being able to solve problems that don't have a clear path to being solved using traditional programming techniques (more on that shortly) is the whole point of using neural nets in the first place. There are some researchers working on creating tools to interpret the inner workings of neural nets after the fact, but this lags far, far behind the development of the neural nets themselves, since this "interpretability" has limited or at least difficult to assess commercial value. Here is one example of rule-gaming. A neural net was once tasked to build a particular type of electrical circuit with as few parts as possible. It succeeded, creating a circuit with fewer parts than the most efficient human-designed circuit ever made. But there was a segment of the circuit with no discernable purpose—it wasn't even connected to the rest! On closer inspection, the engineer running the experiment discovered that the mystery segment was pulling random noise out of the air to modify the rest of the circuit's operation with electromagnetic fields. So much as moving the circuit to a different part of the room would cause it to stop functioning. Thus, the end result succeeded according to the AI's explicit goals, but the circuit it created was worthless in any practical context. Here's another, much creepier, hypothetical example. Netflix uses machine learning to give you recommendations on shows to watch based on what you have watched in the past. But that's really hard because people are unpredictable. What if there was some pattern of recommendations Netflix could give you that made you more predictable, thus making its job easier and its recommendations more effective? If such a pattern exists, Netflix has probably already found it and is using it on you right now (if you are a member). Not to pick on Netflix, of course, any other AI-driven recommendation system—including Facebook—is doing the same thing. Is this specific dynamic actually happening? Probably not, but I wouldn't be at all surprised if something like it is. And given the inherent lack of interpretability of neural networks, there is no way for anyone, not even the people who created the algorithms, to know for sure. OK, given those downsides, why are we using neural nets for anything? Well, to appreciate the full benefits of neural nets over traditional programming, let's consider a task for which the latter falls flat: cat pictures. If you were to write a program that identifies whether a picture contains a cat, where would you start? Keep in mind that from a computer's perspective, an image is a grid of numbers describing the color of each pixel in the image. Well, you could look for sudden color changes to identify lines, then have a series of mathematical equations that describe shapes that can be formed by lines, making sure to account for the full range of variation of each type of shape and specifying exactly when it blends into a different shape and...this is already an impossibly massive task, absolutely riddled with vague concepts and special cases. A rule I follow when programming is that if something feels too complicated to make progress, it's because I am looking at it the wrong way and need to change my perspective. Often, this involves finding clever ways to break a big problem apart into smaller, more manageable problems so that I can solve each of them individually—create building blocks to create more building blocks until the final assembly becomes easy. This process is effective because it works around a limitation of the human mind: we can only hold a few disparate pieces of information in our consciousness at once. Indeed, all of the advancements in programming, from assembly to low level languages to modern object-oriented languages to all the other technologies programmers use all exist for the convenience of humans. The computer doesn't care, all those fancy abstract concepts just get translated into the same set of machine codes that used to be punched onto paper cards. The fundamental limitation of this entire approach is that it all assumes a problem can be broken down into independent parts and then built back up with layers upon layers of abstractions. But what if the very process of breaking ideas down into discrete, independent parts is itself the flawed perspective that needs to be changed? What if the way to identify cats isn't a complex recipe, where each step can be broken down into other recipes, such that you never have to hold more than a dozen things in your head to make progress? In other words, what happens when everything is connected? When the problem, by its fundamental nature, isn't vertical (stacking layers of building blocks on top of each other), but horizontal (finding patterns in a vast sea of data)? In short, computers fail and humans succeed. Our brains are built for pattern recognition, for problems with horizontal complexity. By contrast, our ability for vertical, sequential reasoning is unparalleled in the animal kingdom, but it's ultimately just a hacked pattern recognition process located in the frontal lobe of our brain. Well, when I said "computers fail", that only applied to traditional programming. Neural networks are built for horizontal problems like pattern recognition. Suddenly, biology isn't quite so special anymore... Part 3: RL & GANs; robots & art. I spent a lot of time on the foundations of computing and neural nets because if you understand that, everything else is just a bag of tricks. The platform scenario I built and described above used "reinforcement learning", or RL, an add-on to neural networks that allows them to learn despite sparse feedback and introduces rewards/punishments rather than just an error signal. It basically works by recording actions over time and, when there is a reward or punishment, retroactively sending an error signal to all of those actions, based on their contribution to the end result. If you have seen those goofy videos of computer-generated humanoid puppets wildly flailing their arms while running across an obstacle course—or real robots walking over complex surfaces—that's reinforcement learning. Netflix and other automated recommendation systems use Unsupervised Learning. Whereas more basic neural networks (using Supervised Learning) require clear feedback as to the meaning of the data they are exposed to during training so they know what error signal to send back to the neurons to update their weights and biases, Unsupervised Learning sorts unlabeled data into categories in order to find patterns. Image generators are the result of GANs, or Generative Adversarial Networks, which are two neural networks competing with each other. Here's how that works. First, you gather a bunch of data similar to what you are trying to create—cat pictures, paintings in the style of Van Gogh, etc. The first neural network, the Generator, creates images. The gathered real images and the generated images are then sent to the second network, the Discriminator, which guesses whether each image is real or fake. At first, when both networks are untrained, the Generator makes random noise, but the Discriminator still only guesses right 50% of the time. This is a win for the Generator and a loss for the Discriminator, and so the latter sends error signals to its neurons prompting them to change. Eventually, by a mix of random chance and doing more of what works, the neurons change in a way that cause the Discriminator to guess correctly more often than not. Now the Discriminator is sitting pretty and the Generator starts passing an error signal to its neurons, until they change in a way that is able to fool the Discriminator. This process continues, back and forth, until the Discriminator can't possibly win—the Generator's images are so similar to the given data set that there is no point of differentiation to detect—or one or both systems fall into a local minima. Part 4: Large Language Models. Or, how ChatGPT helps students cheat on their homework. Let's go back to basics with a neural network that recognizes a cat. It uses the simplest structure of a neural network called "Supervised Learning"--comparing guesses against labelled data (pictures that a person has tagged as "cat" or "not cat") to make better guesses until it eventually becomes reliable enough to use on data it has never seen before. That's a pretty modest goal, so let's consider how to make it more powerful. First, the more diverse the data, the more abstract the lessons that the AI needs to learn, the wider the range of situations it can be useful in. Second, the more the AI needs to learn, the more data it needs to train on in order to mold its neurons into the perfect set of weights and biases. So if one were to build the most powerful AI possible, one would need access to the biggest, most diverse data set available. But there's a problem...actually, there already was a problem that has just gotten a lot bigger: where do you get all of that labelled data? For an AI to learn, it needs to be able to compare its guesses against a "right answer" so that it can measure the error of its guess and update its weights and biases such that its next guess is less wrong. Creating those right answers requires someone who already knows the answer filling out a LOT of labels. For a relatively narrow task like identifying handwritten characters, you might be able to get a bunch of high school students to do it, but that's going to become a giant money-pit for anything much more ambitious. What would be really great is if there were some ginormous, super-diverse, already-labelled dataset just lying around somewhere... Enter self-supervised learning. Take some data, redact part of it, have the AI guess what was in the redacted part, and then compare its answer against the original version (and then repeat by redacting a different part of the data). With self-supervised learning, one can automatically generate labelled data from any old data--no high school students required. So now we just need a dataset that is big, diverse, freely available, and ideally also easily lends itself to automated self-supervised learning. Hey, how about ALL OF THE TEXT ON THE INTERNET! There's a lot of text out there, you can get it for free, and people write about all kinds of crazy things. And it turns out that predicting text is a really subtle task. Good text prediction requires understanding the structure of language, great prediction requires understanding the meaning of words, and really super-fantastic prediction requires actually thinking like the humans who wrote the text in the first place. Historical note: OpenAI didn't originally set out to build a chatbot, their goal was to build AGI, or an AI system that was General enough to automate most jobs (including yours!), and they figured that language was the most direct way to do this. Now, there are some experts who argue that AIs are just stochastic parrots, which means that they operate on a low level of abstraction, associating words with other words based on how often they appear together. Since AI isn't directly programmed and it's internal process isn't a series of logical statements, it's very difficult to know what level of abstraction it is actually operating under and therefore impossible to know whether it is really "thinking" just by looking at its behavior. But if AIs really are stochastic parrots, that will limit their ability to generalize to new context and make them less useful, so we should expect AIs to operate on increasingly high levels of abstraction over time as they improve. You know that things have gotten crazy when it's this hard to talk about AI without philosophizing about the nature of consciousness... Self-supervised learning is the first of a three-stage process used by Large Language Models, or LLMs, which power familiar applications like ChatGPT. The problem with self-supervised learning alone is that nobody actually wants a thing that predicts how the average Reddit user would finish your sentences. No, people want a thing that lets high school students write their essays for them--a rather fitting payback for labelling all those handwritten letters! The second stage of the training process for LLMs is "finetuning," which basically means re-running self-supervised learning on the best data, like the New York Times (all the news that's fit to plagiarize), so that the AI learns to copy its primo style. By running supervised learning on the entire internet and then finetuning on a small subset, you get the best of both worlds, an AI that understands the structure of language well enough to predict text written by anyone while also being refined enough to not sound like some dirty Reddit troll. But since finetuning is a vastly less intensive process than the initial self-supervised learning, you can always re-finetune the model relatively cheaply if you don't want it to sound like some snooty NYT journalist. The third stage of training is Reinforcement Learning from Human Feedback, or RLHF for short. This is the cherry on top of the LLM, where the AI generates multiple responses to prompts and actual people rate those responses from best to worst. The AI helps this process along by using reinforcement learning (RL) to find patterns in the human feedback (HF) and generate more ratings automatically. These ratings together create a dataset from which the AI learns to give answers that make OpenAI's PR team's job easier. Remember Bing's chatbot that went off the rails and threatened its users? That's what happens when you neglect RLHF. Conclusions: If you want to be involved in building AI, this post doesn't even scratch the surface of what you need to know. For that, I would recommend taking an online course and working with AI yourself. But if you just want to get a better intuition for how AI works so you can have enough background to understand and participate meaningfully in the societal discussions around it, what you have read here is pretty much all you need to know on the technical side. Having built a neural network visualizer from scratch (including independently figuring out the math), I learned basically nothing that I didn't already know that is relevant to what to expect from AI, demystifying how it works, its potential dangers to society, or what sort of regulations we should be considering. Everyone I know who has gone further than me and built a transformer architecture (the 'T' in ChatGPT and a key architectural innovation that makes it work) has had a similar experience. Congratulations, you may now consider yourself AI informed ;)
0 Comments
|
Archives
August 2024
Articles
AI Explained AI, from Transistors to ChatGPT Ethical Implications of AI Art Alignment What is Alignment? Learned Altruism Unity Gridworlds Predictions Superintelligence Soon? AI is Probably Sentient Extinction is the Default Outcome AI Danger Trajectories Others' Ideas What if Alignment is not Enough? Interview with Vanessa Kosoy Solutions Fixing Facebook Fixing Global Warming Other A Hogwarts Guide to Citizenship Black Box |