What is a good eli5 analogy for GenAI not "knowing" what they say?

Hucklebee · edit-2 1 year ago

What is a good eli5 analogy for GenAI not "knowing" what they say?

@[email protected] · 1 year ago

It’s all just fancy statistics. It turns words into numbers. Then it finds patterns in those numbers. When you enter a prompt, it finds numbers that are similar and spits out an answer.

You can get into vectors and back propagation and blah blah blah but essentially it’s a math formula. We call it AI but it’s not fundamentally different than solving 2x + 4 = 10 for x.

@[email protected] · 1 year ago

x = 3

Deconceptualist · 1 year ago

You could maybe just share a meme like this one.

Some folks in the comments there share actual LLM results, a few of which are sensible but plenty that aren’t far off from the joke.

@[email protected] · 1 year ago

LMAO! I tried it, and it said:

The man takes the goat across the river first, then he returns alone and takes the boat across again. Finally, he brings the goat’s friend, Mr. Cabbage, across the river.

LMAO

I asked what if the man can’t swim…

In that case, the man can still take the goat across the river first. Then, he ties the goat to the boat and pulls the boat across the river using a rope, ensuring he stays on the side of the river where he can safely manage the boat without swimming.

I asked who Mr. Cabbage is…

My apologies for the confusion! Mr. Cabbage is a playful addition. In the classic riddle, there’s usually no mention of Mr. Cabbage. It’s just the man, the goat, and perhaps another item or animal depending on the version of the riddle. I just added Mr. Cabbage for some extra fun!

Then I asked what some other additions could be…

Certainly! You can add various characters or items to make the riddle more entertaining. Here are a few ideas:

Mr. Wolf: The man, the goat, and Mr. Wolf need to cross the river. But if the man leaves the wolf alone with the goat, the wolf will eat the goat.

Ms. Hen: The man, the goat, and Ms. Hen need to cross the river. But if the man leaves the hen alone with the goat, the goat will eat the hen.

Professor Fox: The man, the goat, and Professor Fox need to cross the river. But if the man leaves the fox alone with the goat, the fox will eat the goat.

Princess Diamond: The man, the goat, and Princess Diamond’s precious gem need to cross the river. But Princess Diamond has warned that if anyone other than herself touches the gem, they’ll be cursed.

Feel free to mix and match or come up with your own additions!

And the “solution”…

Ah, that’s a fun twist! Let’s see:

The man takes Mr. Cabbage across first, then returns alone.

Next, he takes Ms. Hen across but brings Mr. Cabbage back.

He then takes Princess Diamond’s gem across, leaves it on the other side, and takes Mr. Wolf across.

The man leaves Mr. Wolf with Ms. Hen on the other side and returns alone.

He takes Princess Diamond across, but before he leaves, he swaps Mr. Cabbage with Princess Diamond’s gem.

The man brings Mr. Cabbage across, leaving Princess Diamond’s gem with Mr. Wolf.

Finally, he takes the goat across.

Everyone safely crosses the river without any mishaps!

JackbyDev · 1 year ago

I love Mr. Cabbage! Thank you CharGPT, very cool!

@[email protected] · 1 year ago

Dude, that hurt my brain trying to follow it.

@[email protected] · 1 year ago

Imagine you were asked to start speaking a new language, eg Chinese. Your brain happens to work quite differently to the rest of us. You have immense capabilities for memorization and computation but not much else. You can’t really learn Chinese with this kind of mind, but you have an idea that plays right into your strengths. You will listen to millions of conversations by real Chinese speakers and mimic their patterns. You make notes like “when one person says A, the most common response by the other person is B”, or “most often after someone says X, they follow it up with Y”. So you go into conversations with Chinese speakers and just perform these patterns. It’s all just sounds to you. You don’t recognize words and you can’t even tell from context what’s happening. If you do that well enough you are technically speaking Chinese but you will never have any intent or understanding behind what you say. That’s basically LLMs.

@[email protected] · 1 year ago

Imagine making a whole chicken out of chicken-nugget goo.

It will look like a roast chicken. It will taste alarmingly like chicken. It absolutely will not be a roast chicken.

The sad thing is that humans do a hell of a lot of this, a hell of a lot of the time. Look how well a highschooler who hasn’t actually read the book can churn out a book report. Flick through, soak up the flavour and texture of the thing, read the blurb on the back to see what it’s about, keep in mind the bloated over-flowery language that teachers expect, and you can bullshit your way to an A.

Only problem is, you can’t use the results for anything productive, which is what people try to use GenAI for.

DarkThoughts · 1 year ago

Someone once described is as T9 on steroids. It’s like your mobile keyboard suggesting follow up words, just a lot more complex in size.

If an AI says something malicious, our brain immediatly jumps to “it has intent”. How can we explain this away?

The more you understand the underlying concept of LLMs, the more the magic fades away. LLMs are certainly cool and can be fun but the hype around them seems very artificial and they’re certainly not what I’d describe as “AI”. To me, an AI would be something that actually has some form of consciousness, something that actually can form its own thoughts and learn things on its own through observation or experimentation. LLMs can’t do any of those things. They’re static and always wait for your input to even do anything. For text generation you can even just regenerate an answer to the same previous text and the replies can and will vary greatly. If they say something mean or malicious, it’s simply because it is based on whatever they were trained on and whatever parameters they are following (like if you told them to roleplay a mean person).

@[email protected] · 1 year ago

it’s a spicy autocomplete. it doesn’t know anything, does not understand anything, it does not reason, and it won’t stop until your boss thinks it’s good enough at your job for “restructuring” (it’s not). any illusion of knowledge comes from the fact that its source material mostly is factual. when you’re drifting off into niche topics or something that was missing out of training data entirely, spicy autocomplete does what it does best, it makes shit up. some people call this hallucination, but it’s closer to making shit up confidently while not knowing any better. humans do that too, but at least they know when they do that

Hucklebee · 1 year ago

Hmm, now that I read this, I have a thought: it might also be hard to wrap our heads around this issue because we all talk about AI as if it is an entity. Even the sentence “it makes shit up” gives AI some kind of credit that it “thinks” about things. It doesn’t make shit up, it is doing exactly what it is programmed to do: create good sentences. It succeeds.

Maybe the answer is just to stop talking about AI’s as “saying” things, and start talking about GenAI as “generating sentences”? That way, we emotionally distance ourselves from “it” and it’s more difficult to ascribe consciousness to an AI.

@[email protected] · 1 year ago

https://youtube.com/watch?v=evos0WBCip8

Hegar · edit-2 1 year ago

Part of the problem is hyperactive agency detection - the same biological bug/feature that fuels belief in the divine.

If a twig snaps, it could be nothing or someone. If it’s nothing and we react as if it was someone, no biggie. If it was someone and we react as if it was nothing, potential biggie. So our brains are bias towards assuming agency where there is none, to keep us alive.

@[email protected] · edit-2 1 year ago

Compression algorithms can reduce most written text to about 20–25% of its original size—implying that that’s the amount of actual unique information it contains, while the rest is predictable filler.
Empirical studies have found that chimps and human infants, when looking at test patterns, will ignore patterns that are too predictable or too unpredictable—with the sweet spot for maximizing attention being patterns that are about 80% predictable.
AI researchers have found that generating new text by predicting the most likely continuation of the given input results in text that sounds monotonous and obviously robotic. Through trial and error, they found that, instead of choosing the most likely result, choosing one with around an 80% likelihood threshold produces results judged most interesting and human-like.

The point being: AI has stumbled on a method of mimicking the presence of meaning by imitating the ratio of novelty to predictability that characterizes real human thought. But we know that the actual content of that novelty is randomly chosen, rather than being a deliberate message.

@[email protected] · 1 year ago

It’s your phone’s ‘predictive text’, but if it were trained on the internet.

It can guess what the next word should be a lot of the time, but it’s also easy for it to go off the rails.

@[email protected] · edit-2 1 year ago

I am an LLM researcher at MIT, and hopefully this will help.

As others have answered, LLMs have only learned the ability to autocomplete given some input, known as the prompt. Functionally, the model is strictly predicting the probability of the next word⁺, called tokens, with some randomness injected so the output isn’t exactly the same for any given prompt.

The probability of the next word comes from what was in the model’s training data, in combination with a very complex mathematical method to compute the impact of all previous words with every other previous word and with the new predicted word, called self-attention, but you can think of this like a computed relatedness factor.

This relatedness factor is very computationally expensive and grows exponentially, so models are limited by how many previous words can be used to compute relatedness. This limitation is called the Context Window. The recent breakthroughs in LLMs come from the use of very large context windows to learn the relationships of as many words as possible.

This process of predicting the next word is repeated iteratively until a special stop token is generated, which tells the model go stop generating more words. So literally, the models builds entire responses one word at a time from left to right.

Because all future words are predicated on the previously stated words in either the prompt or subsequent generated words, it becomes impossible to apply even the most basic logical concepts, unless all the components required are present in the prompt or have somehow serendipitously been stated by the model in its generated response.

This is also why LLMs tend to work better when you ask them to work out all the steps of a problem instead of jumping to a conclusion, and why the best models tend to rely on extremely verbose answers to give you the simple piece of information you were looking for.

From this fundamental understanding, hopefully you can now reason the LLM limitations in factual understanding as well. For instance, if a given fact was never mentioned in the training data, or an answer simply doesn’t exist, the model will make it up, inferring the next most likely word to create a plausible sounding statement. Essentially, the model has been faking language understanding so much, that even when the model has no factual basis for an answer, it can easily trick a unwitting human into believing the answer to be correct.

—-

⁺more specifically these words are tokens which usually contain some smaller part of a word. For instance, understand and able would be represented as two tokens that when put together would become the word understandable.

@[email protected] · 1 year ago

I think that a good starting place to explain the concept to people would be to describe a Travesty Generator. I remember playing with one of those back in the 1980’s. If you fed it a snippet of Shakespeare, what it churned out sounded remarkably like Shakespeare, even if it created brand “new” words.

The results were goofy, but fun because it still almost made sense.

The most disappointing source text I ever put in was TS Eliot. The output was just about as much rubbish as the original text.

Sabata11792 · 1 year ago

As some nerd playing with various Ai models at home with no formal training, any wisdom you think that’s worth sharing?

@[email protected] · 1 year ago

The only winning move is not to play.

Sabata11792 · 1 year ago

But my therapist said she needs more VRam.

@[email protected] · 1 year ago

The Chinese Room by Searle

@[email protected] · 1 year ago

In the sense that the “argument” is an intuition pump. As an anti ai argument it’s weak - you could replace the operator in the Chinese room with an operator in an individual neuron and conclude that our brains don’t know anything, either

@[email protected] · 1 year ago

There is no intelligent operator in a neuron

@[email protected] · 1 year ago

Yeah? Of course?

@[email protected] · 1 year ago

Exactly. The brain is analogous to the room, not the person in it. Try removing a chunk of a brain and see how well it can “understand”

@[email protected] · 1 year ago

Like parrots, LLM learn to immitate language (only, unlike parrots, it’s done in a learning mode, not from mere exposure, and it’s billions or even trillions of examples) without ever understanding its primary meaning, much less secondary more subtle meanings (such as how a person’s certainty and formal education shapes their choice of words used for a subject).

As we humans tend to see patterns in everything even when they’re not there (like spotting a train in the clouds or a christ in a burnt toast), when confronted with the parroted output from an LLM we tend to “spot” subtle patterns and from them conclude characteristics of the writter of those words as we would if the writter was human.

Subconsciously we’re using a cognitive process meant to derive conclusions about other humans from their words, and applying it to words from non-humans, and of course out of such process you only ever get human chracteristics out so this shortcut yields human characteristics for non-humans - in logical terms it’s as if we’re going “assuming this is from a human, here are the human characteristics of the writer of this words” only because it’s all subconscious we don’t spot we’re upfront presuming humanity to conclude the presence of human traits, i.e. circular logic.

This kind of natural human cognitive shortcut is commonly and purposefully taken advantage of by all good scammers, including politicians and propagandists, to lead people into reaching specific conclusions since we’re much more wedded to conclusion we (think we) reached ourselves than to those others told us about.

DaDragon · 1 year ago

The short hand answer I’d try to give people is ‘it’s statistics’. Based on training data, there’s a certain chance of certain words being in proximity of each other. There’s no reasoning behind placement, other than whatever pattern is discernible from known situation.

@[email protected] · 1 year ago

Harry Frankfurt’s influential 2005 book (based on his influential 1986 essay), On Bullshit, offered a description of what bullshit is.

When we say a speaker tells the truth, that speaker says something true that they know is true.

When we say a speaker tells a lie, that speaker says something false that they know is false.

But bullshit is when the speaker says something to persuade, not caring whether the underlying statement is true or false. The goal is to persuade the listener of that underlying fact.

The current generation of AI chat bots are basically optimized for bullshit. The underlying algorithms reward the models for sounding convincing, not necessarily for being right.