LLMs Will Always Hallucinate

@[email protected]

If humans are neural networks yet humans know when they don’t know and ai is also a neural network can’t they also have the ability to know when they are wrong? Maybe not llms specifically but there must be an ai system that could be made that knows when it is wrong.

@[email protected]

Imagine this: the simple solar-powered calculator in a ruler and your PC are both computers. That’s why your comparison makes no sense.

And yes, it could. But i don’t think it needs neurons to work.

Edit: sorry, this sounds a lot more stern than intended.

@[email protected]

Yeah of course humans are waay smarter and have way more neurons than llm’s but yeah my point was that it could work in theory. I guess not with large language models though.

@[email protected]

This is a feature not a bug. Right wing oligarchs, a lot of them in tech, have been creaming their pants on the fantasy of shaping general consensus and privatizing culture for decades. LLM hallucination is just a wrench they are throwing on the machinery of human subjectivity.

Scrubbles

Uh, no. You want to be mad at something like that look into how they’re training models without a care for bias (or adding in their own biases).

Hallucination is a completely different thing that is mathematically proven to happen regardless of who or what made it. Even if the model only knows about fluffy puppies and kitties it will still always hallucinate to some extent, just in that case it will be hallucinating fluffy puppies and kitties. It’s just random data at the end.

That isn’t some conspiracy. Now if you expected a model that’s fluffy kitties and puppies and you’re mad because it starts spewing out hate speech - that’s not hallucination. That’s the training data.

If you’re going to rage about something like that, you might as well rage about the correct thing.

I’m getting real tired here of the “AI is the boogieman”. AI isn’t bad. We’ve had AI and Models for over 20 years now. They can be really helpful. The bias that is baked into them and how they’re implemented and trained has always been and will continue to be the problem.

AmbiguousProps

The AI we’ve had for over 20 years is not an LLM. LLMs are a different beast. This is why I hate the “AI” generalization. Yes, there are useful AI tools. But that doesn’t mean that LLMs are automatically always useful. And right now, I’m less concerned about the obvious hallucination that LLMs constantly do, and more concerned about the hype cycle that is causing a bubble. This bubble will wipe out savings, retirement, and make people starve. That’s not to mention the people currently, right now, being glazed up by these LLMs and falling to a sort of psychosis.

The execs causing this bubble say a lot of things similar to you (with a lot more insanity, of course). They generalize and lump all of the different, actually very useful tools (such as models used in cancer research) together with LLMs. This is what allows them to equate the very useful, well studied and tested models to LLMs. Basically, because some models and tools have had actual impact, that must mean LLMs are also just as useful, and we should definitely be melting the planet to feed more copyrighted, stolen data into them at any cost.

That usefulness is yet to be proven in any substantial way. Sure, I’ll take that they can be situationally useful for things like making new functions in existing code. They can be moderately useful for helping to get ideas for projects. But they are not useful for finding facts or the truth, and unfortunately, that is what the average person uses it for. They also are no where near able to replace software devs, engineers, accountants, etc, primarily because of how they are built to hallucinate a result that looks statistically correct.

LLMs also will not become AGI, they are not capable of that in any sort of capacity. I know you’re not claiming otherwise, but the execs that say similar things to your last paragraph are claiming that. I want to point out who you’re helping by saying what you’re saying.

@[email protected]

Nah don’t put words in my mouth, I’m mad so much money is wasted on this useless LLM shit. I didn’t use the word “AI” once on my post, so the fact that we’ve had AI for 20 years is beside the point.

The dream of the tech oligarchs is to privatize and centralize everything. LLM’s is their tool. Fuck techbros and their Large language bullshit

Scrubbles

Not related at all to the arguments above.

@[email protected]

So it’s really both.

LLMs may always hallucinate, bad actors are also going to poison models they have control over, but even “good” or “neutral” LLMs are useful to fascists. Because part of the fascist playbook it’s to remove meaning and facts from the language they use. Often their appeal to recruits is that they are telling them what they want to hear or feel, sometimes based on a truth or fact, but it doesn’t matter to the fascists, just like it doesn’t matter to the LLMs.

Ex Nummis

Then they will always be useless as standalone trustworthy agents.

@[email protected]

People make mistakes too.

☆ Yσɠƚԋσʂ ☆

It’s worth noting that humans aren’t immune to the problem either. The real solution will be to have a system that can do reasoning and have a heuristic for figuring out what’s likely a hallucination or not. The reason we’re able to do that is because we interact with the outside world, and we get feedback when our internal model diverges from it that allows us to bring it in sync.

@[email protected]

LLMentalist is a mandatory read.

Stop making LLMs happen, we don’t need energy hungry bullshit generators for anything.

There are so many more important AIs that need attention and funding to help us with real problems.

LLMs won’t solve anything.

☆ Yσɠƚԋσʂ ☆

There is a lot of hype around LLMs, and other forms of AI certainly should be getting more attention, but arguing that this tech no value is simply disingenuous. People really need to stop perseverating over the fact that this tech exists because it’s not going anywhere.

@[email protected]

Any benefits are by far outweighted by the cost and dangers.

Tell me more about the value when every LLM company is hemorrhaging money.

☆ Yσɠƚԋσʂ ☆

You seem to have a very US centric perspective on this tech the situation in China looks to be quite different. Meanwhile, whether you personally think the benefits are outweighed by whatever dangers you envision, the reality is that you can’t put toothpaste back in the tube at this point. LLMs will continue to be developed. The only question is how that’s going to be done and who will control this tech. I’d much rather see it developed in the open.

@[email protected]

You dense motherfucker.

No LLMs are being developed in the open.

Even provided weights mean nothing.

It’s not knowledge LLMs retain, just the ingressed text.

LLMs should be skipped after confirming that they are indeed a dead end they always were. And the entire world should focus on anything else.

☆ Yσɠƚԋσʂ ☆

You’re such an angry little ignoramus. The GPT-NeoX repo on GitHub is the actual codebase they used to train these models. They also open-sourced the training data, checkpoints, and all the tools.

However, even if you were right that the weights were worthless, which they’re obviously not, and there were no open projects which there are, the solution would be to develop models from scratch in the open instead of screeching at people and pretending this tech is just going to go away because it offends you personally.

And nobody says LLMs are anything other than Markov chains at a fundamental level. However, just like Markov chains themselves, they have plenty of real world uses. Some very obvious ones include doing translations, generating subtitles, doing text to speech, and describing images for visually impaired. There are plenty of other uses for these tools.

I love how you presumed to know better than the entire world what technology to focus on. The megalomania is absolutely hilarious. Like all these researchers can’t understand that this tech is a dead end, it takes the brilliant mind of some lemmy troll to figure it out. I’m sure your mommy tells you you’re very special every day.

DigitalStefan

@msage @yogthos I don’t know if I agree 100% with this, but I do like what you’re saying.

It seems like all the AI companies are simply hoping AGI emerges from it and nobody is doing the actual research to make that happen.

People were researching it when I was a child and I suspect they’ll still be researching it when I’m collecting my pension.

☆ Yσɠƚԋσʂ ☆

Again, this is a very US centred perspective. I highly urge you to watch this interview with the Alibaba cloud founder on how this tech is being approached in China https://www.youtube.com/watch?v=X0PaVrpFD14

@[email protected]

I’m not saying every AI company is bad, just the generative ones.

Specially the token models.

We did good things for much cheaper, just because the llmentalist effect everybody rich lost their mind and believes in the AGI from LLMs.

@[email protected]

Wanting a LLM do not hallucinate is like wanting a heater to not generate heat.

@[email protected]

Yes. Like people, if you want the nuggets of gold, you need to go dig them out of the turds.

@[email protected]

You hang out with a lot of people who eat gold nuggets, do you?

@[email protected]

Or wanting LLMs to not produce heat

@[email protected]

“Hallucintae.” A nice euphamism for the term ‘lie.’

@[email protected]

Hopefully there are people still working on non-llm type general AI, because i don’t think we’re ever going to get there with LLMs. The architecture just seems wrong to ever get there, and even Altman has said they probably can’t solve hallucinations. We can probably go very far down this road and get them pretty good, but it’s the wrong road if you want a real AI.

@[email protected]

That’s what I hate most about LLMs.

They’re syphoning away all the funding from real AI research, causing people to hate AI (when they have absolutely nothing to do with AI other than their poorly chosen marketing name), and, once the bubble pops, will keep investors from putting money into anything even remotely sounding like AI (frankly, I wouldn’t be surprised if we end up going full Butlerian jihad and banning anything more complex than a calculator).

The bastards selling this shit have probably set humanity’s progress back for centuries. Doomed us to a new dark age from which we’ll never recover (global warming will kill us first, and even if we survive there’s no resources left to start a new technologically advanced civilization. They’ve murdered us all, for short term profits.

@[email protected]

when they have absolutely nothing to do with AI other than their poorly chosen marketing name

I worked somewhere once where they had an algorithm that placed items according to rules it was given, and it would output variations based on the rules to give the user some output options to work with. Think A or B could go here, and the different outcomes based on if you started with A or B.

It was pretty complex, but ultimately it was just a deterministic outcome of many possible deterministic outcomes based off the rules and what you started with.

They marketed that shit as AI.

It infuriated me.

No machine learning, no neural nets, no reinforcement learning, or learning of any kind, just placing things based off rules.

And don’t get me wrong, it was good, just not AI.

@[email protected]

The word “hallucination” itself is a marketing term. It’s not because it’s been frequently used in the technical literature that it is free of any problem. It’s used because it highlights a problem (namely that some of the output of LLM are not factually correct) but the very name is wrong. Hallucination implies there is someone, perceiving and with a world model, who typically via heuristics (for efficient interfaces like Donald Hoffman suggests) do so incorrectly leading to bad decision regarding the current problem to solve.

So… sure, “it” (trying not to use the term) is structural but it is simply because LLM have no notion of veracity or truth (or anything else, to be clear). They have no simulation to verify from if the output they propose (the tokens out, the sentence the user gets) is correct or not, it is solely highly probably based on their training data.

@[email protected]

Brand new example : “Skills” by Anthropic https://www.anthropic.com/news/skills even though here the audience is technical it is still a marketing term. Why? Because the entire phrasing implies agency. There is no “one” getting new skills here. It’s as if I was adding bash scripts to my ~/bin directory but instead of saying “The first script will use regex to start the appropriate script” I named my process “Theodore” and that I was “teaching” it new “abilities”. It would be literally the same thing, it would be functionally equivalent and the implement would be actually identical… but users, specifically non technical users, would assume that there is more than just branching options. They would also assume errors are just “it” in the process of “learning”.

It’s really a brilliant marketing trick, but it’s nothing more.

@[email protected]

Also your scripts will always do what they were meant to do.

LLMs will do whatever.

@[email protected]

To be clear, I’m not saying the word itself shouldn’t be used but I bet that 99% of the time if it’s not used by someone with a degree in AI or CS it’s going to be used incorrectly.

@[email protected]

The thing that always bothered me about the Halting Problem is that the proof of it is so thoroughly convoluted and easy to fix (simply add the ability to return “undecidable”) that it seems wanky to try applying it as part of a proof for any kind of real world problem.

(Edit: jfc, fuck me for trying to introduce any kind of technical discussion in a pile-on thread. I wasn’t even trying to cheerlead for LLMs, I just wanted to talk about comp sci)

@[email protected]

How do you know something is truly undecidable and not deterministically solvable with more computation?

@[email protected]

Mathematically you might be able to prove I don’t always (and I’m not convinced of that even; I don’t think there is an inherent contradiction like the one used for the proof of Halting), but the bar for acceptable false positives is sufficiently low and the scenario is such an edge case of an edge case of an edge case, that anyone trying to use the whole principle to argue anything about real-world applications is grasping at straws.

@[email protected]

I suggest you re-read through the proof of the halting problem, and consider precisely what it’s saying. It really has been mathematically proven.

But fair enough, the program made in the halting problem you probably wouldn’t ever encounter. But the consequence is, if you were trying to write an algorithm that solves the halting problem, you would have to sacrifice some level of correctness - and technically any algorithm you write would fail or loop forever on an infinite number of programs, surely one of them would be useful. Consider the Collatz conjecture. I severely doubt anyone would be able to “decide” the collatz conjecture program halting without it being a very specific proof of it (with maybe some generalisations).

@[email protected]

That’s not “a fix”, that’s called “a practical workaround” which is used in the real world all the time.

@[email protected]

How would token prediction machine arrive at undecidable? I mean would you just add a percentage threshold? Static or calculated? How would you calculate it?

(Why jfc? Because two people downvoted you? Dood, grow some.)

@[email protected]

It’s easy to be dismissive because you’re talking from the frame of reference of current LLMs. The article is positing a universal truth about all possible technological advances in future LLMs.

@[email protected]

Then I’m confused what is your point on Halting Problem vis-a-vis hallucinations being un-mitigable qualities of LLMs? Did I misunderstood you proposed “return undecided (somehow magically, bypassing Halting Problem)” to be the solution?

@[email protected]

First, there’s no “somehow magically” about it, the entire logic of the halting problem’s proof relies on being able to set up a contradiction. I’ll agree that returning undecidable doesn’t solve the problem as stated because the problem as stated only allows two responses.

My wider point is that the Halting problem as stated is a purely academic one that’s unlikely to ever cause a problem in any real world scenario. Indeed, the ability to say “I don’t know” to unsolvable questions is a hot topic of ongoing LLM research.

@[email protected]

They are errors, not hallucinations. Use the right words and then you can talk about the error rate and the acceptable error rate, the same way we do everything else.

@[email protected]

An “error” could be like it did a grammar wrong or used the wrong definition when interpreting, or something like an unsanitized input injection. When we’re talking about an LLM trying to convince the user of completely fabricated information, “hallucination” conveys that idea much more precisely, and IMO differentiating the phenomenon from a regular mis-coded software bug is significant.

@[email protected]

But calling it an error implies that it can be solved. I’d call it a fundamental design flaw.

@[email protected]

LLMs are basically read-only Mr. Meseeks, except rather than being present for the whole conversation like Mr. Meseeks, each new question in the conversation is a new Mr. Meseeks that has to context the previous convo and answer. It’s no surprise they hallucinate.

setVeryLoud(true);

And each new Mr Meeseeks is told that it can’t let it be known to the user that this is a new Mr. Meeseeks, so make shit up.

@[email protected]

Generally hallucinations are frequent in pure chatbots, ChatGPT and similar, because they are based on an own knowledge base and LLM, so, if they don’t know an answer, they invent it, based on their data set. Different are AI with web access, they don’t have an own knowledge base, retrieving their answers in realtime from webcontents, because of this with a similar reliability as traditional search engines, with the advantage that they find relevant sites which are related with the context of the question, listing sources and summarizing the contents in a direct answer, instead of 390.000 pages of sites, which have nothing to do with the question in the traditional keyword search. IMHO for me, the only AI apps which result usefull for normal users, as search assistant, not an chatbot which tell me BS.

LLMs Will Always Hallucinate

LLMs Will Always Hallucinate

LLMs Will Always Hallucinate, and We Need to Live With This

Technology

LLMs Will Always Hallucinateplus-square

LLMs Will Always Hallucinateplus-square

LLMs Will Always Hallucinate, and We Need to Live With This

Technology

LLMs Will Always Hallucinate

LLMs Will Always Hallucinate