Aren’t they processing high quality data from multiple sources?
Here’s where the misunderstanding comes in, I think. And it’s not the high quality data or the multiple sources. It’s the “processing” part.
It’s a natural human assumption to imagine that a thinking machine with access to a huge repository of data would have little trouble providing useful and correct answers. But the mistake here is in treating these things as thinking machines.
That’s understandable. A multi-billion dollar propaganda machine has been set up to sell you that lie.
In reality, LLMs are word prediction machines. They try to predict the words that would likely follow other words. They’re really quite good at it. The underlying technology is extremely impressive, allowing them to approximate human conversation in a way that is quite uncanny.
But what you have to grasp is that you’re not interacting with something that thinks. There isn’t even an attempt to approximate a mind. Rather, what you have is a confabulation engine; a machine for producing plausible fictions. It does this by creating unbelievably huge matrices of words - literally operating in billions of dimensions at once, graphs with many times more axes than we have letters - and probabilistically associating them with each other. It’s all very clever, but what it produces is 100% fake, made up, totally invented.
Now, because of the training data they’ve been fed, those made up answers will, depending on the question, sometimes ends up being right. For certain types of question they can actually be right quite a lot of the time. For other types of question, almost never. But the point is, they’re only ever right by accident. The “AI” is always, always constructing a fiction. That fiction just sometimes aligns with reality.
I assume by “thinking engine” you mean “Reasoning AI”.
Reasoning AI is just more bullshit. What happens is that they produce the output the way they always do - by guessing at a sequence of words that is statistically adjacent to the input they’re given - but then what they do is produce a randomly generated “Chain of thought” which is invented in the same way as the result; just pure statistical word association. Essentially they create the output the same way that a non-reasoning LLM does, then they give r themselves the prompt “Write a chain of thought for this output.” There’s a little extra stuff going on where they sort of check their own output, but in essence that’s just done by running the model multiple times and picking the output they converge on. So, just weighting the randomness, basically.
I’m simplifying a lot here obviously, but that’s pretty much what’s going on.