Tech experts are starting to doubt that ChatGPT and A.I. ‘hallucinations’ will ever go away: ‘This isn’t fixable’::Experts are starting to doubt it, and even OpenAI CEO Sam Altman is a bit stumped.

  • kromem@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    11 months ago

    This is a common misconception that I’ve even seen from people who have a background in ML but just haven’t been keeping up to date on the emerging research over the past year.

    If you’re interested in the topic, this article from a joint MIT/Harvard team of researchers on their work looking at what a toy model of GPT would end up understanding in its neural network might be up your alley.

    The TLDR is that it increasingly seems like when you reach a certain complexity of the network, the emergent version that best predicted text is one that isn’t simply mapping some sort of frequency table, but is actually performing more abstracted specialization in line with what generated the original training materials in the first place.

    So while yes, it trains on being the best to predict text, that doesn’t mean the thing that best does that can only predict text.

    You, homo sapiens, were effectively trained across many rounds of “don’t die and reproduce.” And while you may be very good at doing that, you picked up a lot of other skills along the way as complexity increased which helped accomplish that result, like central air conditioning and Netflix to chill with.

    • dudeami0@lemmy.dudeami.win
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 months ago

      In my humble opinion, we too are simply prediction machines. The main difference is how efficient our brains are at the large number of tasks given for it to accomplish for it’s size and energy requirements. No matter how complex the network is it is still a mapped outcome, just the number of factors weighed is extremely large and therefore gives a more intelligent response. You can see this with each increment in GPT models that use larger and larger parameter sets giving more and more intelligent answers. The fact we call these “hallucinations” shows how effective the predictive math is, and mimics humans abilities to just make things up on the fly when we don’t have a solid knowledge base to back it up.

      I do like this quote from the linked paper:

      As we will discuss, we find interesting evidence that simple sequence prediction can lead to the formation of a world model.

      That is to say, you don’t need complex solutions to map complex problems, you just need to have learned how you got there. It’s never purely random attempts at the problem, it’s always predictive attempts that try to map the expected outcomes and learn by getting it right and wrong.

      At this point, it seems fair to conclude the crow is relying on more than surface statistics. It evidently has formed a model of the game it has been hearing about, one that humans can understand and even use to steer the crow’s behavior.

      Which is to say that it has a predictive model based on previous games. This does not mean it must rigidly follow previous games, but that by playing many games it can see how each move affects the next. This is a simpler example because most board games are simpler than language with less possible outcomes. This isn’t to say that the crow is now a grand master at the game, but it has the reasoning to understand possible next moves, knows illegal moves, and knows to take the most advantageous move based on it’s current model. This is all predictive in nature, with “illegal” moves being assigned very low probability based on the learned behavior the moves never happen. This also allows possible unknown moves that a different model wouldn’t consider, but overall provides what is statistically the best move based on it’s model. This allows the crow to be placed into unknown situations, and give an intelligent response instead of just going “I don’t know this state, I’ll do something random”. This does not always mean this prediction is correct, but it will most likely be a valid and more than not statistically valid move.

      Overall, we aren’t totally sure what “intelligence” is, we are just an organism that has developed more and more capabilities to process information based on a need to survive. But getting down to it, we know neurons take inputs and give outputs based on what it perceives is the best response for the given input, and when enough of these are added together we get “intelligence”. In my opinion it’s still all predictive, its how the networks are trained and gain meaning from the data that isn’t always obvious. It’s only when you blindly accept any answer as correct that you run into these issues we’ve seen with ChatGPT.

      Thank you for sharing the article, it was an interesting article and helped clarify my understanding of the topic.