“* People ask LLMs to write code

LLMs recommend imports that don’t actually exist
Attackers work out what these imports’ names are, and create & upload them with malicious payloads
People using LLM-written code then auto-add malware themselves”

  • Waltekin@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    What so many people don’t understand: LLMs like ChatGPT are nothing but statistical engines. They break their incoming text into tokens, and see which tokens usually follow which others. When they generate output, they just roll the dice: After tokens A, B, and C, usually comes a D.

    The point is: they have no understanding. If their training data included a good code example, they might regurgitate it. If their training data included broken code, they may regurgitate that. Or they could mix it all together and produce something weird. It’s a lottery, based on what they sucked out of StackOverflow and other places.