ChatGPT use declines as users complain about ‘dumber’ answers, and the reason might be AI’s biggest threat for the future::AI for the smart guy?

  • mikkL@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    1 year ago

    This was really enlightening. Do you have some articles that elaborate? ☺️

    • Zeth0s@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      ·
      edit-2
      1 year ago

      Regarding 3.5 turbo you can check the documentation, the old 3.5 models are defined as “legacy”. Regarding max number of tokens of gpt-4 you can try yourself. It used to be >8k, it is now >4k from webui.

      There is a talk from openai cio (if I recall correctly) where he describes that reinforcement learning from human feedback (rlhf) actually decreased performance of the models when it comes to programming. I cannot find it now, but it is around on YouTube.

      The additional safeguard against jailbreaking, it is what OpenAI has been focusing the past months with heavy use of rlhf. You can google official statements regarding “safety” of the model. I have a bunch of standard pre-prompt I have been using to initialize my chats since the beginning, and with time you could see how the model followed the instructions less strictly.

      Problem with openai is that they never released exact number of parameters they are using and detailed benchmarks. And benchmarks you find online refer to APIs that behave differently than the chat webui (for instance you have longer context, you set temperature and system prompt, they are probably even different models, who knows… All is closed)

      Measuring performances of llm is pretty tricky, minimal changes can have big effects (see https://huggingface.co/blog/evaluating-mmlu-leaderboard), and unfortunately I haven’t found good resources to properly track chatgpt performances (from web ui) over time, across iterations