ChatGPT use declines as users complain about ‘dumber’ answers, and the reason might be AI’s biggest threat for the future

L4sBot@lemmy.world · 2 years ago

ChatGPT use declines as users complain about ‘dumber’ answers, and the reason might be AI’s biggest threat for the future

mikkL@lemmy.world · 2 years ago

This was really enlightening. Do you have some articles that elaborate? ☺️

Zeth0s@lemmy.world · edit-2 2 years ago

Regarding 3.5 turbo you can check the documentation, the old 3.5 models are defined as “legacy”. Regarding max number of tokens of gpt-4 you can try yourself. It used to be >8k, it is now >4k from webui.

There is a talk from openai cio (if I recall correctly) where he describes that reinforcement learning from human feedback (rlhf) actually decreased performance of the models when it comes to programming. I cannot find it now, but it is around on YouTube.

The additional safeguard against jailbreaking, it is what OpenAI has been focusing the past months with heavy use of rlhf. You can google official statements regarding “safety” of the model. I have a bunch of standard pre-prompt I have been using to initialize my chats since the beginning, and with time you could see how the model followed the instructions less strictly.

Problem with openai is that they never released exact number of parameters they are using and detailed benchmarks. And benchmarks you find online refer to APIs that behave differently than the chat webui (for instance you have longer context, you set temperature and system prompt, they are probably even different models, who knows… All is closed)

Measuring performances of llm is pretty tricky, minimal changes can have big effects (see https://huggingface.co/blog/evaluating-mmlu-leaderboard), and unfortunately I haven’t found good resources to properly track chatgpt performances (from web ui) over time, across iterations

mikkL@lemmy.world · 2 years ago

Thank you for the detailed reply 👍🏻