Multiple LLMs voting together on content validation catch each other’s mistakes to achieve 95.6% accuracy.

Lugh@futurology.today · 26 days ago

DavidGarcia@feddit.nl · 26 days ago

For the small ones, with GPUs a couple hundred watts when generating. For the large ones, somewhere between 10 to 100 times that.

With specialty hardware maybe 10x less.

Pennomi@lemmy.world · 26 days ago

A lot of the smaller LLMs don’t require GPU at all - they run just fine on a normal consumer CPU.

copygirl@lemmy.blahaj.zone · 26 days ago

Wouldn’t running on a CPU (while possible) make it less energy efficient, though?

Pennomi@lemmy.world · 25 days ago

It depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.

DavidGarcia@feddit.nl · 24 days ago

yeah but 10x slower, at speeds that just don’t work for many use cases. When you compare energy consumption per token, there isn’t much difference.

kippinitreal@lemmy.world · 26 days ago

Good god. Thanks for the info.