if meta used an illegal source (which is extremely stupid, like using drug money to open a bank) it does not mean google or openai did the same
the meta model is not public, probably for that reason, they just trained it with dirty data for research just to see the feasibility
for fun, i searched the most obscure and niche recent book that i could think: 9791280546517 “Vado e tornerò da voi. Riflessioni sulla Pasqua e sulla Pentecoste”. It’s so niche that’s impossible to find a pirated or even a legit ebook copy. Even if it was published a few months ago, bing AI was able to produce an excerpt and even a short review.
the meta model is not public, probably for that reason, they just trained it with dirty data for research just to see the feasibility
Meta’s LLaMA model actually is publicly available; they released it widely to anyone with a .edu email address and of course it soon ended up on bittorrent. Here is the 🧲 link (which you can also hilariously still find in this pull request, despite the DMCA takedowns they’ve sent elsewhere about it).
if meta used an illegal source (which is extremely stupid, like using drug money to open a bank) it does not mean google or openai did the same
the meta model is not public, probably for that reason, they just trained it with dirty data for research just to see the feasibility
for fun, i searched the most obscure and niche recent book that i could think: 9791280546517 “Vado e tornerò da voi. Riflessioni sulla Pasqua e sulla Pentecoste”. It’s so niche that’s impossible to find a pirated or even a legit ebook copy. Even if it was published a few months ago, bing AI was able to produce an excerpt and even a short review.
Meta’s LLaMA model actually is publicly available; they released it widely to anyone with a .edu email address and of course it soon ended up on bittorrent. Here is the 🧲 link (which you can also hilariously still find in this pull request, despite the DMCA takedowns they’ve sent elsewhere about it).