Blocking AI crawlers on the fediverse

cecep@fedia.io · 9 months ago

Blocking AI crawlers on the fediverse

will_a113@lemmy.ml · 9 months ago

I wonder if content should carry some license automatically. Like if you agree to the TOS of an instance, your comments are automatically all licensed as CC:BY or CC:O or the more restrictive license of choice of the instance owner.

hollyberries@programming.dev · 9 months ago

There’s someone running around lemmy with a creative commons sharealike link as a signature. Quite funny to be honest. I can’t remember the username though. They’re bound to show up sooner or later :)

Rentlar@lemmy.ca · 9 months ago

Oh yeah it was @onlinepersona@programming.dev

You go champ! If an AI starts ending their posts with a CC BY-NC-SA license I know who to credit!

onlinepersona@programming.dev · 9 months ago

You’re welcome

CC BY-NC-SA 4.0

ArbitraryValue@sh.itjust.works · 9 months ago

I don’t think that would make much of a difference. Training AI on copyright-protected data appears to be fair use.

FaceDeer@kbin.social · 9 months ago

Yup. There are dumps of Reddit’s entire archive of comments and posts available via torrent, I suspect the only reason Reddit’s getting paid for that stuff right now is that it’s a legal ass-covering that’s comparatively cheap. Anyone who’s a little daring could use it to train an LLM and if they prep the data well enough it’d be hard to even notice.