• roguetrick
      link
      fedilink
      284 months ago

      Well, they can (and will) still scrape us if they want. Just nobody’s making a buck off of it.

      • @stoly@lemmy.world
        link
        fedilink
        14 months ago

        That’s going to be a lot more work since comments and posts are decentralized here. You can probably easily get some of it but it will be hard to get all of it.

        • roguetrick
          link
          fedilink
          14 months ago

          It’s actually even easier than that. Instead of setting up an tool to make up requests for the API, you can just set up a bridge that will dump everything right into your database. The wonders of federation.

          • @LWD@lemm.ee
            link
            fedilink
            14 months ago

            If you can set up a Lemmy instance and apply a little elbow grease to manually follow a few instances, that’s pretty much all you need to have the data come in automatically. You’d probably need more knowledge about how to actually get the data out of the DB than the initial setup, which could be done by somebody just copying and pasting text.

    • The Bard in Green
      link
      fedilink
      124 months ago

      The reality though is I can train LLMs off Lemmy data all I want and I don’t have to pay ANYONE a dime…

    • I wish people like spez and zuck cried themselves to sleep, but those beds of cash are probable pretty comfortable. The only real hope is that they’re pilloried so thoroughly in history books that, at the ends of their lives, they’re bitterly angry at the injustice of how they’ll be remembered. The good news is that this is something the public can influence. The bad news is that 99% of the public don’t give a shit. Musk might be the only one in this crop of unethical sociopaths who might ene up railing about his legacy; the rest are just going to get away with raping the public and generally recognized as being “shrewd business men.” And it’s only the men; the women who do this tend to end more poorly - fired by boards, or spending time in jail.

      • @hangukdise@lemmy.ml
        link
        fedilink
        74 months ago

        America is truly exceptional… Nonagenarian politicians serve as lawmakers of an economy they barely understand, and part of a system of legalized bribery that reinforces their lack of interest in not understanding, while septuagenarian supreme court interpets and applies laws made in the aftermath of the civil war but are free to bend the meaning of laws as their personal political biases allow, and octagenarian presidents wield extreme unchecked power.

        In this system, laws against abuse of personal information and exploitation of data will only be written in 2080 or later, after many lives of common people are damaged, until it damages the life of a congressman and then change happens.

        • Not for the first time do I wish Lemmy had github-like responses. Up/downvotes are utterly inadequate; why didn’t Lemmy learn this lesson from Reddit?

          Anyway, I love how succinctly you summed up the state we’re in. I’ve joked before that America would be well-served by the introducion of Carousel; I’m well past the Last Day age, but the older I get, the less it becomes a joke to me. It’d be better for the environment, too.

          • @hangukdise@lemmy.ml
            link
            fedilink
            14 months ago

            Oh the carousel. Anyway I just wished voters would vote more consciously but even that has been rigged so that people vote to those who appeal to their own fears and anger 😞

  • @Substance_P@lemmy.world
    link
    fedilink
    204 months ago

    Brilliant, A.I does the heavy lifting takes data for free then resells access to it while us who contributed for the last decade don’t get a dime.

    • AwkwardLookMonkeyPuppet
      link
      fedilink
      English
      14 months ago

      Those contributing to it are forced to view ads or pay money for the right to contribute without having ads forced upon them.

  • Gamma
    link
    fedilink
    English
    164 months ago

    Wow, I bet the writing focused communities will love this!

  • Eager Eagle
    link
    fedilink
    English
    134 months ago

    Well, they already made it very clear to everyone back in May that the content created by the community does not belong to the community. Anyone still using that dump deserves to be explored.

    • Zellith
      link
      fedilink
      154 months ago

      Anyone still using that dump deserves to be explored.

      ( ͡° ͜ʖ ͡°)

    • @muhyb@programming.dev
      link
      fedilink
      15
      edit-2
      4 months ago

      PowerDeleteSuite. I used this when things went hot with Reddit. You can even edit your comments before deleting them, best part for you, you don’t have to delete them. (Hopefully Reddit haven’t countered this).

        • AwkwardLookMonkeyPuppet
          link
          fedilink
          English
          24 months ago

          It works, but it takes a long time, and then Reddit un-deletes your comments. Make sure you set it up to edit your comments before deletion. A message like the one in the image is a pretty good choice.

      • @laverabe@lemmy.world
        link
        fedilink
        English
        1
        edit-2
        4 months ago

        Is there a more effective one, that slowly edits all your comments a little bit at a time so it misses their detection over a period of weeks/months? Like scrambling/nonsense sentences.

        There was a book whose card when blunk when they looked up.

        Like completely non sensical but a real sentence so it would be hard to detect.

    • @online@lemmy.ml
      link
      fedilink
      English
      14 months ago

      Look at the issues and you will notice it only works on comments visible from the profile page and that not all are visible. It appears that someone made a python script to solve this problem but that you need an API key to use it.

  • AutoTL;DRB
    link
    fedilink
    English
    74 months ago

    This is the best summary I could come up with:


    Reddit will let “an unnamed large AI company” have access to its user-generated content platform in a new licensing deal, according to Bloomberg yesterday.

    The deal, “worth about $60 million on an annualized basis,” the outlet writes, could still change as the company’s plans to go public are still in the works.

    The news also follows an October story that Reddit had threatened to cut off Google and Bing’s search crawlers if it couldn’t make a training data deal with AI companies.

    Last year, it successfully stonewalled its way out of the biggest protest in its history after changes to its third-party API access pricing caused developers of the most popular Reddit apps to shut down.

    As Bloomberg writes, Reddit’s year-over-year revenue was up by 20 percent by the end of 2023, but it was still $200 million shy of a $1 billion target it had set two years prior.

    The company was reportedly advised to seek a $5 billion valuation when it opens up for public investment, which is expected to happen in March.


    The original article contains 346 words, the summary contains 175 words. Saved 49%. I’m a bot and I’m open source!

  • @9point6@lemmy.world
    link
    fedilink
    64 months ago

    Here comes a new wave of users, I guess

    Kinda thought they’d manage to go a bit longer than the few months they did

  • @mrcleanup@lemmy.world
    link
    fedilink
    64 months ago

    Time to delete my old accounts, I guess. Is there a bit that will go through and delete all posts and comments too? That would be helpful.

  • zeluko
    link
    fedilink
    5
    edit-2
    4 months ago

    I dont see why someone would need this deal anyways… most is already available, and most the new stuff probably too, even without API access.
    I also expect the fediverse to be crawled and used for training, thats just the thing about publicly available stuff, it gets used, if we like it or not…