GenAI tools ‘could not exist’ if firms are made to pay copyright::undefined

  • Valen
    link
    fedilink
    English
    1205 months ago

    So they’re admitting that their entire business model requires them to break the law. Sounds like they shouldn’t exist.

    • @Even_Adder@lemmy.dbzer0.com
      link
      fedilink
      English
      48
      edit-2
      5 months ago

      It likely doesn’t break the law. You should check out this article by Kit Walsh, a senior staff attorney at the EFF, and this one by Katherine Klosek, the director of information policy and federal relations at the Association of Research Libraries.

      Headlines like these let people assume that it’s illegal, rather than educate people on their rights.

      • @jacksilver@lemmy.world
        link
        fedilink
        English
        25 months ago

        The Kit Walsh article purposefully handwaves around a couple of issues that could present larger issues as law suits in this arena continue.

        1. He says that due to the size of training data and the model, only a byte of data per image could be stored in any compressed format, but this assumes all training data is treated equally. It’s very possible certain image artifacts are compressed/stored in the weights more than other images.

        2. These models don’t produce exact copies. Beyond the Getty issue, nytimes recently released an article about a near duplicate - https://www.nytimes.com/interactive/2024/01/25/business/ai-image-generators-openai-microsoft-midjourney-copyright.html.

        I think some of the points he makes are valid, but they’re making a lot of assumptions about what is actually going on in these models which we either don’t know for certain or have evidence to the contrary.

        I didn’t read Katherine’s article so maybe there is something more there.

          • @jacksilver@lemmy.world
            link
            fedilink
            English
            25 months ago

            I’m not sure she does, just read the article and it focuses primarily what models can train on. However, the real meat of the issue, at least I think, with GenAI is what it produces.

            For example, if I built a model that just spit out exact frames from “Space Jam”, I don’t think anyone would argue that would be a problem. The question is where is the line?

            • @Even_Adder@lemmy.dbzer0.com
              link
              fedilink
              English
              2
              edit-2
              5 months ago

              This part does:

              It’s not surprising that the complaints don’t include examples of substantially similar images. Research regarding privacy concerns suggests it is unlikely it is that a diffusion-based model will produce outputs that closely resemble one of the inputs.

              According to this research, there is a small chance that a diffusion model will store information that makes it possible to recreate something close to an image in its training data, provided that the image in question is duplicated many times during training. But the chances of an image in the training data set being duplicated in output, even from a prompt specifically designed to do just that, is literally less than one in a million.

              The linked paper goes into more detail.

              On the note of output, I think you’re responsible for infringing works, whether you used Photoshop, copy & paste, or a generative model. Also, specific instances will need to be evaluated individually, and there might be models that don’t qualify. Midjourney’s new model is so poorly trained that it’s downright easy to get these bad outputs.

              • @jacksilver@lemmy.world
                link
                fedilink
                English
                15 months ago

                This goes back to my previous comment of handwaving away the details. There is a model out there that clearly is reproducing copyrighted materials almost identically (nytimes article), we also have issues with models spitting out training data https://www.wired.com/story/chatgpt-poem-forever-security-roundup/. Clearly people studying these models don’t fully know what is actually possible.

                Additionally, it only takes one instance to show that these models, in general, can and do have issues with regurgitating copyrighted data. Whether that passes the bar for legal consequences we’ll have to see, but i think it’s dangerous to take a couple of statements made by people who don’t seem to understand the unknowns in this space at face value.

                • @FatCrab@lemmy.one
                  link
                  fedilink
                  English
                  45 months ago

                  The ultimate issue is that the models don’t encode the training data in any way that we historically have considered infringement of copyright. This is true for both transformer architectures (gpt) and diffusion ones (most image generators). From a lay perspective, it’s probably good and relatively accurate for our purposes to imagine the models themselves as enormous nets that learn vague, muddled, impressions of multiple portions of multiple pieces of the training data at arbitrary locations within the net. Now, this may still have IP implications for the outputs and here music copyright is pretty instructive, albeit very case-by-case. If a piece is too “inspired” by a particular previous work, even if it is not explicit copying it may still be regarded as infringement of copyright. But, like I said, this is very case specific and precedent cuts both ways on it.

                • @Even_Adder@lemmy.dbzer0.com
                  link
                  fedilink
                  English
                  15 months ago

                  The article dealt with Stable Diffusion, the only open model that allowed people to study it. If there were more problems with Stable Diffusion, we’d’ve heard of them by now. These are the critical solutions Open-source development offers here. By making AI accessible, we maximize public participation and understanding, foster responsible development, as well as prevent harmful control attempts.

                  As it stands, she was much better informed than you are and is an expert in law to boot. On the other hand, you’re making a sweeping generalization right into an appeal to ignorance. It’s dangerous to assert a proposition just because it has not been proven false.

    • @Marcbmann@lemmy.world
      link
      fedilink
      English
      385 months ago

      Reproduction of copyrighted material would be breaking the law. Studying it and using it as reference when creating original content is not.

      • @1Fuji2Taka3Nasubi@lemmy.zip
        link
        fedilink
        English
        85 months ago

        Reproduction of copyrighted material would be breaking the law. Studying it and using it as reference when creating original content is not.

        I’m curious why we think otherwise when it is a student obtaining an unauthorized copy of a textbook to study, or researchers getting papers from sci-hub. Probably because it benefits corporations and they say so?

        • @hglman@lemmy.ml
          link
          fedilink
          English
          115 months ago

          So if a tool is involved, it’s no longer ok? So, people with glasses cannot consume copyrighted material?

        • @hedgehog@ttrpg.network
          link
          fedilink
          English
          75 months ago

          Copyright can only be granted to works created by a human, but I don’t know of any such restriction for fair use. Care to share a source explaining why you think only humans are able to use fair use as a defense for copyright infringement?

          • @phdepressed@sh.itjust.works
            link
            fedilink
            English
            -65 months ago

            Because a human has to use talent+effort to make something that’s fair use. They adapt a product into something that while similar is noticeably different. AI will

            1. make things that are not just similar but not noticeably different.

            2. There’s not an effort in creation. There’s human thought behind a prompt but not on the AI following it.

            3. If allowed to AI companies will basically copyright everything…

        • @LainTrain@lemmy.dbzer0.com
          link
          fedilink
          English
          35 months ago

          What’s the difference? Humans are just the intent suppliers, the rest of the art is mostly made possible by software, whether photoshop or stable diffusion.

        • @Marcbmann@lemmy.world
          link
          fedilink
          English
          05 months ago

          I don’t agree. The publisher of the material does not get to dictate what it is used for. What are we protecting at the end of the day and why?

          In the case of a textbook, someone worked hard to explain certain materials in a certain way to make the material easily digestible. They produced examples to explain concepts. Reproducing and disseminating that material would be unfair to the author who worked hard to produce it.

          But the author does not have jurisdiction over the knowledge gained. They cannot tell the reader that they are forbidden from using the knowledge gained to tutor another person in calculus. That would be absurd.

          IP law protects the works of the creator. The author of a calculus textbook did not invent calculus. As such, copyright law does not apply.

      • @wewbull@feddit.uk
        link
        fedilink
        English
        15 months ago

        The model itself is a derivative work. It’s existence is what is under dispute. It’s not about using the model to produce further works

        • @Marcbmann@lemmy.world
          link
          fedilink
          English
          45 months ago

          Then every single student graduating college produces derivative work.

          Everything that required the underlying knowledge gained from the textbooks studied, or research papers read, is derivative work.

          At the core of this, what are we saying? Your machine could only explain calculus because it was provided information from multiple calculus textbooks? Well, that applies to literally everyone.

      • @Telodzrum@lemmy.world
        link
        fedilink
        English
        155 months ago

        This ruling only applies to the 2nd Circuit and SCOTUS has yet to take up a case. As soon as there’s a good fact pattern for the Supreme Court of a circuit split, you’ll get nationwide information. You’ll also note that the decision is deliberately written to provide an extremely narrow precedent and is likely restricted to Google Books and near-identical sources of information.

        • @hedgehog@ttrpg.network
          link
          fedilink
          English
          55 months ago

          Have there been any US ruling stating something along the lines of “The training of general purpose LLMs and/or image generation AIs does not qualify as fair use,” even in a lower court?

        • @Eccitaze@yiffit.net
          link
          fedilink
          English
          15 months ago

          Hell, that article is also all about Google Books, which is an entirely different beast from generative AI. One of the key points from the circuit judge was that Google Books’ use of copyrighted material “…[maintains] respectful consideration for the rights of authors and other creative individuals, and without adversely impacting the rights of copyright holders.” The appeals court, in upholding the ruling that Google Books’ use of copyrighted content is fair use, ruled “the revelations do not provide a significant market substitute for the protected aspects of the originals.”

          If you think that gen AI doesn’t provide a significant market substitute for the artwork created by the artists and authors used to train these models, or that it doesn’t adversely impact their rights, then you’re utterly delusional.

    • iquanyin
      link
      fedilink
      English
      15 months ago

      i don’t think it’s need rules against the law…

  • Dran
    link
    fedilink
    English
    56
    edit-2
    5 months ago

    you know what? I like this argument. Software/Streaming services are “too complex and costly to work in practice” therefore my viewership/participation “could not exist” if I were forced to pay for them.

  • Queen HawlSera
    link
    fedilink
    English
    555 months ago

    I do love how AI has gotten Corporate Giants to start attacking the Copyright System they’ve used to beat down the little man for generations

    • Dr. Moose
      link
      fedilink
      English
      125 months ago

      Maybe because it’s not the same corporations? We might be seeing a giant powershift from IP hoarders to makers.

      • @wewbull@feddit.uk
        link
        fedilink
        English
        165 months ago

        Makers use the copyright system to their advantage as well though. If I write code and place it on github, the only thing stopping a mega corp stealing it is the copyright I hold.

        Abolishing copyright is not a win.

        • Dr. Moose
          link
          fedilink
          English
          15
          edit-2
          5 months ago

          Let’s not kid ourselves that the copyright is stopping mega corporations from stealing your github code.

          What’s stopping them from hiring an engineer that basically rewrites your code? No one would ever know.

          Copyleft enforcement is laughable at best and thats with legitimate non profits working on it (like FSF) and that’s when it comes to direct library use without modifications and there’s basically no history of prosecution or penalties for partial code copying (nor that there should be imo) that’s even when 1:1 code has been found!

          I feel like copyright has been doing very little in modern age and have yet to see any science that contradicts my opinion here. Most copyright holders (like high 90%) are mega corporations like ghetty images that hardly contribute back to the society.

        • @sir_reginald@lemmy.world
          link
          fedilink
          English
          4
          edit-2
          5 months ago

          it definitely is, tho.

          of course they could steal your code, but you could steal theirs. you could steal their software. you could steal their paywalled articles. you could steal all those things that are affected by artificial scarcity. They have much more to loose if copyright gets abolished.

          In an ideal world, I’d make copyright to be 5 years for individuals and abolished for corporations. But the world is far from ideal and individuals have much to win if copyright gets abolished as a whole.

        • @Doomsider@lemmy.world
          link
          fedilink
          English
          -25 months ago

          Science and the arts existed just fine without Intellectual Property. Abolishing copyright would be a huge win for society as a whole and a minor loss to corporations.

  • @flop_leash_973@lemmy.world
    link
    fedilink
    English
    385 months ago

    Not that I am a fan of the current implementation of copyright in the US, but I know if I was planning on building my business around something that couldn’t exist without violating copyright I would surely thought of that fairly early on.

    • @beebarfbadger@lemmy.world
      link
      fedilink
      English
      195 months ago

      “My profits from fencing your wallet could not exist if stealing your wallet were punished.”

      “Ah, you’re right, how silly of me, carry on.”

    • @Even_Adder@lemmy.dbzer0.com
      link
      fedilink
      English
      -15 months ago

      You should check out this article by Kit Walsh, a senior staff attorney at the EFF, and this one by Katherine Klosek, the director of information policy and federal relations at the Association of Research Libraries.

      • @nymwit@lemm.ee
        cake
        link
        fedilink
        English
        25 months ago

        The LCA principles also make the careful and critical distinction between input to train an LLM, and output—which could potentially be infringing if it is substantially similar to an original expressive work.

        from your second link. I don’t often see this brought up in discussions. The problem of models trained on copyrighted info is definitely different than what you do with that model/output from it. If you’re making money from infringing, the fair use arguments are historically less successful. I have less of an issue with the general training of a model vs. commercial infringing use.

          • @nymwit@lemm.ee
            cake
            link
            fedilink
            English
            35 months ago

            I don’t disagree with that statement. I’m having trouble seeing how that fits with what I said, though. Can you elaborate?

            • @Even_Adder@lemmy.dbzer0.com
              link
              fedilink
              English
              -15 months ago

              It doesn’t really, I was just kind of restating what you quoted. Since no one factor of fair use is more important than the others, and it is possible to have a fair use defense even if you do not meet all the criteria of fair use, do you have data to back up your claims about moneymaking infringement?

              • @nymwit@lemm.ee
                cake
                link
                fedilink
                English
                25 months ago

                Cool. Thanks. I can see it now. No, not really, just the pieces over time I’ve read on what wins fair use protections when challenged often talk about the interpretations involved and that profit making was generally seen as detracting from gaining fair use protections when the extent of the transformative nature was in question.

                This mentions it, but of course it isn’t data on what has been granted protections vs. denials of protection. Harvard counsel primer on copyright and fair use

                Noncommercial use is more likely to be deemed fair use than commercial use, and the statute expressly contrasts nonprofit educational purposes with commercial ones. However, uses made at or by a nonprofit educational institution may be deemed commercial if they are made in connection with content that is sold, ad-supported, or profit-making. When the use of a work is commercial, the user must show a greater degree of transformation (see below) in order to establish that it is fair.

  • @space@lemmy.dbzer0.com
    link
    fedilink
    English
    35
    edit-2
    5 months ago

    Another reason why copyright should be shortened… Society has changed massively in the last 100 years, but every expression of our modern society is locked behind copyright.

  • Mr PoopyButthole
    link
    fedilink
    English
    305 months ago

    I keep thinking how great it would be if the federal government made a central server system to access digital content for free via taxes.

    All public domain and publicly funded research and content, all in one place. Could also host owned content for people/entities and pay out royalties automatically based on consumption.

    There are ways to make this fairly affordable to everyone via taxes, but maybe the big opportunity is it could also allow companies to train AI on all the data for a fat, but fair subscription. The value of that could easily pay for enough to shrink any tax costs for the public.

    • @kromem@lemmy.world
      link
      fedilink
      English
      14
      edit-2
      5 months ago

      In general, if the US government were smart (and not currently tearing itself apart) it would be creating a generative AI public service like the postal service, potentially even relying on public government documents and the library system for training.

      Offer it at effectively cost for the public to use. Would drive innovation and development, nothing produced by it would be copyrightable, and it would put pressure on private options to compete against it.

      We can still have the FedEx or DHL of gen AI out there, but they would need to contend with the public option being cheaper and more widely available for use.

      • @Goldmage263@sh.itjust.works
        link
        fedilink
        English
        35 months ago

        In addition to the US government actually needing to do work, the senators would need to understand how to turn a computer on and off.

      • Queen HawlSera
        link
        fedilink
        English
        -4
        edit-2
        5 months ago

        Yeah, but the attempts to kill off the USPS have been surprisingly bi-partisan

        I literally don’t think the US will ever see a new socialized industry ever happen

        Edited for clarity, to make it more obvious I’m not blaming just one party for this

  • @InvaderDJ@lemmy.world
    link
    fedilink
    English
    195 months ago

    I’d be fine with this argument if these generative tools were only being used by non-profits. But they aren’t.

    So I think there has to be some compromise here. Some type of licensing fee should be paid by these generative AI tools.

    • @LarmyOfLone@lemm.ee
      link
      fedilink
      English
      55 months ago

      You’re basically arguing for making any free use of them illegal, thereby giving a monopoly to the richest and most powerful capitalists.

      Humans won’t be able to compete, and you won’t be able to use the means of generation either.

      • @InvaderDJ@lemmy.world
        link
        fedilink
        English
        45 months ago

        I’m arguing for free commercial use being illegal, absolutely.

        And that fee should scale based on who is using it for commercial purposes. Microsoft and Google should be paying far, far out the ass for their data.

          • @InvaderDJ@lemmy.world
            link
            fedilink
            English
            15 months ago

            Do you mean the whole thing or something specific like free commercial use being illegal?

            I think the answer to both are the people who created the art, text, etc that these generative AI tools are going to make mostly obsolete.

            • @LarmyOfLone@lemm.ee
              link
              fedilink
              English
              2
              edit-2
              5 months ago

              Open source or open use AI will be practically illegal. Research will be practically impossible. It will be exclusively controlled by super rich and powerful corporations.

              It won’t benefit the creators. It will only benefit those with the most capital that can buy up the training data needed and then can set the market so they make almost all of the money. For example you’d need to buy all the user content from reddit, facebook and twitter to train an AI. That will cost many millions because it’s a precious commodity (and only they own it). So only a few will control the “means of generation” and they will (have to) use it to make profit for themselves. This will make it practically illegal to make a free or an independent AI because you don’t have access to training data. This sets the rules and will lead to incredibly bad outcomes. For example anti-consumerist thinking or dissent could be suppressed, or other more subtle biases. Anything that reduces profit from advertising or threatens the shareholders. And they can manipulate the training data behind closed doors.

              But that is how it’s going to go and it’s going to make the effects of AI generation on our civilization extra bad. We are so fucked :(

  • @General_Effort@lemmy.world
    link
    fedilink
    English
    19
    edit-2
    5 months ago

    So… This may be an unpopular question. Almost every time AI is discussed, a staggering number of posts support very right-wing positions. EG on topics like this one: Unearned money for capital owners. It’s all Ayn Rand and not Karl Marx. Posters seem to be unaware of that, though.

    Is that the “neoliberal Zeitgeist” or what you may call it?

    I’m worried about what this may mean for the future.

    ETA: 7 downvotes after 1 hour with 0 explanation. About what I expected.

    • @fhqwgads@possumpat.io
      link
      fedilink
      English
      95 months ago

      I think it’s a conflation of the ideas of what copyright should be and actually is. I don’t tend to see many people who believe copyright should be abolished in its entirety, and if people write a book or a song they should have some kind of control over that work. But there’s a lot of contention over the fact that copyright as it exists now is a bit of a farce, constantly traded and sold and lasting an aeon after the person who created the original work dies.

      It seems fairly morally constant to think that something old and part of the zeitgeist should not be under copyright, but that the system needs an overhaul when companies are using your live journal to make a robot call center.

      • @General_Effort@lemmy.world
        link
        fedilink
        English
        35 months ago

        Lemmy seems left-wing on economics in other threads. But on AI, it’s private property all the way, without regard for the consequences on society. The view on intellectual property is that of Ayn Rand. Economically, it does not get further to the right than that.

        My interpretation is that people go by gut feeling and never think of the consequences. The question is, why does their gut give them a far-right answer? One answer is that somehow our culture, at present, fosters such reactions; that it is the zeitgeist. If that’s the truth (and this reflects a wider trend) then inequality will continue to increase as a result of voter’s demands.

        • @hedgehog@ttrpg.network
          link
          fedilink
          English
          55 months ago

          My interpretation is that people go by gut feeling and never think of the consequences.

          Often, yes.

          The question is, why does their gut give them a far-right answer?

          The political right exploits fear, and the fear of AI hits close to home. Many people either have been impacted, could be impacted, or know someone who could be impacted, either by AI itself or by something that has been enabled by or that has been blamed on AI.

          When you’re afraid and/or operating from a vulnerable position, it’s a lot easier to jump on the anti-AI bandwagon. This is especially true when the counter-arguments address their flawed reasoning rather than the actual problems. They need something to fix the problem, not a sound argument about why a particular attempt to do so is flawed. And when this problem is staring you in the face, the implications of what it would otherwise mean just aren’t that important to you.

          People are losing income because of AI and our society does not have enough safety nets in place to make that less terrifying. If you swap “AI” for “off-shore outsourcing” it’s the same thing.

          The people arguing in favor of AI don’t have good answers for them about what needs to happen to “fix the problem.” The people arguing against AI don’t need to have sound arguments to appeal to these folks since their arguments sound like they could “fix the problem.” “If they win this lawsuit against OpenAI, ChatGPT and all the other LLMs will be shut down and companies will have to hire real people again. Anthropic even said so, see!”

          UBI would solve a lot of the problems, but it doesn’t have the political support of our elected officials in either party and the amount of effort to completely upend the makeup of Congress is so high that it’s obviously not a solution in the short term.

          Unions are a better short-term option, but that’s still not enough.

          One feasible solution would be legislation restricting or taxing the use of AI by corporations, particularly when that use results in the displacement of human laborers. If those taxes were then used to support those same displaced laborers, then that would both encourage corporations to hire real people and lessen the sting of getting laid off.

          I think another big part of this is that there’s a certain amount of feeling helpless to do anything about the situation. If you can root for the folks with the lawsuit, then that’s at least something. And it’s empowering to see that people like you - other writers, artists, etc. - are the ones spearheading this, as opposed to legislators.

          But yes, the more that people’s fear is exploited and the more that they’re misdirected when it comes to having an actual solution, the worse things will get.

          • @General_Effort@lemmy.world
            link
            fedilink
            English
            15 months ago

            The fear angle makes a lot of sense, but I wonder how many people are really so immediately threatened that it would cloud their judgment.

            • @hedgehog@ttrpg.network
              link
              fedilink
              English
              15 months ago

              Well, when you consider that more than 60% of Americans are living paycheck to paycheck - I’d say a lot of them.

        • @LainTrain@lemmy.dbzer0.com
          link
          fedilink
          English
          35 months ago

          Yeah I think that this is showing a lot of people only really care about espousing anti-privatization ideas as long as it suits their personal interests and as long as they feel they have more to gain than to lose. People are selfish, and a lot of progressive, or really any kind of passionate rhetoric is often conveniently self-serving and emotionally driven, rather than truly principled.

          • @General_Effort@lemmy.world
            link
            fedilink
            English
            05 months ago

            You’re not wrong but how many people here are actually pursuing their own personal interest. Most people here are probably wage-earners. Yet so many people support giving more money to property owners without any kind of requirement or incentive for work. Just a rent for property owners. It feels like this should be met with knee-jerk rejection.

        • @wewbull@feddit.uk
          link
          fedilink
          English
          15 months ago

          I’m not sure what you’re referring to as a far-right position?

          • AI corporations should have the right to all works in order to train their AIs.
          • Copyright needs to be enforced.

          The first is very pro-corporation in one way, but can lead to an argument for all intellectual works to be public domain.

          The second is pro-mega-rights-owners, but also allows someone to write a story, publish it themselves, and make money without having it stolen from them.

          • @General_Effort@lemmy.world
            link
            fedilink
            English
            15 months ago

            Fair use has always been a thing in the US.

            The US constitutions allows congress to limit the freedom of the press with these words: To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

            This has no room for fuck you, I got mine. Framing the abolition of fair use as enforcing copyright is an absolute lie.

            The view of copyright as some sort of absolute property right that can be exercised against the public is a far right position. (I’d argue that’s true for all property rights but that’s a different subject.) What makes it far right is that it implies unfettered, heritable power for a small elite. Saying that everyone has an equal right to property, as such, is so inane that it is worthy only of ridicule. The law, in its majestic equality, forbids rich and poor alike to sleep under bridges, to beg in the streets, and to steal their bread.

            The NYT is suing for money. It owns the copyright to all those articles published in the last century; all already paid. Every cent licensing fee is pure profit for the owners; beautiful shareholder value. Benefit to society? Zero. But you have to enforce copyright. It’s property! You wouldn’t want some corporation to steal the cardboard boxes of the homeless.

    • JackGreenEarth
      link
      fedilink
      English
      65 months ago

      I see way too many people advocating for copyright. I understand in this case it benefits big companies rather than consumers, but if you disagree with copyright, as I do, you should be consistent.

      • Optional
        link
        fedilink
        English
        75 months ago

        Copyright law should benefit humans, not machines, not corporations. And no, corporations are not people. Anthony Kennedy can get bent.

      • @General_Effort@lemmy.world
        link
        fedilink
        English
        55 months ago

        You don’t have to be against copyright, as such. Fair Use is part of copyright law. It exists to prevent copyrights from being abused against the interests of the general public.

        • JackGreenEarth
          link
          fedilink
          English
          15 months ago

          But I am against any copyright beyond forcing attribution to the original creator.

            • JackGreenEarth
              link
              fedilink
              English
              25 months ago

              AI creators, at least the open source ones, are usually pretty open about where they got the training data for their model

            • @Zoboomafoo@slrpnk.net
              link
              fedilink
              English
              25 months ago

              Here’s your works cited for any generative AI:

              Humanity. “The Entire Publicly Accessible Internet .” The World Wide Web, , 1 Jan. 1983, WWW.org.

            • @assassin_aragorn@lemmy.world
              link
              fedilink
              English
              05 months ago

              At the very least, every AI should be able to spit out a comprehensive list of all the material it used for training. And it should be capable of removing any specific item and regenerating its algorithm.

              This is a fundamental requirement of the technology itself to function. What happens if one the training materials has a retraction? Or if the authors admit they used AI to generate it? You need to purge that knowledge to keep the AI healthy and accurate.

    • @kromem@lemmy.world
      link
      fedilink
      English
      5
      edit-2
      5 months ago

      It’s interesting as it’s many of the MPAA/RIAA attitudes towards Napster/BitTorrent but now towards gen AI.

      I think it reflects the generational shift in who considers themselves content creators. Tech allowed for the long tail to become profitable content producers, so now there’s a large public audience that sees this from what’s historically been a corporate perspective.

      Of course, they are making the same mistakes because they don’t know their own history and thus are doomed to repeat it.

      They are largely unaware that the MPAA/RIAA fighting against online sharing of media meant they ceded the inevitable tech to other companies like Apple and Netflix that developed platforms that navigated the legality alongside the tech.

      So for example right now voice actors are largely opposing gen AI rather than realizing they should probably have their union develop or partner for their own owned offering which maximizes member revenues off of usage and can dictate fair terms.

      In fact, the only way many of today’s mass content creators have platforms to create content is because the corporate fights to hold onto IP status quo failed with platforms like YouTube, etc.

      Gen AI should exist in a social construct such that it is limited in being able to produce copyrighted content. But policing training/education of anything (human or otherwise) doesn’t serve us and will hold back developments that are going to have much more public good than most people seem to realize.

      Also, it’s unfortunate that we’ve effectively self propagandized for nearly a century around ‘AI’ being the bad guy and at odds with humanity, misaligned with our interests, an existential threat, etc. There’s such an incredible priming bias right now that it’s effectively become the Boogeyman rather than correctly being identified as a tool that - like every other tool in human history - is going to be able to be used for good or bad depending on the wielder (though unlike past tools this one may actually have a slight inherent and unavoidable bias towards good as Musk and Gab recently found out with their AI efforts on release denouncing their own personally held beliefs).

    • @CurbsTickle@lemmy.world
      link
      fedilink
      English
      45 months ago

      I’d say the main reason is companies are profiting off the work of others. It’s not some grand positive motive for society, but taking the work of others, from other companies, sure, but also from small time artists, writers, etc.

      Then selling access to the information they took from others.

      I wouldn’t call it a right wing position.

      • @General_Effort@lemmy.world
        link
        fedilink
        English
        25 months ago

        Wanting to abolish the IRS is a right-wing policy that will benefit the rich. That doesn’t change when some marketing genius talks about how the IRS takes money from small time artists, writers, etc. Same thing. It’s about substance and not manipulative framing.

        • @CurbsTickle@lemmy.world
          link
          fedilink
          English
          15 months ago

          That isn’t remotely similar…

          The IRS takes a portion of income. This is taking away someone’s income, then charging access to it.

          Like it or not, these people need money to survive. Calling it right wing to think these individuals deserve to be paid for someone taking their work, then using it for a product they sell access to, is absolutely insane to me.

            • @CurbsTickle@lemmy.world
              link
              fedilink
              English
              15 months ago

              One is a percentage of income that everyone pays into.

              The other is stealing someone’s work then using that person’s work for profit.

              Recognizing that stealing someone’s work is not a right-wing position.

              How is this complicated?

              • @General_Effort@lemmy.world
                link
                fedilink
                English
                1
                edit-2
                5 months ago

                I see. Thanks for explaining.

                This view of property rights as absolute is what right-libertarians, anarcho-capitalists, etc… espouse. Usually the cries of “theft” come when it gets to taxes, though. Is it supposed to be not right because it’s about intellectual property?

                Property rights are not necessarily right-wing (communism notwithstanding). What is definitely right-wing is (heritable) privilege and that’s implied in these views of property.

                ETA: Just to make sure that I really understand what you are saying: When you say “stealing someone’s work” you do mean the unauthorized copying of copyrighted expression, yes? Do you actually understand that copyright is intellectual property and that property is not usually called work? Labor and capital are traditionally considered opposites, of a sort, particularly among the left.

                • @CurbsTickle@lemmy.world
                  link
                  fedilink
                  English
                  15 months ago

                  So… You think their art or writing was created by what then? Magic? Do you think no time was expended in the creation of books, research, drawings, painted canvases, etc?

                  Do you think they should starve because we currently live in a world driven entirely around money?

                  I don’t get your point even remotely.

    • nickwitha_k (he/him)
      link
      fedilink
      English
      35 months ago

      Yeah…it’s pretty weird. Feels like some folks have really dived into LLMs regardless of ethics and will do any amount of hand waving to avoid criticism of a for-profit company openly attacking creatives’ livelihood with their own uncompensated works. In an ideal world where it wasn’t a case of “earn or die cold and alone in the streets”, sure, but this is just robbing those workers of the fruits of their labor and burning the ladder while

      I think the “neoliberal zeitgeist” thought may be correct as neoliberal ideology devalues anything and everything that is not solely profit-driven, including just about everything that humans have historically found to make life meaningful.

    • Magical Thinker
      link
      fedilink
      English
      25 months ago

      As an aside, when I browse TheGatewayPundit comments on AI articles, it is a lot more open, against legislation, and woke than I would expect!

    • @Doomsider@lemmy.world
      link
      fedilink
      English
      15 months ago

      Every single poster here has relied on disruptive technologies in their life. They don’t even realize that they couldn’t even make these arguments here if it was not for people before them pushing the envelope.

      They don’t know the history of their technology nor corporate law. If they did they would just roll their eyes every time an entrenched economic interest started saber rattling about the next disruptive technology that is going to steal their profits.

      The posters here are the people who complained about horsewhip manufactures that were going out of business because of cars. They are ignorant and act like the few sound bytes they heard make them an expert.

    • @PsychedSy@sh.itjust.works
      link
      fedilink
      English
      05 months ago

      It’s important to recognize that IP is conceptually fucky to begin with. They’re seeing what it’s claimed to be (creator ‘ownership’ of their creations) rather than what it really is (corporations using the government to enact violence on non-violent people).

      It means nothing interesting. The position they feel they’re taking is “corporation bad” which is in line, they just haven’t analyzed how IP works in the real world.

      • @Eccitaze@yiffit.net
        link
        fedilink
        English
        25 months ago

        So because corps abuse copyright, that means I should be fine with AI companies taking whatever I write–all the journal entries, short stories, blog posts, tweets, comments, etc.–and putting it through their model without being asked, and with no ability to opt out? My artist friends should be fine with their art galleries being used to train the AI models that are actively being used to deprive them of their livelihood without any ability to say “I don’t want the fruits of my labor to be used in this way?”

        • @BURN@lemmy.world
          link
          fedilink
          English
          15 months ago

          This is the problem people have

          They don’t see artists and creators as worth protecting. They’d rather screw over every small creator and take away control of their works, just because “it’d be hard to train without copyrighted data”

          Plenty of creators would opt in if given the option, but I’m going to guess a large portion will not.

          I don’t want my works training what will replace me, and right now copyright is the only way we can defend what was made.

          • @Eccitaze@yiffit.net
            link
            fedilink
            English
            25 months ago

            It’s like nobody here actually knows someone who is actually creative or has bothered making anything creative themselves

            I don’t even have a financial interest in it because there’s no way my job could be automated, and I don’t have any chance of making any kind of money off my trash. I still wouldn’t let LLMs train with my work, and I have a feeling that the vast majority of people would do the same

        • @PsychedSy@sh.itjust.works
          link
          fedilink
          English
          05 months ago

          The concept of copyright is insane to begin with. Corps don’t make it bad - it starts out bad.

          It’s an invented right.

        • @General_Effort@lemmy.world
          link
          fedilink
          English
          05 months ago

          I don’t know if your fears about your friends’ livelihood are justified, but cutting down on fair use will not help at all. In fact, it would make their situation worse. Think through what would actually happen.

          When you publish something you have to accept that people will make it their own to some degree. Think parody or R34. It may be hurtful, but the alternative is so much worse.

          • @Eccitaze@yiffit.net
            link
            fedilink
            English
            15 months ago

            Huh? How does that follow at all? Judging that the specific use of training LLMs–which absolutely flunks the “amount and substantiality of the portion taken” (since it’s taking the whole damn work) and “the effect on the market” (fucking DUH) tests–isn’t fair use in no way impacts parody or R34. It’s the same kind of logic the GOP uses when they say “if the IRS cracks down on billionaires evading taxes then Blue Collar Joe is going to get audited!”

            Fuck outta here with that insane clown logic.

            • @General_Effort@lemmy.world
              link
              fedilink
              English
              05 months ago

              I think you would find it easier to help your friends if you approached the matter with reason rather than emotion. Your take on fair use isn’t is missing a lot, but that’s beside the point.

              Assume you get what you wanted are asking for. What then?

              • @Eccitaze@yiffit.net
                link
                fedilink
                English
                15 months ago

                Yeah, no, stop with the goddamn tone policing. I have zero interest in vagueposting and high-horse riding.

                As for what I want, I want generative AI banned entirely, or at minimum restricted to training on works that are either in the public domain, or that the person creating the training model received explicit, opt-in consent to use. This is the supposed gold standard everyone demands when it comes to the widescale collection and processing of personal data that they generate just through their normal, everyday activities, why should it be different for the widescale collection and processing of the stuff we actually put our effort into creating?

                • @General_Effort@lemmy.world
                  link
                  fedilink
                  English
                  05 months ago

                  As for what I want, I want generative AI banned entirely,

                  Well, you can see the moral (and political!) problem here. Maybe the people who crunched numbers before electric computers wanted them banned. Maybe people who make diesel engines want EVs banned. That’s asking the public to take a hit for the benefit of a small group. Morality aside, it’s politically unlikely.

                  or at minimum restricted to training on works that are either in the public domain, or that the person creating the training model received explicit, opt-in consent to use.

                  This is somewhat more likely. But what then?

                  I’ll start. Opt-in means that you have to obtain a license to AI train with something. You have to pay the owner of the intellectual property. What does this mean in our economy? What happens?

    • @LainTrain@lemmy.dbzer0.com
      link
      fedilink
      English
      -35 months ago

      I don’t know what you’re on about, the majority of the thread is pro open source AI and anti-capitalist, which is as left a stance as it gets, it’s not called “copyleft” for no reason. No one here wants to see AI banned and the already insane IP laws expanded to the benefit of the few corpos like the NYT at the expense of broader society.

      • @General_Effort@lemmy.world
        link
        fedilink
        English
        -15 months ago

        IDK. I have seen a number of pro-corpo copyleft takes. It’s absolutely crazy to me. The pitch is that expansive copyright makes for expansive copyleft. It seems neo-feudal to me. The lords have their castles but the peasants have their commons.

    • @TheFriar@lemm.ee
      link
      fedilink
      English
      35 months ago

      Right? Like…I don’t give a shit. That’s not a threat or a fact that bothers me at all. They are only a tool for amassing more power and money. So what the fuck do I care.

    • @doylio@lemmy.ca
      link
      fedilink
      English
      35 months ago

      And what about the open source models? Or the AI companies in countries that have more lax copyright laws? (Japan for example)

      This technology exists now. We can’t put the genie back in the bottle. Copyright came out of the printing press, which allowed cheap copies to be made. Now a new technology has emerged so we likely need a new set of rules to replace the role that copyright performed, which was incentivizing artistic creation

  • @satanmat@lemmy.world
    link
    fedilink
    English
    145 months ago

    I’m just trying to think about how refined AI would be if it could only use public domain data.

    ChatGPT channels Jane Austin and Shakespeare.

    • @kromem@lemmy.world
      link
      fedilink
      English
      45 months ago

      That’s not really how it would work.

      If you want that outcome, it’s better to train on as massive a data set as possible initially (which does regress towards the mean but also manages to pick up remarkable capabilities and relationships around abstract concepts), and then use fine tuning to bias it back towards an exceptional result.

      If you only trained it on those works, it would suck at pretty much everything except specifically completing those specific works with those specific characters. It wouldn’t model what the concerns of a prince in general were, but instead model that a prince either wants to murder his mother (Macbeth) or fuck her (Oedipus).

  • YeetPics
    link
    fedilink
    English
    12
    edit-2
    5 months ago

    If they can’t afford a thing they want, that’s too bad.

    The fact that their dream-AI ‘cant exist’ without stealing from everyone there is only one message to bounce back there from the rest of us;

    ‘good’

  • @dangblingus@lemmy.dbzer0.com
    link
    fedilink
    English
    105 months ago

    Huh. You’d think in a situation where copyright is threatened by a lack of AI regulation, Disney would be all over this. Oh wait. They’re trying to use generative AI to make movies cheaper. Nevermind.