It’s a blow to the big closed-source AI companies, sure, but hardly a knockout one. If a small company can use a million dollars to produce a neat model perhaps a big company can use those same techniques and a billion dollars to produce a really neat model. Or at least build a lot more of the infrastructure that goes around those models and makes use of them. Code Copilot isn’t just selling a raw LLM API, they’re selling its integration into the Microsoft coding ecosystem. They may have wasted some money on their current-generation AIs but that’s just sunk cost. They’ve got more money to spend on future AIs.
The main problem will be if Western AI companies are prevented from adapting the techniques being used by these Chinese AI companies. If, for example, there are lots of onerous regulations on what training data can be used or requiring extreme “safety guardrails.” The United States seems likely to be getting rid of a lot of those sorts of obstructions over the next few years, though, so I wouldn’t count the West out yet.
The site producing the nonsense has to produce lots of it any time a bot comes along, the trainers only have to filter it once. As others have pointed out it’s likely easy for an automated filter to spot. I don’t see it as being a clear win.