The Open Source Initiative have defined what they believe constitutes “open source AI” (https://opensource.org/ai/open-source-ai-definition). This includes detailed descriptions of training data, explanation on how it was obtained, selected, labeled, processed and filtered. As long as a company utilize any model trained on non-specified data I will assume it is either stolen or otherwise unlawfully obtained from non-consenting users.
I will be clear that I have not read up on Deepseek yet, but I have a hard time believing their training data is specified according to OSI, since no big model yet has done so. Releasing the model source code means little for AI compared to all its training data.
I just like the analogy of a dashboard with knobs. Input text on one wide output text on the other. “Training” AI is simply letting the knobs adjust themselves based on feedback of the output. AI never “learns” it only produces output based on how the knobs are dialed in. Its not a magic box, its just a lot of settings converting data to new data.
As a new programmer who actively treads carefully in date-data-dabbling territories it is amusing to see how shit commercially available code is, especially in the e-commerce world.
I started my first programming job 2 years ago working with building Magento 2 sites. I already knew Magento was a horrible mess to begin with, but I took whatever I could get. After witnessing the publishers, developers and code of 99% of the plugins available (which is btw quality certified on Adobes marketplace) I can safely say that there is so much shit code squirted out every second be self-taught developers working in shitty small companies with CEOs trying to earn a quick buck.
It is actually insane how bad the code was, I can not with words describe how bad it was. Every time I felt impostor syndrome i would just open up vendor and look at a random plugin to confirm that I am at least not a Magento plugin developer.
If you’re running Magento, change, preferably 10 years ago, but change.
Typical Nintendo move. So sad to see Yuzu possibly going down this way. Even looks like Nintendo might win this one. I’m just gonna download the entire source from GitHub just in case.
I wish this would just go full hydra mode if it goes down though. Start popping up new anonymous accounts releasing the source code everywhere.
The more bullshit like this I read about YouTube the more I despite them. I already use GrayJay on mobile and I’m using ublock Origin + ublock Matrix on Librewolf to control cookie usage on desktop. So far I’ve been able escape the video player block by clearing cache.
I’m just waiting for the day they “force” me onto another frontend.
As i wrote in my comment i have not read up on Deepseek, if this is true it is definetly a step in the right direction.
I am not saying i expect any company of significant scale to follow OSI since, as you say, it is too high risk. I do still believe that if you cannot prove to me that your AI is not abusing artists or creators by using their art, or not using data non-consentually acquired from users of your platform, you are not providing an ethic or moral service. This is my main concern with AI. Big tech keeps showing us, time and time again, that they really dont care about about these topics and this needs to change.
Imo AI today is developing and expanding way too fast for the general consumer to understand it and by extension also the legal and justice systems. We need more laws in place regarding how to handle AI and the data they use and produce. We need more education on what AI actually is doing.