Open Source AI needs to require data to be viable

stefano · June 6, 2024, 9:24am

That’s exactly what @pchestek is referring to: You’re admitting and even formalizing that there is a degree of “openness” and by doing that you’re playing exactly into the hands of the open washers. Granted, there is no formal definition of Open Source AI yet but we’re pretty close to having one. It’d be great to have one more column to your paper: after showing the availability of components, finally puts a checkmark of whether that system passes or fails the “Open Source AI Definition.”

The OSI has been playing this game for 26 years: There are many many organizations arguing daily that there is a degree of openness and it’s all “open source”. Examples range from the early 2000s of Microsoft Shared Source Initiative to more recent companies and VCs sharing software with agreements that preclude unrestricted use and modifications. It’s an old game that is being replicated in AI. Companies play the game because there is tremendous market value in being considered Open Source. And now with the AI Act excluding “open source AI” from some obligations, it’s even more appetizing to win the open washing game.

Open Source is a binary concept: you’re either granting end users and developers the right to have self-sovereignity over the technology or you’re not. Same with Open Source AI. And sure, there are AI systems out there that will be “almost” Open Source AI. But they’re not, and they all go in the same pile over there with a big FAIL stamp on, there is no “oh but this is slightly better, more open because …” Nope, it goes in the same pile over there: NOT Open Source AI. That’s what the OSI has been entrusted to do by its stakeholders and will continue doing it.