I apologize for being late to the party. I’ve just entered to understand the work that has been going on, and so I’ve read quite a few documents and now 350 posts to the forum.
I’m mainly coming to this from being concerned about the potential for open-washing. I’m also concerned about v.0.0.8’s “preferred form to make modifications” language, as I would like to have seen open data is part of it. Reading the comments, I have come to appreciate the counter-arguments, I think weighty arguments have been presented, though I am not entirely convinced. Principally, I fear that not requiring open data will lead to a lot of open-washing.
Nevertheless, I feel there’s something that is unspoken here, like it has always been in Open Source. I remember attending a talk by ESR around 2000 where he spoke about the taboos of open source. For example, how you could take somebody else’s code, repackage it under the same license and call it your own. But nobody does that, that would have been breaking a taboo, you’d be ostracized for it. Along a similar vein, to the discussion of whether testing data should be available we should note that you don’t need to distribute your tests with your code, but people do it anyway. Even though it could be a viable business model to have a private test suite to provide extra QA to paying customers, projects bundle their test suites. To do otherwise would be taboo.
Such things are all over open source and have always been. As stakes are higher, it may not serve us that well going forward.
I think everyone here agrees that open-washing is horribly damaging and it is important for OSAID to be clear enough to alleviate that concern. But here’s another big unspoken problem: Governance has never been part of the OSD, even though community management has always been extremely important. And community management spans from toxic BDFLs to highly efficient meritocracies.
There are so many ways to undermines people’s right to self-determination even with open source technology that it really comes down to good governance models, well beyond even the best meritocracies of today.
I think it would be a good idea to speak the unspoken, perhaps to say that the governance of open-source AI is out of scope for OSAID, and that one should not have an expectation that all or even most forms of open-washing will be addressed by the definition. What do you all think about that?
I have specifically started to talk more about “Digital Commons” recently. Within that framework, Open Source provides primarily goods that are non-exclusive and non-subtractive. Then, it marries the rich community management traditions with Elinor Ostrom’s economic theories, as the latter implies elaborate governance models.
When the EU AI Act mentioned Open Source, it should probably rather have talked about Digital Commons, because when some commentators talk about open source AI, they talk about Open Source as a completely ungoverned space, which is false for practical purposes, but since it isn’t formalized and supported by democratic institutions, an easy target.
I therefore feel that if we could make the scope clearer and make it clear that there is a division of labour here, where a Digital Commons conversation has to take place on the heels of this work, I would feel more comfortable.
Now, I really wouldn’t know where this would take us in the training data access debate, but it may readjust the expectations of the community.