We heard you: let's focus on substantive discussion

zack · September 25, 2024, 8:43am

Wait, what? I have consistently voted stating that availability of training datasets was a requirement for exercising both the freedom of study and the freedom of modify. (But I’ve been outvoted.)

So I’m not sure where you obtain the above conclusion from. (I apologize, but I haven’t found time yet to digest all the details you have provided. I am participating as volunteer in this process, with limited time availability.)

I’ve also raised earlier on this forum (don’t have the link at hand, sorry) the concern that casted voted could not be interpreted as consensus in favor of making training dataset optional, but rather that they denoted a 50/50 split on the data matter.

But, honestly, I think discussing the voting details is beside the point, and that’s why I have stopped arguing about them. Not only because voting is not a good way to decide on complex technical matters, but because I think OSI’s decisions to not mandate access to training data is a political decision, that they are entirely entitled to make. It’s even a pragmatic one, in the tradition of the organization, for better or for worse.

(In fact, I even think that OSAID 0.9 is potentially a good definition, that could improve the state of model freedom in the industry. The main problem it has is of naming/branding, as I consider that something as broad as “open source AI” should be more “radical”, and require training data availability. I am still hopeful that we can obtain a multi-tier definition, either via the D-/D+ classification, or via some even stricter split between “open weight” and “open source” labels. I planned to write more broadly about this later on, but it will not be here.)