Open Source AI needs to require data to be viable

I don’t really follow, @pchestek , as we don’t propose “a variety of meanings” for the notion “open source AI”. Rather, in our paper, we observe that the current lack of clarity is the result of important dimensions of openness being overlooked (by regulators) or obfuscated (by model providers). I should note that our paper is more descriptive (aiming to make sense of the complexity in a domain) than prescriptive (proposing specific labels).

As the image upthread shows, we note how various possible operationalizations of openness each distort this complex reality in various ways, and some are more vulnerable to open-washing than others.

I agree that it may be desirable to have a strict definition of “open source AI” with a small attack surface. But we can also predict that model providers will try wriggle their way out of this by insisting on watering down the notion (as in “fair trade”) or retreating to broader terms (as in “open”). Purpose of our paper is to sketch this broader landscape and propose dimensions of openness that can help pin down what providers are doing even if they are playing these word games.