The Open Source(ish) AI Definition (OSAID)

Well, in fact they do comply with OSAID 0.0.9.

All developers have to do to comply is to provide plausible “Data Information” pretending they trained the system on a publicly available dataset.

Such “facade dataset” might even be a proper subset of the dataset actually used, but without the availability of the whole dataset actually used in training, nobody could really study or modify that AI system.

Well, when the dataset cannot be distributed but are available for anyone to use in machine learning under the same legal terms that allowed their use during the system training, the system can still be considered Open Source AI, since both the freedom to study and the freedom to modify are preserved.

But being available is different from being made available.
Without requiring the dataset distribution, all legal issues disappear.

What do you think about such solution?

Sure, if anyone can recreate the whole and exact dataset used during the training of a system, such system could be defined Open Source AI.