Hold on, the possibilities here aren’t at all well understood. There is a difference between opening floodgates for open-washing and accomodating for open source medical models.
If you could agree to a definition now that there could be broad consensus around, and then iterate on that definition once the distinction is understood, we wouldn’t be in the situation where there is a risk for openwashwater flooding.
I fully agree that a pure mandate for open datasets cannot be the end result, for exactly that reason. But I also have concerns that the current definition will bring Open Source into strong disrepute as it will bring security holes, backdoors and poor regulability.
If, OTOH, someone managed to build a system that could do federated learning on sensitive data in such a way that it can be proven that the original data cannot be reconstructed, then certainly, that would be Open Source AI, even in the absence of an open dataset.
By closing the discussion now, you are making sure that we cannot ask and work towards such a goal. I’m sure academia will be working on it, but that needs to be a central part of the understanding of Open Source AI.