Data Transparency in Open Source AI: Protecting Sensitive Datasets

Originally published at: Data Transparency in Open Source AI: Protecting Sensitive Datasets – Open Source Initiative

The Open Source Initiative (OSI) is running a series of stories about a few of the people involved in the Open Source AI Definition (OSAID) co-design process. Today, we are featuring Tarunima Prabhakar, one of the volunteers who has helped to shape and are shaping the OSAID.

A very interesting read.

We are trying to find a happy medium that lets us balance the numerous concerns- recognition of effort and effectiveness of the data on one hand, and transparency, adaptability and extensibility on the other.
[…]
For our project in particular, we are considering the option of staggered data release- older data is released under an open data license, while the newest data requires users to request access.

This remind me of the ongoing work at Sentry around Fair Source Software: maybe Tattle simply needs a Fair Source AI definition that turn their system to Open Source AI when the corresponding datasets are released under an open data license?

Or maybe OSI recently approved licenses like FSL, FCL and BSL?
Do they match the OSD?

If not, I can’t see why an AI system delaying the freedom to study the system should be qualified as Open Source AI.

I fail to see any justification on labeling something “Open Source” when it obviously does not respect the basic principles the community holds.

And also fail to see any drama on an iA not being an “Open Source Ai” because any of its datasets or model networks are not compliant.

If it’s not “Open Source Ai”, it’s not “Open Source Ai”. period.

For the past years I’ve seen countless attempts of newer licenses being invalidated. They never made it to the OSI list.

No one really cared: the proponents carried on with their business, they just wouldn’t be allowed to market it as “Open Source”, and their customers didn’t care.

The happens here: if an Ai system can’t have all its the components respecting the values of the Open Source community, it’s not “Open Source Ai”. period.

Now change your dates, redo the presentations, and start correcting things on the 0.0.10 version.

1 Like