Concerns and feedback on anchoring on the Model Openness Framework

amcasari · June 12, 2024, 6:41pm

As I address in our team’s feedback in the last round of the validation process, I wanted to pull out a separate conversation for this forum.

I am concerned about relying on the “Model Openness Framework” as a basis for a technology taxonomy to identify an AI system, regardless of whether it has the approval and backing of another Large organization.

The Framework itself as presented is not agnostic or neutral in how it categorizes, groups together, and classifies different kinds of artifacts (data, data information, documentation, weights, etc etc etc).
While the paper itself outlines broadly how this framework comes together, the authors have a distinct opinion on the kinds of IP law they determine to apply to each kind of component (and whether or not one actually does).
The Framework as presented does not align easily or readily with current releasing and sharing practices amongst developers of all levels. It’s not usable in its current form as a useful guide. This itself is not a red flag — “Datasets for Datasheets” (2018) remains a foundational research paper which has guided the creation of functional, useful, industry-leading use of data cards integrated into standards practices and workflows. However, this research team sought feedback from wider industry and researchers, worked with developers and researchers at levels for years, and iterated their thoughts and operational frameworks to a place that ideas integrated into practice. However, I am concerned that this framework and the ideas it presents are not simply able to be picked up and dropped into practical workflows.

I look forward to hearing further discussion here on whether this framework could immediately or is usable for other folks, whether you are developing, using, creating, or otherwise researching existing ML/AI systems.

cheers,

amanda casari
OSS+AI Lead
Google Open Source

shujisado · June 13, 2024, 2:29pm

The current part of OSAID that relies on the MOF is only the classification of components. As I understand it, OSAID does not depend on the IP laws that should be applied to each component as stated in the MOF paper, but rather considers the legal framework that should be applied based on the OSD.

I am a Japanese living in Japan, and in this forum, I am discussing based on Japanese law. Japan has a different intellectual property law system from both US and EU laws, but I feel that the discussions are consistent by basing them on the OSD.

Is your claim that the classification of components by the MOF is not suitable for the purposes of OSAID?

amcasari · June 14, 2024, 2:32pm

My feedback is that as one of the reviewers walking through the current integration of the MOF into OSAID framework and evaluative method using the MOF, I was not able to resolve this current proposal against my best knowledge of ML/AI systems, current released technology, or my role as someone who would implement this into industry practices.

I would find it useful to move the discussion from theory to practice and pragmatic application — how have others evaluated this current technology framework against their own workflows? Where is it working? Where is it not?

stefano · June 14, 2024, 4:05pm

This is a very good question… In my mind, the document we’re co-designing (the drafts of the Open Source AI Definition) have three main parts: the Preamble, the Definition and the Checklist. The first two parts should be well thought out, represent timeless principles, as much as possible. Together with the FAQ they should continue driving the interpretation of the openness of AI systems in the future, when technology and the legal landscape may change. The principles that AI developers need to have the model parameters, code and data information so that they can recreate a substantially equivalent system shouldn’t change over time.

The Checklist may change more frequently: I see it as a working document that reviewers of AI systems use to evaluate if a system is Open Source AI or not. Right now I think it serves this purpose. Given how quickly things change, it may become obsolete or other types of checklists will be required.

I’ll also propose some small improvements to the current checklist in v0.0.9, based on comments to the draft.

I’m expecting the Linux Foundation to show how they’re putting this into production… Given that this coming week there is AI_Dev, maybe we’ll see something demoed there.

I join @amcasari call to see examples used in practice.

shujisado · June 16, 2024, 7:44am

Stefan-san,

How about separating the checklist from OSAID? Similar to how the Debian Social Contract and the Debian Free Software Guidelines exist as separate documents, separating the checklist would reduce confusion even if it is updated relatively frequently.

gvlx · October 13, 2024, 8:06pm

Actually. dear @shujisado, you might be right.