Context
We’re aiming to use the Open Source AI Definition (OSAID) to review approximately ten AI systems before releasing RC1 at the end of June.
To this end, we convened four workgroups at the beginning of this year to review an initial group of AI systems self-described as open: BLOOM, Pythia, Llama 2, and OpenCV. These workgroups were composed of system creators and unaffiliated volunteers and results were announced in early April.
Reviewed Systems
To continue towards our ten-system review goal, in early May we posted a call for volunteers on this forum. The ask was to help us validate additional AI systems using v.0.0.8 of the OSAID.
That call for volunteers resulted in the following system list and volunteer reviewers. (Previously reviewed systems* are included to give a complete list of AI systems we are analyzing.) Reviewers completed their analysis on this public spreadsheet.
1. Arctic
- Jesús M. Gonzalez-Barahona – Universidad Rey Juan Carlos
2. BLOOM*
- Danish Contractor – BLOOM Model Gov. Work Group
- Jaan Li – University of Tartu, One Fact Foundation
3. Falcon
- Casey Valk – Nutanix
- Jean-Pierre Lorre – LINAGORA, OpenLLM-France
4. Grok
- Victor Lu – independent database consultant
- Karsten Wade – Open Community Architects
5. Llama 2*
- Davide Testuggine – Meta
- Jonathan Torres – Meta
- Stefano Zacchiroli – Polytechnic Institute of Paris
- Victor Lu – independent database consultant
6. Mistral
- Mark Collier – OpenInfra Foundation
- Jean-Pierre Lorre – LINAGORA, OpenLLM-France
- Cailean Osborne – University of Oxford, Linux Foundation
7. OLMo
- Amanda Casari – Google
- Abdoulaye Diack – Google
8. OpenCV*
- Rasim Sen – Oasis Software Technology Ltd.
9. Phi-2
- Seo-Young Isabelle Hwang – Samsung
10. Pythia*
- Seo-Young Isabelle Hwang – Samsung
- Stella Biderman – EleutherAI
- Hailey Schoelkopf – EleutherAI
- Aviya Skowron – EleutherAI
11. T5
- Jaan Li University of Tartu, One Fact Foundation
Initial Findings and Obstacles
Unlike the earlier review of BLOOM, Pythia, Llama 2, and OpenCV, in which system creators contributed to system analysis, review of the seven additional systems did not include creators, leaving wide knowledge gaps. This ended up being quite difficult. As a result, this initial report conveys findings about obstacles in the review process as well as findings about the definition:
As this example shows, it was difficult to find documents describing each component (column D) and thus to complete the subsequent analysis (columns F-I).
-
Elusive Documents Not having system creators in the process meant that reviewers were on their own in searching for the legal documents associated with each component. As one reviewer noted in her feedback email, “There is no ‘one and done’ place to see artifacts, licenses, and terms & conditions attached to each component…” As a result, her system and most others had many blanks in both their document list and the subsequent use/study/modify/share analysis. (See example, above.)
-
One Component, Many Artifacts and Documents: A related challenge was that some components are associated with multiple artifacts and multiple documents. Another reviewer noted that, for example, “Source code could be in several repos, and documentation could be in several tech reports or blog posts.”
-
Compounded Components Part of the above problem is due to the fact that some components in the checklist combine multiple artifacts in a single list item, such as training, validation and testing code; supporting libraries and tools; and information on training methodologies and techniques. This also made analysis of the status of any one artifact difficult. As one reviewer noted, “compounding of different kinds of artifacts together made it challenging to track down the legal document for a specific component…”
-
Compliant? Conformant? Of the eleven required components, six require a legal framework which is “compliant” or “conformant” with the Open Source Definition (OSD), rather than simply having an OSI-approved license. Though definitions of conformant (applied to model parameters) and compliant (applied to data information components) was shared during the review process, there was a request for further guidance on how to review components that are not software.
-
Reverting to the License Document analysis (columns F - I) currently requires the reviewer to independently assess whether the legal document (column D) guarantees the right to use, study, modify, and share the component. In the interest of simplifying the process, one reviewer suggested that “we could revert to the analysis of the license,” meaning that if a license or other legal document is OSI-approved, conformant, or compliant, then study, use, modification, and sharing are already guaranteed and no further analysis is necessary.
Help Us Fill in the Blanks
Given the above, we’re making a call to both AI system creators and unaffiliated volunteers to help us fill in the gaps in the system reviews started by our valiant reviewers.
- AI System Creators If you see your system in the list above, and see blanks in your system’s spreadsheet, please comment or DM me to help us fill them in. If you don’t see your system listed above, but you’d like to put it through the review process, as the LLM360 team did recently, please also let us know.
This could even be a permanent solution for the elusive document challenge. As one reviewer suggested, “In the cases when there is cooperation by the organization publishing the model, it would be great if they can fill in the links to the different artifacts…” In that case, “The review is just checking their licenses and verify[ing] that they really are available and correspond to the artifact category.”
- Independent Volunteers We’d also love to have the help of unaffiliated folks. If you did not create an AI model, but are very knowledgable about it, please do speak up can help us fill in the blanks. Raise your hand by commenting or DMing me.
Let’s see how many of the above questions become clearer – or murkier – once those most familiar with the systems identify the legal documents that describe each component. More to come…