Where to find the description of the "components"

The team at the Linux Foundation AI & Data Generative AI Commons group has published the article that we based the list of components of ML systems on. If you’re wondering what Training, Validation and Testing Code is, you should read this article.

Matt White, Ibrahim Haddad, Cailean Osborne, Xiao-Yang Liu (Yanglet), Ahmed Abdelmonsef, Sachin Varghese have worked with other LF members to establish a ranked classification system that rates machine learning models based on their completeness and openness, following principles of open science, open source, open data, and open access. This work is useful to better understand the environment in which the Open Source AI Definition needs to operate in.

The relevant piece for us is section 5, the Model Openness Framework (MOF) Components:

  1. Datasets
  2. Data Preprocessing Code
  3. Model Architecture
  4. Model Parameters
  5. Model Metadata
  6. Training, Validation and Testing Code
  7. Inference Code
  8. Evaluation Code
  9. Evaluation Data
  10. Evaluation Results
  11. Supporting Libraries and Tools
  12. Technical Report
  13. Model Card
  14. Data Card
  15. Research Paper
  16. Sample Model Outputs
  17. Model Openness Framework Configuration File

and their definitions. You’ll recognize the terms from draft 0.0.6.


I certainly recognize those terms.
Are we to share our definitions of these technical terms with LF AI&Data or are we to cite LF AI&Data’s definitions?

If we are going to cite them, I think it would be better to make it clear where we are citing them from, since these are important terms.
Yes, I understand that the paper was published a few days ago. It is in the future.

1 Like

We should cite the paper now that it’s public, we can do that :slight_smile: Going forward we’ll make sure of that.

1 Like

Unfortunate the paper itself is not open source, considering the non commercial clause.
I had considered writing an article about it, however as the site I operate is ad supported I would feel the coverage is very disconnected from the material. Or even trying to structure guides and details around the information would be more difficult than it has to be.

A paper is not “open source”: there are better reference frameworks than the OSD which refers to software. You should check the Definition of Open.

I think you’re misunderstanding the role of the license of the paper: you can cite, quote and criticize not only that paper but anything basically, even content and works that are distributed with a non-commercial clause. That’s part of how copyright works.

This conversation is off-topic.

1 Like

Of course you are correct. Thank you for bringing clarity to the concern (I won’t pursue the conversation further, though I might point out there is TeX source included and TeX is a Turing complete language.)