A new draft of the Open Source AI Definition: v.0.0.6 is available for comments

lumin · March 14, 2024, 2:38pm

Code : The code used for pre-processing data, the code used for training, validation and testing, the supporting libraries like tokenizers and hyperparameters search code (if used), the inference code, and the model architecture.

The “model architecture” is an overlapped subset of “training, validation, and testing” code. Without architecture defined, the “training, validation, and testing” code cannot correctly load a model at all.

Thus, I suggest at least re-order the phrasing as the follows:

Code : The code used for pre-processing data, the model architecture, the code used for training, validation and testing, the inference code, as well as the supporting libraries like tokenizers and hyperparameters search code (if used), ~~the inference code, and the model architecture~~.

This makes the items logically organized.