Share your thoughts about draft v0.0.9

jplorre · August 28, 2024, 5:10pm

Regarding this new version, we at LINAGORA have the following comments:

Indeed there is no significant difference between v0.0.9 and v0.0.8. The idea is still not to explicitly require training data to be published, but to require sufficient information to be provided so that an equivalent system may be recreated.
Although this choice does not correspond to our initial position, we understand that it is a path of compromise.

However, we note that the notion of “equivalent system” is not specified, which is open to different interpretations. We therefore propose adding a sentence to clarify this notion. Such a sentence may be “two systems are said to be equivalent if they produce the same outputs given identical inputs”.

Another point is about the requirement of the “supporting libraries like tokenizers and hyperparameters search code” of the code bullet in the “Preferred form to make modifications to machine-learning systems” part. We think that tokenizer is very specific to LLM systems and doesn’t apply to other generative AIs so we propose to withdraw this specific reference and keep only “supporting libraries and hyperparameters search code”.

Given these considerations, LINAGORA is prepared to support the proposed definition.