The OSAID requires training data to be shared

Another pertinent point.

@stefano just asked “What does available mean[…]?” and I suggest it should be accessible via URL with a common protocol (http, ftp, torrent, etc.), without authentication, click-through-agreement, or other impediments, both for users but also the practicality of our own validation scripts being able to check (i.e., HTTP status/headers), download, and hash the dependencies listed in mof.json (section 7.3) or our own equivalent, enabling self-service and avoiding turning the OSI into a bottleneck.

Water seeks its own level, and legalese is similar to scripting so when you interpret the current checklist, the bar is effectively set at “data card” for 0.0.9:

At least one of these data components is required, in decreasing order of importance:

  1. Datasets
  2. Research paper
  3. Technical report
  4. Data card

I don’t think there’s a person here who wouldn’t agree this is inadequate for protecting the four freedoms.

2 Likes