Can a derivative of non-open-source AI be considered Open Source AI?

I think this issue is clearly described, thanks everyone for sharing your knowledge: the lack of transparency about the data used in the original model cannot be compensated by transparency downstream. Derivatives of non-Open Source AI cannot be Open Source AI.

I don’t think we should be concerned about theoretical copyright violations when there are very concrete issues with lack of transparency on data.

The copyright issues may not exist in the US, and may not exist in other parts of the world, like Japan.

On the other hand, it’s very likely that opaquely trained models contain privacy issues, biases, harmful content, possibly backdoors and other security issues. Tactically, I think it’s a bad move to lead with copyright issues, it’s like playing with fire as expanding the scope of copyright is very likely to backfire on the open movement.