Let’s state the reality. Japan, China, and South Korea are each developing language models tailored to their respective countries, and they are actually being used. Even my company was developing LLMs with almost comparable performance to GPT-3.5 up until that time. Many Japanese companies, including mine, have shifted to using or improving upon Western foundational models because they realized it is more efficient for their business. Yes, Open Source AI is important.
Also, when Common Crawl was started, there were Japanese people among the key figures involved at that time. The fact that Common Crawl is based in the United States does not have much significance. While English is the most common language among the content in Common Crawl, I believe Russian is the second most common. Japanese also makes up about 5%.
Yes, Japan is a member of WIPO, and there is even a WIPO office located in Tokyo. Article 30-4 of Japan’s Copyright Act, as mentioned earlier, was established after thorough examination to ensure full compliance with the three-step test of the Berne Convention and the WIPO Copyright Treaty.