AI: when China creates its own ChatGPT

AI when China creates its own ChatGPT

“Crypto is libertarian, AI is communist.” This is what the famous investor Peter Thiel said in 2018 during a debate with the founder of LinkedIn, Reid Hoffman, at Stanford University. The statement was then specifically aimed at artificial intelligence, an area in which the Chinese advance seemed irresistible. In 2019, China overtook the United States and Japan for the number of artificial intelligence-related patent applications worldwide.

Peter Thiel’s comment mostly reflected the fact that training artificial intelligence required resources – computing power and datasets – only accessible to centralized organizations. The fanfare launch of ChatGPT, but also of image-generating artificial intelligences and large language models, has shattered this idea. Anything on the internet is a sufficient primary dataset to train very compelling models.

Even though OpenAI has had immense resources, with over $250 million in funding, using the infrastructure clouds of Microsoft Azure for almost $1 billion and a thousand people hand-feeding the algorithm to improve it, it remains accessible to many entities. Add to that the rise of collaborative research sites on artificial intelligence like Hugging Face and the link that was made between large authoritarian states, and this technology is shattered. It should be noted in passing that with the interdependence and lack of transparency of the platforms, the libertarian myth of crypto has taken a big hit.

The growing accessibility of artificial intelligence technologies has not failed to make States react. In the United States, at the end of January, the National Institute of Standards and Technology (Nist) published a voluntary framework giving companies advice on the use, design or deployment of artificial intelligence systems. The European Union has decided, as usual, to be much stricter with a bill regulating the uses of artificial intelligence which should be voted on in the spring (see column of October 21, 2021). But it is the Chinese state that has probably reacted most strongly to this progress.

Beijing requires AI to sign their works

Beijing’s internet regulator, the Cyberspace Administration of China, issued regulations in early December on what it calls “deep synthesis” technology, covering software for generating images, sounds and text through intelligence. artificial. In particular, the use of this technology to create content likely to disrupt the economy or national security is prohibited, which is in line with concerns about deepfakes, these contents produced in mass and sufficiently true on the surface. Beijing’s new rules require the visible labeling of AI-generated content to users along with their digital watermark.

But the Chinese government is also trying not to be left behind by essentially American initiatives whose training games are biased and less fine-grained on Chinese culture. For this, it relies on its private companies, which are increasingly controlled, and also on its research institutions. Widely reported by state media, Chinese internet search giant Baidu has announced that it will launch an artificial intelligence-based chatbot service similar to OpenAI’s ChatGPT in March. Baidu has invested heavily in recent years in its Ernie-ViLG model which consists of over 10 billion parameters and is trained on a dataset of 145 million Chinese image-text pairs. He also uses it in autonomous driving.

The other model which has the wind in its sails in China is Taiyi, from IDEA, a research laboratory directed by the famous computer scientist Harry Shum, formerly of Microsoft. The open-source model is trained on over 20 million filtered Chinese image-text pairs and has 1 billion parameters. Close to the Beijing Artificial Intelligence Academy and Shenzhen local government, IDEA probably enjoys more research freedom.

lep-life-health-03