the new trend in generative artificial intelligence – L’Express

Gemini finally a real competitor for ChatGPT – The Express

The “mini” fashion is back. Not in the ready-to-wear department this time, but in artificial intelligence. After racing to be the one who will deliver the largest language model (OpenAI’s GPT-3 has 175 billion parameters and GPT-4 would have many more), the big names in generative AI are tackling the problem head-on. opposite direction and tries to develop small ones. In December, Google released two Nano versions of its language model Gemini : one of 1.8 billion parameters, the other of 3.25 billion parameters. The same month, Microsoft revealed its model Phi-2 equipped with 2.7 billion parameters.

The field of “mini-AI” is, in fact, full of promise. Compared to large language models (LLM), small language models (SLM) are simpler to train; therefore less expensive to create. And some AI professionals believe they could prove not only easier to understand and audit, but also easier to customize for an industry’s specific needs. These small language models are finally better designed to run on more modest corporate infrastructures, or even individual devices. No need to data centerthe 7B model (7 billion parameters) from French Mistral AI runs like this on a laptop.

READ ALSO: Generative AI: “There is nothing magical in it, only mathematics”

Google’s Gemini Nano is gradually being integrated into Pixel 8 Pro smartphones. Office Counterpoint Research also estimates that smartphones using generative AI will represent 40% of the market by 2027. An exciting prospect. Even disconnected from the Internet, we could thus have permanent access to a powerful form of intelligence in our pocket – even if it is not without imperfections.

Mini-AIs versus giant models

Mini-AIs highlight an important point: the uses of AI are extremely diverse. Certain sectors (health, defense, etc.) require the ultimate. “But if I just want some creative marketing content for a brand, generate product descriptions or have standard customer interactions in unregulated industries […] is it really necessary to use a model trained on the entire world wide web? ” points out with humor, Nisheeth Srivastava executive vice president and head of innovation and technology of Capgemini India in a post published at the beginning of November. Especially since these small language models sometimes turn out to be more effective than the XXL versions. “With only 2.7 billion parameters, Phi-2 outperforms the Mistral and Llama-2 models by 7 and 13 billion parameters on many benchmarks. It also performs better than the Llama–2–70B model which is 25 times larger, on tasks requiring multiple steps of reasoning, such as computer programming and mathematics,” Microsoft researchers write in the ticket presentation.

READ ALSO: France and AI: the underside of a formidable comeback

This competition is reminiscent of the race for large batteries in smartphones. Phone manufacturers like to highlight the capacity of their batteries, showing off higher milliamp hour numbers each year. However, some manufacturers manage to do more with less, for example by intelligently optimizing the software so that it uses little battery power. In AI, we can also do more with less. One way to do this is to better sort and arrange the data on which the language model is trained.

“Very few people know where to retrieve interesting data. Most use a corpus called Common Crawl, but there are many others less known. Also few people know how to organize training data optimally. is a real work of art”, Vincent Luciani, CEO of Artefact, a consulting firm specializing in AI, explained to L’Express in mid-December. If “mini-AIs” are full of promise, they will not replace the big ones. The language model that crushes all others by far remains the great GPT-4, which is estimated to have more than a trillion parameters. And other techniques may, in the future, make it possible to run large language models on small devices.

.

lep-general-02