IA: this threat that hangs over ChatGPT

IA this threat that hangs over ChatGPT

The star of artificial intelligence (AI) is him. OpenAI took the world by storm in November 2022 when it released ChatGPT, its chatbot that has – almost – the answer to everything. The “wow” effect over, the company founded by Sam Altman continues its harvest. According to the American media The Information, OpenAI expects $1 billion in revenue for 2023, a year ahead of its initial projections. The firm is now attacking businesses, using a ChatGPT formula that gains in confidentiality and precision, two of its shortcomings so far. A real racing car launched at high speed, with in the rear view mirror competitors like Google or the Chinese Baidu, which has just presented its “bot” Ernie.

However, two little words are starting to resonate in the AI ​​ecosystem: open source. This Anglo-Saxon expression summarily indicates that the code of a program or software is open to consultation and especially to use by others, free of charge. The principle is as old as the Internet. A number of sharing rules are enacted by the Open Source Initiative, an association founded in 1998 in California. Easily accessible, we thus find bricks in operating systems – Linux is an example that is fully open source – or even in the cloud, cloud computing. The list is not exhaustive.

The generative AI boom is no exception. The vitality of a platform like HuggingFace is today the symbol of the development of open source in this field. To date, it compiles more than 300,000 open models and 50,000 datasets, two of the essential ingredients for any design of a generative artificial intelligence product. This “hub” of French origin raised 215 million euros a few days ago, bringing its valuation to more than 4 billion euros. Nvidia, Google or Amazon participated in the round. But not OpenAI.

“ChatGPT may be restricted”

HuggingFace offers the luxury of not starting from scratch when designing software, paid or not. Everyone can try it. Provided you know a little about it. The AI ​​specialist, LightOn, has benefited from this in particular to develop its latest program, “Alfred”, launched at the end of July. Using Falcon, a large language model (LLM) published free of charge by the Abu Dhabi Institute of Technological Innovation, the French company quickly created this assistant capable of helping companies write quickthese instructions given to AIs.

“Our bet is to believe that our customers will not put all their eggs in one basket. There are subjects on which they will be very happy to use ChatGPT, because it is still very efficient. But it can also be limited on certain tasks”, explains Laurent Daudet, the boss of LightOn. A breach to which open source responds, thanks to LLMs such as Falcon or Bloom, born in France thanks to collaborative work bringing together several hundred engineers and HuggingFace. So many models that don’t have to be ashamed of GPT. Just like LLaMA-2, recently made available by Meta.

These are not their only advantages. “The principle of open source allows the community to identify faults in the code, to correct errors, to improve programs”, recalls Alice Pannier, researcher in geopolitics of technologies at Ifri (French Institute for international). It also responds to “a real need for transparency on the part of the general public and customers”, according to Laurent Daudet, while OpenAI is distinguished by its opacity, which has already earned it a few lawsuits. Open source can finally be seen as a way to demystify AI and its potential. “The best protection lies in the knowledge and understanding of the models by the greatest number”, insists Julien Chaumond, co-founder of HuggingFace, with L’Express.

But above all, these solutions represent benefits in terms of time and money. Driving models like LLaMA requires months of work and sophisticated computer chips (GPUs), the best of which are sold for around $40,000 each by Nvidia. OpenAI has spent no less than a billion dollars to equip itself with a supercomputer that lives up to its ambitions. The current efforts of open source would thus tend to transform what makes, to date, the value of generative artificial intelligence. For Julien Chaumond, the future “is no longer for very large, expensive models”. “Wealth is in the hands of those who know how to set them up effectively”, adds Tariq Krim, entrepreneur and founder of the think tank in digital geopolitics Cybernetica. As evidenced by the attempt, far from being isolated, of LightOn.

Meta in blaster

Of course, OpenAI has some real tricks up its sleeve. Valued at 29 billion dollars and used by 1.6 billion people each month, ChatGPT appears to be essential for the time being. Committed to open source at the start (as its name suggests), OpenAI finally made a major shift in the subject in 2019 and chose to keep its recipes secret, in the hope of being more competitive and secure. The idea was then to control a technology that arouses fears in terms of employment, disinformation, democracy. “The black point of open models is the dissemination of these advanced tools. Now even Vladimir Putin has access to the potential of generative AI”, judges Tariq Krim.

As for Meta’s recent turn toward open source in AI, it may not be as selfless as it sounds. “The incredibly important decisions about model weights, data, how those models are calibrated and trained, that are in the hands of big companies right now, are never open source,” recently lamented Meredith Whittaker, the founder of Signal encrypted and open messaging, in an interview with a Swiss magazine. Three Dutch researchers from the University of Nijmegen also ranked the models according to their degree of openness and transparency. Meta figure at the bottom of the rank. It’s hard not to see a real strategy there. “Using their tool creates a form of dependence on them,” says Alice Pannier. A way for Meta to gain influence in this area, without launching a direct competitor to ChatGPT.

On arrival, Julien Chaumond “does not believe” in a definitive victory for one camp over another. Nor to a Manichaean battle between OpenAI or Google – also rather closed – to the rest of Tech. “They use open source tools, like everyone else,” he breathes. Then, in the field of generative AI, “innovation comes mainly from Big Tech and especially from OpenAI, launches Tariq Krim. Open source models are currently less efficient copies”.

A card to play for Europe?

This is not everyone’s opinion. According to some voices, OpenAI still has problems to worry about. “We are now seeing open models that rival closed-source alternatives,” recently wrote the emblematic American venture capital fund Andreessen Horowitz, yet among the first supporters of Sam Altman. An engineer at Google had publicly expressed concern about the dynamism of this community in a blog post published in the spring. “We are not in a position to win the sprint race in AI. And neither is OpenAI… Open source is overtaking us. Our lead is melting at incredible speed. Open source models are faster, more adaptable by the customer and more efficient.” The general public can also realize this in the generation of images. A Stable Diffusion, open, has nothing to envy to Dall-E, the similar tool imagined by OpenAI whose manufacturing secrets remain opaque. “All the internet technologies that have finally imposed themselves are based on open source”, declared in the spring the Mr. IA of Meta, Yann Le Cun, in the review Usbek and Rica.

Europe, and in particular France, sees an opportunity in this. At the VivaTech fair in Paris in June, Emmanuel Macron himself pleaded for the development of open LLMs. “This digital approach can be seen as a third politico-diplomatic way, between the American model turned towards proprietary systems, and the Chinese techno-nationalist model”, analyzes Alice Pannier, the Ifri expert. A strategy rather consistent with the desire for transparency of technologies promoted by the recent DSA, the Digital services act, which forces the major social networks to open their algorithms. A pragmatic response, too, to the lack of means and attractiveness of European firms compared to their rivals.

One of the most promising French start-ups, Mistral, recently raised 100 million euros to build an LLM capable of being close to OpenAI. Open source, of course. Good fuel. Provided, for Europe, to maneuver with tact. HuggingFace, among others, denounced the first draft of the IA act, the regulatory text of Europe. According to several entrepreneurs, the strong legal constraints that the latter could impose on supporters of open source risk putting a stop to this sector. And clear the road for OpenAI.

lep-sports-01