When Rodolphe Saadé (CMA-CGM), Xavier Niel (Iliad) and Eric Schmidt (ex-Google) take the stage at Station F on November 17, 2023, the French artificial intelligence sector holds its breath. The result lives up to expectations: the three entrepreneurs announce the creation of Kyutai, a non-profit AI research laboratory with 300 million euros. His mission? Become a reference in the field. A year later, the general director of Kyutai, Patrick Pérez, is all smiles when he meets L’Express. Rightly so: in one year, the laboratory has already achieved some feats, the most important of which is undoubtedly Moshi. The voice chatbot, a demo of which was unveiled in July and the source code published in September, is impressively fast.
A European player facing OpenAI
From the start, Kyutai sought to revolutionize the field of voice assistants previously infamous for their slowness and verbal rigidity. “When ChatGPT came out, we were all impressed that we could talk to an AI. But as soon as we switched to voice, there was a lag. The interaction wasn’t good, the latency was too long, it didn’t work. was neither fluid nor expressive, nothing like a real conversation,” explains the general director. Kyutai therefore set to work, with success. To the point that a daring Moshi sometimes cut off his interlocutors during his presentation in July. The public online “demo” has since been used nearly 500,000 times. “People played with Moshi, and were very surprised by its responsiveness and fluidity,” comments Patrick Pérez with a smile.
Between the creation of the laboratory and the deployment of Moshi, “their speed is still exceptional. The feedback from the ecosystem on them is very good. Everyone has tested Moshi”, observes Mehdi Triki with pride. For this manager of public and institutional relations at Hub France IA, the association bringing together players in the sector, Kyutai’s work with Moshi places him in the league of OpenAI. Even if the American giant, founded in 2015, valued at $157 billion and staffed by 1,700 employees, is much more imposing than the French laboratory and its fifteen researchers.
“We want to grow, confirms Patrick Pérez. We are always looking for sharp profiles, but it is more important to us to grow carefully than quickly.” The competition to attract competent profiles is, it is true, fierce in this booming sector. To seduce, Kyutai relies on its singularities. “In the French and European landscape, there are no other non-profit organizations with a similar budget. And the team has a very high level,” argues Patrick Pérez. Not to mention that Kyutai has access to the supercomputer equipped with Nvidia chips from Scaleway, the cloud solution that is part of the Iliad and Xavier Niel portfolio.
The very high cost of training models is, however, a challenge for an entity such as Kyutai. This puzzle also prompted the leader OpenAI, which had a similar status, to review its structure and open a for-profit subsidiary. “For the moment, Kyutai does not have a business model,” recognizes Patrick Pérez. Moshi, impressive as it is, doesn’t make money – the training codes and data are open source, available for free. “The question of viability is important in the long term, but we have time to see it coming. Our endowment is large, and we believe we can attract other donors. Moshi is a tour de force which has impressed, so it may convince other institutions.” Other options such as the creation of a spin-off will be studied in the longer term if necessary, but the laboratory indicates that it is not at this stage of consideration. Above all, there is no question for the director of developing a commercial part inside the laboratory, as OpenAI was able to do.
Money, the sinews of war
“The ability to conquer the business and general public market will be decisive,” however, warns Mehdi Triki. Because the competition promises to be tough. When OpenAI launched ChatGPT, the technology quickly became essential, and the company a reference. However, in the new technology sector, the first entrant often enjoys an advantage. “The other question that Kyutai must answer is what’s next after Moshi?”, specifies the head of public and institutional relations at the France IA Hub.
Kyutai is fortunately not short of ideas. “For Moshi, we are thinking of use cases in call centers, satisfaction surveys, but also in education or accessibility for people with disabilities,” reveals Patrick Pérez. Moshi could thus become “the voice and the eyes” of many people.
Above all, Moshi has only taken his first steps. “We want to make him speak other languages and give him sight,” explains the director of Kyutai. The laboratory is working on multimodal models, but also on smaller and compact models. In the longer term, Kyutai dreams of transforming the very architecture of the models, for the moment all based on a very specific type of neural network, called “transformers”. “These networks work well, but they have many drawbacks, and they are perhaps not intended to embody the future, believes Patrick Pérez. We want to build this future, the ‘transformers 2.0′“.
.