Millions here, billions there. The frenzy around generative AI, which began with the release of ChatGPT eighteen months ago, is not waning. She even seems to have moved up a gear. At the opening of the VivaTech Show on May 22, a new French start-up announced the raising of $220 million as part of its seed round, its very first fundraising. An extremely rare event, the so-called “H”, founded by veterans of DeepMind, even having the luxury of doing better than the nugget Mistral AI at its own startup.
This departure with fanfare can be explained by H’s ambition, launched towards artificial general intelligence (AGI), the Holy Grail established by the sector leader, OpenAI. A race to which the young shoot intends to contribute thanks to “agentic models”, she writes in a press release, a technology aimed at shaping specialized “agents”, capable of reasoning, planning actions and collaborating with humans, or among themselves. The big trend of the moment in AI. “In my opinion, agentic AI offers immense possibilities and promises to exceed the current capabilities of GenAI,” Daniel Dines, general manager of UiPath, a major investor in H. Around a hundred companies, told L’Express. – including of course OpenAI – are actively working on this subject, revealed in a recent study the Dutch technology firm Prosus Group.
To understand this craze, a little step back in time is necessary. Current generative AI solutions, like ChatGPT, rely on large language models, or LLMs, like GPT-4. For each query, the chatbot offers an almost entirely statistical response, based on the probability that such a word could occur after another. This method, due to the colossal data learning imposed on these programs, gives more than correct results. Some would say bluffing, but far from perfect, note today a number of leading scientists, such as the American Andrew Ng (Coursera), or the boss of Meta’s AI, the Frenchman Yann LeCun, who consider current LLMs limited to very general requests for writing or document analysis. What’s more, LLMs regularly make mistakes and, sometimes, invent when they don’t know. They are then said to be “hallucinating”.
This is therefore where the agents come in, by going “a stage above” what the LLMs offer, indicates Aymeric Roucher, engineer specializing in the subject within the company HuggingFace. Roughly speaking, these programs, always powered by LLMs, but configured in a specific way, “are capable of calling external tools (software, Web services, etc.) or thinking about what they generate to arrive at the best result”, summarizes the expert. They favor what we call “iteration”, or the repetition of a calculation until a condition, of performance or quality, is reached. The agents thus prove to be more effective according to the reference criteria, the benchmarks, the most advanced in the industry, observes Aymeric Roucher.
“More automation”
The horizon for agents is to intervene on more complex tasks, which cannot be resolved in a single request, also called “prompt”. “Imagine you want to plan a trip: a personal agent can book everything, from the flight to the hotel, and even pay,” explains Florian Douetteau, boss of the Dataiku unicorn. Unlike LLMs, agents prioritize action over simple recommendation. The general public has already started to experiment with them. With custom GPTs offered by OpenAI, now available for free. Or with the RAG system (for retrieval augmented generation), an agent that allows chatbots like those of Google or Perplexity to search for recent sources of information supporting their answers, while LLMs have knowledge limited to the date of their training, which generally goes back several months.
Companies are also getting into it, little by little. Laurent Daudet, at the head of LightOn, builds professional agents with tailor-made roles. At the moment, he is working for a client on “an automatic system response agent to calls for tenders”. A typically “agentic” job, since it requires access to different tools, textual or numerical data, and actions to obtain a result. “Understanding the call for tenders, consulting the internal documentation, comparing it to other calls for tenders carried out in the past…” lists Laurent Daudet. Once again, the objective is for the agent to be able to carry out the operation from A to Z. “For companies, the challenge of this new stage of generative AI is to provide more automation on specific tasks”, points out Florian Douetteau, from Dataiku.
Towards the agent teams
The new so-called “agentic” architectures of models, as promised by H, should make agents ever more efficient and autonomous, with better memory capacities and increased possibilities for action and planning. “This seems a necessary step for generative AI, in order to be useful in the real world,” thinks Aymeric Roucher. The hope is then to be able to get them to work as a team. This is what we call “multi-agent”. “This system is currently being tested in particular for the development of Web applications,” relates the boss of Dataiku, Florian Douetteau. “One agent will play the role of the product manager, a second, that of the developer, a third, the tester, a fourth, finally, will synchronize them all… And they will discuss in order to create the best possible application.” A sort of technological variation of the adage: “Alone, we go faster; together, we go further.”
Although research is progressing very quickly at the moment, thanks to major open source code libraries such as AutoGen (Microsoft), LlamaIndex or CrewAI, multi-agent systems are not yet ready to sweep the world. Not only do agents need access to a considerable number of tools in order to be as autonomous as possible, but above all, they need to learn how to use them. “For this, you need files with action sequences carried out by humans. Actions that succeeded and others that failed. It’s not that simple,” says Florian Douetteau.
Some are trying to fill this gap. This is the case of Rabbit and his little futuristic device called “R1”, presented as a new kind of assistant, controlled by voice. The machine carries a LAM – for “large action model“ –, which imitates the interactions that the user can have with applications like Airbnb, in order to book a trip. As if looking over its shoulder, the machine trains itself to understand how to carry out tasks instead of humans, and to be able to imitate them afterwards.
Once these obstacles are overcome, HuggingFace engineer Aymeric Roucher “sees no limits to what agents can accomplish.” According to him, the latter “will be able to do everything that we do behind a computer”. A tempting prospect, which nevertheless reminds us of the challenges that generative AI should ultimately pose in our societies. Particularly in the organization of work. Sam Altman, boss of OpenAI, imagines agents as “super-competent colleagues who know absolutely everything about [notre] life, every email, every conversation,” he told the MIT Technology Review. Other ethical issues inevitably arise from this, if by chance these agents to whom we entrust our personal data and the possibility of acting in our place were misused for malicious purposes. “New user interfaces need to be imagined,” believes Florian Douetteau, to allow human validations at different stages. History of maintaining control over these machines which seem to aspire, more and more, to do without us.
.