“Generative AI models cannot make an addition” – L’Express

Generative AI models cannot make an addition LExpress

How to create more useful AI? And more reliable? These questions are two of the great challenges of our century. And those that ECE Kamar, general manager of the AI ​​Frontiers laboratory of the Microsoft giant, works on a daily basis with her team. For this expert, memory will be a decisive brick to add to artificial intelligence to make it progress. But there is also a lot to learn about how recent reasoning models (O3 of Openai, R1 of Deepseek, etc.) work. Interview.

Read also: George Lee (Goldman Sachs): “We are very impressed by Mistral”

L’Express: How can the AI ​​of memory help them to progress?

ECE Kamar: My laboratory is very interested in the pieces of the puzzle to add to create AI capable of solving our problems. AI capable of understanding us, to lighten our daily tasks reliably. And it becomes obvious that memory is an important brick, because one of the limits of major language models (Large Model Language or llm) is that they do not learn, they are static. Even if we point out their mistakes or give them details on us, they do not adapt their answers to our profile and our needs. Without memory, LLMs do not remember what they are told, which makes them static, boring and much less useful. Giving the AI ​​of “memory” prevents us from having to repeat the same thing constantly.

https://www.youtube.com/watch?v=Aeqcq75qj_Q

By definition, the answer of a generative AI to the same question varies. That makes the salt of the exchange but does it not greatly complicate the execution of certain tasks?

The automatic learning models built using statistical learning techniques are intrinsically probabilistic, which makes them stochastic. Whenever an automatic learning model runs, it makes a prediction on the best thing to say. And it does it on the basis of a distribution of probabilities. Imagine that the probability of saying the word A compared to word B is 90 % to 10 %. In 90 % of cases, the IA system will give the word A. But in 10 % of cases, it will give the word B. This response can disarray a human being who would systematically opt for response A. But because of the A probabilistic nature of these models, there will be random cases where the system will give the answer B. And for the tasks which require predictability, it becomes a problem to be solved.

A calculator seems to be a much better tool than generative AI to make mathematical operations. So how can these “unpredictable” AIs help scientific research?

These models of AI being stochastic, they are indeed, in essence, not good to make deterministic calculations such as an addition. But what is very interesting is that we can train them to use the appropriate tools. We can teach the models not to try to do the calculation themselves, but to call on a calculator to carry out the operation and then to reintegrate the result in a response that may be more supported. Why is it useful? Because to solve a sophisticated mathematical problem, there are in reality several subtaches to accomplish. You have to think about the different resolution techniques, run a solution, make the calculations and, finally, be able to validate the accuracy of the solution.

Read also: Artificial intelligence: Mistral, the last hope of Europe against Openai and Deepseek

The current models are adapted to some of these tasks, but not to others. For a calculation for example, we will teach the model that it is best to use a calculator. Similarly, if the system needs to see previous solutions to be able to learn from it, it can go on the web, do a search and collect this information, put it in memory as we mentioned earlier.

The fundamental challenge of mathematics is however to know how to consider different resolution techniques, implement them, check if the promising technique gives you the right answer, and if it is not the case, be able to go back and go towards A different solution. Traditional language models were poor at this level. But recently, new generations of AI arouse enthusiasm: models of reasoning such as O1 and O3 (Editor’s note: Openai) or the R1 model (Editor’s note: Deepseek). Their innovative architecture allows them to have a complex thought, to consider several ways to solve a problem, to assess the accuracy of their approach and, if necessary, to try alternative solutions. These are very important capacities for scientific discovery, resolution of complex mathematical problems, the resolution of complex tasks in general, that traditional LLMs did not have so far. To come back to your question – how these models can help scientists – I think we are all looking for an entity that allows us to exchange ideas in order to be more creative in our resolution techniques.

How do these news that “reason” work differently from previous generations?

Until then, the major language models (LLM) learned by trying to predict the following word and by penalizing or rewarding themselves according to their ability to predict the following word. This has its limits when it comes to solving reasoning problems where it is necessary to reflect in several stages, to check the accuracy of your approach and to the need to go back. The “reasoning” models are formed by experience. This is called strengthening learning (Editor’s note: Reinforcement Learning or RL). This essentially consists in learning out of tests and errors. The models are exercised in problem solving and check if they reach the right answer.

Read also: Lila Ibrahim (Google Deepmind): “AI now shows the way she thinks”

Curiously, a simple training seems to teach models of complex thinking patterns. I think that as scientists, we still have to work a lot to understand why. We must solidify these approaches, ensure that they operate reliably in the areas that interest us. It is an exciting adventure for us and for many teams in the world. This brings a new way of thinking of AI models, which is very, very exciting.

AI has sometimes been described as a black box. Is this still the case today?

The most basic definition of transparency is to be able to open an automatic learning model, look at its activations and weights and retrace the paths: when there is such an entry, here is how the machine makes decisions, what activations play, What mathematical functions run to obtain such outing. This level of transparency is no longer realistic for the large fundamental models that we are studying. But the interest of current models is that it is possible to dialogue with them. You can ask the model to explain the logic underlying its decisions. You can challenge him by asking him to what extent his answer would have varied if the data that had been provided had been slightly different. The fact that we can converse with these models is a different level of transparency.

What do we need today to make AI reliable to be used in areas such as health care ?

These models must have the capacity to assess the accuracy of their response. Am I on the right track? What degree of confidence do I have in my answer? When we, humans, solve problems, we are intrinsically aware of the quality of our work. We know if we are blocked or if we are progressing well. So far, the models have not had these capacities. But we find ways to complete these systems with accurate control techniques. This is a key step to build reliable systems for critical uses such as health care. But human interaction remains very important. In the sphere of health, in particular, it is a question of empowering people rather than replacing them. We do not want a world where medical decisions would be taken only by AI. Doctors are the real experts. They must stay in charge.

.

lep-general-02