the new OpenAI tool to clone a voice in 15 seconds – L’Express

the new OpenAI tool to clone a voice in 15

“Voice Engine”: this is the name of the new software presented on Friday March 29 by OpenAI, the giant of generative artificial intelligence (AI), and publisher of ChatGPT. This tool allows you to clone a voice from a 15-second audio sample, and have it interpret a text, according to a press release from OpenAI on the results of a small-scale test. The brand has released first samples in which it is very difficult to differentiate the voice generated by the AI ​​from the reference one.

“Voice Engine” can read text in the speaker’s language as well as in another. But the release of this new software raises many questions regarding security and ethics. Fraudulent uses, risks of artist plagiarism, dissemination of false information… OpenAI has announced that the use of “Voice Engine” will be restricted, to prevent fraud or crimes such as identity theft.

For what uses?

The company ensures that it is taking “a cautious and informed approach” before wider distribution of the new tool, “due to the potential for misuse of synthetic voices”. OpenAI would thus work “with American and international partners from government, media, entertainment, education, civil society and other sectors”. “We are taking their feedback into account as we develop the tool,” the publisher said.

READ ALSO: How to make France an AI giant: instructions for use

For now, only around ten developers have access to this technology, including “educational technology company Age of Learning, visual storytelling platform HeyGen, health software maker Dimagi, communications app creator by IA Livox and the Lifespan health system”, indicates the specialized site The Verge. In the example shown above, we can hear Age of Learning using the text reading tool to formulate educational content or answers to student questions.

Guarantee of traceability

OpenAI clarified that the partners testing “Voice Engine” have agreed to rules requiring, among other things, explicit and informed consent from anyone whose voice is duplicated, and transparency for listeners: they must be clear that the voices they hear are generated by AI. “We have implemented a range of security measures, including a watermark to be able to trace the origin of all sound generated by Voice Engine, as well as proactive monitoring of its use,” the company insisted.

READ ALSO: MistralAI versus OpenAI: Sam Altman, source of inspiration and model to surpass

Last October, the White House unveiled rules and principles to govern the development of AI, including that of transparency. Joe Biden was moved by the idea of ​​criminals using it to trap people by posing as members of their family. OpenAI also suggested several measures that could limit the risks associated with tools of this type: legislation aimed at protecting the use of people’s voices through AI, better training in recognizing content generated by these technologies, notably deepfakes, and the development of traceability systems for these creations.

Fear around election year

These precautionary measures are in fact being put forward while disinformation researchers fear abusive use of generative AI applications (automated production of texts, images, etc.), and in particular voice cloning tools, while the world is experiencing several crucial elections this year. “We recognize that the ability to generate human-like voices carries serious risks, which are particularly significant in this election year,” the San Francisco-based company agreed.

READ ALSO: AI: how to make all French people virtuosos in this field

Recently, a rival of Joe Biden in the Democratic primary, for example, developed an automated program which impersonated the American president, campaigning for his re-election. The voice imitating that of Joe Biden called on voters to encourage them to abstain in the New Hampshire primary. The United States has since banned calls using cloned voices, generated by AI, in order to combat political or commercial scams.

OpenAI is not the only publisher interested in the field of creating audio texts using artificial intelligence. Podcastle and ElevenLabs have also developed voice cloning techniques. But given the problems raised by the generation of human voices, most developers focus instead on the design of instrumental or natural sounds.



lep-life-health-03