Abuse of a voice-driven AI system was short-lived

Abuse of a voice driven AI system was short lived


It is a beautiful sound-oriented development of the internet world. artificial intelligence too short to abuse the system lasted.

Recently, voice and speech focused artificial intelligence initiative ElevenLabsa platform that gives users the power to do voiceovers, create entirely new synthetic voices if desired, or clone someone’s voice released as beta. Here is this test-driven activated system of the internet world. (Especially users on 4chan) lasted only a few days. The company made a mandatory statement on Twitter. stated that they had to take precautions against these abuses.. According to reports, clips of famous names saying very bad things exploded on 4chan. Using this system, users prepared sound clips in which famous names made homophobic, transphobic, violent and racist remarks. You know, situations like this “deepfake” technology made a big splash when it first exploded. Thanks to Deepfake, the faces of female celebrities were added to many pornographic content.

YOU MAY BE INTERESTED

Before this on artificial intelligence and sound Microsoft sounded. The company has appeared in the past weeks. VALL-E came out with. This system focuses on automatic voice over text and It can analyze people’s voices from only 3-second recordings and make them usable in long vocalizations. Although it uses only 3 seconds of data according to initial explanations, it is a natural, not robotic. automatic voice over The system that can offerEnCodecIt is based on the codec technology ”. EnCodec, the artificial intelligence-assisted audio compression method, can seriously compress audio without loss of quality.

In the process of developing VALL-E, it also benefited from Meta’s data (Exactly 60 thousand hours of conversation). MicrosoftIn this infrastructure, it analyzes how a person’s voice sounds during speech, divides this information into separate components to make it usable, and uses the fed data to create how the speech / voice will sound out of the three-second sample. It is reported that people can imitate even the intonations and acoustics of the environment by taking the entered data. VALL-E systemis still under development and reveals great potential for the future.



lgct-tech-game