Google DeepMind is developing a system that produces music and dialogues

Google DeepMind is developing a system that produces music and

Working on artificial intelligence Google DeepMind, This time, it is a system that produces music and dialogues for videos. is developing.

The system, which Google DeepMind researchers call V2A, is currently in the development phase. DeepMind researchers stated that systems that can produce videos from text are developing rapidly, but can only provide silent content. It also wants to support audio outputs with V2A. V2A technology, which produces sound/music/dialogue with written commands, can analyze videos And It is also stated that it can synchronize the produced sounds with the video without any explanation.. The model, which differs from all its competitors in this very important issue, will not be put into public use in the near future, according to DeepMind’s statement.

These days, many voice-focused artificial intelligence systems have come to the fore. For example, Stability AI in the past year generative artificial intelligence system Stable Audio 2.0 He appeared before us. This system allows people to create three-minute music (44.1 kHz) based on written input. Infrastructure that creates music through written commands If desired, it can analyze uploaded royalty-free music and create similar ones. After the system, which is still not at a level to replace musicians and professional sound artists, Stability AI Stable Audio Open came.

here Stable Audio Open, available in the link, is designed as open source and can create up to 47 seconds of musical foundations and sound effects from written commands. The system is reported to have been trained using more than 486,000 examples. It can be especially useful for content creators.

Last week, a service that produces sound effects from written texts was introduced by ElevenLabs. The contents he created here The system you can see in the link is trained with content from Shutterstock and can actually produce usable outputs. At this stage, the system can produce sound effects up to 22 seconds and can also include human voices and music in the content. here The system, which can be used free of charge via the link, can become an indispensable tool for many content creators.
