New system that can turn photos into video: Stable Video Diffusion

New system that can turn photos into video Stable Video



Today, a new system that can turn uploaded photos into videos Stable Video Diffusion It makes noise. behind this Stability AI place taking.

Focused on producing visuals by writing stable diffusion known for Stability AI developed by Stable Video Diffusion, It brings together two different generative artificial intelligence models in an open source structure and It can be used locally on systems with Nvidia graphics cards. This system analyzes the photos uploaded into it and It can create videos as short as 4 seconds from them. As you can see above, the system, which can make people or objects in the videos it prepares move or directly animate the background, offers a very limited usage area for now and first level test trials are being carried out at this stage. In other words, as the trials continue, the infrastructure will develop and much longer videos will begin to be produced. There are other examples on this subject, one of them is an artificial intelligence system that can create videos from text. Gen-2 is happening. Runway Promotional video of the system developed by Herethe details we have previously mentioned are Here It is located.

YOU MAY BE INTERESTED IN

There has been a very vocal discussion on this subject before. Will Smith There was a video, this video had a funny and scary side. This video simulates Smith eating spaghetti in an incredibly strange way, specifically “Chaindrop” From Reddit user named was coming.

As far as it is reported, 10 two-second sections, created independently of each other, were brought together for the 20-second video below. Each episode shows a simulated Will Smith greedily eating spaghetti from different angles, with an infrastructure under development behind the process.

It is reported that the video was prepared on the artificial intelligence tool called ModelScope prepared by DAMO Vision Intelligence Lab, a research division of Alibaba. By analyzing millions of photos and thousands of videos in databases such as ModelScope, “LAION5B, ImageNet and Webvid”, A system trained to create videos from written texts. “text2video” It is based on the model.

lgct-tech-game