Alibaba Cloud Launches AI Video Generation Tool Live Portait, Easily Making Photos Speak Up
2023-08-17
According to news on August 16th, Alibaba Cloud recently launched a digital human video generation tool called Live Portait. Its function is to generate a digital human video with synchronized mouth shape and voice when users upload a photo and a paragraph of text or voice.

According to news on August 16th, Alibaba Cloud recently launched a digital human video generation tool called Live Portait. Its function is to generate a digital human video with synchronized mouth shape and voice when users upload a photo and a paragraph of text or voice.


The application scenarios of this feature can be said to be very extensive, and can be used in fields such as video live streaming, chat robots, and enterprise marketing. At present, this tool has been used to create an open experience in the Magic Building Community.


Alibaba Cloud Launches AI Video Generation Tool Live Portait, Easily Making Photos Speak Up

In recent years, with the continuous deepening of research on generative artificial intelligence technology, the industry has begun to shift its attention to more modal application fields, and AI video generation is one of the areas that has received much attention. By converting text or audio information into facial motion information, Live Portait can generate realistic photo character animations and effectively reduce the threshold for video shooting and production.


The Live Portait tool consists of a motion module and a generation module, and adopts Alibaba Cloud's self-developed mouth shape prediction algorithm, which greatly improves the accuracy of mouth shape compared to traditional methods. In the training phase, explicit posture control technology has also been added, allowing for the generation of any action video without the need for a baseboard video, further enhancing the realism of digital humans speaking and enabling them to make more natural and realistic movements.


In addition, Live Portait also uses eye active control technology to increase natural eye movement, making the generated results closer to the real person effect. It is reported that Live Portait related technologies have been included in international AI forums such as CVPR and ICCV.


Alibaba Cloud Launches AI Video Generation Tool Live Portait, Easily Making Photos Speak Up

According to the information on the Magic Match community, after uploading photos using Live Portait, users can choose between text driven or audio driven methods. In text driven mode, the tool offers 28 different sounds to choose from, including Mandarin, English, Cantonese, and children's voices. In addition, Live Portait also offers lightweight model options to help users generate videos more quickly.


Zhang Bang, the algorithm manager of the tool, said, "Live Portait integrates multiple self-developed innovative technologies from the team, such as generating realistic facial animations with just a single image, breaking through the limitations of traditional adversarial generation networks. With further iteration of technology, image generated videos have huge application space and are expected to become a production tool for enterprises to reduce costs and improve efficiency


It is reported that the research direction of the team covers fields such as digital humans, 3D model AI generation, high realism rendering, and natural human-computer interaction, and has published over 50 papers at top international academic conferences.