Google unveiled its new artificial intelligence-backed video generator Veo, signaling a significant expansion in the world of AI art.
Google went all in on its Gemini AI model at the company’s I/O keynote on Tuesday, and there were plenty of photo and video updates on display. In addition to Veo, Google announced an upgrade to its image creation tool Imagen 3 and expanded Gemini use cases with “Ask Photos,” which works with Google Photos.
Now that AI photos have become almost omnipresent, video feels ripe for disruption from large language models like Gemini. Just a few months ago, OpenAI made a big splash in the tech and video worlds when it released several short videos created with its AI video generation tool Sora. Even though these clips weren’t perfect, they were still considered hyper-realistic and highly detailed.
Google says Veo can create “high-quality 1080p resolution videos in a wide range of cinematic and visual styles that can go beyond a minute,” and showcases some footage the company says is unmodified (specifically, unedited raw output) at I/O. Additionally, the tech giant says Veo can mimic real-world physics and understands queries that include specifications for time-lapses and aerial footage.
To create a video using Veo, users can opt for text, video, or image prompts, and additional prompts can be used for edits.
Veo is only available to “select creators in private preview in VideoFX,” the company’s new “experimental tool,” over the next few weeks. The waitlist is open now for others who are interested as well.
While AI image and video generation have been met with criticisms over whose art is used to train these models and whether or not they are aware of or compensated for the fact, Google didn’t address those matters. Instead, it leaned heavily into the positives and how this will bring video creation to the masses. And it’s not just Google saying it, famous actor, writer, rapper, and director Donald Glover is saying it (at Google’s I/O keynote).
“Everybody’s going to become a director, and everybody should be a director,” Glover says in a video where he, along with his creative studio Gilga, work on a short film using Veo. “Because at the heart of all of this is just storytelling. The closer we are to being able to tell each other our stories, the more we’ll understand each other.
Google Launches Imagen 3
Not to be forgotten, Google’s AI text-to-image model Imagen also got some love during the I/O keynote with the announcement of the tool’s third iteration. Imagen 3 is meant to better understand “natural language, the intent behind your prompt and incorporates small details from longer prompts,” according to Google.
Details are meant to be richer, and the results less wonky — sorry, results contain “fewer visual artifacts.” Further, the update means Imagen 3 is less likely to forget about the smaller details users might include in prompts. It’s also much better at rendering text. Users can sign up to try Imagen 3 in ImageFX, and it is “coming soon” to developers and enterprise customers in VertexAI.
Image credits: Google