It’s been 90 days since Google surprised us with Flow at its annual I/O developer conference. According to Elias Roman, senior director of product management for Flow at Google Labs, most of the time since then has been spent “just working to meet demand.”
Flow is a departure from Google’s previous generative AI work. For years, the company’s AI efforts were focused on Gemini, its all-in-one chatbot. The company has infused its products with artificial intelligence, such as Search’s AI summaries and Gmail’s AI-generated summaries. Its research assistant tool NotebookLM, with its AI audio generator that can turn documents into personal podcasts, is constantly introducing new features.
The industry leader has spent billions of dollars trying to win the race to develop the most advanced artificial intelligence for Google’s researchers, developers and, yes, even artists and creators. 100 million AI-powered videos is a milestone for the company and shows us what the future of AI-powered creativity could look like.
Immerse yourself in the flow of AI
To compete with Midjourney and Stable Diffusion, Google has created a series of AI image models called Image (pronounced ‘imagine’). Earlier generative media models were better suited to amateur or enthusiast creators than professionals and did not dominate the creative space for AI. That changed with the Veo 3.
Google unveiled the Veo 3, its latest video AI model, at its I/O conference in May. In Veo 3, it surpassed the competition with a somewhat obvious but industry first: AI video with synchronized AI-generated audio. The model attracted a lot of attention online, and Google reported more than 40 million AI videos seven weeks later.
“I see 3 as enabling a much larger group of people to create highly immersive videos that immediately engage all the senses. There was no need to build a toolbox,” said Roman. “I think it’s also a great opportunity to be able to do the foley (ambient sound), the sound effects, the soundtrack, the dialogue and everything else without the user having to think about each of these modes in a certain way.”
I See 3 is one of many AI models you can use in the Filmmaker tool. Designed for creative and professional filmmakers, Flow goes far beyond the simple image and video generation possible with Gemini. Google deliberately moved away from the original ImageFX nomenclature and evolved the interface, Roman said. He wanted Flow to combine the more advanced Image and Veo models with Gemini, which was used in the Veo line and “basically speaks native Veo.”
Flow is a way to combine all these AI models and components, unifying Google’s various generative AI models for seamless video creation and editing.
What makes Flow different from Veo and Imagen?
Flow is designed to focus on consistency, or the ability to maintain visual identity from one clip to another. If you have a 90 second video of your character drinking coffee in a cafe, you don’t want his hair length or eye color to change every 8 seconds between scenes. This consistency is important for professional projects, but also difficult to achieve. Roman called it the “Achilles heel of AI video.”
Flow has several tools to help you maintain this consistency, and in my testing, they give you a new level of control over your work that Google’s AI tools previously lacked. I can best describe Flow as an enhanced version of simple video generation interfaces, with the ability to export multiple clips into a simplified version of a Premiere Pro-style timeline.
AI tools are often updated in hopes of making them more useful to professional developers, even if the target audience isn’t automatically drawn to using them. Generative AI is a controversial topic in the creative industries, especially when it comes to general creation of text, images and videos. While AI enthusiasts praise the creativity and speed of AI models, developers continue to raise legitimate concerns about how AI is trained and deployed. That’s why publishers and artists have filed copyright infringement lawsuits against AI companies. As a result, workers in data-intensive industries are concerned about job security, while managers are trying to cut costs.
Another problem with AI is what kind of images it can create. Last year, users found Gemini. images can produce blacks in Nazi soldier uniforms. Google apologized for what the company described as “inaccuracies in some historical images.” he said Work is underway to immediately improve these representations.
(Google guidelines prohibit create offensive and illegal AI content. Roman said that improved enforcement of its security policy will be supported by technology updates as well as real-world usage and reporting.)
According to Roman, Flow is working on expanding Veo 3’s capabilities in the future, improving consistency and adding new features like custom voices for character work. The core of the project is to make creativity more accessible to people.
“We can break down the barriers that prevent a much larger group of people from telling stories through video, and we can raise the bar for the kinds of stories that can be told through video,” Roman said. “Some of them will be fun and crazy, like wild street interviews or Yeti ASMR bloggers, and some of them will be really powerful.”
How to use Google Flow for AI videos
Flow, part of Google Labs and available through the AI test kitchen, is paid Google’s artificial intelligence Subscribe to the Pro plan for $20 per month and the Ultra plan for $250 per month (currently $125 off for three months). google labs Data protection declaration states that “human reviewers read, comment, and process laboratory interactions and instrument results” to improve AI models. (By default, your laboratory data is retained for up to 18 months and the company recommends that you do not upload or send sensitive information. Google’s Public Privacy Center has more information.)
I spent some time testing Flow, creating clips and merging them with Scene Builder. Several tools are only available to Flow users.
Ingredients for the video: There are several ways to request video generation, including explicitly converting text to video and images to video. Converting ingredients to video is a new topic worth exploring. With this method, you upload specific images and add a text message, and Flow merges the elements piece by piece. For example, you can upload a photo of a man, a product photo of a specific jacket, and a panoramic background, and Flow can combine them and animate the video.
Zoom clips and smooth transitions: Extend allows you to extend clips. In the Scene Builder timeline, extend the end of a clip’s frame to the desired length. If you’re creating a new video and want a smooth transition, I recommend going to the end of the first clip and clicking the plus button at the top of the cursor to save the last frame to your library. You can then use this image in an image-to-video conversion request to maintain consistency from clip to clip.
Doodle and edit: If you’re editing a frame or image in a separate document, you can upload the selected image to Flow and let the template make the changes. You can also do this with pictures you’ve drawn that bring doodles to life. This is a development feature (a new prototype is currently being developed), but it’s really fun to extend Flow’s capabilities in this way.
Question with twins: Gemini doesn’t have a way to automatically create and/or improve your messages directly in Flow (which will hopefully change in a future update), but you can use the chatbot to help you create the perfect message. If you have trouble realizing more detailed ideas, let Gemini help you.
To learn more, check out the best AI image generators and a guide to writing the best AI image messages.