(Credit: Google)
Google has added yet another creative tool to Gemini, allowing the chatbot to convert any photo into a dynamic video with sound.
The new tool, powered by Google’s Veo 3 model, takes AI video generation a step further. Instead of typing out every detail about how objects should appear, you can upload a photo and ask Gemini to animate the elements and add some sound.
For now, the tool can generate only 8-second-long dynamic videos. To get started, select Video from Gemini’s prompt box and upload a photo. Provide instructions for the animation and the audio, and wait for the chatbot to generate the output.
“You can get creative by animating everyday objects, bringing your drawings and paintings to life or adding movement to nature scenes,” Google says.
Once complete, you can share it or download it. All videos generated using this tool will include both a visible and an invisible SynthID digital watermark to indicate the footage is generated by AI.
At launch, the photo-to-video tool will be available on the web and is limited to Google AI Pro and AI Ultra subscribers in select countries. Google AI Pro costs $19.99 per month, while its AI Ultra package costs $249.99 per month.
The same photo-to-video capability has been extended to Google’s AI filmmaking tool, Flow. Here, users can also make the elements talk to each other.
Since launching Veo 3 in May, the model has been used to generate 40 million videos, says Google CEO Sundar Pichai. However, not all of those might have been pleasant. TikTok was recently flooded with AI-generated racist videos, and they are suspected to have been created using Veo 3.


