Google introduces Whisk, its innovative answer to generative AI image creation

Press Release

Generative AI opens up a world of creative possibilities with its ability to make unique content. Following the success of Google’s AI tools, it’s taking another significant step with Whisk, the newest experiment that explores the capabilities of gen AI.

Whisk is a fun and engaging tool that makes AI image generation more accessible, especially to those without prior AI knowledge. Instead of typing out long, detailed text prompts, it allows people to use image prompts. They simply drag images and start creating.

Generative AI utilizes deep-learning models to make high-quality content based on existing data. Sometimes, certain design elements get lost in text prompts. But with Whisk, anyone can easily customize how the image will come out. It allows specific image inputs for the main subject, the scene, and preferred art styles.

Behind the scenes, the Gemini model automatically writes a detailed caption of chosen images. It then feeds those descriptions into Google’s latest image generation model, Imagen 3. This process captures the subject’s essence, and doesn’t create an exact replica. This further allows people to remix their subjects, scenes, and styles in novel ways to create something that is uniquely theirs, from character concepts to enamel pin designs.

It’s also important to highlight that Whisk extracts only a few key characters from images, so generated content may differ from one’s expectations. Fortunately, it allows users to view and edit underlying prompts at any time to get the output they want.

Play with Whisk here: https://labs.google/fx/tools/whisk.

Press Release

Related Posts: