Pushing AI's Imagination to the Edge
With the increased popularity of AI across global media, we are witnessing it take on diverse forms. One trend that recently captured my attention is its utilization on social media for visual storytelling. In one trend, users prompt an AI image generator (often Open AI's Dall-E), to generate an apparently innocent character and then engage in a conversation to push the generated visuals to become more and more extreme.
Here's a recent example I came across from @aivideolab on TikTok:
And another from u/basilbutters on Reddit:
This trend is a fascinating example of how content creators are using AI technology for comedy, but it also serves as an excellent illustration of the technology's power and implications as a tool beyond entertainment.
Similar Workflows, Different Outcomes
There is an eerie similarity here to the way creative work unfolds on a creative team. Just as a designer receives rounds of direction from an art director and together they enhance a layout through iteration, this trend with AI involves a similar process. The user prompts the AI for specific visual elements and through a back-and-forth exchange, they refine and iterate on the generated direction. This analogy underscores the potential for AI to mirror and augment the creative collaboration seen in professional settings, further blurring the line between human and AI creative input.
It's also worth considering how AI can benefit not only the designer through image generation, but also their director by assisting in generating and optimizing the text prompts or direction given to the AI. But even here there is a need for the human interacting with the AI model to have a vision for what they want the output to be, and to know how to communicate it.
A couple of Quick but Important Side Notes
The AI prompting featured in this piece is powered by Allied GPT, Allied Global Marketing's proprietary conversational intelligence platform. Allied GPT centralizes access to leading language model APIs with enhanced security. It containerizes client data and models ensuring IP integrity and addressing security concerns. With all capabilities served through a simple chatbot interface, AGM employees can easily compare outputs from tools like OpenAI and Anthropic. Beyond efficient access, Allied GPT offers tailored guidance for applying these technologies' possibilities to business strategy and campaign creation — far beyond writing and visual creation alone. We recently added a 'Prompt Optimizer' feature that aims to address many of the concerns and observations I make in this piece. With this purpose-built launchpad, we can maximize emerging technologies and gain a competitive edge.
Before jumping into my creative experiments, I should emphasize that AI represents an evolution of creative roles — not a replacement. While technologies like Dall-E point to new capabilities, they lack human intuition. Human directors and designers must provide context and strategic positioning. Designers and copywriters bring cultural knowledge and emotional resonance from life experience. But the evolution presents opportunities if embraced openly. As mundane, repetitive tasks transition to automation, creative teams gain capacity. Blending emerging tools into ideation and execution means exploring more iterations and taking interesting risks. AI also offers an objective outside lens immune to inherent bias, which creative leads can point towards undiscovered areas. With proven guidance from talented strategists and critical thinkers, we can implement technologies like Allied GPT to unlock team potential rather than diminish valued skill sets built over careers in marketing science and creative arts. The results aim to quicken, not compromise, human imagination. Now - on to the AI experimentation - let's have some fun!
Bridging AI's Imagination Gaps
What if want to make my own version of one of the above trends, but don't have a specific idea for one? I prompt Open AI's GPT 4.0 via Allied GPT with:
Give me a few ideas for an exchange I could have with an AI that illustrates the trend where a user prompts an AI image generator to create an apparently innocent character and then engage in a conversation to push the generated visuals to become more and more extreme.
It responds:
Certainly! Here are a few prompts you can try using OpenAI's Dall-E or similar AI image generator:
- "Create a picture of a cat sitting on a chair."
After the initial image is generated, you can then guide the AI to make it more extreme, such as "Can you make the cat riding a unicycle while holding a bouquet of flowers?"- "Show me a landscape with a calm river and a few trees."
You can then push the AI to create a more extreme or surreal version, like "Now, add a pink sky with three moons and a floating castle in the background."- "I want to see a person playing guitar on a stage."
Take the prompt in a more extreme direction by asking the AI to "Add laser beams shooting out of the guitar and fireworks exploding in the background."
These prompt ideas are certainly imaginative, but they don't exactly fit the mold of the trend and would result in visuals that would be more surreal, fantastical (and honestly, silly) than what I have in mind. The humorous aspect of the social trend lies in taking a seemingly simple and sometimes positive visual and twisting it into something very different, but still fairly simple conceptually. Another key element is the open-ended direction given to the image generator, which adds to the humor as the AI interprets how to visually adjust the image without specific manual instructions.
So I tell it just that and after some further prodding and back and forth it responds:
I understand what you're looking for. Here are some open-ended prompts that allow the AI to interpret the direction and push the imagery in a different direction, without needing specific instructions:
- "Create an image of a cheerful clown." Then, continue with "Now, imagine the clown in a dimly lit alley, with a menacing grin and glowing eyes."
- "I'd like to see a picture of a friendly-looking doll." Follow up with "Now, imagine the doll in an abandoned house, with tattered clothes and a haunting expression."
- "Show me a drawing of a cozy cabin in the woods." Then prompt the AI with "Now, picture the cabin with broken windows and an eerie, flickering light inside."
These prompts are closer to what I have in mind and even provide some added inspiration for a story that can be fabricated around the visuals.
But it's worth noting GPT 4.0's seeming lack of intuition. While the AI's output may be instantaneous, a quick chat with our human copywriter would likely result in a quicker understanding of the concept and diminish the need for extensive back and forth. Integrating AI in my creative process has led me to adjust my approach, as it requires more precision in providing directions compared to collaborating with a human who understands human concepts, like humor and context.
Human Vision, Artificial Sparks of Inspiration
So let's work with one of these prompts and finesse it a little with some of my own creative input. I like the structure of the clown concept, but I don't love clowns and I also think it would be interesting to have a more dramatic visual transformation happen with the character. What if we swap the clown for a caterpillar that can turn into a butterfly at some point in the narrative? And think through a couple more narrative arc plot points? I prompt:
Write me 9 steps of progressive prompts (6 words max each) to generate the visuals for the following story: An adorable caterpillar explores a vibrant garden eating his friendly critter peers. With each meal he grows stronger and starts to bulk up as his peers start to dislike and fear him. What begins as an innocent scene becomes testosterone-fueled insanity. In our hero's resulting exhaustion and desperation, he disappears into a dark corner of the garden. We see only his menacing grin and glowing eyes. Eventually he emerges a beautiful butterfly loved by all.
And this is a good example of the more specific the prompt is, the more likely you are to get what you're looking for from your AI tools because it responds:
Here are 9 progressive prompts (6 words max) for that caterpillar story arc:
- Caterpillar munches leaves happily
- Eats critter peers, bulks up
- Peers fear him, he grows alone
- Rampages garden insanely
- Retreats, menacing grin glows
- Emerges beautiful butterfly
- Flutters through vibrant garden
- Critters gaze admiringly
- Lands triumphantly, loved again
Words take Visual Form
Let's see how this storyline comes to life visually. I enter the first prompt into Dall-E proving a little more context:
I'm going to provide a series of prompts. I'd like you to create a visual output for each one. Since the sequence of images that will be produced, should tell a narrative, it's important that they all be in a similar photo-realistic style, and that the main character look like a consistent evolution of himself. I also want the characters' faces to be very expressive. Produce each image in 1x1 format. I'd like the colors to be vibrant. Apply these rules to all of the following prompts. The first prompt is, "Caterpillar munches leaves happily".
The exchange leads to the following story:
As I review the sequence of AI-generated images, I'm impressed by how closely Dall-E followed the arc of the storyline prompts. The visuals align with the emotional tone I intended to convey at each stage---from the happy, innocent caterpillar to the hulking menace feared by its peers.
However, limitations emerged. Without broader context about the storyline and previousl generated visuals, Dall-E seems to have approached each prompt as a new visual and so our protagonist doesn't look like a naturally evolved representation from frame to frame.
Nonetheless, the sequence effectively communicates the storyline, proving AI's potential for visualizing narratives. Reflecting on this aligns with my original thoughts on AI becoming a tool for augmenting human creativity rather than replacing it completely.
Humanity's Place in an AI-Driven Future
I am optimistic that the adoption of AI in our creative process will enhance the overall output and productivity of the entire team. By effectively leveraging AI's capabilities, the team will work more efficiently and generate higher-quality creative work. However, it also raises the important consideration of ensuring that human creativity remains at the forefront and that AI is a support tool rather than a replacement for human ingenuity.