Pixel Perfect? Assessing AI Imagery

David Atlas, Assoc. Creative Director, 02.07.24

02.07.24 David Atlas, Assoc. Creative Director

Since the recent advent of AI image generation, we at Allied Global Marketing have looked for ways to incorporate this groundbreaking technology into our workflow. And while AI text generators have certainly made their mark on so many facets of our business, image generation has revealed new creative possibilities as well.

Benefits of AI image generation

Adobe has built AI into Photoshop, public platforms like Midjourney have arisen, and we have begun hosting our own sandboxed versions of Dall-E and Stable Diffusion. What has come from these advances is unprecedented. Painstaking retouching tasks that would have taken hours can now be done in seconds, and original images that would have taken stock photo sourcing, silhouetting and image manipulation (with limited success), can now be done in a fraction of the time, with often spectacular outcomes. And with a bit of trial and error, prompts can be finessed for countless options and versions, until the desired images are produced.

During the brainstorming stage of creative assignments, we are able to use AI to help "sketch" and "storyboard" ideas and get a real sense of how they may appear, without extensive research and comping time.

As the software behind image-generating AI continues to evolve and improve, the quality and accuracy of the results get better as well. And, like its text-only counterparts, it allows for extreme detail in the prompts it will accept. Not only can it understand description of the image itself, including point of view, illustrative style, lighting conditions and mood, but camera lens, focal length and aperture, depth of field, aspect ratios and more.

Recently we were given a key art assignment for a stage production of a holiday-themed acrobatics show. Since we were working concurrently with the production development, photographic assets were limited; yet we had descriptions of the plot, key characters, the setting, etc. So with a few keystrokes, we were able to help the producers (and ourselves) to imagine costume options, poses and other character attributes. Versions of the images not only made it into the final key art, but also informed the actual production and costume design.

Screenshot%202024-02-01%20at%2011.41.43%20AM

In another instance, we were pitching a new client with a spec presentation where we were given no assets whatsoever. Historically in these instances, we would present mood boards and reference to the client, along with a verbal description of our ideas. Or put together mock concepts featuring images cobbled together with stock, or pencil sketches or marker comps. These options often fell short of describing our vision. This time, we were able to build full-out finished looking comps, complete with (seemingly) photographic assets, customized to the client's aesthetic and tone.

image6

An imperfect world

However, with the positive often comes the negative. Despite the impressiveness of the images we were able to conjure, the AI's misunderstanding, as well as visual aberrations and outright errors abounded.

In both instances described above, bodies often appeared with three legs, or nine or eleven fingers. In one instance, an acrobat's entire torso was backwards. On occasion, the AI would misrepresent or "misunderstand" the prompts it was given. In one instance, multiple attempts were made to produce an image of a seemingly simple background character for a piece of key art with poor to unusable results. The prompts used to describe the character's build and pose were just simply ignored.

Screenshot%202024-02-01%20at%2011.45.12%20AM

As AI imagery is largely an amalgamation of an admittedly massive library of reference that is assembled to satisfy a prompt, sometimes it just doesn't have the intuition to understand nuance the way a human artist would.

New types of challenges

In addition, AI does not have the ability (yet-at the time of this writing) to generate accurate, usable typography. In all forms of communication, images (and video footage, which is also being generated by AI now) are only part of the picture. Nearly every piece of creative we produce has some form of copy or type elements. And while AI can generate typographic mood images, the content is usually nondescript and cannot be customized. While other AI models are now popping up that can produce a phrase in a selection of typefaces, the integration of this ability into the image generators is not quite there yet. So, for instance, attempting to prompt an image of a person holding a sign with a particular phrase or word, in a particular style or typeface, is not yet achievable.

Screenshot%202024-02-01%20at%2011.46.16%20AM

Last minute update

As mentioned above, at the time of this writing, AI image generators did not have the dependable capability to create custom, usable typography within their images. However, in truly modern fashion, the day AFTER this article was completed, Google Gemini Ultra was released, which enables the user to do just that (among other things). This fact just reinforces the idea that in the world of AI updates, new features and breakthroughs can happen rapidly and unexpectedly, seemingly more so than in most other fields. So it is with confidence that I can predict other advances will most likely address shortcomings of the past in the near future.

image6

The value of AI

Of course, image generation is merely a tool, one in an ever-growing toolbox we as creative professionals use to solve creative challenges for our clients. Like the advent of digital photo manipulation and page layout software some 30 years ago, it enhances our capabilities, but does not replace our individual creativity or thought processes. It is for this reason that we can embrace these new tools, remaining cautiously optimistic that our human creativity will always be needed to drive the process, and continue to be in high demand.

Taken at face value, AI imagery has provided a quantum leap for creative advertising. What it has enabled us as artists to do has had a lasting effect on our work process, and on the creativity we can offer our clients. But it is not a solution to every creative challenge, and cannot solve every visual problem.

Yet.

To learn more about how we integrate new technologies into our workstream, get in touch.

Find out what we can do for you