A few months ago, a colleague posted on LinkedIn that they had just completed a project that involved over 1,000 prompts. I resisted the urge to ask how many iterations each prompt required. Six seemed appropriate. That is 6,000 prompts to build something that will eventually get thrown out once the engineers get involved.
To me, that colossal amount of wasted time points out a fundamental disconnect between humans and Generative AI.
As designers, our job is to bring something new to life, whether it be a UI, an image, a video or words. The AI is clueless when it comes to the word “new”. It is trained on what has been done. The word “new” to an AI is as foreign as my first immersion in the Chinese language in Beijing. The disconnect between the designer and the AI is language.
Generative AI requires us to describe our vision. The problem is that after you outline your vision to your clients and team members, they will look at what you produced and say, “That isn’t what I was expecting.”
This is understandable because they all imagined what it would look like in their “mind’s eye” and, being human, each expectation was unique to that individual. As I have often said, “You can’t describe it. You have to show it.” 6,000 prompts is the designer saying to the AI, “That isn’t what I expected… 5,000 times.” Michael Polanyi identified this disconnect in 1966. In his book, The Tacit Dimension, he said, “I shall reconsider human knowledge by starting from the fact that we can know more than we can tell.” As designers, we live in the gap between knowing something and knowing what something looks like.
When I make a rough sketch on paper, I have an “idea” of how something will look. There is an enormous amount of implicit information on that piece of paper- proportion, rhythm, negative space- that simply can’t be spoken or written down. As I am fond of saying when sharing sketches, “Here is what I am thinking.” Write it down and feed it into an AI, and language becomes the JPEG compression equivalent for visual ideas because you have moved from the high-resolution vision in your mind’s eye to the lower-resolution result based on the AI’s subjective definition of such keywords as “modern”,” sleek,” or “intuitive.”
When I make a rough sketch on paper, I have an “idea” of how something will look. There is an enormous amount of implicit information on that piece of paper- proportion, rhythm, negative space- that simply can’t be spoken or written down. As I am fond of saying when sharing sketches, “Here is what I am thinking.” Write it down and feed it into an AI, and language becomes the JPEG compression equivalent for visual ideas because you have moved from the high-resolution vision in your mind’s eye to the lower-resolution result based on the AI’s subjective definition of such keywords as “modern”,” sleek,” or “intuitive.”
This same inability to deal with ambiguity also exists in client briefs, team communication and even design reviews that drag an AI into the process. The AI is simply unable to deal with language-as-design-specification. The result is six iterations for each of 1000 prompts to get close to the final vision. The AI promoters, in their boosterism, are missing the point. If we rely on language, eventually everything appears to be AI’s “best guess,” and people start asking, “Is it AI?” or “More AI slop.”
There are ways of getting closer to the designer’s intent. The ability to include reference material- sketches, mood boards, photographs and so on- is the Generative AI models acknowledging that language, alone, is not enough. You are “showing it” to the AI rather than telling it. Iterative refinement loops are becoming standard elements of workflow because we know that first result is not the vision. Iteration is what helps the designer patiently articulate what he or she meant. In many respects, each result becomes the vocabulary for the next iteration of the prompt.
If our role is shifting away from sketching and using design tools, are we transitioning from Creators to Curators and Editors? Providing reference images is a form of curation — we are selecting and showcasing rather than creating. Writing and refining a prompt is a form of editing — patiently narrowing the gap between what the AI generated and what we truly intended.