Prompt engineering for generative AI restores the art of word setting to prominence. Would the old masters of poetry reign supreme in the generative arts today?
Long before we all started to use words to paint pictures using DALL-E, Midjourney, and Stable Diffusion, there was a creative art craft called poetry with the exact same purpose. But, instead of churning out pixels onto digital canvases, the painting was formed in the mind of the reader. Has the history of the creative arts returned to the point where "poetry is back in business" and once again we need to master the craft of writing to paint pictures? Maybe so, indeed.
From Words to Paintings
Creative AIs like DALL-E, Stable Diffusion, and Midjourney require the "painter" to describe the image they want to create in text strings, or prompts, as they call it. The art of "prompting" heavily influences whether the output moves the viewers' hearts or makes a jumbled mess of blobs (which, of course, can be an art as well).
Skilled AI prompt writers blend the richness of words with style and original author expertise to produce paintings fit for any art gallery's walls without ever touching a brush or picking up a palette.It turns out that, thanks to the AI models mentioned above, the realm of words is already easily linked with the realm of visuals.
This means that these AIs have been trained using massive amounts of tagged visual data. Stable Diffusioin's training data, for example, has 5 billion or more pairings of images and text tags in dataset called LAION-5B. This data collection is most likely the largest publicly accessible model ever built, providing AIs with huge creative power.
Resurrecting the poets of the past
While current AI prompt writers strive to describe the output as fully as possible, the goal of poetry has never been to precisely describe what the reader should see or feel. The best poetry will gently but surely ignite the process of imagination in the readers' minds and hearts, creating as many unique "mental paintings" as there are enthusiastic readers.
But is AI capable of "imagining" the same way as humans do? Are we going to "see" the same images?
Let's find out!
In the experiment below, we are testing out the DALL-E and Stable Diffusion using the famous poems written by giants of poetry to see what kind of images they trigger in the "minds of AI". You can read those poems and see if the imagery in your mind corresponds to the imagery AI has created.
The Raven
Edgar Allan Poe
Deep into that darkness peering,
Long I stood there, wondering, fearing,
Doubting, dreaming dreams no mortals
Ever dared to dream before;
But the silence was unbroken,
And the stillness gave no token,
And the only word there spoken
Was the whispered word, "Lenore!"
This I whispered, and an echo
Murmured back the word, "Lenore!"
Merely this, and nothing more.
Because I could not stop for Death
Emily Dickinson
Because I could not stop for Death,
He kindly stopped for me;
The carriage held but just ourselves
And Immortality.
She Walks in Beauty
Lord Byron
She walks in beauty, like the night
Of cloudless climes and starry skies;
And all that’s best of dark and bright
Meet in her aspect and her eyes;
Thus mellowed to that tender light
Which heaven to gaudy day denies.
As you can see from the images above, the AIs' results can range from quite exact depictions of the words to more abstract interpretations (see DALL-illustration E's of Emily Dickinson's poem). But in every case, we can see the "feelings" in the AI-generated artworks.
This is both terrifying and breathtaking. The obvious next question is, can AIs have feelings like humans, or are we just staring into a fancy mirror that operates on the mind-boggling intricacy of computer algorithms and deep datapools of human creativity?
PS. It is really hard to define who is the author of these images. It could be argued that the author is the author of the poem (long dead when the image was created), a person who actually copied these prompts to the AI application or maybe the AI itself.