Image generator DALL·E 3 is here, available to everyone, and initial experiments suggest that this is one of the best AI image creators at the moment. To help you get up to speed with DALL-E faster than I did, I've written down some practical tips and introduced different types of prompts that allow you to coax slightly higher quality results out of DALL·E 3. In other words, this post is suitable for both those taking their first steps in the world of AI image creation, as well as those who want to improve their DALL·E 3 skills.

What is this DALL·E 3 and how can it be used??

DALL·E 3  is an artificial intelligence model created by OpenAI (yes, the same company that's behind ChatGPT) that is able to generate visual imagery based on textual instructions or prompts. DALL·E 3's older "sister" (or "brother") already existed before, but the updated version 3 is much more capable and versatile than the previous ones. Therefore, DALL·E 3 is increasingly offering strong competition to Midjourney, the current market leader, and is finding more and more applications in various fields, from creating artistic imagery to generating specific diagrams, illustrations or advertising materials. As you can read below, DALL·E 3 is not just a digital pencil, but allows users to visualize complex ideas quite easily, giving them the tools for graphical representation of concepts and narratives.

DALL·E 3 can be used by all ChatGPT Pro users or Bing Image Creator users.

In ChatGPT, DALL·E 3 looks like this:

Dall-E 3 in ChatGPT
Dall-E 3 in ChatGPT


Common problems when writing DALL·E 3 image prompts

Finding the balance between precision and abstraction

One of the most common challenges when writing prompts for DALL·E 3 is finding the right balance between precision and abstraction. Specifically, an overly abstract prompt can leave too much room for different interpretations by the AI (as well as a human), which can ultimately lead to visuals that do not meet the user's expectations.

For example, if a user enters the prompt "natural landscape", DALL·E 3 could generate anything from a desert vista to a mountainous terrain. On the other hand, an excessively long and detailed prompt can be equally constraining for DALL·E 3 and for the user themselves, as it leaves little room for creative surprise. Therefore, it is important to find a reasonable compromise that contains enough details to guide the system, while still allowing room for artistic flourishes.

Lack of constraints and/or context

Clearly defining context and constraints is also important. A prompt that lacks context or is too open-ended can result in unwanted or unpredictable images. For example, if you enter the prompt "dog with ball", DALL·E 3 may create an image where the dog is chewing on the ball, instead of catching it etc. Adding context and constraints, such as "a dog catching a flying ball at sunset", helps quickly create the desired visual.

Ambiguity of style and composition

When possible, it is important to specify the desired style and composition in the prompt. For example, the user may want an image done in watercolor technique or following a cubist style. If such details are not added, the resulting style and composition is unpredictable. In addition, it is always worth thinking through before writing the prompt whether the desired visual has important considerations around angle of view, lighting, and distance from the object. If so, then all instructions should be written down in the prompt as precisely as possible.

How to create better prompts for Dall-E3?

So how do you actually avoid these problems and generate better visuals? Below I outline some thoughts and if you want to start experimenting alongside reading, then log into ChatGPT or Bing Image Creator right away and start trying it out :)

The AI enerated visual is already pretty cool by itself, but often there is a desire to make the created image more interesting or improve some detail according to your wishes.

Here are some tips on how to do that better:

Be as precise as possible

If you have a clear vision in your mind of the desired result, describe it as precisely as possible. Precision does not usually imply the length of the text, but clearly articulated expectations. For example, instead of writing "bird on tree", you could say "blue bird sitting on an oak branch". This way you can be more certain that the generated image corresponds more closely to your expectations.

AI generated image with Dall-E 3, blue bird on oak branch
Image created with Dall-E 3

Use descriptive adjectives

Using adjectives helps give the image more depth and context. For example, "radiant sunset" or "mystical forest" can evoke a much stronger visual impression than simply "sunset" or "forest".

Describe the background or story of the scene depicted in the image

It is often forgotten that in addition to the desired object, there is usually something around or behind it. Describing the context and situation of the created "protagonist" also helps reach the desired result faster. For example, "a child playing with a ball on the beach" gives the AI much more information than simply "child and ball".

Specify the angle of view and framing

Specifying the angle of view can significantly affect the composition of the image. Whether the object is depicted up close or from afar, from above or below. If these are important to you in creating the visual, you should add them to the prompt right away. If the visual is almost right, but something feels "off", you can try completing the prompt by describing the angle of view or framing.

Add examples of known works and/or styles

If you happen to have a creative block, you can use the characteristics of well-known artists or art movements to gain inspiration. Referencing famous works of art or photos can help generate more sophisticated and interesting images. For example, you could use the prompt "Portrait of a yellow dog, in the style of Salvador Dali".

AI generated image with Dall-E 3, dog in Salvador Dali style
Image created with OpenAI Dall-E 3

Prefer a shorter prompt if possible

A precise and concrete prompt does not always mean that all thoughts in your head need to be detailed in writing. An overly long and complex prompt can confuse the algorithm. For best results, try to phrase the prompt concisely yet rich in detail. From my experience, the ideal prompt length for image creation is about 3-4 sentences.

How to phrase these prompts for DALL·E 3?

Different types of prompts allow the user to express different goals and desires for generating visuals. Here are five main types of prompts that are useful when using the DALL·E 3 image generator.

Descriptive prompts

Descriptive prompts focus on describing objects and scenes. They are usually very concrete and contain many details that help construct a precise image.

Example: "A red bicycle standing in front of a yellow house with purple curtains in the window."

Narrative prompts

Narrative prompts add story or context to an image. They can be longer and string together several sentences connecting different elements.

Example: "A child sitting on the shore watching the sunset, while their dog plays with a ball on the beach."

Metaphorical prompts

Metaphorical prompts allow using figurative language to evoke more abstract or deeper meanings.

Example: "A clock ticking slowly in the deep ocean, representing the relativity of time."

AI generated image with Dall-E 3
Image created with Dall-E 3


Conceptual prompts

Conceptual prompts focus on broader ideas or themes. They can be more abstract and may require more interpretation from the viewer.

Example: "The human rights tree, with branches being different fundamental rights and roots being democracy."

Style-based prompts

Style-based prompts focus on a specific artistic style or technique. They can be used to influence the overall look of the image, not just its content.

Example: "A night view of Paris in an impressionist style."

Each prompt type brings with it unique capabilities and constraints. It’s important to understand their differences and use cases in order to achieve your desired results with DALL·E 3 or other image generators.

Additional tips for fine-tuning DALL·E 3 images

If you've already experimented a bit and become more demanding in your desires, here are some tips in addition to the advice above on how to improve DALL·E 3 images.

Specifying the format

Many DALL·E 3 versions and/or user interfaces allow you to also specify the image format.

If you use Dall-E3 through ChatGPT, you can specify whether the image is presented in square (1024x1024 pixels), portrait (1024x1792 pixels) or landscape format (1792x1024 pixels). For this, use the English or Estonian specifications in your prompt.

For example: "A meadow in spring bloom, in the morning mist, landscape format image".

Using variations

DALL·E 3 usually allows generating several different variations from the same prompt. If most of the generated visual is pleasing, but something catches the eye, then... Remember that if you use Dall-E3 through ChatGPT, ChatGPT itself already varies your prompts slightly.

Improving image resolution (upscaling)

If the generated image does not match the desired resolution, it can be upscaled using various image editing tools. Those with Adobe creative licenses can find upscaling for example in Adobe Lightroom. In addition, Topaz Labs' upscaler has received praise. I personally use open source software called SwinIR, which also gives very good results.

Summary

In summary, there are no special secret tricks in creating images with AI. Perhaps the most difficult part is articulating your vision into concrete wording. The classic "make this picture cooler!" does not help either a human designer or artificial intelligence.

Hopefully these few tips above will help you take your first steps and avoid some typical mistakes (which I've made myself), but to achieve the best results, you simply need to try and practice. :)

So, good luck experimenting and practicing!