Prompting for Image Generation

In prompt engineering for image generation, there are two main aspects to consider: the content and the formatting of the image.

Description of image content

Pay attention to the level of detail you're providing in your prompt. A helpful tip to keep in mind is to describe the image you want in the level of detail you'd describe something in front of you to someone who cannot see it. When prompting for 'a university professor' or 'a nurse', most image generators will provide you with an old white man or a young white woman, respectively. This can largely be explained by bias in the training data (i.e., there are more pictures of old, white, male professors than of young, Black, female ones). This can be prevented by specifying more detail, e.g. on the gender, age, time period, skin colour, etc. of the person you want to visualise.

On the left a photorealistic image of an old, white, male professor in front of a chalkboard. Caption: "'a professor', DALL-E3". On the right a young, Black, female professor in front of a chalkboard. Caption: 'a young, Black, female professor'.

Image formatting and style

In image generation, the formatting and style possibilities are endless. Below are some examples of the options available.

Art style

You could specify the art style of the output. For example, do you want a pencil drawing, an oil painting, or a photorealistic image?

on the left 'a pencil drawing of a tree' created by DALL-E3. On the right, 'an oil painting of a tree', DALL-E3.

Amplification

You could add (subjective) adjectives or adverbs to change the output: beautiful, sweet, etc.

On the left a photorealistic image of a snail, prompt 'a snail'. In the middle a cartoon-style image of a hybrid lollipop-snail on a leaf, accompanied by a ladybug. Prompt 'a sweet snail'. On the right, a cartoon-style pink snail with a cupcake on top of its shell, with sparkles and a rainbow in the background. Prompt 'a very sweet snail'. All images created by DALL-E3.

Angle and lighting

Especially in photorealistic images, it might be helpful to specify the angle and type of lighting you want. For example, a bird's eye view, or a portrait - soft side lighting or a spotlight.

On the left, a photorealistic image of the Eiffel tower as seen from above. Prompt 'Eiffel tower. Bird's eye view.' On the right a photorealistic image of the Eiffel tower as seen from below. Prompt 'Eiffel tower. Worm's eye view'. Both images created by DALL-E3.
On the left a plant in a spotlight. Prompt 'a potted plant on a table in a spotlight'. On the right a plant in soft ambient lighting. Prompt 'a potted plant on a table in ambient lighting'. Both images created by DALL-E3.

Positive and negative prompting

Some image generators, like StableDiffusion, have introduced the concept of 'negative prompting' on top of positive prompts (i.e. telling the model what you want to see in its output). In negative prompting, you can specify what you don't want to see.

Three portrait photos of men: the left and middle ones have moustaches, the right one does not. Prompts, from left to right: 'Portrait photo of a man', 'Portrait photo of a man without moustache', and 'Portrait photo of a man' with negative prompt 'moustache'. All images generated by StableDiffusion.

The text of this page is based largely on a document written by Tijmen Kerstens.