Skip to main content

What common errors should I look out for in Image Generation?

Christopher John avatar
Written by Christopher John
Updated over 3 weeks ago

Introduction

AI Image Generation has opened up many new creative possibilities; with some simple text-based prompting it's possible to create unique and richly detailed visuals that would previously taken hours of work.

Alongside impressive results, even the most advanced models used by Pencil can make certain, fairly predictable mistakes. Knowing what to look out for will save you time and help you decide whether you need to adjust your prompt, or work out when stepping in with manual edits might be your best option for an optimal result.

This article covers the most frequent Image Generation issues, gives some explanation and considerations for working around them.

Models are improving fast and such issues are becoming less and less commonplace, but possibly harder to spot.

Anatomical mistakes

One of the most noticeable and unsettling errors in AI-generated images is anatomical inaccuracy. Extra fingers, twisted limbs, backward joints, and inconsistent facial features are all common, and can be an issue when generating people or animals. These mistakes stem from the fact that models learn from millions of images without fully understanding the structure or function of a human or animal body - they look for patterns rather than comprehend.

You can minimise these issues by being more specific in your prompt (e.g. 'a kitten with a single tail') or, where appropriate, use post-processing tools like Photoshop to fix distorted parts. Don't forget that you can generate multiple variations with a single click, increasing your chances of an issue-free image, or allowing you to choose the cleanest.

Incoherent text in images

Another common issue to look out for comes when you ask AI to generate images that include text - like signs, book covers, or product labels. The result often resembles language at first glance, but on closer inspection may contain symbols, misspellings or meaningless letters. This is because most image models don’t understand language the way text-based models do - they learn what text looks like in images, not what it says.

To work around this, we recommend using Templates to combine text and images coherently and at scale. In some instances, working the text into the image afterwards using editing software might be best.

Lighting and perspective inconsistencies

AI models often struggle with spatial and lighting coherence and consistency. Even a simple prompt such as 'a person standing by a window at sunset' can result in incorrectly lit parts of the image or missing, misplaced or additional shadows.

These inconsistencies occur because image models aren’t built on a 3D understanding of space - they draw on a vast dataset to stitch together visual elements based on what typically appears near each other, not on real-world geometry or physics.

The best way to mitigate such issues is to include as much detail as possible in your prompt - see this guide for tips.

Cluttered or 'melting' backgrounds

Even when the main subject of a generated image is as desired, the background can often reveal subtle flaws. You might notice environments that feel overly busy, abstract, or 'melting' - where objects and textures blend together in unnatural ways, or objects in the distance appear half-formed. This happens because background elements are typically lower priority in the model’s attention, and because it’s harder for the AI to maintain spatial consistency across a wide canvas.

These issues are especially common in prompts that ask for complex scenes with numerous elements, or wide landscapes, where the model has to try and work with many visual elements at once.

To improve results, you may consider simplifying your prompt so that it focuses on fewer key elements. You can also include negative prompts such as 'excessive detail' or 'cluttered background'.

Finally, you may wish to crop, blur, or otherwise manually edit backgrounds in post-processing, especially if your goal is to put the spotlight on a subject rather than render a detailed environment.

Style Mismatch and Prompt Drift

Another frustration you might encounter in image generation is when the output drifts away from your intended style or theme. For example your prompt might ask for a 'whimsical children's book illustration', but end up with something eerily realistic. This kind of 'prompt drift' often happens when certain keywords dominate the model's training associations, or when multiple concepts in a single prompt pull in conflicting directions.

To reduce this, you have a few options:

  • Test prompts incrementally, adding elements bit-by-bit, possibly using Seeds if you have the basis of an image you are happy with.

  • You may consider specifying a particular medium or art-inspired style, e.g. 'in the style of a watercolour painting' or 'in classic comic book art style'.

  • If you are looking to generate multiple images in a specific style you may wish to consider training a LoRA, or custom model, provided you have sufficient reference images

Conclusion

AI Image Generation is both an art and a science. While good models are incredibly powerful and draw on vast datasets, they're not perfect and they don't 'think' in the way a human does.

Knowing what can go wrong can be one of the first steps to making better use of them. Whatever brief you are working on, or content you are looking to create, you should expect a few flawed results along the way.

The key is adjusting prompts, iterating, trying different models, and consider using some of the additional tools mentioned in this article: Templates, Seeds, Negative prompts, and LoRAs. Don't forget to check our other extensive Image Generation resources.

Did this answer your question?