AI Image Generation Guide: DALL-E, Midjourney & Stable Diffusion

📖 12 min read · AI & Machine Learning · Image Resizer →

How AI Image Generation Works

Modern AI image generators use diffusion models — a technique that starts with random noise and gradually removes it, guided by your text prompt, until a coherent image emerges. This process typically runs 20–50 denoising steps.

The key components are a text encoder (usually CLIP) that converts your prompt into a vector, a U-Net that iteratively denoises the image, and a VAE (Variational Autoencoder) that converts between pixel space and the compressed latent space where diffusion happens.

DALL-E 3 vs Midjourney vs Stable Diffusion

DALL-E 3 (OpenAI)

PROS

+Best prompt adherence — follows instructions very literally

+Available via API ($0.04–$0.12 per image)

+Integrated with ChatGPT

+Good at text in images

CONS

−Less artistic/stylized than Midjourney

−More conservative content policy

−Slower than some alternatives

Midjourney

PROS

+Best aesthetic quality for artistic images

+Excellent for concept art, illustrations

+Active community and style references

+V6 has great photorealism

CONS

−Discord-only interface (no API)

−Subscription required ($10–$120/month)

−Less precise prompt following

Stable Diffusion (open source)

PROS

+Free to run locally

+Fully customizable — thousands of community models

+No content restrictions (run locally)

+ControlNet for precise control

CONS

−Requires GPU hardware or cloud credits

−Steeper learning curve

−Quality varies by model/settings

Writing Effective Image Prompts

Image prompts follow a different structure than text prompts. The most effective formula:

[Subject], [Style/Medium], [Lighting], [Composition], [Quality modifiers]

Weak prompt

a cat

Strong prompt

A fluffy orange tabby cat sitting on a windowsill, golden hour sunlight, shallow depth of field, photorealistic, 85mm lens, soft bokeh background, 4K

Style Keywords That Work

Photography

photorealisticDSLR85mm lensbokehgolden hourstudio lightingHDR

Art Styles

oil paintingwatercolordigital artconcept artanimepixel artimpressionist

Lighting

cinematic lightingrim lightingvolumetric lightneon lightscandlelightovercast

Quality

highly detailed8Ksharp focusintricate detailsmasterpieceaward-winning

Using DALL-E 3 via API

import OpenAI from 'openai';

const openai = new OpenAI();

const image = await openai.images.generate({
  model: 'dall-e-3',
  prompt: 'A futuristic city skyline at night with flying cars, cyberpunk aesthetic, neon lights reflecting on wet streets, cinematic lighting, 4K',
  n: 1,                    // DALL-E 3 only supports n=1
  size: '1792x1024',       // 1024x1024 | 1024x1792 | 1792x1024
  quality: 'hd',           // 'standard' | 'hd' (2x cost)
  style: 'vivid',          // 'vivid' | 'natural'
  response_format: 'url',  // 'url' | 'b64_json'
});

console.log(image.data[0].url);
// Returns a temporary URL valid for 1 hour
// Download and store it yourself for permanent access

Image Generation Pricing

Model	Size	Price per image
DALL-E 3 Standard	1024×1024	$0.040
DALL-E 3 Standard	1792×1024	$0.080
DALL-E 3 HD	1024×1024	$0.080
DALL-E 3 HD	1792×1024	$0.120
DALL-E 2	1024×1024	$0.020
Stable Diffusion (self-hosted)	Any	$0.00 (GPU cost only)

Image Tools on DevBench

Resize, compress, and convert your AI-generated images for web use.

Image Resizer →Image Compressor →PNG to JPG →

DevBench Editorial Team

Software Developers & Technical Writers

The DevBench team builds and maintains 90+ free developer tools used by thousands of developers daily. We write practical, no-fluff guides covering web development, APIs, security, data formats, and AI tools.

About DevBench →More Articles →