AI Image Generation Guide: DALL-E, Midjourney & Stable Diffusion

๐Ÿ“– 12 min read ยท AI & Machine Learning ยท Image Resizer โ†’

How AI Image Generation Works

Modern AI image generators use diffusion models โ€” a technique that starts with random noise and gradually removes it, guided by your text prompt, until a coherent image emerges. This process typically runs 20โ€“50 denoising steps.

The key components are a text encoder (usually CLIP) that converts your prompt into a vector, a U-Net that iteratively denoises the image, and a VAE (Variational Autoencoder) that converts between pixel space and the compressed latent space where diffusion happens.

DALL-E 3 vs Midjourney vs Stable Diffusion

DALL-E 3 (OpenAI)
PROS
+Best prompt adherence โ€” follows instructions very literally
+Available via API ($0.04โ€“$0.12 per image)
+Integrated with ChatGPT
+Good at text in images
CONS
โˆ’Less artistic/stylized than Midjourney
โˆ’More conservative content policy
โˆ’Slower than some alternatives
Midjourney
PROS
+Best aesthetic quality for artistic images
+Excellent for concept art, illustrations
+Active community and style references
+V6 has great photorealism
CONS
โˆ’Discord-only interface (no API)
โˆ’Subscription required ($10โ€“$120/month)
โˆ’Less precise prompt following
Stable Diffusion (open source)
PROS
+Free to run locally
+Fully customizable โ€” thousands of community models
+No content restrictions (run locally)
+ControlNet for precise control
CONS
โˆ’Requires GPU hardware or cloud credits
โˆ’Steeper learning curve
โˆ’Quality varies by model/settings

Writing Effective Image Prompts

Image prompts follow a different structure than text prompts. The most effective formula:

[Subject], [Style/Medium], [Lighting], [Composition], [Quality modifiers]
Weak prompt
a cat
Strong prompt
A fluffy orange tabby cat sitting on a windowsill, golden hour sunlight, shallow depth of field, photorealistic, 85mm lens, soft bokeh background, 4K

Style Keywords That Work

Photography
photorealisticDSLR85mm lensbokehgolden hourstudio lightingHDR
Art Styles
oil paintingwatercolordigital artconcept artanimepixel artimpressionist
Lighting
cinematic lightingrim lightingvolumetric lightneon lightscandlelightovercast
Quality
highly detailed8Ksharp focusintricate detailsmasterpieceaward-winning

Using DALL-E 3 via API

import OpenAI from 'openai';

const openai = new OpenAI();

const image = await openai.images.generate({
  model: 'dall-e-3',
  prompt: 'A futuristic city skyline at night with flying cars, cyberpunk aesthetic, neon lights reflecting on wet streets, cinematic lighting, 4K',
  n: 1,                    // DALL-E 3 only supports n=1
  size: '1792x1024',       // 1024x1024 | 1024x1792 | 1792x1024
  quality: 'hd',           // 'standard' | 'hd' (2x cost)
  style: 'vivid',          // 'vivid' | 'natural'
  response_format: 'url',  // 'url' | 'b64_json'
});

console.log(image.data[0].url);
// Returns a temporary URL valid for 1 hour
// Download and store it yourself for permanent access

Image Generation Pricing

ModelSizePrice per image
DALL-E 3 Standard1024ร—1024$0.040
DALL-E 3 Standard1792ร—1024$0.080
DALL-E 3 HD1024ร—1024$0.080
DALL-E 3 HD1792ร—1024$0.120
DALL-E 21024ร—1024$0.020
Stable Diffusion (self-hosted)Any$0.00 (GPU cost only)

Image Tools on DevBench

Resize, compress, and convert your AI-generated images for web use.