AI Image Generation Guide: DALL-E, Midjourney & Stable Diffusion
๐ 12 min read ยท AI & Machine Learning ยท Image Resizer โ
How AI Image Generation Works
Modern AI image generators use diffusion models โ a technique that starts with random noise and gradually removes it, guided by your text prompt, until a coherent image emerges. This process typically runs 20โ50 denoising steps.
The key components are a text encoder (usually CLIP) that converts your prompt into a vector, a U-Net that iteratively denoises the image, and a VAE (Variational Autoencoder) that converts between pixel space and the compressed latent space where diffusion happens.
DALL-E 3 vs Midjourney vs Stable Diffusion
DALL-E 3 (OpenAI)
PROS
+Best prompt adherence โ follows instructions very literally
+Available via API ($0.04โ$0.12 per image)
+Integrated with ChatGPT
+Good at text in images
CONS
โLess artistic/stylized than Midjourney
โMore conservative content policy
โSlower than some alternatives
Midjourney
PROS
+Best aesthetic quality for artistic images
+Excellent for concept art, illustrations
+Active community and style references
+V6 has great photorealism
CONS
โDiscord-only interface (no API)
โSubscription required ($10โ$120/month)
โLess precise prompt following
Stable Diffusion (open source)
PROS
+Free to run locally
+Fully customizable โ thousands of community models
+No content restrictions (run locally)
+ControlNet for precise control
CONS
โRequires GPU hardware or cloud credits
โSteeper learning curve
โQuality varies by model/settings
Writing Effective Image Prompts
Image prompts follow a different structure than text prompts. The most effective formula:
[Subject], [Style/Medium], [Lighting], [Composition], [Quality modifiers]
Weak prompt
a catStrong prompt
A fluffy orange tabby cat sitting on a windowsill, golden hour sunlight, shallow depth of field, photorealistic, 85mm lens, soft bokeh background, 4KStyle Keywords That Work
Photography
photorealisticDSLR85mm lensbokehgolden hourstudio lightingHDR
Art Styles
oil paintingwatercolordigital artconcept artanimepixel artimpressionist
Lighting
cinematic lightingrim lightingvolumetric lightneon lightscandlelightovercast
Quality
highly detailed8Ksharp focusintricate detailsmasterpieceaward-winning
Using DALL-E 3 via API
import OpenAI from 'openai';
const openai = new OpenAI();
const image = await openai.images.generate({
model: 'dall-e-3',
prompt: 'A futuristic city skyline at night with flying cars, cyberpunk aesthetic, neon lights reflecting on wet streets, cinematic lighting, 4K',
n: 1, // DALL-E 3 only supports n=1
size: '1792x1024', // 1024x1024 | 1024x1792 | 1792x1024
quality: 'hd', // 'standard' | 'hd' (2x cost)
style: 'vivid', // 'vivid' | 'natural'
response_format: 'url', // 'url' | 'b64_json'
});
console.log(image.data[0].url);
// Returns a temporary URL valid for 1 hour
// Download and store it yourself for permanent accessImage Generation Pricing
| Model | Size | Price per image |
|---|---|---|
| DALL-E 3 Standard | 1024ร1024 | $0.040 |
| DALL-E 3 Standard | 1792ร1024 | $0.080 |
| DALL-E 3 HD | 1024ร1024 | $0.080 |
| DALL-E 3 HD | 1792ร1024 | $0.120 |
| DALL-E 2 | 1024ร1024 | $0.020 |
| Stable Diffusion (self-hosted) | Any | $0.00 (GPU cost only) |
Image Tools on DevBench
Resize, compress, and convert your AI-generated images for web use.