How AI image prompts are structured
A reliable prompt has four layers: subject, style, lighting, and
composition. State the subject plainly (“a red fox sitting in snow”), then layer a
medium or art style (“oil painting”, “35mm film photo”, “low-poly 3D render”), then a
lighting term, then framing and camera details. Models weight earlier words more
heavily, so lead with what matters most. Across tools the vocabulary is shared, but the
syntax differs: Midjourney likes comma-separated keyword stacks, DALL-E 3 prefers full
sentences, and Stable Diffusion accepts keyword stacks plus weighting like (term:1.3).
Style and medium words that work
- Photography:
35mm photo,DSLR,cinematic still,editorial photo,macro shot,tilt-shift,analog film grain - Painting and illustration:
oil painting,watercolor,gouache,ink wash,vector illustration,flat design,concept art,matte painting - 3D and digital:
octane render,unreal engine,isometric 3D,clay render,low-poly,pixel art,voxel art - Era and movement:
art nouveau,bauhaus,ukiyo-e,cyberpunk,vaporwave,1970s sci-fi cover
Pick one dominant style. Stacking several (“watercolor oil painting 3D render”) produces muddy, averaged results because the model tries to satisfy all of them.
Lighting, camera, and composition modifiers
Lighting is the highest-impact lever. Useful terms include golden hour,
blue hour, soft diffused light, rim lighting, backlit, volumetric light,
neon glow, and dramatic chiaroscuro. For camera control, add focal-length and depth
cues: wide-angle, 85mm portrait lens, shallow depth of field, bokeh,
aerial view, low angle, Dutch angle. For composition, use rule of thirds,
centered symmetrical, negative space, close-up, or extreme wide shot.
Quality boosters like highly detailed, 8k, sharp focus, and trending on ArtStation still help with some Stable Diffusion checkpoints, but newer models such as
Midjourney v6 and DALL-E 3 mostly ignore them — spend those words on real description
instead.
Tool-specific syntax and parameters
Midjourney v6 uses parameters after the prompt: --ar 16:9 (aspect ratio),
--stylize 250 (artistic license), --chaos 30 (variation), and --no text to
exclude elements. Keep the prompt itself a concise keyword stack.
DALL-E 3 rewrites whatever you type into a longer internal prompt, so write a clear descriptive sentence and add “I NEED the exact prompt I gave you, do not rewrite it” if you want literal control. It excels at following relationships (“a cat on top of a blue box, not inside it”).
Stable Diffusion XL supports prompt weighting (keyword:1.4), negative prompts
(list what to exclude: blurry, extra fingers, watermark), and per-model trigger words
from LoRAs or fine-tuned checkpoints. Always pair a positive prompt with a negative
prompt for clean results.
A practical template that works almost everywhere:
[subject], [one style], [lighting], [camera/composition], [mood/color palette].