Guide
How to Create an AI Prompt From a Photo
A five-part framework for turning any image into a reproducible prompt for Midjourney, DALL·E, Stable Diffusion and ChatGPT vision models.
Most "image-to-prompt" tools spit out generic descriptions like "a beautiful photo of a landscape". That doesn't reproduce anything. The fix is a structured framework — five layers, in order, every time.
1. Identify the subject
Look at the photo and answer: what is this a picture of? Keep it to 2-4 words. 'A red fox in snow' beats 'an animal in winter weather'. Specificity is the single biggest lever in image prompting.
2. Capture composition
Note the shot type (close-up, wide shot, overhead), the camera angle (eye-level, low angle, Dutch tilt), the framing (rule of thirds, centered, negative space), and depth of field (shallow, deep, bokeh). These four numbers are what make a photo look like a photo.
3. Describe lighting & color
Light source (golden hour, overcast, neon, studio softbox), direction (backlit, side-lit, rim light), mood (moody, airy, cinematic) and palette (warm pastels, high-contrast monochrome, teal-and-orange). Lighting is what separates 'AI-looking' from 'real-looking'.
4. Add medium & style
Is it a photograph (35mm, medium format, Polaroid)? An illustration (watercolor, ink, pixel art)? A 3D render (Octane, Unreal, claymation)? Optionally reference a known artist or studio for consistent style anchoring.
5. Assemble & test
Combine in order: [subject], [composition], [lighting], [medium], [modifiers]. Generate, compare to the original, then tighten the weakest part. Two or three iterations usually nails it.
The prompt template
Copy this and fill in the brackets:
[Subject] in [Setting], [Shot type] [camera angle], [framing], [Lighting] [mood], [Medium] in the style of [Reference], --ar 16:9 --style raw
Example: A red fox curled in fresh snow, close-up low angle, shallow depth of field, soft golden-hour backlight, cinematic warmth, 85mm photograph in the style of Annie Leibovitz, --ar 16:9 --style raw
Faster: let a vision model do step 1-4
Upload the photo to GPT-4o, Claude 3.5 Sonnet or Gemini with this meta-prompt:
Analyze this image and output a single Midjourney prompt using: Subject, Setting, Shot type + angle, Lighting + mood, Medium + style reference, then aspect ratio. Be specific, no generic adjectives.
The output is usually 80% there — tweak the lighting and reference, and you're done.
FAQ
Can AI analyze a photo and generate a prompt automatically?
Yes — vision-capable models (GPT-4o, Claude 3.5, Gemini) can describe and prompt-ify an image. The manual framework above produces sharper, more reproducible results.
Does this work for Midjourney, DALL·E and Stable Diffusion?
Yes. The five-part framework is portable across all major image models — only the parameter flags change.
Why does my output look different from the original?
Models add their own bias. Pin focal length, lighting and medium explicitly, then lock a seed for reproducibility.
Skip the work — buy a proven prompt
Browse thousands of expert prompts tested on real images, with example outputs included.
Browse prompts