Guide

How to Create an AI Prompt From a Photo

A five-part framework for turning any image into a reproducible prompt for Midjourney, DALL·E, Stable Diffusion and ChatGPT vision models.

Most "image-to-prompt" tools spit out generic descriptions like "a beautiful photo of a landscape". That doesn't reproduce anything. The fix is a structured framework — five layers, in order, every time.

1. Identify the subject

Look at the photo and answer: what is this a picture of? Keep it to 2-4 words. 'A red fox in snow' beats 'an animal in winter weather'. Specificity is the single biggest lever in image prompting.

2. Capture composition

Note the shot type (close-up, wide shot, overhead), the camera angle (eye-level, low angle, Dutch tilt), the framing (rule of thirds, centered, negative space), and depth of field (shallow, deep, bokeh). These four numbers are what make a photo look like a photo.

3. Describe lighting & color

Light source (golden hour, overcast, neon, studio softbox), direction (backlit, side-lit, rim light), mood (moody, airy, cinematic) and palette (warm pastels, high-contrast monochrome, teal-and-orange). Lighting is what separates 'AI-looking' from 'real-looking'.

4. Add medium & style

Is it a photograph (35mm, medium format, Polaroid)? An illustration (watercolor, ink, pixel art)? A 3D render (Octane, Unreal, claymation)? Optionally reference a known artist or studio for consistent style anchoring.

5. Assemble & test

Combine in order: [subject], [composition], [lighting], [medium], [modifiers]. Generate, compare to the original, then tighten the weakest part. Two or three iterations usually nails it.

The prompt template

Copy this and fill in the brackets:

[Subject] in [Setting],
[Shot type] [camera angle], [framing],
[Lighting] [mood],
[Medium] in the style of [Reference],
--ar 16:9 --style raw

Example: A red fox curled in fresh snow, close-up low angle, shallow depth of field, soft golden-hour backlight, cinematic warmth, 85mm photograph in the style of Annie Leibovitz, --ar 16:9 --style raw

Faster: let a vision model do step 1-4

Upload the photo to GPT-4o, Claude 3.5 Sonnet or Gemini with this meta-prompt:

Analyze this image and output a single Midjourney prompt using:
Subject, Setting, Shot type + angle, Lighting + mood,
Medium + style reference, then aspect ratio.
Be specific, no generic adjectives.

The output is usually 80% there — tweak the lighting and reference, and you're done.

FAQ

Can AI analyze a photo and generate a prompt automatically?

Yes — vision-capable models (GPT-4o, Claude 3.5, Gemini) can describe and prompt-ify an image. The manual framework above produces sharper, more reproducible results.

Does this work for Midjourney, DALL·E and Stable Diffusion?

Yes. The five-part framework is portable across all major image models — only the parameter flags change.

Why does my output look different from the original?

Models add their own bias. Pin focal length, lighting and medium explicitly, then lock a seed for reproducibility.

Skip the work — buy a proven prompt

Browse thousands of expert prompts tested on real images, with example outputs included.

Browse prompts