Video Prompts
Video generation in PrePrompt works image-to-video: you start from a still Frame, then prompt the model to animate it. This flips how you think about prompting. The image already shows the scene, the lighting, and the style. Your prompt is only about what moves.
Good video prompts come from good stills. A muddy or poorly-composed starting frame produces a muddy or poorly-composed clip. If you’re not happy with the still, reshot it before you animate.
What to describe
Section titled “What to describe”Keep it to three things:
Subject → Motion → Camera
Who or what moves, what they do, and how the camera responds. That’s the whole shape.
“The woman turns her head slowly from facing forward to looking over her right shoulder. Her eyes shift first, then her head follows. Static camera, no movement.”
Thirty-one words. One subject, one motion, one camera direction. The model can execute that reliably.
What not to describe
Section titled “What not to describe”Do not redescribe the scene. The input frame already shows it.
Don’t say:
- “A woman in a red dress standing in a kitchen” — visible in the frame
- “Warm afternoon light through the window” — inherited from the frame
- “Cinematic, shot on Kodak Portra” — inherited from the frame
Saying it again wastes prompt budget and can confuse the model about what should change versus what should stay.
Motion
Section titled “Motion”Describe movement with defined endpoints. Open-ended motion is the most common reason generations hang or drift.
Works: “The character walks from left to right across the frame, starting off-screen left and exiting off-screen right.”
Doesn’t work: “The character walks.” Where from? Where to? The model has no anchor.
Speed words matter too: “glides smoothly,” “jerks to a halt,” “slowly rises,” “rapid pan,” “gentle tilt,” “crash zoom.” Without a speed cue, the model defaults to a mid-pace that may not match the scene.
Simple motion is reliable. One primary action, one secondary action at most. Four or five simultaneous movements overload the model and produce distortion.
Camera movement
Section titled “Camera movement”Every camera movement maps to a filmmaking term the models understand.
| Movement | Prompt phrasing |
|---|---|
| Pan (horizontal rotation) | “Camera pans slowly from left to right” |
| Tilt (vertical rotation) | “Camera tilts up from the character’s feet to their face” |
| Dolly in (moves closer) | “Slow dolly push in toward the subject” |
| Dolly out (moves back) | “Camera pulls back to reveal the full scene” |
| Zoom (focal length change) | “Slow zoom into the character’s eyes” |
| Tracking (follows a moving subject) | “Camera tracks the character as they walk right” |
| Orbit | ”Camera slowly orbits the character from front to three-quarter view” |
Combinations work too: “Dolly push while tilting up” for a dramatic reveal. “Rise and pan right” to show a cityscape. “Dolly out while zooming in” for the classic unsettling vertigo effect.
Always include a camera direction — even “static camera, no movement.” If you leave it out, you get a flat, barely-moving clip.
Pacing and duration
Section titled “Pacing and duration”Shorter clips are more reliable. Two to three seconds for a single gesture or expression. Five to six seconds for a standard scene beat. Ten seconds for a full performance or reveal.
Push past twelve seconds and you’re asking for drift — the model has to hold character identity and scene coherence over a long stretch, and small inconsistencies accumulate.
Character performance
Section titled “Character performance”Performance beats come from specific physical direction, not emotional adjectives.
Weak: “She looks sad.”
Strong: “Her eyes lower, the corners of her mouth tighten, she exhales slowly.”
The model can animate what you describe physically. It guesses at what you describe emotionally — and the guess is often wrong.
For dialogue scenes, pair the video generation with a voice line from an Audio Node. Lip sync is approximate — most models approximate mouth shapes rather than match phonemes exactly. Frame your clip so the character isn’t lit harshly from below if precise sync matters.
Atmosphere and weather
Section titled “Atmosphere and weather”You can direct atmosphere changes over time, even though the static elements are inherited from the frame:
- “Rain begins to fall, starting light and building to heavy”
- “Fog slowly rolls in from the background”
- “Wind picks up, blowing the character’s hair and coat”
- “Lightning flash illuminates the scene briefly”
- “A car’s headlights sweep across the scene from left to right”
These read as events the model can time within the clip.
What doesn’t work
Section titled “What doesn’t work”- Too many elements. Five-plus simultaneous actions = distortion.
- Open-ended motion. “Things move around” or “the character walks” without endpoints causes hangs.
- Redescribing the frame. The frame is already the input. Describe what changes.
- Contradictions. “Slow rapid pan” or “static camera tracking the subject” confuses the model.
- Vague spatial language. “Somewhere in the scene” won’t land. Say “from the left side of the frame” or “in the foreground.”
Multi-character scenes
Section titled “Multi-character scenes”Two characters, two separate motion descriptions. Keep them spatially distinct in the starting frame.
“Character A on the left slowly extends their hand toward Character B on the right. Character B hesitates, then reaches out to take it. Both characters maintain eye contact throughout. Static camera, medium shot framing both.”
Three or more characters in one clip is doable but drift-prone. For tightly-choreographed group scenes, consider breaking it into two or three clips from different angles and cutting between them in the Timeline.
Related
Section titled “Related”Why does my clip look worse than the still it came from? Usually one of three things: the starting frame was low quality, the prompt asked for too much motion at once, or the duration was too long. Shorten the clip and simplify the motion.
How long can a video clip be? Most clips generate best between two and ten seconds. Push past twelve and you risk visible drift in character features or scene coherence. Chain short clips in the Timeline for longer sequences.
Do I have to describe the scene in the prompt? No — and you shouldn’t. The scene is already in the input frame. Describe only what moves.
Why did my character’s face change halfway through the clip? Identity drift at longer durations. Shorten the clip, keep the motion simple, and make sure the input frame shows the character clearly and well-lit.
Can I animate between two frames? Some video models support a first-frame / end-frame setup — the clip interpolates from one still to the other. Useful for deliberate transitions like “character sitting at desk” to “character standing at the window.”
What if the video model rejects my prompt? Certain words trigger safety filters even in innocent contexts. See Safety Blocks for how to rephrase.