WanImageToVideoApi - ComfyUI Built-in Node Documentation

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

The Wan Image to Video node generates a video from a single input image and a text prompt. It uses the provided image as the first frame and creates a video sequence based on the description, with options for resolution, duration, audio, and other advanced settings.

Inputs

Parameter	Data Type	Required	Range	Description
`model`	COMBO	Yes	”wan2.5-i2v-preview" "wan2.6-i2v”	Model to use (default: “wan2.6-i2v”)
`image`	IMAGE	Yes	-	Input image that serves as the first frame for video generation. Exactly one image is required.
`prompt`	STRING	Yes	-	Prompt describing the elements and visual features. Supports English and Chinese (default: empty).
`negative_prompt`	STRING	No	-	Negative prompt describing what to avoid (default: empty).
`resolution`	COMBO	No	”480P" "720P" "1080P”	Video resolution quality (default: “720P”). The Wan 2.6 model does not support 480P.
`duration`	INT	No	5-15 (step: 5)	Duration of the generated video in seconds. A 15-second duration is supported only by the Wan 2.6 model (default: 5).
`audio`	AUDIO	No	-	Audio must contain a clear, loud voice, without extraneous noise or background music. When provided, audio duration must be between 3.0 and 29.0 seconds.
`seed`	INT	No	0-2147483647	Seed to use for generation (default: 0).
`generate_audio`	BOOLEAN	No	-	If no audio input is provided, generate audio automatically (default: False).
`prompt_extend`	BOOLEAN	No	-	Whether to enhance the prompt with AI assistance (default: True).
`watermark`	BOOLEAN	No	-	Whether to add an AI-generated watermark to the result (default: False).
`shot_type`	COMBO	No	”single" "multi”	Specifies the shot type for the generated video, that is, whether the video is a single continuous shot or multiple shots with cuts. This parameter takes effect only when prompt_extend is True (default: “single”).

Constraints:

Exactly one input image is required for video generation.
The Wan 2.6 model (wan2.6-i2v) does not support 480P resolution.
A 15-second duration is supported only by the Wan 2.6 model (wan2.6-i2v).
When audio is provided, it must be between 3.0 and 29.0 seconds in duration.

Outputs

Output Name	Data Type	Description
`output`	VIDEO	Generated video based on the input image and prompt.

Source fingerprint (SHA-256): b8a75e324f7436e8a376e4a058b0a32556cafbe8e7975148cbc6302638f52058

Documentation Index

​Inputs

​Outputs

Inputs

Outputs