StabilityAudioInpaint - ComfyUI Built-in Node Documentation

Transforms part of an existing audio sample using text instructions. This node allows you to modify specific sections of audio by providing descriptive prompts, effectively “inpainting” or regenerating selected portions while preserving the rest of the audio.

Inputs

Parameter	Description	Data Type	Required	Range
`model`	The AI model to use for audio inpainting.	STRING	Yes	`"stable-audio-2.5"`
`prompt`	Text description guiding how the audio should be transformed (default: empty). Maximum length is 10,000 characters.	STRING	Yes
`audio`	Input audio file to transform. Audio must be between 6 and 190 seconds long.	AUDIO	Yes
`duration`	Controls the duration in seconds of the generated audio (default: 190).	INT	No	1 to 190
`seed`	The random seed used for generation (default: 0).	INT	No	0 to 4294967294
`steps`	Controls the number of sampling steps (default: 8).	INT	No	4 to 8
`mask_start`	Starting position in seconds for the audio section to transform (default: 30).	INT	No	0 to 190
`mask_end`	Ending position in seconds for the audio section to transform (default: 190).	INT	No	0 to 190

Note: The mask_end value must be greater than the mask_start value. The input audio must be between 6 and 190 seconds in duration.

Outputs

Output Name	Description	Data Type
`audio`	The transformed audio output with the specified section modified according to the prompt.	AUDIO

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 3c180043c538311b1808cddd84b0c0ab22a6fa1d943b7f9ddc9edab0fb3413ad

​Inputs

​Outputs

Inputs

Outputs