Skip to content
Tablet screen displaying a music production app with waveform, dials, and colorful interface, evoking a focused, creative atmosphere.

Meta’s SAM Audio Brings AI-Powered Sound Separation to Creators and Editors

Meta’s new SAM Audio model uses AI to isolate sounds using text, video, or timeline prompts—transforming audio workflows.

Meta has launched SAM Audio, a new open‑source artificial intelligence model designed to make advanced audio editing and sound isolation more intuitive and accessible. Part of Meta’s Segment Anything family of AI models, SAM Audio extends the “segment anything” philosophy from images and video into the audio domain, enabling users to isolate specific sounds from complex recordings using natural prompts.

Traditional audio editing often requires specialist tools and manual work to separate vocals, background noise, instruments, or sound effects. SAM Audio introduces a unified approach that supports three types of user prompts: text, visual, and time‑span.

With text prompts, users can type descriptions like “dog barking” or “lead vocals” to extract the corresponding sound from a track. Visual prompting lets users click on a source in a video frame to isolate its audio component, while time‑span prompts guide the model to identify a sound based on where it appears in the timeline. These prompt methods can be used alone or in combination for more precise control.

Under the hood, SAM Audio employs transformer‑based architecture and Meta’s Perception Encoder Audiovisual (PE‑AV) engine to analyze inputs across text, audio, and visual cues, producing separated target audio alongside residual tracks that contain the remaining sound. This architecture aims to deliver state‑of‑the‑art audio separation across real‑world scenarios, from speech and music to environmental sounds.

The model is available through Meta’s Segment Anything Playground, where users can experiment with audio and video uploads or use sample assets to test separation tasks, and it can also be downloaded for integration into production workflows.

Meta positions SAM Audio as a foundational tool that could streamline workflows in music production, podcast editing, film post‑production, accessibility applications, and research, potentially reducing reliance on multiple specialized audio tools and making professional‑quality sound isolation more widely available.


Comments

Latest