Silence removal

6 minutes of dead air.
Gone in 30 seconds.

Sapari transcribes your video, finds every pause longer than your threshold, and cuts them. You watch the timeline, not a stopwatch.

Start free trial

7 days · 30 AI minutes · No credit card

Before

47:23

~14% dead air · sparse peaks · long flat sections

Sapari · one pass

↓

After

38:41

Tight peaks · minimal gaps · 8m 42s saved

The problem

The worst kind
of editing labor.

Manual silence cutting is scrub, find a pause, cut, cross-fade, repeat — for hours. A 30-minute recording with natural pauses is 90 minutes of work before you've started on captions or audio.

Most of the "edit four hours per hour of footage" trap, right there in one feature.

The math

Typical talking-head dead air 12–18%

10-min footage → silence ~1.5 min

45-min footage → silence ~6–8 min

Manual cut time / final minute 3–5 min

How it works

Two signals.
Better than either alone.

01 · Transcribe

Word-level audio

Sapari runs the audio through speech-to-text with word-level timing.

02 · Detect

Word gap + acoustic

Combines gaps between spoken words and acoustic silence. Catches paused speech and buried filler.

03 · Cut

Reviewable cards

Every silence becomes a card on the timeline. Keep it, dismiss it, or drag the boundary.

Controls

One slider.
Off to aggressive.

Pacing slider

Off Natural Balanced Hyper

Off

Keep natural pauses. Threshold 3000ms.

Natural / Podcast

Preserves breath and rhythm.

Balanced

The default. Tightens without rushing.

Hyper / TikTok

Removes anything that isn't speech. Threshold 400ms.

You also get edge padding — how much silence to leave around each speech segment so cuts don't sound choppy. Higher pacing reduces padding; lower pacing keeps it.

The numbers

Fast, long-form,
frame-tight.

10-min video

~3 min

to analyze.

45-min video

~13 min

to analyze.

Max length

No limit

Podcasters run 2-hour episodes through it.

Cut precision

Word-level

Cuts land between words, never mid-syllable.

In the pipeline

One step of one analysis.

Silence removal runs alongside false start detection, caption generation, audio cleanup, and B-roll placement — all from the same pass.

See the full pipeline →

Before you ask

Common questions.

What if I want natural pauses? +

Set the slider to Natural/Podcast or turn silence removal off entirely. You keep full control.

Will it cut mid-sentence? +

No. Cuts land in gaps between words, not during them. If the speaker paused inside a sentence, Sapari detects the gap but knows the sentence isn't over — you can dismiss that specific cut.

What about breath sounds? +

Breath is usually below the word-gap threshold at most pacing settings. At Hyper, it gets cut — which most short-form creators want.

Can I remove silence from a video that's already edited? +

Yes. Upload the edited version as a new project and Sapari treats it like any recording.

Does it work on non-English audio? +

Yes. Transcription supports English, Spanish, Portuguese, and French. Silence detection is language-independent.

What if I want natural pauses?

Set the slider to Natural/Podcast or turn silence removal off entirely. You keep full control.

Will it cut mid-sentence?

No. Cuts land in gaps between words, not during them. If the speaker paused inside a sentence, Sapari detects the gap but knows the sentence isn't over — you can dismiss that specific cut.

What about breath sounds?

Breath is usually below the word-gap threshold at most pacing settings. At Hyper, it gets cut — which most short-form creators want.

Can I remove silence from a video that's already edited?

Yes. Upload the edited version as a new project and Sapari treats it like any recording.

Does it work on non-English audio?

Yes. Transcription supports English, Spanish, Portuguese, and French. Silence detection is language-independent.

Cut the dead air.
Keep the rest.

7 days. 30 AI minutes. No credit card.