Captions in the
same pass
as the edit.
Sapari transcribes word-by-word, aligns each one to its timestamp, and generates captions that stay in sync after every cut. Margin auto-adapts per aspect ratio.
7 days · 30 AI minutes · No credit card
Why captions matter
Most platforms
auto-mute on scroll.
A video without captions loses viewers in the first second. Doing them manually means syncing word-by-word — nobody has time. Running a separate captioning tool means another upload, another export, another sync issue.
Sapari generates captions as part of the main analysis. No separate step.
How it works
Word-by-word.
Locked to time.
Speech-to-text
Word-level precision. Every word recognized with start and end timestamps.
Bound to ms
Every word locked to its exact timestamps. Captions stay in sync after silence and false start cuts.
Burn or toggle
Captions burn into the export or stay toggleable based on your setting.
Styling
What you can change.
Save your preferred combination as a preset and reuse it across projects.
Margin auto-adapts
One setting, three ratios.
Sapari scales the caption's vertical margin per aspect ratio so captions clear platform UI on 9:16 — without you fiddling with separate position settings per export.
Languages
Four today.
Caption generation uses the detected language from your audio.
Profanity in captions
Three modes.
Independent of audio censoring. Bleep audio and leave captions clean — or the other way around. More on profanity filter →
In the pipeline
Captions move with everything else.
Credit cost
Half a credit if that's all you need.
Full AI pipeline (captions + silence + false starts + audio + B-roll).
Captions-only mode (transcription + captions).
Manual editing and rendering. Always free.
Before you ask
Common questions.
Are captions included in every plan? +
Yes, including the free trial. They're part of the analysis, not a premium add-on.
Can I edit the text after generation? +
Yes. Every caption line is editable — fix a misheard word, change capitalization, tweak punctuation.
Do captions work for thick accents or noise? +
Accent support is good; heavy background noise hurts accuracy. Running Clean Sweep in the same analysis helps — it denoises before transcription.
Can I download the transcript separately? +
SRT export is on the roadmap. Today captions render into the exported video.
What's the maximum video length? +
Same as the rest of Sapari — no hard upper limit. Long-form podcasters run 2-hour episodes.
Are captions included in every plan?
Yes, including the free trial. They're part of the analysis, not a premium add-on.
Can I edit the text after generation?
Yes. Every caption line is editable — fix a misheard word, change capitalization, tweak punctuation.
Do captions work for thick accents or noise?
Accent support is good; heavy background noise hurts accuracy. Running Clean Sweep in the same analysis helps — it denoises before transcription.
Can I download the transcript separately?
SRT export is on the roadmap. Today captions render into the exported video.
What's the maximum video length?
Same as the rest of Sapari — no hard upper limit. Long-form podcasters run 2-hour episodes.