Inconsistent loudness is the biggest giveaway.
The biggest giveaway that a recording is from a creator and not a studio is inconsistent loudness. A viewer scrolls past your video and it's quieter than the one before. They bump the volume up. The next video auto-plays louder and they wince. They remember yours as the quiet one.
Every major streaming platform normalizes audio to a target loudness, measured in LUFS (Loudness Units relative to Full Scale):
Most raw creator recordings are anywhere from −28 LUFS (too quiet) to −8 LUFS (too loud, often clipped). Normalizing to −14 LUFS across your videos is the single change that most reliably makes your audio sound consistent with everything else in the feed. The point of normalizing is to match what the platform expects, so your video plays at the same perceived level as everything around it, rather than simply making it louder.
Platforms only normalize downward. YouTube and Spotify turn down videos that are too loud, but they don't turn up videos that are too quiet. If your −24 LUFS recording plays next to a −14 LUFS one, yours stays quieter. Normalize before uploading.
Your recording has a noise floor.
Your recording has a noise floor whether you hear it or not. HVAC, laptop fans, the faint hiss of a cheap preamp, keyboard clicks. All of it sits under your voice, and the viewer's brain registers it as "cheap" even when they can't consciously name what's wrong.
Denoising cuts the noise floor without touching the voice. Tools that do this well (iZotope RX, Adobe Enhance Speech, Sapari's Clean Sweep, and similar) use FFT-based processing to sample the ambient noise and subtract it from the speech signal. At default settings the difference is subtle; at aggressive settings the voice starts sounding artifacted, like someone speaking through a filter.
The test is A/B comparison with good headphones. If the denoised version sounds tighter without sounding processed, you're at the right aggressiveness. If it sounds too clean (like studio audio that lost the texture of your room), back off.
Consistency between hosts and guests.
If you do interviews, volume mismatch between participants is the third biggest thing that reads as amateur. One person recorded on a Shure SM7B locally, the other on their laptop mic on Zoom. Without normalization, the viewer adjusts their volume every time someone speaks.
The fix is running both tracks through the same loudness target. If you're capturing in Riverside or Zencastr with separate tracks per guest, normalize each to −14 LUFS independently and the mix balances itself. If you only have a mixed recording, tools that do per-segment normalization can partially correct the mismatch, though separate tracks are always better.
How to do it in Sapari.
Clean Sweep is a single toggle that runs denoising plus loudness normalization to −14 LUFS on every clip in the project. It runs in the same analysis pass as silence removal and captioning.
Upload the recording
MP4, MOV, or any common video format.
Toggle Clean Sweep on
Single toggle in the analysis settings. Runs FFT denoising plus EBU R128 normalization to −14 LUFS.
Let the pass run
Output is normalized, denoised audio across every clip in the project. Runs in the same analysis pass as silence and captions.
Review in the preview
If anything sounds over-processed, toggle Clean Sweep off at the project level and re-export.
For separate-track podcasts (Riverside, Zencastr), per-track processing is on the roadmap but not live today. Mix down before uploading.
Common questions.
Will denoising make my voice sound robotic?
At default settings, no. Denoising at high aggressiveness introduces artifacts, especially on voices recorded close to significant noise. Sapari's default is tuned to preserve speech transients.
What mic should I use?
A capable mic genuinely matters. A Blue Yeti, Shure MV7, or SM7B beats a built-in laptop mic by a wide margin. Dynamics like the SM7B pick up less room sound, which makes post-production easier. It's not the only thing, but it's not interchangeable either.
Why do I still need to normalize if platforms normalize on playback?
Platform normalization only turns loud content down, not quiet content up. If your recording is below −14 LUFS, YouTube and Spotify leave it there.
Can I do this in CapCut or Premiere?
Yes. Premiere has Essential Sound with loudness metering. CapCut has basic normalization. The question is whether you want to learn and run those tools per clip, or run a single toggle per project.