How to Clean Up Zoom and Remote Interview Recordings
Remote work changed how audio content gets made. Podcasts, video interviews, webinars, online courses, documentary footage — all of it now routinely involves participants recording from their homes, offices, and wherever else life takes them.
The result: hours and hours of audio that sounds like what it is — compressed, inconsistent, recorded over internet connections in rooms that were never designed for recording.
But bad Zoom audio isn't unfixable. This guide covers exactly what causes the problems, what you can realistically fix, and the most effective tools and techniques for cleaning up remote interview recordings.
Why Zoom Recordings Sound the Way They Do
Internet Compression
Zoom, Teams, Google Meet, and every other video conferencing platform compress audio in real time to make it transmissible over internet connections. This compression discards audio information that algorithms consider non-essential. The result is characteristic "Zoom sound" — slightly robotic, dulled, with a certain flatness.
When you record from Zoom's built-in recorder, you're capturing this already-compressed audio. The compression artifacts are baked in.
The fix at source: Use platforms designed for podcast/interview recording that capture each participant's audio locally (before compression) and sync the tracks afterward. Riverside.fm, SquadCast, and Zencastr all do this. The recordings sound dramatically better because you're getting 48kHz uncompressed WAV from each participant rather than Zoom's compressed output.
But that's prevention. If you already have Zoom recordings you need to fix, read on.
Participant Microphones
Most Zoom participants aren't using quality microphones. They're using:
- Laptop built-in microphones (capture everything in the room equally)
- Earbuds with inline mics (tinny, distance from mouth varies)
- Webcam built-in microphones (often positioned below the screen, facing away from the mouth)
- Bluetooth headsets (apply their own compression before the audio even hits Zoom)
Each of these creates different audio problems, and in a multi-person recording, you have multiple participants with different setups — meaning inconsistent audio quality throughout the recording.
Room Acoustics
Home office environments are generally acoustically terrible for recording. Hard floors, bare walls, glass windows — all highly reflective surfaces that create echo and reverb. You can hear a person's room character clearly in most Zoom calls: some sound boxy, some sound like they're in a bathroom, some are heavily echoed.
Zoom's Automatic Processing
Zoom applies its own automatic noise suppression and automatic gain control. These features help in a casual meeting but often damage audio quality for professional use:
- Automatic gain control constantly adjusts volume, creating pumping and inconsistent levels
- Noise suppression can create artifacts when music or sound is involved
- Echo cancellation can sometimes remove parts of the voice signal along with the echo
What You Can Realistically Fix in Zoom Audio
You can fix:
- Background noise (AC, traffic, computer fans, hum)
- Mild-to-moderate room echo and reverb
- Inconsistent volume between participants
- Electrical hum (50/60Hz)
- Pops and clicks
- Sibilance and harshness
You can partially improve:
- Moderate internet compression artifacts (reduce but not eliminate)
- Voice recorded at significant distance from mic
- Heavy room reverb (reduce significantly, residual artifacts remain)
You cannot fix:
- Severe internet glitching where audio cuts out or becomes unintelligible
- Recordings where multiple people were on one track
- Audio so heavily compressed that original signal detail is gone
Tools for Cleaning Up Zoom Recordings
iZotope RX (Most Effective)
iZotope RX is the industry standard for dialogue cleanup. For Zoom recordings specifically:
De-noise: Reduces consistent background noise. The adaptive mode continuously tracks the noise floor — better than profile-based tools for variable noise.
De-reverb: Specifically targets and reduces room echo. RX's de-reverb is far more effective than anything built into video editors.
Dialogue Isolation: Uses machine learning to separate the speech signal from background. On difficult recordings, this can be transformative.
De-hum: Removes electrical interference at 50/60Hz fundamental and harmonics without affecting voice.
Adobe Podcast Enhance (Free)
Available at podcast.adobe.com. Upload your audio, download a cleaned version. The AI processing is impressive for a free tool — it separates speech from noise and reduces background significantly.
Descript Studio Sound
If you're using Descript for transcription and editing, its Studio Sound feature applies one-click AI enhancement.
Audacity (Free, Basic)
For Zoom recordings with straightforward background noise:
- Find a section with pure background noise (no one speaking)
- Effect → Noise Reduction → Get Noise Profile
- Select all audio → Effect → Noise Reduction → Apply (12–15dB, sensitivity 6)
- Effect → Compressor to even out levels
- Effect → Normalize
Step-by-Step: Cleaning a Multi-Person Zoom Recording
Before You Start
Separate the tracks. If possible, get individual track recordings. Zoom's audio separation requires enabling it in settings before the call.
Identify each participant's problems. One participant might have AC hum; another might be in an echoing room; a third might be quiet and distant. Treating each track separately gives much better results.
Processing Each Track
Step 1: High-pass filter
Cut everything below 80–100Hz. This removes rumble, traffic vibration, and other low-frequency noise.
Step 2: De-noise
Apply noise reduction conservatively — start at 50–60% of maximum strength. Check for artifacts.
Step 3: De-reverb (if needed)
For participants with noticeable room echo, apply de-reverb at moderate settings.
Step 4: De-hum (if needed)
For tracks with a constant tonal hum, apply de-hum targeting 50Hz (Europe) or 60Hz (North America).
Step 5: EQ
- Cut: resonant peaks in the 200–400Hz range (reduces boxiness)
- Boost: gentle presence boost at 3–5kHz (improves intelligibility)
Step 6: Compression
Brings up quieter moments and controls loud peaks. 2:1 to 3:1 ratio with moderate attack works well for interview speech.
Step 7: Level matching
Bring all tracks to a consistent loudness before mixing. Aim for -18 to -16 LUFS per track.
Step 8: Final limiter
A brickwall limiter at -1dB prevents clipping in the final mix.
Remote Interview Audio for Different Use Cases
Podcast
Standard target: -16 LUFS integrated (mono) or -19 LUFS (stereo). Export as MP3 at 128kbps (mono) or 192kbps (stereo).
YouTube and Video
Target: -14 LUFS (YouTube normalizes to this level). Export as WAV for the video edit.
Corporate Video / Training Content
-23 LUFS is the EBU R128 broadcast standard. Check with the distributor for their specific requirements.
Transcription and Legal Use
If the recording is needed for transcription or legal purposes, clarity is everything. Focus on de-noise and de-reverb to maximize speech intelligibility.
When to Send Zoom Recordings to a Professional
Remote interview audio cleanup is time-intensive. For regular production — weekly podcasts, ongoing series, corporate video — the math changes:
- A typical 1-hour Zoom interview requires 1–2 hours of cleanup time with decent setups, 3–4+ hours if setups were poor
- Professional cleanup services often cost less than the time value of doing it yourself
- Consistency is harder to maintain DIY — professional services deliver standardized results file after file
WefixSound cleans up Zoom and remote interview recordings with a free sample before commitment. Upload your most difficult track, get the first 60 seconds back clean, and decide based on actual results. Bulk rates are available for ongoing podcast or video production work.
Quick Reference: Zoom Audio Cleanup
| Problem | Tool | Settings |
|---|---|---|
| Background noise | De-noise | 50–70% strength, adaptive |
| Room echo | De-reverb | Low-medium reduction |
| Electrical hum | De-hum | 60Hz (US) or 50Hz (EU) + harmonics |
| Volume inconsistency | Compression + leveling | 2:1 ratio, -18dB threshold |
| Boomy/boxy voice | EQ | Cut 200–300Hz |
| Unclear speech | EQ | Boost 3–5kHz gently |
Related articles: Podcast Audio Cleanup Guide · How to Clean Up Interview Audio · How to Clean Up a Conference Recording