All articles

How to Clean Up Interview Audio: Noise, Echo, and Clarity Issues Solved

Interview recordings suffer from background noise, echo, inconsistent levels, and bad guest setups. This guide covers every technique for cleaning up interview audio for podcasts, journalism, and documentary work.

May 12, 20258 min readBy WefixSound Engineers

Ready to restore your audio?

Free sample within 24–48 h. You only pay if you're happy.

Get Free Sample

How to Clean Up Interview Audio: Noise, Echo, and Clarity Issues Solved

Interview recordings are the backbone of podcasting, journalism, documentary filmmaking, and corporate content. They're also consistently some of the most challenging audio to clean up.

The fundamental problem: you can control your own recording setup. You can't control your interview subject's environment, equipment, acoustics, or how they position themselves relative to a microphone.

This guide covers every aspect of cleaning up interview audio — whether it's a remote Zoom call, an in-person sit-down, a phone conversation, or archival interview footage.


Types of Interview Recordings and Their Specific Challenges

Remote Video Call Interviews (Zoom, Teams, Riverside)

The dominant interview format today. Challenges:

  • Internet compression applied to audio at both ends
  • Guest using laptop internal microphone or earbuds
  • Guest's room acoustics (usually untreated home or office)
  • Zoom's own processing (automatic gain control, noise suppression) interfering with audio
  • Inconsistent levels between host and guest

In-Person Sit-Down Interviews

Location sound with controlled microphones. Challenges:

  • Location noise (traffic, HVAC, ambient environment)
  • Room acoustics at the interview location
  • Distance from subject if boom mic is used
  • Wind noise for exterior interviews
  • Competing sounds from the interview location

Phone Interviews

Narrowband audio (300Hz–3.4kHz) with heavy compression. Challenges:

  • Inherent bandwidth limitation
  • Compression artifacts from phone network
  • Connection quality variation
  • Background noise from the subject's environment

Archival and Historical Interviews

Often recorded on obsolete formats (tape, film) in conditions not optimized for audio quality. Challenges:

  • Media degradation over time
  • Original equipment limitations
  • Environmental noise of the era

Diagnostic Step: Identify What You're Working With

Before applying any processing, listen carefully to the interview and make notes:

  1. Is there consistent background noise? (AC hum, traffic, computer fan) — fixable with noise reduction
  2. Is there room echo? (hollow, distant sound) — fixable with de-reverb
  3. Are levels inconsistent between speakers? — fixable with compression and level matching
  4. Is there electrical hum? (constant tonal buzz) — fixable with de-hum
  5. Is the voice muffled or lacking clarity? — fixable with EQ
  6. Is there severe audio distortion or compression artifacts? — partially fixable, limits apply

This diagnostic step tells you exactly which tools to reach for and in what order.


Processing Order: Why It Matters

Audio processing order affects results significantly. Apply in this sequence:

1. High-pass filter (80–100Hz)
Remove subsonic content and low-frequency rumble before anything else. This prevents low-frequency noise from confusing later processes.

2. De-hum (if present)
Remove electrical interference at 50/60Hz and harmonics. A constant tonal hum is best addressed early before compression, which would make the hum louder relative to the signal.

3. Noise reduction
Apply broadband noise reduction to address consistent background noise. Do this before de-reverb — noise can confuse de-reverb algorithms if present at high levels.

4. De-reverb (if needed)
For echoing rooms. After noise reduction, the de-reverb algorithm works on a cleaner signal.

5. EQ
Frequency shaping after noise and echo processing, when the signal is cleaner. Cuts first (remove unwanted frequencies), then boosts (add back needed frequencies).

6. De-esser (if needed)
After EQ, because EQ changes may affect the sibilance profile.

7. Compression
Level management comes near the end of the processing chain, after you've cleaned up what you can.

8. Loudness normalization
Final step. Target for your distribution platform (-16 LUFS for podcasts, -14 LUFS for YouTube, -23 LUFS for broadcast).


Tool Recommendations by Budget

Free Tier

Audacity: Handles the fundamentals — noise reduction (profile-based), basic EQ, compressor, normalize. Adequate for recordings with straightforward problems in controlled environments.

DaVinci Resolve (free version): Good for interview audio extracted from video. Fairlight audio module includes noise reduction and basic processing.

Adobe Podcast Enhance: Free web tool from Adobe. AI-based processing — upload, get enhanced version. Impressive on mild-to-moderate noise problems. Can over-process.

Mid-Tier ($100–300)

iZotope RX Elements: Entry-level version of the professional standard. Includes De-noise, De-hum, and De-click. Missing the more advanced modules (De-reverb, Dialogue Isolation) available in higher tiers.

Waves Clarity Vx: AI-based noise and reverb reduction plugin. Simpler than RX but effective, and works directly in DAWs as a plugin.

Professional Tier ($400+)

iZotope RX Advanced: The full toolkit — every module, including Dialogue Isolation (AI-based speech separation), De-reverb, Spectral Repair, Voice De-noise. Used by professional post-production studios.


Specific Scenarios and Solutions

"The guest has constant air conditioning noise"

Classic podcast problem. Apply noise reduction targeting the AC frequency range (typically a broadband hiss with energy concentrated around 1–4kHz). In Audacity: sample pure AC noise between sentences, then apply. In RX: use adaptive De-noise — it continuously tracks and removes the noise profile.

Conservative reduction (8–12dB) sounds natural. Aggressive reduction (15dB+) creates artifacts. Find the setting where the AC is not distracting rather than completely absent.

"The guest sounds like they're in a bathroom"

Heavy reverb from an echoing room. Apply De-reverb at 50–65% reduction. The voice will sound drier and more present. Check for artifacts — if the voice sounds metallic or "watery," reduce the strength.

For severe bathroom reverb, RX's Dialogue Isolation is often more effective than De-reverb — it separates the speech signal from everything else rather than trying to subtract the reverb component.

"One speaker is much louder than the other"

Level mismatch between interviewer and subject. For multi-track recordings, treat each track separately and bring them to a consistent LUFS level before mixing. For single-track recordings, use automation (manual volume adjustments) to bring the quieter speaker up — or a dynamic EQ that responds to level differences between the voices.

"There's a constant electrical hum"

Apply de-hum targeting 50Hz (Europe, Asia, Africa, Australia) or 60Hz (North America, much of South America) plus harmonics. This removes the interference precisely without affecting the voice signal. In Audacity: Effects → Filter Curve EQ → apply notches at the hum fundamental and harmonics.

"The interview was recorded through a window or wall"

High-frequency content is absorbed by the surface, leaving a muffled, low-frequency heavy sound. Combined with whatever acoustic environment was on the other side. This is one of the harder restoration scenarios. Apply: high-frequency boost (3–8kHz), de-reverb for the echo, noise reduction. Results depend heavily on how much signal was captured through the obstacle.

"The phone interview sounds tinny and compressed"

Phone bandwidth is limited to 300Hz–3.4kHz. You can enhance clarity within that range (presence boost at 2–3kHz) but can't add back frequency content that was never captured. Focus on reducing background noise from the subject's environment and leveling. Accept that it will sound like a phone call — this is a known convention in audio content.


Multi-Person Interview Audio: Specific Considerations

Separating Overlapping Speech

This is a fundamental limitation: if two people speak simultaneously on a single microphone or a mixed track, they cannot be separated. Any attempt to isolate one voice from another will damage both. The only prevention is capturing each speaker on a separate track.

For remote recordings, use Riverside.fm, SquadCast, or Zencastr — these capture separate local tracks for each participant. For in-person recordings, use separate microphones on separate recording channels.

Syncing Multi-Mic Recordings

When two or more microphones record the same conversation, slight timing differences create phase issues when mixed together. Use a clap or countdown at the start of recording to sync tracks, then verify alignment using a sharp transient (like a handclap sound) visible in both tracks' waveforms.

Level Matching Across Multiple Speakers

After individual processing, measure the average loudness (LUFS) of each track before mixing. Bring them to a consistent target level (around -18 LUFS pre-mix) before combining. This creates a balanced conversation where no speaker dominates.


Interview Audio for Different Purposes

Journalism

Authenticity matters. Apply noise reduction and leveling, but avoid heavy processing that could be perceived as altering what was said. Document all processing applied. The goal is intelligibility, not perfection.

Podcast

Professional quality expected. Apply full processing chain. Guest audio that's clearly worse than host audio is acceptable and recognized — listeners understand not everyone has a professional setup. Try to minimize the gap but don't over-process trying to eliminate it.

Documentary Film

High stakes, high expectations. Professional audio restoration for interviews that need to meet broadcast standards. Multiple rounds of processing may be needed. Location sound engineers on documentary shoots often work with post-production audio specialists specifically for difficult interview cleanup.

Corporate Training and Video

Consistent, professional quality required. Training material will be watched many times; any audio quality issue compounds with repeated exposure. Full cleanup and consistent mastering across all videos in a series.


When Professional Restoration Is the Right Call

For journalism, documentary work, and high-stakes corporate content, professional audio restoration for key interviews:

  • Achieves results that consumer tools can't match on difficult material
  • Provides consistent quality across all interviews in a project
  • Saves significant time that would otherwise be spent on difficult cleanup
  • Delivers the technical quality expected for broadcast, distribution, and licensing

WefixSound cleans up interview audio for podcasters, journalists, and video producers. A free 60-second sample shows you the result before payment — submit your most difficult interview clip to see what's achievable.


Related articles: How to Clean Up Zoom Recordings · How to Fix Poor Quality Phone Recordings · Podcast Audio Cleanup Guide

Ready to restore your audio?

Submit your file and receive a free sample within 24–48 hours. You only pay if you're happy with the result.

Get Free Sample
How to Clean Up Interview Audio: Complete Guide | WefixSound | WefixSound