Izotope RX Voice De-noise In Depth - Remove Room Noise

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] so [Music] in this episode we demonstrate how to reduce fairly steady state noise like fan noise maybe a kind of a steady state traffic noise outside of a building from a dialogue recording using rx's voice d noise now good news this is available in all versions of rx including elements standard and advanced and what's also nice about it is that there is no latency so you can actually use it for live audio processing as well if you need to do that for a live stream for example or any other sort of live show where you can use a vst plug-in or it also comes in a you and vst3 i actually have a secret here there actually is a set of numbers in this particular case that generally work well in a number of circumstances and that's the settings that i have right now now you still need to tune that if you're really trying to do a good job a careful job but these actually work quite well in a lot of circumstances again where you have steady state noise like a fan if you have discontinuous noise it's a completely different thing so we'll have to come back to that in another episode but let's go ahead and start here let me play the first few seconds you can hear where we're starting the other thing we say though is uh you know parenting is really built on two pillars one is structure guidance the other is love and warmth okay so that's what the dialogue sounds if you want to hear what the noise itself sounds like it's this so obviously some fans running there some other things going on so let's dive into the settings so you can understand what each of them does and then that'll make it a lot easier for you to apply this for your own situations but before we do that what is this thing actually doing let me explain in very broad terms how it works first it uses 64 psychoacoustically spaced band pass filters which together act as a multi-band gate now you might be thinking yourself what in the world did he just say i don't understand what that means let me try to describe it a little bit more in the spectrogram view here the orange trace that you see here along this x-axis represents time and the vertical axis refers to audio frequencies so we have our low frequencies our base frequencies down here at the bottom and then our higher frequencies up here at the top so that's what we're looking at here you can see for example as this person is talking right here we have are the low frequencies represented here and then we have the higher frequencies so you can see what their voice is doing you can see here sometimes the frequencies don't really extend beyond about 4 000 hertz but when they do they're probably saying something like the letter s which you can hear has a lot more high frequency content so that's what this represents here now what's happening behind the scenes with the voice denoise plug-in is basically they have set 64 different filters along this audio spectrum from i mean i don't know exactly what they cover but probably somewhere from around 20 hertz up to 20 000 hertz is my guess they've spaced them out in a way that makes sense from a psychoacoustics point of view that is to say sounds that our ears are generally very sensitive at picking up they probably spaced them out so they can cover those frequencies appropriately and then what happens is anytime sound falls below a certain amplitude or i guess you could think of it in terms of the height of the waveform if we go back to waveform view here if it falls below a certain point then it applies that filter for that particular frequency and pushes the noise down by however much you specify in the reduction setting so that's a high level let's go ahead and talk about the settings here so it makes a little bit more sense all right so i'm going to go back to this view just because it makes more sense from a noise reduction point of view first we have this learn button and the adaptive mode checkbox i'm going to uncheck that here if i uncheck it i am now in manual mode and in manual mode what you have to do is highlight a period of time within the clip that is just noise nothing else and the reason you do that is you're telling the plugin this is noise so build your noise profile based on this selection here if you select one section you can select another section by holding the shift key down and also highlighting an additional section be careful if you do highlight another section what can happen a lot of times those are breaths in between words so you have to be careful this is really helpful in this particular case because we actually have a couple of seconds and you generally want at least a second preferably a couple of seconds of just noise that's going to get you the best results in most cases now once i've highlighted that i can go ahead and click on learn and you can see here it changed this profile right here this blue line here with the big dots on it that is our threshold and again anything any audio along the spectrum that falls below this threshold gets reduced by however much we set the reduction to so that's what's happening here now what's interesting is that in this mode i can actually tweak it i can move these particular dots here to change the threshold profile if i want to do that so if i really want to get aggressive on reducing the low frequencies i can actually bump that up here the threshold up so anything that falls above minus 33.9 db in this frequency range gets reduced by 6 db so you have to be careful of course this can start sounding really crazy if you do that and this is generally not something i would recommend beginners do is start playing around with these a whole lot but that is an option available to you as you sort of start to train your ear and you can start to make more you know better judgments about changing the threshold profile now what's really neat here is that if you go into adaptive mode i'm going to go ahead and click that you can see the learn button disappears we're not learning anymore now the plugin is doing its own building of the noise profile and automatically adjusting that threshold for you let me go ahead and play back the audio again and watch what happens to the threshold here in the real-time analyzer so the blue line here you'll see it move around as the plug-in actually adapts while it's playing back and processing the audio here we go the other thing we say though is you know parenting is really built on two pillars one is structure so that's pretty interesting and it does actually a really good job in my experience and this is sort of the magic from my point of view of the rx voice denoise plug-in i don't know that a lot of other denoising plug-ins necessarily do things like that so this is this really makes it quite effective i will say that if i'm working on something like this audio clip where i have just this really steady state fan running in the background i will 99.9 of the time use the adaptive mode all right next setting optimize for we have two choices dialogue or music now you might ask yourself well what's the difference the main difference is that you're telling the algorithm to work a little bit differently the plug-in to work a little bit differently depending on the type of material you're working with we are working with spoken word here so we choose dialog dialog is generally different from music or singing in that you don't generally sustain an individual tone for a long time when you are speaking and so that's going to influence how this particular plugin will work if i have it set to dialog if i set it to music it's going to assume that there will more likely be sustained tones because that's what we do when we sing so that's the difference there generally again if you're just working on spoken word dialogue audio you'll choose dialog next up we have our filter type now this is a very interesting one and this is basically a big trade-off let me explain the two different options you have surgical and gentle surgical its advantage is that it can reduce a whole lot more noise however it comes with the trade-off that it can be a little too aggressive sometimes and result in artifacts and change the sound of the dialogue in ways it isn't necessarily really pleasing to the ear so it's really useful if you have to get rid of a lot of noise say for example maybe you're recording something on a factory floor but you have to be willing to take the trade off that it's going to make the dialogue sound potentially a little bit unnatural now the trade-off then with the other setting gentle is kind of the inverse of that it doesn't generally remove quite as much noise but it also generally does a better job at retaining the quality of the audio and it doesn't really affect the dialogue as severely in terms of actually removing frequencies that you still want to retain there it won't change the timbre of the audio or the spoken word audio generally so that's kind of the trade-off there i will say this for me my philosophy generally is i don't want to absolutely eliminate all noise i'm generally more interested in just making the noise less distracting to the audience and so if for example i'm doing a corporate video i don't have to have it perfectly clean but i just don't want it to be distracting and a lot of times therefore i'll use this gentle setting now if i were recording something or processing some audio that had been recorded on a factory floor surgical might make more sense again if it's to the point where i can i can't really make out what the person's saying i might be willing to sacrifice some of the quality of the recording to make it so you can hear what they're saying and i might not be too concerned about well this is a very nice representation of this person's voice or an accurate representation of their voice because i just want to hear what they're saying so that's where you might kind of make that choice again for this case i think we can understand them pretty well so i'm going to go with gentle because i don't need to do anything super heroic here i just need to make the noise a little bit less distracting next up here in this section right here we have our real-time analyzer rta and what this represents is on the x-axis here this is a little bit different than the spectrogram along the x-axis in this case we're showing the frequencies and on the y-axis here the vertical axis we're showing the amplitude or how tall the waveform is at any given time and some people like to call it loudness or volume it's not technically that but if you want to think of it in those terms that's kind of what's happening here so if we go back to the waveform view that refers to how tall each of these waveforms are now in addition to that we have a light gray line and a white line the light gray line is the input or the original audio and the white line represents the audio after it's been processed by the voice denoise plugin so you can actually visually see what's happening here let me just give you an example here i'm going to crank up some of the settings here we'll go surgical and we say though uh you can see it a little bit better here you can see there's a significant difference here in the higher frequencies uh during this particular case again the audio fell below the threshold and so it applied or is starting to apply at least that 20 db of attenuation we're telling it to do here that 20 db of reducing the higher frequency sounds right here you can see there's a little bit going on up here in the lower frequent down here in the lower frequencies i should say but that's generally what the rta does for it's a way to visually see how the voice denoise plug-in is affecting your audio and where it's removing the noise where that noise exists according to the plug-in so that can be helpful of course absolutely critical generally to use headphones so you can hear what the changes that you're applying make you know whatever settings you're using and how that's affecting the audio that's the most important thing but it's nice to have some visuals as well to kind of supplement that all right let's move to the next settings now again if we're in adaptive mode or if we're actually in manual mode this next setting threshold moves the threshold either down or up so if we want to get really aggressive and really stomp on that noise we can push it up or if we want to be a little more gentle and less or more transparent i guess i should say we can also push that down so it's not going to do as much attenuation now what i will say is that when i'm in adaptive mode i almost always just leave it at zero again because the plug-in itself is adjusting these but you do have this again as another setting to kind of fine-tune things and tell it to push harder or don't be so aggressive now the final setting here reduction in db is telling the plug-in how much to reduce that particular frequency once the audio falls below the threshold and we have it set right now to 20 db that is a massive amount of reduction i would say this in my experience it generally makes a lot more sense to be much more conservative on the overall reduction and i generally start with that about at 6 db i might even apply less in some cases but generally what i find is if you do have to remove a lot of noise it might make more sense to do less db attenuation and do multiple passes with the plug-in if you if you need to do that so let me just go ahead and give you a demonstration here again you've heard where we're starting let me go ahead and play through with the settings we currently have let me demonstrate surgical first just so you can hear what it sounds like and why i might not use it in this particular case i'm going to really kind of accentuate it so it's very obvious to hear i'm going to preview it and i'll turn it off i'll bypass it while we're playing back as well so you can hear what the original audio sounds like and the processed audio we'll go ahead and start with the processed audio the other thing we say though is you know parenting is really built on two pillars one is structure guidance the other is love and warmth and what we always try to help parents is to figure out the right balance of those two things discipline is not an act of punishment it's an act of protection okay so there hopefully get a sense for what that was like to me when i do 20 db of attenuation on surgical mode on this particular audio clip it's starting to sound a little bit unnatural to me it's sounding almost like he's you know it sounds really dry i guess i should say doesn't sound like he's in a room and that may be what you want for particular circumstances but my philosophy is don't push it too hard if i have someone in a room and i'm seeing them in a video talking in a room i don't want it to have really distracting noise but i also don't want it to sound like they're in an anechoic chamber where there's no such thing as any reflected sound or her reverb and just kind of starts to feel otherworldly so let's do another pass this time with the filter type set to gentle and i want to drop down to 6 db of attenuation let's see how that goes the other thing we say though is you know parenting is really built on two pillars one is structure guidance the other is love and warmth and what we always try to help parents is to figure out the right balance of those two things discipline is not an act of punishment it's an act of protection and we try to help parents think about how to i don't know if you heard that but when i pushed this reduction db you started to hear some artifacting even on gentle mode and that's why i think it's important not to push this too hard and it's it's more important to generally do a smaller amount of attenuation and do multiple passes so it almost started to have a pumping sound to it so again that's that's the kind of thing i would generally try to avoid but let's go ahead and render this and now i can actually do additional passes now what i want i should actually undo that i'm going to unrender it what i want to do is measure our noise floor here first i'm going to pull up our waveform stats that noise floor is sitting at minus 46 db max rms that's generally the metric i'll use to measure a noise floor and let's see what happens once i reapply the voice denoise okay let's reapplied it let's measure this area now now it came in at minus 49.78 so almost uh what four db of attenuation overall so it has actually pulled that noise floor down a little bit and that's what it sounds like now and this is what it sounded like before so that's actually you know it's it's it's still there there's very much noise floor still there but it's not nearly as prominent as it was before let's do another pass and just see how that sounds the other thing we say though is uh you know parenting is really built on two pillars one is structure guidance okay we'll go ahead and render that out now our noise floor has fallen to minus 52.26 max rms so that's you know it's pushing it down just a little bit at a time and this is what it sounds like now the other thing we say though is uh you know parenting is really built on two pillars one is structure guidance so sounding pretty good i think if we if we went much farther than that it's going to start to sound you know let's do another round just quickly render that pick it up from where he was talking the other is love and warmth and what we always try to see this is where i start to feel like we're getting a pumping sound a little bit so i probably wouldn't go three rounds i just do maybe one or two now there are some other things you can do once i've done that i might come back and just apply a plain old high pass filter to help reduce some of that noise so here i'm in the eq i've got a high pass filter set to 80 hertz um 48 db per octave watch what happens down here in these low frequencies when we apply that great you can see it definitely cleared it up let's go ahead and highlight our noise floor area here where now our max rms is now minus 58.1 so that makes a substantial difference it now sounds like this pretty nice and this is what the overall dialogue sounds like now i'll play it post processing and then i'll go back to the original sound and play that the other thing we say though is you know parenting is really built on two pillars one is structure guidance the other is love and warmth the other thing we say though is you know parenting is really built on two pillars one is structure guidance so that's a difference you can expect with the voice denoise plug-in in izotope rx again in the elements and the standard and advanced versions so you can use it in any version i hope that was helpful if you have any questions go ahead and leave those down below if you haven't already subscribed be sure to do that and we'll be sure to get you more great videos on how to improve your lighting and sound for video talk to you soon [Music] you
Info
Channel: Curtis Judd
Views: 19,340
Rating: undefined out of 5
Keywords: Audio, sound, video, Izotope, RX, Izotope RX, Denoise, De-noise, Voice Denoise, Voice De-noise, Tutorial, how to, in depth, dialogue, spoken word, podcast, edit, cleanup, fix, clean up
Id: csCOPUVmnho
Channel Id: undefined
Length: 20min 8sec (1208 seconds)
Published: Sun Aug 22 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.