Every couple of months or so, I sign with a new radio station as their voice guy. They send their copy, generally asking for the “Z100 read”…I voice and post it for download…and invariably get an email questioning my sanity. It’s too harsh. It’s too brittle. It’s always too something. Yup, yup and yup. It’s all that, and it’s the “Z100 read.” On Z100 I push my voice though a 400Hz Hi-Pass filter and then compress the living heck out of it. The waveform is almost completely flattened by clipping and is thus, the audiophile’s worst nightmare.
Most of the time, the local producer will then ask for the tracks “in the clear,” without any processing. OK. I’m pretty easy going as voiceover guys go, so I give them just a touch of compression to control those pesky little spikes and forgo the Hi-Pass filter. Within a few days, I’ll get another email saying it doesn’t “cut” that way. (If I didn’t know better, I’d have such a complex.) Then I reply with what I’m about to tell you here: the science behind that “harsh” sounding VO.
Close your eyes for a moment and listen. (I’ll wait.) OK. What did you hear? Your CPU fan…some hiss from the monitor system…your own breathing…a page out in the hall for someone to please pick up their pizza at the front desk…maybe you heard your chair squeaking. (Hello, engineering? I need some WD-40 right now.) Whatever you heard, you heard it in context. Your brain automatically isolated the various sounds for identification, but you heard them all mixed together.
Now I’m going to travel a road you might not know for a moment, but stick with me…all will become clear.
Here’s a word you may not know. It’s confabulate (intransitive verb. L. confabulatus, past participle of confabulari, from com- + fabulari to talk, from fabula story.) In this context, it means “to fill in gaps in memory by fabrication.” The brain is a marvelous instrument that we underestimate all the time. If you see the beginning of a sequence of events and see the end of the same sequence of events, your brain will fill in the missing pieces. For example; if you see a car about to crash into another car, but turn away before the actual impact, once you turn back to see the carnage, your brain will go through a series of deductive procedures to tell you what happened. “The red car was speeding toward the intersection. I heard a crash. There are two crumpled cars in the intersection now. There was a car accident.” Even though you didn’t actually see the crash, you are convinced that you’re an eyewitness.
Confabulation is what makes “Theatre Of The Mind” (TOM) a very real experience for most listeners. If you add a simple sound effect to a piece of copy, it becomes much more real to the listener. In fact, a TOM producer only has to suggest a situation for it to work. In the most famous piece of TOM, The War Of The Worlds, the lid of a water tank in the bathroom was scraped across the tank itself, but because of the context of what was being said and other sound effects, that scraping toilet tank lid became the door of a flying saucer opening in the swamps of New Jersey. Nobody said that’s what the sound was…the listener confabulated and it became exactly what the producer intended.
Psychoacoustics also plays a large role in this. This field of study was really hot in the late 1940s. At the time, sound reproduction was not nearly as “perfect” as it is now. If you listen to an old 78rpm record on a vintage player of the day (usually a Victrola), you will immediately notice that there IS no bottom end. But you’re spoiled. At the time, listeners really didn’t know it could be better. Acoustic engineers were trying to expand the audible spectrum of recorded music when they discovered that if the human brain hears the harmonics of a bass tone, it would confabulate the primary tone. This is what makes those teeny, tiny ear buds on your iPod boom with bass notes, when in fact, they are incapable of producing the actual tones with any kind of volume. The harmonics for those notes are all there, so your brain manufactures the bass notes.
This is also the basic principle behind the WAVES plug-in called MaxxBass. It adds extra harmonics throughout the spectrum of sound you’re producing, boosting the apparent bass without adding a lot of muddy sounding real bass to the signal. This keeps the balance intact while seeming to make the bass line jump out. This plug-in is particularly helpful if you’re preparing audio for distribution on the web, where bandwidth can make processing really tricky.
OK. So how does all this apply to a “harsh” sounding VO? You have to put it in context. By itself, the track does sound harsh, even grating to the ear. There is no bottom end. However, once you put it in context, with all the music, zings, zaps and zoodads…the human brain kicks in and the bottom end is suddenly there. In the meantime, the message cuts through all the clutter of the noise you’ve added to the voice track. The message is clear, and that is the bottom line. Oh sure, now that you know my voice has nothing below 400Hz, you’ll still know the difference, but I’ll guarantee you the audience either doesn’t know or doesn’t care.
On the back-end of all this, you should also know that the human voice is roughly in the same frequency range as solo instruments in music. (Some are higher or lower, but in the main the human voice is considered the solo instrument in most music.) You DON'T include the vocals in the music on your promos or commercials except when it is pertinent to the message. If it's a music image promo, you really need to have the hook of the song included, but the VO isn't going when it's in the mix. A really good producer knows that putting VO on top of music with a solo instrument (including the voices of the singers). It just clashes and makes the VO difficult to hear. Oddly, the solo instruments almost never dip below 400Hz. Popping the VO into that little frequency pocket makes it blend nicely into the music without getting lost in it. The bass line of the music can flow without interruption from the VO, making the whole piece sound much more "designed."
In the end, almost every producer decides to have me cut the tracks using my own concoction of filters and compression and with very few exceptions the program director ends up liking the new imaging sound. (I would say every time, but there was that one station in…well, never mind. They didn’t renew and I’m glad because they made me sound wimpy anyway.)
Audio purists are all pretty disgusted with me already, and that’s OK with me. I’m not producing whale songs, railroads or the sounds of buffalo in heat. I’m producing for a hot rockin’, flame throwing Z100, baby. I want it big and powerful, punchy and bright. On the other hand, if you’re working for a classical or AC station, you’ll really want to keep it a lot more open. Otherwise, you’d be swatting a fly with a ten–pound sledge–hammer. When you’re ready to turn up the heat and give your production a lot more verve, you now know what to do.