Close your eyes for a moment and listen. (I’ll wait.) OK. What did you hear? Your CPU fan…some hiss from the monitor system…your own breathing…a page out in the hall for someone to please pick up their pizza at the front desk…maybe you heard your chair squeaking. (Hello, engineering? I need some WD-40 STAT!) Whatever you heard, you heard it in context. Your brain automatically isolated the various sounds for identification, but you heard them all mixed together.
Now I’m going to travel a road you might not know for a moment, but stick with me…the signpost ahead promises all will become clear.
Here’s a word you may not know. It’s confabulate (intransitive verb. L. confabulatus, past participle of confabulari, from com- + fabulari to talk, from fabula story.) In this context, it means “to fill in gaps in memory by fabrication.” The brain is a marvelous instrument that we underestimate all the time. If you see the beginning of a sequence of events and see the end of the same sequence of events, your brain will fill in the missing pieces. For example; if you see a car about to crash into another car, but turn away before the actual impact, once you turn back to see the carnage, your brain will go through a series of deductive procedures to tell you what happened. “The red car was speeding toward the intersection. I heard a crash. There are two crumpled cars in the intersection now. There was a car accident.” Even though you didn’t actually see the crash, you are convinced that you’re an eyewitness.
Confabulation is what makes “Theatre Of The Mind” (TOM) a very real experience for most listeners. If you add a simple sound effect to a piece of copy, it becomes much more real to the listener. In fact, a TOM producer only has to suggest a situation for it to work. In the most famous piece of TOM, The War Of The Worlds, the lid of a water tank in the bathroom was scraped across the tank itself, but because of the context of what was being said and other sound effects, that scraping toilet tank lid became the door of a flying saucer opening in the swamps of New Jersey. Nobody said that’s what the sound was…the listener confabulated and it became exactly what the producer intended.
Psychoacoustics also plays a large role in this. This field of study was really hot in the late 1940s. At the time, sound reproduction was not nearly as “perfect” as it is now. If you listen to an old 78rpm record on a vintage player of the day (usually a Victrola), you will immediately notice that there IS no bottom end. But you’re spoiled. At the time, listeners really didn’t know it could be better. Acoustic engineers were trying to expand the audible spectrum of recorded music when they discovered that if the human brain hears the harmonics of a bass tone, it would confabulate the primary tone. This is what makes those teeny, tiny ear buds boom with bass notes, when in fact, they are incapable of producing the actual tones with any kind of volume. The harmonics for those notes are all there, so your brain manufactures the bass notes.
This is the basic principle behind the WAVES plug-in called MaxxBass™. It adds extra harmonics throughout the spectrum of sound you’re producing, boosting the apparent bass without adding a lot of muddy sounding real bass to the signal. This keeps the balance intact while seeming to make the bass line jump out. This plug-in is particularly helpful if you’re preparing audio for distribution on the phone (read: answering machine jobs) where bandwidth can make processing really tricky.
Longtime VO clients of mine have gotten used to my tracks sounding a bit “harsh.” They are. (Although, I also give them a flat response version on every file as well for those who’d rather customize the sound on their own.) There are actually TWO reasons for the “harsh” sounding version, which I’ll now explain.
By itself, the track does sound harsh, even grating to the ear. There is no bottom end because I cut off all frequencies below 400Hz and pre-compress. However, once you put it in context, with all the music and effects…the human brain kicks in and the bottom end is suddenly there, because the harmonics are all still intact. In the meantime, the message cuts through all the clutter of the noise you’ve added to the voice track. The message is clear, and that is the bottom line. Oh sure, now that you know my voice has nothing below 400Hz, you’ll still know the difference, but I’ll guarantee you the audience never hears the difference.
The second reason is even more important: because my VO doesn’t have any frequencies below 400Hz, it doesn’t fight with anything the music is pushing, like the bass line or bigger effects.
The scientific reason this all works is fascinating to me. Every LEAD instrument sits in the exact same frequency range as the human voice. Guitar, saxophone and piano, when acting as a lead instrument uses the same frequencies as the human voice. Bass does not. Rhythm instruments like drums, guitar and the left hand of a piano do not.
A drum ‘fill’ happens when the voice is not singing, or in my case, speaking. Otherwise the kick and toms hit below that range and the snare is either muted or only hitting on the main beats. Cymbals are generally above the range of the human voice. Strings might use some of the same range, but when the voice is active, they are generally subdued.
A really good piece of ‘production’ music has no lead instrument, making room for the spoken word on the frequency spectrum. Honestly, listening to a production track by itself is kinda boring because there’s no real melody. If you’re using current pop music as a background for a concert spot, you’re hopefully using the instrumental version as the track, with the vocal only popping in for the hook. The instrumental version, like production music, has no real melody because that’s where the original vocal was in the finished song. TIP: If there isn’t an instrumental version available, look for Karaoke versions. There are usually 2 or 3 really good ones for just about every hit. Clearly, Karaoke versions don’t have a vocal track either.
The result is music and voice that play nicely together, without any cross-frequency interference. It can all be processed together, making a well balanced sound that emphasizes the message clearly and strongly. And the BEST part for you, as the producer, the need for ducking the music so much when the VO is rolling is greatly reduced, giving you a full spectrum, award-winning sound that has an immense impact on the listener. True story.
The ONE time you should use a flat response VO is when there IS no music. Trust me, it doesn’t sound weird at all. In fact if you use the Hi-Pass track during the silence, THAT will sound weird.
Audio purists are all pretty disgusted with me already, and that’s OK with me. I’m not producing whale songs, railroads or the sounds of buffalo in heat. When I’m doing the voice for a promo, I want it big and powerful, punchy and bright. If you’re working for a classical or AC station, you’ll probably want to keep it a lot more open, otherwise, you’d be swatting a fly with a ten–pound sledge–hammer. When you’re ready to turn up the heat and give your production a lot more verve, you now know what to do.