Look What They've Done to My Song
This is a companion piece to a tutorial series of Jay's at ProVideoCoalition.com.
by Jay Rose, CAS
You might want to read those articles first:
A slightly different version of this article was originally published here at JayRose.com, August 2005.
I'm convinced that mp3 has gotten a bad reputation for quality because it's often misused by the rip-and-share crowd, and because there is some poor encoding software out there. When done right, it can be tolerable. And there's a reasonable way to evaluate exactly what mp3 is doing to your sound.
Bear in mind, mp3 is a data encoding standard (ISO-MPEG Audio Layer-3, IS 11172-3 and IS 13818-3), not a particular algorithm or piece of software. When it first appeared in the late 1980s, mp3-encoding software was limited by what was then practical in desktop computers or DSP chips, and this affected the sound. As computers and chips got more powerful, software designers could use more processing cycles to get better quality.
There are more powerful standards now, such as AAC, but mp3 survives for more than just casual music listening. In fact, it's often the default format used by broadcast and film professionals (including yours truly) for real-time audio transfers over ISDN lines, using multi-thousand dollar dedicated hardware codecs.
So What's Missing?
Any psychoacoustic data reduction scheme, including mp3 and AAC, has to clobber some audio data; these aren't lossless compression like PKZIP. In the most basic terms, mp3 analyzes the incoming audio by modeling human hearing, and attempts to remove the details that most people's nervous systems would never pass to their brains. It then applies a zip-like algorithm to what remains. There's a more complete explanation at the Fraunhofer Institute... they invented the format. (AAC builds on mp3, by applying an additional set of rules about human hearing.)
The technology is scalable, in that you can tell the encoder how much detail you're willing to throw away. This is what the BITRATE and SAMPLE RATE settings in mp3 software do. Reasonably, the lower those rates, the smaller the files and more sacrifice to the audio. At extreme settings (8 kbps, 6 kHz sampling) the resulting file is slightly more than 1% of its original size. Of course, nobody would pipe important sound through something that extreme. But portable music players often run at 64 or 128 kbps... a significant reduction, when you consider that normal music CDs have a data rate of 1,411 kbps (after error correction and other overhead; the gross rate is around 4 megabits).
How important is the data that mp3 takes away at those settings? Will having it missing hurt your soundtrack?
To find out, I made a montage of male and female speech and singing, along with music in various genres. The full resolution version is on the CD included in my book "Audio Postproduction". If you don't have the book, you can hear a very well encoded approximation here. (The music is from DeWolfe Music New York, protected by copyright, and used by permission.)
I encoded the original, full-res version as an mp3 at various bitrates. I then compared the decoded results to the original mathematically, by subtracting them from the original. This is actually a fairly simple technique of inverting the polarity of one file and combining it with the other.
You can understand the technique easily with a graphic example, using complementary colors as a visual equivalent of audio polarity. Let's say my original file is a string of letters...
We run that file through data reduction software - in this case, an imaginary visual equivalent of mp3, set to remove all vowels:
In order to tell exactly what's been taken away, we flip the polarity of the compressed version - making red into cyan, and cyan into red:
Now it's just a simple matter of adding them, so the cyan areas combine with the red ones to make white. What's left is what the compression took away:
Piece of cake.
In audio it's almost as simple, using a multi-track audio program that lets you flip the polarity of one channel. I did it in the cross-platform, open source Audacity.
Hear the results
As a baseline, you might want to listen to the original montage, described earlier.
I turned it into a 128 kbps mp3 file, decoded it, and compared the two. Here's what's left, or what the mp3 process took away from the original.
If you don't hear anything, it's not because the file or the link are bad. You just have to listen very carefully: the tiny amount of sound is the total difference between the high res and the mp3 versions!
(If you don't see a player at all, you'll need the QuickTime plug-in, free for Mac or Windows from Apple. I encoded these results files with Apple Lossless - which I can prove doesn't affect audio quality at all - to save download time.)
Even at 64 kbps, you don't lose much. Here's the result.
You can hear a difference when encoded at 32 kbps (like this), but notice that it's strictly high frequencies. That's because the standard for 32 kbps lowers the sample rate as well. But nobody recommends this bitrate for music.
Do it yourself
Feel free to try this with your own source material. If you want to replicate my tests exactly, use an encoder that uses the open source LAME library. (I did mine in Bias Peak.)
Also, it's absolutely critical that you align the original and decoded versions perfectly, to the sample. Audacity makes this easy by zooming and slipping one track against the other:
Once you've aligned the decoded mp3 track against the original, mix the two together and listen. Try different kinds of source material... you'll probably find that certain kinds of music, like overly processed and very loud dance pieces with vocals, lose more through mp3 than quiet acoustic pieces. If you think you've guessed the reason for this, .
Want to learn more? I've written a couple of books on the subject...
Posted 8/16/08 © 2008 Jay Rose. May be linked to in its entirety, but not otherwise copied. Individual graphics or audio files may not be separately linked, copied, or excerpted.
To top of site
To tutorials index
Producing Great Sound for Film and Video
Newly revised, July 2014
"Should be mandatory reading for anyone seriously considering a career making movies." — Jeff Wexler, mixer of over 70 major films including Independence Day, Last Samurai, Jerry Maguire
"Cutting-edge ideas about the collaboration of sound and image, and also covers the basics... in an easy to read, easy to understand style." — Randy Thom, multiple Oscar winner and Director of Sound Design, Skywalker Sound
humor, tutorials, and ideas at ProVideoCoalition.com
(archive up to Summer 2008)