Audio Compression
Compact Disc Quality Audio
The new standard of high quality audio has, since the mid 1980's, been the compact disc (abbreviated "CD"). Before the emergence of compact disc there have been many mechanisms and formats to store audio information, however, before compact disc, all of these formats have been analog - compact disc is the first transportable medium (available to the consumer) that boasts digital storage. The advantages of digital storage are immense, but the most impressive is the nature of the digital medium - a CD will never deteriorate unlike all analog storage.
The Size Problem
It is possible (and fairly easy) to "rip" the audio data from a CD and store it into "WAV" files on a computer, and these files can be played back on demand. So ideally, you'd want to hear your music at this quality everywhere, since it's the highest quality you can typically purchase. You'd want copies of your music, at this quality, in your car, on your computer, in your portable music player, and in your stereo. Why is this not currently feasible? The answer is size.
A little math can reveal the space required to store sound information at this quality. Each sample is 16 bits, or two bytes. There are 44,100 samples each second, and since modern music is recorded in stereo, there is both a left and a right channel. This results in ( 2 x 44100 x 2 ) = 176,400 bytes to store one second's worth of samples. This means 10,584,000 bytes or approximately 10 megabytes to store just one minute of CD-quality audio.
If you want to download that over the Internet, given an average 28.8 modem, it would take you about 45 minutes. Just to download one minute of music!
Fortunately, there is a solution. Compression is the technique of making a file take up less space while still containing the same information. There are two categories of compression: lossless and lossy.
Lossless Compression
Lossless compression means that the compressed, smaller file can be expanded back into the original file without losing any information whatever. That is: take a file; compress it, and uncompress it again. If the original file is bit-for-bit identical, 100% of the time, for any given input file, then the compression scheme is lossless. No information is lost.
Unfortunately, compressing audio losslessly is hard. General-purpose compression programs like WinZip and gzip only manage about 5% on average. Even "next-generation" utilities like WinRAR and bzip2 only manage a few percent more.
There are special-purpose compressors (like flac) which were designed solely for losslessly compressing audio, but even they only manage about a 50% reduction in filesize on average. While this is enough for some, for music files to be truly portable they must be even smaller.
Lossy Compression
Lossy compression is any compression which causes information to be lost. Compressing and then uncompressing a file results in something similar, but not identical, to the original file. This is no good for things which must be interpreted by a computer, like executable programs/applications or most computer-readable data, but is often just fine for things where the interpretation is being done by a human (like photographs or sounds). The trick is to remove little bits of information in places where it can't be perceived.
Lossy audio compression works using a psychoacoustic model. That is, by modeling how your ears (and your brain) hear sound, it is possible to find places to remove information that you wouldn't have perceived anyway. A full treatment of these techniques is beyond the scope of this document, but here are two simple examples:
Though humans can technically hear tones up to 20 kHz in pitch, most can't hear anything above 15 kHz, especially when other sounds are present. However, most CD-quality audio contains information for reproducing these tones anyway. By filtering out tones outside this range, you reduce the amount of information that has to be stored without affecting the perceived sound quality.
Similarly, if a piece of music contains a loud bass drum hit (such as most rock and roll, a couple of times each second), the eardrum is too busy reacting to the percussive hits of the drum to register any other sounds at all for a few milliseconds. By simply omitting the samples immediately after such sounds, less information can be stored while still maintaining the same perceived sound quality.
Using sophisticated techniques such as these, lossy audio compression formats such as MP3 can achieve results which are provably indistinguishable from the original, CD-quality sound but are a mere 10 to 20% of the size.
And what's even better, being more aggressive with these techniques can result in files which are less than 5% of the original size but still sound quite good on normal equipment. |