Bit Rates When Encoding MP3 Files

Next Meeting: Sat, TBD
Meeting Directions

Be a Member
Join SCOUG

Navigation:

20 Most Recent Documents
Search Archives
Index by date, title, author, category.

Features:

Mr. Know-It-All
Ink
Download!

Supporting Warpstock Phoenix 2023

Supporting Warpstock Orlando 2022

SCOUG:

Home

Email Lists

SIGs (Internet, General Interest, Programming, Network, more..)

Online Chats

Pictures from Sept. 1999

The views expressed in articles on this site are those of their authors.

warptech
SCOUG was there!

SCOUG, Warp Expo West, and Warpfest are trademarks of the Southern California OS/2 User Group. OS/2, Workplace Shell, and IBM are registered trademarks of International Business Machines Corporation. All other trademarks remain the property of their respective owners.

The Southern California OS/2 User Group
USA

February 2002

Bit Rates When Encoding MP3 Files

MP3 Bit Rates Explained for the Layman

by Darryl Sperber

Darryl wrote the following in response to a question on comp.os.os2.multimedia. It's a great education in MP3 encoding and we think you'll find his article fascinating.

Background: An MP3 file is a sound file (often taken from a WAV file or a track from an audio CD) that is compressed to make it smaller. Much of the music you download from Napster and similar sites is encoded into MP3 files.

Welcome to the mystical topic of choosing a "bit rate" when creating an MP3 sound file!

When you make an MP3 file you have to make two choices - the type of bit rate to use and the speed of that bit rate. The bit rate used when encoding an MP3 file is one of two types, either fixed/constant bit rate ("CBR") or variable bit rate ("VBR").

Fixed Bit Rate Encoding

In the case of a fixed bit rate, a person creates the MP3 file using an "encoder" program where the desired constant bit rate is specified in advance. It is usually selected as a command-line parameter value but can also be specified as some GUI setting if that is the way the software was written.

Generally speaking, the higher the fixed MP3 bit rate selected at encoding time, the closer it will be to the original CD/WAV source, and the "better" will be the sound of the resulting MP3 file when played back (assuming you're using a high-quality player program). And naturally, the higher the fixed bit rate the larger the resulting output file.

With CBR, the same fixed/constant number of bits per second are generated into the encoded MP3 file output no matter what the source file, so that the output MP3 file size can actually be predicted in advance based on the time length of the track being encoded.

And the same number of bits per second are used regardless of the amount of "information" in the track being encoded. So a one minute WAV file which has a constant tone that is on for two seconds and then off (and totally silent) for two seconds will generate an output file size which is a constant, based approximately on the bit rate times the length of the track.

If the one minute file instead has that same constant tone for the entire one minute, without the alternating silent periods, the output CBR MP3 file will again be exactly the same size. The size of these two CBR-encoded MP3 files will be identical.

And if instead you encoded a one minute musical WAV selection which had completely normal musical variations during that one minute, it would again produce an output file which is identical in size to the file produced as described above, from the alternating/constant/silent tone one minute WAV file.

One minute of CBR MP3 is always the same size (depending on the bit rate selected), regardless of what is in the input file.

In other words a "constant bit rate" encoding uses that many bits per second regardless of what the input source is -- sound or silence, fixed or varying sound frequency, whatever the volume and intensity.

Variable Bit Rate Encoding

In contrast, "variable bit rate" (VBR) encoding uses a dynamic algorithm whose bit rate at any point is based on "need" and varies as a function of the input sound source data values.

With VBR encoding it takes very little to represent a section of a WAV sound that is constant because the sound doesn't change and the encoder can just tell the player to keep repeating each cycle of the sound. All that's conceptually needed is to describe the particular sound or silence by its frequency, volume, intensity, absence, etc., plus an indication of the time duration for which that combination of values (or absence) is to be maintained, until the next change in the sound.

Once the sound is "described", these encoded values will "apply" until a new value of VBR data is encountered indicating that the sound state values should change.

So the more the input WAV source appears "quiet" or "constant" or "non-varying", the greater is the "mathematical advantage" of the VBR encoding approach. During each of these periods of constant sound (or silence) essentially NO bits are required except during the very first cycle. Only when the sound once again changes from its last state is new encoded data required.

Clearly, then, VBR files are "unpredictable" in output size. They are obviously a function of the input source WAV data, since all of the factors (frequency, loudness, etc.) which make that sound data will vary. But VBR-encoded files are almost always smaller than their CBR counterparts, especially for slower, quieter, more non-varying input files.

There is some additional information required in the VBR output datastream (namely the time duration for which each described set of sound values are to be maintained), but it is only required at the beginning of that sound's time interval. New information is only required when the sound state changes, with zero additional information required during the intervening constant state period.

When you do VBR encoding, you give the encoder program a "range" of bit rates (really just a specification of the maximum bit rate you will allow it to go up to) which you will permit it to vary across while performing the MP3 encoding. The mathematical design of the process will determine how many bits per second are required to adequately represent the sound being encoded at that moment, up to the maximum you allow.

And of course the mathematical design of the VBR encoding algorithm has much to do with the resulting audio quality when played back. Intuitively more elaborate and more sophisticated analysis encoding algorithms will take more CPU power and will be slower, while faster algorithms might make some compromises.

Depending on the source material, the encoded VBR result might be almost entirely at the maximum bit rate allowed (if the "variability" of the source is extremely high) or considerably less than that (if the source is much more "quiet" and "non-varying").

As expected, the resulting sound quality of the VBR output can be no better than the maximum bit rate permitted. But if the input source data can be easily "represented" within that constraint you probably won't notice the difference and the file size will be much smaller than its CBR counterpart.

If the upper limit permitted in a VBR (or CBR, for that matter) encoding isn't high enough, there's no question that playback sound quality will suffer.

Bit Rate Displayed By MP3 Player

The bit rate that an MP3 player program displays when it reads CBR input files is obviously the CBR value itself. 128,000 is usually the minimum default value for "acceptable" quality, although file size may also be on the mind of the person running the encoder program so the choice of this value is a tradeoff.

The VBR value of the encoded MP3 data clearly changes as each second goes by. So what the MP3 player program displays is sort of up to the software author. The player program could be designed to display a "VBR indicator" with an updated value every second as the VBR bit rate changes. It wouldn't make any CPU-cycle sense to try and "average" the variable bit rates encountered during a time interval and then display that average, since you want to listen to an MP3 file, not watch the VBR bit rate value change.

A more worthwhile use of CPU cycles in VBR decoder/players would be such things as calculating an EQ (tone control) to improve sound quality still further or driving a visualization plugin.

Also, more CPU cycles are spent in "overhead" for a VBR playback than a CBR playback, as the extra VBR bit rate values must be utilized to determine and then use the bit rate of the next section of MP3 input data, insofar as how it is to be decoded and converted to analog sound output. Clearly CBR playback provides many more extra CPU cycles for use by the EQ or visualization plugin. VBR thus needs a stronger machine to "sound good", because of the additional VBR overhead.

Bit Rate versus Sound Quality

Note that there are definite sonic differences in an encoded MP3 sound file as the bit rate upper limit is increased. For example 192,000 is the smallest bit rate at which my DOS Fraunhofer encoder will produce "true stereo" L/R output. Anything less than that provides some "compromise" in resulting stereo quality with some "mixing" of L/R data.

Also, my Fraunhofer is a CBR encoder, not a VBR encoder. So the 192,000 bit rate at which I do my MP3 encoding is definitely going to produce larger MP3 files than if I used 128,000 or 160,000. But my personal primary objective is "sonic quality", not smaller file sizes. These MP3 files are for my own use and enjoyment, and 192,000 produces a noticeably superior result when played back.

I also run my Fraunhofer CBR encoder at "maximum quality", which makes the audio analysis algorithms run at their slowest. Thus my MP3 encoding time is longer than it might be but the audio quality result is what I'm shooting for. You only produce this encoded MP3 file once, but you listen to it forever. So why not make it sound its best and not be concerned with a little extra time to do that?

In general, the higher the CBR bit rate or VBR upper-limit bit rate, the better will be the audio quality result. This is common sense.

Happy encoding!

If you have comments or questions, you can reach the author by email.

The Southern California OS/2 User Group
P.O. Box 26904
Santa Ana, CA 92799-6904, USA