| February 2002
 Bit Rates When Encoding MP3 FilesMP3 Bit Rates Explained for the Laymanby Darryl Sperber
 Darryl wrote the following in response to a question on comp.os.os2.multimedia.
It's a great education in MP3 encoding and we think you'll find his article fascinating.
         
        Background:
           An MP3 file is a sound file (often taken from a WAV file or a track from an audio 
CD) that is compressed to make it smaller.
           Much of the music you download from Napster and similar sites is encoded into 
MP3 files.
         
 
Welcome to the mystical topic of choosing a "bit rate" when creating an MP3 sound file!
 
When you make an MP3 file you have to make two choices - the type of bit rate to use 
and the speed of that bit rate.
The bit rate used when encoding an MP3 file is one of two types, either fixed/constant 
bit rate ("CBR") or variable bit rate ("VBR").
 
 Fixed Bit Rate EncodingIn the case of a fixed bit rate, a person creates the MP3 file using an "encoder" program 
where the desired constant bit rate is specified in advance.
It is usually selected as a command-line parameter value but can also be specified as 
some GUI setting if that is the way the software was written.
Generally speaking, the higher the fixed MP3 bit rate selected at encoding time, the 
closer it will be to the original CD/WAV source, and the "better" will be the sound of the 
resulting MP3 file when played back (assuming you're using a high-quality player 
program).
And naturally, the higher the fixed bit rate the larger the resulting output file.
 
With CBR, the same fixed/constant number of bits per second are generated into the 
encoded MP3 file output no matter what the source file, so that the output MP3 file size 
can actually be predicted in advance based on the time length of the track being 
encoded.
 
And the same number of bits per second are used regardless of the amount of 
"information" in the track being encoded.
So a one minute WAV file which has a constant tone that is on for two seconds and 
then off (and totally silent) for two seconds will generate an output file size which is a 
constant, based approximately on the bit rate times the length of the track.
 
If the one minute file instead has that same constant tone for the entire one minute, 
without the alternating silent periods, the output CBR MP3 file will again be exactly the 
same size.
The size of these two CBR-encoded MP3 files will be identical.
 
And if instead you encoded a one minute musical WAV selection which had completely 
normal musical variations during that one minute, it would again produce an output file 
which is identical in size to the file produced as described above, from the 
alternating/constant/silent tone one minute WAV file.
 
One minute of CBR MP3 is always the same size (depending on the bit rate selected), 
regardless of what is in the input file.
 
In other words a "constant bit rate" encoding uses that many bits per second regardless 
of what the input source is -- sound or silence, fixed or varying sound frequency, 
whatever the volume and intensity.
 
 Variable Bit Rate EncodingIn contrast, "variable bit rate" (VBR) encoding uses a dynamic algorithm whose bit rate 
at any point is based on "need" and varies as a function of the input sound source data 
values.
With VBR encoding it takes very little to represent a section of a WAV sound that is 
constant because the sound doesn't change and the encoder can just tell the player to 
keep repeating each cycle of the sound.
All that's conceptually needed is to describe the particular sound or silence by its 
frequency, volume, intensity, absence, etc., plus an indication of the time duration for 
which that combination of values (or absence) is to be maintained, until the next change 
in the sound.
 
Once the sound is "described", these encoded values will "apply" until a new value of 
VBR data is encountered indicating that the sound state values should change.
 
So the more the input WAV source appears "quiet" or "constant" or "non-varying", the 
greater is the "mathematical advantage" of the VBR encoding approach.
During each of these periods of constant sound (or silence) essentially NO bits are 
required except during the very first cycle.
Only when the sound once again changes from its last state is new encoded data 
required.
 
Clearly, then, VBR files are "unpredictable" in output size.
They are obviously a function of the input source WAV data, since all of the factors 
(frequency, loudness, etc.) which make that sound data will vary.
But VBR-encoded files are almost always smaller than their CBR counterparts, 
especially for slower, quieter, more non-varying input files.
 
There is some additional information required in the VBR output datastream (namely the 
time duration for which each described set of sound values are to be maintained), but it 
is only required at the beginning of that sound's time interval.
New information is only required when the sound state changes, with zero additional 
information required during the intervening constant state period.
 
When you do VBR encoding, you give the encoder program a "range" of bit rates (really 
just a specification of the maximum bit rate you will allow it to go up to) which you will 
permit it to vary across while performing the MP3 encoding.
The mathematical design of the process will determine how many bits per second are 
required to adequately represent the sound being encoded at that moment, up to the 
maximum you allow.
 
And of course the mathematical design of the VBR encoding algorithm has much to do 
with the resulting audio quality when played back.
Intuitively more elaborate and more sophisticated analysis encoding algorithms will take 
more CPU power and will be slower, while faster algorithms might make some 
compromises.
 
Depending on the source material, the encoded VBR result might be almost entirely at 
the maximum bit rate allowed (if the "variability" of the source is extremely high) or 
considerably less than that (if the source is much more "quiet" and "non-varying").
 
As expected, the resulting sound quality of the VBR output can be no better than the 
maximum bit rate permitted.
But if the input source data can be easily "represented" within that constraint you 
probably won't notice the difference and the file size will be much smaller than its CBR 
counterpart.
 
If the upper limit permitted in a VBR (or CBR, for that matter) encoding isn't high 
enough, there's no question that playback sound quality will suffer.
 
 Bit Rate Displayed By MP3 PlayerThe bit rate that an MP3 player program displays when it reads CBR input files is 
obviously the CBR value itself.
128,000 is usually the minimum default value for "acceptable" quality, although file size 
may also be on the mind of the person running the encoder program so the choice of 
this value is a tradeoff.
The VBR value of the encoded MP3 data clearly changes as each second goes by.
So what the MP3 player program displays is sort of up to the software author.
The player program could be designed to display a "VBR indicator" with an updated 
value every second as the VBR bit rate changes.
It wouldn't make any CPU-cycle sense to try and "average" the variable bit rates 
encountered during a time interval and then display that average, since you want to 
listen to an MP3 file, not watch the VBR bit rate value change.
 
A more worthwhile use of CPU cycles in VBR decoder/players would be such things as 
calculating an EQ (tone control) to improve sound quality still further or driving a 
visualization plugin.
 
Also, more CPU cycles are spent in "overhead" for a VBR playback than a CBR 
playback, as the extra VBR bit rate values must be utilized to determine and then use 
the bit rate of the next section of MP3 input data, insofar as how it is to be decoded and 
converted to analog sound output.
Clearly CBR playback provides many more extra CPU cycles for use by the EQ or 
visualization plugin.
VBR thus needs a stronger machine to "sound good", because of the additional VBR 
overhead.
 
 Bit Rate versus Sound QualityNote that there are definite sonic differences in an encoded MP3 sound file as the bit 
rate upper limit is increased.
For example 192,000 is the smallest bit rate at which my DOS Fraunhofer encoder will 
produce "true stereo" L/R output.
Anything less than that provides some "compromise" in resulting stereo quality with 
some "mixing" of L/R data.
Also, my Fraunhofer is a CBR encoder, not a VBR encoder.
So the 192,000 bit rate at which I do my MP3 encoding is definitely going to produce 
larger MP3 files than if I used 128,000 or 160,000.
But my personal primary objective is "sonic quality", not smaller file sizes.
These MP3 files are for my own use and enjoyment, and 192,000 produces a noticeably 
superior result when played back.
 
I also run my Fraunhofer CBR encoder at "maximum quality", which makes the audio 
analysis algorithms run at their slowest.
Thus my MP3 encoding time is longer than it might be but the audio quality result is 
what I'm shooting for.
You only produce this encoded MP3 file once, but you listen to it forever.
So why not make it sound its best and not be concerned with a little extra time to do 
that?
 
In general, the higher the CBR bit rate or VBR upper-limit bit rate, the better will be the 
audio quality result.
This is common sense.
 
Happy encoding!
 
 If you have comments or questions, you can reach the author by 
email.
 
 The Southern California OS/2 User Group
 P.O. Box 26904
 Santa Ana, CA  92799-6904, USA
 
Copyright 2002 the Southern California OS/2 User Group.  ALL RIGHTS 
RESERVED. 
 
SCOUG, Warp Expo West, and Warpfest are trademarks of the Southern California OS/2 User Group.
OS/2, Workplace Shell, and IBM are registered trademarks of International 
Business Machines Corporation.
All other trademarks remain the property of their respective owners.
 |