/usr/share/doc/libtwolame-dev/html/vbr.html is in libtwolame-dev 0.3.13-1ubuntu1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
<meta name="generator" content="AsciiDoc 8.6.3" />
<title>TwoLAME: MPEG Audio Layer II VBR</title>
<link rel="stylesheet" href="./twolame.css" type="text/css" />
<script type="text/javascript">
/*<![CDATA[*/
window.onload = function(){asciidoc.footnotes();}
/*]]>*/
</script>
<script type="text/javascript" src="./asciidoc-xhtml11.js"></script>
</head>
<body class="article">
<div id="header">
<h1>TwoLAME: MPEG Audio Layer II VBR</h1>
<span id="revnumber">version 0.3.13</span>
</div>
<div id="content">
<div class="sect1">
<h2 id="_contents">Contents</h2>
<div class="sectionbody">
<div class="ulist"><ul>
<li>
<p>
Introduction
</p>
</li>
<li>
<p>
Usage
</p>
</li>
<li>
<p>
Bitrate Ranges for various Sampling frequencies
</p>
</li>
<li>
<p>
Why can’t the bitrate vary from 32kbps to 384kbps for every file?
</p>
</li>
<li>
<p>
Short Answer
</p>
</li>
<li>
<p>
Long Answer
</p>
</li>
<li>
<p>
Tech Stuff
</p>
</li>
</ul></div>
</div>
</div>
<div class="sect1">
<h2 id="_introduction">Introduction</h2>
<div class="sectionbody">
<div class="paragraph"><p>VBR mode works by selecting a different bitrate for each frame. Frames
which are harder to encode will be allocated more bits i.e. a higher bitrate.</p></div>
<div class="paragraph"><p>LayerII VBR is a complete hack - the ISO standard actually says that decoders are not
required to support it. As a hack, its implementation is a pain to try and understand.
If you’re mega-keen to get full range VBR working, either (a) send me money (b) grab the
ISO standard and a C compiler and email me.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_usage">Usage</h2>
<div class="sectionbody">
<div class="literalblock">
<div class="content">
<pre><tt>twolame -v [level] inputfile outputfile.</tt></pre>
</div></div>
<div class="paragraph"><p>A level of 5 works very well for me.</p></div>
<div class="paragraph"><p>The level value can is a measurement of quality - the higher
the level the higher the average bitrate of the resultant file.
See TECH STUFF for a better explanation of what the value does.</p></div>
<div class="paragraph"><p>The confusing part of my implementation of LayerII VBR is that it’s different from MP3 VBR.</p></div>
<div class="ulist"><ul>
<li>
<p>
The range of bitrates used is controlled by the input sampling frequency. (See below "Bitrate ranges")
</p>
</li>
<li>
<p>
The tendency to use higher bitrates is governed by the <level>.
</p>
</li>
</ul></div>
<div class="paragraph"><p>E.g. Say you have a 44.1kHz Stereo file. In VBR mode, the bitrate can range from 192 to 384 kbps.</p></div>
<div class="paragraph"><p>Using "-v -5" will force the encoder to favour the lower bitrate.</p></div>
<div class="paragraph"><p>Using "-v 5" will force the encoder to favour the upper bitrate.</p></div>
<div class="paragraph"><p>The value can actually be <strong>any</strong> int. -27, 233, 47. The larger the number, the greater
the bitrate bias.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_bitrate_ranges">Bitrate Ranges</h2>
<div class="sectionbody">
<div class="paragraph"><p>When making a VBR stream, the bitrate is only allowed to vary within
set limits</p></div>
<div class="literalblock">
<div class="content">
<pre><tt>48kHz
Stereo: 112-384kbps Mono: 56-192kbps</tt></pre>
</div></div>
<div class="literalblock">
<div class="content">
<pre><tt>44.1kHz & 32kHz
Stereo: 192-384kbps Mono: 96-192kbps</tt></pre>
</div></div>
<div class="literalblock">
<div class="content">
<pre><tt>24kHz, 22.05kHz & 16kHz
Stereo/Mono: 8-160kbps</tt></pre>
</div></div>
</div>
</div>
<div class="sect1">
<h2 id="_why_doesn_8217_t_the_vbr_mode_work_the_same_as_mp3vbr_the_short_answer">Why doesn’t the VBR mode work the same as MP3VBR? The Short Answer</h2>
<div class="sectionbody">
<div class="paragraph"><p><strong>Why can’t the bitrate vary from 32kbps to 384kbps for every file?</strong></p></div>
<div class="paragraph"><p>According to the standard (ISO/IEC 11172-3:1993) Section 2.4.2.3</p></div>
<div class="literalblock">
<div class="content">
<pre><tt>"In order to provide the smallest possible delay and complexity, the
decoder is not required to support a continuously variable bitrate when
in layer I or II. Layer III supports variable bitrate by switching the
bitrate index."</tt></pre>
</div></div>
<div class="literalblock">
<div class="content">
<pre><tt>and</tt></pre>
</div></div>
<div class="literalblock">
<div class="content">
<pre><tt>"For Layer II, not all combinations of total bitrate and mode are allowed."</tt></pre>
</div></div>
<div class="paragraph"><p>Hence, most LayerII coders would not have been written with VBR in mind, and
LayerII VBR is a hack. It works for limited cases. Getting it to work to
the same extent as MP3-style VBR will be a major hack.</p></div>
<div class="paragraph"><p>(If you <strong>really</strong> want better bitrate ranges, read "The Long Answer" and submit your mega-patch.)</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_why_doesn_8217_t_the_vbr_mode_work_the_same_as_mp3vbr_the_long_answer">Why doesn’t the VBR mode work the same as MP3VBR? The Long Answer</h2>
<div class="sectionbody">
<div class="paragraph"><p><strong>Why can’t the bitrate vary from 32kbps to 384kbps for every file?</strong></p></div>
<div class="sect2">
<h3 id="_reason_1_the_standard_limits_the_range">Reason 1: The standard limits the range</h3>
<div class="paragraph"><p>As quoted above from the standard for 48/44.1/32kHz:</p></div>
<div class="literalblock">
<div class="content">
<pre><tt>"For Layer II, not all combinations of total bitrate and mode are allowed. See
the following table."</tt></pre>
</div></div>
<div class="literalblock">
<div class="content">
<pre><tt>Bitrate Allowed Modes
(kbps)
32 mono only
48 mono only
56 mono only
64 all modes
80 mono only
96 all modes
112 all modes
128 all modes
160 all modes
192 all modes
224 stereo only
256 stereo only
320 stereo only
384 stereo only</tt></pre>
</div></div>
<div class="paragraph"><p>So based upon this table alone, you <strong>could</strong> have VBR stereo encoding which varies
smoothly from 96 to 384kbps. Or you could have have VBR mono encoding which varies from
32 to 192kbps. But since the top and bottom bitrates don’t apply to all modes, it would
be impossible to have a stereo file encoded from 32 to 384 kbps.</p></div>
<div class="paragraph"><p>But this isn’t what is really limiting the allowable bitrate range - the bit allocation
tables are the major hurdle.</p></div>
</div>
<div class="sect2">
<h3 id="_reason_2_the_bit_allocation_tables_don_8217_t_allow_it">Reason 2: The bit allocation tables don’t allow it</h3>
<div class="paragraph"><p>From the standard, Section 2.4.3.3.1 "Bit allocation decoding"</p></div>
<div class="literalblock">
<div class="content">
<pre><tt>"For different combinations of bitrate and sampling frequency, different bit
allocation tables exist.</tt></pre>
</div></div>
<div class="paragraph"><p>These bit allocation tables are pre-determined tables (in Annex B of the standard) which
indicate</p></div>
<div class="ulist"><ul>
<li>
<p>
how many bits to read for the initial data (2,3 or 4)
</p>
</li>
<li>
<p>
these bits are then used as an index back into the table to
find the number of quantize levels for the samples in this subband
</p>
</li>
</ul></div>
<div class="paragraph"><p>But the table used (and hence the number of bits and the calculated index) are different
for different combinations of bitrate and sampling frequency.</p></div>
<div class="paragraph"><p>I will use TableB.2a as an example.</p></div>
<div class="paragraph"><p>Table B.2a Applies for the following combinations.</p></div>
<div class="literalblock">
<div class="content">
<pre><tt>Sampling Freq Bitrates in (kbps/channel) [emphasis: this is a PER CHANNEL bitrate]
48 56, 64, 80, 96, 112, 128, 160, 192
44.1 56, 64, 80
32 56, 64, 80</tt></pre>
</div></div>
<div class="paragraph"><p>If we have a STEREO 48kHz input file, and we use this table, then the bitrates
we could calculate from this would be 112, 128, 160, 192, 224, 256, 320 and 384 kbps.</p></div>
<div class="paragraph"><p>This table contains no information on how to encode stuff at bitrates less than 112kbps
(for a stereo file). You would have to load allocation table B.2c to encode stereo at
64kbps and 128kbps.</p></div>
<div class="paragraph"><p>Since it would be a MAJOR piece of hacking to get the different tables shifted in and out
during the encoding process, once an allocation table is loaded <strong>IT IS NOT CHANGED</strong>.</p></div>
<div class="paragraph"><p>Hence, the best table is picked at the start of the encoding process, and the encoder
is stuck with it for the rest of the encode.</p></div>
<div class="paragraph"><p>For twolame-02j, I have picked the table it loads for different
sampling frequencies in order to optimize the range of bitrates possible.</p></div>
<div class="literalblock">
<div class="content">
<pre><tt>48 kHz - Table B.2a
Stereo Bitrate Range: 112 - 384
Mono Bitrate Range : 56 - 192</tt></pre>
</div></div>
<div class="literalblock">
<div class="content">
<pre><tt>44.1/32 kHz - Table B.2b
Stereo Bitrate Range: 192 - 384
Mono Bitrate Range: 96 - 192</tt></pre>
</div></div>
<div class="literalblock">
<div class="content">
<pre><tt>24/22.05/16 kHz - LSF Table (Standard ISO/IEC 13818.3:1995 Annex B, Table B.1)
There is only 1 table for the Lower Sampling Frequencies
All modes (mono and stereo) are allowable at all bitrates
So at the Lower Sampling Frequencies you *can* have a completely variable
bitrate over the entire range.</tt></pre>
</div></div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_tech_stuff">Tech Stuff</h2>
<div class="sectionbody">
<div class="paragraph"><p>The VBR mode is mainly centered around the main_bit_allocation() and
a_bit_allocation() routines in encode.c.</p></div>
<div class="paragraph"><p>The limited range of VBR is due to my particular implementation which restricts
ranges to within one alloc table (see tables B.2a, B.2b, B.2c and B.2d in ISO 11172).
The VBR range for 32/44.1khz lies within B.2b, and the 48khz VBR lies within table B.2a.</p></div>
<div class="paragraph"><p>I’m not sure whether it is worth extending these ranges down to lower bitrates.
The work required to switch alloc tables <strong>during</strong> the encoding is major.</p></div>
<div class="paragraph"><p>In the case of silence, it might be worth doing a quick check for very low signals
and writing a pre-calculated <strong>blank</strong> 32kpbs frame. [probably also a lot of work].</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_how_cbr_works">How CBR works</h2>
<div class="sectionbody">
<div class="ulist"><ul>
<li>
<p>
Use the psycho model to determine the MNRs for each subband
[MNR = the ratio of "masking" to "noise"]
(From an encoding perspective, a bigger MNR in a subband means that
it sounds better since the noise is more masked))
</p>
</li>
<li>
<p>
calculate the available data bits (adb) for this bitrate.
</p>
</li>
<li>
<p>
Based upon the MNR (Masking:Noise Ratio) values, allocate bits to each
subband
</p>
</li>
<li>
<p>
Keep increasing the bits to whichever subband currently has the min MNR
value until we have no bits left.
</p>
</li>
<li>
<p>
This mode does not guarentee that all the subbands are without noise
ie there may still be subbands with MNR less than 0.0 (noisy!)
</p>
</li>
</ul></div>
</div>
</div>
<div class="sect1">
<h2 id="_how_vbr_works">How VBR works</h2>
<div class="sectionbody">
<div class="ulist"><ul>
<li>
<p>
pretend we have lots of bits to spare, and work out the bits which would
raise the MNR in each subband to the level given by the argument on the
command line "-v [int]"
</p>
</li>
<li>
<p>
Pick the bitrate which has more bits than the required_bits we just calculated
</p>
</li>
<li>
<p>
calculate a_bit_allocation()
</p>
</li>
<li>
<p>
VBR "guarantees" that all subbands have MNR > VBRLEVEL or that we have
reached the maximum bitrate.
</p>
</li>
</ul></div>
</div>
</div>
<div class="sect1">
<h2 id="_future">FUTURE</h2>
<div class="sectionbody">
<div class="ulist"><ul>
<li>
<p>
with this VBR mode, we know the bits aren’t going to run out, so we can
just assign them "greedily".
</p>
</li>
<li>
<p>
VBR_a_bit_allocation() is yet to be written :)
</p>
</li>
</ul></div>
</div>
</div>
</div>
<div id="footnotes"><hr /></div>
<div id="footer">
<div id="footer-text">
Version 0.3.13<br />
Last updated 2011-01-01 22:51:38 GMT
</div>
</div>
</body>
</html>
|