diff options
Diffstat (limited to 'vorbis/doc/framing.html')
-rw-r--r-- | vorbis/doc/framing.html | 431 |
1 files changed, 0 insertions, 431 deletions
diff --git a/vorbis/doc/framing.html b/vorbis/doc/framing.html deleted file mode 100644 index 857b292..0000000 --- a/vorbis/doc/framing.html +++ /dev/null @@ -1,431 +0,0 @@ -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> -<html> -<head> - -<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15"/> -<title>Ogg Vorbis Documentation</title> - -<style type="text/css"> -body { - margin: 0 18px 0 18px; - padding-bottom: 30px; - font-family: Verdana, Arial, Helvetica, sans-serif; - color: #333333; - font-size: .8em; -} - -a { - color: #3366cc; -} - -img { - border: 0; -} - -#xiphlogo { - margin: 30px 0 16px 0; -} - -#content p { - line-height: 1.4; -} - -h1, h1 a, h2, h2 a, h3, h3 a { - font-weight: bold; - color: #ff9900; - margin: 1.3em 0 8px 0; -} - -h1 { - font-size: 1.3em; -} - -h2 { - font-size: 1.2em; -} - -h3 { - font-size: 1.1em; -} - -li { - line-height: 1.4; -} - -#copyright { - margin-top: 30px; - line-height: 1.5em; - text-align: center; - font-size: .8em; - color: #888888; - clear: both; -} -</style> - -</head> - -<body> - -<div id="xiphlogo"> - <a href="http://www.xiph.org/"><img src="fish_xiph_org.png" alt="Fish Logo and Xiph.Org"/></a> -</div> - -<h1>Ogg logical bitstream framing</h1> - -<h2>Ogg bitstreams</h2> - -<p>The Ogg transport bitstream is designed to provide framing, error -protection and seeking structure for higher-level codec streams that -consist of raw, unencapsulated data packets, such as the Vorbis audio -codec or Theora video codec.</p> - -<h2>Application example: Vorbis</h2> - -<p>Vorbis encodes short-time blocks of PCM data into raw packets of -bit-packed data. These raw packets may be used directly by transport -mechanisms that provide their own framing and packet-separation -mechanisms (such as UDP datagrams). For stream based storage (such as -files) and transport (such as TCP streams or pipes), Vorbis uses the -Ogg bitstream format to provide framing/sync, sync recapture -after error, landmarks during seeking, and enough information to -properly separate data back into packets at the original packet -boundaries without relying on decoding to find packet boundaries.</p> - -<h2>Design constraints for Ogg bitstreams</h2> - -<ol> -<li>True streaming; we must not need to seek to build a 100% - complete bitstream.</li> -<li>Use no more than approximately 1-2% of bitstream bandwidth for - packet boundary marking, high-level framing, sync and seeking.</li> -<li>Specification of absolute position within the original sample - stream.</li> -<li>Simple mechanism to ease limited editing, such as a simplified - concatenation mechanism.</li> -<li>Detection of corruption, recapture after error and direct, random - access to data at arbitrary positions in the bitstream.</li> -</ol> - -<h2>Logical and Physical Bitstreams</h2> - -<p>A <em>logical</em> Ogg bitstream is a contiguous stream of -sequential pages belonging only to the logical bitstream. A -<em>physical</em> Ogg bitstream is constructed from one or more -than one logical Ogg bitstream (the simplest physical bitstream -is simply a single logical bitstream). We describe below the exact -formatting of an Ogg logical bitstream. Combining logical -bitstreams into more complex physical bitstreams is described in the -<a href="oggstream.html">Ogg bitstream overview</a>. The exact -mapping of raw Vorbis packets into a valid Ogg Vorbis physical -bitstream is described in the Vorbis I Specification.</p> - -<h2>Bitstream structure</h2> - -<p>An Ogg stream is structured by dividing incoming packets into -segments of up to 255 bytes and then wrapping a group of contiguous -packet segments into a variable length page preceded by a page -header. Both the header size and page size are variable; the page -header contains sizing information and checksum data to determine -header/page size and data integrity.</p> - -<p>The bitstream is captured (or recaptured) by looking for the beginning -of a page, specifically the capture pattern. Once the capture pattern -is found, the decoder verifies page sync and integrity by computing -and comparing the checksum. At that point, the decoder can extract the -packets themselves.</p> - -<h3>Packet segmentation</h3> - -<p>Packets are logically divided into multiple segments before encoding -into a page. Note that the segmentation and fragmentation process is a -logical one; it's used to compute page header values and the original -page data need not be disturbed, even when a packet spans page -boundaries.</p> - -<p>The raw packet is logically divided into [n] 255 byte segments and a -last fractional segment of < 255 bytes. A packet size may well -consist only of the trailing fractional segment, and a fractional -segment may be zero length. These values, called "lacing values" are -then saved and placed into the header segment table.</p> - -<p>An example should make the basic concept clear:</p> - -<pre> -<tt> -raw packet: - ___________________________________________ - |______________packet data__________________| 753 bytes - -lacing values for page header segment table: 255,255,243 -</tt> -</pre> - -<p>We simply add the lacing values for the total size; the last lacing -value for a packet is always the value that is less than 255. Note -that this encoding both avoids imposing a maximum packet size as well -as imposing minimum overhead on small packets (as opposed to, eg, -simply using two bytes at the head of every packet and having a max -packet size of 32k. Small packets (<255, the typical case) are -penalized with twice the segmentation overhead). Using the lacing -values as suggested, small packets see the minimum possible -byte-aligned overheade (1 byte) and large packets, over 512 bytes or -so, see a fairly constant ~.5% overhead on encoding space.</p> - -<p>Note that a lacing value of 255 implies that a second lacing value -follows in the packet, and a value of < 255 marks the end of the -packet after that many additional bytes. A packet of 255 bytes (or a -multiple of 255 bytes) is terminated by a lacing value of 0:</p> - -<pre><tt> -raw packet: - _______________________________ - |________packet data____________| 255 bytes - -lacing values: 255, 0 -</tt></pre> - -<p>Note also that a 'nil' (zero length) packet is not an error; it -consists of nothing more than a lacing value of zero in the header.</p> - -<h3>Packets spanning pages</h3> - -<p>Packets are not restricted to beginning and ending within a page, -although individual segments are, by definition, required to do so. -Packets are not restricted to a maximum size, although excessively -large packets in the data stream are discouraged; the Ogg -bitstream specification strongly recommends nominal page size of -approximately 4-8kB (large packets are foreseen as being useful for -initialization data at the beginning of a logical bitstream).</p> - -<p>After segmenting a packet, the encoder may decide not to place all the -resulting segments into the current page; to do so, the encoder places -the lacing values of the segments it wishes to belong to the current -page into the current segment table, then finishes the page. The next -page is begun with the first value in the segment table belonging to -the next packet segment, thus continuing the packet (data in the -packet body must also correspond properly to the lacing values in the -spanned pages. The segment data in the first packet corresponding to -the lacing values of the first page belong in that page; packet -segments listed in the segment table of the following page must begin -the page body of the subsequent page).</p> - -<p>The last mechanic to spanning a page boundary is to set the header -flag in the new page to indicate that the first lacing value in the -segment table continues rather than begins a packet; a header flag of -0x01 is set to indicate a continued packet. Although mandatory, it -is not actually algorithmically necessary; one could inspect the -preceding segment table to determine if the packet is new or -continued. Adding the information to the packet_header flag allows a -simpler design (with no overhead) that needs only inspect the current -page header after frame capture. This also allows faster error -recovery in the event that the packet originates in a corrupt -preceding page, implying that the previous page's segment table -cannot be trusted.</p> - -<p>Note that a packet can span an arbitrary number of pages; the above -spanning process is repeated for each spanned page boundary. Also a -'zero termination' on a packet size that is an even multiple of 255 -must appear even if the lacing value appears in the next page as a -zero-length continuation of the current packet. The header flag -should be set to 0x01 to indicate that the packet spanned, even though -the span is a nil case as far as data is concerned.</p> - -<p>The encoding looks odd, but is properly optimized for speed and the -expected case of the majority of packets being between 50 and 200 -bytes (note that it is designed such that packets of wildly different -sizes can be handled within the model; placing packet size -restrictions on the encoder would have only slightly simplified design -in page generation and increased overall encoder complexity).</p> - -<p>The main point behind tracking individual packets (and packet -segments) is to allow more flexible encoding tricks that requiring -explicit knowledge of packet size. An example is simple bandwidth -limiting, implemented by simply truncating packets in the nominal case -if the packet is arranged so that the least sensitive portion of the -data comes last.</p> - -<h3>Page header</h3> - -<p>The headering mechanism is designed to avoid copying and re-assembly -of the packet data (ie, making the packet segmentation process a -logical one); the header can be generated directly from incoming -packet data. The encoder buffers packet data until it finishes a -complete page at which point it writes the header followed by the -buffered packet segments.</p> - -<h4>capture_pattern</h4> - -<p>A header begins with a capture pattern that simplifies identifying -pages; once the decoder has found the capture pattern it can do a more -intensive job of verifying that it has in fact found a page boundary -(as opposed to an inadvertent coincidence in the byte stream).</p> - -<pre><tt> - byte value - - 0 0x4f 'O' - 1 0x67 'g' - 2 0x67 'g' - 3 0x53 'S' -</tt></pre> - -<h4>stream_structure_version</h4> - -<p>The capture pattern is followed by the stream structure revision:</p> - -<pre><tt> - byte value - - 4 0x00 -</tt></pre> - -<h4>header_type_flag</h4> - -<p>The header type flag identifies this page's context in the bitstream:</p> - -<pre><tt> - byte value - - 5 bitflags: 0x01: unset = fresh packet - set = continued packet - 0x02: unset = not first page of logical bitstream - set = first page of logical bitstream (bos) - 0x04: unset = not last page of logical bitstream - set = last page of logical bitstream (eos) -</tt></pre> - -<h4>absolute granule position</h4> - -<p>(This is packed in the same way the rest of Ogg data is packed; LSb -of LSB first. Note that the 'position' data specifies a 'sample' -number (eg, in a CD quality sample is four octets, 16 bits for left -and 16 bits for right; in video it would likely be the frame number. -It is up to the specific codec in use to define the semantic meaning -of the granule position value). The position specified is the total -samples encoded after including all packets finished on this page -(packets begun on this page but continuing on to the next page do not -count). The rationale here is that the position specified in the -frame header of the last page tells how long the data coded by the -bitstream is. A truncated stream will still return the proper number -of samples that can be decoded fully.</p> - -<p>A special value of '-1' (in two's complement) indicates that no packets -finish on this page.</p> - -<pre><tt> - byte value - - 6 0xXX LSB - 7 0xXX - 8 0xXX - 9 0xXX - 10 0xXX - 11 0xXX - 12 0xXX - 13 0xXX MSB -</tt></pre> - -<h4>stream serial number</h4> - -<p>Ogg allows for separate logical bitstreams to be mixed at page -granularity in a physical bitstream. The most common case would be -sequential arrangement, but it is possible to interleave pages for -two separate bitstreams to be decoded concurrently. The serial -number is the means by which pages physical pages are associated with -a particular logical stream. Each logical stream must have a unique -serial number within a physical stream:</p> - -<pre><tt> - byte value - - 14 0xXX LSB - 15 0xXX - 16 0xXX - 17 0xXX MSB -</tt></pre> - -<h4>page sequence no</h4> - -<p>Page counter; lets us know if a page is lost (useful where packets -span page boundaries).</p> - -<pre><tt> - byte value - - 18 0xXX LSB - 19 0xXX - 20 0xXX - 21 0xXX MSB -</tt></pre> - -<h4>page checksum</h4> - -<p>32 bit CRC value (direct algorithm, initial val and final XOR = 0, -generator polynomial=0x04c11db7). The value is computed over the -entire header (with the CRC field in the header set to zero) and then -continued over the page. The CRC field is then filled with the -computed value.</p> - -<p>(A thorough discussion of CRC algorithms can be found in <a -href="http://www.ross.net/crc/download/crc_v3.txt">"A -Painless Guide to CRC Error Detection Algorithms"</a> by Ross -Williams <a href="mailto:ross@ross.net">ross@ross.net</a>.)</p> - -<pre><tt> - byte value - - 22 0xXX LSB - 23 0xXX - 24 0xXX - 25 0xXX MSB -</tt></pre> - -<h4>page_segments</h4> - -<p>The number of segment entries to appear in the segment table. The -maximum number of 255 segments (255 bytes each) sets the maximum -possible physical page size at 65307 bytes or just under 64kB (thus -we know that a header corrupted so as destroy sizing/alignment -information will not cause a runaway bitstream. We'll read in the -page according to the corrupted size information that's guaranteed to -be a reasonable size regardless, notice the checksum mismatch, drop -sync and then look for recapture).</p> - -<pre><tt> - byte value - - 26 0x00-0xff (0-255) -</tt></pre> - -<h4>segment_table (containing packet lacing values)</h4> - -<p>The lacing values for each packet segment physically appearing in -this page are listed in contiguous order.</p> - -<pre><tt> - byte value - - 27 0x00-0xff (0-255) - [...] - n 0x00-0xff (0-255, n=page_segments+26) -</tt></pre> - -<p>Total page size is calculated directly from the known header size and -lacing values in the segment table. Packet data segments follow -immediately after the header.</p> - -<p>Page headers typically impose a flat .25-.5% space overhead assuming -nominal ~8k page sizes. The segmentation table needed for exact -packet recovery in the streaming layer adds approximately .5-1% -nominal assuming expected encoder behavior in the 44.1kHz, 128kbps -stereo encodings.</p> - -<div id="copyright"> - The Xiph Fish Logo is a - trademark (™) of Xiph.Org.<br/> - - These pages © 1994 - 2005 Xiph.Org. All rights reserved. -</div> - -</body> -</html> |