summaryrefslogtreecommitdiffhomepage
path: root/vorbis/doc/02-bitpacking.tex
diff options
context:
space:
mode:
Diffstat (limited to 'vorbis/doc/02-bitpacking.tex')
-rw-r--r--vorbis/doc/02-bitpacking.tex246
1 files changed, 0 insertions, 246 deletions
diff --git a/vorbis/doc/02-bitpacking.tex b/vorbis/doc/02-bitpacking.tex
deleted file mode 100644
index 905dcaf..0000000
--- a/vorbis/doc/02-bitpacking.tex
+++ /dev/null
@@ -1,246 +0,0 @@
-% -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
-%!TEX root = Vorbis_I_spec.tex
-\section{Bitpacking Convention} \label{vorbis:spec:bitpacking}
-
-\subsection{Overview}
-
-The Vorbis codec uses relatively unstructured raw packets containing
-arbitrary-width binary integer fields. Logically, these packets are a
-bitstream in which bits are coded one-by-one by the encoder and then
-read one-by-one in the same monotonically increasing order by the
-decoder. Most current binary storage arrangements group bits into a
-native word size of eight bits (octets), sixteen bits, thirty-two bits
-or, less commonly other fixed word sizes. The Vorbis bitpacking
-convention specifies the correct mapping of the logical packet
-bitstream into an actual representation in fixed-width words.
-
-
-\subsubsection{octets, bytes and words}
-
-In most contemporary architectures, a 'byte' is synonymous with an
-'octet', that is, eight bits. This has not always been the case;
-seven, ten, eleven and sixteen bit 'bytes' have been used. For
-purposes of the bitpacking convention, a byte implies the native,
-smallest integer storage representation offered by a platform. On
-modern platforms, this is generally assumed to be eight bits (not
-necessarily because of the processor but because of the
-filesystem/memory architecture. Modern filesystems invariably offer
-bytes as the fundamental atom of storage). A 'word' is an integer
-size that is a grouped multiple of this smallest size.
-
-The most ubiquitous architectures today consider a 'byte' to be an
-octet (eight bits) and a word to be a group of two, four or eight
-bytes (16, 32 or 64 bits). Note however that the Vorbis bitpacking
-convention is still well defined for any native byte size; Vorbis uses
-the native bit-width of a given storage system. This document assumes
-that a byte is one octet for purposes of example.
-
-\subsubsection{bit order}
-
-A byte has a well-defined 'least significant' bit (LSb), which is the
-only bit set when the byte is storing the two's complement integer
-value +1. A byte's 'most significant' bit (MSb) is at the opposite
-end of the byte. Bits in a byte are numbered from zero at the LSb to
-$n$ ($n=7$ in an octet) for the
-MSb.
-
-
-
-\subsubsection{byte order}
-
-Words are native groupings of multiple bytes. Several byte orderings
-are possible in a word; the common ones are 3-2-1-0 ('big endian' or
-'most significant byte first' in which the highest-valued byte comes
-first), 0-1-2-3 ('little endian' or 'least significant byte first' in
-which the lowest value byte comes first) and less commonly 3-1-2-0 and
-0-2-1-3 ('mixed endian').
-
-The Vorbis bitpacking convention specifies storage and bitstream
-manipulation at the byte, not word, level, thus host word ordering is
-of a concern only during optimization when writing high performance
-code that operates on a word of storage at a time rather than by byte.
-Logically, bytes are always coded and decoded in order from byte zero
-through byte $n$.
-
-
-
-\subsubsection{coding bits into byte sequences}
-
-The Vorbis codec has need to code arbitrary bit-width integers, from
-zero to 32 bits wide, into packets. These integer fields are not
-aligned to the boundaries of the byte representation; the next field
-is written at the bit position at which the previous field ends.
-
-The encoder logically packs integers by writing the LSb of a binary
-integer to the logical bitstream first, followed by next least
-significant bit, etc, until the requested number of bits have been
-coded. When packing the bits into bytes, the encoder begins by
-placing the LSb of the integer to be written into the least
-significant unused bit position of the destination byte, followed by
-the next-least significant bit of the source integer and so on up to
-the requested number of bits. When all bits of the destination byte
-have been filled, encoding continues by zeroing all bits of the next
-byte and writing the next bit into the bit position 0 of that byte.
-Decoding follows the same process as encoding, but by reading bits
-from the byte stream and reassembling them into integers.
-
-
-
-\subsubsection{signedness}
-
-The signedness of a specific number resulting from decode is to be
-interpreted by the decoder given decode context. That is, the three
-bit binary pattern 'b111' can be taken to represent either 'seven' as
-an unsigned integer, or '-1' as a signed, two's complement integer.
-The encoder and decoder are responsible for knowing if fields are to
-be treated as signed or unsigned.
-
-
-
-\subsubsection{coding example}
-
-Code the 4 bit integer value '12' [b1100] into an empty bytestream.
-Bytestream result:
-
-\begin{Verbatim}[commandchars=\\\{\}]
- |
- V
-
- 7 6 5 4 3 2 1 0
-byte 0 [0 0 0 0 1 1 0 0] <-
-byte 1 [ ]
-byte 2 [ ]
-byte 3 [ ]
- ...
-byte n [ ] bytestream length == 1 byte
-
-\end{Verbatim}
-
-
-Continue by coding the 3 bit integer value '-1' [b111]:
-
-\begin{Verbatim}[commandchars=\\\{\}]
- |
- V
-
- 7 6 5 4 3 2 1 0
-byte 0 [0 1 1 1 1 1 0 0] <-
-byte 1 [ ]
-byte 2 [ ]
-byte 3 [ ]
- ...
-byte n [ ] bytestream length == 1 byte
-\end{Verbatim}
-
-
-Continue by coding the 7 bit integer value '17' [b0010001]:
-
-\begin{Verbatim}[commandchars=\\\{\}]
- |
- V
-
- 7 6 5 4 3 2 1 0
-byte 0 [1 1 1 1 1 1 0 0]
-byte 1 [0 0 0 0 1 0 0 0] <-
-byte 2 [ ]
-byte 3 [ ]
- ...
-byte n [ ] bytestream length == 2 bytes
- bit cursor == 6
-\end{Verbatim}
-
-
-Continue by coding the 13 bit integer value '6969' [b110 11001110 01]:
-
-\begin{Verbatim}[commandchars=\\\{\}]
- |
- V
-
- 7 6 5 4 3 2 1 0
-byte 0 [1 1 1 1 1 1 0 0]
-byte 1 [0 1 0 0 1 0 0 0]
-byte 2 [1 1 0 0 1 1 1 0]
-byte 3 [0 0 0 0 0 1 1 0] <-
- ...
-byte n [ ] bytestream length == 4 bytes
-
-\end{Verbatim}
-
-
-
-
-\subsubsection{decoding example}
-
-Reading from the beginning of the bytestream encoded in the above example:
-
-\begin{Verbatim}[commandchars=\\\{\}]
- |
- V
-
- 7 6 5 4 3 2 1 0
-byte 0 [1 1 1 1 1 1 0 0] <-
-byte 1 [0 1 0 0 1 0 0 0]
-byte 2 [1 1 0 0 1 1 1 0]
-byte 3 [0 0 0 0 0 1 1 0] bytestream length == 4 bytes
-
-\end{Verbatim}
-
-
-We read two, two-bit integer fields, resulting in the returned numbers
-'b00' and 'b11'. Two things are worth noting here:
-
-\begin{itemize}
-\item Although these four bits were originally written as a single
-four-bit integer, reading some other combination of bit-widths from the
-bitstream is well defined. There are no artificial alignment
-boundaries maintained in the bitstream.
-
-\item The second value is the
-two-bit-wide integer 'b11'. This value may be interpreted either as
-the unsigned value '3', or the signed value '-1'. Signedness is
-dependent on decode context.
-\end{itemize}
-
-
-
-
-\subsubsection{end-of-packet alignment}
-
-The typical use of bitpacking is to produce many independent
-byte-aligned packets which are embedded into a larger byte-aligned
-container structure, such as an Ogg transport bitstream. Externally,
-each bytestream (encoded bitstream) must begin and end on a byte
-boundary. Often, the encoded bitstream is not an integer number of
-bytes, and so there is unused (uncoded) space in the last byte of a
-packet.
-
-Unused space in the last byte of a bytestream is always zeroed during
-the coding process. Thus, should this unused space be read, it will
-return binary zeroes.
-
-Attempting to read past the end of an encoded packet results in an
-'end-of-packet' condition. End-of-packet is not to be considered an
-error; it is merely a state indicating that there is insufficient
-remaining data to fulfill the desired read size. Vorbis uses truncated
-packets as a normal mode of operation, and as such, decoders must
-handle reading past the end of a packet as a typical mode of
-operation. Any further read operations after an 'end-of-packet'
-condition shall also return 'end-of-packet'.
-
-
-
-\subsubsection{reading zero bits}
-
-Reading a zero-bit-wide integer returns the value '0' and does not
-increment the stream cursor. Reading to the end of the packet (but
-not past, such that an 'end-of-packet' condition has not triggered)
-and then reading a zero bit integer shall succeed, returning 0, and
-not trigger an end-of-packet condition. Reading a zero-bit-wide
-integer after a previous read sets 'end-of-packet' shall also fail
-with 'end-of-packet'.
-
-
-
-
-
-