diff options
Diffstat (limited to 'vorbis/doc/a1-encapsulation-ogg.tex')
-rw-r--r-- | vorbis/doc/a1-encapsulation-ogg.tex | 184 |
1 files changed, 184 insertions, 0 deletions
diff --git a/vorbis/doc/a1-encapsulation-ogg.tex b/vorbis/doc/a1-encapsulation-ogg.tex new file mode 100644 index 0000000..8bbd31b --- /dev/null +++ b/vorbis/doc/a1-encapsulation-ogg.tex @@ -0,0 +1,184 @@ +% -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*- +%!TEX root = Vorbis_I_spec.tex +\section{Embedding Vorbis into an Ogg stream} \label{vorbis:over:ogg} + +\subsection{Overview} + +This document describes using Ogg logical and physical transport +streams to encapsulate Vorbis compressed audio packet data into file +form. + +The \xref{vorbis:spec:intro} provides an overview of the construction +of Vorbis audio packets. + +The \href{oggstream.html}{Ogg +bitstream overview} and \href{framing.html}{Ogg logical +bitstream and framing spec} provide detailed descriptions of Ogg +transport streams. This specification document assumes a working +knowledge of the concepts covered in these named backround +documents. Please read them first. + +\subsubsection{Restrictions} + +The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis +streams use Ogg transport streams in degenerate, unmultiplexed +form only. That is: + +\begin{itemize} + \item + A meta-headerless Ogg file encapsulates the Vorbis I packets + + \item + The Ogg stream may be chained, i.e., contain multiple, contigous logical streams (links). + + \item + The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link) + +\end{itemize} + + +This is not to say that it is not currently possible to multiplex +Vorbis with other media types into a multi-stream Ogg file. At the +time this document was written, Ogg was becoming a popular container +for low-bitrate movies consisting of DivX video and Vorbis audio. +However, a 'Vorbis I audio file' is taken to imply Vorbis audio +existing alone within a degenerate Ogg stream. A compliant 'Vorbis +audio player' is not required to implement Ogg support beyond the +specific support of Vorbis within a degenrate Ogg stream (naturally, +application authors are encouraged to support full multiplexed Ogg +handling). + + + + +\subsubsection{MIME type} + +The MIME type of Ogg files depend on the context. Specifically, complex +multimedia and applications should use \literal{application/ogg}, +while visual media should use \literal{video/ogg}, and audio +\literal{audio/ogg}. Vorbis data encapsulated in Ogg may appear +in any of those types. RTP encapsulated Vorbis should use +\literal{audio/vorbis} + \literal{audio/vorbis-config}. + + +\subsection{Encapsulation} + +Ogg encapsulation of a Vorbis packet stream is straightforward. + +\begin{itemize} + +\item + The first Vorbis packet (the identification header), which + uniquely identifies a stream as Vorbis audio, is placed alone in the + first page of the logical Ogg stream. This results in a first Ogg + page of exactly 58 bytes at the very beginning of the logical stream. + + +\item + This first page is marked 'beginning of stream' in the page flags. + + +\item + The second and third vorbis packets (comment and setup + headers) may span one or more pages beginning on the second page of + the logical stream. However many pages they span, the third header + packet finishes the page on which it ends. The next (first audio) packet + must begin on a fresh page. + + +\item + The granule position of these first pages containing only headers is zero. + + +\item + The first audio packet of the logical stream begins a fresh Ogg page. + + +\item + Packets are placed into ogg pages in order until the end of stream. + + +\item + The last page is marked 'end of stream' in the page flags. + + +\item + Vorbis packets may span page boundaries. + + +\item + The granule position of pages containing Vorbis audio is in units + of PCM audio samples (per channel; a stereo stream's granule position + does not increment at twice the speed of a mono stream). + + +\item + The granule position of a page represents the end PCM sample + position of the last packet \emph{completed} on that + page. The 'last PCM sample' is the last complete sample returned by + decode, not an internal sample awaiting lapping with a + subsequent block. A page that is entirely spanned by a single + packet (that completes on a subsequent page) has no granule + position, and the granule position is set to '-1'. + + + Note that the last decoded (fully lapped) PCM sample from a packet + is not necessarily the middle sample from that block. If, eg, the + current Vorbis packet encodes a "long block" and the next Vorbis + packet encodes a "short block", the last decodable sample from the + current packet be at position (3*long\_block\_length/4) - + (short\_block\_length/4). + + +\item + The granule (PCM) position of the first page need not indicate + that the stream started at position zero. Although the granule + position belongs to the last completed packet on the page and a + valid granule position must be positive, by + inference it may indicate that the PCM position of the beginning + of audio is positive or negative. + + + \begin{itemize} + \item + A positive starting value simply indicates that this stream begins at + some positive time offset, potentially within a larger + program. This is a common case when connecting to the middle + of broadcast stream. + + \item + A negative value indicates that + output samples preceeding time zero should be discarded during + decoding; this technique is used to allow sample-granularity + editing of the stream start time of already-encoded Vorbis + streams. The number of samples to be discarded must not exceed + the overlap-add span of the first two audio packets. + + \end{itemize} + + + In both of these cases in which the initial audio PCM starting + offset is nonzero, the second finished audio packet must flush the + page on which it appears and the third packet begin a fresh page. + This allows the decoder to always be able to perform PCM position + adjustments before needing to return any PCM data from synthesis, + resulting in correct positioning information without any aditional + seeking logic. + + + \begin{note} + Failure to do so should, at worst, cause a + decoder implementation to return incorrect positioning information + for seeking operations at the very beginning of the stream. + \end{note} + + +\item + A granule position on the final page in a stream that indicates + less audio data than the final packet would normally return is used to + end the stream on other than even frame boundaries. The difference + between the actual available data returned and the declared amount + indicates how many trailing samples to discard from the decoding + process. + +\end{itemize} |