summaryrefslogtreecommitdiffhomepage
path: root/vorbis/doc/a1-encapsulation-ogg.tex
diff options
context:
space:
mode:
Diffstat (limited to 'vorbis/doc/a1-encapsulation-ogg.tex')
-rw-r--r--vorbis/doc/a1-encapsulation-ogg.tex184
1 files changed, 184 insertions, 0 deletions
diff --git a/vorbis/doc/a1-encapsulation-ogg.tex b/vorbis/doc/a1-encapsulation-ogg.tex
new file mode 100644
index 0000000..8bbd31b
--- /dev/null
+++ b/vorbis/doc/a1-encapsulation-ogg.tex
@@ -0,0 +1,184 @@
+% -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
+%!TEX root = Vorbis_I_spec.tex
+\section{Embedding Vorbis into an Ogg stream} \label{vorbis:over:ogg}
+
+\subsection{Overview}
+
+This document describes using Ogg logical and physical transport
+streams to encapsulate Vorbis compressed audio packet data into file
+form.
+
+The \xref{vorbis:spec:intro} provides an overview of the construction
+of Vorbis audio packets.
+
+The \href{oggstream.html}{Ogg
+bitstream overview} and \href{framing.html}{Ogg logical
+bitstream and framing spec} provide detailed descriptions of Ogg
+transport streams. This specification document assumes a working
+knowledge of the concepts covered in these named backround
+documents. Please read them first.
+
+\subsubsection{Restrictions}
+
+The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis
+streams use Ogg transport streams in degenerate, unmultiplexed
+form only. That is:
+
+\begin{itemize}
+ \item
+ A meta-headerless Ogg file encapsulates the Vorbis I packets
+
+ \item
+ The Ogg stream may be chained, i.e., contain multiple, contigous logical streams (links).
+
+ \item
+ The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link)
+
+\end{itemize}
+
+
+This is not to say that it is not currently possible to multiplex
+Vorbis with other media types into a multi-stream Ogg file. At the
+time this document was written, Ogg was becoming a popular container
+for low-bitrate movies consisting of DivX video and Vorbis audio.
+However, a 'Vorbis I audio file' is taken to imply Vorbis audio
+existing alone within a degenerate Ogg stream. A compliant 'Vorbis
+audio player' is not required to implement Ogg support beyond the
+specific support of Vorbis within a degenrate Ogg stream (naturally,
+application authors are encouraged to support full multiplexed Ogg
+handling).
+
+
+
+
+\subsubsection{MIME type}
+
+The MIME type of Ogg files depend on the context. Specifically, complex
+multimedia and applications should use \literal{application/ogg},
+while visual media should use \literal{video/ogg}, and audio
+\literal{audio/ogg}. Vorbis data encapsulated in Ogg may appear
+in any of those types. RTP encapsulated Vorbis should use
+\literal{audio/vorbis} + \literal{audio/vorbis-config}.
+
+
+\subsection{Encapsulation}
+
+Ogg encapsulation of a Vorbis packet stream is straightforward.
+
+\begin{itemize}
+
+\item
+ The first Vorbis packet (the identification header), which
+ uniquely identifies a stream as Vorbis audio, is placed alone in the
+ first page of the logical Ogg stream. This results in a first Ogg
+ page of exactly 58 bytes at the very beginning of the logical stream.
+
+
+\item
+ This first page is marked 'beginning of stream' in the page flags.
+
+
+\item
+ The second and third vorbis packets (comment and setup
+ headers) may span one or more pages beginning on the second page of
+ the logical stream. However many pages they span, the third header
+ packet finishes the page on which it ends. The next (first audio) packet
+ must begin on a fresh page.
+
+
+\item
+ The granule position of these first pages containing only headers is zero.
+
+
+\item
+ The first audio packet of the logical stream begins a fresh Ogg page.
+
+
+\item
+ Packets are placed into ogg pages in order until the end of stream.
+
+
+\item
+ The last page is marked 'end of stream' in the page flags.
+
+
+\item
+ Vorbis packets may span page boundaries.
+
+
+\item
+ The granule position of pages containing Vorbis audio is in units
+ of PCM audio samples (per channel; a stereo stream's granule position
+ does not increment at twice the speed of a mono stream).
+
+
+\item
+ The granule position of a page represents the end PCM sample
+ position of the last packet \emph{completed} on that
+ page. The 'last PCM sample' is the last complete sample returned by
+ decode, not an internal sample awaiting lapping with a
+ subsequent block. A page that is entirely spanned by a single
+ packet (that completes on a subsequent page) has no granule
+ position, and the granule position is set to '-1'.
+
+
+ Note that the last decoded (fully lapped) PCM sample from a packet
+ is not necessarily the middle sample from that block. If, eg, the
+ current Vorbis packet encodes a "long block" and the next Vorbis
+ packet encodes a "short block", the last decodable sample from the
+ current packet be at position (3*long\_block\_length/4) -
+ (short\_block\_length/4).
+
+
+\item
+ The granule (PCM) position of the first page need not indicate
+ that the stream started at position zero. Although the granule
+ position belongs to the last completed packet on the page and a
+ valid granule position must be positive, by
+ inference it may indicate that the PCM position of the beginning
+ of audio is positive or negative.
+
+
+ \begin{itemize}
+ \item
+ A positive starting value simply indicates that this stream begins at
+ some positive time offset, potentially within a larger
+ program. This is a common case when connecting to the middle
+ of broadcast stream.
+
+ \item
+ A negative value indicates that
+ output samples preceeding time zero should be discarded during
+ decoding; this technique is used to allow sample-granularity
+ editing of the stream start time of already-encoded Vorbis
+ streams. The number of samples to be discarded must not exceed
+ the overlap-add span of the first two audio packets.
+
+ \end{itemize}
+
+
+ In both of these cases in which the initial audio PCM starting
+ offset is nonzero, the second finished audio packet must flush the
+ page on which it appears and the third packet begin a fresh page.
+ This allows the decoder to always be able to perform PCM position
+ adjustments before needing to return any PCM data from synthesis,
+ resulting in correct positioning information without any aditional
+ seeking logic.
+
+
+ \begin{note}
+ Failure to do so should, at worst, cause a
+ decoder implementation to return incorrect positioning information
+ for seeking operations at the very beginning of the stream.
+ \end{note}
+
+
+\item
+ A granule position on the final page in a stream that indicates
+ less audio data than the final packet would normally return is used to
+ end the stream on other than even frame boundaries. The difference
+ between the actual available data returned and the declared amount
+ indicates how many trailing samples to discard from the decoding
+ process.
+
+\end{itemize}