summaryrefslogtreecommitdiffhomepage
path: root/vorbis/doc/vorbis-clip.txt
diff options
context:
space:
mode:
authorAki <please@ignore.pl>2021-09-29 22:52:49 +0200
committerAki <please@ignore.pl>2021-09-29 22:52:49 +0200
commit760f65d35df281b04d99843958623d99ab35dcaf (patch)
tree76f6f05695822256bbf8097fa0aa6b5d2a34369b /vorbis/doc/vorbis-clip.txt
parentbdb934044a10bcccdea4ae5e9b067a2e764e0e7f (diff)
parent74f4b1bc3b627ba4c7e03498234d88cacdfbe97b (diff)
downloadstarshatter-760f65d35df281b04d99843958623d99ab35dcaf.zip
starshatter-760f65d35df281b04d99843958623d99ab35dcaf.tar.gz
starshatter-760f65d35df281b04d99843958623d99ab35dcaf.tar.bz2
Merge commit '74f4b1bc3b627ba4c7e03498234d88cacdfbe97b' as 'vorbis'
Diffstat (limited to 'vorbis/doc/vorbis-clip.txt')
-rw-r--r--vorbis/doc/vorbis-clip.txt139
1 files changed, 139 insertions, 0 deletions
diff --git a/vorbis/doc/vorbis-clip.txt b/vorbis/doc/vorbis-clip.txt
new file mode 100644
index 0000000..2e67034
--- /dev/null
+++ b/vorbis/doc/vorbis-clip.txt
@@ -0,0 +1,139 @@
+Topic:
+
+Sample granularity editing of a Vorbis file; inferred arbitrary sample
+length starting offsets / PCM stream lengths
+
+Overview:
+
+Vorbis, like mp3, is a frame-based* audio compression where audio is
+broken up into discrete short time segments. These segments are
+'atomic' that is, one must recover the entire short time segment from
+the frame packet; there's no way to recover only a part of the PCM time
+segment from part of the coded packet without expanding the entire
+packet and then discarding a portion of the resulting PCM audio.
+
+* In mp3, the data segment representing a given time period is called
+ a 'frame'; the roughly equivalent Vorbis construct is a 'packet'.
+
+Thus, when we edit a Vorbis stream, the finest physical editing
+granularity is on these packet boundaries (the mp3 case is
+actually somewhat more complex and mp3 editing is more complicated
+than just snipping on a frame boundary because time data can be spread
+backward or forward over frames. In Vorbis, packets are all
+stand-alone). Thus, at the physical packet level, Vorbis is still
+limited to streams that contain an integral number of packets.
+
+However, Vorbis streams may still exactly represent and be edited to a
+PCM stream of arbitrary length and starting offset without padding the
+beginning or end of the decoded stream or requiring that the desired
+edit points be packet aligned. Vorbis makes use of Ogg stream
+framing, and this framing provides time-stamping data, called a
+'granule position'; our starting offset and finished stream length may
+be inferred from correct usage of the granule position data.
+
+Time stamping mechanism:
+
+Vorbis packets are bundled into into Ogg pages (note that pages do not
+necessarily contain integral numbers of packets, but that isn't
+inportant in this discussion. More about Ogg framing can be found in
+ogg/doc/framing.html). Each page that contains a packet boundary is
+stamped with the absolute sample-granularity offset of the data, that
+is, 'complete samples-to-date' up to the last completed packet of that
+page. (The same mechanism is used for eg, video, where the number
+represents complete 2-D frames, and so on).
+
+(It's possible but rare for a packet to span more than two pages such
+that page[s] in the middle have no packet boundary; these packets have
+a granule position of '-1'.)
+
+This granule position mechaism in Ogg is used by Vorbis to indicate when the
+PCM data intended to be represented in a Vorbis segment begins a
+number of samples into the data represented by the first packet[s]
+and/or ends before the physical PCM data represented in the last
+packet[s].
+
+File length a non-integral number of frames:
+
+A file to be encoded in Vorbis will probably not encode into an
+integral number of packets; such a file is encoded with the last
+packet containing 'extra'* samples. These samples are not padding; they
+will be discarded in decode.
+
+*(For best results, the encoder should use extra samples that preserve
+the character of the last frame. Simply setting them to zero will
+introduce a 'cliff' that's hard to encode, resulting in spread-frame
+noise. Libvorbis extrapolates the last frame past the end of data to
+produce the extra samples. Even simply duplicating the last value is
+better than clamping the signal to zero).
+
+The encoder indicates to the decoder that the file is actually shorter
+than all of the samples ('original' + 'extra') by setting the granule
+position in the last page to a short value, that is, the last
+timestamp is the original length of the file discarding extra samples.
+The decoder will see that the number of samples it has decoded in the
+last page is too many; it is 'original' + 'extra', where the
+granulepos says that through the last packet we only have 'original'
+number of samples. The decoder then ignores the 'extra' samples.
+This behavior is to occur only when the end-of-stream bit is set in
+the page (indicating last page of the logical stream).
+
+Note that it not legal for the granule position of the last page to
+indicate that there are more samples in the file than actually exist,
+however, implementations should handle such an illegal file gracefully
+in the interests of robust programming.
+
+Beginning point not on integral packet boundary:
+
+It is possible that we will the PCM data represented by a Vorbis
+stream to begin at a position later than where the decoded PCM data
+really begins after an integral packet boundary, a situation analagous
+to the above description where the PCM data does not end at an
+integral packet boundary. The easiest example is taking a clip out of
+a larger Vorbis stream, and choosing a beginning point of the clip
+that is not on a packet boundary; we need to ignore a few samples to
+get the desired beginning point.
+
+The process of marking the desired beginning point is similar to
+marking an arbitrary ending point. If the encoder wishes sample zero
+to be some location past the actual beginning of data, it associates a
+'short' granule position value with the completion of the second*
+audio packet. The granule position is associated with the second
+packet simply by making sure the second packet completes its page.
+
+*(We associate the short value with the second packet for two reasons.
+ a) The first packet only primes the overlap/add buffer. No data is
+ returned before decoding the second packet; this places the decision
+ information at the point of decision. b) Placing the short value on
+ the first packet would make the value negative (as the first packet
+ normally represents position zero); a negative value would break the
+ requirement that granule positions increase; the headers have
+ position values of zero)
+
+The decoder sees that on the first page that will return
+data from the overlap/add queue, we have more samples than the granule
+position accounts for, and discards the 'surplus' from the beginning
+of the queue.
+
+Note that short granule values (indicating less than the actually
+returned about of data) are not legal in the Vorbis spec outside of
+indicating beginning and ending sample positions. However, decoders
+should, at minimum, tolerate inadvertant short values elsewhere in the
+stream (just as they should tolerate out-of-order/non-increasing
+granulepos values, although this too is illegal).
+
+Beginning point at arbitrary positive timestamp (no 'zero' sample):
+
+It's also possible that the granule position of the first page of an
+audio stream is a 'long value', that is, a value larger than the
+amount of PCM audio decoded. This implies only that we are starting
+playback at some point into the logical stream, a potentially common
+occurence in streaming applications where the decoder may be
+connecting into a live stream. The decoder should not treat the long
+value specially.
+
+A long value elsewhere in the stream would normally occur only when a
+page is lost or out of sequence, as indicated by the page's sequence
+number. A long value under any other situation is not legal, however
+a decoder should tolerate both possibilities.
+
+