Re: DVI Incompatibility
Author:   Van Jacobson <van@ee.lbl.gov>
Date:     1995/06/22
Forum:    ucb.digital-video

On 10 Jun 1995, Henning Schulzrinne said:
> There are (at least) four possibilities: (1) Put the first
> sample unencoded into the header and encode 160 bytes. The
> unencoded sample is simply used as a predictor for the first
> sample (which happens to be the same). [This appears to be the
> vat approach and is the NeVoT approach.]
>
> (2) Same as 1), but encode only the following 159 samples. The
> last four bits in the packet are meaningless (zero). The
> receiver, unfortunately, can't tell unless it knows that each
> packet contains 160 ms (or at least knows that packets contain
> an even number of samples). A sender doing (2) actually works
> reasonably well with a receiver doing (1).
>
> (3) Use 161 samples, conforming to the 'DVI standard'.
> Conformance is rather useful if either hardware or system
> libraries produce that format. 161 samples obviously don't fit
> well with the rest and may not agree with certain hardware
> restrictions.

To which Jack Jansen replied:
> I would go for the first solution, put the predictor in the
> header. After all, we don't really need the predictor if we had
> an error-free link, it is just there because the sample-stream
> can be broken, so it can be seen as part of the transport
> protocol, not part of the adpcm sound protocol.

It looks to me as if Jack made the same mistake I did: he
mentally translated Henning's option 1 into something sensible
then said "yes, we should do this sensible thing" (which just
happens to be what Jack's coder & the vat coder derived from it
do).  Unfortunately, what Henning said & the text he put in the
profile document is not at all sensible & not what the existing
coders do (except, possibly, nevot's).  The existing coders put
the *prediction* (i.e., what the coder thought the next sample
would be at the time it coded the last sample of the previous
frame) into the header.  This means every sample's nybble is
treated identically in both the coding & decoding loops.

What Henning's text actually says is to put the first sample of
the current frame in the header (presumably this means the coder
should discard the final prediction of the previous frame & code
the first sample's delta nybble as '0' but god only knows if
this is what Henning thought he was suggesting when he said "The
unencoded sample is simply used as a predictor for the first
sample (which happens to be the same)").  This makes for a more
complicated coder (since you treat the first sample of a frame
specially) and is almost guaranteed to cause audible artifacts
in the output stream since every 20ms (50Hz) you quantize 1
sample with different rules than are used for all the others.

I agree with what Jack suggested (which we have 3 years of
experience with & which is known to work well) -- put the
predictor in the header, not the first sample.

 - Van