Calling convention violator broke streaming loop pretty far away

A really nasty problem coming from MainConcept AVC/H.264 SDK Encoder was destroying media streaming pipeline. SDK is somewhat old (9.7.9.5738) and the problem might be already fixed, or might be not. The problem is a good example of how a small bug could become a big pain.

The problem was coming up in 64-bit Release builds only. Win32 build? OK. Debug build where you can step things through? No problem.

The bug materialized in GDCL MP4 Demultiplexer filter streaming (Demultiplexer filter in the pipeline below) generating media samples with incorrect time stamps.

Pipeline

Initial start and stop time are okay, and further go as _I64_MIN (incorrect).

Clipbrd3

The problem appears to be SSE optimization and x64 calling convention related. This explains why it’s only 64-bit Release build suffering from the issue. MS compiler decided to use XMM7 register for dRate variable in this code fragment:

REFERENCE_TIME tStart, tStop;
double dRate;
m_pParser->GetSeekingParams(&tStart, &tStop, &dRate);

[...]

for(; ; )
{
    [...]

    tSampleStart = REFERENCE_TIME(tSampleStart / dRate);
    tSampleEnd = REFERENCE_TIME(tSampleEnd / dRate);

dRate is the only floating point thing here and it’s clear why the compiler optimized the variable into register: no other floating point activity around.

However sample delivery goes pretty deep into other functions and modules reaching MainConcept H.264 encoder. One of its functions is violating x64 calling convention and does not preserve XMM6+ register values. OOPS! Everything is about working right, but after media sample delivery dRate value is destroyed and further media samples receive incorrect time stamps.

It is not really a problem of MP4 demultiplexer, of course, however media sample delivery might involve a long delivery chain where any violator would break streaming loop. In the same time, it is not really a big expense to de-optimize the floating point math in the demultiplexer for those a few time stamp adjustment operations. A volatile specifier breaks compiler optimization and makes the loop resistant to SSE2 register violators:

// HOTFIX: Volatile specifier is not really necessary here but it fixes a nasty problem with MainConcept AVC SDK violating x64 calling convention;
//         MS compiler might choose to keep dRate in XMM6 register and the value would be destroyed by the violating call leading to incorrect 
//         further streaming (wrong time stamps)
volatile DOUBLE dRate;
m_pParser->GetSeekingParams(&tStart, &tStop, (DOUBLE*) &dRate);

This makes H.264 this build of encoding SDK unstable and the problem is hopefully already fixed. The SDK indeed gave other troubles on specific architectures leading to undefined behavior.

Leave a Reply