Bug in Media Foundation MPEG-4 File Source related to timestamping video frames of a fragmented MP4 file

Some recent update in Media Foundation platform introduced a new bug related to fragmented MP4 files and H.264 video. The bug shows up consistently with file versions:

  • mfplat.dll – 10.0.14393.351 (rs1_release_inmarket.161014-1755)    15-Oct-16 05:48
  • mfmp4srcsnk.dll – 10.0.14393.351 (rs1_release_inmarket.161014-1755)    15-Oct-16 05:45

The nature of the problem is that MPEG-4 File Source is incorrectly time stamping the data: frame time stamps are incorrect, they seems to be getting wrong durations and increments, then quickly jumps into future… and on playback this leads to unobvious playback freezes. As Media Foundation is used by Windows Media Player, Windows 10 Movies & TV Player, the bug is present there as well.

The original report is on MSDN Forums.

Presumably it is possible to roll certain Windows Update package back, or alternatively one has to wait for Microsoft to fix the problem and deliver a new update deploying the fix.

DirectShowSpy: REGDB_E_CLASSNOTREG with IMMDevice::Activate

A DirectShow developer complained on sudden failure of Core Audio IMMDevice::Activate call supposed to instantiate a DirectShow filter for a given device.

The problem appeared to be related to installed DirectShowSpy and its interference with the API calls. The symptom was of the following kind: when Activate was called for different types of objects, the calls all succeeded except interoperation with DirectShow (activation for IBaseFilter), e.g. EnumerateAudioDevices output:

    IAudioClient            0x00000000
    IAudioEndpointVolume    0x00000000
    IAudioMeterInformation  0x00000000
    IAudioSessionManager    0x00000000
    IAudioSessionManager2   0x00000000
    IBaseFilter             REGDB_E_CLASSNOTREG
    IDeviceTopology         0x00000000
    IMFTrustedOutput        0x00000000

When Core Audio is requested to do DirectShow activation, the API creates and instance of System Device Enumerator, which is forwarded the activation call to. DirectShowSpy intercepts these calls, however what it did not do was support for unknown COM interfaces, and support for undocumented IMMDeviceActivator interface which is used internally by the APIs to forward the activation call.

So, System Device Enumerator implements documented ICreateDevEnum and then it also implements undocumented internal IMMDeviceActivator. The entire sequence call is as follows:

// Top level code:

CComPtr<IMMDevice> pDevice = ...; // Audio endpoint interface
pDevice->Activate(..., __uuidof(IBaseFilter), ...)

// API:

STDMETHOD(Activate)(...)
{
    // ...
    if(requested is IBaseFilter)
    {
        CComPtr<IMMDeviceActivator> pDeviceActivator;
        pDeviceActivator.CoCreateInstace(CLSID_SystemDeviceEnum);
        return pDeviceActivator->Activate(pDevice, ...)
    }

DirectShowSpy’s failure to provide IMMDeviceActivator resulted in symptom in question and is fixed with version 1.0.0.2106 and on. The failure code is not so much descriptive, but of course the APIs did not expect external hook and failure is not actually a supposed possible behavior there.

System Device Enumerator matches the known devices to the provided Core Audio device and creates an instance of respective filter – this is how APIs work together. DirectShowSpy prints these calls out to its log.

roatlbase.h(1582): TraceModuleVersion: "D:\...\DirectShowSpy-Win32.dll" version is 1.0.0.2107, running in "D:\...\EnumerateAudioDevices-Win32.exe" at 0x63210000
dllmain.h(36): CDirectShowSpyModule::CDirectShowSpyModule: this 0x633963A4
SystemDeviceEnumeratorSpy.h(669): CSystemDeviceEnumeratorSpyT<...>::CSystemDeviceEnumeratorSpyT: this 0x02F1DA68
SystemDeviceEnumeratorSpy.h(681): CSystemDeviceEnumeratorSpyT<...>::FinalConstruct: pszPath "D:\...\EnumerateAudioDevices-Win32.exe", this 0x02F1DA68, m_dwRef 1
SystemDeviceEnumeratorSpy.h(49): CSystemDeviceEnumeratorSpyT<...>::InternalQueryInterface: 0x02F1DA68, Interface {3B0D0EA4-D0A9-4B0E-935B-09516746FAC0}, Result 0x00000000
SystemDeviceEnumeratorSpy.h(49): CSystemDeviceEnumeratorSpyT<...>::InternalQueryInterface: 0x02F1DA68, Interface {3B0D0EA4-D0A9-4B0E-935B-09516746FAC0}, Result 0x00000000
SystemDeviceEnumeratorSpy.h(808): CSystemDeviceEnumeratorSpyT<...>::Activate: this 0x02F1DA68, InterfaceIdentifier {56A86895-0AD4-11CE-B03A-0020AF0BA770}, pMmDevice 0x0054E7F8
SystemDeviceEnumeratorSpy.h(815): CSystemDeviceEnumeratorSpyT<...>::Activate: nActivateResult 0x00000000 
SystemDeviceEnumeratorSpy.h(673): CSystemDeviceEnumeratorSpyT<...>::~CSystemDeviceEnumeratorSpyT: this 0x02F1DA68

Download links

Calling convention violator broke streaming loop pretty far away

A really nasty problem coming from MainConcept AVC/H.264 SDK Encoder was destroying media streaming pipeline. SDK is somewhat old (9.7.9.5738) and the problem might be already fixed, or might be not. The problem is a good example of how a small bug could become a big pain.

The problem was coming up in 64-bit Release builds only. Win32 build? OK. Debug build where you can step things through? No problem.

The bug materialized in GDCL MP4 Demultiplexer filter streaming (Demultiplexer filter in the pipeline below) generating media samples with incorrect time stamps.

Pipeline

Initial start and stop time are okay, and further go as _I64_MIN (incorrect).

Clipbrd3

The problem appears to be SSE optimization and x64 calling convention related. This explains why it’s only 64-bit Release build suffering from the issue. MS compiler decided to use XMM7 register for dRate variable in this code fragment:

REFERENCE_TIME tStart, tStop;
double dRate;
m_pParser->GetSeekingParams(&tStart, &tStop, &dRate);

[...]

for(; ; )
{
    [...]

    tSampleStart = REFERENCE_TIME(tSampleStart / dRate);
    tSampleEnd = REFERENCE_TIME(tSampleEnd / dRate);

dRate is the only floating point thing here and it’s clear why the compiler optimized the variable into register: no other floating point activity around.

However sample delivery goes pretty deep into other functions and modules reaching MainConcept H.264 encoder. One of its functions is violating x64 calling convention and does not preserve XMM6+ register values. OOPS! Everything is about working right, but after media sample delivery dRate value is destroyed and further media samples receive incorrect time stamps.

It is not really a problem of MP4 demultiplexer, of course, however media sample delivery might involve a long delivery chain where any violator would break streaming loop. In the same time, it is not really a big expense to de-optimize the floating point math in the demultiplexer for those a few time stamp adjustment operations. A volatile specifier breaks compiler optimization and makes the loop resistant to SSE2 register violators:

// HOTFIX: Volatile specifier is not really necessary here but it fixes a nasty problem with MainConcept AVC SDK violating x64 calling convention;
//         MS compiler might choose to keep dRate in XMM6 register and the value would be destroyed by the violating call leading to incorrect 
//         further streaming (wrong time stamps)
volatile DOUBLE dRate;
m_pParser->GetSeekingParams(&tStart, &tStop, (DOUBLE*) &dRate);

This makes H.264 this build of encoding SDK unstable and the problem is hopefully already fixed. The SDK indeed gave other troubles on specific architectures leading to undefined behavior.

Windows 10 AVI Splitter bug

There were a few reports that in Windows 10 it is unable to play AVI files, which played fine in earlier versions of Windows, AVI files specifically.

OK, the problem does exist. More to say, the problem exist in Windows component that implements AVI Splitter DirectShow filter. One of the reporters mentioned he had a problem with a DV AVI flie. I build one and it indeed showed the problem:

AVI Splitter bug in GraphStudioNext

Playback stops at the same frame every time the filter graph is run. The error is 0x8004020D VFW_E_BUFFER_OVERFLOW “The buffer is not big enough” coming from AVI Splitter’s worker thread. The buffers on the memory allocators look appropriate, so the bug looks related to AVI Splitter implementation details, CBaseMSRWorker class that reads from file and delivers frames downstream.

AVI Splitter bug call stack

The problem exists in 32 and 64 bit versions, but not in Media Foundation. With certain luck Microsoft will fix the problem on their side.

PolyTextOut API – Does It Work?

As MSDN says,

The PolyTextOut function draws several strings using the font and text colors currently selected in the specified device context.

The article also mentions ExtTextOut as a simpler sister function:

To draw a single string of text, the application should call the ExtTextOut function.

It looks like the API is not so Unicode friendly. Code as simple as

PolyTextOut(L"Мама мыла раму");
PolyTextOut(L"Mother washed window");
PolyTextOut(L"ママソープフレーム");
PolyTextOut(L"დედა საპნის კარკასი");

ExtTextOut(L"Мама мыла раму");
ExtTextOut(L"Mother washed window");
ExtTextOut(L"ママソープフレーム");
ExtTextOut(L"დედა საპნის კარკასი");

Outputs correctly in case of ExtTextOut, while PolyTextOut stumbles on strings in Japanese, Georgian. All right, so why did it do Russian?

PolyTextOut Sample

Media Foundation MPEG-4 Property Handler might report incorrect Video Frame Rate

To follow up previous post with Media Foundation bug, here goes another one related to property handler for MPEG-4 files (.MP4) and specific property PKEY_Video_FrameRate which reports frame rate for given media file.

This is the object responsible for filling columns in explorer, or otherwise visually the bug might look like this:

Image001

The values of the properties are also accessible programmatically using IPropertyStore::GetValue API, in which case they are:

  • PKEY_Video_FrameWidth1280 (VT_UI4) // 1,280
  • PKEY_Video_FrameHeight720 (VT_UI4) // 720
  • PKEY_Video_FrameRate1091345 (VT_UI4) // 1,091,345
  • PKEY_Video_Compression{34363248-0000-0010-8000-00AA00389B71} (VT_LPWSTR) // FourCC H264
  • PKEY_Video_FourCC875967048 (VT_UI4) // 875,967,048
  • PKEY_Video_HorizontalAspectRatio1 (VT_UI4) // 1
  • PKEY_Video_VerticalAspectRatio1 (VT_UI4) // 1
  • PKEY_Video_StreamNumber2 (VT_UI4) // 2
  • PKEY_Video_TotalBitrate12123288 (VT_UI4) // 12,123,288

The actual frame rate of the file is 50 fps. The file is playable well in every media player, so the problem is the reporting itself. Let us look inside the file to possibly identify the cause. The mdhd box for the video track shows the following information:

Image003

Let us do some math now:

  • Time Scale: 10,000,000
  • Duration: 4,501,200,000 (around 7.5 minutes)
  • Video Sample Count: 22,506

This makes the correct fps of 50 (frames per scaled duration). However the duration number itself is a pretty big one and looks exceeding the 32-bit range. Now let us try this one:

22506 / (4501200000 & ((1 << 32) – 1)) * 10000000

And we get 1,091. Bingo! Arithmetic overflow in the property handler then…

See also:

Bonus tool: FilePropertyStore application which reads properties of the file you drag and drop onto it, Win32 and x64 versions.

Image002

Something went terribly wrong in x64 build of Windows Media Video 9 Decoder

Unfortunately, 64-bit version of Windows Media Video 9 Decoder is not as good as its 32-bit sister. 32-bit version is user an order of magnitude more frequently and does not give troubles, nevertheless 64-bit version offers similar feature set it is pretty hard to see it in action since it takes a 64-bit media application to host it and most of media applications are 32-bit (there is often a good reason for this), and even Windows SDK 7.0 topoedit comes pre-built as Win32 application only (provided with source code though, so one can built x64 peer – after fixing buildability issues and adding x64 configuration manually).

Decoder is available as dual DMO/MFT which enabled it for both DirectShow and Media Foundation APIs and similarly exposes the problem in both as well.

Once in a while, 64-bit version of the decoder might be producing incorrect output, adding white dots where they are not supposed to be.

02

Because the problem resides supposedly in the decoder itself, it affects everything decoding Windows Media Video of this flavor, including

  • Windows Media Player 64-bit
  • GraphEdit x64 from Windows SDK
  • TopoEdit x64 from Windows SDK (built manually)

The only exception is DXVA-accelerated decoding where bug is worked around by hardware assisted decoding and the output is correct.

Image002

One of the ways to easily see and reproduce the problem in action is to re-encode the content in 64-bit version of GraphEdit into anything else: video decoder there will decode in software and burn the artifacts in.

See also: