DirectShow VMR-7 bug in Windows 10

DirectShow Video Mixing Renderer (VMR-7) filter exhibits a (regression?) bug in Windows 10 systems. When aspect ratio preservation is enabled in VMR_ARMODE_LETTER_BOX mode, which makes overall sense as default mode quote so often, the letterboxing does not work as expected.

The problem is easy to reproduce with a well known DShowPlayerSDK sample application, with an edit enforcing VMR-7 mode. Once video is started, just resize the window and the parts not covered by video will not be erased as expected.

Apparently this worked well earlier.

UpdateVersionInfoGit: Multiple references/hashes

Some time ago I shared an application which I have been using to embed git reference into binary resources, especially as a post-build event in automated manner: Embedding a Git reference at build time.

This time I needed a small amendment related to use of a git repository as a sub-module of another repository. To make things easier for troubleshooting, when a project if built as a part of bigger build through a sub-module repository reference, both git details of the repository and its parent might be embedded into resources.

“$(ProjectDir)..\_Bin\Third Party\UpdateVersionInfoGit-$(PlatformName).exe” path “$(ProjectDir)..” path “$(ProjectDir)..\..” binary “$(TargetPath)”


The utility allows multiple path arguments, will go over all of them and concetenate the “git log” output. When multiple paths are given it is okay to some of them be invalid or unrelated to git repositories.

Download links

Infrared Camera in Media Foundation

Surface Pro (5th Gen) infrared camera streamed into Chrome browser in H.264 encoding over WebSocket connection

The screenshot above shows Surface Pro tablet’s infrared camera (known as “Microsoft IR Camera Front” on the device) captured live, encoded and streamed (everything is hosted by Microsoft Media Foundation Media Session by this point) over network using WebSockets into Chrome’s HTML5 video tag by means of Media Source Extensions (MSE).

Why? Because why not.

Unfortunately, Microsoft did not publish/document API to access infrared and depth (time-of-flight) cameras so that traditional applications could use the hardware capabilities. Nevertheless, the functionality is available in Universal Windows Platform (UWP), see Windows.Media.Capture.Frames and friends.

UWP implementation is apparently using Media Foundation on its backyard so the fucntionlaity could certainly be published for desktop applications as well. Another interesting thing is that my [undocumented] way to access the device seems to be bypassing frame server and talks to device directly, including video.

It does not look like Microsoft is planning to extend visibility of these new features to desktop Media Foundation API since they sequentially add new features without exposing them for public use outside UWP. UWP API itself is eclectic and I can’t imagine how one could get a good understanding of it without having a good grip on underlying API layers.

Media Foundation MP4 Media Source gets a bit too tired when doing too much work

It appears there is a sort of a limitation (read: “a bug”) in Media Foundation MPEG-4 File Source implementation when it comes to reading long fragmented MP4 files.

When respective media source is used to read a file (for which, by the way, it does not offer seeking), the source issues a MF_SOURCE_READERF_ENDOFSTREAM before reaching actual end of file.

When some software sees a full hour of video in the file…

… Media Foundation primitive, after reading frame 00:58:35.1833333, issues “oh gimme a break” event and reports end of stream.

NVIDIA Video Codec SDK encoder initialization memory leak

It appears that re-initialization of encoding session with NVIDIA Video Codec SDK is or might be producing an unexpected memory leak.

So, how does it work exactly?

NVENCSTATUS Status;
Status = m_ApiFunctionList.nvEncInitializeEncoder(m_Encoder, &InitializeParams);
assert(Status == NV_ENC_SUCCESS);
// NOTE: Another nvEncInitializeEncoder call
Status = m_ApiFunctionList.nvEncInitializeEncoder(m_Encoder, &InitializeParams);
assert(Status == NV_ENC_SUCCESS); // Still success
...
Status = m_ApiFunctionList.nvEncDestroyEncoder(m_Encoder);
assert(Status == NV_ENC_SUCCESS);

The root case problem is secondary nvEncInitializeEncoder call. Alright, it might be not exactly how API is designed to work, but returned statuses all indicate success, so it will be a bit hard to justify the leak by telling that second initialization call was not expected in first place. Apparently the implementation overwrites internally allocated resources without accurate releasing or reusing. And without triggering any warning of sorts.

Another part of the problem is eclectic design of the API in first place. You open a “session” and obtain “encoder” as a result. Then you initialize “encoder” and when you are finished you destroy “encoder”. Do you destroy “session”? Oh no, you don’t have any session at all except that API opening “session” actually opens an “encoder”.

So when I get into situation where I want to initialize encoder and it is already initialized then what I do is to destroy existing “encoder”, open new “session” and now I can initialize the session-encoder once again with the initialization parameters.

MFCreateVideoSampleFromSurface’s IMFTrackedSample offering

IMFTrackedSample interface is available/allowed in UWP applications. The interface is a useful one when one implements a pool of samples and needs a notification when certain instance can be recycled.

Use this interface to determine whether it is safe to delete or re-use the buffer contained in a sample. One object assigns itself as the owner of the video sample by calling SetAllocator. When all objects release their reference counts on the sample, the owner’s callback method is invoked.

The notification is asynchronous meaning that when a sample is available the notification is scheduled for delivery via standard (for Media Foundation) IMFAsyncCallback::Invoke call. This is quite convenient.

When this method is called, the sample holds an additional reference count on itself. When every other object releases its reference counts on the sample, the sample invokes the pSampleAllocator callback method. To get a pointer to the sample, call IMFAsyncResult::GetObject on the asynchronous result object given to the callback’s IMFAsyncCallback::Invoke method.

I would not have mentioned this if it was that simple, would I?

One could start feeling problems already while looking at MSDN page:

Requirements

Minimum supported client – Windows Vista [desktop apps | UWP apps]

Minimum supported server – Windows Server 2008 [desktop apps | UWP apps]

Header – Evr.h

Library – Strmiids.lib

Oh really, Strmiids.lib?

So the problem is that even though the interface itself is whitelisted for UWP and is a Media Foundation interface in its nature, it is implemented along with EVR and is effectively exposed to public via MFCreateVideoSampleFromSurface API. That is, the only API function that provides access to UWP-friendly interface is a UWP-unfriendly function. Bummer.

It took me less than 300 lines of code to implement a video sample class with IMFTrackedSample implementation that mimics standard (good bye stock implementation!), so it is not difficult. However it would be better if OS implementation is available nicely in first place.

Intel H.264 Video Encoder MFT is ignoring texture synchronization too

Some time ago I wrote about a bug in AMD’s H.264 Video Encoder MFT, where implementation fails to synchronize access to Direct3D 11 texture. So Intel’s implementation has exactly the same problem. Intel® Quick Sync Video H.264 Encoder MFT processes input textures/frames without acquiring synchroization and can lose actual content.

It is pretty hard to reproduce this problem because it is hardware dependent and in most cases the data arrives into the texture before encoder starts processing it, so the problem remains hidden. But in certain systems the bug comes up so easily and produces a stuttering effect. Since input textures are pooled, when new data is late to arrive into texture, H.264 encoder encodes an old video frame and H.264 output is technially valid: it just produces a stuttering effect on playback because wrong content was encoded.

For a Media Foundation API consumer it is not really easy to work the problem around because Media Foundation does not provide access to data streamed between the promitives internally. A high level application might be even not aware that primitives are exchanging with synchronization enabled textures so it is unclear where the source of the problem is. 

Possible solutions to the problem (applicable or not depending on specific case):

  1. to not use synchronization-enabled textures; do a copy from properly sycnhronized texture into a new plain texture before feeding it into encoder; this might require an additional/special MFT inserted into the pipeline before the encoder
  2. implement a customized Media Session (Sink Writer) alike subsystem with control over streamed data so that, in particular, one could sycnhronize (or duplicate) the data before it is fed to encoder’s IMFTransform::ProcessInput
  3. avoid using vendor supplied video encoder MFTs as buggy…