Microsoft Media Foundation code samples online

Media Foundation Team Blog (2009-2011) lost connection with the community some time ago, and its sample code hosted at http://code.msdn.microsoft.com/mfblog passed away too.

Three Four of the five samples were saved and were made back available by user mofo77, and the other two one MFSimpleEncode and MFManagedEncode are is still wanted.

I put the sample code online at GitHub here. If someone happens to have the two missing projects, please post there or email me to have them pushed to the repository. Feel free to use the samples if, for whatever reason,
Media Foundation is what you decided to mess with.

Also, be aware that older Windows SDK Media Foundation samples can be found in:

As it becomes a collection of Media Foundation related links here we go with bonus reading:

Now you are well set, GOOD LUCK!

Effects of IMFVideoProcessorControl2::EnableHardwareEffects

IMFVideoProcessorControl2::EnableHardwareEffects method:

Enables effects that were implemented with IDirectXVideoProcessor::VideoProcessorBlt.

[…] Specifies whether effects are to be enabled. TRUE specifies to enable effects. FALSE specifies to disable effects.

All right, it is apparently not IDirectXVideoProcessor and MSDN link behind the identifier takes one to correct Direct3D 11 API method: ID3D11VideoContext::VideoProcessorBlt.

Worse news is that having the effects “enabled”, the transform (the whole thing belongs to Media Foundation’s Swiss knife of conversion [with just one blade and no scissors though] – Video Processing MFT) fails to deliver proper output and produces green black fill instead of proper image.

Or, perhaps, this counts as a hardware effect.

No conversion with MF_CONNECT_ALLOW_CONVERTER

Microsoft Media Foundation Media Session API topology resolution is way less clear compared to DirectShow. The API takes away a part of component connection process and makes it less transparent to API consumer. Then, while DirectShow Intelligent Connect is use scenario agnostic,  Media Foundation Media Session apparently targets playback scenarios, and its topology resolution process is tuned respectively.

MF_CONNECT_ALLOW_DECODER

Add a decoder transform upstream upstream from this node, if needed to complete the connection. The numeric value of this flag includes the MF_CONNECT_ALLOW_CONVERTER flag. Therefore, setting the MF_CONNECT_ALLOW_DECODER flag sets the MF_CONNECT_ALLOW_CONVERTER flag as well.

[…] If this attribute is not set, the default value is MF_CONNECT_ALLOW_DECODER.

Well, that’s double upstream and suggests that the thing is impressively reliable. However it is not.

In a non-playback topology, if a direct connection is impossible and required conversion is not typical for playback, MF_CONNECT_ALLOW_CONVERTER flag seems to be not helpful for topology resolution.

Apparently, Microsoft does have suitable code, esp. used in Sink Writer API, however it does not seem to be available in any form other than a packet deal with Sink Writer object and its own limitations. Media Session API does not implement this (non-playback, that is) style of topology resolution and node connection. Transcode API too has the necessary topology resolution code, but again it comes with its own constraints making it useless unless you want to do something really simple.

 

Reference Signal Source: audio as Media Foundation source

Reference signal source for DirectShow in its video part already received Media Foundation Source interface earlier.

This time, the update implements a separate Media Foundation source for audio. MfGenerate2 sample code gives an idea on how to initialize the source:

using namespace AlaxInfoDirectShowReferenceSource;
CComPtr<IAudioMediaSource> pSource;
__C(pSource.CoCreateInstance(__uuidof(AudioMediaSource)));
__C(pSource->SetMediaType(NULL, g_nSampleRate, g_nChannelCount, g_nBitDepth));
__C(pSource->put_Duration((DOUBLE) g_nDuration));
CComPtr<IMFMediaSource> pAudioMediaSource = pSource;

The source can be given specific format using Media Foundation stream descriptor’s media type handler, or set up via private COM interface.

The source can be given sampling rate, channel count (all channels receive the same signal) and bit depth for PCM audio formats (8, 16..32), or 32-bit IEEE floating point format.

Video and audio streams can also be combined into aggregate source (video+audio) to produce a multi-track output. The MfGenerate2 sample shows the approach as well:

__D(pVideoMediaSource || pAudioMediaSource, E_UNNAMED);
if(pVideoMediaSource && pAudioMediaSource)
{
    CComPtr<IMFCollection> pCollection;
    __C(MFCreateCollection(&pCollection));
    __C(pCollection->AddElement(pVideoMediaSource));
    __C(pCollection->AddElement(pAudioMediaSource));
    __C(MFCreateAggregateSource(pCollection, &pMediaSource));
} else
    pMediaSource = pVideoMediaSource ? pVideoMediaSource : pAudioMediaSource;
_A(pMediaSource);

The sample project is capable to produce output of this kind:

Download links

IMFAttributes::CopyAllItems freeze on copying to self

An attempt to copy Media Foundation attribute collection to itself results in a deadlock. Well, it’s not a good idea and a practical one to do a nonsense like this, but the implementation should be resistant to such use either, esp. avoiding the unexpected deadlock.

#include "stdafx.h"
#include <mfapi.h>

#pragma comment(lib, "mfplat.lib")
#pragma comment(lib, "mfuuid.lib")

int main()
{
    _ATLTRY
    {
        ATLENSURE_SUCCEEDED(MFStartup(MF_VERSION));
        CComPtr<IMFAttributes> pAttributes;
        ATLENSURE_SUCCEEDED(MFCreateAttributes(&pAttributes, 1));
        ATLENSURE_SUCCEEDED(pAttributes->SetGUID(MF_MT_SUBTYPE, GUID_NULL));
        ATLENSURE_SUCCEEDED(pAttributes->CopyAllItems(pAttributes)); // <<--- Freeze
    }
    _ATLCATCHALL()
    {
    }
    return 0;
}

The freeze takes place around SRW locks, that is presumably the implementation attempts to lock the data for reading and then once again for writing immediately afterwards.

    ntdll.dll!_NtWaitForAlertByThreadId@8()    Unknown
    ntdll.dll!RtlAcquireSRWLockExclusive()  Unknown
    mfplat.dll!CMFAttributesImpl<struct IMFAttributes,class CMFSRWLock>::DeleteAllItems(void)   Unknown
    mfplat.dll!CMFAttributesImpl<struct IMFAttributes,class CMFSRWLock>::_CloneAllAttributes(struct IMFAttributes *)    Unknown
    mfplat.dll!CMFAttributesImpl<struct IMFAttributes,class CMFSRWLock>::CopyAllItems(struct IMFAttributes *)   Unknown
>   MfSample01.exe!main() Line 15   C++

Media Foundation’s MFT_MESSAGE_SET_D3D_MANAGER with Frame Rate Converter DSP

It might look weird why would someone try Direct3D mode with a DSP, which is not supposed to be Direct3D aware, but still. I am omitting the part why I even got to such scenario. The documentation says a few things about MFT_MESSAGE_SET_D3D_MANAGER:

  • This message applies only to video transforms. The client should not send this message unless the MFT returns TRUE for the MF_SA_D3D_AWARE attribute (MF_SA_D3D11_AWARE for Direct3D 11).
  • Do not send this message to an MFT with multiple outputs.
  • An MFT should support this message only if the MFT uses DirectX Video Acceleration for video processing or decoding.
  • If an MFT supports this message, it should also implement the IMFTransform::GetAttributes method and return the value TRUE…
  • If an MFT does not support this message, it should return E_NOTIMPL from ProcessMessage. This is an exception to the general rule that an MFT can return S_OK from any message it ignores.

Frame Rate Converter DSP is a hybrid DMO/MFT, which in turn basically means that its “legacy” DMO upgraded to MFT using specialized wrapper. It is not supposed to be Direct3D aware, not documented as such.

However it could presumably normalize frame rate of Direct3D aware samples by dropping/duplicating samples respectively. It could easily be Direct3D aware since it does not need, in its simplest implementation, to change the data. It is easy to see that the MFT satisfies the other conditions: it is single output video transform.

The MFT correctly and expectedly does not advertise itself as Direct3D aware. It does not have transform attributes.

However, it fails to comply with documented behavior on returning E_NOTIMPL in MFT_MESSAGE_SET_D3D_MANAGER message. The message is defined to be an exception, however DSP implementation seems to be ignoring that. The wrapper could possibly be created even before the exception was introduced in first place.

The DSP does not make an exception, returns success code as if it does handle the message, and does not act as documented.

Intel Quick Sync Video Consumption by Applications

I wrote a few posts on hardware H.264 encoding (e.g. this and the latest one Applying Hardsubs to H.264 Video with GPU). A blog reader asked a question regarding availability of the mentioned Intel Quick Sync Video support with low end Intel x5-Z8300 CPU.

[…] Intel has advertised that the Cherry Trail CPUs support H264 encoding and / or QSV, but nowhere have I seen a demo of this being used […].
What did you use to encode the video? Is the QSV codec available in the x5-z8300 for possible 720p realtime encoding? I’d like to see this checked in regards to using software like FFmpeg with qsv_h264 -codec and OBS. […]

A picture below explains how applications are consuming Intel’s hardware video compression offering in Windows.

Intel QSV includes hardware implementation of the encoder and corresponding drivers which provide a frontend API to software. This includes a component which integrates the codec with Microsoft’s Media Foundation API. Applications are to choose between interfacing the codec using Windows API – this is the way stock Microsoft applications work, and this is the way I used for video encoding development mentioned on the blog. Other applications prefer to interface through Intel Media SDK, which is an alternate route ending up at the same hardware-backed services.

Intel x5-Z8300 system in question has H.264 video encoding support integrated into Windows API and the services can be consumed without additional Intel runtime and/or development kit. The codec, according to the benchmarks made earlier, is fast enough to handle real-time 720p video encoding nevertheless the device is a budget thing.