Media Foundation Video/Audio Capture Capabilities

Just like with DirectShow video capture capability information, it is helpful to unerstand what Medfia Foudnation video capture offering is exactly. Specifically:

H.264 realated attributes in media types might be not so obvious and device might report too many types.

  • Major Type: MFMediaType_Video
  • Compressed: 1
  • 25 Attributes
    • MF_MT_MAJOR_TYPE: MFMediaTypeVideo
    • MF_MT_SUBTYPE: MFVideoFormatH264ES
    • MF_MT_COMPRESSED: 1 (Type VTUI4)
    • MF_MT_ALL_SAMPLES_INDEPENDENT: 0 (Type VTUI4)
    • MF_MT_FIXED_SIZE_SAMPLES: 0 (Type VTUI4)
    • MF_MT_FRAME_SIZE: 755914244240 (Type VTUI8) // Width 176, Height 144
    • MF_MT_PIXEL_ASPECT_RATIO: 4294967297 (Type VTUI8) // Numerator 1, Denominator 1
    • MF_MT_INTERLACE_MODE: 2 (Type VTUI4) // MFVideoInterlaceProgressive
    • MF_MT_FRAME_RATE: 128849018881 (Type VTUI8) // Numerator 30, Denominator 1
    • MF_MT_FRAME_RATE_RANGE_MIN: 128849018881 (Type VTUI8) // Numerator 30, Denominator 1
    • MF_MT_FRAME_RATE_RANGE_MAX: 128849018881 (Type VTUI8) // Numerator 30, Denominator 1
    • MF_MT_AVG_BITRATE: 6003500 (Type VTUI4)
    • MF_MT_AM_FORMAT_TYPE: {2017BE05-6629-4248-AAED-7E1A47BC9B9C}
    • MF_MT_VIDEO_PROFILE: 257 (Type VTUI4) // eAVEncH264VProfileUCConstrainedHigh
    • MF_MT_VIDEO_LEVEL: 40 (Type VTUI4) // eAVEncH264VLevel4
    • MF_MT_H264_MAX_MB_PER_SEC: F5 00 00 00 00 00 00 00 F5 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    • MF_MT_H264_SUPPORTED_USAGES: 3 (Type VTUI4)
    • MF_MT_H264_SUPPORTED_RATE_CONTROL_MODES: 15 (Type VTUI4)
    • MF_MT_H264_SUPPORTED_SYNC_FRAME_TYPES: 10 (Type VTUI4)
    • MF_MT_H264_SIMULCAST_SUPPORT: 0 (Type VTUI4)
    • MF_MT_H264_CAPABILITIES: 40 (Type VTUI4)
    • MF_MT_H264_SUPPORTED_SLICE_MODES: 14 (Type VTUI4)
    • MF_MT_H264_RESOLUTION_SCALING: 3 (Type VTUI4)
    • MF_MT_H264_MAX_CODEC_CONFIG_DELAY: 1 (Type VTUI4)
    • MF_MT_H264_SVC_CAPABILITIES: 1 (Type VTUI4)

MediaFoundationCaptureCapabilities

Download:

Sample Data:

See Also:

CLSID_VideoInputDeviceCategory and Media Foundation

Media Foundation as video capture API is inflexible. At Microsoft – besides standard Media Foundation problems of backward compatibility, availability of developer tools and overall awkwardness – they decided to no longer offer video capture extensibility with Media Foundation. Be happy with MFEnumDeviceSources and don’t go anywhere else. They explain that they already provided support for devices backed by kernel streaming drivers:

Starting in Windows 7, Media Foundation automatically supports audio and video capture devices. For video, the device must provide a kernel streaming (KS) minidriver in the video capture category. Media Foundation uses the PnP path to enumerate the device. For audio, Media Foundation uses the Windows Multimedia Device (MMDevice) API to enumerate audio endpoint devices. If the device meets these criteria, there is no need to implement a custom media source.

The next paragraph there is slyness:

However, you might want to implement a custom media source for some other type of device or other live data source. There are only a few differences between a live source and other media sources.

Indeed, you can implement a custom media source, however you cannot implement a backing object (Media Foundation Transform – see below) that standard media source would use, and you cannot make your own video source discoverable by applications so that a custom video source is a new option for video capture enabled applications using Media Foundation.

Over years developers were eagerly interested in various aspects of video capture om Windows platform using VFW and then DirectShow. Including specifically implementing a virtual camera device, for which Microsoft provided Push Source Filters Sample, which then was extended to popular VCam sample that “publishes” video source device and makes it available to applications enumerating video capture hardware. The latest API, Media Foundation, blocked the opportunity to provide a custom video source.

The interesting thing though is that there is no fundamental problem in allowing such extensibility: just a few pieces are missing.

For starters, MFTEnum enumerates objects in, well, DirectShow’s CLSID_VideoInputDeviceCategory category. This is not documented, but this shows how tightly Media Foundation and DirectShow (and related kernel drivers) are connected.

Category: CLSID_VideoInputDeviceCategory {860BB310-5D01-11D0-BD3B-00A0C911CE86}

Logitech Webcam C930e #0
    MFT_ENUM_HARDWARE_URL_Attribute: \\?\usb#vid_046d&pid_0843&mi_00#6&2314864d&0&0000#{65e8773d-8f56-11d0-a3b9-00a0c9223196}\global (Type VT_LPWSTR)
    MFT_TRANSFORM_CLSID_Attribute: {8AC3587A-4AE7-42D8-99E0-0A6013EEF90F} (Type VT_CLSID)
    MFT_OUTPUT_TYPES_Attributes: 
        MFMediaType_Video MFVideoFormat_YUY2
        MFMediaType_Video MFVideoFormat_MJPG
    MF_TRANSFORM_FLAGS_Attribute: MFT_ENUM_FLAG_HARDWARE

Blackmagic WDM Capture #1
    MFT_ENUM_HARDWARE_URL_Attribute: \\?\decklink#avstream#5&2db0fd5&1&0000#{65e8773d-8f56-11d0-a3b9-00a0c9223196}\decklinkcapture1 (Type VT_LPWSTR)
    MFT_TRANSFORM_CLSID_Attribute: {8AC3587A-4AE7-42D8-99E0-0A6013EEF90F} (Type VT_CLSID)
    MFT_OUTPUT_TYPES_Attributes: 
        MFMediaType_Video MFVideoFormat_UYVY
        MFMediaType_Video MFVideoFormat_v210
        MFMediaType_Video FourCC HDYC
        MFMediaType_Audio MFAudioFormat_PCM
    MF_TRANSFORM_FLAGS_Attribute: MFT_ENUM_FLAG_HARDWARE

Any questions? What MFEnumDeviceSources API does is enumeration in this category, and building device COM objects on top of existing MFTs. Using MFT for video source is actually a smart move. This should have been of course done in DirectShow many years ago, and with DMOs instead of MFTs.

DirectX Media Objects (DMOs) got a compact and powerful form factor. Video and audio source implementation can be nicely put in “zero input one output” DMO and then used by standard objects on top of that. Similarly to DirectShow DMO Wrapper Filter but for source filters. This was never done in DirectShow, unfortunately. In Media Foundation DMOs got their obese brother class: Media Foundation Transform, which is pretty much the same, just bloated.

This time Media Foundation guys implemented their base block, MFT, over video capture hardware items, which APIs like MFEnumDeviceSources and MFCreateDeviceSource picks up and uses on their backyard.

Frontend code activating media source goes inside to enumerate formats right there to the inner MFT, its IMFTransform::GetOutputAvailableType through standard Media Foundation implementation for video device source, mfcore‘s CDeviceSource class.

MyTransform::GetOutputAvailableType(unsigned long nOutputStreamIdentifier, unsigned long nTypeIndex, IMFMediaType * * ppMediaType) Line 1033 C++
mfcore.dll!CDeviceSource::GetDeviceStreamType(unsigned long) Unknown
mfcore.dll!CDeviceSource::CreateStreams(void) Unknown
mfcore.dll!CDeviceSource::CDeviceSource(struct IMFTransform *,struct _GUID,struct IMFAttributes *,long *) Unknown
mfcore.dll!CDeviceSource::CreateInstance(struct IMFTransform *,struct _GUID,struct IMFAttributes *,struct IMFMediaSource * *) Unknown
mfcore.dll!MFCreateDeviceSource() Unknown

Capture of frames takes place on WinRT worker thread via IMFTransform::ProcessOutput:

MyTransform::ProcessOutput(unsigned long nFlags, unsigned long nBufferCount, MFTOUTPUTDATABUFFER * pBuffers, unsigned long * pnStatus) Line 1281 C++
mfcore.dll!CDeviceSource::OnMFTEventReceived(struct IMFAsyncResult *) Unknown
mfcore.dll!CDeviceSource::OnMFTEventReceivedAsyncCallback::Invoke(struct IMFAsyncResult *) Unknown
RTWorkQ.dll!CSerialWorkQueue::QueueItem::ExecuteWorkItem(struct IMFAsyncResult *) Unknown
RTWorkQ.dll!CBaseWorkQueue::HandleConcurrentMMCSSEnter(class CRealTimeState *) Unknown
ntdll.dll!TppWorkpExecuteCallback() Unknown
ntdll.dll!TppWorkerThread() Unknown
kernel32.dll!BaseThreadInitThunk() Unknown
ntdll.dll!RtlUserThreadStart() Unknown

That is, the base building block for video capture in Media Foundation is MFT. Excellent! So do they allow registering your own MFT to provide the applications with a custom video device? Not really. The operation of CDeviceSource and Microsoft’s implementation for the MFT (“Device Proxy MFT”) is based on intimate assumptions between the two, and is not documented. When/if this goes public, we will start implementing virtual cameras the same way we did with good old DirectShow.

Not so good H.264 media type

MainConcept’s MP4 Demultiplexer in Annex B mode looks, well… slightly excessively broken.

MainComcept MP4 Demultiplexer Properties

  1. H.264 media type with start codes (H264 FourCC, but here they use legacy subtype informally known as MEDIASUBTYPE_H264_bis) do not require parameter sets as a part of MPEGVIDEOINFO2 structure. If they however decided to provide the NAL units, they have to be RLE encoded, without start codes. MainConcept does it Aneex B way – not good.
  2. Zero BITMAPINFOHEADER::biSize?
  3. BITMAPINFOHEADER::biBitCount of 24 is hardly correct, but it is not fatal
  4. Additionally, they do memory allocator of default capacity of 64K followed by streaming larger samples…

Oh.

Needless to mention that this sort of connection simply has no chances to succeed:

Trying to Connect MainConcept MP4 Demultiplexer and Microsoft H.264 Decoder

Windows 10 AVI Splitter bug

There were a few reports that in Windows 10 it is unable to play AVI files, which played fine in earlier versions of Windows, AVI files specifically.

OK, the problem does exist. More to say, the problem exist in Windows component that implements AVI Splitter DirectShow filter. One of the reporters mentioned he had a problem with a DV AVI flie. I build one and it indeed showed the problem:

AVI Splitter bug in GraphStudioNext

Playback stops at the same frame every time the filter graph is run. The error is 0x8004020D VFW_E_BUFFER_OVERFLOW “The buffer is not big enough” coming from AVI Splitter’s worker thread. The buffers on the memory allocators look appropriate, so the bug looks related to AVI Splitter implementation details, CBaseMSRWorker class that reads from file and delivers frames downstream.

AVI Splitter bug call stack

The problem exists in 32 and 64 bit versions, but not in Media Foundation. With certain luck Microsoft will fix the problem on their side.

Blackmagic Design’s “Decklink Video Capture” filters

Pulling this out from Blackmagic Design Forum thread:

Generally, the recommended interface to the capture cards is the DeckLink API.

A DirectShow interface is available, but provides a subset of the functionality available from the complete DeckLink API.

Please note that the older, user-space DirectShow filters (DeckLink Video Capture) are deprecated in favour of the WDM filters (Blackmagic WDM Capture).

The WDM filters added support for 4K modes in Desktop Video 10.5+.

So the “Decklink Video Capture” filters that wrap the DeckLink SDK and provide convenient DirectShow interface are at their end of life.

Certainly, the most efficient and flexible way to interface Blackmagic Design hardware is to use their SDK (which is good and easy to use), however it does not give the immediate connectivity to Windows APIs. User mode filters were a good wrapper and provided typical functionality for capture and playback. They had their own issues (e.g. no VideoInfo2 support – interlaced formats treated as progressive and no support for progressive formats that collide with interlaces), also some reported 64-bit versions to be not quite stable.

WDM filters are around for some time, specifically they do offer 32-bit audio capture option which the other filters did not have. From what I remember they are lacking other capabilities availalble through SDK (update – e.g. no timecode support).

Apparently WMD filters do not offer playback option via DirectShow. This is not even mentioning the unfortunate Media Foundation – even though “Blackmagic WDM Render” is somehow around and with a certain luck is listed through MFTEnum:

    Blackmagic WDM Render #3
        MFT_ENUM_HARDWARE_URL_Attribute: \\?\decklink#avstream#5&2db0fd5&1&0000#{65e8773e-8f56-11d0-a3b9-00a0c9223196}\decklinkrender1 (Type VT_LPWSTR)
        MFT_INPUT_TYPES_Attributes: 
            MFMediaType_Video MFVideoFormat_UYVY
            MFMediaType_Video MFVideoFormat_v210
            MFMediaType_Video MFVideoFormat_UYVY
            MFMediaType_Video MFVideoFormat_v210
            MFMediaType_Video MFVideoFormat_UYVY
            MFMediaType_Video MFVideoFormat_v210
            MFMediaType_Video MFVideoFormat_UYVY
            MFMediaType_Video MFVideoFormat_v210
            MFMediaType_Video MFVideoFormat_UYVY
            MFMediaType_Video MFVideoFormat_v210
            MFMediaType_Video FourCC HDYC
            MFMediaType_Video FourCC HDYC
            MFMediaType_Video FourCC HDYC
            MFMediaType_Video FourCC HDYC
            MFMediaType_Video FourCC HDYC
            MFMediaType_Video FourCC HDYC
            MFMediaType_Video FourCC HDYC
            MFMediaType_Video FourCC HDYC
            MFMediaType_Audio MFAudioFormat_PCM
            MFMediaType_Audio MFAudioFormat_PCM
            
        MFT_TRANSFORM_CLSID_Attribute: {8AC3587A-4AE7-42D8-99E0-0A6013EEF90F} (Type VT_CLSID)
        MF_TRANSFORM_FLAGS_Attribute: MFT_ENUM_FLAG_HARDWARE

For more or less serious DirectShow development the best way was and is to wrap the DeckLink SDK with a custom filter and have all options that SDK provides.

Logitech C930e camera and Media Foundation

Logitech’s C930e camera is the first one to be compliant with UVC 1.5 specification:

First 1080p HD webcam to support H.264 with Scalable Video Coding and UVC 1.5 encoding technology. […] The result is a smoother video stream in applications like Skype for Business and Microsoft® Lync® 2013.

More marketing information there at Logitech. More interesting is what the new capabilities look from API side programmatically. Additionally to well known Motion JPEG (FourCC MJPG) and YUY2 video, the camera delivers H.264 (FourCC H264) video.

Logitech C930e Webcam

Lync (Skype for Business) is presumably modified to accept that and it communicates to the camera using Media Foundation API.

The camera’s H.264 capabilities are accessible using both APIs, DirectShow and Media Foundation, and there is apparently a mess with driver versions and operating system versions as well. The best results are achieved with stock driver from Microsoft (without installing Logitech driver, this information is in good standing: “The only way I was able to get that stream under Windows 8.x was by NOT USING LOGITECH DRIVERS. This is a UVC 1.5 compatible camera and it will be configured automatically by the OS. With that driver (from Microsoft), use pin 1 (not 0) and you will get a ton of H264 formats.”).

A printout of DirectShow capabilities using DirectShowCaptureCapabilities is available here (note KS_H264VIDEOINFO structure). This time it is about what it looks when one’s doing Media Foundation.

As a Media Source, exposed are a few attributes and a great deal of media types (216 + 476), greater amount compared to DirectShow as it seems:

    • MF_DEVSOURCE_ATTRIBUTE_MEDIA_TYPE: 76 69 64 73 00 00 10 00 80 00 00 AA 00 38 9B 71 59 55 59 32 00 00 10 00 80 00 00 AA 00 38 9B 71
    • MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_SYMBOLIC_LINK: \\?\usb#vid_046d&pid_0843&mi_00#6&2314864d&0&0000#{e5323777-f976-4f5b-9b55-b94699c46e44}\global (Type `VT_LPWSTR`)
    • MF_DEVSOURCE_ATTRIBUTE_FRIENDLY_NAME: Logitech Webcam C930e (Type `VT_LPWSTR`)
    • MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_CATEGORY: KSCATEGORY_VIDEO_CAMERA
    • MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE: MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID
    • MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_HW_SOURCE: 4 (Type `VT_UI4`)
  • Characteristics: MFMEDIASOURCE_IS_LIVE | MFMEDIASOURCE_CAN_PAUSE
  • Stream 0: Default Selected, Identifier 0x0, Major Type MFMediaType_Video, 216 Media Types
  • Stream 1: Identifier 0x1, Major Type MFMediaType_Video, 476 Media Types

The H.264 formats are marked with subtypes of MFVideoFormat_H264 and MFVideoFormat_H264_ES. A raw print out is downloadable:

Specifically, it is interesting what are the attributes there since with Media Foundation it is tricky thing to find out quickly. The keys/identifiers are listed below.

Common

  • MF_MT_ALL_SAMPLES_INDEPENDENT
  • MF_MT_AM_FORMAT_TYPE
  • MF_MT_AVG_BITRATE
  • MF_MT_FIXED_SIZE_SAMPLES
  • MF_MT_FRAME_RATE
  • MF_MT_FRAME_RATE_RANGE_MAX
  • MF_MT_FRAME_RATE_RANGE_MIN
  • MF_MT_FRAME_SIZE
  • MF_MT_INTERLACE_MODE
  • MF_MT_MAJOR_TYPE
  • MF_MT_PIXEL_ASPECT_RATIO
  • MF_MT_SUBTYPE

MFVideoFormat_H264, MFVideoFormat_H264_ES

  • MF_MT_COMPRESSED
  • MF_MT_H264_CAPABILITIES
  • MF_MT_H264_MAX_CODEC_CONFIG_DELAY
  • MF_MT_H264_MAX_MB_PER_SEC
  • MF_MT_H264_RESOLUTION_SCALING
  • MF_MT_H264_SIMULCAST_SUPPORT
  • MF_MT_H264_SUPPORTED_RATE_CONTROL_MODES
  • MF_MT_H264_SUPPORTED_SLICE_MODES
  • MF_MT_H264_SUPPORTED_SYNC_FRAME_TYPES
  • MF_MT_H264_SUPPORTED_USAGES
  • MF_MT_H264_SVC_CAPABILITIES
  • MF_MT_VIDEO_LEVEL
  • MF_MT_VIDEO_PROFILE

MFVideoFormat_MJPG

  • MF_MT_SAMPLE_SIZE
  • MF_MT_VIDEO_CHROMA_SITING
  • MF_MT_VIDEO_LIGHTING
  • MF_MT_VIDEO_NOMINAL_RANGE
  • MF_MT_VIDEO_PRIMARIES
  • MF_MT_YUV_MATRIX

MFVideoFormat_YUY2

  • MF_MT_DEFAULT_STRIDE
  • MF_MT_SAMPLE_SIZE
  • MF_MT_VIDEO_CHROMA_SITING
  • MF_MT_VIDEO_LIGHTING
  • MF_MT_VIDEO_NOMINAL_RANGE
  • MF_MT_VIDEO_PRIMARIES
  • MF_MT_YUV_MATRIX

Registration-Free COM dependencies and COM reference isolation

Visual Studio offers COM reference isolation to applications so that COM dependency is used in a usual way, and in the same time there is no need in its registration or another copy of the COM server might be registered system wide, or using per-user registration, and the application would still prefer a local copy of COM server.

Isolated Property

The advantage is obvious: no more COM registration hell, and the application can be distributed with lowered risk of conflicts with other installed software, without a risk to affect other applications by registering an unwanted piece of software. Also, with an option to use COM dependency without need of elevated privileges to perform COM registration.

The feature is using reg-free COM and is not new. Articles on internet on using the feature date back to 2007 and earlier (e.g. Isolated COM), reg-free COM existed earlier. The feature is cool and offers a one click access to an incredibly powerful option with complicated technology underneath.

Problem 1: It is 2015 fall today and Visual Studio 2013 still does not have this – as complicated as Enabled/Disabled option – working right.

Once enabled, the option has the following effect on the project:

  1. the manifest file is detached from the binary and is written to external file (Client.exe + Client.exe.manifest as opposed to Client.exe with manifest embedded as resource)
  2. the manifest receives assembly/file elements that establish a registration-free link to COM dependency; the content of the element is repeating registration of the COM dependency normally written into registry system wide
  3. the compiler uses RegOverridePredefKey and friends API to check COM dependency registration keys and update the manifest file (see previous item) respectively

Apparently the COM server has to be registered at compile time, so that compiler could convert the registration into manifest file. For whatever reason, Visual Studio 2013 looks for 32-bit COM server when it is doing 64-bit build. That is, building x64 configuration with x64 COM server registered and supposed to be used further fails if you don’t have a similar Win32 COM server registered. Bummer.

This simple solution ComIsolation01 (Trac, SVN) has two projects: C++ COM server with 32 and 64 bit configurations, and C# client consumer. A build of Debug/Release x64 configuration successfully builds Server.dll, registers it, then attempts to build Client.exe and fails:

2>C:\Program Files (x86)\MSBuild\12.0\bin\Microsoft.Common.CurrentVersion.targets(2234,5): warning MSB3284: Cannot get the file path for type library “ae2714e3-e8be-44c7-b737-5510e5f8abed” version 1.0. Library not registered. (Exception from HRESULT: 0x8002801D (TYPE_E_LIBNOTREGISTERED))
2>D:\Projects\Alax.Info\Repository-Public\Utilities\Miscellaneous\ComIsolation01\Client\Program.cs(14,13,14,22): error CS0246: The type or namespace name ‘ServerLib’ could not be found (are you missing a using directive or an assembly reference?)
2>D:\Projects\Alax.Info\Repository-Public\Utilities\Miscellaneous\ComIsolation01\Client\Program.cs(14,51,14,60): error CS0246: The type or namespace name ‘ServerLib’ could not be found (are you missing a using directive or an assembly reference?)

TYPE_E_LIBNOTREGISTERED, really? Because it looks for 32-bit type library and there is only 64-bit one registered. Build Win32 configuration once, and x64 builds are fixed. In other aspects, x64 build of Server.dll is correctly picked up.

Problem 2: Inflexible. The only COM reference isolation offered is a link with size and hash specification of the dependency.

Strict dependency check in manifest file

Why on earth? Okay it might be good for some people, perhaps. The only scenario I want to ever use is a link without checks for whether dependency is exactly as at build time. It is already isolated and the isolated file will be picked up. I would like to retain an option to patch it quickly by simply substituting a new file there, without an annoying need to patch manifest respectively. I don’t have an option like this.

Another post soon will show a solution for the problem, as well as easy way to apply isolation to C++ clients as well.