H.265/HEVC Video Decoder issues in Windows 10 Fall Creators Update

Microsoft released new Windows 10 update and again there is a new tsunami approaching.

Media Foundation H.265 decoder: there is no longer “preliminary documentation” notice in MSDN article. The decoder is a substandard quality software item but it covers more or less what it has to. In particular, it indeed backs H.265 video playback including with the use of DXVA2 where applicable. It could have been an impression that technology matures and who knows maybe it will be even updated to at least the quality and feature set of H.264 decoder.

To make life more interesting and challenging Fall Creators Update does a breaking change. The changes are not yet documented, and they are important: H.265 decoder is taken away. Luckily, not completely: the decoder (along with encoder) is moved to a separate Windows Store download: HEVC Video Extension.

Even though it looks like being the same decoder, it is packaged differently and consumers might require to update code to continue decoder use/consumption.

Even though mentioned above makes at least some sense, there is also an obvious bug on top of all this: 32-bit version of the decoder is BROKEN.

The released variant of the decoder is sealed by digital signature and enforces integrity checks. This looks good with 64-bit version of the software, and in particular stock Movies and TV Player application can indeed play H.265 video in 64-bit Windows because 64-bit version of the application is used, and in turn this pulls 64-bit version of the decoder. However, 32-bit version of the DLL is broken and does not work, hence 32-bit applications relying on H.265 video decoding capabilities are going to stop working with Fall Creators Update.

The problem is apparently the integrity check because if you manage to remove that, 32-bit version of H.265/HEVC decoder is operational.

It will take some time for Microsoft to identify and fix the problem. It will be fixed though, so if it is important to have 32-bit apps functional in the part of H.265 video decoding/playback, one should rather postpone Fall Creators Update.

Further reading:

Greetings from H.265 / HEVC Video Decoder Media Foundation Transform

H.265 / HEVC Video Decoder Media Foundation has been around for a while, but using Media Foundation overall one step off straightforward basic paths is like walking a minefield.

A twenty-liner below hits memory access violation inside IMFTransform::GetInputStreamInfo “Exception thrown at 0x6D1E71E7 (hevcdecoder.dll) in MfHevcDecoder01.exe: 0xC0000005: Access violation reading location 0x00000000.”:

#include "stdafx.h"
#include <mfapi.h>
#include <mftransform.h>
#include <wmcodecdsp.h>
#include <atlbase.h>
#include <atlcom.h>

#pragma comment(lib, "mfplat.lib")
#pragma comment(lib, "mfuuid.lib")
#pragma comment(lib, "wmcodecdspuuid.lib")

int main()
{
    ATLVERIFY(SUCCEEDED(CoInitialize(NULL)));
    ATLVERIFY(SUCCEEDED(MFStartup(MF_VERSION)));
    CComPtr<IMFTransform> pTransform;
#if 1
    static const MFT_REGISTER_TYPE_INFO InputTypeInformation = { MFMediaType_Video, MFVideoFormat_HEVC };
    IMFActivate** ppActivates;
    UINT32 nActivateCount = 0;
    ATLVERIFY(SUCCEEDED(MFTEnumEx(MFT_CATEGORY_VIDEO_DECODER, 0, &InputTypeInformation, NULL, &ppActivates, &nActivateCount)));
    ATLASSERT(nActivateCount > 0);
    ATLVERIFY(SUCCEEDED(ppActivates[0]->ActivateObject(__uuidof(IMFTransform), (VOID**) &pTransform)));
#else
    ATLVERIFY(SUCCEEDED(pTransform.CoCreateInstance(CLSID_CMSH265EncoderMFT)));
#endif
    MFT_INPUT_STREAM_INFO InputInformation;
    ATLVERIFY(SUCCEEDED(pTransform->GetInputStreamInfo(0, &InputInformation)));
    return 0;
}

Interestingly, alternative path around IMFActivate (see #if above) seems to be working fine.

 

AMD started offering hardware H.265/HEVC video encoder for Media Foundation

It should be good news for those interested in hardware assisted video encoding as AMD extends offering in their new hardware and offers H.265 encoder in already well-known form factor: as a Microsoft Media Foundation Transform “AMDh265Encoder”:

# System

* Version: 10.0.14393, Windows 10, VER_SUITE_SINGLEUSERTS, VER_NT_WORKSTATION
* Product: PRODUCT_PROFESSIONAL

[…]

# Display Devices

* AMD Radeon (TM) RX 480
* Instance: PCI\VEN_1002&DEV_67DF&SUBSYS_0B371002&REV_C7\4&2D78AB8F&0&0008
* DEVPKEY_Device_Manufacturer: Advanced Micro Devices, Inc.
* DEVPKEY_Device_DriverVersion: 21.19.137.514

[…]

# Category `MFT_CATEGORY_VIDEO_ENCODER`

[…]

## AMDh265Encoder

15 Attributes:

* MFT_TRANSFORM_CLSID_Attribute: {5FD65104-A924-4835-AB71-09A223E3E37B} (Type VT_CLSID)
* MF_TRANSFORM_FLAGS_Attribute: MFT_ENUM_FLAG_HARDWARE
* MFT_ENUM_HARDWARE_VENDOR_ID_Attribute: VEN_1002 (Type VT_LPWSTR)
* MFT_ENUM_HARDWARE_URL_Attribute: AMDh265Encoder (Type VT_LPWSTR)
* MFT_INPUT_TYPES_Attributes: MFVideoFormat_NV12, MFVideoFormat_ARGB32
* MFT_OUTPUT_TYPES_Attributes: MFVideoFormat_HEVC
* MFT_CODEC_MERIT_Attribute: 8 (Type VT_UI4)
* MFT_SUPPORT_DYNAMIC_FORMAT_CHANGE: 1 (Type VT_UI4)
* MF_TRANSFORM_ASYNC: 1 (Type VT_UI4)
* MF_SA_D3D11_AWARE: 1 (Type VT_UI4)
* MF_SA_D3D_AWARE: 1 (Type VT_UI4)
* MF_TRANSFORM_ASYNC_UNLOCK: 0 (Type VT_UI4)
* MFT_GFX_DRIVER_VERSION_ID_Attribute: 1.2.3.4

This follows Intel’s H.265/HEVC hardware compression offering also available in MFT form factor:

## Intel® Hardware H265 Encoder MFT

12 Attributes:

* MFT_TRANSFORM_CLSID_Attribute: {BC10864D-2B34-408F-912A-102B1B867B6C} (Type VT_CLSID)
* MF_TRANSFORM_FLAGS_Attribute: MFT_ENUM_FLAG_HARDWARE
* MFT_ENUM_HARDWARE_VENDOR_ID_Attribute: VEN_8086 (Type VT_LPWSTR)
* MFT_ENUM_HARDWARE_URL_Attribute: AA243E5D-2F73-48c7-97F7-F6FA17651651 (Type VT_LPWSTR)
* MFT_INPUT_TYPES_Attributes: {3231564E-3961-42AE-BA67-FF47CCC13EED}, MFVideoFormat_NV12, MFVideoFormat_ARGB32
* MFT_OUTPUT_TYPES_Attributes: MFVideoFormat_HEVC
* MFT_CODEC_MERIT_Attribute: 7 (Type VT_UI4)
* MFT_SUPPORT_DYNAMIC_FORMAT_CHANGE: 1 (Type VT_UI4)
* MF_TRANSFORM_ASYNC: 1 (Type VT_UI4)
* MFT_GFX_DRIVER_VERSION_ID_Attribute: 0.0.0.3

Encoding H.264 video using hardware MFTs

Some time ago there were some pictures explaining performance and other properties of software H.264 encoder (x264). At this time, it is a turn of hardware H.264 encoders and more to that, two of them and side by side. Both encoders are nothing new: Intel® Quick Sync Video H.264 Encoder and NVIDIA H.264 Encoder already have been around for a while. Some would say it is already time for H.265 encoders.

Either way, on my test machine both encoders are available without additionally installed software (that is, no need for Intel Media SDK, Nvidia NVENC, redistributable files etc.). Out of the box, Windows 10 offers stock software only encoder, and hardware encoders in form factor of Media Foundation Transform (MFT).

Environment:

  • OS: Windows 10 Pro
  • CPU: Intel i7-4790
  • Video Adapter 1: Intel HD Graphics 4600 (on-board, not connected to monitors)
  • Video Adapter 2: NVIDIA GeForce GTX 750

It is not convenient or fun to do things with Media Foundation, but good news is that Media Foundation components are well-separable. A wrapper over MFT that converts them into DirectShow filters, make them available to DirectShow where it is already way easier to run various test runs. The pictures below show metrics for encoder defaults (bitrate, profiles and many other options that create a great deal of encoding modes). Still the pictures do show that both encoders are well usable for many scenarios including HD processing, simultaneous data processing etc.

Video Encoder MFT Wrapper in GraphStudioNext

Test runs are as simple as taking reference video source signal of different properties, pushing it through encoder filter and either writing to a file (to inspect the footage) or to Null Renderer Filter to measure performance.

Intel® Quick Sync Video H.264 Encoder produces files like these: 720×480.mp4, 2556×1440.mp4, which are of decent quality (with respect to low bitrate and “hard to handle” background changes). NVIDIA H.264 Encoder produces somewhat better output supposedly by choosing higher bitrate. Either way, both encoders have a number of ways to fine tune the encoding process. Not just bitrate, profile, GOP length, B frame settings but even more sophisticated parameters.

Intel® Quick Sync Video H.264 Encoder MFT

CODECAPI_AVEncCommonRateControlMode: VT_UI4 0, default VT_UI4 0, modifiable // eAVEncCommonRateControlMode_CBR = 0
CODECAPI_AVEncCommonQuality: minimal VT_UI4 0, maximal VT_EMPTY, step VT_EMPTY
CODECAPI_AVEncCommonBufferSize: VT_UI4 3131961357, default VT_UI4 0, modifiable
CODECAPI_AVEncCommonMaxBitRate: default VT_UI4 0
CODECAPI_AVEncCommonMeanBitRate: VT_UI4 3131961357, default VT_UI4 2222000, modifiable
CODECAPI_AVEncCommonQualityVsSpeed: VT_UI4 50, default VT_UI4 50, modifiable
CODECAPI_AVEncH264CABACEnable: modifiable
CODECAPI_AVEncMPVDefaultBPictureCount: VT_UI4 0, default VT_UI4 0, modifiable
CODECAPI_AVEncMPVGOPSize: VT_UI4 128, default VT_UI4 128, modifiable
CODECAPI_AVEncVideoEncodeQP: 
CODECAPI_AVEncVideoForceKeyFrame: VT_UI4 0, default VT_UI4 0, modifiable
CODECAPI_AVLowLatencyMode: VT_BOOL 0, default VT_BOOL 0, modifiable
CODECAPI_AVEncVideoLTRBufferControl: VT_UI4 65536, values { VT_UI4 65536, VT_UI4 65537, VT_UI4 65538, VT_UI4 65539, VT_UI4 65540, VT_UI4 65541, VT_UI4 65542, VT_UI4 65543, VT_UI4 65544, VT_UI4 65545, VT_UI4 65546, VT_UI4 65547, VT_UI4 65548, VT_UI4 65549, VT_UI4 65550, VT_UI4 65551, VT_UI4 65552 }, modifiable
CODECAPI_AVEncVideoMarkLTRFrame: 
CODECAPI_AVEncVideoUseLTRFrame: 
CODECAPI_AVEncVideoEncodeFrameTypeQP: default VT_UI8 111670853658, minimal VT_UI8 0, maximal VT_UI8 219046674483, step VT_UI8 1
CODECAPI_AVEncSliceControlMode: VT_UI4 0, default VT_UI4 2, minimal VT_UI4 2, maximal VT_UI4 2, step VT_UI4 0, modifiable
CODECAPI_AVEncSliceControlSize: VT_UI4 0, minimal VT_UI4 0, maximal VT_UI4 8160, step VT_UI4 1, modifiable
CODECAPI_AVEncVideoMaxNumRefFrame: minimal VT_UI4 0, maximal VT_UI4 16, step VT_UI4 1, modifiable
CODECAPI_AVEncVideoTemporalLayerCount: default VT_UI4 1, minimal VT_UI4 1, maximal VT_UI4 3, step VT_UI4 1, modifiable
CODECAPI_AVEncMPVDefaultBPictureCount: VT_UI4 0, default VT_UI4 0, modifiable

NVIDIA H.264 Encoder MFT

CODECAPI_AVEncCommonRateControlMode: VT_UI4 0
CODECAPI_AVEncCommonQuality: VT_UI4 65
CODECAPI_AVEncCommonBufferSize: VT_UI4 8923353
CODECAPI_AVEncCommonMaxBitRate: VT_UI4 8923353
CODECAPI_AVEncCommonMeanBitRate: VT_UI4 2974451
CODECAPI_AVEncCommonQualityVsSpeed: VT_UI4 33
CODECAPI_AVEncH264CABACEnable: VT_BOOL -1
CODECAPI_AVEncMPVGOPSize: VT_UI4 50
CODECAPI_AVEncVideoEncodeQP: VT_UI8 26
CODECAPI_AVEncVideoForceKeyFrame: 
CODECAPI_AVEncVideoMinQP: VT_UI4 0, minimal VT_UI4 0, maximal VT_UI4 51, step VT_UI4 1
CODECAPI_AVLowLatencyMode: VT_BOOL 0
CODECAPI_AVEncVideoLTRBufferControl: VT_UI4 0, values { VT_I4 65537, VT_I4 65538 }
CODECAPI_AVEncVideoMarkLTRFrame: 
CODECAPI_AVEncVideoUseLTRFrame: 
CODECAPI_AVEncVideoEncodeFrameTypeQP: VT_UI8 111670853658
CODECAPI_AVEncSliceControlMode: VT_UI4 2, minimal VT_UI4 0, maximal VT_UI4 2, step VT_UI4 1
CODECAPI_AVEncSliceControlSize: VT_UI4 0, minimal VT_UI4 0, maximal VT_UI4 3, step VT_UI4 1
CODECAPI_AVEncVideoMaxNumRefFrame: VT_UI4 1, minimal VT_UI4 0, maximal VT_UI4 16, step VT_UI4 1
CODECAPI_AVEncVideoMeanAbsoluteDifference: VT_UI4 0
CODECAPI_AVEncVideoMaxQP: VT_UI4 51, minimal VT_UI4 0, maximal VT_UI4 51, step VT_UI4 1
CODECAPI_AVEncVideoROIEnabled: VT_UI4 0
CODECAPI_AVEncVideoTemporalLayerCount: minimal VT_UI4 1, maximal VT_UI4 3, step VT_UI4 1

Important property of hardware encoder is that even that it does consume some of CPU time, the most of the complexity is offloaded to video hardware. In all single stream test runs, the eight-core CPU was loaded not more than 30% including time required to synthesize the image using WIC and Direct2D and convert it to YUV format using CPU. That is, offloading video encoding to GPU is a convenient way to free CPU for real time video processing applications.

I was mostly interested in how the encoders are in terms of being able to process real time data, esp. so that they are applied to record lengthy sessions. Both encoders appear to be fast enough to crack 1920×1080 HD video at frame rates up to 60 and higher. The test did encoding at highest rate possible and 100% number on the charts corresponds to situation that it took one second to synthesize and encode one second of video no matter what effective CPU/GPU load is. That is, values less than 100% indicate ability to encode video content in real time right away.

Intel and NVidia Hardware H.264 Encoders Side by Side

Basically, the numbers show that both encoders are fast enough to reliably encode 1080p60 stream.

Looking at it from another standpoint of being able to process two or more H.264 encoding sessions at once, encoder from NVidia has an important limitation of two sessions per system (supposedly related thread – for this or another reason test run with three streams fails).

Intel and NVidia H.264 Encoders in Concurrent Encoding

Both encoders are hardly suitable for reliable encoding of two 1080p60 streams simultaneously (or perhaps some fine tuning might make things faster by choosing appropriate encoding mode). However both look fine for encoding 1080p and lower resolution stream. Clearly, Intel’s encoder can be used to encoder multiple low resolution streams in parallel or mix real time encoding with background encoding (provided that background encoding is throttled to let the real time stream run fast enough). If otherwise real-time encoding is not necessary, both encoders can do the job as well, and with Nvidia the application needs to make sure that only two sessions are running simultaneously, Intel’s encoder can be used in a more flexible way.

Also, Nvidia’s encoder is slightly faster, however Intel’s allow 3+ concurrently encoded stream and also allows to supply RGB input directly without converting to YUV.

There is also Intel® Hardware H265 Encoder MFT available for H.265 encoding, but this is going to be another story some time later.