AMD H.264 Video Encoder MFT buggy processing of synchronization-enabled input textures

Even though AMD H.264 Video Encoder Media Foundation Transform (MFT) AKA AMDh264Encoder is, generally, a not so bad done piece of software, it still has a few awkward bugs to mention. At this time I am going to show this one: the video encoder transform fails to acquire synchronization on input textures.

The problem comes up when keyed mutex aware textures knock the input door of the transform. The Media Foundation samples carry textures created with D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX flag, MSDN describes this way:

[…] You can retrieve a pointer to the IDXGIKeyedMutex interface from the resource by using IUnknown::QueryInterface. The IDXGIKeyedMutex interface implements the IDXGIKeyedMutex::AcquireSync and IDXGIKeyedMutex::ReleaseSync APIs to synchronize access to the surface. The device that creates the surface, and any other device that opens the surface by using OpenSharedResource, must call IDXGIKeyedMutex::AcquireSync before they issue any rendering commands to the surface. When those devices finish rendering, they must call IDXGIKeyedMutex::ReleaseSync. […]

Video encoder MFT is supposed to pay attention to the flag and acquire synchronization before the video frame is taken to encoding. AMD implementation fails to do so and it is a bug, a pretty important one and it has been around for a while.

The following code snippet (see also text at the bottom of the post) demonstrates the incorrect behavior of the transform.

Execution reaches the breakpoint position and produces a H.264 sample even though input texture fed into transform is made inaccessible by AcquireSync call in line 104.

By contrast, Microsoft’s H.264 Video Encoder implementation AKA CLSID_MSH264EncoderMFT implements correct behavior and triggers DXGI_ERROR_INVALID_CALL (0x887A0001) failure in line 112.

In the process of doing the SSCCE above and writing the blog post I hit another AMD MFT bug, which is perhaps less important but still showing the internal implementation inaccuracy.

An attempt to send MFT_MESSAGE_NOTIFY_START_OF_STREAM message in line 96 above without input and output media types set triggers a memory access violation:

‘Application.exe’ (Win32): Loaded ‘C:\Windows\System32\DriverStore\FileRepository\c0334550.inf_amd64_cd83b792de8abee9\B334365\atiumd6a.dll’. Symbol loading disabled by Include/Exclude setting.
‘Application.exe’ (Win32): Loaded ‘C:\Windows\System32\DriverStore\FileRepository\c0334550.inf_amd64_cd83b792de8abee9\B334365\atiumd6t.dll’. Symbol loading disabled by Include/Exclude setting.
‘Application.exe’ (Win32): Loaded ‘C:\Windows\System32\DriverStore\FileRepository\c0334550.inf_amd64_cd83b792de8abee9\B334365\amduve64.dll’. Symbol loading disabled by Include/Exclude setting.
Exception thrown at 0x00007FF81FC0E24B (AMDh264Enc64.dll) in Application.exe: 0xC0000005: Access violation reading location 0x0000000000000000.

Better code snippet for the screenshot above:

#pragma region DXGI Adapter
DXGI::CFactory2 pFactory2;
pFactory2.DebugCreate();
DXGI::CAdapter1 pAdapter1;
__C(pFactory2->EnumAdapters1(0, &pAdapter1));
#pragma endregion 
#pragma region D3D11 Device
CComPtr pDevice;
CComPtr pDeviceContext;
UINT nFlags = 0;
#if defined(_DEBUG)
	nFlags |= D3D11_CREATE_DEVICE_DEBUG;
#endif // defined(_DEBUG)
static const D3D_FEATURE_LEVEL g_pFeatureLevels[] = 
{
	D3D_FEATURE_LEVEL_12_1,
	D3D_FEATURE_LEVEL_12_0,
	D3D_FEATURE_LEVEL_11_1,
	D3D_FEATURE_LEVEL_11_0,
	D3D_FEATURE_LEVEL_10_1,
	D3D_FEATURE_LEVEL_10_0,
	D3D_FEATURE_LEVEL_9_3,
	D3D_FEATURE_LEVEL_9_2,
	D3D_FEATURE_LEVEL_9_1,
};
D3D_FEATURE_LEVEL FeatureLevel;
__C(D3D11CreateDevice(pAdapter1, D3D_DRIVER_TYPE_UNKNOWN, NULL, nFlags, g_pFeatureLevels, DIM(g_pFeatureLevels), D3D11_SDK_VERSION, &pDevice, &FeatureLevel, &pDeviceContext));
const CComQIPtr pMultithread = pDevice;
__D(pMultithread, E_NOINTERFACE);
pMultithread->SetMultithreadProtected(TRUE);
#pragma endregion 
MF::CStartup Startup;
MF::CDxgiDeviceManager pDeviceManager;
pDeviceManager.Create();
pDeviceManager.Reset(pDevice);
MF::CTransform pTransform;
#if TRUE
	__C(pTransform.m_p.CoCreateInstance(__uuidof(AmdH264Encoder)));
	{
		MF::CAttributes pAttributes;
		__C(pTransform->GetAttributes(&pAttributes));
		_A(pAttributes);
		_A(pAttributes.GetUINT32(MF_TRANSFORM_ASYNC));
		pAttributes[MF_TRANSFORM_ASYNC_UNLOCK] = (UINT32) 1;
	}
	_W(pTransform.ProcessSetD3dManagerMessage(pDeviceManager));
#else
	__C(pTransform.m_p.CoCreateInstance(CLSID_MSH264EncoderMFT));
#endif
static const UINT32 g_nRateNumerator = 50, g_nRateDenominator = 1;
#pragma region Media Type
MF::CMediaType pInputMediaType;
pInputMediaType.Create();
pInputMediaType[MF_MT_MAJOR_TYPE] = MFMediaType_Video;
pInputMediaType[MF_MT_SUBTYPE] = MFVideoFormat_NV12;
pInputMediaType[MF_MT_ALL_SAMPLES_INDEPENDENT] = (UINT32) 1;
pInputMediaType[MF_MT_FRAME_SIZE].SetSize(1280, 720);
pInputMediaType[MF_MT_INTERLACE_MODE] = (UINT32) MFVideoInterlace_Progressive;
pInputMediaType[MF_MT_FRAME_RATE].SetRatio(g_nRateNumerator, g_nRateDenominator);
pInputMediaType[MF_MT_FIXED_SIZE_SAMPLES] = (UINT32) 1;
MF::CMediaType pOutputMediaType;
pOutputMediaType.Create();
pOutputMediaType[MF_MT_MAJOR_TYPE] = MFMediaType_Video;
pOutputMediaType[MF_MT_SUBTYPE] = MFVideoFormat_H264;
pOutputMediaType.CopyFrom(pInputMediaType, MF_MT_FRAME_SIZE);
pOutputMediaType.CopyFrom(pInputMediaType, MF_MT_INTERLACE_MODE);
pOutputMediaType.CopyFrom(pInputMediaType, MF_MT_FRAME_RATE);
pOutputMediaType[MF_MT_AVG_BITRATE] = (UINT32) 1000 * 1000;
pTransform.SetOutputType(pOutputMediaType);
pTransform.SetInputType(pInputMediaType);
#pragma endregion
_W(pTransform.ProcessStartOfStreamNotifyMessage());
CD3D11_TEXTURE2D_DESC TextureDescription(DXGI_FORMAT_NV12, 1280, 720);
TextureDescription.MipLevels = 1;
TextureDescription.MiscFlags |= D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX;
CComPtr pTexture;
__C(pDevice->CreateTexture2D(&TextureDescription, NULL, &pTexture));
_A(D3D11::IsKeyedMutexAware(pTexture));
DXGI::CKeyedMutexLock KeyedMutexLock(pTexture);
_W(KeyedMutexLock.AcquireSync(0));
for(UINT nIndex = 0; nIndex < 20; nIndex++)
{
	MF::CSample pSample;
	pSample.Create();
	pSample.AddTextureBuffer(pTexture);
	pSample->SetSampleTime(MFllMulDiv(nIndex * 1000 * 10000i64, g_nRateDenominator, g_nRateNumerator, 0));
	pSample->SetSampleDuration(MFllMulDiv(1, g_nRateDenominator, g_nRateNumerator, 0));
	__C(pTransform->ProcessInput(0, pSample, 0));
	_A(pTransform.GetOutputStreamInformation().dwFlags & MFT_OUTPUT_STREAM_PROVIDES_SAMPLES);
	MFT_OUTPUT_DATA_BUFFER OutputDataBuffer = { };
	DWORD nStatus;
	if(SUCCEEDED(pTransform->ProcessOutput(0, 1, &OutputDataBuffer, &nStatus)))
	{
		_A(OutputDataBuffer.pSample);
		reinterpret_cast(OutputDataBuffer.pSample).Trace();
		break;
	}
}

Leave a Reply