{"id":1886,"date":"2018-10-27T17:10:08","date_gmt":"2018-10-27T15:10:08","guid":{"rendered":"https:\/\/alax.info\/blog\/?p=1886"},"modified":"2018-10-27T17:15:45","modified_gmt":"2018-10-27T15:15:45","slug":"amd-h-264-video-encoder-mft-buggy-processing-of-synchronization-enabled-input-textures","status":"publish","type":"post","link":"https:\/\/alax.info\/blog\/1886","title":{"rendered":"AMD H.264 Video Encoder MFT buggy processing of synchronization-enabled input textures"},"content":{"rendered":"<p>Even though AMD H.264 Video Encoder Media Foundation Transform (MFT) AKA <code>AMDh264Encoder<\/code> is, generally, a not so bad done piece of software, it still has a few awkward bugs to mention. At this time I am going to show this one: the video encoder transform fails to acquire synchronization on input textures.<\/p>\n<p>The problem comes up when keyed mutex aware textures knock the input door of the transform. The Media Foundation samples carry textures created with <code>D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX<\/code> flag, MSDN <a href=\"https:\/\/docs.microsoft.com\/en-us\/windows\/desktop\/api\/d3d11\/ne-d3d11-d3d11_resource_misc_flag\">describes this way<\/a>:<\/p>\n<blockquote><p>[&#8230;] You can retrieve a pointer to the <code>IDXGIKeyedMutex<\/code> interface from the resource by using <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/ms682521(v=VS.85).aspx\"><code>IUnknown::QueryInterface<\/code><\/a>. The <code>IDXGIKeyedMutex<\/code> interface implements the <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/Ff471339(v=VS.85).aspx\"><code>IDXGIKeyedMutex::AcquireSync<\/code><\/a> and <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/Ff471340(v=VS.85).aspx\"><code>IDXGIKeyedMutex::ReleaseSync<\/code><\/a> APIs to synchronize access to the surface. The device that creates the surface, and any other device that opens the surface by using <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/Ff476531(v=VS.85).aspx\"><code>OpenSharedResource<\/code><\/a>, must call <code>IDXGIKeyedMutex::AcquireSync<\/code> before they issue any rendering commands to the surface. When those devices finish rendering, they must call <code>IDXGIKeyedMutex::ReleaseSync<\/code>. [&#8230;]<\/p><\/blockquote>\n<p>Video encoder MFT is supposed to pay attention to the flag and acquire synchronization before the video frame is taken to encoding. AMD implementation fails to do so and it is a bug, a pretty important one and it has been around for a while.<\/p>\n<p>The following code snippet (see also text at the bottom of the post) demonstrates the incorrect behavior of the transform.<\/p>\n<p><a href=\"https:\/\/alax.info\/blog\/wp-content\/uploads\/2018\/10\/01.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-1887\" src=\"https:\/\/alax.info\/blog\/wp-content\/uploads\/2018\/10\/01-758x600.png\" alt=\"\" width=\"758\" height=\"600\" srcset=\"https:\/\/alax.info\/blog\/wp-content\/uploads\/2018\/10\/01-758x600.png 758w, https:\/\/alax.info\/blog\/wp-content\/uploads\/2018\/10\/01-320x253.png 320w, https:\/\/alax.info\/blog\/wp-content\/uploads\/2018\/10\/01-768x608.png 768w, https:\/\/alax.info\/blog\/wp-content\/uploads\/2018\/10\/01-600x475.png 600w, https:\/\/alax.info\/blog\/wp-content\/uploads\/2018\/10\/01.png 1658w\" sizes=\"auto, (max-width: 758px) 100vw, 758px\" \/><\/a><\/p>\n<p>Execution reaches the breakpoint position and produces a H.264 sample even though input texture fed into transform is made inaccessible by <code>AcquireSync<\/code> call in line 104.<\/p>\n<p>By contrast, Microsoft&#8217;s <a href=\"https:\/\/docs.microsoft.com\/en-us\/windows\/desktop\/medfound\/h-264-video-encoder\">H.264 Video Encoder<\/a> implementation AKA <code>CLSID_MSH264EncoderMFT<\/code> implements correct behavior and triggers <code>DXGI_ERROR_INVALID_CALL<\/code> (0x887A0001) failure in line 112.<\/p>\n<p>In the process of doing the SSCCE above and writing the blog post I hit another AMD MFT bug, which is perhaps less important but still showing the internal implementation inaccuracy.<\/p>\n<p>An attempt to send <code>MFT_MESSAGE_NOTIFY_START_OF_STREAM<\/code> message in line 96 above without input and output media types set triggers a memory access violation:<\/p>\n<blockquote><p>&#8216;Application.exe&#8217; (Win32): Loaded &#8216;C:\\Windows\\System32\\DriverStore\\FileRepository\\c0334550.inf_amd64_cd83b792de8abee9\\B334365\\atiumd6a.dll&#8217;. Symbol loading disabled by Include\/Exclude setting.<br \/>\n&#8216;Application.exe&#8217; (Win32): Loaded &#8216;C:\\Windows\\System32\\DriverStore\\FileRepository\\c0334550.inf_amd64_cd83b792de8abee9\\B334365\\atiumd6t.dll&#8217;. Symbol loading disabled by Include\/Exclude setting.<br \/>\n&#8216;Application.exe&#8217; (Win32): Loaded &#8216;C:\\Windows\\System32\\DriverStore\\FileRepository\\c0334550.inf_amd64_cd83b792de8abee9\\B334365\\amduve64.dll&#8217;. Symbol loading disabled by Include\/Exclude setting.<br \/>\nException thrown at 0x00007FF81FC0E24B (AMDh264Enc64.dll) in Application.exe: 0xC0000005: Access violation reading location 0x0000000000000000.<\/p><\/blockquote>\n<p><!--more--><\/p>\n<p>Better code snippet for the screenshot above:<\/p>\n<p><code><\/p>\n<pre>#pragma region DXGI Adapter\r\nDXGI::CFactory2 pFactory2;\r\npFactory2.DebugCreate();\r\nDXGI::CAdapter1 pAdapter1;\r\n__C(pFactory2->EnumAdapters1(0, &pAdapter1));\r\n#pragma endregion \r\n#pragma region D3D11 Device\r\nCComPtr<ID3D11Device> pDevice;\r\nCComPtr<ID3D11DeviceContext> pDeviceContext;\r\nUINT nFlags = 0;\r\n#if defined(_DEBUG)\r\n\tnFlags |= D3D11_CREATE_DEVICE_DEBUG;\r\n#endif \/\/ defined(_DEBUG)\r\nstatic const D3D_FEATURE_LEVEL g_pFeatureLevels[] = \r\n{\r\n\tD3D_FEATURE_LEVEL_12_1,\r\n\tD3D_FEATURE_LEVEL_12_0,\r\n\tD3D_FEATURE_LEVEL_11_1,\r\n\tD3D_FEATURE_LEVEL_11_0,\r\n\tD3D_FEATURE_LEVEL_10_1,\r\n\tD3D_FEATURE_LEVEL_10_0,\r\n\tD3D_FEATURE_LEVEL_9_3,\r\n\tD3D_FEATURE_LEVEL_9_2,\r\n\tD3D_FEATURE_LEVEL_9_1,\r\n};\r\nD3D_FEATURE_LEVEL FeatureLevel;\r\n__C(D3D11CreateDevice(pAdapter1, D3D_DRIVER_TYPE_UNKNOWN, NULL, nFlags, g_pFeatureLevels, DIM(g_pFeatureLevels), D3D11_SDK_VERSION, &pDevice, &FeatureLevel, &pDeviceContext));\r\nconst CComQIPtr<ID3D11Multithread> pMultithread = pDevice;\r\n__D(pMultithread, E_NOINTERFACE);\r\npMultithread->SetMultithreadProtected(TRUE);\r\n#pragma endregion \r\nMF::CStartup Startup;\r\nMF::CDxgiDeviceManager pDeviceManager;\r\npDeviceManager.Create();\r\npDeviceManager.Reset(pDevice);\r\nMF::CTransform pTransform;\r\n#if TRUE\r\n\t__C(pTransform.m_p.CoCreateInstance(__uuidof(AmdH264Encoder)));\r\n\t{\r\n\t\tMF::CAttributes pAttributes;\r\n\t\t__C(pTransform->GetAttributes(&pAttributes));\r\n\t\t_A(pAttributes);\r\n\t\t_A(pAttributes.GetUINT32(MF_TRANSFORM_ASYNC));\r\n\t\tpAttributes[MF_TRANSFORM_ASYNC_UNLOCK] = (UINT32) 1;\r\n\t}\r\n\t_W(pTransform.ProcessSetD3dManagerMessage(pDeviceManager));\r\n#else\r\n\t__C(pTransform.m_p.CoCreateInstance(CLSID_MSH264EncoderMFT));\r\n#endif\r\nstatic const UINT32 g_nRateNumerator = 50, g_nRateDenominator = 1;\r\n#pragma region Media Type\r\nMF::CMediaType pInputMediaType;\r\npInputMediaType.Create();\r\npInputMediaType[MF_MT_MAJOR_TYPE] = MFMediaType_Video;\r\npInputMediaType[MF_MT_SUBTYPE] = MFVideoFormat_NV12;\r\npInputMediaType[MF_MT_ALL_SAMPLES_INDEPENDENT] = (UINT32) 1;\r\npInputMediaType[MF_MT_FRAME_SIZE].SetSize(1280, 720);\r\npInputMediaType[MF_MT_INTERLACE_MODE] = (UINT32) MFVideoInterlace_Progressive;\r\npInputMediaType[MF_MT_FRAME_RATE].SetRatio(g_nRateNumerator, g_nRateDenominator);\r\npInputMediaType[MF_MT_FIXED_SIZE_SAMPLES] = (UINT32) 1;\r\nMF::CMediaType pOutputMediaType;\r\npOutputMediaType.Create();\r\npOutputMediaType[MF_MT_MAJOR_TYPE] = MFMediaType_Video;\r\npOutputMediaType[MF_MT_SUBTYPE] = MFVideoFormat_H264;\r\npOutputMediaType.CopyFrom(pInputMediaType, MF_MT_FRAME_SIZE);\r\npOutputMediaType.CopyFrom(pInputMediaType, MF_MT_INTERLACE_MODE);\r\npOutputMediaType.CopyFrom(pInputMediaType, MF_MT_FRAME_RATE);\r\npOutputMediaType[MF_MT_AVG_BITRATE] = (UINT32) 1000 * 1000;\r\npTransform.SetOutputType(pOutputMediaType);\r\npTransform.SetInputType(pInputMediaType);\r\n#pragma endregion\r\n_W(pTransform.ProcessStartOfStreamNotifyMessage());\r\nCD3D11_TEXTURE2D_DESC TextureDescription(DXGI_FORMAT_NV12, 1280, 720);\r\nTextureDescription.MipLevels = 1;\r\nTextureDescription.MiscFlags |= D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX;\r\nCComPtr<ID3D11Texture2D> pTexture;\r\n__C(pDevice->CreateTexture2D(&TextureDescription, NULL, &pTexture));\r\n_A(D3D11::IsKeyedMutexAware(pTexture));\r\nDXGI::CKeyedMutexLock KeyedMutexLock(pTexture);\r\n_W(KeyedMutexLock.AcquireSync(0));\r\nfor(UINT nIndex = 0; nIndex < 20; nIndex++)\r\n{\r\n\tMF::CSample pSample;\r\n\tpSample.Create();\r\n\tpSample.AddTextureBuffer(pTexture);\r\n\tpSample->SetSampleTime(MFllMulDiv(nIndex * 1000 * 10000i64, g_nRateDenominator, g_nRateNumerator, 0));\r\n\tpSample->SetSampleDuration(MFllMulDiv(1, g_nRateDenominator, g_nRateNumerator, 0));\r\n\t__C(pTransform->ProcessInput(0, pSample, 0));\r\n\t_A(pTransform.GetOutputStreamInformation().dwFlags & MFT_OUTPUT_STREAM_PROVIDES_SAMPLES);\r\n\tMFT_OUTPUT_DATA_BUFFER OutputDataBuffer = { };\r\n\tDWORD nStatus;\r\n\tif(SUCCEEDED(pTransform->ProcessOutput(0, 1, &OutputDataBuffer, &nStatus)))\r\n\t{\r\n\t\t_A(OutputDataBuffer.pSample);\r\n\t\treinterpret_cast<MF::CSample&#038;>(OutputDataBuffer.pSample).Trace();\r\n\t\tbreak;\r\n\t}\r\n}<\/pre>\n<p><\/code><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Even though AMD H.264 Video Encoder Media Foundation Transform (MFT) AKA AMDh264Encoder is, generally, a not so bad done piece of software, it still has a few awkward bugs to mention. At this time I am going to show this one: the video encoder transform fails to acquire synchronization on input textures. The problem comes&hellip; <\/p>\n<p><a class=\"moretag\" href=\"https:\/\/alax.info\/blog\/1886\">Read the full article<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[553,63,379,424,426],"class_list":["post-1886","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-amd","tag-bug","tag-h-264","tag-media-foundation","tag-mft"],"_links":{"self":[{"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/posts\/1886","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/comments?post=1886"}],"version-history":[{"count":0,"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/posts\/1886\/revisions"}],"wp:attachment":[{"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/media?parent=1886"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/categories?post=1886"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/alax.info\/blog\/wp-json\/wp\/v2\/tags?post=1886"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}