The previous post was focusing on problems with the hardware MFT decoder provided as a part of video driver package. This time I am going to mention some data about how the inefficiency affects performance of video capture using a high frame rate 260 FPS camera as a test stand. Apparently the effect is better visible with high frame rates because CPU and GPU hardware is fast enough already to process less complicated signal.
There is already some interest from AMD end (deserves a separate post why this is exceptional on its own), and some bug fixes are already under the way.
The performance problem is less visible because the decoder is overall performing without fatal issues and provides expected output: no failures, error codes, no deadlocks, neither CPU or GPU engine is peaked out, so things are more or less fine at first glance… The test application uses Media Foundation and Source Reader API to read textures in hardware MFT enabled mode and discards the textures just printing out the frame rate.
AMD MFT MJPEG Decoder
C:\...\MjpgCameraReader\bin\x64\Release>MjpgCameraReader.exe Using camera HD USB Camera Using adapter Radeon RX 570 Series Using video capture format 640x360@260.004 MFVideoFormat_MJPG Using hardware decoder MFT AMD MFT MJPEG Decoder Using video frame format 640x384@260.004 MFVideoFormat_YUY2 72.500 video samples per second captured 134.000 video samples per second captured 135.000 video samples per second captured 134.500 video samples per second captured 135.500 video samples per second captured 134.000 video samples per second captured 134.000 video samples per second captured 135.000 video samples per second captured 134.500 video samples per second captured 133.500 video samples per second captured 134.000 video samples per second captured
With no sign of hitting a bottleneck the reader process produces ~134 FPS from the video capture device.
Alax.Info MJPG Video Decoder for AMD Hardware
My replacement for hardware decoder MFT is doing the decoding of the same signal, and, generally, shares a lot with AMD’s own decoder: both MFTs are built on top of Advanced Media Framework (AMF) SDK. Driver package installs runtime for this SDK and installs a decoder MFT which is linked against a copy of the runtime (according to AMD representative, the static link copy shares the same codebase).
C:\...\MjpgCameraReader\bin\x64\Release>MjpgCameraReader.exe Using camera HD USB Camera Using adapter Radeon RX 570 Series Using video capture format 640x360@260.004 MFVideoFormat_MJPG Using substitute decoder Alax.Info MJPG Video Decoder for AMD Hardware Using video frame format 640x360@260.004 MFVideoFormat_YUY2 74.000 video samples per second captured 261.000 video samples per second captured 261.000 video samples per second captured 261.000 video samples per second captured 261.000 video samples per second captured 260.500 video samples per second captured 261.000 video samples per second captured 261.000 video samples per second captured 261.000 video samples per second captured 261.000 video samples per second captured 260.500 video samples per second captured
Similar CPU and GPU utilization levels with higher frame rate. Actually, with the expected frame rate because it is the rate the camera is supposed to operate at.
1280×720@120 Mode
Interestingly, at lower FPS mode the AMD MFT threading issues are present, and, more to that the MFT exhibits two other issues (one of them is “just ignore” one per AMD comment). At the same time video capture rate is no longer reduced: the horsepower of the hardware is hiding the implementation inefficiency.
Using camera HD USB Camera Using adapter Radeon RX 570 Series Using video capture format 1280x720@120.000 MFVideoFormat_MJPG Using hardware decoder MFT AMD MFT MJPEG Decoder Using video frame format 1280x736@120.000 MFVideoFormat_YUY2 18.500 video samples per second captured 120.000 video samples per second captured 120.000 video samples per second captured 120.000 video samples per second captured 120.000 video samples per second captured 120.000 video samples per second captured 120.000 video samples per second captured 120.000 video samples per second captured 120.000 video samples per second captured
Intel Hardware M-JPEG Decoder MFT
AMD is not the only one GPU vendor out there and my development system is equipped with integrated GPU from Intel as well, so why not give it a try?
To AMD defence, Intel’s decoder is exhibiting a subpar performance:
C:\...\MjpgCameraReader\bin\x64\Release>MjpgCameraReader.exe Using camera HD USB Camera Using adapter Intel(R) UHD Graphics 630 Using video capture format 640x360@260.004 MFVideoFormat_MJPG Using hardware decoder MFT IntelÐ Ñ• Hardware M-JPEG Decoder MFT Using video frame format 640x368@260.004 MFVideoFormat_YUY2 24.000 video samples per second captured 63.500 video samples per second captured 63.500 video samples per second captured 64.000 video samples per second captured 63.500 video samples per second captured 63.000 video samples per second captured 63.500 video samples per second captured 62.000 video samples per second captured 63.500 video samples per second captured 64.000 video samples per second captured 63.500 video samples per second captured
At lower relative utilization levels and, again, without hitting any bottleneck visibly, the capture rate is reduced.
And this happens even without the threading problem I could at least see in the AMD’s case.
120 FPS mode is doing good:
C:\...\MjpgCameraReader\bin\x64\Release>MjpgCameraReader.exe Using camera HD USB Camera Using adapter Intel(R) UHD Graphics 630 Using video capture format 1280x720@120.000 MFVideoFormat_MJPG Using hardware decoder MFT Intelо Hardware M-JPEG Decoder MFT Using video frame format 1280x720@120.000 MFVideoFormat_YUY2 77.000 video samples per second captured 119.000 video samples per second captured 120.000 video samples per second captured 121.000 video samples per second captured 119.000 video samples per second captured 121.000 video samples per second captured 120.000 video samples per second captured 120.000 video samples per second captured 120.500 video samples per second captured 119.500 video samples per second captured 120.000 video samples per second captured
That is, there is an obvious performance issue in Intel’s implementation since they fail to process lower resolution signal at original rate and even at rate they are showing for higher resolution signal!
So does 1920×1080@60:
C:\...\MjpgCameraReader\bin\x64\Release>MjpgCameraReader.exe Using camera HD USB Camera Using adapter Intel(R) UHD Graphics 630 Using video capture format 1920x1080@60.000 MFVideoFormat_MJPG Using hardware decoder MFT Intelо Hardware M-JPEG Decoder MFT Using video frame format 1920x1088@60.000 MFVideoFormat_YUY2 49.500 video samples per second captured 60.500 video samples per second captured 59.500 video samples per second captured 60.000 video samples per second captured 60.000 video samples per second captured 60.000 video samples per second captured 60.000 video samples per second captured 60.000 video samples per second captured 60.000 video samples per second captured 60.000 video samples per second captured 60.000 video samples per second captured
In closing
Bottom line is that hardware ASICs are generally good, but the quality of software MFT layer is not something GPU vendors care much of.
The application below does the testing on first available GPU and it assumes you have a video capture compatible to Media Foundation API. The application uses highest frame rate MJPG format of the camera and uses a hardware decoder MFT associated with the GPU.
One more thing to mention is that video capture takes place through so called Microsoft Windows Camera Frame Server (FrameServer) Service, notorious and not documented. Frame Server virtualizes video capture device adding processing overhead and cross-process synchronization.
Some time later I will compare performance of capturing around Frame Server and around Media Foundation default implementation of video capture device proxy. I expect though that there is no visible performance difference as those are, eventually, done well.
Download links
Binaries:
- 64-bit: MjpgCameraReader.exe (in .7z archive)
- License: This software is free to use