Screen recording using Desktop Duplication API and hardware H.264 encoder

The application takes advantage of three powerful Windows APIs at a time:

MediaFoundationDesktopRecorder initializes a desktop duplication session and sends obtained desktop images to H.264 video encoder producing a standard MP4 recording. Optionally, it can add an audio track capturing data from one of the standard inputs.

The best performance is achieved when used with hardware H.264 encoder: not only the performance of hardware encoder is better, but additionally desktop images are transferred to the encoder efficiently, without being copied through system memory. With respective hardware, recording is pretty efficient.

There are certain limitations: duplication API is Windows 8+, encoder availability depends on hardware and OS versions. The application let API pick encoder automatically and in worth case scenario falls back to software encoder, which is typically a performance hit.

MediaFoundationDesktopRecorder UI

When started, the application prints initial information, esp. regarding availability of devices, and appends as actions and events take place.

The application uses configuration file with the same name and location as the application, and .INI extension. Changes to the configuration file take effect when the application is restarted.

The application registers Win+F5, Win+F8 hotkeys globally to start/stop recording when the application is in background (that is, when user interacts with another application).

The application generates .MP4 files in the directory of its own location. There will be a video track, and optionally one additional audio track – depending on settings. Video is taken from one of the monitors, and audio – from one of the available standard audio input devices.

The application also generates log files at one the locations:

  • C:\ProgramData\MediaFoundationDesktopRecorder.log
  • C:\Users\$(UserName)\AppData\Local\MediaFoundationDesktopRecorder.log (in case the first path above is inaccessible, esp. due to insufficient permissions)

Configuration

The configuration .INI file might contain a few settings that set up and alter the behavoir of the application:

[Input]
;Video Adapter Description=NVIDIA GeForce GTX 750
Video Output Device Name=\\.\DISPLAY2
;Audio Friendly Name=Stereo Mix (Realtek High Definition Audio)

When started, the application enumerates (“found video…”, “found audio…”) available video and audio inputs. These discoveries are compared against configuration file settings in order to identify monitor for recording, and possibly audio input device.

Default behavior is to take first available monitor, which happens when settings do not instruct otherwise. By default, no audio is recorded. Audio is recorded and added to resulting file if input device is provided explicitly.

The application also prints which devices are taken for further recording (“using adapter…”).

[Format]
;Video Frame Rate=30000
;Video Frame Rate Denominator=1001
Video Bitrate=4096000
Video Texture Pool Capacity=24
Video Throttle=70
Audio Bitrate=192000

Default behavior is to identify monitor’s refresh rate and produce output file with video at the same frame rate. Video Frame Rate and Video Frame Rate Denominator settings offer an override to target file frame rate. With the former value only, it is the frame rate. With both values they define a ratio, e.g. values of 30000 and 1001 result in 29.97 fps file.

Frame rate reduction is a good way to reduce encoding complexity and overall graphics subsystem load.

Bitrate values define respective bitrates for the encoded content.

Details

As recording goes, the application grabs new desktop snapshots and sends them to encoder. There are no specific expectations about frame rate stability and reduction in case of overload of graphics subsystem. When the complexity is excessive, it is expected that some frames might be lost without breaking the entire playability of the output file.

The application provides additional information when it creates a file, for example:

Using Direct3D 11 at feature level D3D_FEATURE_LEVEL_11_0
Using Desktop Duplication mode: Resolution 1680 x 1050, Refresh Rate 59954/1000, Format DXGI_FORMAT_B8G8R8A8_UNORM
Using path “D:\Projects\...\Output\20160707-070707.mp4”
Using video transform Direct3D 11 Aware, Category MFT_CATEGORY_VIDEO_PROCESSOR, Input MFVideoFormat_ARGB32, Output MFVideoFormat_NV12
Using video transform NVIDIA H.264 Encoder MFT, Direct3D 11 Aware, Category MFT_CATEGORY_VIDEO_ENCODER, Input MFVideoFormat_NV12, Output MFVideoFormat_H264
Started writing…
PPP frames written (QQQ frame timeouts, RRR early frame skips, SSS late frame skips)
Stopped writing
Output file size is TTT bytes

When started the application might experience a condition when certain hardware resource is no longer available, e.g. the desktop itself is locked by user. The application will close the file, and attempt to automatically restart recording into new file. The attempts keep going until user explicitly stops recording.

The application does NOT do the following (among things it could):

  • the application is limited to record from one monitor only; to record from two at a time it is possible to start several instances however the produced result will not be synchronized
  • the application does not provide options to record single window image, to cut a section of monitor image or to scale image down
  • the application does not offer choices for video encoders (e.g. there are two or more hardware H.264 encoders), it will always use encoder picked by the system
  • the application only offers bitrate setting for video encoding
  • the application does not provide flexibility in audio encoding settings, it also expects that audio device is available throughout the entire recording session (esp. is not unplugged as recording goes)

References (Informational)

Download links

Build Incrementing for Visual Studio C++ Projects

Over long time I used an automatic build incrementer add-in for Visual Studio and C++ projects, which proved to be helpful. Having increments in file information, the binaries were easy to identify. It was easy to find a matching symbol information etc. Long story short, a tool like this has been a must.

The add-in has problems or downsides though. It kept patching the .RC source and touched it when no other changes existed in the build, touching source code forced rebuilds on its own and reloaded resource-related files opened in Visual Studio editors. I was annoying even though more or less acceptable.

Visual Studio 2015 Community Edition does not support add-ins because of 2015 or because it’s Community Edition. Either way it was time to update the incrementer ot make things nicer overall.

This time I preferred to change things a bit. No longer source code patching: the incrementer can be attached as a post-build event and patch VERSIONINFO resource on the built binary. This requires that current build number is kept somewhere but not in the .RC text, so I am using an additional .INI file. The good thing is that this file can still be included in version control system and the version history can be tracked relatively easily. No longer source code modification which makes code base dirty and forces another rebuild.

Command line syntax:

C:\>IncrementBuild-Win32
Syntax: IncrementBuild-Win32.exe argument [argument...]

Arguments:
  help - displays syntax
  configuration <path> - path to .INI file holding configuration information (mandatory)
  binary <path> - path to binary to be patched with file version update (mandatory)
  string <name> <value> - add, update or remove specific version information string (optional; multiple arguments possible)
  dump - print version information data block dump before and after update

Additional feature is that incrementer can attach additional version strings (see example below – it adds build configuration as a version information string).

Setting up is easy. First, the project should have a version information resource, so that the binary has data to patch in first place.

Then, there should be an .INI file which tracks version numbers. The binary will be build with .RC numbers and then incrementer will apply the least significant number from the .INI file incrementing it along the way.

[General]

[VersionInformation]
;Language=133 ;MAKELANGID(LANG_ENGLISH, SUBLANG_ENGLISH_US)
;Version String Format=%d.%d.%d.%d
Current Build Number=4

Next thing, project post-build event needs a command for patching:

Post-Build Event in VS for C++ Project

"$(AlaxInfo_Common)\..\Utilities\IncrementBuild\_Bin\IncrementBuild-$(PlatformName).exe" configuration "$(ProjectDir)Module.ini" binary "$(TargetPath)" string "ConfigurationName" "$(ConfigurationName)" 

The command takes Module.ini from the projects directory for configuration file, patches build output and also attaches build configuration as an additional version information string.

Build output looks like this:

—— Rebuild All started: Project: EnumerateTransforms, Configuration: Release Win32 ——
stdafx.cpp
Application.cpp
Generating code
Finished generating code
EnumerateTransforms.vcxproj -> D:\Projects…_Bin\Win32\Release\EnumerateTransforms.exe
EnumerateTransforms.vcxproj -> D:\Projects…_Bin\Win32\Release\EnumerateTransforms.pdb (Full PDB)
Configuration Path: D:\Projects…\Module.ini
Binary Path: D:\Projects…_Bin\Win32\Release\EnumerateTransforms.exe
Incrementing build number, product version 1.0.0.1, file version 1.0.0.4
Applying version information string, name “ConfigurationName”, value “Release”

Presumably, it is not necessary to use same bitness tool for a binary, since version information patching API should be able to patch resources of mismatching build, but I normally use a matching tool anyway, why not?

Download links

KB3176938’s Frame Server update visually

  1. M-JPEG and H.264 media types are available again (good)
  2. Nevertheless connected, H.264 video is not processed correctly; new bug or old one? Not clear. Even though it sort of works, in DirectShow it looks broken in another new way (this and not just this), perhaps a collateral damage and maybe never ever fixed…
  3. There is no camera sharing between the applications even though it was the justification for the changes in first place. For now Frame Server is just useless overhead, which adds bad stuff, is polished a bit to do not so much harm, and maybe turns to be good some time later.
    • for the record, the camera works in Skype when it is not consumed elsewhere concurrently

BTW the hack that bypasses FrameServer survived the update and remains in good standing.

DirectShowCaptureCapabilities and MediaFoundationCaptureCapabilities: API version of EnableFrameServerMode state

Both tools now include exact version of the API and also include an export or registry key related to frame server.

Capture Capabilities: API Version and State

mfcore.dll version of 10.0.14393.105 corresponds to Cumulative Update for Windows 10 Version 1607: August 31, 2016 also known as KB3176938 with DirectShow and Media Foundation improvement for Windows 10 Anniversary Update that restores availability of compressed media types.

See:

Enumeration of DirectShow Capture Capabilities (Video and Audio)

Media Foundation Video/Audio Capture Capabilities

Number of streams served by IMFSourceReader interface

It looks confusing that IMFSourceReader interface does not offer a dedicated method to find out the number of streams behind it. There is a IMFMediaSource instance behind the reader, and its streams are available through IMFMediaSource::CreatePresentationDescriptor method and IMFPresentationDescriptor::GetStreamDescriptorCount method call.

I am under impression that source reader’s method just has to be there even though I am not seeing it looking at the list of methods. Okay, there are other methods, namely IMFSourceReader::GetStreamSelection method, which takes either ordinal stream index or an alias as the first argument, then returns MF_E_INVALIDSTREAMNUMBER if you run out of streams. However the problem is that this is associated with an internal exception, and I consider exceptions as exceptional conditions the code should not normally hit. I would expect to have a legal exception-free way to find out the number of streams. I am using debugger that breaks on exception or at least pollutes output log for no reason, I use other tools that intercept and log exceptions as something that needs attention – getting number of streams is nowhere near there.

Internal MF_E_INVALIDSTREAMNUMBER Exception

Even though it is not a real drawback of the API since it is still possible to get the data and the API acts as documented, I still think someone overlooked this and API like this should have have a normal method or argument to request number of streams explicitly. Or I am just not seeing it even though I am trying thoroughly.

Logitech camera video freezes in Skype after Windows 10 Anniversary Update

Windows 10 Anniversary Update broke Skype video conferencing in classic Skype desktop application for many users.

Video from Logitech cameras is freezing in Skype

There is a number of pieces of software running together to power video conferencing and standing in front of the end user application people don’t see whose fault is the broken video. There is Microsoft, Logitech, Skype (oh wait, Skype was acquired by Microsoft, so some think that that somehow changed the internals and the core of the existing at the time of the acquisition application).

This really confuses me. The Logitech cameras got a skype certificate. But then MS decides to make a change that renders those devices uselss? […]

The reality is that even though Anniversary Update does an extremely extravagant change in the API used by many applications, that it changed behavior of the applications, and that it constrained the capabilities of such nice cameras as Logitech’s C920, C930e and other. The reality is that video capture using these Logitech cameras is still well functioning in updated Windows. The update made highest modes unavailable, destroyed ability to capture M-JPEG and H.264 video,but the cameras still work in other simpler raw video modes smoothly.

So what’s the heck the problem with this videoconferencing? Skype’s implementation for video capture in desktop application has been terrible for years. There was noone there to ask questions why they support some devices and not others, what was the problem in their inability to work with certain cameras while other application work with them well etc. Instead, they hid the problem by offering certified cameras. This is leading to nowhere in long term, and this case of broken videoconferencing is the respective example. The video capture was not done right in first place and relied on things it should not have relied on.

Another question is that videoconferencing is one of the key Skype features. Insider builds of new Windows were available over months.

… this behavior was planned, designed, tested, and flighted out to our partners and Windows Insiders around the end of January of this year. We worked with partners to make sure their applications continued to function throughout this change…

Somehow it appeared that Skype’s guys were unable to respond timely to the coming update deployment, and ended up with a tornado of customer complaints. Who are the partners whom Microsoft worked with to make sure that the change is not fatal for their applications and hardware? Logitech and Skype don’t seem to be on that list.

German users discuss it here:

Wobei das auch ein perfektes Beispiel ist, wie ignorant Entwickler und User gegenüber den angebotenen Möglichkeiten von Microsoft sind.

Mir geht es nicht um den technischen Aspekt, der ist in der Tat diskutabel und auch wenn Microsoft da noble Absichten hat (gleichzeitiger Zugriff mehrerer Apps auf die Kamera etc. – steht ja alles im MS Foren Link), ist das nen breaking Change der viele eiskalt erwischt hat und wohl übers Ziel hinaus geschossen ist.

Nur da muss man sich fragen warum. Microsoft gibt zwar zu, dass die Änderung besser dokumentiert sein könnte, aber sie ist seit Januar! in den Insider Builds live. Im MS Forum beschweren sich User, dass tausende ihrer Kunden von einen Tag auf den anderen Probleme mit ihren Produkten haben aufgrund dieser Änderung und ich frag mich nur – was hat die Firma das halbe Jahr gemacht seit die Änderung zum testen verfügbar ist? So wichtig kann denen ihr Kundenstamm ja nicht sein… Jeder kann kostenfrei Insider Builds beziehen und Entwickler sollten die Möglichkeit nutzen ihre Produkte frühzeitig zu testen. Ich verstehe diese Ignoranz nicht – warten bis ein Update kommt und hoffen es hat sich nichts geändert. Früher war das leider die Regel und es gab bei jedem großen OS Release immer ne ganze Weile bis Softwarehersteller Inkompatibilitäten behoben haben, aber das sollte gerade durch das Insider Programm sehr zurückgegangen sein. Stattdessen warten Hersteller immer noch ab und betreiben lieber Flickschusterei statt pro aktiv Ihre Software im Vorhinein anzupassen, so dass ihre Kunden auch beim Start der neuen OS Version gleich ein funktionierendes System haben.

Im Forum sieht man auch wie redebereit Microsoft ist und wie kurzfristig auf das Userfeedback eingegangen wird und dementsprechend zügig auch ein wenig ihrerseits von den Einschränkungen des System zurückgenommen werden (MJPEG wird im ersten Schritt wieder zugelassen). Aber die Diskussion ist nen halbes Jahr zu spät, nicht seitens Microsoft sondern auf der Seite der entsprechenden Entwickler die nicht die angebotenen Möglichkeiten genutzt haben, frühzeitig die Änderungen zu testen.

OK, Skype is acquisition, what about Skype for Business (previously known as Lync)?

… Optimized for Skype for Business, the C930e Webcam supports H.264 with Scalable Video Coding and UVC 1.5 encoding to minimize its dependence on computer and network resources.

I suppose Lync team is also not on the mentioned list of partners. With some knowledge of Lync’s internals the Anniversary Update is likely to block the optimization cited above. As a video capture source, Logitech C930e cameras no longer offer H.264 video and Skype for Business is unlikely to be able to utilize it – the camera is likely to operate in fallback mode, just as other cameras. Perhaps someone from Lync should have mentioned that in January or earlier. Or maybe they did it, who knows.

Long story short, supposed Skype update will fix freezing soon (also good news they are delivering it not using Windows Update, hence fix will come faster), and then Media Foundation team will provide solution that restores H.264 optimizations later in a few months.

Anniversary Webcam Saga: It’s clear who’s guilty, now what to do? (Updated)

As new and new people discover the Windows 10 Anniversary Update breaking changes (expectedly running mad), let’s reiterate the possible solutions:

  1. You don’t like the idea that video sharing service adds latency, and adds man-in-the-middle access/spying over a video feed.
    See #6 below.
  2. You are consuming raw video from camera using one of the uncompressed modes within USB 2.0 bandwidth.
    You are likely to be not affected by the changes.
  3. You are consuming raw video from camera but resolution/rate combination makes it unable to capture raw video, so you captured M-JPEG instead and decoded that, via DirectShow API.
    It is no longer possible, but you can use Media Foundation API instead. Or someone will develop a wrapper that re-exposes Media Foundation captured video via DirectShow.
  4. Same as #3 above but via Media Foundation API.
    You have the option to consume already decoded video, new subsystem will automatically capture M-JPEG and decode into NV12.
  5. You take advantage of compressed format of video captured (DirectShow or Media Foundation) so that you don’t need compress it for storage or network transmission purposes.
    Compressed captured video is no longer available, see #6 below.
  6. You take advantage of H.264 video capture offered by UVC 1.5 device, including fine tuning of hardware H.264 compression.
    Just as in #1 and #5 above, you are in trouble. Windows Camera Frame Server no longer offers access to such video feed. You need a hack (yes, it’s confirmed to be possible) that restores original behavior of video capture hardware.

These and other reasons related to the fact that applications no longer talk to real capture device, but rather a Frame Server Client that proxies a web camera, will possibly require that video capture applications are updated in order to work well in new version of the operating system.

It is unclear if and how Microsoft and Media Foundation team will respond to customer pain voices. First, it looked as a bug and one could expect a response and fix. But with the information from Windows Camera Team it looks completely different. No, they did not accidentally break it up – it was a planned change. Then they connected new behavior with new Microsoft Products – new products rely on new behavior. Then they did a few nasty things, not just one: added proxy service, killed UVC related compression control over the device, reduced range of operation modes for DirectShow they look for ways to deprecate, conceptually removed compressed video capture modes. I think there is no way back – Windows Camera Frame Server is new reality. The best to expect is that some of the mentioned problems are relaxed by offering greater flexibility by the platform. Maybe they will add some sort of exclusive modes for video capture or “professional” hardware which offers more through the API. In any event these changes are unlikely to appear soon, as they will go through the full cycle of development and take months to get delivered. Public pressure might force that to appear rather earlier than later, but I don’t think it is what is going to happen.

16 Aug update: Windows Camera Team reported that they see customer pain and will do something to ease it shortly. As I see it, they will address scenarios #3, #4, #5 above, for MJPG video, to allow compressed formats pass Frame Server so that users could consume them from their applications and Frame Server would be able to release frames not just after decoder, but also before the shared decoder. Also as use of H.264 is limited, it might be not included into hotfix at all, or will be included much later being given more serious consideration (which might end up as dropped support in DirectShow and something new introduced for Media Foundation).

19 Aug update: Someone took time to locate a registry value. User WithinRafael on MSDN Forums:

Try opening up

HKLM\SOFTWARE\Microsoft\Windows Media Foundation\Platform (32- and 64-bit OS)
HKLM\SOFTWARE\WOW6432Node\Microsoft\Windows Media Foundation\Platform (64-bit OS)

and add a DWORD value with name EnableFrameServerMode. Set its value to 0 and try again.

Put a sticky note on your monitor to revisit this if/when Microsoft issues a fix.

Also:

Untitled