YouTube video contains content… blocked on copyright grounds

Previous post embedded a video clip, hosted by YouTube, with a 37 seconds recording of Microsoft Edge window playing another YouTube clip. The video clearly showed UI part of browser and YouTube web page besides the video. Audio was a re-encoded re-capture of playback audio, however obviously matching algorithm would do a positive here because audio was captured excellently. The clip itself had a link to original content in the description & also linked blog post. Essentially, the 37s fragment of 205s song displayed that capture and recording of a separate window using OS API and tooling produces a decent MP4 file.

Then this happened:

After a manual review, a copyright owner has claimed some material in your video.

As a result, your video has been blocked, and can no longer be played on YouTube.

This is not a copyright strike. This claim does not affect your account status.

Video title: DxgiTakeWindowSnapshot’s WGC & Loopback Audio Recording

So it is a copyright infringement nowadays to show a screen-rip of half a minute trim of playback of 1/6th of music video with clear indication of source and purpose, and MANUAL REVIEWER confirmed that.

But okay, the song is rather good.

DxgiTakeWindowSnapshot & Window Recording w/ Audio

I am sometimes using a rework of earlier DxgiTakeSnapshot application for one specific purpose mentioned below. In addition to Desktop Duplication API, recent version of Windows offer a similar (in sense of acquisition of external video content) API: Windows.Graphics.Capture (hereinafter “WGC”), and the new rework is using this API to capture visual content as a snapshot or, now, as a video stream.

I will skip the technical details and just link Robert Mikhayelyan’s Win32CaptureSample project on GitHub, which is perhaps the best place to ask questions about this new API.

The today’s application takes a snapshot of given monitor, window or process windows (each window separately) similar to original DxgiTakeSnapshot application, and this might be worth a separate post, but at this time I just wanted to mention an quick hack with -Video command line argument.

The hack uses the API along with taking audio currently played and records the audiovisual stream into MP4 file until you stop it with Control+Break.

For example, this way I recorded video below by recording browser window with YouTube playback. I started the app first to quickly identify HWND of interest.

C:\>DxgiTakeWindowSnapshot.exe -Process msedge.exe
DxgiTakeWindowSnapshot.exe 20210717.1-16-g34e6c97 (Release)
34e6c972dad5568347d44e58ab0338b7daa1dba7
HEAD -> video, origin/video
2021-11-20 21:07:46 +0200
--
Found process 023228 "C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe"
Found 2 windows for process 023228
Trying to capture window 0x0000000000A608E0, Filatov & Karas - ????? ? ?????? (Live @????????? ) - YouTube - Personal - Microsoft??? Edge
Trying to capture window 0x00000000016D066E, Filatov & Karas - Don???t Be So Shy (RuSongTV - Turkey) - YouTube and 3 more pages - Personal - Microsoft??? Edge
Found process 013944 "C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe"
Found 0 windows for process 013944

C:\>DxgiTakeWindowSnapshot.exe -Window 0x0000000000A608E0 -Video
DxgiTakeWindowSnapshot.exe 20210717.1-16-g34e6c97 (Release)
34e6c972dad5568347d44e58ab0338b7daa1dba7
HEAD -> video, origin/video
2021-11-20 21:07:46 +0200
--
Trying to capture window 0x0000000000A608E0, Filatov & Karas - ????? ? ?????? (Live @????????? ) - YouTube - Personal - Microsoft??? Edge
Stopping: CTRL_C_EVENT

(see next post on why video below is blocked here)

Original content: Filatov & Karas – ????? ? ?????? (Live @????????? )

The application is currently hardcoding produced video as 1920×1080@60 with 10 MBps bitrate, and is subject to certain limitations because of quick and hacky implementation:

  • as mentioned, 1920×1080@60 with 10 MBps video bitrate (& max documented bitrate for stock AAC audio encoder)
  • video might be letterboxed to preserve aspect ratio (default behavior of Media Foundation XVP)
  • compatible Media Foundation video encoder which is expected to be discoverable as default (hardware encoding is used as designed)
  • a pretty much regular audio device expected, certain not so usual settings like 5.1 audio as shared format are likely to cause exceptions
  • the application picks default multimedia audio device for recording (actual code line is auto const Device = DefaultAudioEndpoint(eRender, eMultimedia);)

So that one specific purpose the application is good for is meeting recording: record a window with Google Meet session, Microsoft Teams or alike and you have a convenient copy of content for review or share.

As of now the application is recording only audio playback device, but not the microphone – I will reach this some time later.

If you want something more convenient, there is also Robert’s Simple Screen Recorder in Windows Store, however this will be video only yet with source code on GitHub.

DxgiTakeWindowSnapshot uses Media Foundation in old-school way: it creates a Microsoft Media Foundation Media Session pipeline with a few custom primitives and some stock ones, esp. video encoder, video processor and audio encoder it takes all this stuff off to record media…

Download links

Binaries:

Virtual Camera API in Windows 11 (Build 22000)

There is a new API coming with Windows 11. Finally we will get well defined way to register virtual cameras (perhaps for applications built against Windows Media Foundation API, not DirectShow): MFCreateVirtualCamera.

Creates a virtual camera object which can be used by the caller to register, unregister, or remove the virtual camera from the system.

Frame server reference is a good sign and suggests that an application might be able to register its own implementation, then system wide service would act as a proxy and expose the implementation to video capture applciations built to work with cameras.

MFVirtualCameraType_SoftwareCameraSource “The virtual camera is a software camera source.”

There already is a sample on GitHub for this API: Windows-Camera/Samples/VirtualCamera at master · microsoft/Windows-Camera (github.com)

See also:

Some other interesting things are also coming, e.g. “virtual audio device that supports audio loopback based on a process ID instead of the device interface path of a physical audio device” (AUDIOCLIENT_ACTIVATION_TYPE_PROCESS_LOOPBACK and friends). We will be able to re-capture individual process audio, whcih is a cool new one, but keep patience: new stuff is scheduled for Windows 10 Build 20348.

PlayReady DRM in StreamingServer application via MSE/EME

One more video streaming scenario is added to PoC StreamingServer application: ability to stream DRM-enabled content with playback via Media Source Extensions (MSE) and Encrypted Media Extensions EME interface and by-frame data appendage in JavaScript.

DRM scenarios are typically handled by JavaScript streaming media players such as castLabs PRESTOplay for Web Apps, where JavaScript player gets evrything together: the player, HTML MSE/EME, support for streaming media formats, player experience, advanced features, visual styling, DRM server integration.

Opposite to this, StreamingServer is generating PlayReady protected video stream live and client side JavaScript code handles EME in vanilla JavaScript.

Microsoft Edge navigated to http://localhost/hls/playready-A.html picks a page with the code and loads live-generated video frame by frame from an application integrated with web server, then feeds data into HTML5 video element handling EME events as they appear.

HTML5 MSE/EME PlayReady DRM aware player

The video above, of course, does not show the video itself because DRM-enabled content is rednered with restrictions: no snapshots allowed, and APIs like Desktop Duplication and Windows.Graphics.Capture have the repsective regions blacked out. Simple Screen Recorder used to take video above shows black where the content is visible physically.

Download links

Binaries:

  • 64-bit: StreamingServer.exe (in .ZIP archive)
  • License: This software is free to use; builds have time based expiration

WebCodecs in StreamingServer for JavaScript H.264 decoding

One another small addition to StreamingServer showcase/development application: verification for WebCodecs API video streaming. WebCodecs API offers browser applications video decoding capabilities:

The WebCodecs API gives web developers low-level access to the individual frames of a video stream and chunks of audio. It is useful for web applications that require full control over the way media is processed. For example, video or audio editors, and video conferencing.

The API is shipped starting Chrome version 94 (explainer is here). In a nutshell, JavaScript code can handle raw uncontainerized video data and convert that into video frames which can be, in particular, drawn on HTML canvas. This provides a lower level video decoding capability compared to Media Streaming Extensions (MSE): the video stream does not need to be containerized, yet browser provides intefrace into hardware accelerated video decoding for efficient video data processing.

StreamingServer now handles two types of requests in its HTTP/HTTPS interface: /webcodecs-videodecoder-A.html with JavaScript code controlling WebCodecs API for decoding followed by rendering obtained frames on a timer callback, and /webcodecs-videodecoder-A?frame= to send indivudal H.264 encoded on the fly video frame. All together, the code simluates video playback receiving H.264 frames from HTTP server one by one.

The setup is a proof of concept and generates and encodes the full frame set on original request, without actual per frame on demand encoding, so be aware if you happen to request a long sequence.

To check things out, have StreamingServer started and open Chrome Canary version 94+¹, then navigate to one of the following:

  • http://localhost/hls/webcodecs-videodecoder-A.html
  • http://localhost/hls/webcodecs-videodecoder-A.html?FrameSizeW=720&FrameSizeH=480&FrameRateN=30000&FrameRateD=1001&SegmentDuration=15

The second URL shows the available parameters for video encoding. The JavaScript code can be inspected directly from Chrome’s Developer Tools.

¹ Microsoft Edge self-updated today to Version 94.0.992.31 and it has WebCodecs API available in it as well!

Download links

Binaries:

  • 64-bit: StreamingServer.exe (in .ZIP archive)
  • License: This software is free to use; builds have time based expiration

MPEG-DASH trick play adaptation set

Just a small addition to MPEG-DASH server: a separate trick play video track with 1 fps just IDR frames track.

The “trick mode” itself is essentially this:

3.2.9. Trick Mode Support

Trick Modes are used by DASH clients in order to support fast forward, seek, rewind and other operations in which typically the media, especially video, is displayed in a speed other than the normal playout speed. In order to support such operations, it is recommended that the content author adds Representations at lower frame rates in order to support faster playout with the same decoding and rendering capabilities.

However, Representations targeted for trick modes are typically not be suitable for regular playout.

The application extends its manifest with an additional “trick” video track when requested URL is http://localhost/hls/manifest.mpd?trickplay

Download links

Binaries:

  • 64-bit: StreamingServer.exe (in .ZIP archive)
  • License: This software is free to use; builds have time based expiration

MPEG-DASH content in StreamingServer application

MPEG-DASH is ISO/IEC 23009 “Dynamic Adaptive Streaming over HTTP” specification. This is widely used to stream audiovisual content over internet opposed to playback of static content such as downloaded clip.

The StreamingServer application I published some time ago generated test content using HTTP Live Streaming protocol, which is, well, similar.

So I extended StreamingServer a bit and made it expose the media as MPEG-DASH content as well. The feature set is way narrower than in the case of HLS, it’s just a VOD asset, but it’s a bit sophisticated: multi-period with three periods and not so obvious internal layout. Experimental, a sort of.

I will use the space of this post to document steps to enable playback of this content.

Once again what the application does in first place? Once started, the application (or a service, if converted to run as a Windows service) is jumping onto Windows HTTP Server API (so you might need to run it with elevated privileges) and extends built-in web server by providing content. If executed with no arguments, it connects to http://localhost/hls/ node and is ready to serve http://localhost/hls/master.m3u8 for HLS playback, and now also http://localhost/hls/manifest.mpd for MPEG-DASH playback. http://localhost/hls/about has some embedded documentation.

Serving the requests, the application prepares audio and video content on the fly, for video it leverages NVIGIA GPU hardware video encoder if available, bit it also has a fallback code path to use Microsoft software encoder. The application is not designed for concurrent access by multiple clients and of course real time video encoding has its own capacity too. The application is rather a verification tool, internally it runs a few Microsoft Media Foundation pipelines (media sessions) for various things: to obtains RFC 6381 “codecs” data, initialization and media segments etc.

To play MPEG-DASH asset perhaps the most popular player would be Shaka Player, which specifically has a convenient online demo. There is custom content section where manifest URL http://localhost/hls/manifest.mpd can be added for playback.

One problem here is CORS with security and permissions for browser code. The demo is running over HTTPS and so it can’t consume HTTP media asset. To work this around StreamingServer needs to be started with these command line switches, to register on both HTTP and HTTPS nodes of the web server.

StreamingServer.exe -Location http://+:80/hls/ -Location https://+:443/hls/

In order to use the application non-locally over HTTPS you might need to configure IIS first and add a certificate there. Self-signed certificate works out fine as long as you add trust to it on the client side.

What happens next? We are good to go.

The blue, green and red parts represent separate periods which are stitched smoothly during playback (it is easy to see what’s inside by downloading the manifest and opening it in your favorite XML editor).

Some more perks:

The rest of the properties of video and audio are hardcoded for MPEG-DASH.

Further reading:

Download links

Binaries:

  • 64-bit: StreamingServer.exe (in .ZIP archive)
  • License: This software is free to use; builds have time based expiration