Lync (Skype for Business) and H.264

A blog reader asked an interesting question a question about H.264 support in Lync. There have been a number of announcements on H.264 integration with multiple pieces of software:

Fantastic news! I will just leave the links here (perhaps they are not even all quite relevant) for the time I would want to revisit this pile of issues (or this would hopefully never happen). Some of them are from quite some past. Presently, however people having really hard time getting stuff working well one with another:

Let’s just make one thing clear. I do have Windows 8.1 system which is supposed to pick up UVC 1.5 compliant camera, and I do have “1st of New Generation of UVC 1.5 H.264 WebCams” Logitech Webcam C930e camera.

More to that, I do see it does do H.264 (in a weird way but still – and the camera itself is really cool, I like it!):

H.264 Video Capture with Logitech C930e

As a user, I expect that Lync 2013 (Skype for Business 2015) somehow figure the H.264 support out and do the thing out of the box. This is what the original question I was asked was about. What I do see – and this is what I was able to “touch” myself to avoid guesswork – is that a preview video feed from the device is NOT using H.264:

Video Preview with Lync 2013 and Logitech C930e

Well, maybe it is leveraging H.264 in real conference? Not sure.

Perhaps Skype can capture H.264? Oops, not it cannot (this is from real video call session):

Skype Video Call and Logitech C930e

Okay, it simply does not all work yet. As a guess, this worked with Windows 7 style of H.264 webcam and then the way Windows 8 offered support for UVC 1.5 devices broke it (if it at all ever worked!).

Reference signal source for DirectShow

Every so often there are tasks that need certain reference video or video/audio footage with specific properties: resolution, frame rate, frame accuracy with content identifying specific frame, motion in view, amount of motion which is “hard” for processing with encoder tuned for natural video, specific video and audio synchronization.

There is of course some content available, and sometimes it’s really nice:

Bipbopbipbop video on Youtube

However once in a while you need 59.94 fps and not 60, and another time you’d go with 50 so that millisecond time is well-aligned and every second has equal number of frames, then next time you need specific aspect ratio override and then you’d prefer longer clip to a short one.

I converted one of my sources for reference signal into DirectShow filters, which might be used to produce infinite signal, or otherwise they might be used to generate a file of specific format with specific properties.

The filters are Reference Video Source and Reference Audio Source, regular filters registered in a separate category (not virtual video/audio source devices – yet?), available for instantiation programmatically or in GraphStudioNext/GraphEdit.

DirectShowReferenceSource filters in GraphStudio

The filters are in both 32- and 64-bit versions, with hardcoded properties (yet?): 1280×720@50 32-bit RGB for video and 16-bit PCM mono at 48 kHz for audio. Programmatically, however, the filters can be tuned flexibly using IAMStreamConfig::Format call:

  • Video:
    • Any resolution
    • 32-bit RGB, top-to-bottom only (the filter internally uses Direct2D/WIC to generate the images)
    • Any positive frame rate
    • Aspect ratio can be overridden using VIDEOINFOHEADER2 format, e.g. to force SD video to be 4:3
  • Audio:
    • Any sample rate
    • 16-bit PCM or 32-bit IEEE floating point format
    • Mono

Video filter generates image sequence with properties burnt in, with frame number, 100ns time, time with frame number within second, and a circle with a sector filled to reflect current sub-second time. There is Uh/Oh text inside the circle at sharp second times and the background color is in continuous transition between colors.

Audio filter beeps every second during the first 100 ms of a second, with a tone different for every fifth and every tenth second.

DirectShowReferenceSource fitlers running in GraphStudio

Both filters support IAMStreamControl interface, and IAMStreamControl::StopAt method in particular, which allows to limit signal duration and be used for time accurate file creation.

This comes with a sample project that demonstrates ASF file generation for specific properties and duration. Output file generated by the sample is Output.asf.

ASF file format and WM ASF Writer are chosen for code brevity and to reference stock multiplexer. This has a downside that multiplexer re-scales video to profile’s resolution and frame rate, of course. Those interested in generation of their own content would use something like their favorite H.264 and AAC encoders with MP4 or MKV multiplexer perhaps. And a nicer output would look like Output.mp4 then.

A good thing about publishing these filters is that while preparing test project, I hit a thread safety bug in GDCL MP4 multiplexer, which is presumably present in all/most version of the mp4mux.dll out there: if filter graph is stopped at the time of video streaming, before reaching end-of-stream on video leg (which is often the case because upstream connection would be H.264 encoder having internal queue of frames draining then on worker threads processing stop request), multiplexer might generate a memory access violation trying to use NULL track object, which is already gone.

Download links

 

CLSID_FilterGraphNoThread and IMediaEvent::WaitForCompletion

An interesting find about CLSID_FilterGraphNoThread version of DirectShow fitler graph implementation is that its WaitForCompletion method available through IMediaEvent and IMediaEventEx interfaces is not implemented: the immediately returned value is E_NOTIMPL.

CLSID_FilterGraphNoThread itself is not well documented and is rather a spin off the baseline implementation (which is – my guess would be – was introduced at some later stage and was necessary for windowless filter graphs running in worker threads; perhaps Windows Media Player team requested this at some point of WMP development):

… creates the Filter Graph Manager on the application’s thread. If you use this CLSID, the thread that calls CoCreateInstance must have a message loop that dispatches messages; otherwise, deadlocks can occur. Also, before the application thread exits, it must release the Filter Graph Manager and all graph objects (such as filters, pins, reference clocks, and so forth).

The filter graph does not have its internal worker window and is unable to wait for completion in its usual manner. Hence the error. Those who are using CLSID_FilterGraphNoThread version of the filter graphs are supposed to handle EC_COMPLETE events to detect the completion.

I suppose it is not intended and not even somehow foreseen behavior, but who uses CLSID_FilterGraphNoThread anyway except Windows Media Player and those who can cope with the problem.

IMediaObject::Discontinuity while Windows Media Video 9 Encoder has data to process

This is presumably a bug in Windows Media Video 9 Encoder in versions up to Windows 7 included (fixed in Windows 8.1 at the very least – wmvencod.dll 6.3.9600.17415).

A IMediaObject::Discontinuity call destroys input the DMO it already holds: it reports success, and handles discontinuity correctly. It even drains output as it should, but if in the same time it already has input to process – this input is gone and the typical outcome is that a frame (or possibly more?) in the end of the stream is trimmed away.

The call itself is legal and reports S_OK. The method should have returned DMO_E_NOTACCEPTING if it is too early to report discontinuity, and the DMO does not do it.

Good news it’s fixed in its most recent version and it is not a cold case.

GDCL MPEG-4 filters update

As mentioned recently in DirectShowSpy media sample trace update, I uploaded a fork of MPEG-4 filters developed by Geraint Davies, which includes a few updates made over time. They are worth mentioning in a separate post:

  • the projects received type libraries and it is easier to reference filters now by #importing the type library into your project, with class and interface definitions
  • added support for seeking to key frame, positioning loop without thread re-creation, ability to query for all frame times in the file including frame flags – all together this makes a base for playback with good scrubbing performance and user experience
  • fixed a few bugs (resulting file might be not playable in WMP, seeking esp. when scrubbing results in media samples set off by small amount of time, maybe other)
  • ability to affect multiplexer memory allocator and its properties
  • support for raw video formats, so that MP4 container could be used as a temporary storage for raw captured video (it’s a good choice because the container format is not bloated and has no file size constraints unlike e.g. AVI implementations)
  • ability to write a secondary helper temporary file to keep important data, which lets one fix broken capture sessions; an incomplete MP4 file is typically a piece of garbage but with this file it can be restored up to crash point (think of multi-hour recordings) – this needs to be elaborated and some time in future there could possibly be a tool that does the job
  • last but not least, DirectShowSpy integration; the code is easy to remove or disable because it is put inside #if defined(ALAXINFODIRECTSHOWSPY_AVAILABLE) sections; integration serves both as example of how to leverage DirectShowSpy media sample traces and also to provide pre-built filters with enabled tracing

Source

Range-based for with ATL Collections

ATL collection classes did not receive an update related to new C++11 range-based fors and other fancy features. Well it’s a pity because writing

for(auto&& pPin: AudioPinList)
  pPin-> //...

compared to

for(POSITION Position = AudioPinList.GetHeadPosition(); Position; AudioPinList.GetNext(Position))
{
  IPin* pPin = AudioPinList.GetAt(Position);
  pPin-> //...

is indeed a relief. Namespace std is, of course, good but ATL is still here as well. Here is a take on adding range-based for loops to ATL collection classes.

  • CRoIterativeCollectionT class is a traits-based uniform base for lists and arrays, adding (again, uniform) helpers to iterate through collection:
    • GetCountThat, GetThat, RemoveThat, RemoveFirst, RemoveLast
    • Find, FindThat, FindFirst, FindLast, FindFirstThat, FindLastThat
    • ForEach
    • begin, end implementing for range based for loops
  • CRoArrayT is a class extending CAtlArray, adding mentioned above through inheriting from CRoIterativeCollectionT
  • CRoListT is a class extending CAtlList, adding mentioned above through inheriting from CRoIterativeCollectionT
  • CRoAssignableListT, CRoAssignableArrayT, CRoAssignableMapT to inherit collections and allow assignment operators on those (through duplication of elements), esp. to be able to easily put the collections as members of classes, eligible for assignment operator copies
  • CRoFixedArrayT, CRoFixedMapT are compatible collection classes backed by stack memory only, no allocations (old stuff as is, was used somewhere in handler where immediate response assumed no memory allocation operations)
  • CRoMapT is essentially an thin extension of CAtlMap, however also adds GetPositions() and GetValues() methods, which can be used for range-based for loops

Download

DirectShowSpy: Media Sample Traces

Overview

DirectShow filters pass media samples (portions of data) through graphs and details of how the streaming happens exactly is important for debugging and troubleshooting DirectShow graphs and development. A developer needs clear understanding of parts of streaming process, importance of which increases with multiple streams, threads, parallelization, cut-off times, multiple graphs working simultaneously.

Details of streaming is typically hidden from top level control over the graph where the application is controlling state of the process overall, and then filters are on their own sending data through.

DirectShowSpy provides API to filters to register media samples as well as other details of streaming process, including comments and application defined emphasis (highlighting), it stores the traces and provides UI to review and export the traces for analysis and troubleshooting.

A similar tracing is implemented by GraphStudioNext Analyzer Filter.

05

DirectShowSpy trace is different in several ways:

  1. DirectShowSpy is a drop-in module and adds troubleshooting capabilities to already built and existing application, including that it is suitable for temporary troubleshooting in production environment
    • DirectShowSpy offers tracing for filter which are private and not registered globally
    • DirectShowSpy tracing better reproduces production application environment
  2. DirectShowSpy allows supplementary application defined comments, which are registered chronologically along with media samples tracing
    • it is possible to trace not only at filter boundaries/granularity, but also internal events and steps
  3. DirectShowSpy combines tracing from multiple graphs, multiple processes and presents them in a single log

DirectShowSpy media sample trace is a sort of log capability, which is implemented with small overhead. The traces reside in RAM, backed by paging file and are automatically released with release and destruction of filter graph. The important exception is however the media sample tracing UI of DirectShowSpy. When UI is active, each (a) manual refresh of the view, and (b) each destruction of filter graph in analyzed process makes UI add a reference to trace data and data lifetime is extended up to closing of DirectShowSpy UI.

The important feature of behavior mentioned above is that media tracing data outlives, or might outlive the processes that host filter graphs. As long as UI is active, processes including terminated expose media sample trace for interactive review.

Basically the feature is to review a streaming session details obtained from the filters which registered the respective events. For example, this filter graph

01

has two filters, MPEG-4 demultiplexer and multiplexer, which register streaming events. Because the trace is chronological, it in particular allows seeings how “Stuff” filter is doing the processing: threads, timings. If “Stuff” filter registers its own events, the picture becomes more complete.

02

Using

To leverage media sample traces, a filter developer obtains ISpy interface from filter graph (which succeeds when DirectShowSpy is registered and hooks between application and DirectShow API) and creates a IMediaSampleTrace interface using ISpy::CreateMediaSampleTrace call. An example of such integration is shows in a fork of GDCL MPEG-4 filters here, in DemuxOutputPin::Active method.

It does not matter whether filter and pins are sharing IMediaSampleTrace pointers. Each CreateMediaSampleTrace creates a new trace object, which is thread safe on its own, and data is anyway combined on UI from all sources of tracing.

With no DirectShowSpy registered, QueryInterface for ISpy fails and this is the only expense/overhead of integration of media sample tracing in production code.

A developer is typically interested in registering the following events:

  • Segments starts, media sample receives and deliveries, end of stream events; DirectShowSpy introduces respective methods in IMediaSampleTrace interface: RegisterNewSegment, RegisterMediaSample, RegisterEndOfStream
  • Application defined comments, using RegisterComment method

All methods automatically track process and thread information associated with the source of the event. Other parameters include:

  • filter interface
  • abstract stream name, which is a string value and can be anything; typically it makes sense to use pin name/type or it can be pin name with appended stage of processing if developer wants to track processing steps as they happen in the filter; UI offers filtering capability for stream values, and separate column in exported text so that filter could be applied in spreadsheet software such as Excel when reviewing the log
  • user defined comment and highlighting option

RegisterMediaSample methods can be used with anything associated with a media sample, not exactly one event per processing call. The method logs media sample data (it takes AM_SAMPLE2_PROPERTIES pointer as byte array pointer) and makes it available for review with its flags and other data.

Comments can be anything and are to hold supplementary information for events happening in certain relation to streaming:

03

An application can automatically highlight the log entries to draw attention to certain events. For example, if data is streamed out of order and the filter registers the event with highlighting, the entry immediately drawing attention on UI review. Then interactive user can change the highlighting interactively as well:

04

The media trace data can be conveniently filtered right in DirectShowSpy UI, which is invoked by DoMediaSampleTracePropertySheetModal exported function, or copied to clipboard or saved as file in tab-separated values format. The file can be opened with Microsoft Excel for further review.

Limitations

  • there is a global limit on in-memory trace storage; there is no specific size in samples (it’s 8 MB for global registry of data here) and the storage is large enough to hold streaming of a movie with multiple tracks, however once in a while it is possible to hit the limit and there is no automatic recycling of data: the data is released with last copy of UI closed and graphs released in context of their respective processes
  • traces are visible from the same session only, in particular processes with elevated privileges are “visible” by similarly started DirectShowSpy UI and vice versa
  • 32-bit process traces are visible from 32-bit DirectShowSpy UI, and the same applies to 64 bits; technically it is possible to share that but it is not implemented

Download links

Additional stuff

A fork of GDCL MPEG-4 filters is uploaded to GitHub, which in particular has integration with media sample tracing and includes pre-built binaries, 32- and 64-bit versions.