DirectShowSpy: Media Sample Traces

Overview

DirectShow filters pass media samples (portions of data) through graphs and details of how the streaming happens exactly is important for debugging and troubleshooting DirectShow graphs and development. A developer needs clear understanding of parts of streaming process, importance of which increases with multiple streams, threads, parallelization, cut-off times, multiple graphs working simultaneously.

Details of streaming is typically hidden from top level control over the graph where the application is controlling state of the process overall, and then filters are on their own sending data through.

DirectShowSpy provides API to filters to register media samples as well as other details of streaming process, including comments and application defined emphasis (highlighting), it stores the traces and provides UI to review and export the traces for analysis and troubleshooting.

A similar tracing is implemented by GraphStudioNext Analyzer Filter.

05

DirectShowSpy trace is different in several ways:

  1. DirectShowSpy is a drop-in module and adds troubleshooting capabilities to already built and existing application, including that it is suitable for temporary troubleshooting in production environment
    • DirectShowSpy offers tracing for filter which are private and not registered globally
    • DirectShowSpy tracing better reproduces production application environment
  2. DirectShowSpy allows supplementary application defined comments, which are registered chronologically along with media samples tracing
    • it is possible to trace not only at filter boundaries/granularity, but also internal events and steps
  3. DirectShowSpy combines tracing from multiple graphs, multiple processes and presents them in a single log

DirectShowSpy media sample trace is a sort of log capability, which is implemented with small overhead. The traces reside in RAM, backed by paging file and are automatically released with release and destruction of filter graph. The important exception is however the media sample tracing UI of DirectShowSpy. When UI is active, each (a) manual refresh of the view, and (b) each destruction of filter graph in analyzed process makes UI add a reference to trace data and data lifetime is extended up to closing of DirectShowSpy UI.

The important feature of behavior mentioned above is that media tracing data outlives, or might outlive the processes that host filter graphs. As long as UI is active, processes including terminated expose media sample trace for interactive review.

Basically the feature is to review a streaming session details obtained from the filters which registered the respective events. For example, this filter graph

01

has two filters, MPEG-4 demultiplexer and multiplexer, which register streaming events. Because the trace is chronological, it in particular allows seeings how “Stuff” filter is doing the processing: threads, timings. If “Stuff” filter registers its own events, the picture becomes more complete.

02

Using

To leverage media sample traces, a filter developer obtains ISpy interface from filter graph (which succeeds when DirectShowSpy is registered and hooks between application and DirectShow API) and creates a IMediaSampleTrace interface using ISpy::CreateMediaSampleTrace call. An example of such integration is shows in a fork of GDCL MPEG-4 filters here, in DemuxOutputPin::Active method.

It does not matter whether filter and pins are sharing IMediaSampleTrace pointers. Each CreateMediaSampleTrace creates a new trace object, which is thread safe on its own, and data is anyway combined on UI from all sources of tracing.

With no DirectShowSpy registered, QueryInterface for ISpy fails and this is the only expense/overhead of integration of media sample tracing in production code.

A developer is typically interested in registering the following events:

  • Segments starts, media sample receives and deliveries, end of stream events; DirectShowSpy introduces respective methods in IMediaSampleTrace interface: RegisterNewSegment, RegisterMediaSample, RegisterEndOfStream
  • Application defined comments, using RegisterComment method

All methods automatically track process and thread information associated with the source of the event. Other parameters include:

  • filter interface
  • abstract stream name, which is a string value and can be anything; typically it makes sense to use pin name/type or it can be pin name with appended stage of processing if developer wants to track processing steps as they happen in the filter; UI offers filtering capability for stream values, and separate column in exported text so that filter could be applied in spreadsheet software such as Excel when reviewing the log
  • user defined comment and highlighting option

RegisterMediaSample methods can be used with anything associated with a media sample, not exactly one event per processing call. The method logs media sample data (it takes AM_SAMPLE2_PROPERTIES pointer as byte array pointer) and makes it available for review with its flags and other data.

Comments can be anything and are to hold supplementary information for events happening in certain relation to streaming:

03

An application can automatically highlight the log entries to draw attention to certain events. For example, if data is streamed out of order and the filter registers the event with highlighting, the entry immediately drawing attention on UI review. Then interactive user can change the highlighting interactively as well:

04

The media trace data can be conveniently filtered right in DirectShowSpy UI, which is invoked by DoMediaSampleTracePropertySheetModal exported function, or copied to clipboard or saved as file in tab-separated values format. The file can be opened with Microsoft Excel for further review.

Limitations

  • there is a global limit on in-memory trace storage; there is no specific size in samples (it’s 8 MB for global registry of data here) and the storage is large enough to hold streaming of a movie with multiple tracks, however once in a while it is possible to hit the limit and there is no automatic recycling of data: the data is released with last copy of UI closed and graphs released in context of their respective processes
  • traces are visible from the same session only, in particular processes with elevated privileges are “visible” by similarly started DirectShowSpy UI and vice versa
  • 32-bit process traces are visible from 32-bit DirectShowSpy UI, and the same applies to 64 bits; technically it is possible to share that but it is not implemented

Download links

Additional stuff

A fork of GDCL MPEG-4 filters is uploaded to GitHub, which in particular has integration with media sample tracing and includes pre-built binaries, 32- and 64-bit versions.

Enumeration of DirectShow Capture Capabilities (Video and Audio)

The tool appears to be unmentioned here, and this is to fix the problem.

DirectShowCaptureCapabilities application enumerates video and audio capture devices and lists their typical DirectShow properties, specifically:

  • Moniker names (including USB identification)
  • Pins and property pages
  • Supported interfaces
  • Formats and capabilities available through IAMStreamConfig interface
  • Video and audio devices

DirectShowCaptureCapabilities Screenshot

The utility allows to save output and post it over Internet, what some users already did and here is capabilities of some hardware.

Some of the files might be useful to provide sample data for AM_MEDIA_TYPE structures for typical YUY2, UYVY, MJPG, H264 formats.

Download:

Continue reading →

PolyTextOut API – Does It Work?

As MSDN says,

The PolyTextOut function draws several strings using the font and text colors currently selected in the specified device context.

The article also mentions ExtTextOut as a simpler sister function:

To draw a single string of text, the application should call the ExtTextOut function.

It looks like the API is not so Unicode friendly. Code as simple as

PolyTextOut(L"Мама мыла раму");
PolyTextOut(L"Mother washed window");
PolyTextOut(L"ママソープフレーム");
PolyTextOut(L"დედა საპნის კარკასი");

ExtTextOut(L"Мама мыла раму");
ExtTextOut(L"Mother washed window");
ExtTextOut(L"ママソープフレーム");
ExtTextOut(L"დედა საპნის კარკასი");

Outputs correctly in case of ExtTextOut, while PolyTextOut stumbles on strings in Japanese, Georgian. All right, so why did it do Russian?

PolyTextOut Sample

Blackmagic Design Intensity Pro 4K Issues

The new board is inexpensive,cool (well, actually it is hot, see below) and easy to interface with but has has severe issues.

The Intensity Pro 4K is great for video editors that need a realtime preview on a big screen TV, people doing live streaming presentations, or for those trying to save family videos from old VHS tapes.
Customers can capture NTSC, PAL, 720HD, 1080HD and Ultra HD.

Issue 1. The cooling fan is totally annoying. 40mm fan is running at constant high (max?) speed without dynamic speed control. Noise level is absolutely unacceptable and the board is a “no go” until the problem is solved. There is a grayed out check box to enable board’s control over spinning, so we might expect (and hope!) that firmware update starts doing what it is supposed to do from the start.

Issue 2. Blackmagic Design Desktop Video 10.4 is unstable. Internal problems partially disable board capabilities and certain modes are no longer available. Software can no longer capture XBox signal, in particular. Blackmagic Design is yet to release good operational version of software.

Issue 3. DeckLink SDK memory leak (applies to 10.3.7 and supposedly earlier versions as well; reference code). IDeckLinkInput does not properly manage internal video frame buffers and leak them once in a while. The problem does not happen if you:

  • reuse IDeckLinkInput interfaces
  • use custom memory allocator (which is preferred because stock allocator is also way too memory greedy)

Windows Media Player encountered a problem while playing ASF/WMV file with multiple audio tracks

This is not really obvious: Windows Media Player refuses to open a Windows Media (ASF) file with an undescriptive error message: “Windows Media Player encountered a problem while playing the file”.

Broken WMV file in Windows Media Player

The problem, however, is that the file is actually good, more or less. The file plays well in DirectShow and Media Foundation APIs. There is a unusual thing, of course, that the file contains two audio tracks. The tracks are mutually exclusive as they should be (exclusion of language type – MFASFMutexType_Language).

href=”https://alax.info/blog/wp-content/uploads/2015/04/Image003.png”>Image003

Image004

It appears that Windows Media Player does a strict checking of language strings, and expects them to be RFC 1766 compatible. They are free style tags in this file:

Image002

Not a valid language string? OK, no playback then.

IP Video Source: Compatibility Issues

I received a few emails recently with questions about compatibility issues between IP Video Source and other applications. The compatibility issues typically fall into classes:

  1. An application is sensitively expecting the device to be real camera, with specific requests, e.g. to be able to change resolution of video capture
  2. An application is explicitly requesting that video device is a device backed by kernel driver
  3. Inaccurate use of DirectShow API on application side, bugs and incorrect assumptions

For a DirectShow developer a reference video capture application is Windows SDK AMCap. If device works with AMCap, then application should basically be able to pick it up as well, or it has certain code issues. Then AMCap is available in source as SDK sample and its code is available.

Specific titles which had compatibility issues with IP Video Source were: Skype, Unity 3D, Open Broadcaster Software (OBS). I have no plans to investigate compatibility with Skype: something is really wrong on Skype side, and they prevent from attaching debugger to their application. Unity 3D was unreasonably assuming that a video device has to be a WDM Video Capture Filter backed device, ignoring it otherwise. OBS was attempting to blacklist certain group of devices (for specific reasons I am not aware of) and they way it was done was also blacklisting IP Video Source.

IP Video Source 1.0.3 works this around the way it can. If you are already Unity 3D/Open Broadcster Software user and you have IP Video Source installed, you want to re-add cameras in IP Video Source. The easiest is to open camera management dialog, copy everything into clipboard, then delete all cameras and then paste them back. New cameras get correct configuration right from the start.

Additionally, IP Video Source 1.0.3 updates include:

  • JpegVideoSourceFilterRegistry COM class which allows adding/managing cameras programmatically (no sample code, but COM interfaces on type library are self-describing)
  • Automatic conversions to YUY2, UYVY on application request in Windows Vista+, on application request and to better mimic real cameras, such as especially laptop webcams, that often deliver 4:2:2 YUV video
  • added EC_DYNAMICMEDIATYPECHANGE event to notify graph owner that filter hit necessity to change video resolution
  • added IQualityControl implementation on output pin (e.g. DeckLink SDK video renderer filter makes unreasonable assumption that this optional interface is implemented)
  • worked around a bug in parsing D-Link DCS-930LB1 camera (for the second time! there was a related issue a few years ago with a predecessor of this camera)

There is also some sample C++ and C# code published (rather code snippets) that demonstrate basic operations.

Download links

Audio playback at non-standard rates in DirectShow

DirectShow streaming and playback in particular offers flexible playback rates for scenarios where playback is requested to take place slower or faster than real time. For a DirectShow developer, the outer interface is pretty straightforward:IMediaPosition::put_Rate takes playback rate and that’s it.

Playback rate. Must not be zero.

The playback rate is expressed as a ratio of the normal speed. Thus, 1.0 is normal playback speed, 0.5 is half speed, and 2.0 is twice speed. For audio streams, changing the rate also changes the pitch.

Even after taking out the case of reverse playback, which is not supported out of the box and requires some DirectShow magic to implement, there is a nasty problem from those who want to be able to change playback rate flexibly on the go.

Rates greater than one are faster than normal. Rates between zero and one are slower than normal. Negative rates are defined as backward playback, but in practice most filters do not support it. Currently none of the standard DirectShow filters support reverse playback.

The problem comes up when an audio-enabled file/stream is being played back and there is an audio renderer in the pipeline. The filter graph would connect and play excellently, but once you try to change playback rate too much, the request might fail unexpectedly with 0x8004025C VFW_E_UNSUPPORTED_AUDIO “Cannot play back the audio stream: the audio format is not supported.” error.

An application that “almost does everything right” is unable to do a small thing as simple as fast forward playback!

The root of the problem is in audio renderer. Requests to change playback rate propagate through filter graphs through IMediaSeeking interface and Filter Graph Manager sends the new rates through renderers upstream. Audio renderer rejects to accept the rates it does not support and this breaks the whole thing.

Earlier implementations had [supposedly? “But I cannot call SetRate with more than 2, it returns VFW_E_UNSUPPORTED_AUDIO.”] a limit of 50%..200% rate range, and since Vista the actual range is somewhat relaxed. Having no documentation reference, my educated guess is that actual playback rate limit is defined by ability of the renderer to resample the data into format accepted by underlying device. That is, a device taking up to 192 kHz audio could be used to play 44.1 kHz content at rates up to 435%.

The nasty part of the problem is that even though one might want to mute the audio part at such rates, or exclude audio substream at all, this is only possible with transition through stopped state (due to supposed changes in filter graph topology) and otherwise audio renderer blocks rate changing with the mentioned error code.

So, is there any way to fix VFW_E_UNSUPPORTED_AUDIO issue? with reuse of existing components and smooth user experience on the UI side? One of the approaches is to customize the behavior of standard audio renderer, DirectSound Renderer Filter.

Filter Graph Manager would use its IMediaSeeking/IMediaPosition interfaces directly, so the filter cannot be added into filter graph as is. Fhe following is the checklist for required updates:

  • IMediaSeeking needs to be intercepted to accept wide range of rates, to pass some of them transparently and fake those accepted in “muted” mode
  • IPin, IMemInputPin interfaces need to be intercepted to accept incoming media sample, to pass them through or suppress and replace with IPin::EndOfStream in “muted” mode

The mentioned tasks make it impossible to have standard audio renderer as a normal participant of the filter graph, however a wrapper COM object can achieve the planned just fine without a single line of code doing audio. The figure below shows how standard DirectSound renderer is different from its wrapper.

Wrapper

The complete list of tasks to do in the wrapper:

  • IPin::QueryPinInfo needs to properly report wrapper filter
  • IPin::EndOfStream needs to suppress EOS call in case we already “muted” artificially
  • IPin::NewSegment needs to replace rate argument with 1.0 before forwarding to real renderer in case we decided to “mute” the stream
  • IMemInputPin::Receive and IMemInputPin::ReceiveMultiple need to replace media sample delivery with an EOS in case we are muting the stream
  • IBaseFilter::EnumPins and IBaseFilter::FindPin should properly expose pin wrapper
  • IMediaSeeking::SetRate accepts any rate and decides on muting or transparent operation, then forward real or fake value to the real renderer managed internally
  • IMediaSeeking::GetRate reports accepted rate

As the list says, wrapper filter can accept any rate (including negative!) and decode on transparent playback or muted operation for unsupported or otherwise unwanted rates. No filter graph re-creation or stopping required when changing rates, and changing muting.

A DirectSound renderer filter added to the graph automatically or otherwise, as a part of normal graph construction needs to be replaced by the wrapper in the following way:

CLSID ClassIdentifier;
if(FAILED(FilterArray[nIndex]->GetClassID(&ClassIdentifier)))
    continue;
// NOTE: DirectSound Renderer Filter, CLSID_DSoundRender
//       http://msdn.microsoft.com/en-us/library/windows/desktop/dd375473%28v=vs.85%29.aspx
if(ClassIdentifier != CLSID_DSoundRender)
    continue;
const CComPtr<IPin> pInputPin = _FilterGraphHelper::GetFilterPin(pBaseFilter);
const CComPtr<IPin> pOutputPin = _FilterGraphHelper::GetPeerPin(pInputPin);
const CMediaType pMediaType = _FilterGraphHelper::GetPinMediaType(pInputPin);
const CStringW sName = _FilterGraphHelper::GetFilterName(pBaseFilter);
__C(FilterGraph.RemoveFilter(pBaseFilter));
CObjectPtr<CFilter> pFilter;
pFilter.Construct();
pFilter->Initialize(pBaseFilter);
__C(FilterGraph.AddFilter(pFilter, sName + _T(" (Substitute)")));
__C(FilterGraph.ConnectDirect(pOutputPin, pFilter->GetInputPin(), pMediaType));