Enumeration of DirectShow Capture Capabilities (Video and Audio)

The tool appears to be unmentioned here, and this is to fix the problem.

DirectShowCaptureCapabilities application enumerates video and audio capture devices and lists their typical DirectShow properties, specifically:

  • Moniker names (including USB identification)
  • Pins and property pages
  • Supported interfaces
  • Formats and capabilities available through IAMStreamConfig interface
  • Video and audio devices

DirectShowCaptureCapabilities Screenshot

The utility allows to save output and post it over Internet, what some users already did and here is capabilities of some hardware.

Some of the files might be useful to provide sample data for AM_MEDIA_TYPE structures for typical YUY2, UYVY, MJPG, H264 formats.

Download:

Continue reading →

PolyTextOut API – Does It Work?

As MSDN says,

The PolyTextOut function draws several strings using the font and text colors currently selected in the specified device context.

The article also mentions ExtTextOut as a simpler sister function:

To draw a single string of text, the application should call the ExtTextOut function.

It looks like the API is not so Unicode friendly. Code as simple as

PolyTextOut(L"Мама мыла раму");
PolyTextOut(L"Mother washed window");
PolyTextOut(L"ママソープフレーム");
PolyTextOut(L"დედა საპნის კარკასი");

ExtTextOut(L"Мама мыла раму");
ExtTextOut(L"Mother washed window");
ExtTextOut(L"ママソープフレーム");
ExtTextOut(L"დედა საპნის კარკასი");

Outputs correctly in case of ExtTextOut, while PolyTextOut stumbles on strings in Japanese, Georgian. All right, so why did it do Russian?

PolyTextOut Sample

Blackmagic Design Intensity Pro 4K Issues

The new board is inexpensive,cool (well, actually it is hot, see below) and easy to interface with but has has severe issues.

The Intensity Pro 4K is great for video editors that need a realtime preview on a big screen TV, people doing live streaming presentations, or for those trying to save family videos from old VHS tapes.
Customers can capture NTSC, PAL, 720HD, 1080HD and Ultra HD.

Issue 1. The cooling fan is totally annoying. 40mm fan is running at constant high (max?) speed without dynamic speed control. Noise level is absolutely unacceptable and the board is a “no go” until the problem is solved. There is a grayed out check box to enable board’s control over spinning, so we might expect (and hope!) that firmware update starts doing what it is supposed to do from the start.

Issue 2. Blackmagic Design Desktop Video 10.4 is unstable. Internal problems partially disable board capabilities and certain modes are no longer available. Software can no longer capture XBox signal, in particular. Blackmagic Design is yet to release good operational version of software.

Issue 3. DeckLink SDK memory leak (applies to 10.3.7 and supposedly earlier versions as well; reference code). IDeckLinkInput does not properly manage internal video frame buffers and leak them once in a while. The problem does not happen if you:

  • reuse IDeckLinkInput interfaces
  • use custom memory allocator (which is preferred because stock allocator is also way too memory greedy)

Windows Media Player encountered a problem while playing ASF/WMV file with multiple audio tracks

This is not really obvious: Windows Media Player refuses to open a Windows Media (ASF) file with an undescriptive error message: “Windows Media Player encountered a problem while playing the file”.

Broken WMV file in Windows Media Player

The problem, however, is that the file is actually good, more or less. The file plays well in DirectShow and Media Foundation APIs. There is a unusual thing, of course, that the file contains two audio tracks. The tracks are mutually exclusive as they should be (exclusion of language type – MFASFMutexType_Language).

href=”https://alax.info/blog/wp-content/uploads/2015/04/Image003.png”>Image003

Image004

It appears that Windows Media Player does a strict checking of language strings, and expects them to be RFC 1766 compatible. They are free style tags in this file:

Image002

Not a valid language string? OK, no playback then.

IP Video Source: Compatibility Issues

I received a few emails recently with questions about compatibility issues between IP Video Source and other applications. The compatibility issues typically fall into classes:

  1. An application is sensitively expecting the device to be real camera, with specific requests, e.g. to be able to change resolution of video capture
  2. An application is explicitly requesting that video device is a device backed by kernel driver
  3. Inaccurate use of DirectShow API on application side, bugs and incorrect assumptions

For a DirectShow developer a reference video capture application is Windows SDK AMCap. If device works with AMCap, then application should basically be able to pick it up as well, or it has certain code issues. Then AMCap is available in source as SDK sample and its code is available.

Specific titles which had compatibility issues with IP Video Source were: Skype, Unity 3D, Open Broadcaster Software (OBS). I have no plans to investigate compatibility with Skype: something is really wrong on Skype side, and they prevent from attaching debugger to their application. Unity 3D was unreasonably assuming that a video device has to be a WDM Video Capture Filter backed device, ignoring it otherwise. OBS was attempting to blacklist certain group of devices (for specific reasons I am not aware of) and they way it was done was also blacklisting IP Video Source.

IP Video Source 1.0.3 works this around the way it can. If you are already Unity 3D/Open Broadcster Software user and you have IP Video Source installed, you want to re-add cameras in IP Video Source. The easiest is to open camera management dialog, copy everything into clipboard, then delete all cameras and then paste them back. New cameras get correct configuration right from the start.

Additionally, IP Video Source 1.0.3 updates include:

  • JpegVideoSourceFilterRegistry COM class which allows adding/managing cameras programmatically (no sample code, but COM interfaces on type library are self-describing)
  • Automatic conversions to YUY2, UYVY on application request in Windows Vista+, on application request and to better mimic real cameras, such as especially laptop webcams, that often deliver 4:2:2 YUV video
  • added EC_DYNAMICMEDIATYPECHANGE event to notify graph owner that filter hit necessity to change video resolution
  • added IQualityControl implementation on output pin (e.g. DeckLink SDK video renderer filter makes unreasonable assumption that this optional interface is implemented)
  • worked around a bug in parsing D-Link DCS-930LB1 camera (for the second time! there was a related issue a few years ago with a predecessor of this camera)

There is also some sample C++ and C# code published (rather code snippets) that demonstrate basic operations.

Download links

Audio playback at non-standard rates in DirectShow

DirectShow streaming and playback in particular offers flexible playback rates for scenarios where playback is requested to take place slower or faster than real time. For a DirectShow developer, the outer interface is pretty straightforward:IMediaPosition::put_Rate takes playback rate and that’s it.

Playback rate. Must not be zero.

The playback rate is expressed as a ratio of the normal speed. Thus, 1.0 is normal playback speed, 0.5 is half speed, and 2.0 is twice speed. For audio streams, changing the rate also changes the pitch.

Even after taking out the case of reverse playback, which is not supported out of the box and requires some DirectShow magic to implement, there is a nasty problem from those who want to be able to change playback rate flexibly on the go.

Rates greater than one are faster than normal. Rates between zero and one are slower than normal. Negative rates are defined as backward playback, but in practice most filters do not support it. Currently none of the standard DirectShow filters support reverse playback.

The problem comes up when an audio-enabled file/stream is being played back and there is an audio renderer in the pipeline. The filter graph would connect and play excellently, but once you try to change playback rate too much, the request might fail unexpectedly with 0x8004025C VFW_E_UNSUPPORTED_AUDIO “Cannot play back the audio stream: the audio format is not supported.” error.

An application that “almost does everything right” is unable to do a small thing as simple as fast forward playback!

The root of the problem is in audio renderer. Requests to change playback rate propagate through filter graphs through IMediaSeeking interface and Filter Graph Manager sends the new rates through renderers upstream. Audio renderer rejects to accept the rates it does not support and this breaks the whole thing.

Earlier implementations had [supposedly? “But I cannot call SetRate with more than 2, it returns VFW_E_UNSUPPORTED_AUDIO.”] a limit of 50%..200% rate range, and since Vista the actual range is somewhat relaxed. Having no documentation reference, my educated guess is that actual playback rate limit is defined by ability of the renderer to resample the data into format accepted by underlying device. That is, a device taking up to 192 kHz audio could be used to play 44.1 kHz content at rates up to 435%.

The nasty part of the problem is that even though one might want to mute the audio part at such rates, or exclude audio substream at all, this is only possible with transition through stopped state (due to supposed changes in filter graph topology) and otherwise audio renderer blocks rate changing with the mentioned error code.

So, is there any way to fix VFW_E_UNSUPPORTED_AUDIO issue? with reuse of existing components and smooth user experience on the UI side? One of the approaches is to customize the behavior of standard audio renderer, DirectSound Renderer Filter.

Filter Graph Manager would use its IMediaSeeking/IMediaPosition interfaces directly, so the filter cannot be added into filter graph as is. Fhe following is the checklist for required updates:

  • IMediaSeeking needs to be intercepted to accept wide range of rates, to pass some of them transparently and fake those accepted in “muted” mode
  • IPin, IMemInputPin interfaces need to be intercepted to accept incoming media sample, to pass them through or suppress and replace with IPin::EndOfStream in “muted” mode

The mentioned tasks make it impossible to have standard audio renderer as a normal participant of the filter graph, however a wrapper COM object can achieve the planned just fine without a single line of code doing audio. The figure below shows how standard DirectSound renderer is different from its wrapper.

Wrapper

The complete list of tasks to do in the wrapper:

  • IPin::QueryPinInfo needs to properly report wrapper filter
  • IPin::EndOfStream needs to suppress EOS call in case we already “muted” artificially
  • IPin::NewSegment needs to replace rate argument with 1.0 before forwarding to real renderer in case we decided to “mute” the stream
  • IMemInputPin::Receive and IMemInputPin::ReceiveMultiple need to replace media sample delivery with an EOS in case we are muting the stream
  • IBaseFilter::EnumPins and IBaseFilter::FindPin should properly expose pin wrapper
  • IMediaSeeking::SetRate accepts any rate and decides on muting or transparent operation, then forward real or fake value to the real renderer managed internally
  • IMediaSeeking::GetRate reports accepted rate

As the list says, wrapper filter can accept any rate (including negative!) and decode on transparent playback or muted operation for unsupported or otherwise unwanted rates. No filter graph re-creation or stopping required when changing rates, and changing muting.

A DirectSound renderer filter added to the graph automatically or otherwise, as a part of normal graph construction needs to be replaced by the wrapper in the following way:

CLSID ClassIdentifier;
if(FAILED(FilterArray[nIndex]->GetClassID(&ClassIdentifier)))
    continue;
// NOTE: DirectSound Renderer Filter, CLSID_DSoundRender
//       http://msdn.microsoft.com/en-us/library/windows/desktop/dd375473%28v=vs.85%29.aspx
if(ClassIdentifier != CLSID_DSoundRender)
    continue;
const CComPtr<IPin> pInputPin = _FilterGraphHelper::GetFilterPin(pBaseFilter);
const CComPtr<IPin> pOutputPin = _FilterGraphHelper::GetPeerPin(pInputPin);
const CMediaType pMediaType = _FilterGraphHelper::GetPinMediaType(pInputPin);
const CStringW sName = _FilterGraphHelper::GetFilterName(pBaseFilter);
__C(FilterGraph.RemoveFilter(pBaseFilter));
CObjectPtr<CFilter> pFilter;
pFilter.Construct();
pFilter->Initialize(pBaseFilter);
__C(FilterGraph.AddFilter(pFilter, sName + _T(" (Substitute)")));
__C(FilterGraph.ConnectDirect(pOutputPin, pFilter->GetInputPin(), pMediaType));

Media Foundation MPEG-4 Property Handler might report incorrect Video Frame Rate

To follow up previous post with Media Foundation bug, here goes another one related to property handler for MPEG-4 files (.MP4) and specific property PKEY_Video_FrameRate which reports frame rate for given media file.

This is the object responsible for filling columns in explorer, or otherwise visually the bug might look like this:

Image001

The values of the properties are also accessible programmatically using IPropertyStore::GetValue API, in which case they are:

  • PKEY_Video_FrameWidth: 1280 (VT_UI4) // 1,280
  • PKEY_Video_FrameHeight: 720 (VT_UI4) // 720
  • PKEY_Video_FrameRate: 1091345 (VT_UI4) // 1,091,345
  • PKEY_Video_Compression: {34363248-0000-0010-8000-00AA00389B71} (VT_LPWSTR) // FourCC H264
  • PKEY_Video_FourCC: 875967048 (VT_UI4) // 875,967,048
  • PKEY_Video_HorizontalAspectRatio: 1 (VT_UI4) // 1
  • PKEY_Video_VerticalAspectRatio: 1 (VT_UI4) // 1
  • PKEY_Video_StreamNumber: 2 (VT_UI4) // 2
  • PKEY_Video_TotalBitrate: 12123288 (VT_UI4) // 12,123,288

The actual frame rate of the file is 50 fps. The file is playable well in every media player, so the problem is the reporting itself. Let us look inside the file to possibly identify the cause. The mdhd box for the video track shows the following information:

Image003

Let us do some math now:

  • Time Scale: 10,000,000
  • Duration: 4,501,200,000 (around 7.5 minutes)
  • Video Sample Count: 22,506

This makes the correct fps of 50 (frames per scaled duration). However the duration number itself is a pretty big one and looks exceeding the 32-bit range. Now let us try this one:

22506 / (4501200000 & ((1 << 32) – 1)) * 10000000

And we get 1,091. Bingo! Arithmetic overflow in the property handler then…

See also:

Bonus tool: FilePropertyStore application which reads properties of the file you drag and drop onto it, Win32 and x64 versions.

Image002