Continuous realloc()

A colleague raised a question that realloc does better than free + malloc because allocated memory block is never being actually shrunk and reallocations to smaller size following by reallocations to larger (but still not larger than one of the previous) do not lead to heap locks and actual underlying heap memory block reallocations.

While this is technically possible within the contract declared by the API, it does not seem to be likely that the runtime will stay reluctant to release unused memory. And what is also highly probable, that heap managers implement advanced tricks to decrease impact of heap locks while doing memory allocations. In the same time, realloc must move the payload data in full to the new memory location in case the reallocated block is moved itself. If this is not required and the block is large, there is an unwanted performance impact to take place.

The details of the API operation are likely to be described somewhere, and another related question might be how to do the measurement programmatically and get a hint of what is going on internally.

PSAPI offers GetProcessMemoryInfo function to obtain process memory metrics, and returned PROCESS_MEMORY_COUNTERS_EX::PrivateUsage field is showing private memory in use. malloc allocated memory is eventually mapped onto process private memory, so the API is good for seeing approximate (because of fragmentation, process memory use is always higher than sum of actually allocated block sizes) memory usage.

If we are going to allocate 1 MB blocks, then reallocate to 1 KB, then allocate additional memory, observing the process private memory usage we will be able to see if realloc does release unused memory.

The code is as simple as:

PrintPrivateUsage();
VOID* ppvItemsA[256];
static const SIZE_T g_nSizeA1 = 1 << 20; // 1 MB
_tprintf(_T("Allocating %d MB\n"), (_countof(ppvItemsA) * g_nSizeA1) >> 20);
for(SIZE_T nIndex = 0; nIndex < _countof(ppvItemsA); nIndex++)
    ppvItemsA[nIndex] = malloc(g_nSizeA1);
PrintPrivateUsage();
static const SIZE_T g_nSizeA2 = 4 << 10; // 4 KB
_tprintf(_T("Reallocating to %d MB\n"), (_countof(ppvItemsA) * g_nSizeA2) >> 20);
for(SIZE_T nIndex = 0; nIndex < _countof(ppvItemsA); nIndex++)
    ppvItemsA[nIndex] = realloc(ppvItemsA[nIndex], g_nSizeA2);
PrintPrivateUsage();
VOID* ppvItemsB[256];
static const SIZE_T g_nSizeB1 = 16 << 10; // 16 MB
_tprintf(_T("Allocating %d MB more\n"), (_countof(ppvItemsB) * g_nSizeB1) >> 20);
for(SIZE_T nIndex = 0; nIndex < _countof(ppvItemsB); nIndex++)
    ppvItemsB[nIndex] = malloc(g_nSizeB1);
PrintPrivateUsage();

And the output is:

PrivateUsage: 0 MB
Allocating 256 MB
PrivateUsage: 258 MB
Reallocating to 1 MB
PrivateUsage: 3 MB // <<--- (*)
Allocating 4 MB more
PrivateUsage: 7 MB

Which shows that reallocating to smaller size involves freeing unused space.

Download links:

Double right angle bracket kills Visual C++ source code outlining in IDE versions 2008, 2010, 2012

An amusing bug which seems to be affecting three of the versions of Visual Studio in a row: 2012, 2010, 2008: a double right angle bracket closing (or just present) the declaration of templated base class is breaking Visual Studio outlining capability (code scout? Intellisense? whatever).

Have a space there and you are fine.

Enumerating Media Foundation Transforms (MFTs)

Matthew van Eerde already made a similar wrapper over MFTEnumEx in How to enumerate Media Foundation transforms on your system, and this one extends it with enumeration of attributes, also listing them in human friendly way.
This sort of code should perhaps have been in Media Foundation SDK Samples, however we have what we have.

Media Foundation Transforms (MFTs) – they are registered and accessed through the registry, being available for enumeration with and without qualifying criteria. Some of the transforms are dual, DMO/MFT, some are MFT only which make their useful functionality not available directly for DirectShow pipeline. Luckily, the interface is similar to those of DMOs and making it reasonably possible to wrap one into another. Comparison of MFTs and DMOs shows how the two form factors compare one to the other.

Enumeration tool/utility shows availability of registered MFTs in the system. In Windows 7. For example, the output in Windows 7 workstation in provided below.

The output is a good cheat sheet for seeing support of media types in Windows components.

Download links:

Continue reading →

GPS Location/Coordinate Converter: Fractional Seconds, More Shortcuts

This adds a small update to the recently published GPS Location/Coordinate Converter utility:

  • Seconds in Degrees, Minutes & Seconds notation are shown and are accepted as floating point numbers
  • More shortcuts to popular online map services (note that only Google Maps and Yandex Maps are still accepted as input via clipboard):
    • Bing Maps
    • Yahoo Maps
    • Open Street Map
    • WikiMapia

The latter makes the tool an easy to use converted between the services for a GPS POI.

A binary [Win32] and partial Visual C++ .NET 2010 partial source code are available from SVN.

Sample: Simultaneous Audio Playback via Waveform Audio (waveOut) API

The minimalistic sample demonstrates support of [deprecated] Waveform Audio API for multiple playback streams.

Depending on command line parameters, the application starts threads to open audio hardware using separate waveOutOpen call and stream one or more generated sine waves:

  • 1,000 Hz sine wave as 22,050 Hz, Mono, 16-bit PCM (command line parameter “a”)
  • 5,000 Hz sine wave as 32,000 Hz, Mono, 16-bit PCM (command line parameter “b”)
  • 15,000 Hz sine wave as 44,100 Hz, Mono, 16-bit PCM (command line parameter “c”)
Check(waveOutOpen(&hWaveOut, WAVE_MAPPER, &WaveFormatEx, NULL, NULL, CALLBACK_NULL));
ATLASSERT(hWaveOut);
WAVEHDR* pWaveHeader;
HGLOBAL hWaveHeader = (WAVEHDR*) GlobalAlloc(GMEM_MOVEABLE | GMEM_SHARE, sizeof *pWaveHeader + WaveFormatEx.nAvgBytesPerSec * 10);
pWaveHeader = (WAVEHDR*) GlobalLock(hWaveHeader);
ATLENSURE_THROW(pWaveHeader, E_OUTOFMEMORY);
pWaveHeader->lpData = (LPSTR) (BYTE*) (pWaveHeader + 1);
pWaveHeader->dwBufferLength = WaveFormatEx.nAvgBytesPerSec * 10;
//pWaveHeader->dwUser = 
pWaveHeader->dwFlags = 0;
pWaveHeader->dwLoops = 0;
#pragma region Generate Actual Data
{
    SHORT* pnData = (SHORT*) pWaveHeader->lpData;
    SIZE_T nDataCount = pWaveHeader->dwBufferLength / sizeof *pnData;
    for(SIZE_T nIndex = 0; nIndex < nDataCount; nIndex++)
    pnData[nIndex] = (SHORT) (32000 * sin(1.0 * nIndex / WaveFormatEx.nSamplesPerSec * nFrequency * 2 * M_PI));
}
#pragma endregion 
Check(waveOutPrepareHeader(hWaveOut, pWaveHeader, sizeof *pWaveHeader)); 
Check(waveOutWrite(hWaveOut, pWaveHeader, sizeof *pWaveHeader)); 
GlobalUnlock(hWaveHeader);

The operating system is supposed to mix the waves, which can be easily perceived taking place. It is possible to run the application with multiple waveforms within a process, e.g. “abc” command line parameter, and/or start multiple instances of the application.

A binary [Win32] and partial Visual C++ .NET 2010 source code are available from SVN.

A tricky EVR bug was caught up: input pin may falsely report disconnected state

Crime

An application which builds a DirectShow graph unexpectedly started failing with VFW_E_NOT_CONNECTED (0x80040209) error code.

Scene

The problem takes place during DirectShow graph building, yet in stopped state. Specific call which appeared to be giving out the error in first place appears to be EVR input pin’s IPin::ConnectionMediaType, and the problem is also specific to Enhanced video Renderer (Windows 7, but not necessarily only this version).

Investigation

The problem does not appear to be persistent. On the contrary, it is taking place for just a few milliseconds after pin connection. After the problem is gone, it does not seem to ever come up again unless the filter graph is built again from the beginning.

EVR pin connection is always reporting success, so the following error code stating VFW_E_NOT_CONNECTED “The operation cannot be performed because the pins are not connected.” goes against documented behavior, and is thus a bug.

Depending on time between pin connection and media type polling, the call can reach EVR:

  • before it starts showing the problem – stage A
  • at the time the call fails – stage B
  • after the failure time interval, when the call is successful from then on – stage C

Thus, the problem is limited to specific use cases:

  • the application should care about media type on EVR input
  • unexpected failure takes place when the call reaches in stage B
  • also found: the clipping window for the EVR has to belong to a non-primary monitor

If an application keep polling for media type in a loop, the result may be about the following:

UINT nStageA = 0, nStageB = 0, nStageC = 0;
// [...]
for(; ; )
{
    AM_MEDIA_TYPE MediaType;
    ZeroMemory(&MediaType, sizeof MediaType);
    const HRESULT nConnectionMediaTypeResult = pInputPin->ConnectionMediaType(&MediaType);
    if(SUCCEEDED(nConnectionMediaTypeResult))
    {
        if(nStageB)
        {
            nStageC++;
            break;
        } else
            nStageA++;
    } else
    {
        ATLASSERT(nConnectionMediaTypeResult == VFW_E_NOT_CONNECTED);
        nStageB++;
    }
    CoTaskMemFree(MediaType.pbFormat);
}
// [...]
CString sMessage;
sMessage.Format(_T("Bingo!\r\n\r\n") _T("nStageA %d, nStageB %d - 0x%08x, nStageC %d\n"), nStageA, nStageB, nResult, nStageC);
AtlMessageBox(m_hWnd, (LPCTSTR) sMessage, _T("Result"), MB_ICONERROR);

Workaround

An obvious straightforward workaround is to follow EVR connection with a wait for Stage B to pass, or timeout – whichever takes place first.

Also, vote for the bug on Microsoft Connect.

More Details

Video renderer filter are notorious for re-agreeing media type and being fretful as for memory allocators and media types (for a good reason though!). So it makes sense to suggest that the problem takes place when the filter is doing something related, such as it starts background activity immediately after connection in order to discover upstream peer capabilities.

In order to possibly get details on this, it is possible to raise an exception as soon as Stage B is detected and take a look at thread states using a debugger. Indeed, on of the background threads is engaged in EVR reconnection activity:

Yes it does the reconnection, but nevertheless it is expected to do the things undercover and transparently, it still allows a failure on the outer API.

     evr.dll!GetSourceRectFromMediaType() + 0x37 bytes    
     evr.dll!CEVRInputPin::CheckMediaType() + 0x81 bytes    
     evr.dll!CBasePin::ReceiveConnection() + 0x61 bytes    
     evr.dll!CEVRInputPin::ReceiveConnection() + 0x1fc2d bytes    
     quartz.dll!CBasePin::AttemptConnection() - 0x21 bytes    
     quartz.dll!CBasePin::TryMediaTypes() + 0x60 bytes    
     quartz.dll!CBasePin::AgreeMediaType() + 0x54 bytes    
     quartz.dll!CBasePin::Connect() + 0x46 bytes    
     quartz.dll!CFilterGraph::ConnectDirectInternal() + 0x83 bytes    
     quartz.dll!CFilterGraph::ConnectRecursively() + 0x2c bytes    
     quartz.dll!CFilterGraph::ConnectInternal() + 0xde bytes    
     quartz.dll!CFilterGraph::Connect() + 0x17 bytes    
     quartz.dll!CFGControl::WorkerDisplayChanged() + 0xf1 bytes    
     quartz.dll!CFGControl::CGraphWindow::OnReceiveMessage() + 0x2e2a bytes    
>    quartz.dll!WndProc() + 0x3e bytes    
     user32.dll!_InternalCallWinProc@20() + 0x23 bytes    
     user32.dll!_UserCallWinProcCheckWow@32() + 0xb7 bytes    
     user32.dll!_DispatchMessageWorker@8() + 0xed bytes    
     user32.dll!_DispatchMessageW@4() + 0xf bytes    
     quartz.dll!ObjectThread() + 0x65 bytes

A test Visual C++ .NET 2010 application is available from SVN. The code requires a media file, and refers to 352×288 I420.avi, which is included into ZIP file attached to MS Connect Feedback.

IP Video Source: Pure JPEG URLs and Software Version

This does not update the software with new features, but there are a few simple things worth mentioning explicitly.

The first is that virtual DirectShow camera device can be set up with both M-JPEG and JPEG URLs. That is, IP cameras which do not implement M-JPEG, or implement it in a buggy way (there is a *huge* deal of such out there) can still be set up to send video as individual video frames/images as long as they implement JPEG snapshots. This is taking place often at a lower frame rate, but still works.

The driver will automatically detect type of URL (by response from the device) and will choose best access method for the given URL.

Second is that if you are looking for IP Video Source software version, such as to check against available updates, it is here on the UI (right click the caption):