Category Archives: Technology - Page 2

So, how many EVRs you can do?

Direct3D based DirectShow video renderers – Video Mixing Renderer 9 and Enhanced Video Renderer – have been notoriously known for consuming resources in a way that you can run at most X simultaneously. There has been no comment published on the topic and questions (e.g. like this: How many VMR 9 can a PC support concurrently) remain unanswered for a long time. Video Mixing Renderer 7 was a good alternative for some time in past until it was cut down to be unable to support hardware scaling (thanks Microsoft!).

The trendy way to render video nowadays is using Enhanced Video Renderer, a Media Foundation subsystem with an interface into DirectShow to take over state of the art video rendering capabilities. So, how many EVRs one can run simultaneously? Chances are that it is less than one could suppose.

The interesting part is that there is no obvious evidence on type of resource running out, which causing next EVR instance to fail to run. And not even run, the failure seem to be coming up at an earlier stage of just connecting pins in stopped state. The failure might be accompanied with errors like E_INVALIDARG, ERROR_FILE_NOT_FOUND, E_UNEXPECTED. The actual limit appear to be loosely correlating to parameters of video output, such as resolution and bitness.

Desperately waiting for clarification, I am sharing the tool to estimate the limit:

  • multiple EVR instances at once, hosted by multiple windows, which can be distributed across multiple monitors
  • choices of resolutions and formats
  • a double click on an individual renderer pops up property page set displaying effective frame rate
  • 32- and 64-bit versions

A binary [Win32x64] and partial Visual C++ .NET 2010 partial source code are available from SVN.

File Mappings: Virtual Memory and Virtual Address Space

More and more applications hit the Windows limit of available address space for 32-bit applications, and the whole concept becomes more important for understanding due to necessity to work things around.

A thing, which is more or less easy to understand, is that a user mode 32-bit application can address 2^32 addresses. The addresses are not directly physical RAM and the operating system is responsible for management of the mapping addresses into RAM as a part of virtual memory manager operation. Paged memory organization is well documented on MSDN, and the questions has been raised numerous times. An interesting question is whether a 32-bit application can effectively manage memory amounts exceeding address space limits.

Back in 80386 times, the systems could address megabytes of RAM in 16-bit code through XMS and EMS services. The application could access “high” memory addresses by requesting mapping portions of RAM into lower megabyte address space. In some way similar technique is also here for 32-bit applications in Windows through use of file mappings.

A regular memory backed file mapping requests Windows to reserve a memory block which becomes available for mapping into address space of one or more processes. Creating file mapping itself does not imply mapping and this leaves a great option for the owner to allocate more data than it can actually map into address space: if 32-bit process virtual address space is fundamentally constrained, the file mapping allocation space is more loosely limited by amount of physical memory and paging file. The application can allocate 2, 3, 4 and more gigabytes of memory – it just cannot still map it all together into address space and make it available simultaneously.

The FileMappingVirtualAddress utility does a simple thing:

  • on startup it allocates (CreateFileMapping) as many 256 MB file mappings as operating system would allow, and shows it in a list
  • each time a user checks a box, the application maps (MapViewOfFile) corresponding file mapping into address space; unchecking a box unmaps the view
  • the caption shows currently used and maximal available virtual address space

A plain 32-bit version of the application allocated 51 blocks for me (which totals in 13 GB of memory, with 8 GB physical RAM installed in the system). The allocation takes place immediately because the operating system does not actually make all this memory prepared for use – the actual pages would be allocated and ready to use on demand when the application requires them.

The most important part made so obvious is that the 32-bit application succeeds in allocating well over 4 GB, which is maximal virtual address space it can ever get.

The virtual address space in use is only 1641 MB and another request to map an additional section with MapViewOfFile would fail (the default address space limit is 2 GB) – space fragmentation make mapping unavailable earlier than we actually use the whole space, since the API would need to allocate contiguous range of addresses to satisfy the request.

32-bit application built with /LARGEADDRESSAWARE parameter might manage to do more allocations: 64-bit versions of Windows provide 4 GB of addresses to 32-bit processes. 32-bit operating systems might also be extending the limit in case of 4GB RAM Tuning (which would typically be 3 GB of space for a process).

Finally, 64-bit build of the application is free from virtual address space limit as the limit is 8 terabytes. The mapping is again instantaneous because actual RAM will be supplied on first request to mapped pages only.

A binary [Win32, Win32 with /LARGEADDRESSAWARE, x64] and partial Visual C++ .NET 2010 partial source code are available from SVN.

DirectShow SDK at Crime Scene Investigation Service

According to CSI: Crime Scene Investigation, Season 06 Episode 13 (17:04), investigators are using snake scope camera with DirectShow AMCAP tool to present the captured image:

Hardware assisted memory corruption detection

So you got a memory corruption issue with a piece of software. It comes in a unique scenario along the line of having a huge pile of weird code running well most of the time and then, right out of the blue, a corruption takes place followed by unexpected code execution and unstable software state in general.

The biggest problem with memory corruption is that a fragment of code is modifying a memory block which it does not own, and it has no idea who actually is the owner of the block, while the real owner has no timely way to detect the modification. You only face the consequences being unable to capture the modification moment in first place.

To get back to the original cause, an engineer has to drop into a time machine, turn back time and step back to where the trouble took originally place. As developers are not actually given state-of-the-art time machines, the time turning step is speculative.

CVirtualHeapPtr Class: Memory with Exception-on-Write access mode

At the same time a Windows platform developer is or might be aware of virtual memory API which among other things provides user mode application with capabilities to define memory protection modes. Having this on hands opens unique opportunity to apply read-only protection (PAGE_READONLY) onto a memory block and have exception raised at the very moment of unexpected memory modification, having call stack showing up a source of the problem. I refer to this mode of operation as “hardware assisted” because the access violation exception/condition would be generated purely in hardware without any need to additionally do any address comparison in code.

Needless to say that this way is completely convenient for the developer as he does not need to patch the monstrous application all around in order to compare access addresses against read-only fragment. Instead, a block defined as read-only will be immediately available as such for the whole process almost without any performance overhead.

As ATL provides a set of memory allocator templates (CHeapPtr for heap backed memory blocks, allocated with CCRTAllocator, alternate options include CComHeapPtr with CComAllocator wrapping CoTaskMemAlloc/CoTaskMemFree API), let us make an alternate allocator option that mimic well-known class interface and would facilitate corruption detection.

Because virtual memory allocation unit is a page, and protection mode is defined for the whole page, this would be the allocation granularity. For a single allocated byte we would need to request SYSTEM_INFO::dwPageSize bytes of virtual memory. Unlike normal memory heap manager, we have no way to share pages between allocations as we would be unable to effectively apply protection modes. This would definitely increase application pressure onto virtual memory, but is still acceptable for the sacred task of troubleshooting.

We define a CVirtualAllocator class to be compatible with ATL’s CCRTAllocator, however based on VirtualAlloc/VirtualFree API. The smart pointer class over memory pointer would be defined as follows:

template <typename T>
class CVirtualHeapPtr :
    public CHeapPtr<T, CVirtualAllocator>
{
public:
// CVirtualHeapPtr
    CVirtualHeapPtr() throw();
    explicit CVirtualHeapPtr(_In_ T* pData) throw();
    VOID SetProtection(DWORD nProtection)
    {
        // TODO: ...
    }
};

The SetProtection method is to define memory protection for the memory block. Full code for the classes is available on Trac here (lines 9-132):

  • CGlobalVirtualAllocator class is a singleton querying operating system for virtual memory page size, and provides alignment method
  • CVirtualAllocator class is a CCRTAllocator-compatible allocator class
  • CVirtualHeapPtr class is smart template class wrapping a pointer to allocated memory

Use case code will be as follows. “SetProtection(PAGE_READONLY)” enables protection on memory block and turns on exception generation at the moment memory block modification attempt. “SetProtection(PAGE_READWRITE)” would restore normal mode of memory operation.

CVirtualHeapPtr<BYTE> p;
p.Allocate(2);
p[1] = 0x01;
p.SetProtection(PAGE_READONLY);
// NOTE: Compile with /EHa on order to catch the exception
_ATLTRY
{
    p[1] = 0x02;
    // NOTE: We never reach here due to exception
}
_ATLCATCHALL()
{
    // NOTE: Catching the access violation for now to be able to continue execution
}
p.SetProtection(PAGE_READWRITE);
p[1] = 0x03;

Given the information what data gets corrupt, the pointer allocator provides an efficient opportunity to detect the violation attempt. The only thing remained is to keep memory read-only, and temporarily revert to write access when the “legal” memory modification code is about to be executed.

Read more »

GPS Location/Coordinate Converter: Multiple Locations at Once

Today’s update lets you convert multiple locations at once with a single click. Here is the story behind the update and use case scenario.

In rally raid sport events (so called baja), a team gets a road book for the next competition day in a few hours before actual start. The GPS coordinates are printed on one of the pages of the roadbook and are not available in any electronic format.

There were just a few times when the organizer also uploaded a copy of a file with the coordinates and shared a link to download from, but this was rather an exception. Another alternate option was a dedicated person to upload the coordinates (they were earlier full tracks, but at some point tracks were no longer available at all) to pilots’ hardware, but in a state of pre-start рфыеу and variety of GPS hardware, formats, cable etc. this created lines of people. The most one can rely on is a sheet of paper with GPS coordinates. The mistery does not end even here as you don’t know whether you are to get Degrees only, or Degrees and Minutes, or Degrees, Minutes and Seconds. Everything depends on software the organizer uses.

As soon as you get a hard copy of this, the idea is to upload it into device as quickly as possible because there are other things to do and the time is normally 11 PM when the race is to start 7 AM next day tens of miles away from you. The time interval will be shared by uploading data, sleeping and transfer to start location.

The utility is here to grant extra sleep time. Since it is capable to accept various separators on the input, a convenient way is to quickly type in the text in Microsoft Excel, check the data against the hardcopy, and copy into clipboard to transfer to the utility.

A hotkey with conversion transfers data into format of interest, and single “Find and Replace” operation creates a good OziExplorer waypoint file which is good for upload onto portable navigation device.

The whole thing take a few minutes to do with minimal routine typing in.

A binary [Win32] and partial Visual C++ .NET 2010 partial source code are available from SVN.

Bonus picture, rally raid Suzuki is on the way to score the victory and the rally promotional teaser:

Rally Raid Suzuki Samurai on the Way

Common Controls: Versions, Compatibility, WTL

An application appears to be not working in Windows XP in a weird way: rebar control appeared to fail showing up.

Where the application is expected to look much nicer with rebar control as a container for menu (implemented as command bar WTL control) and toolbar with buttons:

A WTL sample project generated by a Visual Studio wizard would never give such effect, and the bug was a combination of factors:

  1. An application built with a newer version of Windows SDK, which includes support for features (Windows Vista+ Common Controls) that are more recent than production environment (Windows XP); the application targets to Windows Vista+ environment too (_WIN32_WINNT >= 0×0600)
  2. Compatibility issues of Common Controls library
  3. WTL version (7.5), which did not yet include a workaround for the problem

The problem, which caused the bug directly was the REBARBANDINFO structure and its use as an argument with Common Controls API. As MSDN shows, the structure was amended twice with additional fields.

One of the way to support multiple versions of the structure definition, and to resolve compatibility issues, is to embed structure size into structure payload. In fact, REBARBANDINFO::cbSize member is there exactly for this reason.

The application is normally filling cbSize with the maximal known structure size and fills the rest of the fields respectively. The API is expected to be checking cbSize member and be detecting API version compatibility scenarios:

  1. cbSize holds exactly the value the API expects (that is, the maximal value known/defined to the API) – the simplest scenario where the API and the application are on the same page, both are using the same versions of the “protocol”/interface.
  2. cbSize is smaller than API can support – the API sees that it is dealing with a sort of legacy application which cannot utilize all available features, and the API acts respectively supporting the older part of the protocol, and keeping defaults or “old look” for the rest of implementation. This addresses backward compatibility: the newer API works with apps designed for older version of the API
  3. cbSize is greater then API can support – the API sees that the application is already aware of newer version API and is possibly requesting some of the missing features. The API might be ignoring the unsupported part in assumption that API evolution tool place keeping some compatibility in mind, and still do the best it can with the existing implementation. Or, the API might just fail to work.

The latter item #3 is the scenario here with rebar control. The application is using Windows Vista version of REBARBANDINFO structure and Windows XP implementation choses to completely fail.

While it does not seem to be directly a bug, this attitude is definitely not developer friendly: there is no reason for the control to not work in its best and default way. Having API acting this way, each developer using the API needs to take care of the situation explicitly: whenever Windows Vista enabled application needs to be able to run in Windows XP system, the code around REBARBANDINFO would look like this:

REBARBANDINFO BandInformation = { sizeof BandInformation, RBBIM_LPARAM };
#if _WIN32_WINNT >= 0x0600
if(GetOsVersion() < 0x00060000 || GetCommCtrlVersion() < 0x00060000) // pre-Vista, Common Controls pre-6.0
    BandInformation.cbSize = REBARBANDINFO_V6_SIZE;
#endif// _WIN32_WINNT >= 0x0600
const BOOL bGetBandInfoResult = Rebar.GetBandInfo(0, &BandInformation);

If the API was nicer to developers, the code would be plain and simple:

REBARBANDINFO BandInformation = { sizeof BandInformation, RBBIM_LPARAM };
const BOOL bGetBandInfoResult = Rebar.GetBandInfo(0, &BandInformation);

To address this problem, WTL 8.0 comes up with RunTimeHelper namespace and its SizeOf_REBARBANDINFO function. It takes care of details for the developer choosing the proper size of the structure on runtime. The code is being taken back to a simpler shape:

REBARBANDINFO BandInformation = { RunTimeHelper::SizeOf_REBARBANDINFO(), RBBIM_LPARAM };
const BOOL bGetBandInfoResult = Rebar.GetBandInfo(0, &BandInformation);

All in all:

  • be aware of compatibility issues (same scenario exists with other SDK structures: LVGROUP, LVTILEINFO, MCHITTESTINFO, NONCLIENTMETRICS and other).
  • use latest version of WTL to have things worked around for you where Microsoft developers were not kid enough to provide perfect API
  • be aware and take advantage of WTL’s RunTimeHelper class

GPS Location/Coordinate Converter: Fractional Seconds, More Shortcuts

This adds a small update to the recently published GPS Location/Coordinate Converter utility:

  • Seconds in Degrees, Minutes & Seconds notation are shown and are accepted as floating point numbers
  • More shortcuts to popular online map services (note that only Google Maps and Yandex Maps are still accepted as input via clipboard):
    • Bing Maps
    • Yahoo Maps
    • Open Street Map
    • WikiMapia

The latter makes the tool an easy to use converted between the services for a GPS POI.

A binary [Win32] and partial Visual C++ .NET 2010 partial source code are available from SVN.