So you got a memory corruption issue with a piece of software. It comes in a unique scenario along the line of having a huge pile of weird code running well most of the time and then, right out of the blue, a corruption takes place followed by unexpected code execution and unstable software state in general.
The biggest problem with memory corruption is that a fragment of code is modifying a memory block which it does not own, and it has no idea who actually is the owner of the block, while the real owner has no timely way to detect the modification. You only face the consequences being unable to capture the modification moment in first place.
To get back to the original cause, an engineer has to drop into a time machine, turn back time and step back to where the trouble took originally place. As developers are not actually given state-of-the-art time machines, the time turning step is speculative.
CVirtualHeapPtr Class: Memory with Exception-on-Write access mode
At the same time a Windows platform developer is or might be aware of virtual memory API which among other things provides user mode application with capabilities to define memory protection modes. Having this on hands opens unique opportunity to apply read-only protection (PAGE_READONLY) onto a memory block and have exception raised at the very moment of unexpected memory modification, having call stack showing up a source of the problem. I refer to this mode of operation as “hardware assisted” because the access violation exception/condition would be generated purely in hardware without any need to additionally do any address comparison in code.
Needless to say that this way is completely convenient for the developer as he does not need to patch the monstrous application all around in order to compare access addresses against read-only fragment. Instead, a block defined as read-only will be immediately available as such for the whole process almost without any performance overhead.
As ATL provides a set of memory allocator templates (CHeapPtr for heap backed memory blocks, allocated with CCRTAllocator, alternate options include CComHeapPtr with CComAllocator wrapping CoTaskMemAlloc/CoTaskMemFree API), let us make an alternate allocator option that mimic well-known class interface and would facilitate corruption detection.
Because virtual memory allocation unit is a page, and protection mode is defined for the whole page, this would be the allocation granularity. For a single allocated byte we would need to request SYSTEM_INFO::dwPageSize bytes of virtual memory. Unlike normal memory heap manager, we have no way to share pages between allocations as we would be unable to effectively apply protection modes. This would definitely increase application pressure onto virtual memory, but is still acceptable for the sacred task of troubleshooting.
We define a CVirtualAllocator class to be compatible with ATL’s CCRTAllocator, however based on VirtualAlloc/VirtualFree API. The smart pointer class over memory pointer would be defined as follows:
template <typename T>
class CVirtualHeapPtr :
public CHeapPtr<T, CVirtualAllocator>
{
public:
// CVirtualHeapPtr
CVirtualHeapPtr() throw();
explicit CVirtualHeapPtr(_In_ T* pData) throw();
VOID SetProtection(DWORD nProtection)
{
// TODO: ...
}
};
The SetProtection method is to define memory protection for the memory block. Full code for the classes is available on Trac here (lines 9-132):
- CGlobalVirtualAllocator class is a singleton querying operating system for virtual memory page size, and provides alignment method
- CVirtualAllocator class is a CCRTAllocator-compatible allocator class
- CVirtualHeapPtr class is smart template class wrapping a pointer to allocated memory
Use case code will be as follows. “SetProtection(PAGE_READONLY)” enables protection on memory block and turns on exception generation at the moment memory block modification attempt. “SetProtection(PAGE_READWRITE)” would restore normal mode of memory operation.
CVirtualHeapPtr<BYTE> p;
p.Allocate(2);
p[1] = 0x01;
p.SetProtection(PAGE_READONLY);
// NOTE: Compile with /EHa on order to catch the exception
_ATLTRY
{
p[1] = 0x02;
// NOTE: We never reach here due to exception
}
_ATLCATCHALL()
{
// NOTE: Catching the access violation for now to be able to continue execution
}
p.SetProtection(PAGE_READWRITE);
p[1] = 0x03;
Given the information what data gets corrupt, the pointer allocator provides an efficient opportunity to detect the violation attempt. The only thing remained is to keep memory read-only, and temporarily revert to write access when the “legal” memory modification code is about to be executed.
Continue reading →