Tag Archives: Audio

Sample: Simultaneous Audio Playback via Waveform Audio (waveOut) API

The minimalistic sample demonstrates support of [deprecated] Waveform Audio API for multiple playback streams.

Depending on command line parameters, the application starts threads to open audio hardware using separate waveOutOpen call and stream one or more generated sine waves:

  • 1,000 Hz sine wave as 22,050 Hz, Mono, 16-bit PCM (command line parameter “a”)
  • 5,000 Hz sine wave as 32,000 Hz, Mono, 16-bit PCM (command line parameter “b”)
  • 15,000 Hz sine wave as 44,100 Hz, Mono, 16-bit PCM (command line parameter “c”)
Check(waveOutOpen(&hWaveOut, WAVE_MAPPER, &WaveFormatEx, NULL, NULL, CALLBACK_NULL));
ATLASSERT(hWaveOut);
WAVEHDR* pWaveHeader;
HGLOBAL hWaveHeader = (WAVEHDR*) GlobalAlloc(GMEM_MOVEABLE | GMEM_SHARE, sizeof *pWaveHeader + WaveFormatEx.nAvgBytesPerSec * 10);
pWaveHeader = (WAVEHDR*) GlobalLock(hWaveHeader);
ATLENSURE_THROW(pWaveHeader, E_OUTOFMEMORY);
pWaveHeader->lpData = (LPSTR) (BYTE*) (pWaveHeader + 1);
pWaveHeader->dwBufferLength = WaveFormatEx.nAvgBytesPerSec * 10;
//pWaveHeader->dwUser = 
pWaveHeader->dwFlags = 0;
pWaveHeader->dwLoops = 0;
#pragma region Generate Actual Data
{
    SHORT* pnData = (SHORT*) pWaveHeader->lpData;
    SIZE_T nDataCount = pWaveHeader->dwBufferLength / sizeof *pnData;
    for(SIZE_T nIndex = 0; nIndex < nDataCount; nIndex++)
    pnData[nIndex] = (SHORT) (32000 * sin(1.0 * nIndex / WaveFormatEx.nSamplesPerSec * nFrequency * 2 * M_PI));
}
#pragma endregion 
Check(waveOutPrepareHeader(hWaveOut, pWaveHeader, sizeof *pWaveHeader));
Check(waveOutWrite(hWaveOut, pWaveHeader, sizeof *pWaveHeader));
GlobalUnlock(hWaveHeader);

The operating system is supposed to mix the waves, which can be easily perceived taking place. It is possible to run the application with multiple waveforms within a process, e.g. “abc” command line parameter, and/or start multiple instances of the application.

A binary [Win32] and partial Visual C++ .NET 2010 source code are available from SVN.

Utility Clearance: Enumerate Audio ‘MMDevice’s

The utility and code does straightforward enumeration of MMDevices (Vista+, check MSDN for MMDevice API availability), which correspond to MMDevice API, WASAPI, Core Audio API. The code itself is straightforward, with a ready to use binary to quickly lookup data of interest:

The data is detailed well and in Excel-friendly format (via Copy/Paste):

The code also automatically looks up for named Windows SDK constants, such as PKEY_Device_FriendlyName:

    Identifier    {0.0.1.00000000}.{4c1a7642-3f91-43e5-8fcf-b4b1e803d3f9}
    State    DEVICE_STATE_DISABLED    0x02
    Properties:
        {B3F8FA53-0004-438E-9003-51A46E139BFC}, 15    16 bytes of BLOB, DA 07 03 00 02 00 09 00 0E 00 39 00 16 00 C5 02    65
        PKEY_Device_DeviceDesc    Stereo Mix    31
        {B3F8FA53-0004-438E-9003-51A46E139BFC}, 6    Realtek High Definition Audio    31
        {B3F8FA53-0004-438E-9003-51A46E139BFC}, 2    {1}.HDAUDIO\FUNC_01&VEN_10EC&DEV_0888&SUBSYS_80860034&REV_1002\4&37D44F2F&0&0201    31
        {83DA6326-97A6-4088-9453-A1923F573B29}, 3    oem29.inf:AzaliaManufacturerID.NTamd64.6.0:IntcAzAudModel:6.0.1.5964:hdaudio\func_01&ven_10ec&dev_0888    31
        PKEY_Device_BaseContainerId    {00000000-0000-0000-FFFF-FFFFFFFFFFFF}    72
        PKEY_Device_ContainerId    {00000000-0000-0000-FFFF-FFFFFFFFFFFF}    72
        PKEY_Device_EnumeratorName    HDAUDIO    31
        PKEY_AudioEndpoint_FormFactor    10    19
        PKEY_AudioEndpoint_JackSubType    {DFF21FE1-F70F-11D0-B917-00A0C9223196}    31
        PKEY_DeviceClass_IconPath    %windir%\system32\mmres.dll,-3018    31
        {840B8171-B0AD-410F-8581-CCCC0382CFEF}, 0    316 bytes of BLOB, 01 00 00 00 38 01 00 00 ... 00 00 00 00    65
        PKEY_AudioEndpoint_Association    {00000000-0000-0000-0000-000000000000}    31
        PKEY_AudioEndpoint_Supports_EventDriven_Mode    1    19
        {24DBB0FC-9311-4B3D-9CF0-18FF155639D4}, 3    0    11
        {24DBB0FC-9311-4B3D-9CF0-18FF155639D4}, 4    -1    11
        {9A82A7DB-3EBB-41B4-83BA-18B7311718FC}, 1    65536    19
        {233164C8-1B2C-4C7D-BC68-B671687A2567}, 1    {2}.\\?\hdaudio#func_01&ven_10ec&dev_0888&subsys_80860034&rev_1002#4&37d44f2f&0&0201#{6994ad04-93ef-11d0-a3cc-00a0c9223196}\rtstereomixwave    31
        {5A9125B7-F367-4924-ACE2-0803A4A3A471}, 0    1610612916    19
        {B3F8FA53-0004-438E-9003-51A46E139BFC}, 0    3    19
        PKEY_Device_FriendlyName    Stereo Mix (Realtek High Definition Audio)    31
        PKEY_DeviceInterface_FriendlyName    Realtek High Definition Audio    31
        PKEY_AudioEndpoint_GUID    {4C1A7642-3F91-43E5-8FCF-B4B1E803D3F9}    31

A binary [Win32, x64] and partial Visual C++ .NET 2010 source code are available from SVN.

See also:

Utility Clearance: Generate PCM .WAV File

GeneratePcmWavFile tool is generating PCM .WAV files with requested parameters, silent or with a sine wave data. The utility uses WavDest SDK Sample as a multiplexer, so it is expected to be available.

It is possible to define the following audio data parameters:

  • sampling frequency, number of samples per second, such as 44100 or 48000
  • number of channels; the utility does not constrain this to be stereo or 5.1, it will be able to create 64 or 128 channel audio data as well
  • 8 or 16 bit audio samples
    • 16-bit PCM only: sine wave signal parameters, frequency in Hz and amplitude/loudness relative to full scale, that is 0 dB is maximal loudness, and an argument of 23 will result in -23 dB loud audio (such as -23 dbFS, also see EBU R128 Specification, the signal depending on frequency may be used a reference source for normalized -23 LUFS audio)
  • file duration in seconds
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\>GeneratePcmWavFile.exe
Syntax: GeneratePcmWavFile <options> <output-path>
  /s:N: Sampling Rate N, Hz
  /c:N: Channel Count N
  /b:N: Sample Bit Count N, 8 or 16
  /t:N: Length N, seconds
  /f:N: Sine Signal Frequency N, Hz
  /l:N: Sine Signal Loudness N, dB below full scale
  /n:N: Noise Signal Loudness N, dB below full scale

A binary [Win32] and partial Visual C++ .NET 2010 source code are available from SVN.

UPDATE: An extra /n parameter lets the application add random noise within provided loudness parameter.

Adobe Flash Media Live Encoder 3.1

It is the first time ever – because probably I am not as experienced as Geraint – I witness availability of IMediaSample interface without availability of IMediaSample2. One might be curious what kind of software could provide such a weirdo in 2010? It is latest and greatest Adobe Flash Media Live Encoder 3.1.

Additionally to this, they decided to provide garbage in AM_MEDIA_TYPE::formattype field of IAMStreamConfig::SetFormat. They perhaps learned that standard Audio Capture Filter will ignore it anyway, so why bother then?

Having fixed that, Tone Source Filter based virtual audio device is now compatible with Adobe Flash Media Live Encoder 3.1 and it is possible to send audio to remote Flash Media Server, e.g. such as Ustream.tv service (over RTMP protocol, as implemented by FMLE).

24-Image002

Audio Oscillogram DirectShow Filter

Unlike video, audio is difficult to troubleshoot in a way that once an issue has been noticed it was dynamic and already in past and it may be difficult to repeat, explain, forward as a screenshot etc. So it needs a visualization in order to check certain specific of the media stream.

Here we go with a Audio Oscillogram Filter, a filter which may be inserted into audio part of the graph, it insists on PCM audio (multichannel is OK, 8 or 16 bits per sample), so decoders might be required upstream and will be automatically inserted.

Once the filter is running, it shows a tool window with a visualization of the waveform coming through. Additionally, it shows a discontinuity sample (that is, a sample with AM_SAMPLE_DATADISCONTINUITY flag)  with a thick read vertical line, and a mismatch (gap or override) in neighboring sample start and stop times with a thin red line.

Alax.Info Audio Oscillogram Filter

Additionally to this, a window provides an option to pop up downstream audio renderer property sheet in order to check/debug rendering statistics, such as buffer fullness and break count (programmatically available via IAMAudioRendererStats interface). Unfortunately Microsoft does not provide a well done proxy/stub pair for this interface and the property page and there is no way to access this property page connecting to graph remotely. Using this filter the property sheet is shown directly from the hosting process and works correctly.

Audio Renderer Properties

A partial Visual C++ .NET 2008 source code is available from SVN, release binary included.

File and Class Summary

Utility.dll

Utility.dll (download) hosts the following classes:

  • DirectShow Filters
    • Audio Oscillogram Filter, to visualize/debug audio data

Class Overview

Audio Oscillogram Filter

The filter visualizes PCM audio data going through in a form of oscillogram.

Remarks

The filter will create a tool window with audio visualization when put into paused/running state. The filter graph should be controlled (IMediaControl) from a GUI aware apartment so that window messages could reach internal filter’s window.

Windows Media Codec List

Windows Media Codec List utility uses IWMCodecInfo interface (see also IWMCodecInfo2, IWMCodecInfo3) lists installed Windows Media Codecs and their formats and presents the findings in a convenient way. The utility gives a quick idea what a programmer would obtain through IWMCodecInfo2/IWMCodecInfo3 interfaces and what well known format structures (WM_MEDIA_TYPE, AM_MEDIA_TYPE, WAVEFORMATEX, VIDEOINFOHEADER) correspond to particular format.

For a description of Windows Media video and audio codecs, check article Encoding Audio and Video with Windows Media Codecs.

Windows Media Codec List Screenshot

Copy button copies discovered information into clipboard in comma-separated values (CSV) format (e.g. suitable fo rMicrosoft Excel). Submit button posts the same information to this website for… possibly further aggregation.

Some quick facts immediately visualized by the utility:

  • for a video codec there is exactly one generic codec format listed
  • video codec FOUCCs are: WM Video – WMV1, WMV2, WMV3; WM Video Screen – MSS1, MSS2; WM Video Image – WMVP, WVP2; WM Video Advanced Profile – WVC1
  • for audio codecs there are complete codec formats enumerated, with names/descriptions suitable for GUI
  • audio codecs enumerate different formats in response to enumeration settings (e.g. request for VBR formats)
  • WM Audio Lossless only lists formats for single pass VBR mode
  • audio format tags (wFormatTag) are: WM Audio including Professional and Lossless – 0×0161, 0×0162, 0×0163; WM Audio Voice – 0x000A; ACELP.net – 0×0130

Partial Visual C++ .NET 2008 source code is available from SVN, release binary included.

Read more »

Confusing AUDIO_STREAM_CONFIG_CAPS

I don’t have any idea who makes software nowadays, but how can it expected to be reliable?

Intel DG33FBC motherboard, onboard Realtek ALC888 High Definition Audio. I am tracing AUDIO_STREAM_CONFIG_CAPS capabilities reported by onboard audio capture board, one of them:

AM_MEDIA_TYPE:

majortype {73647561-0000-0010-8000-00AA00389B71}, subtype {00000001-0000-0010-8000-00AA00389B71}, pUnk 0x00000000
bFixedSizeSamples 1, bTemporalCompression 0, lSampleSize 4
formattype {05589F81-C356-11CE-BF01-00AA0055595A}, cbFormat 18, pbFormat 0x002911a8
pbFormat as WAVEFORMATEX:
  wFormatTag 1
  nChannels 2
  nSamplesPerSec 8000
  nAvgBytesPerSec 32000
  nBlockAlign 4
  wBitsPerSample 16
  cbSize 0

AUDIO_STREAM_CONFIG_CAPS:

guid {73647561-0000-0010-8000-00AA00389B71}
MinimumChannels 1, MaximumChannels 2, ChannelsGranularity 1
MinimumBitsPerSample 8, MaximumBitsPerSample 16, BitsPerSampleGranularity 8
MinimumSampleFrequency 11025, MaximumSampleFrequency 44100, SampleFrequencyGranularity 11025

Media type sampling frequency is 8 KHz (correct) but associated capabilities structure still report different sampling rates and granularity (crap), it is in fact 11025..44100 Hz for all capabilities, including those with sampling frequencies from a different row.