More Speech Codex

By the way, packaging more speech codecs into DMO interface is expected to come soon. With a certain luck it would go even farther to video coding…

To appear:

  • AMR-WB G.722.2
  • G.711
  • G.722
  • G.722.1
  • G.723.1
  • G.726
  • G.728
  • G.729
  • GSM 06.10

Already:

I would also need to make a summary table on input/output formats, bitrates etc.

Chances are that the following are also to be put into DMO:

  • Echo Canceller

AMR GSM 06.90 DMO

The mix of the technologies works very well.

  • MS VC++.NET 2003, ATL, WTL – for development environment
  • MS DirectShow of DirectX 9 – for multimedia infrastructure
  • Intel Integrated Performance Primitives 5.0 – for standard code implementation base

All together – AMR GSM 06.90 Speech Codec DirectX Media Object (DMO) as shown below:

Alax.Info AMR Objects in GraphEdit
Runtime requirements include:

  • MS CRT runtime msvcr71.dll redistributable
  • Intel IPP 5 redistributables, including:
    • at the very least – libguide40.dll, ippcore.dll, ipps.dll, ippsc.dll
    • one or more per processor type sets, eg. – ippspx.dll, ippscpx.dll

AMR Speech DMO

So, DirectX Media Objects (DMO) appeared to be a nice interface to extend DirectShow functionality. I could quite easily create encoder/decoder pair of DMOs for AMR speech coding (GSM 06.90). Reasonable ease and development speed. I will probably even prepare an informational version and put here on website. It is likely to contain forced silence intevals because it’s a licensed stuff and can’t be distributed freely.

Serious Software + Serious Hardware = ?

= Serious Problems
A customer, a very loyal one, I must admit, installed several multichannel digital video recorders. He reported a few problems and more or less quickly we fixed all nut one. Serious system fails about one per 1-2 days and noone can tell the reason. Logs don’t show problems, it may be fauly hardware too (enough cases known in past), any ways to come out of this are appreciated. I wish we could fix it, the customer is very good.

3D Audio

Undoubtedly from the very beginning, the idea of sound source localization is far from being new. However, while it’s brewing up in the head it gives more and more keywords and finally I got a lot of theoretical and practical findings in this area. Still the Q is if there is any practical progress, something easy to use to see the effectiveness.

3D Audio

Similar to what I wrote recently:

On the up side, if it CAN do this then with a map of the ‘known’ positions of the microphones it could probably also plot the relative location of each source of sound in the room much like a submarine sonar. If you place a few microphones at floor level as well as ceiling it would even be able to place the sources in a 3D space. You could then make your commands do different things relative to the place or height they were spoken from. By designating the radio and TV as places where commands will be ignored you could eliminate the evil ‘clapper syndrome’ where loud gunfights on TV would turn the lights on and off. In fact, you could go one step further and have things happen simply based on any sound coming from a specific location. This could be refined to commands like ‘knock three times on the ceiling’ or ‘twice on the pipe’.