Category: LinkedIn

Cross-posted from LinkedIn company profile

Resource-Efficient Speech Gating: Leveraging Dolby Dialog Intelligence

We recently came across an article https://lnkd.in/dx2ZUgZX discussing the use of Dolby Laboratories Dialog Intelligence for speech gating. This technology addresses a challenge we’ve encountered in the past, involving standards like ITU-R BS.1770 https://lnkd.in/dhVSRTRB and related methods. The article provides detailed technical information and references, allowing us to focus on the practical implications.

We had reference Dolby Dialog Intelligence source code as a departure point, and we applied the code to live audio streams we already handled. The primary outcome of this processing is the ability to confidently determine whether content contains speech or not. While the Dolby source code was relatively straightforward to integrate, it had some performance limitations. It worked, but the resource consumption didn’t align well with other processing requirements.

Before requesting production-ready implementation from Dolby, our customer allowed us to investigate further. We discovered that the initial part of the processing involved downsampling the audio signal to 16 kHz. By replacing this step with a proper #audio resampler and ensuring that it didn’t affect the speech detection algorithm’s output, we achieved a production-ready speech gating solution: processing complexity was reduced by an order of magnitude.

Speech gating plays a crucial role in determining the audio loudness of broadcasted content. Compliance requirements now demand accurate loudness measurements, preventing any manipulation or cheating with audio levels.

Media Foundation Chronicles: Lost and Found

In the years 2009 to 2011, engineers from the Microsoft Media Foundation Team shared a series of blog posts containing sample code related to the hashtag#MediaFoundation API — a successor to the venerable hashtag#DirectShow.

At that time, there was a scarcity of sample source code specifically addressing this topic. Unfortunately, the passage of time and various transformations of blog sites and the Microsoft website took their toll. The original blog posts suffered, and although they were eventually recovered and reinstated as part of the team blog archive https://lnkd.in/drKBW5tW, the source code associated with those posts vanished entirely. The links now led to the dreaded HTTP 404 “Not Found” error.

However, our quest for historical preservation and the benefit of those who remain curious led us to a solution. We unearthed the missing source code and deposited it into a GitHub repository https://lnkd.in/dXRi9PZF. There, it resides — a testament to the past and a resource for those who still harbor interest in the intricacies of the Windows Media Foundation API.

Feel free to explore the repository and delve into the code. After all, sometimes even lost fragments of the digital realm can find their way back home. ????????

Streaming Games to Any Device

In the past, GeekWire featured an article https://lnkd.in/d8FMf3mH on Rainway — a prominent Seattle startup with an ambitious mission: streaming games to any device.

Our role in this endeavor was to contribute essential components to Rainway’s game streaming technology. Among these, a pivotal piece involved transforming the audiovisual content generated by standard games into a format compatible with hashtag#HTML5. Our primary objective was to extend the gaming experience to remote web browsers.

To achieve this, we repurposed our existing technology and developed a subsystem, which efficiently converted monitor video into an H.264/AVC data stream, meticulously packaged for compatibility with HTML5 Media Source Extensions (hashtag#MSE). Through hashtag#WebRTC transmission, this stream seamlessly reached remote systems and integrated into web browsers.

Throughout our journey, we engaged in thoughtful experiments. Should audio be part of a joint stream with video, or should it be delivered separately? We delved into format intricacies and explored novel ideas. Notably, while some debated the idea of video remaining entirely within the GPU realm, including video encoding, we had already implemented this with production-quality results back in 2017.

The outcome was groundbreaking software that facilitated desktop Windows gaming streaming to HTML5 browsers, mobile devices, and even hashtag#Xbox consoles. ????????

Digital Whirlwinds: The Yahoo Supplier Chronicles

Once upon a time, our humble software development venture in Ukraine found itself rubbing shoulders with the bigwigs. Yes, you read that right — we became a Yahoo supplier. How did this unlikely match come to be? Buckle up, because it’s quite the ride.

Picture this: pre-COVID days, when online collaboration was our jam. We were a distributed team, working seamlessly across digital realms. And then, fate knocked on our virtual door. The company we served — a success story in its own right — got swept up in the whirlwind of mergers and acquisitions. Enter Oath, the grand orchestrator behind the scenes. They scooped up not just one, but a whole constellation of brands: Yahoo, AOL, Verizon Media Platform, and more.

Now, here’s where the plot thickens. Our status as a supplier shifted gears. From serving a modest-sized company, we suddenly found ourselves in the global spotlight. Yep, we leveled up — from local hero to international player. Our mission? Keep those wheels turning, operations humming, and pixels dancing across screens.

But life is a series of chapters, and ours took a turn. The acquired company? Well, it vanished into the ether, dissolved within the giant’s belly. Poof! No more product, no more legacy. Just memories and a faint echo of Yahoo yodels.

And so, we closed that chapter, turned the page, and set sail toward new horizons. But deep down, we’ll always cherish our Yahoo days — the underdog who danced with giants. ????????

There you have it — a snippet from our software saga. Who knew lines of code could lead to such adventures? ?????????

Legacy Code and Overengineering: The MJPEG Decoder Saga

So, hashtag#DirectShow virtual cameras — those elusive creatures that always turn heads. We’ve chatted about them before on LinkedIn (check out our post here https://lnkd.in/dYte5SQ5). But let’s rewind to 2011 when we decided to play mad scientist. Our mission? Whip up a batch of DirectShow filters that could snag JPEG and M-JPEG video streams from network sources (think IP cameras) and seamlessly slot them into DirectShow applications.

But wait, there’s more! We cranked it up a notch. Picture this: a secret lab, flickering monitors, and a dash of overengineering. Our filters cozied up to the stock Microsoft JPEG decoder — the one that’s been less than stellar since forever. And guess what? We wrapped it all in a nostalgic bow — a wrapper around the ancient VCM JPEG Decoder from 1992 https://lnkd.in/dYRUi84x. Yep, that’s right — the same decoder that predates most of us.

Why, you ask? Because that’s how Microsoft Windows rolls. It clings to legacy features like your favorite worn-out hoodie. The “MJPEG Decompressor” (sounds fancy, right?) is still documented https://lnkd.in/dBZzBbKK as a relic . But honestly, no one should touch it with a ten-foot pole. Not now, not 13 years ago — never.

And here’s the twist: Our Alax.Info IP Video Source DirectShow extension https://lnkd.in/diB_3vBf, born from this wild experiment, lives on. It’s like that quirky friend who insists on wearing mismatched socks. People still use it, still recommend it. Maybe it’s the retro charm or the sheer audacity. Who knows?

So next time you’re streaming video from a network source, tip your hat to those unsung heroes — the DirectShow filters that made it all happen. And raise a banana (yes, a banana) to the MJPEG Decompressor. It’s been around longer than your grandma’s favorite recipe.

There you have it — a tale of tech, tenacity, and a touch of madness. ????????????

LDS Temples and Technology: The DirectShow Journey

A while back, we were working on a media subsystem for The Church of Jesus Christ of Latter-day Saints. They needed software-controlled multimedia playback with specific requirements for their temples worldwide.

Now, the attached image isn’t an exact representation of our work, but it captures the essence: LDS and technology go hand in hand.

Back in the day, we used #DirectShow as our multimedia framework, and boy, did we face some interesting challenges. One that sticks out in memory is related to audio delivery. Picture this: we had a multi-channel audio output card from AudioScience, Inc., and our task was to schedule audio delivery in perfect sync across multiple physical audio connectors. But wait, there’s more! We also had to toggle outputs on and off while others were already belting out sound. And when we turned on a fresh audio stream, it had to seamlessly match the signal already in play. Oh, and don’t forget — the video part of this signal was streaming nonstop and couldn’t be interrupted.

Now, let me tell you, this wasn’t a walk in the park. The multimedia framework was designed back in the ’90s, with the quaint notion that once you set up your playback topology, you couldn’t tweak anything while the show was running.

But guess what? Our software spread its wings and flew to over a hundred locations worldwide. Many moons have passed, but who knows — it might still be chugging along out there.

Legacy Filters, Modern Solutions: MP4 Support in DirectShow

Microsoft #DirectShow API was introduced long before the widespread adoption of MPEG-4. As MPEG-4 codecs and container formats became standard, DirectShow was, by Microsoft’s own admission, nearing the end of its life.

That’s how this once-popular media framework for Windows found itself without support for MP4 files. Fortunately, there was a handy solution: freely available filters https://gdcl.co.uk/mpeg4/ developed by Geraint Davies. Originally published in 2006, these filters gained popularity over time. Since Geraint had other commitments after the last update, we took the liberty of placing a copy of his work on GitHub https://lnkd.in/dPsZEfpE somewhere around 2015.

Despite the state of DirectShow, these filters still play a role in DirectShow applications. We’ve even made a few updates ourselves — a little bit of everything: a unit test project, some modern C++ and #COM code based on Microsoft WIL https://lnkd.in/de5nxif, a COM type library with an integration interface, and various features. One particularly valuable addition is the ability to recover broken recordings.

You see, sometimes applications crash — whether due to external factors or just plain bad luck. And sometimes the cost of “re-doing things right” is too high. Oh, and the cost of data loss is high too! In such cases, we can salvage the broken recording from the crashed application and recover its content. It’s like a digital rescue mission. And in some instances, it’s even automated — like in our partner’s medical software https://lnkd.in/dCrJJRjy, where multi-hour recordings are the norm these days.