Playing in-memory audio streams on Windows 8

A customer I'd been working with recently came up with a support request for a Windows 8 Store app they'd been working on. They were building the app using the HTML/CSS/JS stack and wanted the ability to play audio streams completely from memory instead of loading it up from a file on the file system or a network stream. They needed this because their service implemented a custom Digital Rights Management (DRM) system where the audio content was encrypted and this needed to be decrypted before playback (duh!). They wanted however, to perform this decryption on the fly during playback instead of creating a decrypted version of the content on the file system. In this post I talk about a little sample I put together for them showing how you can achieve this on Windows 8. If you prefer to directly jump into the code and take a look at things on your own, then here's where its at:

https://github.com/avranju/AudioPlayerWithCustomStream

Playing media streams from memory

The primary requirement proved to be fairly straightforward to accomplish. Turns out, there already exists an SDK sample showing exactly this. The sample shows how to achieve media playback from memory streams using the Windows.Media.Core.MediaStreamSource object. Briefly, here are the steps:

  1. First you go fetch some metadata from the media stream. In case of audio content, this turns out to be the sample rate, encoding bit rate, duration and number of channels. For file based audio sources, the Windows.Storage.StorageFile object has the ability to extract this information from the file directly via Windows.Storage.StorageFile.Properties.RetrievePropertiesAsync. Here's an example function that accepts a StorageFile object as input and then extracts and returns the said metadata from it.

    function loadProps(file) {
        var props = {
            fileName: "",
            sampleRate: 0,
            bitRate: 0,
            channelCount: 0,
            duration: 0
        };
    
    
        // save file name
        props.fileName = file.name;
        return file.properties.getMusicPropertiesAsync().then(
         function (musicProps) {
            // save duration
            props.duration = musicProps.duration;
    
    
            var encProps = [
                "System.Audio.SampleRate",
                "System.Audio.ChannelCount",
                "System.Audio.EncodingBitrate"
            ];
    
    
            return file.properties.
                retrievePropertiesAsync(encProps);
        }).then(function (encProps) {
            // save encoding properties
            props.sampleRate =
               encProps["System.Audio.SampleRate"];
            props.bitRate =
               encProps["System.Audio.EncodingBitrate"];
            props.channelCount =
               encProps["System.Audio.ChannelCount"];
    
    
            return props;
        });
    }
    
  2. Wrap the metadata gathered in step 1 in a Windows.Media.MediaProperties.AudioEncodingProperties object which in turn is then wrapped in a Windows.Media.Core.AudioStreamDescriptor object.

  3. Use the AudioStreamDescriptor object to initialize a MediaStreamSource instance and setup event handlers for the MediaStreamSource's Starting, SampleRequested and Closed events. As you might imagine, the idea is to respond to these events by handing out audio data to the MediaStreamSourcewhich then proceeds to play that content.

This is all fine and dandy, but how do we get this to work when the audio content is stored in memory in an Windows.Storage.Streams.InMemoryRandomAccessStream object? The challenge of course is in extracting the metadata we need to setup a MediaStreamSource object.

StorageFile can read from arbitrary streams?

As it happens, the StorageFile object has direct support for having it powered by an arbitrary stream (or pretty much anything really). I figured I'll hook up a StorageFile with an InMemoryRandomAccessStream object and have it extract the metadata that I needed. Here's how you connect a StorageFile with data fetched from any arbitrary source - in this case, just a string constant. You create a StorageFile object by calling StorageFile.CreateStreamedFileAsync. CreateStreamedFileAsync requires that you pass a reference to a callback routine which is expected to supply the data the StorageFile object needs when it is first accessed. Here's a brief example:

function init() {
    var reader;
    var size = 0;

    Windows.Storage.StorageFile.createStreamedFileAsync(
           "foo.txt", generateData, null).then(
       function (file) {
        // open a stream on the file and read the data;
        // this will cause the StorageFile object to
        // invoke the "generateData" function
        return file.openReadAsync();
    }).then(function (stream) {
        var inputStream = stream.getInputStreamAt(0);
        reader = new Windows.Storage.Streams.DataReader(inputStream);
        size = stream.size;
        return reader.loadAsync(size);
    }).then(function () {
        var str = reader.readString(size);
        console.log(str);
    });
}

function generateData(stream) {
    var writer = new Windows.Storage.Streams.DataWriter();
    writer.writeString("Some arbit random data.");

    var buffer = writer.detachBuffer();
    writer.close();

    stream.writeAsync(buffer).then(function () {
        return stream.flushAsync();
    }).done(function () {
        stream.close();
    });
}

The problem however, as I ended up discovering, is that StorageFile objects that work off of a stream created in this fashion do not support retrieval of file properties via StorageFile.Properties.RetrievePropertiesAsync or for that matter StorageFile.Properties.GetMusicPropertiesAsync. So clearly, this approach is not going to work. Having said that its useful to know that this technique is possible at all with StorageFile objects as it allows you to defer performing the actual work of producing the data represented by the StorageFile object till it is actually needed. And being a bona fide Windows Runtime object you can confidently pass this around wherever a StorageFile object is accepted - for instance when implementing a share source contract you might hand out a StorageFile object created in this manner via Windows.ApplicationModel.DataTransfer.DataPackage.SetStorageItems.

Reading music metadata using the Microsoft Media Foundation

After a bit of research I discovered that there is another API that can be used for fetching metadata from media streams (among other things) called the Microsoft Media Foundation. In particular, the API features an object called the source reader that can be used to get the data we are after. The trouble though is that this is a COM based API and cannot therefore be directly invoked from JavaScript. I decided to write a little wrapper Windows Runtime component in C++ and then use that from the JS app. After non-trivial help from my colleague Chris Guzak and others directly from the Media Foundation team at Microsoft (perks of working for Microsoft I guess!) we managed to put together a small component that allows us to read the required meta data from an InMemoryRandomAccessStream object. Here's relevant snippet that does the main job (stripped out all the error handling code to de-clutter the code):

MFAttributesHelper(InMemoryRandomAccessStream^ stream, String^ mimeType)
{
    MFStartup(MF_VERSION);

    // create an IMFByteStream from "stream"
    ComPtr<IMFByteStream> byteStream;
    MFCreateMFByteStreamOnStreamEx(
           reinterpret_cast<IUnknown*>(stream),
           &byteStream);

    // assign mime type to the attributes on this byte stream
    ComPtr<IMFAttributes> attributes;
    byteStream.As(&attributes);
    attributes->SetString(
           MF_BYTESTREAM_CONTENT_TYPE,
           mimeType->Data());

    // create a source reader from the byte stream
    ComPtr<IMFSourceReader> sourceReader;
    MFCreateSourceReaderFromByteStream(
           byteStream.Get(),
           nullptr,
           &sourceReader);

    // get current media type
    ComPtr<IMFMediaType> mediaType;
    sourceReader->GetCurrentMediaType(
           MF_SOURCE_READER_FIRST_AUDIO_STREAM,
           &mediaType);

    // get all the data we're looking for
    PROPVARIANT prop;
    sourceReader->GetPresentationAttribute(
           MF_SOURCE_READER_MEDIASOURCE,
           MF_PD_DURATION,
           &prop);
    Duration = prop.uhVal.QuadPart;

    UINT32 data;
    sourceReader->GetPresentationAttribute(
           MF_SOURCE_READER_MEDIASOURCE,
           MF_PD_AUDIO_ENCODING_BITRATE,
           &prop);
    BitRate = prop.ulVal;

    mediaType->GetUINT32(
           MF_MT_AUDIO_SAMPLES_PER_SECOND,
           &data);
    SampleRate = data;

    mediaType->GetUINT32(
           MF_MT_AUDIO_NUM_CHANNELS,
           &data);
    ChannelCount = data;
}

This is the implementation of the constructor on the MFAttributesHelper ref class. As you can tell, the constructor accepts a reference to an instance of an InMemoryRandomAccessStream object and the MIME type of the content in question and proceeds to extract the duration, encoding bitrate, sample rate and channel count from it. It does this by first creating an IMFByteStream object via the convenient MFCreateMFByteStreamOnStreamEx function which basically wraps an IRandomAccessStream object (which InMemoryRandomAccessStream implements) and returns an IMFByteStream instance. The object returned by MFCreateMFByteStreamOnStreamEx also implements IMFAttributes which we then QueryInterface for (via ComPtr::As) and assign the MIME type value to it. Next we instantiate an object that implements IMFSourceReader via MFCreateSourceReaderFromByteStream and use that instance to fetch the duration and encoding bitrate values via the GetPresentationAttribute method. And finally, we retrieve an object that implements the IMFMediaType interface via IMFSourceReader::GetCurrentMediaType and use that object to fetch the sample rate and the channel count values. Once you know how to do all this, it seems quite trivial of course but getting here, believe me, took some doing!

Now that we have this component, reading the metadata from JavaScript proves to be fairly straightforward. Here's an example. In the code below, memoryStream is an InMemoryRandomAccessStream instance and mimeType is a string with the MIME type of the content:

var helper = MFUtils.MFAttributesHelper.create(memoryStream, mimeType);

// now, helper's sampleRate, bitRate, duration and channelCount
// properties contain the data we are looking for

Now with the metadata handy, we simply follow the steps as outlined earlier in this post to commence playback. As mentioned before the sample is hosted up on Github here:

https://github.com/avranju/AudioPlayerWithCustomStream

For the sake of the sample, I took a plain MP3 file and applied a XOR cipher on it and then loaded it up and played back from memory applying another XOR transform on the bits before playback. It all works rather well together and again, hat-tip to Chris Guzak for all his help in whittling down the WinRT component down to its essence and really cleaning up its interface!

comments powered by Disqus