mindtrove Collecting ideas since 1980

19Feb/107

HTML5 audio caching

One of my latest coding endeavors is a text-to-speech interface for JavaScript using HTML5 <audio> elements to output synthesized speech from a server. To reduce the latency between a speech request and actual speech output, I'm using various levels of caching. One of these is the regular browser disk cache based on HTTP headers.

It turns out that browser caching behavior for <audio> data varies wildly among browsers. The following table shows the HTML5 <audio> caching behavior of various browsers. I tested all of them on OS X 10.6 with the standard Mac Apache server hosting all of the tested audio files.

Browser <audio> Behavior
Firefox 3.6 Respects cache headers for the sound data. Only contacts the server when the cache item expires. <audio> elements pointing to the same src reuse the cache data.
Chrome 5.0.322.2 Contacts the server on every load(). When it receives a 304 response, does not refetch content.*
Safari 4.0.4 Contacts the server to fetch first two bytes of the audio file on every load(). Receives a 206 response with partial content. Fetches the additional bytes from the file. Receives another 206 response with the partial content. Performs another fetch and receives a 304 response with no data. Continues to alternate between fetches that receive 206 partial data responses and 304 not modified responses. Nothing appears to get cached.
Webkit r54921 Same behavior as Safari 4.0.4.

* Though not cache related, audio output in Chrome is often clipped before the end of the actual audio data. When this occurs, Chrome fires the onended event even before the audible output finishes.

Except for Firefox 3.6, all of these browsers seem to exhibit pretty terrible caching behavior when it comes to audio. I've reported bugs where I thought appropriate, but maybe I'm missing something. Am I supposed to include additional headers in the server-side response? Or maybe I'm glossing over some key part of the <audio> API? If so, please let me know. If not, yikes: <audio> support has definite room for improvement.

Comments (7) Trackbacks (0)
  1. Why js tts? Why not let a screen reader do this? It sounds quite interesting but I think I’ve missed the point

  2. I’m interested in what we could accomplish with self-voicing apps. The screen reader has access to the DOM, but has limited knowledge of the meaning behind it. A self-voicing app can provide a considerably better user experience. Plus, relying on a screen reader limits the focus of audio web apps to people that own them. The potential usefulness of such apps goes beyond access for people with visual impairments (e.g., eyes-free, background notifications, multimodal supplements).

  3. Are you using the Cache-Control and Expires headers?

  4. Ack, same problem, caches fine in FF but not in Chrome. I’m also getting clipped audio in Chrome.

  5. Did you ever figure this out? I’m running into the same thing and it’s somewhat annoying.

  6. There wasn’t much to figure out other than the browser caching is what it is until it gets better. Have a look at http://mindtrove.info/jsonic-speech-and-sound-using-html5/ for pointers to code that I managed to get working.


Leave a comment


No trackbacks yet.