Tag Archives: sound

Coaxing Sound out of the Browser

“Get someone else to blow your horn and the sound will carry twice as far.” – Will Rogers

Having the utmost respect for Will Rogers and his timeless wisdom, I must apologize to his memory for using this quote for purposes other than originally intended.  The quote struck me as somehow appropriate,  because I want to write a few words on the use of sound in browsers and I want to recommend the adoption of a particular JavaScript library to make this as painless as possible.  Rather than creating your own solution from scratch, in the final analysis I am going to suggest letting Howler.js “blow your horn” for you.

This article touches on developing pages and browser applications using sound  – past, present and future.  However, most of the article will be devoted to alternatives for using sound in today’s development environment and I’ll show you two different ways to do it.  A link to a working demo will be included at the bottom of the article and we will walk through the heart of the code that drives this demo, leaving you with two alternative modules for playing sound that you may freely incorporate into your own work.

In the beginning, there was silence.  The early days of browsers were devoid of any method to play any sound at all. Over time, some limited proprietary sound playing capability began to appear in different browsers, but this was essentially unusable.   Eventually, plug-ins appeared that made the playing of sound in the browser a reality.  One of these plug-ins, Flash, still enjoys some popularity even today, particularly as a fallback mechanism for out of date browsers that don’t support more modern methodologies.  However, most developers would rather not have to use a plug-in, which requires client-side installation and maintenance by users.  So a more modern built-in solution is typically preferable.

Enter the HTML5 Audio element.  HTML5 audio is the dominant form of sound expression in today’s browsers.  In fact, this mechanism is present in all modern browsers (see Can I use the HTML5 audio element).  More importantly, it works reasonably well for playing basic sounds in simple scenarios – although the implementation does differ significantly across different platforms and caveats abound.   Here I am tempted to begin describing the features of the HTML5 Audio element and contrasting their implementation in major browsers – however, that is not really the thrust of this article (though basics will be discussed) and that information can be found elsewhere on the web.  I’ll provide a link at the bottom of the article to detailed information on the HTML5 Audio specification.  I’ll also provide a link that discusses implementation limitations on some platforms – notably iOS with mobile Safari.  The topic of mobile Safari does figure prominently in one of the code examples that follow.

The wave of the future for playing sound in the browser appears to be an API known as Web Audio.  This is an open source, WC3 supported, Chrome-originated specification.  This specification is much more extensive and capable than the HTML5 Audio specification.  Unfortunately, though it currently enjoys support from modern web-kit browsers,  there is insufficient penetration in the browser ecosphere to be able to rely upon it as yet, unless you are targeting specific browser platforms (see Can I use the Web Audio API).  Earlier, I foreshadowed a conclusion made in this article by suggesting that Howler.js is the best way to implement browser sounds at this time.  One significant reason for this is that by default, Howler.js will use the Web Audio API, to play your sounds if support is present.  If it is not present, it will fall back to an HTML5 Audio solution.

Let’s focus on the present now and take a look at solutions that work and are broadly applicable today.  If you want to use sound in the browser today, it probably means that you will have at least some degree of reliance on HTML5 Audio – either directly using “native” JavaScript or else indirectly through some library such as Howler.js (since Howler.js will automatically fall back to using HTML5 Audio if it must).  It should be noted that there are other JavaScript sound libraries, but these tend to start with HTML5 Audio support and fall back to Flash support if HTML5 Audio is not supported.  While a Flash fallback is useful in a small number of scenarios, this has become increasingly unnecessary since HTML5 Audio support in modern browsers is nearly ubiquitous.  Personally I’d rather go with a forward-looking solution like Howler.js that tries to support that latest technologies and used HTML5 Audio as the fallback.

Here’s a quick review of the core ideas behind playing sound in the browser and using HTML5 Audio.   At the most basic level, the process of causing a browser to emit sound consists of two steps:

  1. Loading / pre-loading your sound resource(s).
  2. Playing your sound resource(s).

Regardless of which browser is involved, when employing HTML5 Audio it is the audio element that handles both of these tasks.  It will even provide a sound playing UI if you enable it.  For our purposes however, we would prefer to dynamically create an audio element with no visible UI, and use it from within JavaScript code to both load and play sounds.

Whether a static or a dynamic audio element is employed, or instead a dynamic audio element is used, the manner in which it works is configured by setting it’s attributes.  The following are the “most important” attributes and are enough to get started with – if your sound playing needs are fairly basic, it may be that you will need to set no other attributes.

  • The controls attribute – Add this to the element by itself (with no associated value) and it will enable the build-in sound playing API.  This is normally used when a static element is employed but useful to know about, even though neither code example in this article uses it.
  • The src attribute – This is used to define the sound resource (i.e., some file path or URL) that the element will act upon.
  • The preload attribute  – This is supposed to allow control of if and when the element will preload the sound (buffer it in memory) to minimize or eliminate lag between the request to play the sound and the actual playing of the sound.  If you do not specify a preload attribute, the default value of “auto” will be applied, which is pretty much what you’d always want anyway.  Unfortunately, some browsers – notably mobile Safari – ignore this setting and will refuse to preload, only loading sound after a user interaction is detected.

Aside from the above-listed attributes, there are only perhaps two more things more you must know about the audio element – how to know when a sound resource is ready to play and how to actually play it.  Some sound resources can be very large – for example, music files.  A significant amount of time can pass between initiation of preload and the ability to actually play a sound.  The canplaythrough event is useful in detecting when a sound has been loaded completely enough to be able to play it without problems.  Which leads us to the other thing you should know – the play method of the HTML5 audio element must be called from JavaScript code to actually play the sound if the built-in sound UI is not employed.

In short, using the HTML5 audio element is pretty easy - unfortunately the lack of a single standardized audio format and the uneven implementation of the specification across browsers presents difficulties to the web developer who would like to write once and run everywhere.   We’ll look at the worst of these difficulties and then examine work-arounds and approaches to deal with them.  Once this is out of the way, we’ll dive into some code, the implementation of which has been designed to avoid or minimize these difficulties.

Probably the first thing to consider and possibly the most significant, is to decide what audio format to use for your sound.  There are several formats available but here is a short-list of some of the most popular and summary comments on each:

  • The wav format -  On the plus side, this format provides good sound quality and is usable nearly everywhere – the most notable exception being Internet Explorer, which does not support wav except in the mobile version.  On the negative side, files of this format are quite large, and as mentioned already, it is not supported by desktop I.E.
  • The mp3 format – On the plus side, this format provides reasonably good sound quality (although it is lossy storage format) with small file sizes (it is a compressed).  Additionally, this format is supported by every modern browser except Opera.  On the negative side, there is typically a small lag when decoding the compression on this file - even if pre-loaded – and again as already mentioned, no Opera support.  Additionally, there are potential legal issues with using this format which although quite standard in the music industry, is proprietary in nature.  The good news for consumers is that several of the patents related to mp3 usage have already gone through one patent extension and the last of these remaining patents are due to expire sometime in 2015.  Although one or two patents related to mp3 are still enforced by two companies, this enforcement is ebbing and might not even be an issue except for high volume usages (and in these cases the financial demands for usage are not so unreasonable).  Caveat: Let it be understood that I am not a lawyer and not offering legal advice – use mp3 at your own risk.
  • The m4a aka ACC format – This is a variation on MPEG4 for audio files, a descendant of mp3.  There are several plusses compared to mp3: files are smaller, the sound is higher quality and usage is completely royalty-free.  The downsides are that support for this format is probably not quite as widespread as it is for mp3.  Also, I believe there is still a decode lag on this format.
  • The ogg format – This is a proprietary but license-free format that offers good quality and reasonable size files.  The downside is that it is only available on certain platforms: Chrome, Firefox and Opera.  Neither Internet Explorer nor Safari support it at this time.

If you have to choose one format, wav is a pretty good choice for small sounds and m4a (ACC) is a pretty good choice for large sounds and music files.  The mp3 format may also be a possibility (just an observation – follow it at your own risk).  However, be aware of the aforementioned caveats when choosing these or any formats.  The good news is, you can specify multiple formats.  This is easy and reliable if you put static tags in your html, and more difficult if using JavaScript.  The canPlayType method is used for this, but instead of returning true or false, it returns “probably”, “maybe” or empty string – go figure.  Another of the nice features of Howler.js is that you can specify an array of files in different formats for the same sound, and Howler.js will figure out which to play.  Of course, one problem in going with the multiple format approach is that you have to provide multiple resources, which is more hassle and not ideal on mobile devices with their limited storage space.

We’ll look at just one more concern – the issues with mobile Safari – and then finally move on to some actual code.  Mobile Safari has the following issues with playing sounds (this list is derived from a somewhat famous article by Remy Sharp on his blog – see link at bottom of page):

  • iPhones do not like playing too much audio at once, it gets very choppy.
  • iPads do not play more than one audio stream at once.
  • iOS will not preload the audio unless the user initiates the action.
  • There is about a 1/2 second delay before iOS is able to play the audio.  This is because the audio object (in iOS, not HTML5) is being created.

The list above can be condensed into two major problems:

  • Apple iOS / mobile Safari uses a singleton audio object to play sounds and switching between sound resources always incurs a lag while the backing audio object is being created and initialized.
  • A user must take an action in the UI before each sound can be loaded – this effectively means that there is no preload and you must suffer an additional lag to load each sound the first time it is played.

There is an approach that  helps to address these two problems: the use of a single sound file containing sound sprites (aka audio sprites).  The term “sound sprite” or if you prefer, “audio sprite”, is a derivative of the css or image sprite idea employed in game development.  Image are often packed together in single files known as sprite sheets.  These sprite sheets consist of multiple embedded images, which are in turn used as textures to give game sprites their unique individual appearance.  Placing sprites all in one sprite sheet results in better organization, better performance and better memory usage.  Similarly, a sound sprite is an embedded sound in a single larger file containing other embedded sounds.  You get some of the same benefits using this technique that are gained through the use of image sprites, but that’s not the best reason to use them.  the biggest reason that sound sprites are worth using is that they help surmount some difficulties experienced in playing sound in Apple’s mobile Safari browser.  Creating JavaScript code to use a sound sprite is no trivial matter (also see the above-mentioned Remy Sharp article for details on how to do this).  Fortunately, once again Howler.js can come to your rescue.  It provides an API for using a sound sprite in addition to providing APIs for using stand alone sound resource files – and don’t forget, it will use Web Audio if it is supported, which gives better results than HTML5 Audio.

Now, at last, we get to look at some code -  the first example is a Sounds object that relies on plain HTML5 code and the Audio element.  This Sounds object is in its own module, contained within an IIFE (Immediately Invoked Function Expression), and uses the revealing module pattern to expose functionality.  The object is stored in a single global app variable, making it ready for use anywhere within the app with no pollution of the global name space.  Take a look at the code below.

The gist of the code above is that five different sound files are loaded and methods are provided to play these sounds individually.  Each sound is assigned to its own dynamically created Audio element, with the src attribute of that element referencing the actual file containing the sound.  The preload attribute is not used here, since the default for the preload attribute is “auto”, and that is what we want.  This is the best shot at getting our sounds preloaded, but in reality, a preload of “auto” is only considered a request, and browsers may honor it or ignore it (e.g. mobile Safari ignores it).  I should note here that canplaythrough event that is used here as a means of knowing when a sound has been loaded is not perfect – at least not on some browsers.  In Chrome, if I specified a path to a file that did not exist, the canplaythrough event  still fired - though it fired almost immediately – I suppose because there was nothing to load.  To complete the explanation of this Sounds object, a reference to each audio element, and thus each sound, is stored in the sounds object as a property using the associative array syntax, with the property name being a short descriptive name for the sound.  This name is the argument value that is passed to the play method, denoting which sound to play.  I have used objects very much like this in small projects and apps and they work well for providing basic sounds.

Now we’ll look at a Howler.js version of this same object.  This implementation uses a sound sprite file, although as previously mentioned, Howler.js also supports the playing of stand alone files as the “hand-made” object above does.  However, Howler provides much more built in capability than the object above and it does so in only about 9K of code.  Take a look at the Howler version of a Sound controlling object below.

In the above example, there are four sounds embedded in one file – the loading and playing details all handled by Howler.js.  This file is borrowed from a JSFiddle example created by Aaron Gloege (see link at bottom of page). Note that for each sound, a starting point with the file and duration a duration (both in milliseconds) are specified.  When Howler.js plays one of these sounds, it seeks to the proper point in the file and plays it for the specified duration.  It is also worth noting that each embedded sound is surrounded by a small amount of silence (generally a second or two) which allows for some leeway in the precision of the audio playing mechanism.  Despite major internal differences, the API for the HSounds object is pretty much the same as the Sounds object API.  So I’ll not repeat a description of that.

This article is drawing to a close – I hope you found the explanations, the demos and the stealable code to be useful.  As promised at the start of the article, a small demo of a page that uses both of these objects can be found on Uberiquity.com, here.  In the sections below I provide other links pertinent to this article and also attribution for the remaining sound files (those other than the file I borrowed from Aaron Gloege’s JSFiddle).

Resources cited in this article:

Remy Sharp’s Audio Sprites article

Aaron Gloege’s JSFiddle Audio Sprites example

Howler.js Home Page

Using HTML5 audio and video (MDN)


Attributions (for sound files other than those used by Mr. Gloege):

  • The “1″, “2″ and “3″ sounds are being used under the  Creative Commons Attribution License (3.0) and were created by Margo Heston.  They can be found in a pack on freesound.org at http://www.freesound.org/people/margo_heston/packs/12534/
  • The shore sound and the ding sound are being used under the Creative Commons 0 License.