Category Archives: open source

Coaxing Sound out of the Browser

“Get someone else to blow your horn and the sound will carry twice as far.” – Will Rogers

Having the utmost respect for Will Rogers and his timeless wisdom, I must apologize to his memory for using this quote for purposes other than originally intended.  The quote struck me as somehow appropriate,  because I want to write a few words on the use of sound in browsers and I want to recommend the adoption of a particular JavaScript library to make this as painless as possible.  Rather than creating your own solution from scratch, in the final analysis I am going to suggest letting Howler.js “blow your horn” for you.

This article touches on developing pages and browser applications using sound  – past, present and future.  However, most of the article will be devoted to alternatives for using sound in today’s development environment and I’ll show you two different ways to do it.  A link to a working demo will be included at the bottom of the article and we will walk through the heart of the code that drives this demo, leaving you with two alternative modules for playing sound that you may freely incorporate into your own work.

In the beginning, there was silence.  The early days of browsers were devoid of any method to play any sound at all. Over time, some limited proprietary sound playing capability began to appear in different browsers, but this was essentially unusable.   Eventually, plug-ins appeared that made the playing of sound in the browser a reality.  One of these plug-ins, Flash, still enjoys some popularity even today, particularly as a fallback mechanism for out of date browsers that don’t support more modern methodologies.  However, most developers would rather not have to use a plug-in, which requires client-side installation and maintenance by users.  So a more modern built-in solution is typically preferable.

Enter the HTML5 Audio element.  HTML5 audio is the dominant form of sound expression in today’s browsers.  In fact, this mechanism is present in all modern browsers (see Can I use the HTML5 audio element).  More importantly, it works reasonably well for playing basic sounds in simple scenarios – although the implementation does differ significantly across different platforms and caveats abound.   Here I am tempted to begin describing the features of the HTML5 Audio element and contrasting their implementation in major browsers – however, that is not really the thrust of this article (though basics will be discussed) and that information can be found elsewhere on the web.  I’ll provide a link at the bottom of the article to detailed information on the HTML5 Audio specification.  I’ll also provide a link that discusses implementation limitations on some platforms – notably iOS with mobile Safari.  The topic of mobile Safari does figure prominently in one of the code examples that follow.

The wave of the future for playing sound in the browser appears to be an API known as Web Audio.  This is an open source, WC3 supported, Chrome-originated specification.  This specification is much more extensive and capable than the HTML5 Audio specification.  Unfortunately, though it currently enjoys support from modern web-kit browsers,  there is insufficient penetration in the browser ecosphere to be able to rely upon it as yet, unless you are targeting specific browser platforms (see Can I use the Web Audio API).  Earlier, I foreshadowed a conclusion made in this article by suggesting that Howler.js is the best way to implement browser sounds at this time.  One significant reason for this is that by default, Howler.js will use the Web Audio API, to play your sounds if support is present.  If it is not present, it will fall back to an HTML5 Audio solution.

Let’s focus on the present now and take a look at solutions that work and are broadly applicable today.  If you want to use sound in the browser today, it probably means that you will have at least some degree of reliance on HTML5 Audio – either directly using “native” JavaScript or else indirectly through some library such as Howler.js (since Howler.js will automatically fall back to using HTML5 Audio if it must).  It should be noted that there are other JavaScript sound libraries, but these tend to start with HTML5 Audio support and fall back to Flash support if HTML5 Audio is not supported.  While a Flash fallback is useful in a small number of scenarios, this has become increasingly unnecessary since HTML5 Audio support in modern browsers is nearly ubiquitous.  Personally I’d rather go with a forward-looking solution like Howler.js that tries to support that latest technologies and used HTML5 Audio as the fallback.

Here’s a quick review of the core ideas behind playing sound in the browser and using HTML5 Audio.   At the most basic level, the process of causing a browser to emit sound consists of two steps:

  1. Loading / pre-loading your sound resource(s).
  2. Playing your sound resource(s).

Regardless of which browser is involved, when employing HTML5 Audio it is the audio element that handles both of these tasks.  It will even provide a sound playing UI if you enable it.  For our purposes however, we would prefer to dynamically create an audio element with no visible UI, and use it from within JavaScript code to both load and play sounds.

Whether a static or a dynamic audio element is employed, or instead a dynamic audio element is used, the manner in which it works is configured by setting it’s attributes.  The following are the “most important” attributes and are enough to get started with – if your sound playing needs are fairly basic, it may be that you will need to set no other attributes.

  • The controls attribute – Add this to the element by itself (with no associated value) and it will enable the build-in sound playing API.  This is normally used when a static element is employed but useful to know about, even though neither code example in this article uses it.
  • The src attribute – This is used to define the sound resource (i.e., some file path or URL) that the element will act upon.
  • The preload attribute  – This is supposed to allow control of if and when the element will preload the sound (buffer it in memory) to minimize or eliminate lag between the request to play the sound and the actual playing of the sound.  If you do not specify a preload attribute, the default value of “auto” will be applied, which is pretty much what you’d always want anyway.  Unfortunately, some browsers – notably mobile Safari – ignore this setting and will refuse to preload, only loading sound after a user interaction is detected.

Aside from the above-listed attributes, there are only perhaps two more things more you must know about the audio element – how to know when a sound resource is ready to play and how to actually play it.  Some sound resources can be very large – for example, music files.  A significant amount of time can pass between initiation of preload and the ability to actually play a sound.  The canplaythrough event is useful in detecting when a sound has been loaded completely enough to be able to play it without problems.  Which leads us to the other thing you should know – the play method of the HTML5 audio element must be called from JavaScript code to actually play the sound if the built-in sound UI is not employed.

In short, using the HTML5 audio element is pretty easy - unfortunately the lack of a single standardized audio format and the uneven implementation of the specification across browsers presents difficulties to the web developer who would like to write once and run everywhere.   We’ll look at the worst of these difficulties and then examine work-arounds and approaches to deal with them.  Once this is out of the way, we’ll dive into some code, the implementation of which has been designed to avoid or minimize these difficulties.

Probably the first thing to consider and possibly the most significant, is to decide what audio format to use for your sound.  There are several formats available but here is a short-list of some of the most popular and summary comments on each:

  • The wav format -  On the plus side, this format provides good sound quality and is usable nearly everywhere – the most notable exception being Internet Explorer, which does not support wav except in the mobile version.  On the negative side, files of this format are quite large, and as mentioned already, it is not supported by desktop I.E.
  • The mp3 format – On the plus side, this format provides reasonably good sound quality (although it is lossy storage format) with small file sizes (it is a compressed).  Additionally, this format is supported by every modern browser except Opera.  On the negative side, there is typically a small lag when decoding the compression on this file - even if pre-loaded – and again as already mentioned, no Opera support.  Additionally, there are potential legal issues with using this format which although quite standard in the music industry, is proprietary in nature.  The good news for consumers is that several of the patents related to mp3 usage have already gone through one patent extension and the last of these remaining patents are due to expire sometime in 2015.  Although one or two patents related to mp3 are still enforced by two companies, this enforcement is ebbing and might not even be an issue except for high volume usages (and in these cases the financial demands for usage are not so unreasonable).  Caveat: Let it be understood that I am not a lawyer and not offering legal advice – use mp3 at your own risk.
  • The m4a aka ACC format – This is a variation on MPEG4 for audio files, a descendant of mp3.  There are several plusses compared to mp3: files are smaller, the sound is higher quality and usage is completely royalty-free.  The downsides are that support for this format is probably not quite as widespread as it is for mp3.  Also, I believe there is still a decode lag on this format.
  • The ogg format – This is a proprietary but license-free format that offers good quality and reasonable size files.  The downside is that it is only available on certain platforms: Chrome, Firefox and Opera.  Neither Internet Explorer nor Safari support it at this time.

If you have to choose one format, wav is a pretty good choice for small sounds and m4a (ACC) is a pretty good choice for large sounds and music files.  The mp3 format may also be a possibility (just an observation – follow it at your own risk).  However, be aware of the aforementioned caveats when choosing these or any formats.  The good news is, you can specify multiple formats.  This is easy and reliable if you put static tags in your html, and more difficult if using JavaScript.  The canPlayType method is used for this, but instead of returning true or false, it returns “probably”, “maybe” or empty string – go figure.  Another of the nice features of Howler.js is that you can specify an array of files in different formats for the same sound, and Howler.js will figure out which to play.  Of course, one problem in going with the multiple format approach is that you have to provide multiple resources, which is more hassle and not ideal on mobile devices with their limited storage space.

We’ll look at just one more concern – the issues with mobile Safari – and then finally move on to some actual code.  Mobile Safari has the following issues with playing sounds (this list is derived from a somewhat famous article by Remy Sharp on his blog – see link at bottom of page):

  • iPhones do not like playing too much audio at once, it gets very choppy.
  • iPads do not play more than one audio stream at once.
  • iOS will not preload the audio unless the user initiates the action.
  • There is about a 1/2 second delay before iOS is able to play the audio.  This is because the audio object (in iOS, not HTML5) is being created.

The list above can be condensed into two major problems:

  • Apple iOS / mobile Safari uses a singleton audio object to play sounds and switching between sound resources always incurs a lag while the backing audio object is being created and initialized.
  • A user must take an action in the UI before each sound can be loaded – this effectively means that there is no preload and you must suffer an additional lag to load each sound the first time it is played.

There is an approach that  helps to address these two problems: the use of a single sound file containing sound sprites (aka audio sprites).  The term “sound sprite” or if you prefer, “audio sprite”, is a derivative of the css or image sprite idea employed in game development.  Image are often packed together in single files known as sprite sheets.  These sprite sheets consist of multiple embedded images, which are in turn used as textures to give game sprites their unique individual appearance.  Placing sprites all in one sprite sheet results in better organization, better performance and better memory usage.  Similarly, a sound sprite is an embedded sound in a single larger file containing other embedded sounds.  You get some of the same benefits using this technique that are gained through the use of image sprites, but that’s not the best reason to use them.  the biggest reason that sound sprites are worth using is that they help surmount some difficulties experienced in playing sound in Apple’s mobile Safari browser.  Creating JavaScript code to use a sound sprite is no trivial matter (also see the above-mentioned Remy Sharp article for details on how to do this).  Fortunately, once again Howler.js can come to your rescue.  It provides an API for using a sound sprite in addition to providing APIs for using stand alone sound resource files – and don’t forget, it will use Web Audio if it is supported, which gives better results than HTML5 Audio.

Now, at last, we get to look at some code -  the first example is a Sounds object that relies on plain HTML5 code and the Audio element.  This Sounds object is in its own module, contained within an IIFE (Immediately Invoked Function Expression), and uses the revealing module pattern to expose functionality.  The object is stored in a single global app variable, making it ready for use anywhere within the app with no pollution of the global name space.  Take a look at the code below.

The gist of the code above is that five different sound files are loaded and methods are provided to play these sounds individually.  Each sound is assigned to its own dynamically created Audio element, with the src attribute of that element referencing the actual file containing the sound.  The preload attribute is not used here, since the default for the preload attribute is “auto”, and that is what we want.  This is the best shot at getting our sounds preloaded, but in reality, a preload of “auto” is only considered a request, and browsers may honor it or ignore it (e.g. mobile Safari ignores it).  I should note here that canplaythrough event that is used here as a means of knowing when a sound has been loaded is not perfect – at least not on some browsers.  In Chrome, if I specified a path to a file that did not exist, the canplaythrough event  still fired - though it fired almost immediately – I suppose because there was nothing to load.  To complete the explanation of this Sounds object, a reference to each audio element, and thus each sound, is stored in the sounds object as a property using the associative array syntax, with the property name being a short descriptive name for the sound.  This name is the argument value that is passed to the play method, denoting which sound to play.  I have used objects very much like this in small projects and apps and they work well for providing basic sounds.

Now we’ll look at a Howler.js version of this same object.  This implementation uses a sound sprite file, although as previously mentioned, Howler.js also supports the playing of stand alone files as the “hand-made” object above does.  However, Howler provides much more built in capability than the object above and it does so in only about 9K of code.  Take a look at the Howler version of a Sound controlling object below.

In the above example, there are four sounds embedded in one file – the loading and playing details all handled by Howler.js.  This file is borrowed from a JSFiddle example created by Aaron Gloege (see link at bottom of page). Note that for each sound, a starting point with the file and duration a duration (both in milliseconds) are specified.  When Howler.js plays one of these sounds, it seeks to the proper point in the file and plays it for the specified duration.  It is also worth noting that each embedded sound is surrounded by a small amount of silence (generally a second or two) which allows for some leeway in the precision of the audio playing mechanism.  Despite major internal differences, the API for the HSounds object is pretty much the same as the Sounds object API.  So I’ll not repeat a description of that.

This article is drawing to a close – I hope you found the explanations, the demos and the stealable code to be useful.  As promised at the start of the article, a small demo of a page that uses both of these objects can be found on Uberiquity.com, here.  In the sections below I provide other links pertinent to this article and also attribution for the remaining sound files (those other than the file I borrowed from Aaron Gloege’s JSFiddle).

Resources cited in this article:

Remy Sharp’s Audio Sprites article

Aaron Gloege’s JSFiddle Audio Sprites example

Howler.js Home Page

Using HTML5 audio and video (MDN)

 

Attributions (for sound files other than those used by Mr. Gloege):

  • The “1″, “2″ and “3″ sounds are being used under the  Creative Commons Attribution License (3.0) and were created by Margo Heston.  They can be found in a pack on freesound.org at http://www.freesound.org/people/margo_heston/packs/12534/
  • The shore sound and the ding sound are being used under the Creative Commons 0 License.

 

 

 

Understanding Open Source Licensing

“Liberty is the right to do what I like; license, the right to do what you like.” – Bertrand Russell

Russell implies that license means abrogation of rights.  In the world of open source software this quote is true – and yet in an important way, false.  The very idea of open source software licensing is intended to establish and guarantee right to use without any real financial cost.   However, monetarily free does not mean free in every sense of the word – there are restrictions on open source licenses.  The details of these license restrictions varies depending upon which open source license is employed – some are more restrictive than others.  To complicate matters even more, It can be argued that some types of restrictions are on the whole, beneficial.  With literally hundreds of distinct licenses in circulation, and bearing in mind the legalese in which most of these are wrapped, it is a daunting task to understand when and how a particular license license can safely be used.

This article is an attempt to demystify the most common of the these open source licenses.  It will also define – in layman’s terms – some of the language and concepts used in describing and discussing open source licenses.  However, take note before continuing on – the phrase “layman’s terms” is quite apropos – I am not a lawyer nor am I an expert on open source software licensing.  The information presented in this article is a distillation of information obtained by studying several different internet sources.

To simplify the task of understanding open source licensing, one can make use of three ideas:

  1. There are some basic similarities common to all open source licenses.
  2. Even the differences between common open source licenses can be reduced by grouping them into basic categories.
  3. In practice, there is only a small subset of these licenses that see frequent use.

Of the hundreds of licenses alluded to above, less than one hundred need be given any real consideration.  Even so, by applying the above ideas it is possible to reduce the number of licenses that must be considered to only seven or so, and each of these seven can be categorized as belonging to only one of two basic types.  However, before going into details I will define a small number of terms pertaining to open source software licensing.  The majority of these terms are useful in the context of this article and may also be useful on occasions when examining various open source licenses in the wild.

The following definitions of terms are useful in understanding open source software licensing:

Open Source Software – Executable or component software that is freely (both in cost and accessibility) available for private and commercial use.  Proprietary use may or may not be restricted, depending upon the type of license.

Private Use -  Any and all localized uses of the software, including modification, that do not involve redistribution for profit.

Commercial Use – Redistribution of the software as a product – in whole, part, derived or combinatorial work – in exchange for financial compensation.  An important point to bear in mind with this definition is that there is no attempt to limit or alter the original terms of the license.

Proprietary Use – In terms of open source software, “proprietary use” is a misnomer – once it becomes proprietary, it is no longer open source.  However, the concept can come into play with some types of licensing combinations.  A reasonable definition is that this is like commercial use, except that ability to limit or eliminate the impact of the original open source license has been granted or purchased.

Open Source License – A glob of text (typically less than around one thousand words) that indemnifies the creator/distributor against legal action, describes the rights of any would be user, and also potentially details limits of usage and distribution of the pertinent software.

Copyright – A legal device that guarantees the creator of a work the right to control its usage by others.  Copyright law provides the “teeth” for open source licenses.

Fair Use – Fair use is a legal concept that provides a limit to the manner in which copyright law may be enforced.  It generally applies to actual usage or output produced from a work, instead of restricting in what manner the contents of the work itself my be used.

Copyleft -  A copyleft clause in a license allows the open source use of a work by any party, but also mandates that any party who creates a derivative work must not change or disallow the rights detailed in the original open source license.  You might think of this in coding terms as a sort of “inheritance” for licenses.  The term “copyleft” has no legal meaning and derives it’s backing from copyright law.

Attribution Requirement – Typically this means that the license under which an open source product is distributed must be copied word for word into the source and/or documentation of any derivative product, if said product will itself be distributed.  Indemnification clauses are often bundled with the Attribution text.

License Compatibility – This term comes into play when one is considering distributing a derivative work that based upon open source works using different kinds of licenses.  The terms of some licenses are mutually incompatible.  On the other hand, some licenses are by nature compatible with other open source licenses.

Multiple (often Dual) Licensing – A mechanism whereby a software creator allows users of the work to use it under more than one license.  This offers the user a choice – either one or the other licenses will apply.  This technique is typically used to work around license compatibility issues in popular open source licenses.

So, what are the open source license commonalities?

  • All are created and offered with the philosophy that it should be possible to distribute useful software ideas and implementations in ways that do not result in their monopolization.  Sometimes the intent is purely altruistic and sometimes the idea is to promote mass distribution with the idea of eventual direct or indirect gain.
  • All are based upon copyright law.
  • All are intended to provide the software creator/distributor with some degree of legal indemnity against harm.
  • All require some form of attribution.

What basic differences exist between the different sorts of open software licenses and how to group similar types of licenses together for easier understanding?  Open source software licenses can be divided into two basic groups:  Those that employ “copyleft” and those that do not.  The intent of copyleft is to give all users the same rights as given to immediate or “first level” users of the software.  This generally translates into requirements for authors of derivative works to make freely available their changes to the original work – in other words, the changes to the open source software must also be open source.  While this is a laudable goal, it is more restrictive than the very permissive rules of non-copyleft open open source licenses.  This tends to prevent the adoption of copyleft solutions by some companies.

The most common examples of copyleft license open source software are those in the GNU Public License family, typically referred to as GPL.  The GPL licenses represent the first form of open source licenses that were introduced and are probably the most popular form of open source license in use today.  It is worth noting that even the wording of the license itself is copyrighted.  In the GPL family there are several similar licenses, the most common ones in use today being:

  • GPLv2 – a long time staple of the open source industry
  • GPLv3 – A revised revision of GPLv2 that aims at preventing tivoization and introducing greater compatibility with other licensing schemes.
  • LGPL – This family of GPL variants is similar to the GPL family, but use what is known as a “weak copyleft” scheme.  The first “L” in LGPL stands for library – not “Light”.  What the LGPL offers that a GPL does not is the ability to create a program that “dynamically links” to LGPL licensed products yet is not required to itself utilize the LGPL license.  The term “dynamic linking” refers to code accessing other discreet units of compiled code without embedding them.
  • AGPLv3 – Otherwise known as the Affero General Public License, is a GPL-like license with one big difference.  It is intended to stop scenarios such as web sites or other central network resources using open source licensed code to provide a service but without abiding by the open source license used in the creation of the centralized product.

The most common representatives of non-copyleft open source software licenses are BSD, MIT/X11 and Apache 2.0.  These licenses are very permissive, requiring only attribution in a prescribed manner.  The permissiveness is similar to that provided by public domain, aside from the need for attribution and the additional protections provided by concurrent disclaimers in these open source licenses.  Open software used under this kind of license is the kind most likely to end up in proprietary products.

  • BSD -  This may be the oldest and simplest form of non-restrictive open source licenses.  This is actually a family within itself but all are pretty similar.  The texts of the most common among these are not copyrighted.
  • MIT/X11 – Not much different than the BSD license family, but the wording more explicitly states the rights of users of the software.  The wording of this license is not copyrighted. Note that MIT has released other licenses in the past, which may differ from X11.  Nevertheless, the MIT/X11 license is often colloquially referred to as the “MIT” license.
  • Apache 2.0 – Similar in philosophy to BSD and MIT/X11, but much more verbose with more explicit wording yet, particularly including extra provisos regarding patent rights.  The Apache License requires preservation of the copyright notice and disclaimer.

We have arrived at last upon the take-away paragraph.  Here is the summary of what one needs to remember about open source licensing.  Bear in mind that I am not making recommendations here – I am only sharing my personal observations.  Use any of these in any manner at your own risk.

  • If you just want to use some free code and don’t care about or don’t want to maintain its “open-sourceness”, look for a BSD, MIT/X11 or Apache 2.O license.
  • If you want to use open source code and make sure your changes continue to say open source, look for a GPL license variant, but be ready to supply your source changes.
  • You will encounter other licenses that are not mentioned here.  A good first step in understanding licenses is to classify them as either the “copyleft” or “non-copyleft”. However, it is very important to understand all of the restrictions and rights associated with the license of any open source software if you change it – or even simply use it – in a product that you intend to make available for others.
  • When you combine open source material from multiple sources, you must ensure that the licensing constraints are compatible.  In general, extra care must be taken when copyleft licenses are involved, though some copyleft licenses, for example LPGv3, may be compatible with other licenses.
  • When dual or multiple licenses are offered, choose the one which best suits your own distribution and compatibility needs.
  • In virtually every case, open source software requires attribution.  Typically this consists of the requirement to redistribute the “license text”, which includes disclaimers of liability.  Licenses will explicitly state all details of the attribution requirements – and by law they must be followed if you wish to distribute software that uses the licensed product.