Category Archives: Collaboration

Understanding Open Source Licensing

“Liberty is the right to do what I like; license, the right to do what you like.” – Bertrand Russell

Russell implies that license means abrogation of rights.  In the world of open source software this quote is true – and yet in an important way, false.  The very idea of open source software licensing is intended to establish and guarantee right to use without any real financial cost.   However, monetarily free does not mean free in every sense of the word – there are restrictions on open source licenses.  The details of these license restrictions varies depending upon which open source license is employed – some are more restrictive than others.  To complicate matters even more, It can be argued that some types of restrictions are on the whole, beneficial.  With literally hundreds of distinct licenses in circulation, and bearing in mind the legalese in which most of these are wrapped, it is a daunting task to understand when and how a particular license license can safely be used.

This article is an attempt to demystify the most common of the these open source licenses.  It will also define – in layman’s terms – some of the language and concepts used in describing and discussing open source licenses.  However, take note before continuing on – the phrase “layman’s terms” is quite apropos – I am not a lawyer nor am I an expert on open source software licensing.  The information presented in this article is a distillation of information obtained by studying several different internet sources.

To simplify the task of understanding open source licensing, one can make use of three ideas:

  1. There are some basic similarities common to all open source licenses.
  2. Even the differences between common open source licenses can be reduced by grouping them into basic categories.
  3. In practice, there is only a small subset of these licenses that see frequent use.

Of the hundreds of licenses alluded to above, less than one hundred need be given any real consideration.  Even so, by applying the above ideas it is possible to reduce the number of licenses that must be considered to only seven or so, and each of these seven can be categorized as belonging to only one of two basic types.  However, before going into details I will define a small number of terms pertaining to open source software licensing.  The majority of these terms are useful in the context of this article and may also be useful on occasions when examining various open source licenses in the wild.

The following definitions of terms are useful in understanding open source software licensing:

Open Source Software – Executable or component software that is freely (both in cost and accessibility) available for private and commercial use.  Proprietary use may or may not be restricted, depending upon the type of license.

Private Use -  Any and all localized uses of the software, including modification, that do not involve redistribution for profit.

Commercial Use – Redistribution of the software as a product – in whole, part, derived or combinatorial work – in exchange for financial compensation.  An important point to bear in mind with this definition is that there is no attempt to limit or alter the original terms of the license.

Proprietary Use – In terms of open source software, “proprietary use” is a misnomer – once it becomes proprietary, it is no longer open source.  However, the concept can come into play with some types of licensing combinations.  A reasonable definition is that this is like commercial use, except that ability to limit or eliminate the impact of the original open source license has been granted or purchased.

Open Source License – A glob of text (typically less than around one thousand words) that indemnifies the creator/distributor against legal action, describes the rights of any would be user, and also potentially details limits of usage and distribution of the pertinent software.

Copyright – A legal device that guarantees the creator of a work the right to control its usage by others.  Copyright law provides the “teeth” for open source licenses.

Fair Use – Fair use is a legal concept that provides a limit to the manner in which copyright law may be enforced.  It generally applies to actual usage or output produced from a work, instead of restricting in what manner the contents of the work itself my be used.

Copyleft -  A copyleft clause in a license allows the open source use of a work by any party, but also mandates that any party who creates a derivative work must not change or disallow the rights detailed in the original open source license.  You might think of this in coding terms as a sort of “inheritance” for licenses.  The term “copyleft” has no legal meaning and derives it’s backing from copyright law.

Attribution Requirement – Typically this means that the license under which an open source product is distributed must be copied word for word into the source and/or documentation of any derivative product, if said product will itself be distributed.  Indemnification clauses are often bundled with the Attribution text.

License Compatibility – This term comes into play when one is considering distributing a derivative work that based upon open source works using different kinds of licenses.  The terms of some licenses are mutually incompatible.  On the other hand, some licenses are by nature compatible with other open source licenses.

Multiple (often Dual) Licensing – A mechanism whereby a software creator allows users of the work to use it under more than one license.  This offers the user a choice – either one or the other licenses will apply.  This technique is typically used to work around license compatibility issues in popular open source licenses.

So, what are the open source license commonalities?

  • All are created and offered with the philosophy that it should be possible to distribute useful software ideas and implementations in ways that do not result in their monopolization.  Sometimes the intent is purely altruistic and sometimes the idea is to promote mass distribution with the idea of eventual direct or indirect gain.
  • All are based upon copyright law.
  • All are intended to provide the software creator/distributor with some degree of legal indemnity against harm.
  • All require some form of attribution.

What basic differences exist between the different sorts of open software licenses and how to group similar types of licenses together for easier understanding?  Open source software licenses can be divided into two basic groups:  Those that employ “copyleft” and those that do not.  The intent of copyleft is to give all users the same rights as given to immediate or “first level” users of the software.  This generally translates into requirements for authors of derivative works to make freely available their changes to the original work – in other words, the changes to the open source software must also be open source.  While this is a laudable goal, it is more restrictive than the very permissive rules of non-copyleft open open source licenses.  This tends to prevent the adoption of copyleft solutions by some companies.

The most common examples of copyleft license open source software are those in the GNU Public License family, typically referred to as GPL.  The GPL licenses represent the first form of open source licenses that were introduced and are probably the most popular form of open source license in use today.  It is worth noting that even the wording of the license itself is copyrighted.  In the GPL family there are several similar licenses, the most common ones in use today being:

  • GPLv2 – a long time staple of the open source industry
  • GPLv3 – A revised revision of GPLv2 that aims at preventing tivoization and introducing greater compatibility with other licensing schemes.
  • LGPL – This family of GPL variants is similar to the GPL family, but use what is known as a “weak copyleft” scheme.  The first “L” in LGPL stands for library – not “Light”.  What the LGPL offers that a GPL does not is the ability to create a program that “dynamically links” to LGPL licensed products yet is not required to itself utilize the LGPL license.  The term “dynamic linking” refers to code accessing other discreet units of compiled code without embedding them.
  • AGPLv3 – Otherwise known as the Affero General Public License, is a GPL-like license with one big difference.  It is intended to stop scenarios such as web sites or other central network resources using open source licensed code to provide a service but without abiding by the open source license used in the creation of the centralized product.

The most common representatives of non-copyleft open source software licenses are BSD, MIT/X11 and Apache 2.0.  These licenses are very permissive, requiring only attribution in a prescribed manner.  The permissiveness is similar to that provided by public domain, aside from the need for attribution and the additional protections provided by concurrent disclaimers in these open source licenses.  Open software used under this kind of license is the kind most likely to end up in proprietary products.

  • BSD -  This may be the oldest and simplest form of non-restrictive open source licenses.  This is actually a family within itself but all are pretty similar.  The texts of the most common among these are not copyrighted.
  • MIT/X11 – Not much different than the BSD license family, but the wording more explicitly states the rights of users of the software.  The wording of this license is not copyrighted. Note that MIT has released other licenses in the past, which may differ from X11.  Nevertheless, the MIT/X11 license is often colloquially referred to as the “MIT” license.
  • Apache 2.0 – Similar in philosophy to BSD and MIT/X11, but much more verbose with more explicit wording yet, particularly including extra provisos regarding patent rights.  The Apache License requires preservation of the copyright notice and disclaimer.

We have arrived at last upon the take-away paragraph.  Here is the summary of what one needs to remember about open source licensing.  Bear in mind that I am not making recommendations here – I am only sharing my personal observations.  Use any of these in any manner at your own risk.

  • If you just want to use some free code and don’t care about or don’t want to maintain its “open-sourceness”, look for a BSD, MIT/X11 or Apache 2.O license.
  • If you want to use open source code and make sure your changes continue to say open source, look for a GPL license variant, but be ready to supply your source changes.
  • You will encounter other licenses that are not mentioned here.  A good first step in understanding licenses is to classify them as either the “copyleft” or “non-copyleft”. However, it is very important to understand all of the restrictions and rights associated with the license of any open source software if you change it – or even simply use it – in a product that you intend to make available for others.
  • When you combine open source material from multiple sources, you must ensure that the licensing constraints are compatible.  In general, extra care must be taken when copyleft licenses are involved, though some copyleft licenses, for example LPGv3, may be compatible with other licenses.
  • When dual or multiple licenses are offered, choose the one which best suits your own distribution and compatibility needs.
  • In virtually every case, open source software requires attribution.  Typically this consists of the requirement to redistribute the “license text”, which includes disclaimers of liability.  Licenses will explicitly state all details of the attribution requirements – and by law they must be followed if you wish to distribute software that uses the licensed product.

Stepping into the New Age of JavaScript using Github

“A journey of a thousand miles must begin with a single step.” - Lao Tzu

At least initially, this blog will focus primarily on new developments in JavaScript… but where to start?  “Begin at the beginning”, says the King in Carroll’s Alice in Wonderland – but that would be going much too far back for our purposes.  The reader is probably at least passingly familiar with JavaScript and it’s roots, so we don’t need to tread that path.  It makes more sense to begin with recent developments, such as the advent of server side JavaScript with the inception of node.js and the rise in popularity of HTML5/JavaScript as a very exciting and very real game platform.  There is a fantastic amount of open source code available for both of these topics and many others.  The best place to access that treasure trove of free frameworks, libraries and code is something called Github.

Maybe you already know all about Github – in that case, you should probably stop reading now – this article attempts to do nothing more than define what Github is and provide a basic introduction to it’s features, installation and use.  If you are like I was however, you probably have heard of it and have some vague idea of what it is but are not comfortable with it.  Perhaps you’d rather not deal with it. Maybe you think it sounds too complicated or might expose you as some sort of coding Neanderthal to the elite coding world, should you attempt to get involved.   To be honest, I actually thought those very things myself.

However, after any degree of real exposure to the things that are happening these days in the JavaScript world and in the coding world in general, you are going to discover that you cannot evade Github.  More to the point, after you become more familiar with it you won’t want to avoid it.  The resources there are just too broad, too deep, too well coded, too useful and well… too free to ignore.

Ok, so what is Github?  The least complicated answer that is even close to adequate is that Github is a web site providing a nexus for a distributed version control system called Git.  However, it is more than that and I’ll touch on what that means a bit later on in the article.  Before I do that, I would like to try and make the “least complicated answer” I gave above, be a little less complicated.

FIrst let’s go over a little background on Git itself.  Git is a source code version control system that was created in its initial form in 2005 by a group of Linux devotees headed by Linus Torvalds.  Unlike other version control systems, for example SVN, Git does not store a base file and then subsequent diffed changes on each check in.  Instead it was designed to store “snapshots” of a project over time (although it can save space by referencing unchanged files in previous snapshots).

Coders editing files belonging to a particular project in the centralized git store have a version of that project (and access to all its history) in working directories on their local machines.  The files that they edit must go through a staging phase before being committed within the local project in which they are being modified.  This local commit is not the same thing as committing changes to the actual central store from which the local version of project was obtained.  An additional step must be taken to sync and commit changed versions of the files to the actual project repository in which the master source controlled copies of the files reside.

The differences between the way Git works vs. other source control systems allows Git to be fast and makes it convenient to use even if there is no current network connection.  I found an excellent resource on the web (there are many others as well) that explains the basic ideas behind working with Git: http://git-scm.com/book/en/Getting-Started-Git-Basics.

Once you really wrap your head around what Git is, understanding Github is not as fearsome a task at it may have originally seemed.  In passing, it should be noted that there are other web sites that host Git repositories and I believe you can even install it to be a stand-alone system on a single machine.  However, Github is far and away the most popular and nearly all of the up and coming stuff is stored there.  One of the nicest things about Github is, you don’t really have to get very involved in order to utilize the resources found there.  You can go to the site without creating an account and grab loads of open source software frameworks, libraries and code without even creating a Github account.  You don’t have to jump right in, you can just stick a toe in the water.  Nobody will complain.  You will be just one more anonymous downloader to them.

So, how does one get software from Github in “ninja” mode?

  1. Go to the home page at: https://github.com/.
  2. Type something in the search box at the top of the page, for example: node.js.
  3. Hit enter and watch the results come pouring forth.
  4. Select a project repository from this result list.
  5. Click the button that allows you to download a recent stable version of the project as a compressed file (e.g. zip, tar).

I got 16,249 repository results from a search on “node.js” at the time I wrote this article.  That’s 16,249 projects related to Node.js alone!  Many of these are world class frameworks and code libraries that are in actual use at major corporations and other legitimate companies – and you can use them too.  The majority of these desirable code bases have permissive licenses such as the MIT license or something similar – in essence, free and with almost no restrictions as to their use.  (Note: I plan to do a blog post soon on open source licensing, as I think it is a somewhat confusing subject).

So that is one way to use Github – as huge bundle of open source code repositories that you can access even without a Github account.  However, there is much more to Github than that:

  • It can be a personal version control and backup system for your projects.
  • It can provide source control for an entire team of people working on the same project(s).
  • You can participate on other people’s open source projects, adding to them and squashing bugs.  Some of these projects are world famous.  If you are competent, courageous and creative, you might just make a name for yourself.
  • You can monitor activity on projects and communicate with other denizens of Github.

Participating in the ways listed above require that you create and use a Github account.  Free accounts are available and there are also various levels of paid accounts, as of this writing ranging from $7 to $200 per month.  As they rise in cost the number of repositories you may own increases.  However, any paid account gets you one thing that you do not get with a free account – free accounts may not be private, but must be open source.  Whatever you store in a free, public account may be seen by anyone and people may use and access your code.   They may even submit suggested changes and corrections.  This is not a problem in the majority of cases and in fact this exemplifies the basic idea of open source.  Free accounts do have an unlimited number of repositories and users, so that too, is very nice.  The steps to create a free account are easy to follow and the process can initiated from the home page of Github.

Once you have created an account, the next step in being able to use Github for source control purposes is to get some manner of a Git client onto your local machine – you need Git software to use Git, which as has been previously stated, is the source control system that Github uses.  Upon creation of an account, the Github site will provide a UI that acts as a sort of “Boot Camp” for beginning to use Github.  Part of this Boot Camp is a UI piece that allows you to download a Git client.  You will have a few choices.  There is a command line Git, which is the original way to use the program, but there are alternatives.  There is a GUI client available for both Windows and Mac OS that is arguably easier to use than the command line program.  There is also an Eclipse plug-in available.  The web site itself also provides GUI UI for the portions of the process that involve fetching code from the Github site down to your local machine.

Disclaimer: In the discussions that follow regarding procedure for using Git and Github, I refer to the actions and concepts behind using this system for source control, but I do not provide the actual syntax of commands if using the command line interface, nor discuss the menu choices or clicks in a GUI.  There is help within these programs and other sites on the web for specific command line syntax and GUI options necessary to conduct an action or procedure.  Any approach other than this would result in a very long article and risk muddying the explanation.  However, I may come back to this topic at some point and provide a few examples.

After you have a Github client on our machine you are able to embark on two-way transmission of code with Github.  The process of doing so involves the use of repositories.  We know that there are a large number of repositories already in existence that are owned by other people.  As you might suspect, the process of working with this type of repository is a little different and a little more complicated than working with repositories that you own or are owned by your team.  We’ll first take a look at how to use owned repositories.

Working with owned repositories:

Let’s start with the easiest scenario to understand, the case in which there is already a repository created on Github and you have ownership access to it.  To be able to work with the code in a Github repository one must first Clone it from the Github website.  Doing so will create a local repository on the user’s local machine.  All of the source code and history of the project will be pulled into this local repository.  It is then possible to edit source files and mark them as modified which puts them in a local staging area.  Any files in staging are able to be committed - still to a local directory.  The set of committed files are then ready to be pushed to the Github server repository from which the source was originally obtained.

If no Github repository yet exists for a project, the creation of one begins with creation of a repository on a local machine.  Files are then created, moved to staging and committed within the local repository (similarly to the process described above for modifying cloned project files pulled down to a local repository).  After that is done, the project may be pushed to a Github repository which is created as part of this process.

Working with Repositories belonging to others:

This process has similarities to the work-flow described above in which a repository is cloned and files are edited, staged, committed and pushed to a Github repository.  However, there are two key differences:

  • A copy of the original repository on Github must first be created on Github.  This is done by Forking a project.  Once a project has been forked, it may then be cloned down to a local machine.
  • When files are eventually pushed back up to Github, they get pushed to the forked repository, not the repository from which the forked project was copied.  The pusher of said files must submit a “pull request” for the files to be “pulled” from the Github fork to the real source repository.  The owner of the actual source repository will then review the changes and merge the files into his repository or not, as he sees fit.

In summary, Github functions as both a repository for private projects and world class open source software.  However, there is much more to Github. It provides certain statistics, the ability to get notifications of project changes and the ability to communicate with repository owners and pull request submitters.  There is even more than this to Github, but the discovery of that, as they say, will be left as an exercise for the reader.