Examining Underscore.js Search and Filter Collection Functions

“Price is what you pay. Value is what you get.” – Warren Buffett

By any measure, Underscore.js is a great deal. Underscore.js is a tiny open source JavaScript utility library that packs about 80 useful and eye-opening functions into only about 5kb (minified and gzipped).  A significant number of these functions are applicable to collections.  In Underscore.js, the term “collections” can refer to arrays, objects, and array-like objects such as arguments and NodeList.  Follow this link to examine the Underscore.js documentation for these functions.  In this article we will be looking at actual usage for some of these functions.  You can find a demo/test page that runs these examples here.

A number of these collection functions are ideal for search-related applications. It’s interesting how many of the Underscore collection functions are reminiscent of SQL. If one were to look at the full list, one would find that about half of them have names and/or do things that are related to SQL select functionality, e.g.:

  • each
  • find
  • filter
  • where
  • findWhere
  • reject
  • every
  • contains
  • max
  • min
  • sortBy
  • groupBy
  • countBy

It is worth noting that the popular JavaScript MV* library, Backbone.js, proxies and uses many Underscore.js functions.  It makes particularly good use of many of these SQL-evocative functions for assisting programmers in the management of collections of Backbone Models.  The search functionality that this article examines, or analogs thereof, could just as easily be implemented in Backbone.js as it could via the more low-level Underscore.js.

There are small number of usage patterns shared by these Underscore.js collection functions.  We’ll be looking at four collection functions, chosen as being representatives of both the functionality and pattern usage differences that pertain to the functions in the above list.  We are going to begin by examining the _.where function, which has great functionality even though it follows a simple usage pattern.  Take a look at the code below.  (Note for programmers unfamiliar with Underscore.js:  As befits the name, all Underscore functions begin with an underscore (_) character, similar to how jQuery functions begin with dollar sign ($) character.)

The code above produces the following output:

In the test function shown above, note first the signature for the _.where function.  It takes two arguments, “list” and “properties”.  The first argument, list, is one that is found in all of the Underscore.js collection functions.  Recall that collection functions work on arrays, objects, and array-like objects such as arguments and NodeList.  The list argument can represent any of these things.  It is important to understand that the nature of the reference passed via the list argument will slightly alter the usage pattern for these collection functions.  In this particular test we are passing an array of objects named “testArr” as the list.  The other argument required by _.where is the properties argument, which must be an object containing one or more properties that are used as the where criteria, i.e., what to search for in the given list.  In the test case above we are passing an object containing two search properties – we are looking for things that are green and have a tail.  In the input list we have four objects, all of them containing color properties and all of them containing tail properties.  Of the four, only one of them – an iguana – is both green and has a tail.  As is to be expected, when this test is run, only one object is returned – the iguana object.

The next function we will be looking at is the _.each function.  I chose this function as one of the four because of it’s flexibility and because it is a good one to use in illustrating the full range of usage patterns that may be encountered when using the Underscore.js collection functions.  Take some time to really look at the two code examples for this function – if you can get through this part the rest of the article is down-hill.  Whereas the _.where function is fairly simple – taking only a list and a set of search criteria as its arguments – the _.each function is inherently more capable, but with less specified functionality at the outset.  In addition to the list argument that we have already touched upon in the previous example, the _.each function has an iterator argument and an optional context argument instead of a “properties to search for” argument.  The iterator argument represents a function to be passed and the optional context argument represents a “this pointer” context to be bound to the iterator function. The ability to reassign the this pointer for an object or function is a commonly used JavaScript technique.  An in-depth explanation of this technique is outside the scope of this particular article, however we will see the context argument in use.  The iterator argument concept is a different story.

In this discussion of the _.each function and in this article in general, we will be spending a fair amount of time examining different uses for an iterator function, as many of the collection functions have such an argument.  The basic concept of an iterator function is the same for all collection functions that use them – it is a bit of code that the Underscore collection function will call for each item in the given list argument.  The number and the nature of arguments that Underscore will pass to an iterator function varies with the particular collection function, as does the nature of the expected function return, if any.  Let’s go ahead and take a look at the code for the first of two _.each tests in the demo page.

The code above produces the following output:

In the test function shown above, note first the signature for the _.each function.  When calling _.each, we are passing a simple array of strings for the list argument.  The iterator function is defined on the fly, as an anonymous function.  The third argument to the _.each method – the context – is a reference to an object declared outside of the testEachInArray function. Lets look closely at the iterator function.  It takes three arguments – the Underscore.js documentation for the _.each method specifies this.  It has no return value.  Ok… but what is it doing?  The answer is, not very much.  It is merely printing to console the content of the three arguments for each iteration.  The Underscore.js documentation states  for the _.each method further states that when the list argument is an array, the first argument will be the value of the array element being processed for the current iteration, the second argument the index of that array element, and the third argument will contain the content passed as list (the entire array).

A major purpose of these _.each function examples is to show an iterator function in action – to give an idea of what sorts of things can happen when an iterator is being called.  In actual use, the iterator function passed to the _.each function would be doing some kind of work with each element in the array, as the _.each function iterated.  But what of the final argument, the context?  Note at line 19, the use of: this.contextValue.  The output of the function in the console proves that the this pointer for the iterator function has been set to the this pointer for the anExternalContext object, because contextValue prints out correctly.  Let’s take a look now at  the second _.each function example.

The code above produces the following output:

In the test function shown above, note that we are calling upon the same _.each function that we called in the previous test.  The difference here is that we are passing an object as the list argument, instead of an array.  This example is more or less the same as the previous one except that the usage pattern is slightly different because the list is an object this time.  Here we must make allowances in the iterator function for the fact that each iteration of _.each is working with a property of the given object passed as the list argument, instead of it being an element of a passed in array.  The Underscore documentation tells us that the three arguments passed to the iterator will (naturally) be a little different for an object passed as list than when an array is passed in the list argument.  Here, the first argument will be the property value being processed for the current iteration, the second argument will be the key for that value, i.e. the property name, and the third argument will again be the content passed as list, in this case an object.  The iterator function takes these differences into account (e.g. the use of JSON.stringify to get a string representation of the list object), but otherwise works the same as the iterator function worked in the first _.each example test.

The next function we will be looking at is the _.max function.  I chose this one because it is reasonably straightforward and representative of other similar functions (e.g. _.min, _.find), in addition to being very useful in its own right.  The iterator function is optional for the _.max function, but when used, its purpose is clear and it is easy to comprehend. Lets look at the first of two examples, one that does not employ an iterator function.

The code above produces the following output:

In the test function shown above, note first the signature for the _.max function.  It is the same signature that _.each uses, except that the iterator function is optional.  In this first example test of the _.max function, we are not passing an iterator.  There is no need for one, as the list we are passing is a simple array of strings and the _.max function needs no additional help to be able to iterate through and find the largest number in the array.  Lets look now at the second _.max function example usage, which does require an iterator function to be passed.

The code above produces the following output:

In the test function shown above, note that we are calling upon the same _.max function that we called in the previous test.  The difference here is that we are passing an array of objects as the list argument, instead of an array of numbers.  This example is similar to the previous one in that _.max is trying to iterate through the input list to determine the largest value in set.  However, the data is different here – both in content and in nature.  The _.max function cannot know on its own, the property name within the array of objects that it must access in order to determine the largest value. In some cases, each object might have any of several properties that would be correct.  Even in this simple test case, it is possible to pass the value of the person property instead of the value of the IQ property (the highest sorting person name would be the “max” value in that case) but we want to find the person with the max IQ in our example.  So, we create the anonymous function to help out.  The Underscore documentation for the _.max function states that the iteration function for _.max must take an argument representing the current object being iterated from the array of objects, and return the value of the property that _.max will use to cull the max value.

The last Underscore.js collection function that we will examine is the _.filter function.  It packs a lot of potential power, but requires the programmer to provide most of that power by providing an iterator function. You will find two example uses of the _.filter collection function below.  The first of these will be a simpler use case than the second.  Each of these two examples will be a little different from the previous examples in that an example is made up of two functions – one is a re-usable function and the other is the test driver for the function.  Let’s take a look now at the two functions that make up the first example use case for the _.filter function: simpleFilterLike and testSimpleFilterLike.

The code above produces the following output:

In the test function shown above, note first the signature for the _.filter function.  By now this pattern should look familiar.  In the scenario above, we are passing as the list argument a simple array of strings, containing phrases that will be searched.  We are also passing a non-optional iterator function that returns a result of true (per Underscore specification for _.filter) if a searched-for string is found with the currently iterated phrase.  We are not explicitly passing the searched for string.  A “likeCriteria” object is created within the body of the function that invokes the _.filter function.  Within this likeCriteria object, a “searchFor” property is set, containing the value that we will search for within the phrases.  We could have passed this likeCriteria object as the context and shared it’s contents with the iterator function that way, but there is no need to do so as the scoping rules of JavaScript already make the likeCriteria object contents available to the iterator function.

What is happening inside this iterator function?  For each iteration, the _.filter function is giving us a phrase to search in and we are doing so using the contents of the searchFor property in the likeCriteria object.  The iterator function is “teaming up” with the Underscore _.filter function by returning boolean true if the searched for string is in the phrase – effectively allowing the Underscore function to filter which of the array elements (i.e. which phrases) pass a truth test. Those phrases that pass the truth test will be returned in a filtered array that contains only the right contents from the original array – where “right” means that the searched for string was found in the phrase.  Now lets look at the second scenario, in which two more functions are used in a similar manner to the ones just examined to call upon the Underscore _.filter function to help do a search.

The code above produces the following output:

The two functions shown above are not very different from the first two _.filter leveraging functions we just looked at, there are just a couple of twists.  The first is that instead of a simple array of strings representing phrases, we are passing here an array of objects containing phrases.  The second twist is that we are employing two searchFor criteria, which means that this version of the “likeCriteria” object contains an array property instead of a property holding a single string.  Otherwise, this pair of functions works the same as the previous pair.

The main reason for providing this second example of how to leverage the Underscore.js _.filter function is that this example is much closer to a real world use-case for a utility filter function that returns information based upon matching “like” criteria.  In fact, a more robust and capable version of the “filterLike” function forms the heart of the search implementation for the main Uberiquity site.  You can test the Uberiquity search feature out if you like, by clicking on this Uberiquity Search link. If you were to go there, you could search for this very article by using the following search:  underscore AND search

Leave a Reply