Lambda expressions and C++11

March 11, 2015

According to Wikipedia, a lambda expression (lambda) is “a function that is not bound to an identifier.” In other words, it’s a function that you can pass around without having to name (define) it separately. These are ubiquitous in functional programming languages like Haskell and F#. Why?

Well, consider the corollary in imperative languages. Consider this snippet:

auto result = answer_to_life(6, 7, "What's the question?", true);

All of the arguments passed to the function answer_to_life are “not bound to an identifier.” If we were required to bound all of these parameters to an identifier, we would be stuck with this mess:

int number1 = 6, number2 = 7;
char *question = "What's the question?";
bool deepThought = true;
auto result = answer_to_life(number1, number2, question, deepThought);

Yuck.

“Well,” you might ask, “why should I care? I almost never pass functions as parameters.”

Maybe you should be.

Adapting functional patterns

Other top-tier languages like Java 8 and C# (.NET 3.5) have promoted functions to first-class citizens for good reason. For some problems, it is much more elegant to pass functions around rather than data.

Let’s build a helper class that works on lists of words. First, let’s take a vector of std::strings and return a vector of their corresponding lengths. Check out this unit test a la Microsoft::VisualStudio::CppUnitTestFramework:

TEST_METHOD(Counts)
{
	WordListHelper helper{};
	std::vector <std::string> input { "o", "tw", "thr" };
	auto lengths = helper.counts(input);

	Assert::AreEqual(1, lengths.at(0));
	Assert::AreEqual(2, lengths.at(1));
	Assert::AreEqual(3, lengths.at(2));
}

<aside>Aren’t tests a beautiful way of expressing functional requirements?</aside>

We’ll frame out our class:

#include <vector>
#include <string>
#include <functional>

class WordListHelper
{
public:
	WordListHelper();
	std::vector<int> counts(const std::vector<std::string> words);
}

And fill in the method just so it compiles:

std::vector<int> WordListHelper::counts(const std::vector<std::string> words)
{
	std::vector<int>{ 42 };
}

Run the test, make sure it fails, then let’s have a hack at an implementation of WordListHelper::counts:

std::vector<int> WordListHelper::counts(const std::vector<std::string> words)
{
	std::vector<int> out{};
	for each (auto word in words)
	{
		out.push_back(word.length());
	}
	return out;
}

This is probably pretty close to what you were thinking: loop through the input, get the length, pack it onto a new vector, and return when done. This is the way of the imperative programmer.

Let’s continue with some more functionality. We want to count the number of vowels in each word. Check out this test:

TEST_METHOD(Vowels)
{
	WordListHelper helper{};
	std::vector < std::string > input{ "aether", "io", "queueing" };
	auto lengths = helper.vowels(input);

	Assert::AreEqual(3, lengths.at(0));
	Assert::AreEqual(2, lengths.at(1));
	Assert::AreEqual(5, lengths.at(2));
}

The pattern needs to be a little different here: we’re going to have to loop through each string now (i.e. we have nested for-statements). Update the class, write the shell method, and make sure your old test passes (and the new one fails!):

class WordListHelper
{
public:
	WordListHelper();
	std::vector<int> counts(const std::vector<std::string> words);
	std::vector<int> vowels(const std::vector<std::string> words);
}

std::vector<int> WordListHelper::vowels(const std::vector<std::string> words)
{
	std::vector<int>{ 42 };
}

Here’s the most straightforward implementation I can think of:

std::vector<int> WordListHelper::vowels(const std::vector<std::string> words)
{
	std::vector<int> out{};
	for each (auto word in words)
	{
		int count{ 0 };
		for each (auto chr in word)
		{
			 count += chr == 'a' || chr == 'e' || chr == 'i' || chr == 'o' || chr == 'u';
		}
		out.push_back(count);
	}
	return out;
}

OK, so we’re starting to get some code smells here. Two observations come to mind. First, it’s not immediately obvious what our function is doing because we’ve nested two for-loops. We should strive to make our code short, crisp, and very simple. Second, we’re repeating ourselves here:

std::vector<int> out{};
for each (auto word in words)
{
	int some_int{ 0 };
	// ...
	out.push_back(some_int);
}
return out;

This pattern is going to keep coming up every time we come up with a new word statistic. “Yeah,” you may be thinking, “but its a for-loop, how are you going to abstract THAT out?”

As a little motivation for some soul searching, let’s add one more method that counts consonants:

TEST_METHOD(Consonants)
{
	WordListHelper helper{};
	std::vector < std::string > input{ "Hampsthwaite", "rythms", "strudel" };
	auto lengths = helper.consonants(input);

	Assert::AreEqual(8, lengths.at(0));
	Assert::AreEqual(6, lengths.at(1));
	Assert::AreEqual(5, lengths.at(2));
}

You might be tempted to copy and paste the vowels method and insert a bang (!):

std::vector<int> WordListHelper::consonants(const std::vector<std::string> words)
{
	std::vector<int> out{};
	for each (auto word in words)
	{
		int count{ 0 };
		for each (auto chr in word)
		{
			count += !(chr == 'a' || chr == 'e' || chr == 'i' || chr == 'o' || chr == 'u');
		}
		out.push_back(count);
	}
	return out;
}

Or even hack this mess together:

std::vector<int> WordListHelper::consonants(const std::vector<std::string> words)
{
	auto lengths = counts(words);
	auto vowelCounts = vowels(words);
	std::vector<int> out{};
	for (int index = 0; index < words.size(); index++) {
		out.push_back(lengths[index] - vowelCounts[index]);
	}
	return out;
}

That’s enough. Let’s spice this class up with some functional programming.

Maps, functors, and predicates, oh my!

Alright, we’ve already got our tests and implementations. In the spirit of the test driven development mantra “red, green, refactor!” Let’s do some refactoring.

We noticed that we are writing out the same loop-and-accumulate pattern over and over again. Let’s consolidate it all into a map function:

std::vector<int> WordListHelper::map_word(const std::vector<std::string> words,
		std::function<int(std::string)> functor) {
	std::vector<int> out{};
	for each (auto word in words)
	{
		out.push_back(functor(word));
	}
	return out;
}

We just passed a function into a function. This allows us to write the logic that loops and accumulates the map once, and focus on what our functor is doing.

Our WordListHelper::counts has a one-liner implementation (thanks to lambdas):

std::vector<int> WordListHelper::counts(const std::vector<std::string> words)
{
	return this->map_word(words, [](std::string word) { return word.length(); });
}

Our map_word takes a std::function<int(std::string)> functor. We could have taken the time to write this function out in the usual way:

int boring_functor(std::string word)
{
	return word.length();
}

But the lambda is extremely compact and expressive. The syntax is pretty easy once you get the hang of it. Our lambda,

[](std::string word) { return word.length(); }

contains four important components:

  • [] is the capture-list. Basically, a capture is anything that needs to be dragged along with the lambda when it eventually gets invoked. We’ll see an example of this shortly.
  • (std::string word) is the parameter list, which is just like any other function’s parameter list. It contains pairs of types and names.
  • { return word.length(); } is the body of the function.
  • Implicit in this definition is the return type. This can be determined by the return of the function body in most cases.

Mapping characters

Now let’s have a crack at building a map for the characters within each word. This will be very useful when we refactor our vowels and consonants functions:

class WordListHelper
{
public:
	WordListHelper();
	std::vector<int> counts(const std::vector<std::string> words);
	std::vector<int> vowels(const std::vector<std::string> words);
	std::vector<int> consonants(const std::vector<std::string> words);
private:
	std::vector<int> map_word(const std::vector<std::string> words,
			std::function<int(std::string)> functor);
	std::vector<int> map_char(const std::vector<std::string> words,
			std::function<bool(char)> predicate);
};

vowels and consonants functions are slight variations of each other. Consider the following map:

std::vector<int> WordListHelper::map_char(const std::vector<std::string> words,
		std::function<bool(char)> predicate) {
	auto char_count = [&](std::string word){
		int count{ 0 };
		for each (auto chr in word)
		{
			count += predicate(chr);
		}
		return count;
	};
	return this->map_word(words, char_count);
}

A predicate is a functor that returns a bool. Basically, we pass in an arbitrary predicate, count the number of hits (for a word!) and feed our map_word function. We still have no repeated code here–and it’s much clearer (after a bit of practice with functional programming) what’s going on here.

Notice that the capture-list is no longer empty. The special list [&] captures all automatic variables. We need to pull in the predicate argument here, and capturing is one easy way to make it work. There are plenty of other possibilities that offer some trade-offs (not within the scope of this introduction).

The payoff

So what do we get for abstracting away all of this iterating and accumulating? We hard-code a predicate is_vowel:

WordListHelper::WordListHelper() {
	this->is_vowel = [](char c) { return c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u';  };
}

Now check out our new implementations for consonant and vowels.

std::vector<int> WordListHelper::vowels(const std::vector<std::string> words)
{
	return this->map_char(words, this->is_vowel);
}

std::vector<int> WordListHelper::consonants_fn(const std::vector<std::string> words)
{
	return this->map_char(words, [&](char c) { return !is_vowel(c); });
}

The difference is huge.

Update: Thanks to Code Node (Twitter @meetingcpp) for pointing out that we can go even further with our refactor. Two std:: C++11 functions allow us to throw out both map_char and map_word:

  • std::count_if will do the dirty work of applying a predicate and counting how many times true is returned (this replaces map_char)
  • std::transform is a map function (this replaces map_word)

An example is available on github:

std::vector<int> WordListHelper::vowels_fn2(const std::vector<std::string> words)
{
	std::vector<int> result{};
	result.resize(words.size());
	std::transform(words.begin(), words.end(), result.begin(),
		[&](std::string word){
			return std::count_if(word.cbegin(), word.cend(), this->is_vowel);
		});
	return result;
}

For more information, see the cppreference on count_if and transform.

Conclusion

It is vastly easier to extend this class to many other kinds of word statistics by focusing on the functors rather than all the looping. All of the benefits of abstraction that we’re used to still apply here: since our loop-and-accumulate map functions are implemented in a single place (i.e. not copied and pasted all over the place with small variations), we could quite easily modify our class for lazy evaluation or parallelize.

The decision on whether to use the lambda syntax or the old-school function declaration/function pointer syntax is largely a style point. Lambdas are certainly some welcome syntactic sugar to C++11. The larger point is that they get us all thinking about functions as first-class citizens, and therefore as prime candidates for abstraction. The results can be stunning.

Miscellany