Technology

Jan 03, 2017

Mastering Scala: Understanding map and flatMap

Micah Jones

The Scala language excels at manipulating large, complex data structures in a multi-threaded environment. However, learning how to effectively perform such tasks requires a strong working knowledge of some of Scala’s most useful tools, including case classes, collections, pattern matching, Options, and Futures.

When working with Scala data structures, we frequently find ourselves using functional combinators, particularly `map` (not to be confused with the Map collection type) and its close cousin `flatMap`. The map method is most closely associated with collections and is a member of the `Traversable` trait that is implemented by Scala’s collection classes. It works similarly to mapping operations in other languages like JavaScript, allowing us to iterate over a collection and translate it to some new collection. Scala’s `map` method is exceptionally powerful, and its uses are heavily overloaded to make it useful in situations that aren’t immediately obvious.

In this blog post we will explore some uses of `map` and `flatMap` in three contexts: collections, Options, and Futures.

Collections

The `map` method is most commonly used with collections. We typically use it to iterate over a list, performing an operation on each element and adding the result to a new list. For example, to double the values of each element of a list of numbers, we could write the following:

scala> List(1,2,3).map { x => x*2 } List[Int] = List(2, 4, 6)

Or, using the underscore _ shorthand:

scala> List(1,2,3).map(_*2) List[Int] = List(2, 4, 6)

Collection types other than `List` behave similarly, using their internal iterators to submit elements to the `map` method:

scala> Array(1,2,3).map(_*2) Array[Int] = Array(2, 4, 6) scala> Set(1,2,2,3).map(_*2) Set[Int] = Set(2, 4, 6) scala> (0 until 5).map(_*2) IndexedSeq[Int] = Vector(0, 2, 4, 6, 8)

The `Map` collection also has a map method, but it converts each key-value pair into a tuple for submission to the mapping function:

scala> Map("key1" -> 1, "key2" -> 2).map { keyValue:(String,Int) =>          keyValue match { case (key, value) => (key, value*2) }        } Map[String,Int] = Map(key1 -> 2, key2 -> 4)

Since our anonymous function just forwards the input to a pattern matcher, Scala allows us to include just the case statement with its tuple extractor, treating the rest as implicit:

scala> Map("key1" -> 1, "key2" -> 2).map {          case (key, value) => (key, value*2)        } Map[String,Int] = Map(key1 -> 2, key2 -> 4)

Although the previous example maps to another `Map`, we can also map to other collection types:

scala> Map("key1" -> 1, "key2" -> 2).map {          case (key, value) => value * 2        } Iterable[Int] = List(2, 4) scala> Map("key1" -> 1, "key2" -> 2).map {          case (key, value) => value * 2        }.toSet Set[Int] = Set(2, 4)

The `String` data type can be treated as a collection of characters, allowing us to `map` it:

scala> "Hello".map { _.toUpper } String = HELLO

Scala collections also support `flatten`, which is usually used to eliminate undesired collection nesting:

scala> List(List(1,2,3),List(4,5,6)).flatten List[Int] = List(1, 2, 3, 4, 5, 6)

The `flatMap` method acts as a shorthand to map a collection and then immediately flatten it. This particular combination of methods is quite powerful. For example, we can use `flatMap` to generate a collection that is either larger or smaller than the original input:

scala> List(1,4,9).flatMap { x => List(x,x+1) } List[Int] = List(1, 2, 4, 5, 9, 10) scala> List(1,4,9).flatMap { x => if (x > 5) List() else List(x) } List[Int] = List(1, 4)

The true power of `map` and `flatMap` becomes much more apparent when we look at Options and Futures.

Options

Although Scala Options are not collections, they do support `map` and `flatMap`. When mapping an Option with an inner value (e.g., `Some(1)`), the mapping function acts on that value. However, when mapping an Option without a value (`None`), the mapping function will just return another `None`.

For example, suppose we have an optional variable `cost` and we want to add a value `fee` to it. In my early days as a Scala programmer, I would have solved that problem with `isDefined` checks:

scala> val fee = 1.25 scala> val cost = Some(4.50) scala> val finalCost =          if (cost.isDefined) Some(cost.get+fee) else None finalCost: Option[Double] = Some(5.75)

This solution works, but it lacks elegance. We can clean it up with a `map`:

scala> val finalCost = cost.map(_+fee) finalCost: Option[Double] = Some(5.75)

If `cost` did not have a value, the mapping function would never execute, and we would get `None` as a result instead:

scala> val cost:Option[Double] = None scala> val finalCost = cost.map(_+fee) finalCost: Option[Double] = None

In an Option context, `flatten` will eliminate nested Options:

scala> Some(Some(1)).flatten Option[Int] = Some(1) scala> Some(None).flatten Option[Nothing] = None scala> None.flatten Option[Nothing] = None

A `flatMap` can be useful when we have optional results in a mapping function we’re already applying to an Option:

scala> cost.flatMap { x => if (x < 1.00) None else Some(x+fee) } Option[Double] = Some(5.75) scala> val cost = Some(.50) scala> cost.flatMap { x => if (x < 1.00) None else Some(x+fee) } Option[Double] = None

When combining Options with collections, `flatten` and `flatMap` allow us to manipulate complex results into something much more manageable. For example, we can flatten a List of Options to eliminate `None`s and reduce `Some`s to their inner values:

scala> List(Some(1),Some(2),None,Some(4),None).flatten List[Int] = List(1, 2, 4)

In a situation where we are optionally mapping elements in a collection, we can use `flatMap`:

scala> List(1,2,3,4,5).flatMap { x =>          if (x <= 3) Some(x*2) else None        } List[Int] = List(2, 4, 6)

Futures

Futures are the standard way to implement concurrency in Scala. Typically, we reason about what to do after a Future’s completion by using callback operations like `onComplete` to finalize and process its results.

In our projects, we frequently need to chain multiple callbacks together. For example, a REST endpoint on a server might call a method that returns a Future, and that method in turn calls another Future with its own callback attached. In this situation, `map` and `flatMap` become indispensable tools.

In this example we’ll use a simple method that just adds two to a number in an independent thread and returns the result as a `Future[Int]`:

scala> def addTwo(n:Int):Future[Int] = Future { n + 2 } addTwo: (n: Int)scala.concurrent.Future[Int]

Suppose we have another method `addTwoAndDouble` which returns a Future containing the result of adding two to a number and then doubling it. We will use `map` as a chaining mechanism to take the Future provided by a call to `addTwo`, double its results in a callback, and then return the result of that doubling operation as another Future.

At times we will encounter a situation in which we need to submit the results of one Future operation as input to another Future operation. In that case our result type will be a nested Future (e.g., `Future[Future[Int]]`). Fortunately, we can `flatten` Futures to eliminate that nesting.

Suppose we want to call `addTwo`, and then forward the result of adding it to another call to `addTwo`. We can use `flatMap` to chain the two Futures together and return the result as a new, non-nested Future.

As a rule of thumb when mapping Futures, use `map` when processing the results of a single Future, and use `flatMap` when you’re nesting another Future in your callback.

Conclusion

Scala’s mapping capabilities are far more powerful and versatile than they might first appear. They are tremendously useful in writing code that concisely and elegantly follows the functional paradigm of immutability. Learning how to effectively use `map` and `flatMap` will bring you a long way toward becoming an expert Scala developer.

Micah Jones is a consultant in Credera’s Integration & Data Services (IDS) practice. In 2011, he received his doctorate in computer science from the University of Texas at Dallas, specializing in programming language-based security.