Consider the following 'imperative' code that prints all values of a list:
val list = List(1, 2, 3, 4)
for (elem <- list) {
println(elem)
}
We can also write this using a higher-order function (or operator) that is defined on List and use the
println
function as its argument:
list.foreach(println)
foreach
will apply this function to every element of the list.
The following 'imperative' code doubles all values of a list and collects the results in a new list:
var result: List[Int] = Nil
for (elem <- list) {
val doubled = elem * 2
result = result :+ doubled
}
result
The collections API has an operator for this, called
map
.
Like foreach
this takes a function as its argument.
list.map(elem => elem * 2) // or, shorter:
list.map(_ * 2)
map
will apply its argument to every element of the list and return a list of the transformed elements.
This is different from foreach
, which does not return anything!
To only select certain elements from a list we can use
filter
,
for example to only get the even numbers from a list:
list.filter(elem => elem % 2 == 0) // or, shorter:
list.filter(_ % 2 == 0)
filter
applies the functional argument (called predicate, as it returns a Boolean
) to each element of the list and returns
a new list with only those elements that satisfy the predicate (i.e. for which it returns true
).
Using basic operators like foreach
, map
and filter
we can create more complex behavior.
For example: "square all odd elements of a list, discarding the even ones and printing each element of the result".
List(1, 2, 3, 4, 5, 6, 7, 8, 9)
.filter(_ % 2 == 1)
.map(x => x * x)
.foreach(println)
Note that we can chain operators in this way because filter
and map
return a new List
themselves.
You cannot continue after foreach
, as this operator returns Unit
.
In the previous section we wrote the function count
which returned the number of characters in a
string that satisfy a predicate. This function does not transform the characters, but rather aggregates them into a single value.
def count(string: String, predicate: Char => Boolean): Int = {
var count = 0
for (c <- string) {
if (predicate(c))
count += 1
}
count
}
For this the collections API has defined a number of operators. Most notable here are foldLeft
and foldRight
. Which one to choose
depends entirely on what you're doing. foldLeft
aggregates the list from left to right, whereas foldRight
aggregates the list from
right to left. To do so (in both cases), these operators take two parameters:
- a seed value
- an aggregator function
The aggregator function is first applied to the seed value and the first (foldLeft
) or last (foldRight
) element of the list, then
the result of that is applied to the next value, etc. The final result is the result of the last application of the aggregator.
The following pictures, taken from Wikipedia, depict this process. Note that execution in both cases proceeds from the bottom to the top.
Typical seed-function combinations are:
description | seed | aggregator |
---|---|---|
the sum of a list of numbers | 0 |
+ |
subtracting all numbers from the seed | any number | - |
the product of a list of numbers | 1 |
* |
flatten a list | empty list | ++ (concat lists) |
turning a list of characters into a String | "" |
+ (String concat) |
counting the number of elements in a list | 0 |
+ 1 |
In code:
List(1, 2, 3, 4).foldLeft(0)(_ + _) // or List(1, 2, 3, 4).foldLeft(0)((i, sum) => i + sum)
List(1, 2, 3, 4).foldRight(0)(_ + _) // or List(1, 2, 3, 4).foldRight(0)((sum, i) => sum + i)
List(1, 2, 3, 4).foldLeft(100)(_ - _) // or List(1, 2, 3, 4).foldLeft(100)((i, rest) => i - rest)
List(1, 2, 3, 4).foldRight(100)(_ - _) // or List(1, 2, 3, 4).foldRight(100)((rest, i) => rest - i)
List(1, 2, 3, 4).foldLeft(1)(_ * _) // or List(1, 2, 3, 4).foldLeft(1)((i, prod) => i * prod)
List(1, 2, 3, 4).foldRight(1)(_ * _) // or List(1, 2, 3, 4).foldRight(1)((prod, i) => prod * i)
List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9)).foldLeft(List[Int]())(_ ++ _) // or List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9)).foldLeft(List[Int]())((acc, list) => acc ++ list)
List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9)).foldRight(List[Int]())(_ ++ _) // or List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9)).foldRight(List[Int]())((list, acc) => list ++ acc)
List('a', 'b', 'c', 'd').foldLeft("")(_ + _) // or List('a', 'b', 'c', 'd').foldLeft("")((str, c) => str + c)
List('a', 'b', 'c', 'd').foldRight("")(_ + _) // or List('a', 'b', 'c', 'd').foldRight("")((c, str) => c + str)
List('a', 'b', 'c', 'd').foldLeft(0)((count, _) => count + 1)
List('a', 'b', 'c', 'd').foldRight(0)((_, count) => count + 1)
The _ + _
is just an abbriviation of the longer version of (i, sum) => i + sum
. When using multiple underscores to access
function parameters without naming them, each underscore represents the next parameter in order.
Notice that the results of these expressions are the same, no matter the use of foldLeft
or foldRight
, except for subtraction.
This is because subtraction is not associative: x - y
is not equal to y - x
in general. To see what is going on, we can add
some println
statements to the code. Note the order in which the operations are done, starting from the left versus starting
from the right.
List(1, 2, 3, 4)
.foldLeft(20)((rest, i) => { // Notice the order of the pair components!
println("foldLeft:")
println(s" i = $i")
println(s" rest = $rest")
val res = rest - i // Notice the order of subtraction!
println(s" res = rest - i = $rest - $i = $res")
res
})
List(1, 2, 3, 4)
.foldRight(20)((i, rest) => { // Notice the order of the pair components!
println("foldRight:")
println(s" rest = $rest")
println(s" i = $i")
val res = i - rest // Notice the order of subtraction!
println(s" res = rest - i = $rest - $i = $res")
res
})
Some of the more commonly used aggregations are predefined in the API:
List(1, 2, 3, 4).sum // 10
List(1, 2, 3, 4).product // 24
List(3, 4, 1, 2).max // 4
List(3, 4, 1, 2).min // 1
List('a', 'b', 'c', 'd').mkString // "abcd"
List('a', 'b', 'c', 'd').mkString(", ") // "a, b, c, d"
List('a', 'b', 'c', 'd').mkString("[", ", ", "]") // "[a, b, c, d]"
List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9)).flatten // List(1, 2, 3, 4, 5, 6, 7, 8, 9)
foldLeft
and foldRight
both take two arguments: an aggregator function that transforms a subresult and
an element from the input list into a new subresult, and seed value that initiates the aggregation, serves
as the first subresult and is returned when the input list is empty.
def foldLeft [B](z: B)(f: (B, A) => B): B
def foldRight[B](z: B)(f: (A, B) => B): B
However, in some situations you do not have a seed value available but still want to aggregate. For this
you use a variant of fold
called reduce
. This operator only takes the aggregator function and uses
the first value of the input as the seed.
def reduceLeft[B >: A](f: (B, A) => B): B
def reduceRight[B >: A](op: (A, B) => B): B
List(1, 2, 3, 4).reduceLeft((b, a) => a + b) // or, shorter
List(1, 2, 3, 4).reduceLeft(_ + _)
(Note on the shorter version: when using multiple underscores to access function parameters without naming them, each underscore represents the next parameter in order, that is why you can only use them once!)
(Note that you are now more limited in the aggregator functions you can use!)
In case of an empty input list, these operators throw an exception, as they cannot return anything else.
To fix this, Scala has a variant on this, which returns an Option[B]
instead. If the input is empty,
a None
is returned, else the result is returned, wrapped in a Some
. We will discuss Option
s in the next section.
def reduceLeftOption[B >: A](op: (B, A) => B): Option[B]
def reduceRightOption[B >: A](op: (A, B) => B): Option[B]