Reduce vs Fold in Scala

For better or worse in functional programming there exists fuzzy concrete notions on what it means to fold over some input versus what it means to reduce over some input. The concrete implementations vary from language to language and even between some libraries. Yet, the underlying fundemental concept tends to hold true, that you are taking some value of N and performing aggregation operations on it such that the end result is typically some value of <= N.

Even in haskell the fold family of functions is overloaded in the sense that it is capable of doing both what scala calls reducing and folding. For example, take the following haskell function that takes a list of integers and produces a sum value in terms of folding:

sumList :: [Int] -> Int  
sumList = foldl (+) 0  

In haskell one might also implement the filterList combinator in terms of folding. The definition of which would resemble the following example:

filterList :: (a -> bool) -> [a] -> [a]  
filterList p = foldl (\ m n -> if (p n) then m ++ [n] else m) []  

While the fold derived operations of sumList and filterList both take a value of N and perform aggregegate operations on it such that the end result is a a value of <= N, their type signatures tell a different story.

sumList takes a list of integers and produces a single integer while filterList takes a list of integers and produces a subset of that list. Putting their type signatures next to each other highlights this important difference.

sumList :: [Int] -> Int  
filterList :: (a -> bool) -> [a] -> [a]  

This is all possible because foldl's type signature is really generic in the sense that all it cares about is that the type of the final value is the same type of the initial value. It's type signature is as follows:

foldl :: (a -> b -> a) -> a -> [b] -> a  

Now this all becomes important when we take the type signatures for scala's implementations of folding and reducing into consideration. Both of which are as follows:

def foldLeft [B] (z: B)(f: (B, A) => B): B  
def reduceLeft [B >: A] (f: (B, A) => B): B  

A quick glance reveals some of the following differences:

  • foldLeft has a paramaterized type of B while reduceLeft has a paramaterized type of B >: A
  • foldLeft is a curried function where the first curried group takes the folds initial value and that reduceLeft is non curried and doesn't take an initial value at all
  • In foldLeft the accumulator and the final return type (B) must be related to the item type (A).

So if reduceLeft doesn't take an initial value as an argument, then what is it's initial value? The initial value of reduceLeft is the initial value of the collection. So if you have a List[Int] and you reduceLeft over it, you will only ever be able to produce an Int (or a supertype of Int) from that operation. However, foldLeft is capable of producing any type you can pass into the initial value.

So we would be able to easily convert our sumList implementation from haskell to scala using reduceLeft like so:

def sum(inputs: List[Int]): Int = inputs.reduceLeft (_+_)  

It's impossible for us to implement filterList in terms of reduceLeft. However, we can easily implement filterList in terms of foldLeft like so:

def filter(inputs: List[Int], p: (Int) => Boolean) = {  
  inputs.foldLeft(List[Int]()) { (m, n) =>
    p(n) match {
      case true  => m ++ List(n)
      case false => m
    }
  }
}

Some interesting things about what can be represented in terms of reduceLeft and foldLeft outside of the type system are related to wether the parameter type for the iterable is a monoid or a semi-group. If the type parameter of your iterable is a monoid, meaning it has a zero value and an append operation, you can implement reduceLeft in terms of foldLeft. But if your type parameter is a semi-group, meaning it has an append operation but no zero value then there are a subset of reduceLeft operations that can not be implemented in terms of foldLeft.