Working with data in immutable style
Remarks#
Value and variable names should be in lower camel case
Constant names should be in upper camel case. That is, if the member is final, immutable and it belongs to a package object or an object, it may be considered a constant
Method, Value and variable names should be in lower camel case
Source: https://docs.scala-lang.org/style/naming-conventions.html
This compile:
val (a,b) = (1,2)
// a: Int = 1
// b: Int = 2but this doesn’t:
val (A,B) = (1,2)
// error: not found: value A
// error: not found: value BIt is not just val vs. var
val and var
scala> val a = 123
a: Int = 123
scala> a = 456
<console>:8: error: reassignment to val
a = 456
scala> var b = 123
b: Int = 123
scala> b = 321
b: Int = 321valreferences are unchangeable: like afinalvariable inJava, once it has been initialized you cannot change itvarreferences are reassignable as a simple variable declaration in Java
Immutable and Mutable collections
val mut = scala.collection.mutable.Map.empty[String, Int]
mut += ("123" -> 123)
mut += ("456" -> 456)
mut += ("789" -> 789)
val imm = scala.collection.immutable.Map.empty[String, Int]
imm + ("123" -> 123)
imm + ("456" -> 456)
imm + ("789" -> 789)
scala> mut
Map(123 -> 123, 456 -> 456, 789 -> 789)
scala> imm
Map()
scala> imm + ("123" -> 123) + ("456" -> 456) + ("789" -> 789)
Map(123 -> 123, 456 -> 456, 789 -> 789)The Scala standard library offers both immutable and mutable data structures, not the reference to it. Each time an immutable data structure get “modified”, a new instance is produced instead of modifying the original collection in-place. Each instance of the collection may share significant structure with another instance.
Mutable and Immutable Collection (Official Scala Documentation)
But I can’t use immutability in this case!
Let’s pick as an example a function that takes 2 Map and return a Map containing every element in ma and mb:
def merge2Maps(ma: Map[String, Int], mb: Map[String, Int]): Map[String, Int]A first attempt could be iterating through the elements of one of the maps using for ((k, v) <- map) and somehow return the merged map.
def merge2Maps(ma: ..., mb: ...): Map[String, Int] = {
for ((k, v) <- mb) {
???
}
}This very first move immediately add a constrain: a mutation outside that for is now needed. This is more clear when de-sugaring the for:
// this:
for ((k, v) <- map) { ??? }
// is equivalent to:
map.foreach { case (k, v) => ??? }“Why we have to mutate?”
foreach relies on side-effects. Every time we want something to happen within a foreach we need to “side-effect something”, in this case we could mutate a variable var result or
we can use a mutable data structure.
Creating and filling the result map
Let’s assume the ma and mb are scala.collection.immutable.Map, we could create the result Map from ma:
val result = mutable.Map() ++ maThen iterate through mb adding its elements and if the key of the current element on ma already exist, let’s override it with the mb one.
mb.foreach { case (k, v) => result += (k -> v) }Mutable implementation
So far so good, we “had to use mutable collections” and a correct implementation could be:
def merge2Maps(ma: Map[String, Int], mb: Map[String, Int]): Map[String, Int] = {
val result = scala.collection.mutable.Map() ++ ma
mb.foreach { case (k, v) => result += (k -> v) }
result.toMap // to get back an immutable Map
}As expected:
scala> merge2Maps(Map("a" -> 11, "b" -> 12), Map("b" -> 22, "c" -> 23))
Map(a -> 11, b -> 22, c -> 23)Folding to the rescue
How can we get rid of foreach in this scenario? If all we what to do is basically iterate over the collection elements and apply a function while accumulating the result on option could be using .foldLeft:
def merge2Maps(ma: Map[String, Int], mb: Map[String, Int]): Map[String, Int] = {
mb.foldLeft(ma) { case (result, (k, v)) => result + (k -> v) }
// or more concisely mb.foldLeft(ma) { _ + _ }
}In this case our “result” is the accumulated value starting from ma, the zero of the .foldLeft.
Intermediate result
Obviously this immutable solution is producing and destroying many Map instances while folding, but it is worth mentioning that those instances are not a full clone of the Map accumulated but instead are sharing significant structure (data) with the existing instance.
Easier reasonability
It is easier to reason about the semantic if it is more declarative as the .foldLeft approach. Using immutable data structures could help making our implementation easier to reason on.