Working with data in immutable style
Remarks#
Value and variable names should be in lower camel case
Constant names should be in upper camel case. That is, if the member is final, immutable and it belongs to a package object or an object, it may be considered a constant
Method, Value and variable names should be in lower camel case
Source: https://docs.scala-lang.org/style/naming-conventions.html
This compile:
val (a,b) = (1,2)
// a: Int = 1
// b: Int = 2
but this doesn’t:
val (A,B) = (1,2)
// error: not found: value A
// error: not found: value B
It is not just val vs. var
val
and var
scala> val a = 123
a: Int = 123
scala> a = 456
<console>:8: error: reassignment to val
a = 456
scala> var b = 123
b: Int = 123
scala> b = 321
b: Int = 321
val
references are unchangeable: like afinal
variable inJava
, once it has been initialized you cannot change itvar
references are reassignable as a simple variable declaration in Java
Immutable and Mutable collections
val mut = scala.collection.mutable.Map.empty[String, Int]
mut += ("123" -> 123)
mut += ("456" -> 456)
mut += ("789" -> 789)
val imm = scala.collection.immutable.Map.empty[String, Int]
imm + ("123" -> 123)
imm + ("456" -> 456)
imm + ("789" -> 789)
scala> mut
Map(123 -> 123, 456 -> 456, 789 -> 789)
scala> imm
Map()
scala> imm + ("123" -> 123) + ("456" -> 456) + ("789" -> 789)
Map(123 -> 123, 456 -> 456, 789 -> 789)
The Scala standard library offers both immutable and mutable data structures, not the reference to it. Each time an immutable data structure get “modified”, a new instance is produced instead of modifying the original collection in-place. Each instance of the collection may share significant structure with another instance.
Mutable and Immutable Collection (Official Scala Documentation)
But I can’t use immutability in this case!
Let’s pick as an example a function that takes 2 Map
and return a Map
containing every element in ma
and mb
:
def merge2Maps(ma: Map[String, Int], mb: Map[String, Int]): Map[String, Int]
A first attempt could be iterating through the elements of one of the maps using for ((k, v) <- map)
and somehow return the merged map.
def merge2Maps(ma: ..., mb: ...): Map[String, Int] = {
for ((k, v) <- mb) {
???
}
}
This very first move immediately add a constrain: a mutation outside that for
is now needed. This is more clear when de-sugaring the for
:
// this:
for ((k, v) <- map) { ??? }
// is equivalent to:
map.foreach { case (k, v) => ??? }
“Why we have to mutate?”
foreach
relies on side-effects. Every time we want something to happen within a foreach
we need to “side-effect something”, in this case we could mutate a variable var result
or
we can use a mutable data structure.
Creating and filling the result
map
Let’s assume the ma
and mb
are scala.collection.immutable.Map
, we could create the result
Map from ma
:
val result = mutable.Map() ++ ma
Then iterate through mb
adding its elements and if the key
of the current element on ma
already exist, let’s override it with the mb
one.
mb.foreach { case (k, v) => result += (k -> v) }
Mutable implementation
So far so good, we “had to use mutable collections” and a correct implementation could be:
def merge2Maps(ma: Map[String, Int], mb: Map[String, Int]): Map[String, Int] = {
val result = scala.collection.mutable.Map() ++ ma
mb.foreach { case (k, v) => result += (k -> v) }
result.toMap // to get back an immutable Map
}
As expected:
scala> merge2Maps(Map("a" -> 11, "b" -> 12), Map("b" -> 22, "c" -> 23))
Map(a -> 11, b -> 22, c -> 23)
Folding to the rescue
How can we get rid of foreach
in this scenario? If all we what to do is basically iterate over the collection elements and apply a function while accumulating the result on option could be using .foldLeft
:
def merge2Maps(ma: Map[String, Int], mb: Map[String, Int]): Map[String, Int] = {
mb.foldLeft(ma) { case (result, (k, v)) => result + (k -> v) }
// or more concisely mb.foldLeft(ma) { _ + _ }
}
In this case our “result” is the accumulated value starting from ma
, the zero
of the .foldLeft
.
Intermediate result
Obviously this immutable solution is producing and destroying many Map
instances while folding, but it is worth mentioning that those instances are not a full clone of the Map
accumulated but instead are sharing significant structure (data) with the existing instance.
Easier reasonability
It is easier to reason about the semantic if it is more declarative as the .foldLeft
approach. Using immutable data structures could help making our implementation easier to reason on.