Functional data structures
1. Introduction Link to heading
1.1 What is a Functional Data Structure? Link to heading
First, let’s start with the definition of Data Structure. When we start learning programming in any high-level language we learn about data structures. A Data Structure is a component that allows us to store and manipulate certain data in a specific and efficient way.
In Scala, for example, Array
, List
, Map
, Set
are data structures that allow us to store and manipulate data (elements) in a certain way. What data structure choose? it depends on the problem we want to solve. Do we require: Fast access?, Fast storing?, Allow duplicate elements?, etc.
A Functional Data Structure is implemented by pure functions. A pure function is a function that must not perform any side effect. A Functional Data Structure is immutable by definition.
Note: You can copy/paste the code and run it in the Scala REPL. The code is written in Scala 3 new syntax but we are not using any new Scala 3 feature.
1.2 How does a Functional Data Structure work? Link to heading
A Functional Data Structure is immutable. Every time we add or remove elements, a new instance of the data structure is returned. This is called data sharing. This new instance has only references to the original data. No new data is created. If we apply another operation to the data structure, a new instance is returned with pointers to the original data. In Functional Data Structures, existing references are never changed by operations on the data structure.
2. Linked list as a functional data structure Link to heading
Let’s create a functional data structure FList
. FList
is a single linked list implemented as a functional data structure.
Let’s define our new data structure:
|
|
The trait FList[+A]
represents the new type of our data structure. So eventually, we will be able to do something like this:
|
|
Then, we define two constructors for our data structure: Nil
and Node[A]
. Note that the type definition of our data structure is parameterized with a +A
parameter. This +A
parameter determines that our FList
is covariant. We have to make it covariant because our list will always point to Nil
at the end, and Nil
is of type FList[Nothing]
and Nothing
is a subtype of all types in Scala. So, covariant variance allows us to use the same definition to consider FList[Nothing]
a subtype of FList[A]
.
For example, we can declare an empty FList
of String
this way:
|
|
Because Nil
is FList[Nothing]
which is a subtype of FList[String]
. And then, we can re-assign myFList
to a new FList[String]
as follows:
|
|
This is the reason why we needed to make it covariant. A linked list always points to Nil
in the last element or is Nil
if is an empty list.
2.1 Factory method Link to heading
Now, let’s create a factory method to simplify the way we create instances of our functional data structure FList
. In Scala, we have companion objects for this purpose.
|
|
Note: If you aren’t so familiar with companion objects and apply
method I suggest you take a look at the Scala documentation.
It’s time to create our first empty FList:
|
|
And now, let’s create a list of strings FList[String]
:
|
|
We also can create a list of integers FList[Int]
:
|
|
There are two fundamental operations over lists which are head and tail. Let’s implement them in our functional data structure.
2.2 Head Link to heading
Head operation returns the first element of a list:
|
|
The head
function is implemented as a member of our FList
companion object. It receives the list and returns the first element of the list using the pattern matching technique, if the list matches the case (the list has a Node
), then the head
is returned, otherwise it throws a NoSuchElementException
if the given list is empty.
Let’s test it.
|
|
As you can see, the head element "a"
is returned (line 4
) but myFList
remains immutable (line 7
).
Let’s test the case when an empty list is provided.
|
|
2.3 Tail Link to heading
Tail returns the rest of the list without the first element (head
):
|
|
The tail
function is also implemented as a member of our FList
companion object. It receives the list and returns the rest of the elements without the first element (head
) using pattern matching technique, if the list matches the case (the list has a Node
) then the tail
is returned, otherwise it throws an UnsupportedOperationException
if the given list is empty.
Let’s test it.
|
|
As you can see, the tail element FList("b", "c")
is returned (line 4
) but myFList
remains immutable (line 7
).
Let’s test the case when a single element list is provided:
|
|
Remember that a linked list always points to Nil
at the end. So, the tail of a single element list is Nil
or in other words, an empty list FList()
.
Now, let’s try the case when an empty list is provided:
|
|
It works as expected, you can’t get the tail of an empty list.
2.4 Adding elements (prepend) Link to heading
Let’s implement a function to insert an element as the first element of a list (prepend). This operation is easy to implement and it takes constant runtime complexity.
Note: If we would want to insert the element after the current last element, then we would need to implement something more complex that runs on O(n) because we would need to iterate all the elements of the list.
I will name this function setHead
just to simplify naming. setHead
will basically insert a new element at the beginning of the list (list’s head).
|
|
We are using pattern matching again. If the list is not empty, then we just return a new Node
(list) where the head is the given element and the tail is the given list, otherwise, we return the given element pointing to Nil
, in other words, a single element list.
Let’s start with an empty list case and add a first element "b"
and then add another first element "a"
.
|
|
As you can see, elements are added at the beginning of the list. Also, check that mySingleFList
remains immutable (line 10
).
2.5 Dropping elements Link to heading
Our functional data structure FList
is almost complete in terms of its basic operations. Now, we are going to implement a function to remove elements from the list.
In this case, I want to drop all the elements (starting from the beginning) while a provided given function A => Boolean
is true. All the initial elements that evaluate true
for the given function will be removed from the list (dropWhile
).
Let’s implement the dropWhile
function:
|
|
In this case, we are taking two parameters:
flist
: The list.func
: The function of typeA => Boolean
that will be applied to each element starting from the beginning and if it evaluates totrue
then the element will be removed from the list, otherwise we stop and return the remaining elements.
Again, with pattern matching, we evaluate the list. If there’s an element (case Node(head, tail)
) then we apply the function func
to the current element (head
), if the result of func
is true
then we drop the current element (head
) and call dropWhile
recursively with the tail as the new list. If the condition continues evaluating as true
, then we continue removing elements, otherwise, the list is returned.
Let’s test our dropWhile
function:
|
|
In line 1
I’m creating a list of companies FList[String]
and all the elements are defined in a way that they are sorted by their name.
In line 4
I’m calling the dropWhile
function and passing the function company => company.charAt(0) < 'm'
of type String => Boolean
which takes the first character of each company name and evaluates if the character is less than 'm'
. If it evaluates to true
, then the company will be removed. As you can see in line 5
the result of this operation is FList("microsoft", "oracle")
.
In line 7
I’m evaluating companies
in the REPL just to verify that the given list companies
remain immutable.
3. Conclusion Link to heading
In this post, we explored another powerful technique in the Functional Programming paradigm: the Functional Data Structures.
Functional Data Structures allow us to create new data structures that are immutable by design. We also implemented a linked list FList
in terms of a functional data structure. We defined its interface as a new type and implemented some basic operations with pure functions applying other techniques such as pattern matching and recursion.
A powerful aspect of Scala is that it brings Functional Programming and Object-Oriented Programming paradigms together, this allows us to provide a more object-oriented design to our FList
data structure to encapsulate the code in a better way. In my next post, I’ll talk about it and how you can use our FList
in the same way that the Scala collections are built, so you should be able to do something like this val myFList = FList("a", "b", "c")
and get the tail in this way myFList.tail
.
4. Code Link to heading
This is the full code I used. It includes other operations and tests that I didn’t explain in this post.
4.1 FList implementation Link to heading
FList.scala
|
|
4.2 FList unit test Link to heading
TestFList.scala
|
|