Tree Structures - Haskell | Learning Cardano

In Haskell, tree structures are a powerful way to organize hierarchical data, enabling efficient storage, retrieval, and manipulation of elements. Among tree structures, binary trees and especially binary search trees (BSTs) are widely used for their efficiency in operations like searching, inserting, and deleting data. This article will explain the basics of tree structures in Haskell, focusing on binary trees and binary search trees, along with practical examples.

Table of Contents

What is a Tree Structure?
Defining a Binary Tree in Haskell
Binary Search Tree (BST)
Implementing a Binary Search Tree in Haskell
Traversing a Binary Tree
Real World Example
Explain Tree Structure Like I’m Five Years Old (ELI5)
Summary

What is a Tree Structure?

A tree is a data structure where each node has a value and references to child nodes. Trees are used for data that has a hierarchical structure, such as file directories, organizational charts, or sorted data.

A binary tree is a type of tree where each node has up to two children, often referred to as the “left” and “right” child. Binary trees can be structured in various ways depending on the application.

Defining a Binary Tree in Haskell

In Haskell, we can define a binary tree using an algebraic data type. Here’s a basic definition:

data Tree a = Empty
            | Node a (Tree a) (Tree a)
            deriving (Show, Eq)

In this definition:

Tree a is a binary tree that can store elements of type a.
Empty represents an empty tree (or a leaf with no children).
Node represents a node in the tree, which contains a value of type a and has two subtrees (left and right).

For example, we can create a simple binary tree like this:

exampleTree :: Tree Int
exampleTree = Node 5
                (Node 3 (Node 1 Empty Empty) (Node 4 Empty Empty))
                (Node 8 (Node 6 Empty Empty) (Node 10 Empty Empty))

This tree structure represents the following layout:

Binary Search Tree (BST)

A binary search tree (BST) is a type of binary tree with a special property: for each node, the values in its left subtree are less than the node’s value, and the values in its right subtree are greater than the node’s value. This property allows for efficient searching, as you can decide which direction to follow at each node based on the comparison with the target value.

Implementing a Binary Search Tree in Haskell

Inserting Elements

In a binary search tree, insertion involves placing the new element in the correct position to maintain the BST property. Here’s how we can implement insertion:

insert :: (Ord a) => a -> Tree a -> Tree a
insert x Empty = Node x Empty Empty
insert x (Node a left right)
    | x == a = Node a left right  -- No duplicates
    | x < a  = Node a (insert x left) right
    | x > a  = Node a left (insert x right)

This function works as follows:

If the tree is Empty, we create a new Node with the value x.
If x is less than the current node’s value (a), we insert x into the left subtree.
If x is greater than a, we insert x into the right subtree.
If x is equal to a, we do nothing to avoid duplicates.

Using insert, we can create a BST:

treeFromList :: (Ord a) => [a] -> Tree a
treeFromList = foldr insert Empty

exampleBST :: Tree Int
exampleBST = treeFromList [5, 3, 8, 1, 4, 6, 10]

Searching for an Element

To search for an element in a BST, we use the following function:

search :: (Ord a) => a -> Tree a -> Bool
search _ Empty = False
search x (Node a left right)
    | x == a    = True
    | x < a     = search x left
    | otherwise = search x right

The search function works as follows:

If the tree is Empty, the element is not found, so it returns False.
If the target value x equals the current node’s value a, it returns True.
If x is less than a, it searches in the left subtree.
If x is greater than a, it searches in the right subtree.

Using search, we can check if an element exists in the tree:

found = search 4 exampleBST  -- True
notFound = search 7 exampleBST  -- False

Deleting an Element

Deleting an element from a BST is more complex. Here’s an implementation:

delete :: (Ord a) => a -> Tree a -> Tree a
delete _ Empty = Empty
delete x (Node a left right)
    | x < a     = Node a (delete x left) right
    | x > a     = Node a left (delete x right)
    | otherwise = case (left, right) of
        (Empty, Empty) -> Empty
        (left, Empty)  -> left
        (Empty, right) -> right
        _              -> let minRight = findMin right
                          in Node minRight left (delete minRight right)

findMin :: Tree a -> a
findMin (Node a Empty _) = a
findMin (Node _ left _) = findMin left

Here’s how delete works:

If the target value x is less than a, it deletes x from the left subtree.
If x is greater than a, it deletes x from the right subtree.
If x matches a, we handle three cases:
1. Node with no children: Replace it with Empty.
2. Node with one child: Replace it with its child.
3. Node with two children: Find the smallest value in the right subtree (using findMin), replace a with this minimum value, and delete the minimum value from the right subtree to avoid duplication.

Traversing a Binary Tree

Traversing a binary tree means visiting each node in a specific order. Here are three common traversal methods:

In-order Traversal

In-order traversal visits nodes in increasing order for a BST.

inOrder :: Tree a -> [a]
inOrder Empty = []
inOrder (Node a left right) = inOrder left ++ [a] ++ inOrder right

Pre-order Traversal

Pre-order traversal visits the root first, then the left and right subtrees.

preOrder :: Tree a -> [a]
preOrder Empty = []
preOrder (Node a left right) = [a] ++ preOrder left ++ preOrder right

Post-order Traversal

Post-order traversal visits the left and right subtrees before the root.

postOrder :: Tree a -> [a]
postOrder Empty = []
postOrder (Node a left right) = postOrder left ++ postOrder right ++ [a]

Real World Example

A real-world example of a binary search tree (BST) can be found in the implementation of a contact list in a phone book application. Imagine you want to store a list of contacts by their names in such a way that you can efficiently search, add, and delete contacts.

Contact List Example Using a Binary Search Tree

In this contact list:

Each node in the BST would represent a contact, and each contact node would store the contact’s name (which is used for sorting) and potentially other information, like phone number, email, or address.
Since names can be compared alphabetically, we can organize the tree such that:
- All contacts in the left subtree of any node have names that are alphabetically less than the current node’s name.
- All contacts in the right subtree have names that are alphabetically greater than the current node’s name.

This alphabetical ordering allows quick lookup, insertion, and deletion of contacts.

How a BST Helps in This Example

1. Searching for a Contact

When searching for a contact by name:

Start at the root of the tree.
If the name matches the root’s name, you’ve found the contact.
If the name is alphabetically less than the root’s name, move to the left child (which contains smaller names).
If the name is greater, move to the right child.

This process reduces the search space by half at each step, making the search operation efficient (with an average time complexity of O(log ⁡n) for a balanced BST).

2. Adding a New Contact

To add a new contact:

Start at the root and compare the new contact’s name with the current node’s name.
Navigate left or right based on whether the new contact’s name is smaller or larger.
When you reach an empty spot (a node’s left or right child is Empty), insert the new contact there as a new Node.

This keeps the BST structure intact while maintaining alphabetical order, enabling efficient future searches.

3. Deleting a Contact

Deleting a contact in a BST involves a few cases:

If the contact is a leaf node, simply remove it.
If the contact has one child, replace it with its child.
If the contact has two children, find the minimum value in its right subtree to replace the deleted contact (or the maximum in the left subtree), and then recursively delete that minimum or maximum node.

Example Layout of a BST for a Contact List

Suppose you have the following contacts by name:

["Alice", "Bob", "Cathy", "Eve", "David"]

Your BST might look like this:

      Cathy
     /     \
  Alice    Eve
     \      /
     Bob  David

In this structure:

To search for “Bob,” you start at “Cathy” (root), go left to “Alice,” and then right to find “Bob.”
To add a contact named “Frank,” start at “Cathy,” go right to “Eve,” then right to an empty spot where “Frank” would be inserted.

Why Use a BST for a Contact List?

Efficiency: The BST allows fast search, insertion, and deletion compared to a simple list, especially as the list grows larger.
Automatic Sorting: Contacts are stored in alphabetical order without needing a separate sorting step.
Scalability: A balanced BST handles large data sets well, making it suitable for applications like phone books with potentially thousands of contacts.

By using a binary search tree for a contact list, applications can handle complex and large datasets with ease, maintaining quick access to contacts as they’re added, searched for, or deleted.

Explain Tree Structure Like I’m Five Years Old (ELI5)

The purpose of a tree structure is to help us find things really quickly, just like if you were playing a game where you have to find the right toy in a big pile. Imagine instead of searching through a messy pile, we put each toy on its own branch of a big tree. The tree has a special order, so each branch leads you closer to the toy you want.

This way, instead of searching through every single toy, you follow the branches, and each step takes you closer to the right toy. So, tree structures help computers find, organize, and store things in a way that saves a lot of time and computing power!

Summary

In Haskell, trees are defined recursively with nodes containing values and references to subtrees. A binary search tree (BST) is a special type of binary tree that maintains a sorted order, making searching, insertion, and deletion more efficient.

Binary Tree: A tree where each node has up to two children.
Binary Search Tree: A binary tree where each node’s left children are smaller, and right children are larger.
Key Operations:
- insert adds a new element while maintaining the BST property.
- search checks if an element exists.
- delete removes an element and restructures the tree if necessary.
Traversals (inOrder, preOrder, postOrder): Different ways to visit nodes in a binary tree.

Binary trees, and especially binary search trees, are foundational data structures that allow efficient storage and retrieval of data. Understanding how to build and manipulate them in Haskell is a valuable skill for creating efficient and functional programs.

Tree Structures – Haskell