Let’s code it: Compact tree (part 1)

You need a tree structure but you don’t want to pay the cost of single node allocation every time, moreover you want a structure cache friendly as much as possible? An arena based tree could be what you need.

An arena based tree is a data structure that uses a packed chunk of memory to store all the nodes of the tree like this:

Arena based tree

Each node contains the index of the the parent, the first child and the next sibling. These information is what we need to navigate the tree effectively without losing the structure. The resulting logic structure can be seen below:

Arena based tree logical structure

Note V is just a generic value

Let’s declare the structure for each node:

pub struct TreeNode<T>
{
    pub data: T,
    pub parent: Option<usize>,
    pub first_child: Option<usize>,
    pub next_node: Option<usize>,
}

The only value that we know that is not optional is data other than that we don’t know much of a node without the position in the tree, adding that we can define all the possible configurations assumed by a node:

  • Root node
    • data
    • data and first_child
  • Intermediate node
    • data, parent and first_child
    • data, parent, first_child and next_node
  • Leaf node
    • data, parent
    • data, parent and next_node

Note that the values listened are those that are not None inside the nodes. Please refer to the above logical structure if the list is not clear enough.

These are the conditions inside the tree:

  • A root will be always placed at index 0
  • A deleted node will have parent and first_child always set to None and next_node set to itself
  • The id/index of the nodes is dictated by the length of the vec
  • The deleted node inside the tree will point always to the closer deleted node to the start of the memory chunk

Now we can declare the tree itself. The memory chunk is represented by a Vec structure.

pub struct ArenaTree<T>
{
    pub nodes: Vec<TreeNode<T>>,
    node_deleted: Option<usize>,
}

Note that the node_deleted is used to keep the index if there is a node available, already deleted, to be taken when a new node is added.

Next we can define the add child method:

pub fn add_child(&mut self, parent: Option<usize>, data: T) -> std::result::Result<usize, ()> {
    // There is a root if the nodes vec is not empty
    let has_root = !self.nodes.is_empty();
    // Before setting the root checks if there is already one
    if has_root && parent.is_none() {
        // The root is already set input are not valid
        return Err(());
    }
    // Takes the next index, it's just the length due to the fact that appends to a vec
    let idx = self.nodes.len();
    let child = if let Some(p) = parent {
        // If the parent exists
        if self.node_exists(p) {
            // Gets the parent node
            let mut parent_node = &mut self.nodes[p];
            // Creates the child node attached to the parent
            let node = TreeNode::new(data, parent, None, parent_node.first_child);
            // Connects the child node to the parent
            parent_node.first_child = Some(idx);
            node
        } else {
            // The parent does not exist
            return Err(());
        }
    } else {
        // Checks if the root doesn't exists
        if !has_root {
            assert!(idx == 0);
            // Creates the root node
            TreeNode::new(data, parent, None, None)
        } else {
            // The root already exists
            return Err(());
        }
    };
    // Pushes the node created into the memory chunk
    self.nodes.push(child);
    Ok(idx)
}

Note that the method returns a Result because if the parent passed does not exist or there is already a root we cannot insert the value.

We can now write the delete child method. What is needed is:

  1. get the node;
  2. remove the connection of the node with others nodes (siblings and children);
  3. remove the parent from the deleted node;
  4. use next node field as a self reference to mark it as deleted.

Here the implementation:

pub fn delete_child(&mut self, index: usize) -> std::result::Result<(), ()> {
    // Gets the node to delete
    let node_to_delete = &self.nodes[index];
    // Checks if the node is already deleted
    if node_to_delete.next_node != Some(index) {
        // Gets the parent node if present
        if let Some(p) = node_to_delete.parent {
            // Checks if the node in the parent is pointed as first child
            if self.nodes[p].first_child == Some(index) {
                // If it is it's possible to detach it connecting the next node.
                // Next node could be Some or None
                self.nodes[p].first_child = node_to_delete.next_node;
            } else {
                let mut node_found = None;
                // If the node is not the first child of the parent searches inside the children list
                // All the siblings are attached together using the next_node field
                let mut node = self.nodes[p].first_child;
                // While there is a node in the list
                while node.is_some() {
                    let n = node.unwrap();
                    // If the node is the one to delete
                    if self.nodes[n].next_node == Some(index) {
                        node_found = Some(n);
                        break;
                    } else {
                        // Continue inside the list the node is not the needed one
                        node = self.nodes[n].next_node;
                    }
                }
                // If the node was found it's possible to detach it
                if let Some(n) = node_found {
                    // Detaches the node in the siblings list and put the next node
                    self.nodes[n].next_node = self.nodes[index].next_node;
                } else {
                    // If not the tree is inconsistent but not changes is made in the list of siblings
                    return Err(());
                }
            }
        }
        // Detaches the current node and all the children iteratively,
        // because it's faster than recursion
        let mut queue = VecDeque::new();
        // Push the first node to delete
        queue.push_front(index);
        let mut first_node = true;
        while !queue.is_empty() {
            // Pops the front node to start the deletion process
            let n_index = queue.pop_front().unwrap();
            // Gets the node
            let mut n = &mut self.nodes[n_index];
            // If it's not the first one queue the siblings otherwise is just the node we started with
            if !first_node {
                // Queues the next sibling
                if let Some(s) = n.next_node {
                    queue.push_front(s);
                }
            } else {
                first_node = false;
            }
            // Queues the first child
            if let Some(c) = n.first_child {
                queue.push_front(c);
            }
            // Removes the node by removing the first child
            n.first_child = None;
            // The parent
            n.parent = None;
            // And using next_node to point to itself
            n.next_node = Some(n_index);
        }
        // Saves the new node deleted inside the tree
        if let Some(nd) = self.node_deleted {
            // Save the index only if it's smaller
            if index < nd {
                self.node_deleted = Some(index);
            }
        } else {
            // There is not a new node delete, saves it
            self.node_deleted = Some(index);
        }
        Ok(())
    } else {
        // The node is already deleted
        Err(())
    }
}

Note the remove child methods is creating empty spaces inside the memory chunk these can be reused if we need to insert a new node

The add child method can be changed like this to reuse the deleted nodes:

pub fn add_child(&mut self, parent: Option<usize>, data: T) -> std::result::Result<usize, ()> {
    // There is a root if the nodes vec is not empty and the first node is node deleted
    let has_root = !self.nodes.is_empty() && self.nodes[0].next_node != Some(0);
    // Before setting the root checks if there is already one
    if has_root && parent.is_none() {
        // The root is already set input are not valid
        return Err(());
    }
    // Is there a deleted node we can reuse that space without allocation
    if let Some(idx) = self.node_deleted {
        if let Some(p) = parent {
            // If the parent exists
            if self.node_exists(p) {
                // Temporary saves the current first child
                let first_child = self.nodes[p].first_child;
                // Sets the "new" node into the parent as first child
                self.nodes[p].first_child = Some(idx);
                // Sets the next value as the previous first child
                self.nodes[idx].next_node = first_child;
            } else {
                // The parent does not exist
                return Err(());
            }
        } else {
            // Checks if the root doesn't exists
            if !has_root {
                assert!(idx == 0);
                let node = &mut self.nodes[idx];
                node.parent = None;
                node.next_node = None;
            } else {
                // The root already exists
                return Err(());
            }
        };
        // Checks if there is another node that can be used in a later insert
        self.node_deleted = self.find_next_deleted(0);
        Ok(idx)
    } else {
        // Takes the next index, it's just the length due to the fact that appends to a vec
        let idx = self.nodes.len();
        let child = if let Some(p) = parent {
            // If the parent exists
            if self.node_exists(p) {
                // Gets the parent node
                let mut parent_node = &mut self.nodes[p];
                // Creates the child node attached to the parent
                let node = TreeNode::new(data, parent, None, parent_node.first_child);
                // Connects the child node to the parent
                parent_node.first_child = Some(idx);
                node
            } else {
                // The parent does not exist
                return Err(());
            }
        } else {
            // Checks if the root doesn't exists
            if !has_root {
                assert!(idx == 0);
                // Creates the root node
                TreeNode::new(data, parent, None, None)
            } else {
                // The root already exists
                return Err(());
            }
        };
        // Pushes the node created into the memory chunk
        self.nodes.push(child);
        Ok(idx)
    }
}

Note as we can reuse now the space that is left from the deletion

For now this is it, in the next part we will find a way to pack all the node that are not deleted and discard what is left.

See you next time!

Updated: