[ACE] Binary Search Tree

Binary Search Tree


Motivation

Why we need BST? We learned binary search before, if we search for an element within a sorted array is fast.
The search will be in O(logn), but arrays suffer from being slow, O(n), to insert new elements.
So we need a search trees attempt to fix the inefficiency of insertion whilst keeping good O(log n) properties for search.

A binary search tree is a binary tree storing key-value entries at its internal nodes and satisfying the following “search tree” property:

  • let u, v and w be any three nodes such that u is in the left subtree of v and w is in the right subtree of v. Then we must have
  • key(u)<= key(v) <= key(w)

Now we are going to implement the binary search tree.
Similar to the previous implementation, when we want to implement a tree-structure, what we should do first is to define Node in tree.
enter image description here

Search for a BST is relatively easy, we have learned binary search before, so we use the similar style.

  • To search for key k, trace a downward path starting at the root
  • The next node visited depends on the outcome of the comparison of k with the key of the current node
  • If we reach a leaf, the key is not found and we return null
    For search, we can use either recursive method or iterative method:
    Recursive program are easier to implement, but less efficient, because the overhead of a function call
    For best efficiency, we’d like to use the iterative program.

enter image description here

Here, we see the recursive method is to invoke the function itself so it has more overhead.

Insert

how to do insertion, insert(K key)?
We need to find(search) where to insert the key first:

  • If k is already in the tree then just replace the value
  • Otherwise, we insert it to the correct place
  • But when we write the code, we should pay attention to the first insert because you can not search anything at first.

Remove

OK! Remove is the most difficult part, because there are too many different cases when you do removing.
What if the node you are going to remove is a external node, what if it has one child, or two children?

External

If it is a external node, which means it doesn’t have any children, so just remove it and don’t need to do anything else

one child

If the node to be removed has one child, we need to remove the node, and connect the child with the node’s parent.
In code, we need to know which child the node have(left or right) and make the child’s parent be node’s parent, the parent’s left/right child be the child.

Two children

Two children is not difficult. What we need is to find the successor.
Successor means the smallest node from the nodes which are bigger than the current node.
enter image description here
Then we replace the current removing node by the successor and don’t break any structure of the current BST.
enter image description here
This is a example.

The code about BST and test case is here.

AVL tree

What if we insert to a tree and the tree is like this:
enter image description here

  • the space used is O(n)
  • methods find, insert and remove take O(h) time
  • in the worst case O(n)

What we need is to self-balancing :
enter image description here

This is AVL tree.
AVL (Adelson-Velskii & Landis) trees are binary search trees where nodes also have additional information: the difference in depth between their right and left subtrees (balance factor) should be 0, 1 or -1.
AVL trees do dynamic self-balancing in O(log n) time
Why self-balancing takes O(logn) time?
Because the method we used for self-balance is “total rebuild”, we insert all the node again, and insertion in BST takes O(log n) or O(height) time.

In the Github code, i used a method called rotation, as far as i am concerned, the insertion should be O(1) in this method.

Top down and bottom up insertion

Top down insertion algorithms make changes to the tree (necessary to keep the tree balanced) as they search for the place to insert the item. They make one pass through the tree.
e.g.: AVL Trees.
Bottom up: first insert the item, and then work back through the tree making changes. Less efficient because make two passes through the tree.
e.g. red-black trees, 2-4 trees.

评论

此博客中的热门博文

[MLE] Linear Classification

[AIM] MetaHeuristics

[CS231] Neural Networks