SE250:HTTB:WIP:Binary Search Trees

From Marks Wiki
Jump to navigation Jump to search

Binary Search Tree logo by rthu009

<HTML> <img src="http://i38.photobucket.com/albums/e141/neozmatrix/BinarySearchTree.jpg" border="0" alt="HashTable"></a> </HTML>


Back to Work in Progess

Basics of Binary Search trees

Binary search trees are important in database applications.

A tree is defined as a non-empty finite set of labeled nodes such that there is only one node called the root of the tree, and the remaining nodes are partitioned into subtrees. If the tree is either empty or each of its nodes has not more than two subtrees, it is called a binary tree. Hence each node in a binary tree has either no children, one left child, one right child, or a left child and a right child, each child being the root of a binary tree called a subtree.

A tree is drawn downwards from the root. Each node is associated with a key. The left subtree of a node contains only values less than the node's value and the right subtree of the node contains the values greater than or equal the node's value. Extreme left node of the tree will be the least and extreme right will be the highest values of the Tree. A tree can be linear, non-equal,zig-zag or balanced.

Trees are good dealing with strings. But more general are Binary Search Trees. Operations on a binary tree require comparisons between nodes.

e.g: X > Y, X < Y or X = Y (One of them is true)

Here is an example of how a Binary Search Tree looks like.

<html> <img src="http://studwww.cs.auckland.ac.nz/~apra102/Binary tree.jpg" width="670" height="437" alt="Simple Binary Search tree" /> </html>

Every node stores information implicitly or explicitly. Every node has a unique parent, the root of the tree is the only node with no parent here it is f.The least value will be the extreme left value of the tree that is a and highest value of the tree will be the extreme right value of the tree that is k. Here a,d,e,g,h,k are leafs which doesn't have any children. And there are two external nodes which says thats the end of the tree. We sometimes use these external nodes as placeholders to indicate empty subtrees.

The depth of a node is the number of edges on its path to the root. The height of a tree is the maximum depth of a node. Here the height of the tree is 3.

If all leaves are at the same depth of the tree then the tree is full which means the tree is balanced.A binary tree is balanced if the left and right subtree of each node have height difference of 0 or 1.

Recursive Method

If we want to know the size of the tree or is it empty or make a particular element as a root or to find whether the tree is a full or how many leaves are at the end of the tree or what is the depth of the tree. For most of the applications it is essential to visit almost all the nodes in the tree in that case traversal method recursively helps us to solve most of the above operations. There are three main ways of doing it pre-order, in-order and post-order. Also one non-recursive method level-order. These methods are explained in detail later on as screen casts.


Insertion and Deletion

We can insert a node or a leaf but there is a problem by inserting, it forces the tree to become rather unbalanced in many cases, depending on the order of insertion. It may become completely linear in a worst case. The search time for an element, in the worst case, is proportional to the height. Delete the nodes or a leaf from the tree, deleting a leaf is easy, but it is different if we want to delete a node with children. We must delete the whole subtree at that node. We also need to find its parent and reset the pointers to its children. This is not a quick operation in general. This representation works well for complete trees. Have a look at screen casts showing the insertion and deletion below in Screencasts.

Red Black Trees

There are many types of binary search trees. AVL trees and red-black trees are both forms of self-balancing binary search trees. These use various operations (mainly rotations of the tree) to force the tree to remain (reasonably) balanced, are fairly complicated to implement and even harder to analyse. The more the balance condition is relaxed, the easier it is to insert, and the harder to find, so there is a tradeoff. Balanced condition is nothing but left branch of the tree must be same as the right branch of the tree mat be with a difference of 1 or 0. These have a balance condition that ensures that the maximum depth of a node is no more than twice the minimum depth. They do this by colouring each node red or black, with the rules:

the root is black
all leaves (external nodes) are black, even when the parent is both children of every red node are black
every path from a node to a leaf contains the same number of  black nodes.

Here is an example of Red Black Tree.


<html> <img src="http://studwww.cs.auckland.ac.nz/~apra102/RBT.png" width="526" height="428" alt="red black tree" /> </html>

here is a good demo java applet which lets you explore the RBT binary search trees: ENJOY :D

<html> <applet codebase="http://gauss.ececs.uc.edu/RedBlack" code="RedBlack.class" width=900 height=680> </applet></html>

Notes

  • TREES

A tree data structure is similar to the road because it provides you with a series of forks in the road that lead you down a path to reach a decision. A tree mainly consists of a root node that has no parent. From the root emerges branches/edges/arcs, each of which leads to a child node. The child nodes themselves can branch out to point to further child nodes. If a node has no children and is the last node in that path from the root to node, then it is called a leaf.

-> A single root is also a tree (diagram). -> A root with many child leaves is also a tree (diagram).

  • BINARY TREE

A binary( where binary means two) tree is a tree where each node has not more than two edges (diagram). Typically, the node has two edges, but there can be situations when the node has one edge or simply terminates, resulting in no additional edges.

The branch node is the fork in the road that links the root node to two branches. Each branch terminates with a child node. These are called the left branch and the right branch.Some nodes have no children, while other nodes have one or two child nodes. If the current node doesn't have any child nodes, then the current node is referred to as the leaf node. A leaf node is located at the bottom of the tree.

  • Depth and Size

A binary tree is described by using two measurements: tree depth and tree size (diagram). Tree depth is the number of levels in the tree. A new level is created each time a current node branches to a child node. For example, one level is created when the root node branches into child nodes. The size of a tree is the number of nodes in the tree. For example, the first level in (diagram) has one node, which is the root node. The second level has up to two nodes, which are the child nodes of the root. The third level may have up to four nodes.

The size is an approximation because the tree may or may not be balanced. A balanced tree is a binary tree where each current node has two child nodes. An unbalanced tree is a binary tree where one or more current nodes have fewer than two child nodes.

  • Why Use a Binary Tree?

Programmers use a binary tree to quickly look up data stored at each node on the binary tree. For example, to find a student ID in a list of a million student IDs.



Structure of a Binary Tree

A binary tree can be either

1)empty

or

2)a node whose children are binary trees.

From Lab 6 we saw that the typical structure for a node is a binary tree is:

typedef
struct Node {
  int key;
  struct Node *left;
  struct Node *right;
  struct Node *parent;	
} Node;

<html><img src="http://studwww.cs.auckland.ac.nz/~dols008/ssfw2.jpg"/></html>

In every binary tree there is one node that doesnt have a parent. This node is called the root. In the diagram above, the node labelled 2 is the root of the tree. No node can have more than two children. If a node has no children, we refer to it as a leaf. in the diagram above, the nodes labelled 1 and 3 are both leaves. The nodes in a binary tree are all linked together by edges.

Traversing a Tree

There are several different ways of traversing through a tree so we can get the necessary value each node. The main ways of doing it recursively are: preorder, inorder and postorder. There is also one way that is not recursive, called level order.

<HTML> <IMG SRC="http://www.geocities.com/braydondunnit/SOFTENG250/6.jpg"> </IMG> </HTML>

preorder = root, left subtree, right subtree

inorder = left subtree, root, right subtree

postorder = left subtree, right subtree, root


level order = visits nodes from left to right, top to bottom.

For more examples on tree traversals, see Screencasts :)


Here is a good java applet on binary search traversals^ ^

<html> <applet code="IDSV.class" align="baseline" width="150" height="50" name="lesson1" archive="http://nova.umuc.edu/~jarc/idsv/idsv.jar"><param name="lesson" value="1"></applet> </html>

Insertion And Deletion by hbar055


Insertion

To insert, search until we find the place that it should live (an empty subtree of a node in the tree). Attach a new node in that place. This to explain in more detail the procedure of insertion. We search if the root is not equal to the value, we search the left or right subtrees as before. Eventually, we will reach an external node and add the value as its right or left child, depending on the node's value. In other words, we examine the root and recursively insert the new node to the left subtree if the new value is less than the root, or the right subtree if the new value is greater than or equal to the root.

Code:

void InsertNode(Node *&treeNode, Node *newNode)
{
    if (treeNode == NULL)
      treeNode = newNode;
    else if (newNode->key < treeNode->key)
      InsertNode(treeNode->left, newNode);
    else
      InsertNode(treeNode->right, newNode);
}


In the worst case the operation requires time proportional to the height of the tree.

Another way to explain insertion is that in order to insert a new node in the tree, its value is first compared with the value of the root. If its value is less than the root's, it is then compared with the value of the root's left child. If its value is greater, it is compared with the root's right child. This process continues, until the new node is compared with a leaf node, and then it is added as this node's right or left child, depending on its value.

There are other ways of inserting nodes into a binary tree, but this is the only way of inserting nodes at the leaves and at the same time preserving the BST structure.

Deletion

There are several cases to be considered:

  • Deleting a leaf: Deleting a node with no children is easy, as we can simply remove it from the tree.
  • Deleting a node with one child: Delete it and replace it with its child.
  • Deleting a node with two children: Suppose the node to be deleted is called N. We replace the value of N with either its in-order successor (the left-most child of the right subtree) or the in-order predecessor (the right-most child of the left subtree).

Deleting a node with two children from a binary search tree

Once we find either the in-order successor or predecessor, swap it with N, and then delete it. Since both the successor and the predecessor must have fewer than two children, either one can be deleted using the previous two cases. A good implementation avoids consistently using one of these nodes, however, because this can unbalance the tree.

Code:

void DeleteNode(Node * & node) {
    if (node->left == NULL) { 
        Node *temp = node;
        node = node->right;
        delete temp;
    } else if (node->right == NULL) {
        Node *temp = node;
        node = node->left;
        delete temp;
    } else {
        // In-order predecessor (rightmost child of left subtree) 
        // Node has two children - get max of left subtree
        Node **temp = &node->left; // get left node of the original node

        // find the rightmost child of the subtree of the left node
        while ((*temp)->right != NULL) {
            temp = &(*temp)->right;
        }

        // copy the value from the in-order predecessor to the original node
        node->value = (*temp)->value;

        // then delete the predecessor
        DeleteNode(*temp);
    }
}

Although this operation does not always traverse the tree down to a leaf, this is always a possibility; thus in the worst case it requires time proportional to the height of the tree. It does not require more even when the node has two children, since it still follows a single path and does not visit any node twice.

Rotation

Rotation changes the structure of the binary tree but leaves the order of data unchanged. This is commonly used to resize the tree, reducing the length and optimising the tree for future operations.

When a node is rotated, only nodes below it(subtree) are changed.

Example rotation screencast

Animations

Screencasts

User Description
dols008 external Explanation of the next_pre function from lab 6 and how to implement it.
sgha014 Lookup function
apra102 Insertion
mgha023 Pre-Order
mgha023 Post-Order
apra102 Deletion
vpup001 Explanation of commands used for BSTs
vpup001 Explanation of minimum and maximum functions
jpar277 - 1 Explanation of rotate functions.
apra102 next_in function
tlou006 Rotation example.
asin185_2 In-Order
asin185_3 level-Order
srag014 Some Basics!
jpar277 - 2 Explanation of the delete function.

Quality Control

  • please note that some of the (example) images are broken. please see if you can correct the links, or find a new host.