- Why Use Binary Trees?
- Tree Terminology
- An Analogy
- How Do Binary Search Trees Work?
- Finding a Node
- Inserting a Node
- Traversing the Tree
- Finding Minimum and Maximum Key Values
- Deleting a Node
- The Efficiency of Binary Search Trees
- Trees Represented as Arrays
- Printing Trees
- Duplicate Keys
- The BinarySearchTreeTester.py Program
- The Huffman Code
- Summary
- Questions
- Experiments
- Programming Projects
Inserting a Node
To insert a node, you must first find the place to insert it. This is the same process as trying to find a node that turns out not to exist, as described in the earlier “Can’t Find the Node” section. You follow the path from the root toward the appropriate node. This is either a node with the same key as the node to be inserted or None, if this is a new key. If it’s the same key, you could try to insert it in the right subtree, but doing so adds some complexity. Another option is to replace the data for that node with the new data. For now, we allow only unique keys to be inserted; we discuss duplicate keys later.
If the key to insert is not in the tree, then __find() returns None for the reference to the node along with a parent reference. The new node is connected as the parent’s left or right child, depending on whether the new node’s key is less or greater than that of the parent. If the parent reference returned by __find() is self, the BinarySearchTree itself, then the node becomes the root node.
Figure 8-10 illustrates the process of inserting a node, with key 31, into a tree. The __find(31) method starts walking the path from the root node. Because 31 is less than the root node key, 44, it follows the left child link. That child’s key is 27, so it follows that child’s right child link. There it encounters key 33, so it again follows the left child link. That is None, so __find(31) stops with the parent pointing at the leaf node with key 33. The new leaf node with key 31 is created, and the parent’s left child link is set to reference it.
FIGURE 8-10 Inserting a node in binary search tree
Using the Visualization Tool to Insert a Node
To insert a new node with the Visualization tool, enter a key value that’s not in the tree and select the Insert button. The first step for the program is to find where it should be inserted. For example, inserting 81 into the tree from an earlier example calls the __find() method of Listing 8-3, which causes the search to follow the path shown in Figure 8-11.
FIGURE 8-11 Steps for inserting a node with key 81 using the Visualization tool
The current pointer starts at the root node with key 77. Finding 81 to be larger, it goes to the right child, node 94. Now the key to insert is less than the current key, so it descends to node 85. The parent pointer follows the current pointer at each of these steps. When current reaches node 85, it goes to its left child and finds it missing. The call to __find() returns None and the parent pointer.
After locating the parent node with the empty child where the new key should go, the Visualization tool creates a new node with the key 81, some data represented by a color, and sets the left child pointer of node 85, the parent, to point at it. The node pointer returned by __find() is moved away because it still is None.
Python Code for Inserting a Node
The insert() method takes parameters for the key and data to insert, as shown in Listing 8-4. It calls the __find() method with the new node’s key to determine whether that key already exists and where its parent node should be. This implementation allows only unique keys in the tree, so if it finds a node with the same key, insert() updates the data for that key and returns False to indicate that no new node was created.
LISTING 8-4 The insert() Method of BinarySearchTree
class BinarySearchTree(object): # A binary search tree class … def insert(self, # Insert a new node in a binary key, # search tree finding where its key data): # places it and storing its data node, parent = self.__find( # Try finding the key in the tree key) # and getting its parent node if node: # If we find a node with this key, node.data = data # then update the node's data return False # and return flag for no insertion if parent is self: # For empty trees, insert new node at self.__root = self.__Node(key, data) # root of tree elif key < parent.key: # If new key is less than parent's key, parent.leftChild = self.__Node( # insert new node as left key, data, right=node) # child of parent else: # Otherwise insert new node as right parent.rightChild = self.__Node( # child of parent key, data, right=node) return True # Return flag for valid insertion
If a matching node was not found, then insertion depends on what kind of parent reference was returned from __find(). If it’s self, the BinarySearchTree must be empty, so the new node becomes the root node of the tree. Otherwise, the parent is a node, so insert() decides which child will get the new node by comparing the new node’s key with that of the parent. If the new key is lower, then the new node becomes the left child; otherwise, it becomes the right child. Finally, insert() returns True to indicate the insertion succeeded.
When insert() creates the new node, it sets the new node’s right child to the node returned from __find(). You might wonder why that’s there, especially because node can only be None at that point (if it were not None, insert() would have returned False before reaching that point). The reason goes back to what to do with duplicate keys. If you allow nodes with duplicate keys, then you must put them somewhere. The binary search tree definition says that a node’s key is less than or equal to that of its right child. So, if you allow duplicate keys, the duplicate node cannot go in the left child. By specifying something other than None as the right child of the new node, other nodes with the same key can be retained. We leave as an exercise how to insert (and delete) nodes with duplicate keys.