- Why Use Binary Trees?
- Tree Terminology
- An Analogy
- How Do Binary Search Trees Work?
- Finding a Node
- Inserting a Node
- Traversing the Tree
- Finding Minimum and Maximum Key Values
- Deleting a Node
- The Efficiency of Binary Search Trees
- Trees Represented as Arrays
- Printing Trees
- Duplicate Keys
- The BinarySearchTreeTester.py Program
- The Huffman Code
- Summary
- Questions
- Experiments
- Programming Projects
Summary
Trees consist of nodes connected by edges.
The root is the topmost node in a tree; it has no parent.
All nodes but the root in a tree have exactly one parent.
In a binary tree, a node has at most two children.
Leaf nodes in a tree have no child nodes and exactly one path to the root.
An unbalanced tree is one whose root has many more left descendants than right descendants, or vice versa.
Each node of a tree stores some data. The data typically has a key value used to identify it.
Edges are most commonly represented by references to a node’s children; less common are references from a node to its parent.
Traversing a tree means visiting all its nodes in some predefined order.
The simplest traversals are pre-order, in-order, and post-order.
Pre-order and post-order traversals are useful for parsing algebraic expressions.
Binary Search Trees
In a binary search tree, all the nodes that are left descendants of node A have key values less than that of A; all the nodes that are A’s right descendants have key values greater than (or equal to) that of A.
Binary search trees perform searches, insertions, and deletions in O(log N) time.
Searching for a node in a binary search tree involves comparing the goal key to be found with the key value of a node and going to that node’s left child if the goal key is less or to the node’s right child if the goal key is greater.
Insertion involves finding the place to insert the new node and then changing a child field in its new parent (or the root of the tree) to refer to it.
An in-order traversal visits nodes in order of ascending keys.
When a node has no children, you can delete it by clearing the child field in its parent (for example, setting it to None in Python).
When a node has one child, you can delete it by setting the child field in its parent to point to its child.
When a node has two children, you can delete it by replacing it with its successor and deleting the successor from the subtree.
You can find the successor to a node A by finding the minimum node in A’s right subtree.
Nodes with duplicate key values require extra coding because typically only one of them (the first) is found in a search, and managing their children complicates insertions and deletions.
Trees can be represented in the computer’s memory as an array, although the reference-based approach is more common and memory efficient.
A Huffman tree is a binary tree (but not a search tree) used in a data-compression algorithm called Huffman coding.
In the Huffman code, the characters that appear most frequently are coded with the fewest bits, and those that appear rarely are coded with the most bits.
The paths in the Huffman tree provide the codes for each of the leaf nodes.
The level of a leaf node indicates the number of bits used in the code for its key.
The characters appearing the least frequently in a Huffman coded message are placed in leaf nodes at the deepest levels of the Huffman tree.