TECHNICAL REPORT 99-1
TREE DATA STRUCTURE
Teaching Perspective
FELIX FRIEDMAN
SUMMER 1999
Teaching Perspective
1999 Felix Friedman
All rights reserved
I would like to thank Lisa L. Levine for her creativity in editing
this project and Matthew J. Kwiecinski for editing of several handouts,
and producing its html version. Their knowledge of the subject greatly
contributed to this work.
Finally, special thanks to Theresa M. Denshuick for the cover
picture and continuous supervision of the editing.
FF
The topic of trees is a major topic in data structures and arises in theory, in many applications and in other courses. The amount of time devoted to it is quite substantial. The topic deals with a number of issues. The attention given to each issue depends on the perceived importance of each of them. There is a number of textbooks on tree data structure. An excellent and quite complete treatment of trees can be found in [1], while an interesting presentation of special binary trees can be discovered in [2]. This project is intended as an aid for teaching or studying trees in a traditional setting or distance learning. It consists of an introduction and three parts.
The introduction describes a general tree in simple terms suited for the first encounter with this concept.
Part 1 deals with General binary trees and is the focal point of the project. Three traversals are discussed here in detail. Exercising the traversal on a tree drawn makes clear the need for a mechanism to memorize the address of the right subtree while the algorithm deals with the left subtree of the same node. The handout 1 provides the logic (pseudocode) of the traversals. It leads to handout 2, where this mechanism is constructed. A major emphasis is made on a tree representation in computer memory as a way to store not only nodes but also their parent-child relationship. Three representations are discussed and handouts are provided to aid in traversal implementation for each. Assignments address each implementation of traversals (traversals code) in two aspects. First, three traversals for each implementation with an additional store routine (since general binary trees do not define insertions) are assigned. These four routines have to be coded without use of recursion. Second, the same is assigned to be done recursively. This practice leads to mastery of issues involved in binary tree representations and traversals. Such emphasis is often not present. Study of traversals provides an opportunity to revisit previously studied topic of recursion and its underlying stack.
Part 2 addresses special binary trees. Being the most effective representation of ordered lists, binary search trees are mostly considered from this vantage point. Heap trees are well covered in most textbooks. Expression trees, their three traversals and corresponding prefix, infix, and postfix forms of an arithmetic expression are discussed in more detail since data structures textbooks usually not cover these trees enough.
Part 3 considers general trees. This material starts with a recursive definition of such trees, discusses two implementations of general trees and several applications. One of them, the union-find problem tree, is especially useful in a number of applications, while another, B-tree, is a very efficient way to maintain an index.
This project treats some issues in more detail than others. Important, but frequently not well covered issues are given additional attention. A special importance is assigned to part 1. It builds a reliable base for further studies.
A major effort went into development of handouts to assist
the study of trees. Pascal code or pseudocode is used in both handouts
and exercises.
Binary Trees
Three Tree TraversalHandout 1Tree Implementations
Handout 2
Handout 3
Handout 4Define a full binary tree
Non-Linked Implementation
Implementing traversals for a non-linked representation
Linked with Integer Representation
Linked Representation
Assignment 1
Assignment 2
Non-linked implementation of a Binary Tree (Handout 5)
Storing (Construction) a Binary Tree (Handout 6)
Example (Handout 7)
Recursive Store Routine (Handout 9)
Special Cases of Binary Trees
Heap and Expression Trees
Binary Search Tree (BST)
Threading for Inorder Traversal (Handout 10)
Insertions into an AVL Tree (Handout 11)
Analysis
General Trees
References
Description (heuristic definition) of a treeUsing the Description of a Tree, Prove the Following Statements:A tree is a data structure consisting of nodes (records) together with pointers to reference other nodes. A node, from which a pointer is emanating, is called a parent with respect to the node to which it points, which is called its child. A non-empty tree begins at a special node called the root of the tree; the root has zero or more children, but no parent. All other nodes have exactly one parent, and may have any number of children, including zero. The tree is usually drawn with the root at the top, children on the next level down, connected to the root by (undirected) line segments, their children on the next level down, etc. A subtree is a subset of nodes which also form a tree. The term-subtree of a node – refers to a child of that node and all its descendants.
- Define branching (interior) node
- Define leaf node
- Define depth (height) of a tree
- Define level of a tree
1. A tree of n nodes has exactly n-1 parent-child pointers (links).An empty tree is a formal concept, which has been proved to be useful. It designates a tree with no nodes. A leaf of a tree does not have subtrees; it is said a leaf may have only empty subtrees, while an interior node has one or more non-empty subtrees.
2. Node P is reachable from node Q by the way of the pointers if and only if Q is the root of a (sub)tree to which node P belongs.
3. No two subtrees of the same node have a common node.
4. A leaf is a one-node subtree of the leaf’s parent.
5. Even if a two directional link exists between any parent and its child, no node can be reached again once it is left for another node and there is a unique path from any node to any other node.
The description above has introduced a general tree. Applications
customarily deal with binary trees – trees, where a parent may have at
most two children, a left child and a right child. It may be explained
by the fact that a general tree can be replaced by a binary tree modeling
the same application. Most of the remaining material deals with binary
trees.
Definition of a Binary Tree. A binary tree is a tree in which each node has exactly two subtrees, designated the left subtree and right subtree, either or both of which may be empty.BINARY TREES
--To model a particular process, problem-solvers usually draw a tree and analyze it
--To use a binary tree in applications a tree has to be stored in computer memory
--The way in which it is stored must reveal the tree structure
exactly: a problem-solver ought not to have difficulty drawing the tree
from the way it is stored (implemented).
Operations on Binary Trees
Three most important operations are considered first. They are VLR traversal, LVR traversal, LRV traversal. In all these traversals the left subtree is traversed before the right subtree and the traversals differ by when the root is visited. These operations use only parent-child pointers to move from one node to another. They make it possible to produce three unique lists of nodes, corresponding to the same tree. Knowing these three lists one may produce the tree. The pseudo-code, yielding these three traversals recursively, is given in handout 1. It is worth noting that those three traversals are left-oriented; right-oriented tree traversals (when the right subtree is traversed before the left subtree) traditionally are not considered.
Notice that the back up (step 4) requires one to identify the root of a tree whose left subtree was just traversed. This may be accomplished by maintaining a stack as shown in handout 2.
- Draw a tree and produce its three traversals using the handout.
- Reconstruct the tree from those three lists without looking at the drawing.
HANDOUT ONE
Preorder Traversal Algorithm
(General Description)
1. Visit the ROOT of the tree.2. If this tree has a LEFT SUBTREE, then treat it as the tree and go to 1.
3. If this tree has a RIGHT SUBTREE, treat it as the tree and go to 1.
4. Back up to the Root whose LEFT SUBTREE was just traversed and take its RIGHT SUBTREE and go to 1.
5. If no such LEFT SUBTREE, stop.
(General Description)
1. If this tree has a LEFT SUBTREE, treat it as the tree and go to 1.2. Visit the ROOT of the tree.
3. If this tree has a RIGHT SUBTREE, treat it as the tree and go to 1.
4. Back up to the ROOT whose LEFT SUBTREE was just traversed and go to 2.
5. If no such LEFT SUBTREE, stop.
(General Description)
1. If this tree has a LEFT SUBTREE, treat it as the tree and go to 1.2. If this tree has a RIGHT SUBTREE, treat it as the tree and go to 1.
3. Visit the ROOT of this tree.
4. Back up to the ROOT whose LEFT SUBTREE was just traversed and go to 2.
5. If no such LEFT SUBTREE, stop.
PREORDER ALGORITHM(**initial tree is assumed non-empty**)
(*) VISIT the root of the current tree.
If the current tree has a non-empty left subtree, then
[put onto the stack the pointer to the root of its right subtree if
and only if it is non-empty: current tree := this left subtree;
GO TO (*)]
If the tree has a right subtree (*that will be examined, only when the
left subtree is empty*), then
[current tree := this right subtree;
GO TO (*)]
If the stack is non-empty, then
[pop off a pointer; current tree := the tree defined by the pointer (*notice,
a root specifies a tree once the tree is stored*)
GO TO (*)]
STOP (*stack is empty and ALGORITHM is executed*)
(*the stack is a stack of pointers only*)
INORDER ALGORITHM(**initial tree is assumed non-empty**)
(*) Examine the root of the current tree. If it has a non-empty left subtree,
then
[put onto the stack the pointer to the root;
define: the current
tree := this left subtree;
GO TO (*)]
(*)(*) VISIT the root of the current tree (*that is done when left subtree
is traversed(empty), if it has a non-empty right subtree*)
[define: current tree := this right subtree;
GO TO (*)]
If the stack is non-empty then0
[pop off a pointer; define: current tree := tree defined by the
pointer;
GO TO (*)(*)]
STOP (*the stack is empty and ALGORITHM is executed*)
(*the stack is a stack of pointers only*)
POSTORDER ALGORITHM(**initial tree is assumed non-empty**)
(*) Examine the root of the current tree.
If it has a non-empty left subtree then
[put onto the stack the pointer to the root multiplied by (-1); define:
current tree := left subtree;
GO TO (*)]
(*)(*) If the current tree has a non-empty right subtree (*that will be
tested only when the current tree is empty or traversed*), then
[put onto the stack the pointer to the root (*positive value onto the stack
indicates that left subtree has been traversed*);
define: current tree := this right subtree;
GO TO (*)]
(*)(*)(*)VISIT the root of the current tree.
If the stack is non-empty then
[pop off a pointer; define: current tree := the tree defined to
by abs(pointer)
If the pointer < 0 (*the left subtree is just traversed*)
GO TO (*)(*),
otherwise
GO TO (*)(*)(*)]
STOP (*stack is empty, the ALGORITHM is executed*)1
(*the stack is a stack of pointers only*)
(*Some ideas for linear non-recursive implementation of a tree*)
const
maxsize = 63;
type
element_type = char;
tree_type = array [1..maxsize] of element_type;
stack_type = record
top : integer;
pointers : array [1..maxsize] of integer;
end;
var_tree : tree_type;
stack : stack_type;
procedure
Preorder_Store (var_tree : tree_type);
var
i : integer;
answ, answ1 : char
;
stored : boolean;
begin
stored := false;
i := 1;
repeat
writeln (‘Is this (sub)tree non-empty? (Y or N)’);
readln (answ);
if (answ = ‘Y’) then
begin
writeln (‘Enter its root’);
readln (tree[i]);
writeln (‘Is there a right subtree for this root? (Y or N)’)
readln (answ1);
if (2 * i + 1 <= size) AND (answ1 = ‘Y’) then Push (stack, 2 * i + 1);
writeln (‘You are at the ‘, tree[i ]:3, ‘ looking at its left subtree’);
i := i * 2;
End;
If ((i > size) AND (top > 0) AND (answ = ‘Y’)) OR ((top >0) AND (answ =
‘N’)) then
Begin
Pop (stack, i);
Writeln (‘You are at the ‘, tree[i div 2]:3, ‘ looking at its right subtree’);
End
Else if ((answ = ‘Y’) AND (i > size)) OR (answ = ‘N’) then stored := true;
Until stored;
End;
Procedure Preorder (tree : tree_type);
Var
i : integer;
begin
i := 1;
(*below stay in the loop while not both subtree and stack empty*)
while not (((i > size) OR (tree[i] = ‘-‘)) AND (top = 0)) do
if (i > size) or (tree[i] = ‘-‘) then Pop (stack, i)
else
begin
visit (tree[i]);
push (stack, 2 * i + 1);
i := i * 2;
end;
end;
(*Inorder and Postorder procedures are similar*)
begin (*body of the main
program*)
writeln (‘Enter the height of the tree’);
readln (height);
size := 1;
for k := 1 to height + 1 do size := size * 2;
size := size – 1;
if size <= maxsize then
begin
for k := 1 to size do tree[k] := ‘-‘;
Preorder_store (tree);
Preorder (tree);
Inorder (tree);
Postorder (tree);
Else
writeln (‘Increase your maxsize in declaration’);
end;
end
(**Demonstrates how to remove goto’s**)
While the tree is not empty
OR stack is not empty
While the
tree is not empty
Visit its root
Define « root := left_child
(**may or may not be empty**)
IF left_child does not exist
Redefine « root := right_child
(**may or may not be empty**)
(**in both definitions above, root defines also the tree**)
Else
(**if a left child does exist**)
Handle the stack by pushing the pointer
to the right subtree on the stack
(**notice it may be a nil pointer**)
End_while
If the stack
is not empty then
Pop the stack, define the tree
Else
(**if the stack is empty**)
Define the tree is empty
(**to terminate**)
End_while
A tree defines partial orderings of nodes, normally lists of nodes beginning from the root and ending at a leaf. The same node may belong to several such orderings; the root of the tree belongs to all of them.
Each partial ordering corresponds to a path from the root to a leaf and it is directly based on pointers. A full binary tree of n nodes has exactly 2n such partial orderings.
The traversals establish three different ways for complete ordering of the nodes based on the pointers also, but not as directly (the order in which nodes are finally visited does not necessarily follow the pointers).
Each of the three traversals produces a unique fixed sequence of all nodes, i.e., a unique list of tree nodes. Given these three lists, the tree may be reconstructed.
This topic presents an opportunity to revisit and apply material studied in Linear Data Structures, namely, recursion and stacks. While traversing a tree, the left subtree is traversed first, leaving the right subtree for later traversal (it is repeated recursively). The moment for its traversal is defined exactly; it is done immediately after the left subtree is traversed. At this moment the address of its root has to be known for the preorder traversal, the address of the parent of this root in two other traversals. That is achieved by maintaining a stack of these addresses. Each time when leaving a node for a traversal of its left subtree, the address of its right child in preorder traversal, the address of this node in two others traversal are pushed onto the stack. When the traversal of the last left subtree is completed, the needed address for the traversal of its right counterpart is found on the top of the stack. It is an address of the desired root allowing one to traverse in a similar fashion a tree defined by it. (Handout 2)
These traversals are 3 major operations for general binary trees. Operations insertion and deletion are not defined for such trees. Therefore, storing a tree in computer memory using repeated insertions is replaced by a special operation store. To develop it, the preorder traversal may be used. The visit step in this routine may signify assigning an array cell to the next node in preorder traversal of the tree being stored. The assisting prompts have to be inserted in this routine to keep the dialog with the user, so the user provides the appropriate node. (Handout 3)
Note. A way to replace a GOTO statement is shown in Handout
4. It is interpreted as a jump statement used for loops. Thus, to replace
a GOTO, a body of an appropriate loop is defined. This usually is not covered
in students’ studies and replacement of GOTO’s in implementation of algorithms
in Handout 3 presents such opportunity.
There are three implementations of a binary tree: non-linked, linked, and linked with integer pointers. Each of these has its own use.
Define a full
binary tree.
--Find the total number of nodes and number of nodes
not having children in a full binary tree.
Solution. A full binary tree of depth d has d+1 levels. The root is on level 0, two nodes on level 1, four nodes on level 2,...,and 2d nodes on level d. The total number of nodes is:
1+2+22+…+2d =
= 2d+1-1.
--The number of nodes on any level of a full binary tree
is greater by one than the total number of nodes on all preceding levels.
Prove. There are 2k nodes on level
k,
there are 2k-1 nodes on a tree of depth k.
| k-th level with 2k
nodes |
![]() |
all nodes above k-th level constitute
a full tree of depth k - 1 |
Each node on the last level d does not have children.
This is 2d (greater by one than the number of remaining
nodes) nodes.
Note. The number of nodes in a full binary tree more
than doubles when a new level is added to the tree. This represents an
exponential growth.
An array A of size S = 2d+1-1, where d is the depth of the tree, is used to represent it. The number S reflects the number of nodes in a respective full binary tree of the depth d. The root is stored in A(1), left child of the root is stored in A(2), right child of the root is stored in A(3). For any node stored in A(k) its left child is stored in A(2k) and its right child is stored in A(2k+1). Even if achild for a non-last level node does not exist, an array cell holding some predefined value has to be designated for this child. Nodes of a tree below are stored in array cells whose indices are shown. Indices 4, 7, 8, 9, 12, 13, 14, and 15 are indices of wasted cells. A character ‘-‘ is chosen as the predefined value.
|
||||||||||||||||||||||||||||||||
![]() |
||||||||||||||||||||||||||||||||
Being a tree of depth 3, it requires 15 = 23+1 –1 cells to be represented. It is wasteful but allows one to find its children for any node A(k) and even a parent node: given a child node A(k), the parent node is A(k/2). In this representation even a non-existing node has to have two cells to represent non-existing children if the node is not on the last level of the tree. Since 15 is the maximal number of nodes of a tree of depth = 3, there is no need for cells with indices larger than 15. When looking for, say, a left child of A(k), first find if 2k is no larger than 15. If it is larger than 15, then A(k) is on the last level of the tree and no cells are allocated for its empty children. A(k) itself may represent a non-existing child. This saves 16 array cells, since the number of nodes in this full tree on the last level is 8. If A(k) represents a non-existing child not on the last level, two cells (for the left child and one for the right child) must be allocated. The topic of implementing (representing or storing) a binary tree in computer memory is important and deserves time to understand all issues involved in it, since implementations of binary tree operations (procedures in programming language) completely depend on properties of this representation.Note 1. If a tree is represented in computer memory, knowing the root of a (sub)tree means knowing the whole (sub)tree; in other words, a root of a represented (sub)tree completely defines the (sub)tree.Note 2. Nodes of the last level d are stored in the array slice
A(2d…2d + 1-1). In general, all nodes of level p are stored in
A(2p..2p + 1-1).Note 3. This representation is the best for storing nearly full or full binary trees, since very little or no memory is wasted.
Note 4. The time overhead needed to compute parent-child links is negligible.
1.1 Implementing traversals for a non-linked representation
2.Linked With Integer Pointer RepresentationHomework for this topic consists of two assignments. The first assignment develops a storing procedure and three traversals for trees drawn. These four procedures have to be implemented non-recursively. The second assignment develops each of these procedures as recursive routines. Handout 5 provides help.
This topic is covered in a sequence similar to the one above. The structure is an array of records, each having a field for a node data, its left child address (index of the array), right child address (index of the array) and (if desired) parent address. This representation uses an array cell for an existing node only. Since storing data in a data field of the node (which may be a large record of information) may require a large amount of memory (the same which would be used for a cell in non-linear representation), the amount of memory for storing left pointer, right pointer and parent pointer is not substantial in comparison. For an application where a node stores more than just an integer or character this representation saves a lot of main memory.Handout 6 provides a non-recursive algorithm for storing a binary tree using this representation. Handout 7 shows an example of execution of the algorithm. The algorithm initializes a variable avail to 1 and stores the root of the tree in cell 1 of the array. Using avail := avail + 1 it allocates the next cell for the next node in preorder traversal of the tree enforced by the algorithm. The two homework assignments are similar to those for non-linked representation. No additional handouts are provided since handout 6 covers the store algorithm. It also can be looked at as a help for the non-recursive preorder traversal. The inorder and postorder traversals are similar to the preorder traversal. The second assignment is much easier, since in recursive implementations the stacks are handled by the compiler. It has a lot in common with the second assignment for non-linked representation.
This representation is different from the previous by the way the memory is allocated and consequently referenced. The operating system provides a new node during the program execution only when it is needed. It avoids the waste of memory in previous representations forced by a need to run the same program for trees of different sizes. Logically the handling of pointers to nodes is not different from the last representation. Handout 8 shows the recursive store procedure. All eight procedures are similar to the corresponding eight procedures of the previous tree implementation. Two similar homework assignments can be given by adjusting the last two assignments.The study of a general binary tree will not be complete without the specification part of a package, which provides for a number of other operations on this tree. One of the main issues is whether such a package should include two operations: insert and delete. A general binary tree does not define a relationship "<" in the same way as a linear list; therefore specifications for these operations do not define them completely, leaving it to the users discretion.Note. Recursive programs in all representations of a tree rely on availability of a pointer type in a programming language. For the last representation, obviously, it is a must for each of eight procedures.The following consideration of trees includes special cases of general binary trees, for which insert and delete operations are very important. Having these operations (which have a different meaning for each special binary tree) placed in the list of specifications of a general binary tree underscores the idea of a general binary tree with some operations defined differently for different special cases. These specifications can be found in [1]. It should be noticed that no set of specifications fits all possible applications. This set can be viewed as a base for an extension when applications call for it.
Write a program with four non-recursive procedures for
1. Storingof a binary tree. The tree is to be implemented using non-linked representation. Your program should display the physical storage and each traversal. Test it for two following trees
2. Preorder Traversal
3. Inorder Traversal
4. Postorder Traversal
![]() |
![]() |
with the results submitted as part of your assignment.
(*A handout is also provided*)
Write a program with four recursive procedures for
5. Storingof a binary tree. The tree is to be implemented using non-linked representation. Your program should display the physical storage and each traversal. Test it for two following trees
6. Preorder Traversal
7. Inorder Traversal
8. Postorder Traversal
![]() |
with the results submitted as part of your assignment.
(*A handout is also provided*)
Non-Linked Implementation of A Binary Tree
(*Comments to homework assignment*)
When storing do not forget to initialize each A[k] := ‘-’; this tree has height = 3 and therefore size = 23 + 1 – 1 = 15 is needed to store it in A. As it was mentioned above size is an important parameter in traversal procedures. The test for a non-empty subtree has to contain this parameter.1. For a given binary tree of height h, the actual size of an array is the same as for a full binary tree of height h, namely,
size = 2h + 1 – 1. This parameter is used then to store all 2h + 1 – 1 nodes (existing or non-existing), even two children
of non-existing parents, despite the fact that an empty subtree does not have subtrees by definition.
(*This procedure is activated by a call from the main program with a statement : preorder(1);*)Procedure preorder (k : integer *index of a tree node in array);
BeginIf k <= size then if A[k] <> ‘-‘ thenend
(*Subtree with the root A[k] is not empty*)
beginvisit(A[k]); (*say, writeln (A[k]);*)end
preorder(2 * k);
preorder(2 * k + 1);
Preorder Traversal is used for storing a binary tree.
(Array implementation with integer pointers)
PreorderStore (Nonrecursive)
Identify the first available cell.
* Visit the root (By storing it in the first available cell)
Does the root have a Lchild?
If yes, put the pointer to the next available cell in the left pointer field of the root,
Define : Root := Lchild. Go to *
If no, put Nil (-1) in the left pointer field of the root (and go to **)
** Does the root have a Rchild?
If yes, put the address of the first available cell in the right pointer field of the root,Overflow case should be addressed also. A curious detail - the stack, discussed for Preorder algorithm earlier is maintained ("quietly") by blanks in the right pointer field. Addresses of nodes "marked" by blanks constitute therefore the stack. The top of the stack currently is an address of a node with the largest address so far and marked with a blank in right pointer field. The code for maintaining this stack has to be added.
Define : Root := Rchild. Go to *.
If no, put Nil (-1) in the same field and find the last occupied cell without a right pointer, define the node in this cell as the root and go to **. If there is no such cell the tree is stored.
Example of storing(constructing)
a tree using integer pointer array
implementation
in a Non-recursive algorithm
(see
Handout 6)
Given following tree, represent it in computer memory.
![]() |
First available address is 1, next address
is used for the next node in the traversal. A current stack of blanks is shown at the moment a new node is visited and decision is taken where to go next (when necessary the top of the stack is popped). A parent pointer can be used if needed. |
| Parent
Pointer |
Left Pointer | Node Info | Right Pointer | Current Stack
(Blanks in addresses) |
|
|
1
|
-1 | 2 | A | 6 | 1 Going left |
|
2
|
1 | 3 | B | 4 | 2,1 Going left |
|
3
|
2 | -1 | D | -1 | 1, going right |
|
4
|
2 | 5 | E | -1 | No change, going right |
|
5
|
4 | -1 | F | -1 | Empty, going right |
|
6
|
1 | -1 | C | 7 | No change, going right |
|
7
|
6 | 8 | G | 11 | 7, Going left |
|
8
|
7 | 9 | H | 10 | 8,7 Going left |
|
9
|
8 | -1 | I | -1 | 7, going right |
|
10
|
8 | -1 | J | -1 | Empty, going right |
|
11
|
7 | -1 | K | 12 | No change, going right |
|
12
|
11 | -1 | L | -1 | No change, going right |
Exercise this example and observe the behavior of the
"quiet" stack of blanks. In actual implementation an address of blanks
in a right pointer field is put onto the stack and popped off at the moment
in the algorithm when left pointer and right pointer both are –1
Recursive version of Preorder Store for storing a Btree
(*A call Preorder Store (Root) is expected*)
Procedure PreorderStore (tree : nodepointer);
(*new (root) has already been executed in the calling program*)
(*no initialization of its fields is done*)
(*this assumption is important: Root is a global VAR*)
VAR
answ : char;
p : nodepointer;
BeginIf tree <> nil thenEnd;
Begin(*first, existing Root I filled up with info*)PreorderStore(tree^.left);
writeln(‘Enter info.field’);
readln(tree^.info:5);(*next prepare a node for the left child*)
(*(if it exist), before calling PreorderStore*)
writeln(‘Does;, tree^.info:5,’ has left child’);
readln(answ);
if answ = ‘y’ thenBeginElsenew(p);End
tree.left := p;tree^.left ;= nil;
(*extended PreorderStore(tree^.left) ends*)(*finally prepare a node for a right child*)
writeln(‘Does’, tree.info:5, ‘has a right child’);
readln(answ);
if answ = ‘y’ thenBeginElsenew(p);End
tree^.right := p;Tree^.right := nil;PreorderStore(tree^.right);
(*extended PreorderStore(tree^.right) ends*)
Exercise
--Which of the following trees is a BST?
a.) b.) c.) d.)
You are writing a program that uses a binary tree. To make it easy to trace the structure, you implement pointers using an array of records. Consider the following binary tree:
|
|
![]() |
a. Indicate the contents of Root and Avail pointers and LeftChild and RightChild fields in the array Tree that follows:
Array Index Info Tree
LeftChildRight Child 1 C 2 R 3 G 4 F 5 X 6 Y 7 B 8 A
b. Show the contents of Root and Avail pointers and LeftChild and RightChild fields after a node containing J has been inserted as the left child of X and then after R has been deleted with G becoming the left child of C.2. How is a linked representation of a binary tree an improvement over its linear representation? In what ways could the linked representation be less efficient?
3. What is the output produced by the following procedure for the pictured tree?
(*An example of a right-oriented traversal*)
![]() |
Procedure TreeWalk({given }4. How does the output from Exercise 3 change if the statement
B : BinaryTree);Begin { TreeWalk }
If B = nil thenEnd; { TreeWalk }
Writeln(‘JUMP’)
ElseBeginEnd { else}
TreeWalk(B^.RightChild);
TreeWalk(B^.LeftChild);
Writeln(B^.Info);
Writeln(B^.Info) is moved ahead of the recursive calls to TreeWalk?
5. How does the output from Exercise 3 change if the statement
Writeln(B^.Info) is located between the recursive calls to TreeWalk?
Special Cases of Binary Trees
1. Heap Trees
An excellent and elegant treatment of this topic can be found in [2].2. Expression Trees
The structure of an arithmetic expression is similar to a structure of a binary tree. A recursive decomposition of a binary tree into a left subtree, root and right subtree corresponds to a recursive decomposition of an arithmetic expression into a left subexpression, lowest precedence operator and right subexpression. While the first decomposition leads to any of three traversals, the second leads to any of three forms of an arithmetic expression: prefix, infix and postfix. An expression tree can be constructed by applying the preorder traversal algorithm to the three elements of an expression decomposition. The lowest precedence operator is stored by the visit statement of this algorithm as the root. An operand subexpression results in a corresponding leaf subtree.
Consider an expression such as
z * (x + y) – t * s + p / (x – y).
When scanning the expression from left to right, the first lowest priority operator encountered is - .
It decomposes this expression into z * (x + y), - , and p / (x – y) and the tree under construction looks as follows
![]()
Applying the preorder traversal algorithm further to represent the sub-expressions results in
An expression with unary operators may be represented by an expression tree similarly.
What follows is an example of such expressions and construction of the corresponding tree.
a sin(x + y) – abs(cos(s * t + x))
For a unary operator’s operand representation the left subtree is chosen above. Choice of the right subtree is justified also. The similarity of the two concepts, binary trees and arithmetic expressions, is even more revealed in the following result
Preorder, inorder, postorder traversals of an expression tree yields prefix, infix, postfix form of the infix expression represented by the tree.
To show that this statement is valid for a preorder traversal, notice that the prefix form for an expression E = LE OP RE where OP is the lowest precedence operator, LE is the subexpression to the left of it, and RE is the subexpression to the right of it results in OP LE RE by the definition of the prefix form. The preorder traversal produces OP first. Since this decomposition is recursive and left-oriented (the left subexpressions brought to prefix form first), the preorder traversal has to yield the prefix form.
The decomposition E = LE OP RE (which is used to produce an expression tree) is the appropriate first step to find the last executed operator (lowest priority operator). The root of the expression tree OP is visited in an inorder traversal after the left subtree is traversed. That means that OP appears after all operators and operands in the left subtree traversal and before all operators and operands in the right subtree traversal. However, since parentheses are not used in expression trees, each subtree traversal has to be parenthesized. Since this decomposition is recursive and left-oriented, the inorder traversal has to yield the infix form.
The first step in producing a postfix expression is the expression decomposition:
E = (LE OP RE)infix = (LE RE OP)postfix .
The postorder traversal of the expression tree yields OP last, as the postfix form does. Recursively applying this decomposition to LE and then to RE yields the postfix form. The same result is achieved by recursively applying the postorder traversal to the expression tree.
Expression trees is a topic suitable for a final project.
Return to Table of Contents
BST is the most widely used special binary tree. It may be explained by the fact that BST’s are often used as means of implementation (storing in computer memory) of an ordered list. Each of three implementations of an ordered list (array, linked, linked with integer pointers) has its use but this fourth implementation is by far the most efficient.
Being used mainly as an implementation of an ordered list, it should not come as a surprise that a BST most often makes use of the inorder traversal. As with an ordered list, search, insertion and deletion are the three most important operations on a BST (notice that both insertion and deletion require search as part of their execution – the same is true for an ordered list). A BST package consists of the same operations as an ordered list package, since both packages implement a necessary set of operations for an ordered sequence of objects. However, as data structures they are distinct. The three operations are defined differently for these two structures.
- Define a BST
- Operations <, =, > have to be defined for a set of objects represented by a BST (as they would have to for an ordered list)
- Inorder traversal of a BST results in an ordered list with respect to the same operations. This follows from both the definition of a BST and the definition of this traversal. Therefore it may be used for sorting the data represented by a BST (tree sort)
- Insertion and deletion on a BST are defined by the requirement that the resulting tree has to be a BST
- No need for a special procedure STORE, since repeated insertions are used for storing a BST
- When performing insertions and deletions in a BST represented by an array with integer pointers, a variable root is maintained (along with avail), since the root may end up in any cell after repeated insertions and deletions
- BST requires more memory in two respects. First, each node has two pointers instead of the one in a list. From 2n pointers required in representation of a tree, only n - 1 pointers are used as links to existing nodes (as in an ordered list); the remaining n + 1 pointers are nil pointers. Second, the recursive nature of the algorithms requires a substantial amount of overhead to maintain the stack (even in a language which supports recursion)
- The O(log2n) efficiency of a BST implementation of an ordered list is not guaranteed (it is the best case). It is contingent upon the trees remaining nearly full. The tree’s remaining nearly full is, in turn, contingent upon the order in which data are added and deleted; in the worst case, data entering the BST in the wrong order can cause the tree to degenerate into a list (with O(n) efficiency)
- These drawbacks can be overcome. The overhead associated with recursion can be avoided by using threading, which puts to good use the n + 1 pointers that are otherwise wasted as nil. Using height balancing a BST may be maintained in a fashion that approaches fullness at all times, regardless of the order in which data arrives for insertions and deletions. Threading and height balancing add a considerable measure of complexity to the BST algorithms
- All three implementations may be used for a BST; relatively less often encountered non-linked representation is very efficient, if height balancing is implemented (for full binary trees this is the best representation)
The following pages provide examples of constructing a BST, insertions into a BST and deletions from a BST. Handout 9 illustrates threading. Each traversal requires a different threading and threading can be used with any binary tree (not just with a BST). Handout 10 demonstrates insertion into an AVL tree. The main consideration when designing this handout was to use minimal amount of detail in illustrating the exact mechanics of balancing. Deletion from an AVL tree may be used as a topic for a final project
The input order determines the shape of a BST
Tree
(a)Input : D B F A C E G
Tree
(b)Input : B A D C G F E
Tree
(c)Input : A B C D E F G
There are 7! = 720 ways to enter this set of 7 characters. The number of different BST’s is less, since different permutations of the characters may lead to the same BST (Try to find all BST’s for a 3-character input A, B, C.)
Insertions into a Binary Search Tree
a. Tree
(b) Insert 5 (c) Insert 9 (d) Insert 7 (e) Insert 3 (f) Insert 8 (g) Insert 12 (f) Insert 4 (j) Insert 20
Deleting from a BST
|
Deleting a Root Node with No Children |
![]() |
![]() ![]() |
|
| Deleting a Root Node
with One Child |
![]() |
![]() ![]() |
|
| The Logical H
Predecessors of Nodes with Two Children E S |
![]() |
![]() |
|
| Deleting a Root Node
with Two Children |
![]() |
![]() |
|
![]() |
|||
Threading for inorder
traversal. (Threading has to be changed for another traversal.) Threading
may be used for a general binary tree as well as for a special binary tree.
Inorder traversal of this tree produces an infix expression.
BST is the most efficient implementation of an ordered list.
The importance of a binary search tree structure for representing
(one can use the terms implementing or storing instead) an ordered list
follows from the analysis table below
Table : Four implementations of an ordered list
| Implementation | Search
(update) |
Insertion
Search is a part |
Deletion
Search is a part |
Traversal |
| Array | O(log2n) | O(log2n) search
and involves O(n) moves |
O(log2n) search and
involves
O(n) moves |
O(n) |
| Linked List | O(n) | O(n) | O(n) search | O(n) |
| Linked List with Integer Pointers | O(n) | O(n) | O(n) search | O(n) |
| Binary Search Tree | O(log2n) | O(log2n) | O(log2n) search | O(n) |
A list implementation is a better choice than an array
implementation, since a move of a record takes substantially more time
than a comparison in a search. A binary search tree can be considered as
a powerful enhancement of a linked implementation replacing O(n) by O(log2n)
for search, insertion and deletion. At the same time, it may be considered
as an alternative to a quick-at-search array implementation free from moves
when inserting or deleting.
General Trees
Definition (recursive). A general tree is a set of nodes
that is either empty (the recursive terminating condition) or has a designated
node, called the root, from which zero or more subtrees descend. Each subtree
itself satisfies the definition of a tree. Moreover, the collection of
subtrees of the root is ordered in that there is a firstsubtree, second
subtree, and so forth.
This is an example of a recursive definition when an object is defined by similar objects each of which is a part of it (it is similar to the definition
n! = n * (n – 1)!.) It is a more exact way to define trees than the description of a general tree in the introduction.
The applications of general trees are so varied in nature
that it is impossible to provide a set of specifications suitable for all
situations. As with binary trees the usual approach is to include only
a small set of generic operations which may serve as a base for further
extensions. The operations closely parallel those for binary trees. The
main difference occurs in the traversal strategies, where only preorder
and postorder make sense in the context of a general tree.
Implementation of a General Tree
In a general tree a node may have any number of children,
making the implementation more complex than for a binary tree. Since a
fixed maximum for the number of children for each node is a must, the memory
space taken up by nil pointers is too large. It is only effective in applications
where the realistic limit on the number of children a tree node may have
is known.
Binary Tree Representation of General Trees
This implementation requires that each node have only
two pointer fields. The first pointer points to the left most child of
the node, and the second specifies the next sibling to the right. The children
form an ordered list, and the left most child of a node is regarded as
First_child and the sibling to the right as Sibling. The preorder traversal
may be used to store this tree in an array with integer pointers, similarly
as it has been done for binary trees.
Tertiary Tree Representation of General Trees
In this representation the info field of each node is
supplemented with 3 pointers, Left_sibling, Children, and Right_sibling:
|
|
|
|
|
Left_sibling is either nil or points to a node whose data
precedes that of the given node in the ordered list of siblings at this
level. Right_sibling is either nil or points to a node whose data equals
or follows that of the given node in the ordered list of siblings at this
level. Children is a pointer to the ordered list of children of a given
node (no < or >= requirement here). It is like a binary search tree
in the way in which left and right pointers are defined there. Tertiary
tree representation of a general tree is used in data base management.
2 – 3 Trees
This tree is a further enhancement of an ordered list
representation. A Binary search tree is very effective in doing this; it
guarantees O(log2n) search efficiency. The constant of proportionality
involved in this big – O efficiency figure is approximately 1.5, that is,
the number of comparisons to search is never greater than 1.5n. A 2 – 3
tree can improve upon this efficiency of a height-balanced BST by guaranteeing
a search path that never exceeds log2n + 1. That is, it matches
the efficiency of a full binary tree.
The UNION-FIND Problem Tree
This is another important application of a general binary tree. The problem deals with the following. Given a domain of elements and the partition of the domain into disjoint subsets, for any two elements of the domain
Such a problem may arise in large, complex transportation networks. Both operations can be implemented by representing each set as a general tree.
- Find whether or not they belong to the same subset
- Form the union of the sets that contain these two elements
B-Trees
Additional areas of tree applications include security problems, artificial intelligence, parallel processing, optimization, and conceptual trees for games (such as chess). The list goes on and on.
- Potential search efficiency O(log2n)
- The ability to traverse the list indexed by the tree in key order
- Dynamic insertion and deletion are qualities that make a binary tree the ideal index structure for situations in which the entire tree can fit in main memory. However, if the data collection is so large that the tree index must itself be stored on disk, the efficiency of the structure is less than optimal. This is true because each node of the index may be in a different disk block and hence require a separate disk access. For example, with 50,000 keys a search of a binary tree index could require 16 disk accesses. To solve this problem it is desirable to cluster the nodes along a given search path into one, or at least few, disk blocks.
Return to Table of Contents[1] Naps, Thomas, Introduction to Data Structures and Algorithm Analysis. (Second Edition). West Publishing Co., 1992.[2] Dale, Nell, Susan Lilly, and John McCormick. Ada, Plus Data Structures. D.C. Health & Co., 1996.