To finish off the subject of binary search trees, I would like to return to the issue that originally motivated them - we wanted a data structure that could be searched in logN time, where N is the number of values in the data structure. Do BSTs actually meet this requirement? In other words, Is it true that I can find any value X in any BST in logN time? What do you think? Yes, no?

Well, while you think about that, here is something that *is*
certainly true. The time to search in a BST is definitely limited by
the *height* (or depth) of the tree. Each step in the search
goes down one level, so in the absolute worst case, we will have to go
all the way from the root to the deepest leaf in order to find X, or
to find out that X is not in the tree. So we can say with certainty
that search is O(Height).

But we want to know, is search O(logN)? Well, since we know search is O(Height), this is really asking is Height O(logN) ... is it always the case that the height of a BST is proportional to the logarithm of the # of nodes?

Unfortunately, the answer to all the questions is *no*.
There are BSTs in which Height and # of Nodes are *equal*! In
these trees, searching is O(N), linear time, not log time. It's
obvious once I draw one:

This is called a Degenerate tree, or linear tree. In it, no node has two subtrees, every node has 1 (except the lone leaf who has 0). Obviously this is nothing more than a sorted

So, aren't we are back where we started? Not quite. Most Binary Search Trees are not degenerate: in most of them Height is proportional to LOG(# nodes), and we get efficient search.

For example, the book cites a theorem (page 267) that if you start
with an empty BST and add to it N randomly generated values, the
expected Height of the tree is 1.4(log(N+1)). This is good news. It
basically says that you only get a degenerate BST if you insert values
in a *systematic* way. So, whenever you build a BST make sure
you do not insert the values in increasing or decreasing order. Or if
you find yourself with a degenerate tree it is worthwhile to rebuild
it by removing all its values and re-inserting them in random order.

There is another approach to keeping Height related to log(# nodes), based on the observation that in all degenerate BSTs, there is a node that has one subtree that is much deeper than the other ... in our example, the right subtree of `1' is much taller than the left subtree. What would happen if we absolutely required that the subtrees be the same height (within 1)?

**Answer:** *Height would always be O(log(# nodes)), so search
would always be O(logN).*

Trees with this property are called *Balanced Trees*, and
that's our next topic.