-------------------------------
Algorithm Complexity (continued)
-------------------------------

Omega and O: Lower bounds vs. upper bounds:
  - Recall T(n) = maximum running time of an algorithm over all inputs of
    size n, so proving asymptotic bounds on T(n) requires care.
  - To prove upper bound (i.e., T(n) is O(f(n))), must show that there 
    exist specific constants n_0, c such that for all n >= n_0, algorithm 
    requires <= c f(n) steps on **every** input of size n.
  - To prove lower bound (i.e., T(n) is Omega(g(n))), only need to show 
    that there exist constants n_0, c such that for all n >= n_0, 
    algorithm takes >= c g(n) steps on **some** input of size n.
  - Example: linear search.
      . T(n) is O(n) because for every input of size n, algo examines each
        element at most once, performing a constant number of steps for each
        input;
      . T(n) is Omega(n) because when value searched for is not in the list,
        algo will examine each element at least once, performing a constant
        amount of work for each step.

-------------------------------------------------
Time Complexity of Recursive Programs (Chapter 4)
-------------------------------------------------

Recurrence relations:

  - Example 1: factorial
        Fact(n)
            if n <= 1 then
                return 1;
            else
                return n * Fact(n-1);
            end if
        End
  - Worst-case running time of factorial satisfies recurrence:
               { 2             if n <= 1,
        T(n) = {
               { 3 + T(n-1)    if n > 1.
    How can we solve for T(n) to get a closed form?
     1. Perform repeated substitutions to get a "guess":
            T(n) = 3 + T(n-1)
                 = 3 + 3 + T(n-2)
                 = 3 + 3 + 3 + T(n-3)
        there's a pattern here, it looks like after i steps, we get:
                 = 3i + T(n-i)
        for now, this is *not* a proof, just an informal guess; we get to a
        base case when n-i <= 1, i.e., i >= n-1, so substituting i = n-1 in
        our guess gives:
                 = 3(n-1) + T(n-(n-1))
                 = 3n - 3 + 2
                 = 3n - 1.
     2. Prove our guess by induction.  Formally, we prove that
        T(n) = 3n-1 for all n >= 1.
        Base Case: n = 1.  By def'n, T(n) = 2 = 3-1 = 3(1)-1.
        Ind. Hyp.: Let k>=1 be arbitrary and assume T(j) = 3j-1 for 1<=j<=k
        Ind. Step: Since k >= 1, k+1 >= 2 so
            T(k+1) = 3 + T(k)      (by recurrence since k+1>=2)
                   = 3 + (4k-3)    (by the ind. hyp.)
                   = 3(k+1) - 1.
        Hence, by induction, T(n) = 3n-1 for all n >= 1.
    This means that T(n) is Theta(n).

  - Example 2: RecBinSearch.  Recurrence for worst-case running time of
    RecBinSearch is:
               { 4                                            if n = 1
        T(n) = {
               { 7 + max { T(ceil(n/2)), T(floor(n/2)) + 1 }  if n > 1
     1. Repeated substitutions.  Since this is used only to get a guess, we
        are allowed to make simplifying assumptions.
            T(n) = 7 + T(n/2)    (roughly, ignoring rounding, etc.)
                 = 7 + 7 + T(n/4)
                 = 7*3 + T(n/8)
        After i substitutions, we expect to get
            T(n) = 7i + T(n/2^i).
        This is just a guess, but we can check that we guessed the pattern
        correctly by doing one more step:
            T(n) = 7i + 7 + T((n/2^i)/2)
                 = 7(i+1) + T(n/2^{i+1}).
        Since this works, we figure out for what value of i we get a base
        case: n/2^i <= 1 <==> n <= 2^i <==> log_2 n <= i, so substituting
        i = log_2 n into the guess, we get:
            T(n) = 7 log_2 n + T(1) = 7 log_2 n + 4.
     2. Because of the simplifying assumptions, this is not the exact
        solution.  But we can prove T(n) is Theta(log n), by proving separate
        upper and lower bounds.
        (a) We prove T(n) is O(log_2 n) by proving T(n) <= 20 log_2 n + 4
            for all n >= 1.
            Base Case: T(1) = 4 <= 0+4 = 20 log_2 1 + 4.
            Ind. Hyp.: For some k>=1, assume T(j) <= 20 log_2 j + 4
                for all 1<=j<=k.
            Ind. Step: Since k >= 1, k+1 >= 2 so
                T(k+1) = 7 + max { T(ceil((k+1)/2)), T(floor((k+1)/2)) + 1 }
                since floor((k+1)/2) <= ceil((k+1)/2) <= k, we apply the
                ind. hyp. to get:
                       <= 7 + max { 20 log_2(ceil((k+1)/2)) + 4,
                                    20 log_2(floor((k+1)/2)) + 4 + 1 }
                Now, we consider two cases:
                case 1: if k+1 is even, then 
                  log_2(ceil((k+1)/2)) = log_2(floor((k+1)/2)) = log_2((k+1)/2) 
                  
                  So:
                  T(k+1) <= 7 + 20 log_2((k+1)/2) + 4 + 1
                          = 20 (log_2(k+1) - log_2 2) + 8 + 4
                          = 20 log_2(k+1) - 20 + 8 + 4
                         <= 20 log_2(k+1) + 4, as wanted.
                case 2: if k+1 is odd, then since k+1>=2, 1/(k+1) <= 1/2 
                  and 1+(1/(k+1)) <= 1.5. So (k+2)/(k+1) <= 1.5 and
                  log_2((k+2)/(k+1)) <= log_2(1.5) <= (13/20). Therefore,
                     20 (log_2(k+2) - log_2(k+1)) <= 13, and
                     20 log_2(k+2) - 13 <= 20 log_2(k+1). 
                  If the first term in the max is larger then:
                     T(k+1) <= 7 + 20 log_2((k+2)/2) + 4
                             = 7 + 20 log_2(k+2) - 20 log_2(2) + 4
                             = 20 log_2(k+2) - 13 + 4
                            <= 20 log_2(k+1) + 4, as wanted.
                  If the second term in the max is larger then:
                     T(k+1) <= 7 + 20 log_2(k/2) + 1 + 4
                             = 7 + 20 log_2(k) - 20 log_2(2) + 1 + 4
                             = 20 log_2(k) - 12 + 4
                            <= 20 log_2(k+1) + 4, as wanted.
            Hence, in all cases, T(n) <= 20 log_2 n + 4 for all n >= 1, 
            i.e., T(n) is O(log_2 n).
        (b) We prove T(n) is Omega(log_2 n) by proving T(n) >= 7 log_2 n
            for all n >= 1.
            Base Case: T(1) = 4 >= 0 = 7 log_2 1.
            Ind. Hyp.: For some k>=1, assume T(j) >= 7 log_2 j for 1<=j<=k.
            Ind. Step: Since k >= 1, k+1 >= 2 so
                T(k+1) = 7 + max { T(ceil((k+1)/2)), T(floor((k+1)/2)) + 1 }
                       >= 7 + T(ceil((k+1)/2))  (by properties of max)
                       >= 7 + 7 log_2(ceil((k+1)/2))  (by ind. hyp.)
                       >= 7 + 7(log_2(k+1) - log_2 2)
                       >= 7 log_2(k+1).
                (where the second-last inequality is because ceil(x) >= x
                for all real numbers x so 7 + 7 log_2(ceil(x)) >= 7 + 7
                log_2(x)).
            Hence, by induction, T(n) >= 7 log_2 n for all n >= 1, i.e.,
            T(n) is Omega(log_2 n).
        Therefore, T(n) is Theta(log n).

General divide-and-conquer recurrences:
  - Many algorithms written using "divide-and-conquer" technique: split up
    problem, solve subproblems recursively, combine solutions.  Worst-case
    running times of such algorithms satisfy recurrences of the form:
           { K                                            if n < b/(b-1)
    T(n) = {
           { a_1 T(ceil(n/b)) + a_2 T(floor(n/b)) + f(n)  if n >= b/(b-1)
    (for some constants K, a_1, a_2, b >= 0).
  - Master Theorem: if f(n) = c n^d for constants c, d >= 0, then
               { Theta(n^d)            if c > 0 and a < b^d,
        T(n) = { Theta(n^d log n)      if c > 0 and a = b^d,
               { Theta(n^{log_b a})    if c = 0 or a > b^d.
    (where a = a_1 + a_2).
  - Example:
        MergeSort(A, b, e)
            if b < e then
                m := (b + e) div 2;
                MergeSort(A, b, m);
                MergeSort(A, m+1, e);
                Merge(A, b, m, e);
            end if
        End
    Where "Merge(A, b, m, e)" performs a merge of the two subarrays A[b..m]
    and A[m+1..e] back into A[b..e].  Worst-case running time of merge is
    Theta(size of array).  To simplify the equation a little, assume this
    means worst-case running time of merge = cn.  Then, worst-case running
    time of mergesort satisfies recurrence:
               { 1                                        if n = 1,
        T(n) = {
               { T(ceil(n/2)) + T(floor(n/2)) + cn + 5    if n > 1.
    So by Master Theorem (with a = 2, b = 2, c > 0, d = 1),
    T(n) is Theta(n log n).