------------------------------- Algorithm Complexity (continued) ------------------------------- Omega and O: Lower bounds vs. upper bounds: - Recall T(n) = maximum running time of an algorithm over all inputs of size n, so proving asymptotic bounds on T(n) requires care. - To prove upper bound (i.e., T(n) is O(f(n))), must show that there exist specific constants n_0, c such that for all n >= n_0, algorithm requires <= c f(n) steps on **every** input of size n. - To prove lower bound (i.e., T(n) is Omega(g(n))), only need to show that there exist constants n_0, c such that for all n >= n_0, algorithm takes >= c g(n) steps on **some** input of size n. - Example: linear search. . T(n) is O(n) because for every input of size n, algo examines each element at most once, performing a constant number of steps for each input; . T(n) is Omega(n) because when value searched for is not in the list, algo will examine each element at least once, performing a constant amount of work for each step. ------------------------------------------------- Time Complexity of Recursive Programs (Chapter 4) ------------------------------------------------- Recurrence relations: - Example 1: factorial Fact(n) if n <= 1 then return 1; else return n * Fact(n-1); end if End - Worst-case running time of factorial satisfies recurrence: { 2 if n <= 1, T(n) = { { 3 + T(n-1) if n > 1. How can we solve for T(n) to get a closed form? 1. Perform repeated substitutions to get a "guess": T(n) = 3 + T(n-1) = 3 + 3 + T(n-2) = 3 + 3 + 3 + T(n-3) there's a pattern here, it looks like after i steps, we get: = 3i + T(n-i) for now, this is *not* a proof, just an informal guess; we get to a base case when n-i <= 1, i.e., i >= n-1, so substituting i = n-1 in our guess gives: = 3(n-1) + T(n-(n-1)) = 3n - 3 + 2 = 3n - 1. 2. Prove our guess by induction. Formally, we prove that T(n) = 3n-1 for all n >= 1. Base Case: n = 1. By def'n, T(n) = 2 = 3-1 = 3(1)-1. Ind. Hyp.: Let k>=1 be arbitrary and assume T(j) = 3j-1 for 1<=j<=k Ind. Step: Since k >= 1, k+1 >= 2 so T(k+1) = 3 + T(k) (by recurrence since k+1>=2) = 3 + (4k-3) (by the ind. hyp.) = 3(k+1) - 1. Hence, by induction, T(n) = 3n-1 for all n >= 1. This means that T(n) is Theta(n). - Example 2: RecBinSearch. Recurrence for worst-case running time of RecBinSearch is: { 4 if n = 1 T(n) = { { 7 + max { T(ceil(n/2)), T(floor(n/2)) + 1 } if n > 1 1. Repeated substitutions. Since this is used only to get a guess, we are allowed to make simplifying assumptions. T(n) = 7 + T(n/2) (roughly, ignoring rounding, etc.) = 7 + 7 + T(n/4) = 7*3 + T(n/8) After i substitutions, we expect to get T(n) = 7i + T(n/2^i). This is just a guess, but we can check that we guessed the pattern correctly by doing one more step: T(n) = 7i + 7 + T((n/2^i)/2) = 7(i+1) + T(n/2^{i+1}). Since this works, we figure out for what value of i we get a base case: n/2^i <= 1 <==> n <= 2^i <==> log_2 n <= i, so substituting i = log_2 n into the guess, we get: T(n) = 7 log_2 n + T(1) = 7 log_2 n + 4. 2. Because of the simplifying assumptions, this is not the exact solution. But we can prove T(n) is Theta(log n), by proving separate upper and lower bounds. (a) We prove T(n) is O(log_2 n) by proving T(n) <= 20 log_2 n + 4 for all n >= 1. Base Case: T(1) = 4 <= 0+4 = 20 log_2 1 + 4. Ind. Hyp.: For some k>=1, assume T(j) <= 20 log_2 j + 4 for all 1<=j<=k. Ind. Step: Since k >= 1, k+1 >= 2 so T(k+1) = 7 + max { T(ceil((k+1)/2)), T(floor((k+1)/2)) + 1 } since floor((k+1)/2) <= ceil((k+1)/2) <= k, we apply the ind. hyp. to get: <= 7 + max { 20 log_2(ceil((k+1)/2)) + 4, 20 log_2(floor((k+1)/2)) + 4 + 1 } Now, we consider two cases: case 1: if k+1 is even, then log_2(ceil((k+1)/2)) = log_2(floor((k+1)/2)) = log_2((k+1)/2) So: T(k+1) <= 7 + 20 log_2((k+1)/2) + 4 + 1 = 20 (log_2(k+1) - log_2 2) + 8 + 4 = 20 log_2(k+1) - 20 + 8 + 4 <= 20 log_2(k+1) + 4, as wanted. case 2: if k+1 is odd, then since k+1>=2, 1/(k+1) <= 1/2 and 1+(1/(k+1)) <= 1.5. So (k+2)/(k+1) <= 1.5 and log_2((k+2)/(k+1)) <= log_2(1.5) <= (13/20). Therefore, 20 (log_2(k+2) - log_2(k+1)) <= 13, and 20 log_2(k+2) - 13 <= 20 log_2(k+1). If the first term in the max is larger then: T(k+1) <= 7 + 20 log_2((k+2)/2) + 4 = 7 + 20 log_2(k+2) - 20 log_2(2) + 4 = 20 log_2(k+2) - 13 + 4 <= 20 log_2(k+1) + 4, as wanted. If the second term in the max is larger then: T(k+1) <= 7 + 20 log_2(k/2) + 1 + 4 = 7 + 20 log_2(k) - 20 log_2(2) + 1 + 4 = 20 log_2(k) - 12 + 4 <= 20 log_2(k+1) + 4, as wanted. Hence, in all cases, T(n) <= 20 log_2 n + 4 for all n >= 1, i.e., T(n) is O(log_2 n). (b) We prove T(n) is Omega(log_2 n) by proving T(n) >= 7 log_2 n for all n >= 1. Base Case: T(1) = 4 >= 0 = 7 log_2 1. Ind. Hyp.: For some k>=1, assume T(j) >= 7 log_2 j for 1<=j<=k. Ind. Step: Since k >= 1, k+1 >= 2 so T(k+1) = 7 + max { T(ceil((k+1)/2)), T(floor((k+1)/2)) + 1 } >= 7 + T(ceil((k+1)/2)) (by properties of max) >= 7 + 7 log_2(ceil((k+1)/2)) (by ind. hyp.) >= 7 + 7(log_2(k+1) - log_2 2) >= 7 log_2(k+1). (where the second-last inequality is because ceil(x) >= x for all real numbers x so 7 + 7 log_2(ceil(x)) >= 7 + 7 log_2(x)). Hence, by induction, T(n) >= 7 log_2 n for all n >= 1, i.e., T(n) is Omega(log_2 n). Therefore, T(n) is Theta(log n). General divide-and-conquer recurrences: - Many algorithms written using "divide-and-conquer" technique: split up problem, solve subproblems recursively, combine solutions. Worst-case running times of such algorithms satisfy recurrences of the form: { K if n < b/(b-1) T(n) = { { a_1 T(ceil(n/b)) + a_2 T(floor(n/b)) + f(n) if n >= b/(b-1) (for some constants K, a_1, a_2, b >= 0). - Master Theorem: if f(n) = c n^d for constants c, d >= 0, then { Theta(n^d) if c > 0 and a < b^d, T(n) = { Theta(n^d log n) if c > 0 and a = b^d, { Theta(n^{log_b a}) if c = 0 or a > b^d. (where a = a_1 + a_2). - Example: MergeSort(A, b, e) if b < e then m := (b + e) div 2; MergeSort(A, b, m); MergeSort(A, m+1, e); Merge(A, b, m, e); end if End Where "Merge(A, b, m, e)" performs a merge of the two subarrays A[b..m] and A[m+1..e] back into A[b..e]. Worst-case running time of merge is Theta(size of array). To simplify the equation a little, assume this means worst-case running time of merge = cn. Then, worst-case running time of mergesort satisfies recurrence: { 1 if n = 1, T(n) = { { T(ceil(n/2)) + T(floor(n/2)) + cn + 5 if n > 1. So by Master Theorem (with a = 2, b = 2, c > 0, d = 1), T(n) is Theta(n log n).