Algorithm Analysis

This is a very brief introduction to algorithm analysis. The topic will reappear in greater detail in Data Structures. For the purposes of this class, it is sufficient to understand how to determine the big-oh running time of an algorithm and how to compare the running times of two algorithms. We will use these techniques extensively to discuss the tradeoffs between random access arrays and sequential access linked list.

Running Time

The efficiency of a program is arguably as important as its correctness. In fact, it is often the case that some correctness (or completeness) is sacrificed for efficiency. As an example, consider Google or any other web service. For these services, response time is absolutely critical. Google, for example, often responds to a query in 1/10 or 1/5 of a second. Would you be willing to wait 30 seconds or 1 minute to receive 2,000,000,000 search results instead of 2,000,000 search results?

Measuring the time to complete a program is one way of analyzing its efficiency. However, this method does not enable you to reason about whether the efficiency can be improved, nor does it give you any insight into how the program will perform if run on different hardware or under different conditions. Big-oh notation provides a way to formally discuss the number of steps an algorithm will require in the worst case.

Big-Oh

Formally,

f(n) is O(g(n)) if there is a real constant c>0 and an integer constant n₀>=1 such that f(n)<=cg(n) for every integer n>=n₀

Example: 2n+2 is O(n)

Essentially, the formal definition tells us that the running time of an algorithm is primarily dependent on the size of the problem being solved. In this case, n is the size of the problem the algorithm is solving. For example, if the problem is to determine whether a particular target value appears in a list of numbers, n is the size of the list of numbers we wish to search. Suppose we implemented an algorithm that printed 2 statements, searched a list of 1 million numbers, and then printed 2 more statements. The time to complete the 1 million comparisons will be far more dominant that the time to print 4 statements.

More examples:

87n⁴ + 7n is O(n⁴)

3nlogn + 12logn is O(nlogn)

4n⁴ + 7n³logn is ???

Terminology


Term	Big-Oh	Example
constant	O(1)	accessing array elements
logarithmic	O(log n)	binary search
linear	O(n)	array insertion/removal
linearithmic	O(nlogn)	mergesort
quadratic	O(n²)	insertion sort
polynomial	O(n^k) k >=1
exponential	O(aⁿ) a>1	traveling salesman

Generally, logs appear in the running time when a problem's size is cut by a fraction at each iteration.

General Rules

Rule 1 - for loops: The running time of a for loop is at most the running time of the statements inside the for loop times the number of iterations.

Rule 2 - nested loops: Analyze these inside out. The total running time of a statement inside a group of nested loops is the running time of the statement multiplied by the product of the sizes of all the loops.

Rule 3 - consecutive statements: These just add.

Rule 4 - if/else: For the fragment if(condition) S1 else S2, the running time of an if/else statement is never more than the running time of the test plus the larger of the running times of S1 and S2.

Data Structures and Algorithm Analysis in C++, 2nd edition, Mark Allen Weiss

Exercise: Given two arrays A1 and A2, determine whether there exists an integer i such that A1[i] == A2[i]. What is the running time of the algorithm to solve this problem?

Exercise: Given two arrays A1 and A2, determine whether there exist integers i and j such that A1[i] == A2[j]. What is the running time of the algorithm to solve this problem?

Sami Rollins

Date: 2007-10-01