Assignment 8

Due date:

Late assignments will be penalized 10% per day, and will not be accepted after .

1. Hand In

Each student is required to do this assignment individually and to hand in the following: Place these items in a 9"x11" envelope, with the following information clearly marked on the outside of the envelope:

Name, Student Number, Assignment Number, Course Number (T26)

Assignments which are not in an envelope will not be marked

Deposit your envelope in .

Marking Scheme

Programming standards: 10pt, Question 1: 30pt, Question 2: 10pt, Total: 50pt.

Your program must conform to the programming standards. These are described in a separate sheet.

2. Program Handout

All code is supplied except for the procedures below which you must write.

This assignment involves an empirical comparison of the efficiency of the retrieve operation for two different open addressing strategies - linear probing and double hashing. Efficiency is measured by the average number of probes needed for retrieval.

3. Question 1

You are given all the operations on hash tables you need for this assignment except insert and retrieve. You must implement insert and retrieve for two different open addressing strategies: linear probing and double hashing. For linear probing, call these operations lp_insert and lp_retrieve. For double hashing, call them dh_insert and dh_retrieve.

In addition to its usual parameters, the retrieve operation should return the number of probes required to do the retrieval.

Your insert procedures may assume the value to be inserted is not already in the table.

The values to be stored in the table are simply integer keys with no data parts.

4. Question 2

The main program you are given generates N different random values in the range 0..(R-1) and inserts them into one table using lp_insert and into another table using hd_insert. It then generates an additional Ntest random values in the same range and retrieves each of them from the two tables, using the appropriate retrieve procedure. It writes out N and the average number of probes to retrieve a value using Linear Probing (LP) and Double Hashing (DH).

It repeats this for N = 15, 30, 45, 60, 65, 70, 75, 80, 85, 90, 95. These values correspond to different ``load factors'' since the size of the table is fixed (101).

The range (R) and Ntest are values you should enter from the keyboard. A good choice for your final runs is R=Ntest=1000.

Your also asked to enter a `seed': this is used to initialize the random number generation process. It should be a real value with a fractional part that is not related to a power of two (e.g. 123.456 is a good choice, but 0.5 is not).

Run the program 4 times using a different seed each time. Average the results for each value of N across the four runs and graph the average values. In the graph, the x-axis is N and the Y-axis is the average number of probes to retrieve a value. Plot the points for Linear Probing and Double Hashing on the same graph.

Briefly explain and discuss the significance of your results.