1. Our Specification Of Lists - General Considerations

There is no unique way to specify an abstract data type. Last lecture we saw two `standard' specifications for lists, we will now develop our own. Specifications differ in two main ways:

The choice of primitive operations is important, because it directly determines:

Functionality:
Users must write their functions using only the primitives that are provided. Some functions may be impossible to write with a given set of primitives. We must be sure that all the functions that might be needed by the user can be written using the primitives we provide.

For example, the primitive operations on arrays provided by Pascal do not permit the user to change the size of an array dynamically. This was a design decision made when Pascal was specified.

Efficiency:
How efficient will the user's code be?

User Convenience:
How much effort it will be for the user to write the functions he really needs? This is important because, in practice, the cost to develop software, and the number of tricky bugs that are in it, are closely linked with the amount of effort required to write the software.

For example, in the classical List specification, there is just one `current' node in any given list. This is actually very inconvenient: in many applications, it is necessary to access two elements of the same list simultaneously. It is not impossible to accomplish this within the classical specification, but doing so involves extra work and tricks which obscure the function being performed.

Maintainability:
A program is easily maintainable if it can easily be understood and changed. To ensure that the code that implements an abstract data type is very easily maintained, there should be as small a set of primitive operations as possible, and each primitive operation should be very simple.
We will view a list as a collection of elements that are organized in a linear fashion. The primitive operations that we will provide is determined by the following general considerations:

Suppose we have three lists,

                
and we join L1 and L2 together using the JOIN operation. This operation captures the intuitive meaning of `join', resulting in list:
What is the relation between this list and the ones we started with? One possibility, which seems very natural, is that L1 and L2 have literally been glued together, so that L1 is the result of the operation (L2 has been glued onto the end of it), and L2 is the last 2 elements of L1:
                
Why is this a problem? Well, because some operations on L2 will affect L1, and, likewise, some operations on L1 will affect L2. For example, if we now JOIN L1 and L3, L2 will be affected:
                
As you can imagine, if this happens, things get very confusing!

There are at least four ways to avoid this confusion:

  1. return a copy. L1 and L2 are unaffected by JOIN. This solution tends to be memory hungry.

  2. keep L1 and L2 completely separate: put all the elements that would normally be shared into just one of them. In this approach, JOIN(L1,L2) transfers the elements from L2 into L1: L2 ends up with nothing in it.

  3. change the `type' of L2: it is now a `sublist' instead of a `list'. Then restrict some operations to operate only on lists, not on sublists (typically sublists can be inspected but not changed). This gives the advantages of sharing (efficient memory use) without the problems.

  4. L1 and L2 become exactly the same list - all subsequent operations on one of them have identical effects on both. They become two names for the very same list: we say that they become `unified'.
            
(2), (3), and (4) are equally acceptable. We will use (2) for lists, and (3) for trees later in the course. We will discuss (4) in more detail next lecture.

This decision is not a mere implementation detail. It is a crucial consideration in formulating our specification. It affects how we define the behaviour of our operations - we have just described 5 different ways the JOIN operation might behave. In order to write an application that uses JOIN, the applications programmer must know which of these five behaviours he will get.