1. Our Specification Of Lists - General Considerations

There is no unique way to specify an abstract data type. Last lecture we saw two `standard' specifications for lists, we will now develop our own. Specifications differ in two main ways:

what are the primitive parts out of which lists are composed.
what are the primitive operations on lists.

The choice of primitive operations is important, because it directly determines:

Functionality:

Users must write their functions using only the primitives that are provided. Some functions may be impossible to write with a given set of primitives. We must be sure that all the functions that might be needed by the user can be written using the primitives we provide.

For example, the primitive operations on arrays provided by Pascal do not permit the user to change the size of an array dynamically. This was a design decision made when Pascal was specified.

Efficiency:

How efficient will the user's code be?

User Convenience:

How much effort it will be for the user to write the functions he really needs? This is important because, in practice, the cost to develop software, and the number of tricky bugs that are in it, are closely linked with the amount of effort required to write the software.

For example, in the classical List specification, there is just one `current' node in any given list. This is actually very inconvenient: in many applications, it is necessary to access two elements of the same list simultaneously. It is not impossible to accomplish this within the classical specification, but doing so involves extra work and tricks which obscure the function being performed.

Maintainability:

A program is easily maintainable if it can easily be understood and changed. To ensure that the code that implements an abstract data type is very easily maintained, there should be as small a set of primitive operations as possible, and each primitive operation should be very simple.

We will view a list as a collection of elements that are organized in a linear fashion. The primitive operations that we will provide is determined by the following general considerations:

A list is a special kind of collection. In the most general notion of `collection' the elements in a collection are not organized in any particular way. Lists are collections having an additional property: the order on elements. There are many very powerful operations that can be defined on collections generally - we'll see several shortly - and we would like to be able to apply these to lists too.
We would like all our primitive operations to be very efficient. More precisely, we want them to be constant time and constant space operations - which means they will use the same time and space to process lists of any size. There will be obvious exceptions - operations which, by definition, must access each element in the list one-by-one - but all other operations should require constant time and space.
There are some very commonly-used operations that we would like to ensure are done as efficiently as possible. For example, one of the most common operations on a list or collection is determining its size (length). Another pair of common operations are joining together two lists and the inverse: splitting one list into two (the front and back). In the classical specification, these operations require going through the list element by element - therefore, they are not constant time operations - they are quick for short lists but very slow for very long lists. We will make them primitive operations and make them constant time.
We will generalize the classical idea of a current node: we will allow the user to access any number of list elements simultaneously. To do this, we will define an abstract type called window, which gives access to an element in a list; the user can have as many windows as he likes. We will have to define operations on windows, that permit them to be created, moved around, inspected, and so on.
We will not have any operations that result in the same element being a member of two different lists. This may sound strange, but it is a situation that can very easily arise, even with ordinary primitives like TAIL or our JOIN operation. To see this, consider the following example.

Suppose we have three lists,

and we join L1 and L2 together using the JOIN operation. This operation captures the intuitive meaning of `join', resulting in list:

What is the relation between this list and the ones we started with? One possibility, which seems very natural, is that L1 and L2 have literally been glued together, so that L1 is the result of the operation (L2 has been glued onto the end of it), and L2 is the last 2 elements of L1:

Why is this a problem? Well, because some operations on L2 will affect L1, and, likewise, some operations on L1 will affect L2. For example, if we now JOIN L1 and L3, L2 will be affected:

As you can imagine, if this happens, things get very confusing!

There are at least four ways to avoid this confusion:

return a copy. L1 and L2 are unaffected by JOIN. This solution tends to be memory hungry.
keep L1 and L2 completely separate: put all the elements that would normally be shared into just one of them. In this approach, JOIN(L1,L2) transfers the elements from L2 into L1: L2 ends up with nothing in it.
change the `type' of L2: it is now a `sublist' instead of a `list'. Then restrict some operations to operate only on lists, not on sublists (typically sublists can be inspected but not changed). This gives the advantages of sharing (efficient memory use) without the problems.
L1 and L2 become exactly the same list - all subsequent operations on one of them have identical effects on both. They become two names for the very same list: we say that they become `unified'.
```
        
```

(2), (3), and (4) are equally acceptable. We will use (2) for lists, and (3) for trees later in the course. We will discuss (4) in more detail next lecture.

This decision is not a mere implementation detail. It is a crucial consideration in formulating our specification. It affects how we define the behaviour of our operations - we have just described 5 different ways the JOIN operation might behave. In order to write an application that uses JOIN, the applications programmer must know which of these five behaviours he will get.