Array Languages II

The discussion in the previous section made a very strong argument for introducing a programming language in the same manner as natural languages are often taught, i.e., by a continuing narrative which although fictional has some relation to the real world and may also be of interest to the student. The grammar is introduced as necessary to support the developing narrative. In this section we shall give a brief introduction to J centred on the unifying although somewhat prosaic theme of a bowlful of pennies.

The idea for such an introduction arose while I was rolling an accumulation of pennies into fifty-coin rolls prior to taking them to the bank. I keep my pennies in a small pottery dish as shown in the accompanying figure, and from time to time as the bowl becomes full dispose of them in this way. (I might mention here for any reader not too familiar with North American currency that by "penny" I am referring to a coin worth 1/100 of a dollar. As almost all of the coins are Canadian pennies this bowful of money represents a trifling sum. The bowl though is of considerable sentimental value.) Rolling pennies is not a task I enjoy, probably because I am so poor at it, and I was undoubtedly trying to think of something to make the work less frustrating. The more I thought about computational problems originating with pennies, e.g., summarizing and tabulating the dates in various ways and examining the results of repeated tosses of a single penny - the more interested I became in basing an introduction to J on such a simple example.

We shall begin by considering how we may summarize the dates on a small collection of pennies both by arranging the dates in order and by representions of their frequency distributions. Then we shall show the same calculations performed on a larger sample of pennies obtained by random selection from a given population of pennies. Having introduced the concept of randomness, we shall then examine the simulation of the repeated tossing of a single coin and see what may be learned from such an investigation.

All of the above investigations will be carried out with the aid of J which will introduced as necessary as the study continues. For the reader unfamiliar with J we have prepared a two-page summary of the language, available as a PDF file, giving some examples of simple calculations and a summary of the main characteristics of the language. An indispensible reference for the language is Kenneth Iverson's J Introduction and Dictionarywhich should be consulted while reading the following text just as one would consult a grammar and dictionary while working on a reading exercise in any natural language one is learning.

In the following we shall discuss very briefly each topic and introduce any terminology which may be new to the reader. The details of the calculations in J and remarks on the J language will be given in pages which are linked to these descriptions. The J script file coins.ijs gives all of the defined verbs and other parts of speech and the coin data for these calculations.

A three-statistic summary of the dates on the coins may be given simply by the number of coins and their earliest and latest dates. Various lists of the dates may be helpful, e.g., the list of unique dates or "nub", the list of all of the dates in sorted order, and the list of unique dates in sorted order or "ordered nub". These summary calculations are given in J followed by a brief discussion in Summaries, a format which will be followed in each of the remaining sections.

The frequency distribution of the dates may be presented either as a table giving the dates and the corresponding numerical frequencies or as a table giving the dates but showing the frequencies as bars proportional to their magnitudes. Also the plotting facilities of J may be used to construct a conventional barchart as shown here. These calculations are given in Frequencies.

Another method of displaying frequencies is by a stem-and-leaf diagram in which the data, which are restricted to non-negative integers, are grouped by their integer quotients when divided by 10, i.e., all items between 0 and 9 are grouped together, as are all items between 10 and 19, etc. With our coin data all dates in the 1960s would be grouped together, all dates in the 1970s would be together, and so on. For example, if in a sample of coins, we had the five dates 1995, 1998, 1992, 1999 and 1998 occurring in the 1990s, we could display them as
1990   5 8 2 9 8
or in sorted order as
1990   2 5 8 8 9 .
These calculations are discussed in stem-and-leaf diagrams.

All of the calculations given so far have been illustrated with a sample of 10 dates which have been drawn at random from a population of size 100. As such a small sample is hardly representative of the population, however convenient it may be for introducing the statistical techniques used, we now discuss random sampling from this population of 100 dates, and illustrate the statistical techniques with a larger sample.

Having introduced some of the features of J by a study of some samples drawn from a population of the dates of coins, let us now restrict ourselves to a single coin in which we ignore the date completely. We shall consider tossing the coin, assumed to be unbiased, an arbitrary number of times and calculating both the cumulative ratios of the number of heads to the number of tosses and also the excess or deficit in the number heads to the number of tails. Of course it is well-known that as the number of tosses increases the ratio of the number of heads to the total number of tosses approaches the probability of obtaining a head on a single toss. It is not so well-known, though, that at the same time the excess of heads to tails may get larger and larger so that although in the long run one neither wins nor loses if betting, one must be able to withstand the chances of longer and longer losing streaks. We shall investigate these events not by tossing our single coin but by a coin-tossing simulation.

The above sections and the associated annotated scripts are intended to provide a very brief introduction to some of the features of the J language. The range of topics may be extended as appropriate illustrations related to collections of pennies either occur to me or are suggested by any readers. I realize such an approach to programming languages is not of universal appeal and runs counter to that taken by most introductions to programming languages including even those for J. An excellent example of one who is not enamoured of my work is shown in the accompanying figure of my cat, Torako, basking peacefully in the morning sun on the kitchen table with one of my earlier reports lying unopened beside her.