C Programming

C Control Structures

   * There are a number of control structures in C, we will briefly list them
     here:
       1. Compound statement

              {    statements   }

       2. Conditional Statement

              if ( expression )
                  statement

              if ( expression )                       if ( expression )                   if (expression {
                  statement                               {   statements   }                  statements }
              else                                    else                                else {
                  statement                               {   statements   }                  statements }

       3. While statement

              while ( expression )                   while ( expression )
                  statement                               {   statements   }

       4. Do-while or Do-until statement

              do
                  statement
              while ( expression ) ;

          The iteration continues while the expression evaluates to a non-zero
          value, a zero value ends the iterating
       5. For statement

              for ( expression1 ; expression2 ; expression3 )
                  statement

          All three expressions are optional
            1. expression1 is evaluated once at the start of the loop
            2. expression2 is evaluated at the start of each iteration, if the
               value of this expression is 0 the loop is exited
            3. expression3 is executed at the end of each iteration
          The for statement is equivalent to:

              expression1;
              while ( expression2 ) {
                  statement
                  expression3;
              }

       6. Switch statement

              switch ( expression ) {
                  case constant-expression : statement
                  case constant-expression : statement
                      .
                      .
                      .
                  default : statement
              } ;

          The expression in the switch must evaluate to an integer value, all
          the case labels must also have integer values - when the switch
          statement is entered the expression is evaluated and control
          transfers to the matching case (like a goto statement), if there is
          no matching case default is used, the default case is optional.
          Control flows from one case to the next, the statement is not
          automatically exited when the next case is reached
       7. Break

              break ;

          The break statement causes the termination of the smallest enclosing
          while, do, for or switch statement This statement is commonly used to
          separate the cases in a switch statement
       8. Continue

              continue ;

          Causes control to transfer to the loop continuation (end of the loop)
          portion of the smallest enclosing while, do or for statement This
          statement is used to transfer control to the next iteration of the
          loop, that is terminate the current iteration and start on the next
          iteration

Example

   * In the example we use the following strategy to read from a file or
     terminal

         read first line
         while ( not end ) {
             process the line
             read next line
         }

   * The main problem with this approach is we need two read statements, a
     better schema (plan) uses the break statement

         while ( true ) {
             read next line
             if ( end )
                 break;
             process the line
         }

Press here to see the example program

Make

   * The make facility is one of the most important tools that you will learn
     in this course, you will use it in all of your other computing science
     courses
   * make will automatically construct the executable form of a program from a
     description of the source files and the dependencies between the files
   * Once you have produced a Makefile for a program you don't need to remember
     how to compile it, or even the files you have changed, make handles all of
     this automatically for you
   * The main input to make is a Makefile, this is basically a description of
     how to make the program
   * The Makefile is divided into two main parts:
       1. macro definitions
       2. dependency declarations

Macro Definitions

   * Macros are used extensively in C programming, get used to them
   * A macro is a text replacement facility
   * A macro has a name, which is a character string, and a body, which is also
     a character string, wherever the name occurs in the file it is replaced by
     the body of the macro
   * In make we use macros to describe the compiler we are using, the compiler
     options, and the list of files we need. It allows us to parameterize the
     Makefile
   * A macro definition has the following format:

         macro-name = macro-body

   * A macro definition can appear anywhere in a Makefile
   * In make, a macro is invoked by mentioning its name, preceded by a $, if
     the macro is more than one character long the name must be placed in
     parenthesis
   * In our example makefile we have

         FILES = main.o phone.o

   * When we use $(FILES) in the rest of the Makefile it is replaced by the two
     file names, as we add more files to the program we only need to change the
     line where FILES is declared, the rest of the Makefile stays the same -
     this saves time and reduces mistakes
   * Similarly we have the lines

         CC=gcc
         CFLAGS=-g

   * The first line defines the C compiler that we use and the second line
     defines the compiler flags, again we can quickly change the compiler
     options without scanning through the entire Makefile

Dependency Declarations

   * The dependency declarations tell make which files depend on which other
     files
   * In the case of our example, the file phone (the executable for our
     program) depends on both main.o and phone.o, we say this in the following
     way:

         phone: main.o phone.o

   * The files on the left of the : depend on the files listed on the right
   * Similarly, both main.o and phone.o depend on phone.h, there are two ways
     that we could specify this

         main.o : phone.h

         phone.o : phone.h

     or

         main.o phone.o : phone.h

   * After each dependency declaration we can list the commands that construct
     the dependent files, these commands must be indented, there must be either
     spaces or tabs at the beginning of the line. WARNING: Old versions of make
     require tabs.
   * So in order to create phone, we use the following specification

         phone : main.o phone.o
                      gcc -o phone main.o phone.o

   * When we need to re-create phone, make will automatically execute the gcc
     command
   * When is a dependency declaration used?
   * When make is started it looks at the time of last modification of all the
     files mentioned in the Makefile, if the last time of modification of a
     file on the right side of a dependency declaration is more recent than one
     of the files on the left side then the commands are executed
   * In our example, if either main.o or phone.o is more recent than phone, the
     gcc command will be executed, otherwise make knows that phone is
     up-to-date and does nothing
   * How does make know how to make main.o and phone.o, there are no commands
     associated with their dependencies?
   * Make has a set of default rules that know about Unix's file naming
     conventions, we use suffixes to indicate the type of file, start with a
     dot ( . ) and have one or more letters, the common ones are:
        o .c a c program
        o .h a header file
        o .o an object file
        o .s an assembler file
   * Make uses the file suffixes to determine how to make a file, if make needs
     a .o file and can find a .c file with the same prefix, it knows that it
     can use the C compiler to produce the .o file
   * For example if name.o is required, mentioned on the right side of a
     dependency rule, but there is no rule for name.o then make will look for a
     file named name.c, and if it finds one will run the c compiler on it, the
     use of default rules saves on the amount of typing you must do
   * You can construct your own default rules
   * If make is started without arguments, it will use the first dependency
     declaration as its target, that is it will make the file on the left side
     of the first dependency rule in the Makefile
   * In our example Makefile, phone is the first file mentioned, so by default
     make will execute that command first
   * You can specify the target when you call make, if we enter the command

          make main.o

   * then make will only run the c compiler on main.c to produce main.o, it
     will not produce a new version of phone
   * The most common file that you produce should be in the first dependency
     declaration, so you don't need to mention it each time you use make
   * Besides making the executable of a program there are other standard things
     that are placed in a Makefile, some of these operations include cleaning
     up temporary files, installing the executable in a standard place, running
     standard test cases, and printing the program
   * We could add the following rules to our Makefile:

         clean: $(FILES)
                    rm $(FILES)

         install: phone
                    cp phone $HOME/bin/phone

   * When we execute the command

         make clean

     All the .o files for the phone program will be deleted, note that a file
     called clean will not be produced, therefore, each time we execute this
     command the .o files will be deleted - make doesn't check whether it has
     actually created the file that was on the left side of the rule

Basic C Types

   * C has a small number of basic data types, which are:
        o char a character
        o int integer
        o float floating point number
        o double floating point number
   * I also consider pointers to be a basic data type - a pointer is always the
     size of a machine address and is treated like an unsigned integer
   * There are three modifiers that can be applied to most of the basic data
     types
   * The long modifier specifies that the maximum number of bits is used to
     represent the value, for example on most machines a float is 32 bits and a
     double is 64 bits, in reality a double is a long float
   * The long modifier is often used with the int type, a long int uses the
     maximum number of bits for an integer, on most machines an int and a long
     int are often the same data type, but this is not always the case
   * The short modifier is used to specify that the minimum number of bits
     should be used, it is usually used with int's - a short int is 16 bits on
     most machines
   * The unsigned modifier can be used with character and integer data types,
     an unsigned value doesn't interpret the sign bit as a sign, it is used as
     part of the value, in other words unsigned values are always positive and
     use all the bits in the value - this is used in bit manipulation
     operations
   * A variable declaration has the following format:

         type variable_name = initial_value ;

   * The initial value part of the declaration is optional
   * Several variables can be declared at the same time, the variable names are
     separated by commas
   * A pointer is declared in the following way

         type *variable_name ;

   * The type is the type of value that is pointed to

Constants

   * Character constants are enclosed in single quotes, the \ is used as an
     escape character - for example '\037' is the character with octal value
     37, '\n' is the end of line character
   * An integer constant is a number without a decimal point, a long integer
     constant is an integer constant with L as a suffix - for example 123L is a
     long integer constant
   * An integer that starts with the digit 0 is a non-decimal constant, if the
     0 is followed by another digit it is an octal constant - for example 037
     is an octal integer constant
   * If the 0 is followed by an x or X, the constant is a hexadecimal integer
     constant - for example 0x1f is a hex constant (its decimal value is 1*16^1
     + 15*16^0 = 31
   * Double or floating point constants are numbers that have decimal points or
     exponents

Arrays

   * Arrays and pointers are very closely related in C
   * C basically supports one dimensional arrays, the first subscript value is
     always 0
   * An array is declared in the following way:

         type array_variable [ size ] ;

   * The type is the type of the individual array elements and size is the
     length of the array
   * For example

         int a[1000];

     will produce an array that has 1000 integers in it, the first element is
     a[0] and the last element is a[999]
   * Arrays are initialized in the same way as basic data types, except a list
     of values must be specified
   * For example

         int a[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

   * A text string is a one dimensional array of char elements
   * For example, a text string can be declared in the following way:

         char string[25];

   * A character string constant is enclosed in double quotes ("), for example
     "this is a text string" - there is a major difference between single and
     double quotes
   * In C and Unix a text string is terminated by a zero byte, this is written
     as '\0', an empty string requires one byte of storage, in general a string
     with n characters requires n+1 bytes of storage - be careful to allocate
     the extra byte

Arrays and Pointers

   * An array name is really a pointer, it points to the first element of the
     array
   * For example:

         int a[100];
         int *pa;

         pa = a;

   * Both pa and a point to the first element of the array of 100 integers,
     both of the following reference the same array element:

         a[i]   and *(pa+i)

   * The expression pa+i, takes the value of the pointer pa and adds i elements
     to it, thus pa+i points to the i'th element of the array a, the * operator
     treats its operand as a pointer and retrieves the value at that address,
     thus *(pa+i) first computes the address of element i, and then retrieves
     its value
   * The & operator is used to compute the address of a variable, for example

         pa = &a[5];

     stores the address of the 5'th element of a in pa, now the expression

         *(pa+2)

     retrieves the value of a[7]
   * There is an important distinction between array and pointer declarations,
     an array declaration allocates storage for the array while a pointer
     declaration only allocates enough storage to store the pointer itself, no
     storage for the value pointed to is allocated
   * We can of course have an array of pointers, it is declared in the
     following way

         int *pa[10];

   * This produces an array containing 10 pointers to integers

Multi-Dimensional Arrays

   * A multi-dimensional array is just an array of arrays
   * Thus we can declare a 2 dimensional array in the following way:

         int a[10][20];

   * This is in fact 10 arrays, each of which contains 20 integer elements
   * An individual element of this array can be referenced in the following way

         a[i][j]

   * Of course we can still do things like the following

         int a[10][20];
         int *pa;

         pa = a;

         x = *(pa+i*20+j);

   * Similarly we can do things like:

         int a[10][20];
         int (*pa)[20];

         pa = &a[0];

   * Then (*pa)[5] would be the same as a[0][5]
   * An observation, a variable is declared in the way that it is used
   * The type in a variable declaration is usually a basic type, the
     declaration syntax shows how a value of that basic type can be obtained
     from the variable name
   * Also note that [] has higher precedence than *, so we need to use
     parenthesis in the above declaration, otherwise we would have an array of
     20 pointers to integers instead of a pointer to an array of 20 integers

Character Arrays and Initialization

   * An array of pointers to text strings (which is not the same as a 2
     dimensional array of characters) is often used in C programming, such a
     structure can be initialized in the following way:

     char *name[] = {
         "fred",
         "george",
         "paul",
         "mark",
         0
     };

   * The last line of the initialization isn't necessary, it serves as a marker
     for the end of the array, we can detect the end of the array in the
     following way:

     for(i=0; i<1000; i++) {
         if(name[i] == 0)
             break;
     }

Structures

   * C structures are similar to records in Pascal, they allow us to collect
     together several pieces of related data into one data structure, the
     individual pieces of data are called structure elements or structure
     members
   * A structure is declared in the following way:

         struct structure_name {
             member declaration;
             member declaration;
                     .
                     .
                     .
             member declaration;
         };

   * This declaration doesn't allocate any memory, it just provides a template
     for the structure, it declares the values that can be stored together
   * The individual structure elements can be of any C type, including other
     structures
   * For example, we could have the following for the declaration of a name
     structure

         struct name {
             char *first;
             char *last;
         };

   * The name structure has two elements, the character pointers first and
     last, variables can have the same names as structure elements and the same
     element name can be used in different structure declarations
   * There are several ways that we can declare a variable that has a structure
     type
   * One way is to do the following:

         struct structure_name variable_name;

   * So with our example name struct we could do the following:

         struct name fred, george;

   * Another way of doing this is:

         struct structure_name {
             member declarations
         } variable_name;

   * In this approach we combine the structure declaration with the variable
     declaration, the structure name is optional, but it is always a good idea
     to include it
   * A structure variable can also be initialized, this can be done in the
     following way:

         struct structure_name variable_name = { element values } ;

   * So for our example we could have:

         struct name george = { "george", "brown" };

   * The element values are assigned in the same order as the element
     declarations
   * The . (dot) operator is used to extract the individual elements from a
     structure value, in the case of our name structure we can do the
     following:

         george.first
         george.last

   * In general the syntax is

         variable_name . element_name

   * A slightly different syntax is used for pointers to structures, for
     example

         struct name *person;

   * The variable person is now a pointer to a name structure, knowing what we
     know about pointers we can use the following to get the value of the first
     element of the structure that person points to

         (*person).first

     or
         person->first

   * The -> operator takes a pointer to a struct, follows the pointer to the
     structure value and then extracts the field - this is a shorthand, but it
     makes sense
   * We can have arrays of structures, which is often quite convenient, this is
     done in the following way:

         struct structure_name variable_name [ size ];

   * In the case of our name structure, we could have an initialized array of
     names constructed in the following way:

         struct name persons[] = {
             { "george", "brown" },
             { "fred", "black" },
                      .
                      .
                      .
             { 0, 0 }
         };

   * Again we use an explicit 0 value at the end of the array to indicate the
     end, there are other ways of doing this, but this is the safest
   * We can include pointers to a structure within the declaration of the
     structure, we use this technique to build linked lists and binary trees
   * We can use the following structure declaration for a node in a binary tree

         struct node {
             int value;
             struct node *left;
             struct node *right;
         };

   * Note that left and right must be pointers to structures, they cannot be
     structure variables, otherwise we will have a structure that includes two
     copies of itself
   * The same thing can be done for a linked list:

         struct list_node {
             float value;
             struct list_node *next;
         };

Fields

   * The structure elements that we have seen so far have been standard C data
     types, these data types may not be the most efficient way of storing data
     in a structure
   * Fields allow us to pack data into a structure as densely as possible, a
     field is an integer value (either signed or unsigned) where the programmer
     specifies the number of bits occupied by the field value
   * We can define fields in the following way:

         struct example {
             int field_a : 4;
             int field_b : 6;
             unsigned field_c : 6;
         };

   * In this structure all three fields would be packed into a 16 bit word, the
     first two fields are signed and the last one is unsigned

Lvalues and Rvalues

   * Lvalues and Rvalues are important in understanding expressions in C
   * An Lvalue is anything that can be on the left side of an assignment
     operators, in other words it represents a memory location where a value
     can be stored - Lvalues include variables and pointer expressions
   * An Rvalue is anything that can be on the right side of an assignment
     operator, in other words it represents a value
   * All expressions are Rvalues, but only some of them can be used as Lvalues,
     in other words any place that an Rvalue is required an Lvalue can be used,
     but the opposite is not true

Assignment Expression

   * There are several forms of assignment expressions, note that assignment is
     an expression, it has a value and can be used as part of a larger
     expression
   * The general format of an assignment expression is

         Lvalue assignment_operator Rvalue

   * The standard assignment operator is =, but there are several other useful
     ones, such as:

         +=
         -=
         *=
         /=

   * These operators are interpreted in the following way:

         x op=y

     is the same as

         x = (x   op  y)

Operators

   * C has the standard arithmetic operators:

         +
         -
         *
         /
         %   -  modulus or remainder

   * The standard comparison operators are:

         ==
         !=
         <
         <=
         >
         >=

Logical Operators

   * Recall, that in C zero is treated as false and non-zero is treated as true
   * The ! is the logical invert operator, if the operand is non-zero the
     result is zero and if the operand is zero the result is 1
   * There are two binary logical operators:

         &&  -  logical and
         ||     -  logical or

   * These operators don't always evaluate their second operand, in the case of
     && if the first operand evaluates to zero, the second operand is not
     evaluated since the result is zero
   * Similarly for ||, if the first operand evaluates to a non-zero value the
     second operand won't be evaluated since the result will be 1
   * The allows us to write expressions like:

         if(n != 0 && m/n > 5)   ...

Increment and Decrement Operators

   * The ++ and -- operators are used to increment and decrement Lvalues, they
     can be used as both a prefix or postfix operator
   * If they are used as prefix operators the value of the expression is the
     new value of the Lvalue, for example

         m = ++n;

     is the same as

         n = n+1;
         m = n;

   * If they are used as a postfix operator, the value of the expression is the
     value of the Lvalue before the operation is performed, for example

         m = n++;

     is the same as
         m = n;
         n = n+1;

Conditional Expression

   * A conditional expression has the following syntax:

         expression1 ? expression2 : expression3

   * First expression1 is evaluated, if it is non-zero then expression2 is
     evaluated and used as the value of the expression, in this case
     expression3 is ignored
   * If expression1 is zero then expression3 is evaluated and used as the value
     of the expression, in this case expression2 is ignored
   * This operator can be used in the following way

         x = n != 0 ? m/n : 0;

   * The conditional operator essentially allows the programmer to put an if
     statement in the middle of an expression

Procedures - Part 1

   * There are two ways of declaring and defining procedures in C, the old way
     and the ANSI standard way - you will run into both so we will cover both
   * All procedures in C are really functions, that is they return a value, the
     special type void is used to indicate that the return value is never used,
     therefore, one is not produced
   * C has both procedure declarations and procedure definitions - these are
     two different concepts
   * A procedure declaration contains all the information required to call the
     procedure, that is the name of the procedure, the types of its parameters
     (optional) and the type of the return value
   * A procedure definition includes all the information in a procedure
     declaration, plus local variables and the statements in the procedure - it
     not only describes how the procedure can be called, but also how it
     computes its value

Procedure Declarations

   * The old style of procedure declaration is:

         type procedure_name();

   * The type is the type of the return value
   * The ANSI style of procedure declaration is:

         type procedure_name(parameter_declarations);

   * The parameter declarations are separated by commas and each declaration
     has the following format:

         type parameter_name

   * This is the same format as variable declarations
   * The ANSI style procedure declarations should be used

Procedure Definitions

   * The old style of procedure definition is:

         type procedure_name(parameter_names)
         parameter declaration;
         parameter declaration;
             .
             .
             .
         parameter declaration; {
             variable declarations

             statements

         }

   * The parameter_names is a comma separated list of parameter names, the
     parameter declarations need not be in the same order as the
     parameter_names
   * I prefer this format, since there is a separate line for each parameter,
     easier to document
   * The ANSI style of procedure definition is:

         type procedure_name(parameter_declarations) {
             variable declarations

             statements
         }

   * The return statement is used to specify the value returned by the
     procedure and return control to the calling procedure
   * The two formats of the return statement are:

         return;

     and

         return(Rvalue);

   * The first form is only used with procedures of type void

Parameter Passing

   * All parameter passing in C is by value, that is, when a procedure is
     called the parameter values from the calling procedure are copied into
     temporary storage in the called procedure - all modifications to the
     parameter values occurs in this temporary storage, the original values in
     the calling procedure are not changed
   * This means that you cannot return a result directly through a parameter,
     the only way to return a result through a parameter is to pass a pointer
     to the variable where the result should be stored
   * Remember arrays are the same as pointers, so if you pass an array (not an
     array element) to a procedure, you can modify the elements in the array
     and these modifications will be seen outside of the procedure

     foo(int x) {

         we can do anything we like to x inside this procedure
         the calling procedure won't see any of these changes
         the calling procedure only provides the initial value of x

     }

     foo(int *x) {

         *x = 5;

     }

         foo(&i)
         printf("%d",i);

   * This will print 5, since we have passed a pointer to i into the procedure,
     the value pointed at is changed by the procedure - note that foo(i) will
     cause all sorts of problems if i isn't a pointer

-------------------------------------------------------------------------------
Cmput 201 Course Notes / Dr. Mark Green / mark@cs.ualberta.ca