C Programming C Control Structures * There are a number of control structures in C, we will briefly list them here: 1. Compound statement { statements } 2. Conditional Statement if ( expression ) statement if ( expression ) if ( expression ) if (expression { statement { statements } statements } else else else { statement { statements } statements } 3. While statement while ( expression ) while ( expression ) statement { statements } 4. Do-while or Do-until statement do statement while ( expression ) ; The iteration continues while the expression evaluates to a non-zero value, a zero value ends the iterating 5. For statement for ( expression1 ; expression2 ; expression3 ) statement All three expressions are optional 1. expression1 is evaluated once at the start of the loop 2. expression2 is evaluated at the start of each iteration, if the value of this expression is 0 the loop is exited 3. expression3 is executed at the end of each iteration The for statement is equivalent to: expression1; while ( expression2 ) { statement expression3; } 6. Switch statement switch ( expression ) { case constant-expression : statement case constant-expression : statement . . . default : statement } ; The expression in the switch must evaluate to an integer value, all the case labels must also have integer values - when the switch statement is entered the expression is evaluated and control transfers to the matching case (like a goto statement), if there is no matching case default is used, the default case is optional. Control flows from one case to the next, the statement is not automatically exited when the next case is reached 7. Break break ; The break statement causes the termination of the smallest enclosing while, do, for or switch statement This statement is commonly used to separate the cases in a switch statement 8. Continue continue ; Causes control to transfer to the loop continuation (end of the loop) portion of the smallest enclosing while, do or for statement This statement is used to transfer control to the next iteration of the loop, that is terminate the current iteration and start on the next iteration Example * In the example we use the following strategy to read from a file or terminal read first line while ( not end ) { process the line read next line } * The main problem with this approach is we need two read statements, a better schema (plan) uses the break statement while ( true ) { read next line if ( end ) break; process the line } Press here to see the example program Make * The make facility is one of the most important tools that you will learn in this course, you will use it in all of your other computing science courses * make will automatically construct the executable form of a program from a description of the source files and the dependencies between the files * Once you have produced a Makefile for a program you don't need to remember how to compile it, or even the files you have changed, make handles all of this automatically for you * The main input to make is a Makefile, this is basically a description of how to make the program * The Makefile is divided into two main parts: 1. macro definitions 2. dependency declarations Macro Definitions * Macros are used extensively in C programming, get used to them * A macro is a text replacement facility * A macro has a name, which is a character string, and a body, which is also a character string, wherever the name occurs in the file it is replaced by the body of the macro * In make we use macros to describe the compiler we are using, the compiler options, and the list of files we need. It allows us to parameterize the Makefile * A macro definition has the following format: macro-name = macro-body * A macro definition can appear anywhere in a Makefile * In make, a macro is invoked by mentioning its name, preceded by a $, if the macro is more than one character long the name must be placed in parenthesis * In our example makefile we have FILES = main.o phone.o * When we use $(FILES) in the rest of the Makefile it is replaced by the two file names, as we add more files to the program we only need to change the line where FILES is declared, the rest of the Makefile stays the same - this saves time and reduces mistakes * Similarly we have the lines CC=gcc CFLAGS=-g * The first line defines the C compiler that we use and the second line defines the compiler flags, again we can quickly change the compiler options without scanning through the entire Makefile Dependency Declarations * The dependency declarations tell make which files depend on which other files * In the case of our example, the file phone (the executable for our program) depends on both main.o and phone.o, we say this in the following way: phone: main.o phone.o * The files on the left of the : depend on the files listed on the right * Similarly, both main.o and phone.o depend on phone.h, there are two ways that we could specify this main.o : phone.h phone.o : phone.h or main.o phone.o : phone.h * After each dependency declaration we can list the commands that construct the dependent files, these commands must be indented, there must be either spaces or tabs at the beginning of the line. WARNING: Old versions of make require tabs. * So in order to create phone, we use the following specification phone : main.o phone.o gcc -o phone main.o phone.o * When we need to re-create phone, make will automatically execute the gcc command * When is a dependency declaration used? * When make is started it looks at the time of last modification of all the files mentioned in the Makefile, if the last time of modification of a file on the right side of a dependency declaration is more recent than one of the files on the left side then the commands are executed * In our example, if either main.o or phone.o is more recent than phone, the gcc command will be executed, otherwise make knows that phone is up-to-date and does nothing * How does make know how to make main.o and phone.o, there are no commands associated with their dependencies? * Make has a set of default rules that know about Unix's file naming conventions, we use suffixes to indicate the type of file, start with a dot ( . ) and have one or more letters, the common ones are: o .c a c program o .h a header file o .o an object file o .s an assembler file * Make uses the file suffixes to determine how to make a file, if make needs a .o file and can find a .c file with the same prefix, it knows that it can use the C compiler to produce the .o file * For example if name.o is required, mentioned on the right side of a dependency rule, but there is no rule for name.o then make will look for a file named name.c, and if it finds one will run the c compiler on it, the use of default rules saves on the amount of typing you must do * You can construct your own default rules * If make is started without arguments, it will use the first dependency declaration as its target, that is it will make the file on the left side of the first dependency rule in the Makefile * In our example Makefile, phone is the first file mentioned, so by default make will execute that command first * You can specify the target when you call make, if we enter the command make main.o * then make will only run the c compiler on main.c to produce main.o, it will not produce a new version of phone * The most common file that you produce should be in the first dependency declaration, so you don't need to mention it each time you use make * Besides making the executable of a program there are other standard things that are placed in a Makefile, some of these operations include cleaning up temporary files, installing the executable in a standard place, running standard test cases, and printing the program * We could add the following rules to our Makefile: clean: $(FILES) rm $(FILES) install: phone cp phone $HOME/bin/phone * When we execute the command make clean All the .o files for the phone program will be deleted, note that a file called clean will not be produced, therefore, each time we execute this command the .o files will be deleted - make doesn't check whether it has actually created the file that was on the left side of the rule Basic C Types * C has a small number of basic data types, which are: o char a character o int integer o float floating point number o double floating point number * I also consider pointers to be a basic data type - a pointer is always the size of a machine address and is treated like an unsigned integer * There are three modifiers that can be applied to most of the basic data types * The long modifier specifies that the maximum number of bits is used to represent the value, for example on most machines a float is 32 bits and a double is 64 bits, in reality a double is a long float * The long modifier is often used with the int type, a long int uses the maximum number of bits for an integer, on most machines an int and a long int are often the same data type, but this is not always the case * The short modifier is used to specify that the minimum number of bits should be used, it is usually used with int's - a short int is 16 bits on most machines * The unsigned modifier can be used with character and integer data types, an unsigned value doesn't interpret the sign bit as a sign, it is used as part of the value, in other words unsigned values are always positive and use all the bits in the value - this is used in bit manipulation operations * A variable declaration has the following format: type variable_name = initial_value ; * The initial value part of the declaration is optional * Several variables can be declared at the same time, the variable names are separated by commas * A pointer is declared in the following way type *variable_name ; * The type is the type of value that is pointed to Constants * Character constants are enclosed in single quotes, the \ is used as an escape character - for example '\037' is the character with octal value 37, '\n' is the end of line character * An integer constant is a number without a decimal point, a long integer constant is an integer constant with L as a suffix - for example 123L is a long integer constant * An integer that starts with the digit 0 is a non-decimal constant, if the 0 is followed by another digit it is an octal constant - for example 037 is an octal integer constant * If the 0 is followed by an x or X, the constant is a hexadecimal integer constant - for example 0x1f is a hex constant (its decimal value is 1*16^1 + 15*16^0 = 31 * Double or floating point constants are numbers that have decimal points or exponents Arrays * Arrays and pointers are very closely related in C * C basically supports one dimensional arrays, the first subscript value is always 0 * An array is declared in the following way: type array_variable [ size ] ; * The type is the type of the individual array elements and size is the length of the array * For example int a[1000]; will produce an array that has 1000 integers in it, the first element is a[0] and the last element is a[999] * Arrays are initialized in the same way as basic data types, except a list of values must be specified * For example int a[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; * A text string is a one dimensional array of char elements * For example, a text string can be declared in the following way: char string[25]; * A character string constant is enclosed in double quotes ("), for example "this is a text string" - there is a major difference between single and double quotes * In C and Unix a text string is terminated by a zero byte, this is written as '\0', an empty string requires one byte of storage, in general a string with n characters requires n+1 bytes of storage - be careful to allocate the extra byte Arrays and Pointers * An array name is really a pointer, it points to the first element of the array * For example: int a[100]; int *pa; pa = a; * Both pa and a point to the first element of the array of 100 integers, both of the following reference the same array element: a[i] and *(pa+i) * The expression pa+i, takes the value of the pointer pa and adds i elements to it, thus pa+i points to the i'th element of the array a, the * operator treats its operand as a pointer and retrieves the value at that address, thus *(pa+i) first computes the address of element i, and then retrieves its value * The & operator is used to compute the address of a variable, for example pa = &a[5]; stores the address of the 5'th element of a in pa, now the expression *(pa+2) retrieves the value of a[7] * There is an important distinction between array and pointer declarations, an array declaration allocates storage for the array while a pointer declaration only allocates enough storage to store the pointer itself, no storage for the value pointed to is allocated * We can of course have an array of pointers, it is declared in the following way int *pa[10]; * This produces an array containing 10 pointers to integers Multi-Dimensional Arrays * A multi-dimensional array is just an array of arrays * Thus we can declare a 2 dimensional array in the following way: int a[10][20]; * This is in fact 10 arrays, each of which contains 20 integer elements * An individual element of this array can be referenced in the following way a[i][j] * Of course we can still do things like the following int a[10][20]; int *pa; pa = a; x = *(pa+i*20+j); * Similarly we can do things like: int a[10][20]; int (*pa)[20]; pa = &a[0]; * Then (*pa)[5] would be the same as a[0][5] * An observation, a variable is declared in the way that it is used * The type in a variable declaration is usually a basic type, the declaration syntax shows how a value of that basic type can be obtained from the variable name * Also note that [] has higher precedence than *, so we need to use parenthesis in the above declaration, otherwise we would have an array of 20 pointers to integers instead of a pointer to an array of 20 integers Character Arrays and Initialization * An array of pointers to text strings (which is not the same as a 2 dimensional array of characters) is often used in C programming, such a structure can be initialized in the following way: char *name[] = { "fred", "george", "paul", "mark", 0 }; * The last line of the initialization isn't necessary, it serves as a marker for the end of the array, we can detect the end of the array in the following way: for(i=0; i<1000; i++) { if(name[i] == 0) break; } Structures * C structures are similar to records in Pascal, they allow us to collect together several pieces of related data into one data structure, the individual pieces of data are called structure elements or structure members * A structure is declared in the following way: struct structure_name { member declaration; member declaration; . . . member declaration; }; * This declaration doesn't allocate any memory, it just provides a template for the structure, it declares the values that can be stored together * The individual structure elements can be of any C type, including other structures * For example, we could have the following for the declaration of a name structure struct name { char *first; char *last; }; * The name structure has two elements, the character pointers first and last, variables can have the same names as structure elements and the same element name can be used in different structure declarations * There are several ways that we can declare a variable that has a structure type * One way is to do the following: struct structure_name variable_name; * So with our example name struct we could do the following: struct name fred, george; * Another way of doing this is: struct structure_name { member declarations } variable_name; * In this approach we combine the structure declaration with the variable declaration, the structure name is optional, but it is always a good idea to include it * A structure variable can also be initialized, this can be done in the following way: struct structure_name variable_name = { element values } ; * So for our example we could have: struct name george = { "george", "brown" }; * The element values are assigned in the same order as the element declarations * The . (dot) operator is used to extract the individual elements from a structure value, in the case of our name structure we can do the following: george.first george.last * In general the syntax is variable_name . element_name * A slightly different syntax is used for pointers to structures, for example struct name *person; * The variable person is now a pointer to a name structure, knowing what we know about pointers we can use the following to get the value of the first element of the structure that person points to (*person).first or person->first * The -> operator takes a pointer to a struct, follows the pointer to the structure value and then extracts the field - this is a shorthand, but it makes sense * We can have arrays of structures, which is often quite convenient, this is done in the following way: struct structure_name variable_name [ size ]; * In the case of our name structure, we could have an initialized array of names constructed in the following way: struct name persons[] = { { "george", "brown" }, { "fred", "black" }, . . . { 0, 0 } }; * Again we use an explicit 0 value at the end of the array to indicate the end, there are other ways of doing this, but this is the safest * We can include pointers to a structure within the declaration of the structure, we use this technique to build linked lists and binary trees * We can use the following structure declaration for a node in a binary tree struct node { int value; struct node *left; struct node *right; }; * Note that left and right must be pointers to structures, they cannot be structure variables, otherwise we will have a structure that includes two copies of itself * The same thing can be done for a linked list: struct list_node { float value; struct list_node *next; }; Fields * The structure elements that we have seen so far have been standard C data types, these data types may not be the most efficient way of storing data in a structure * Fields allow us to pack data into a structure as densely as possible, a field is an integer value (either signed or unsigned) where the programmer specifies the number of bits occupied by the field value * We can define fields in the following way: struct example { int field_a : 4; int field_b : 6; unsigned field_c : 6; }; * In this structure all three fields would be packed into a 16 bit word, the first two fields are signed and the last one is unsigned Lvalues and Rvalues * Lvalues and Rvalues are important in understanding expressions in C * An Lvalue is anything that can be on the left side of an assignment operators, in other words it represents a memory location where a value can be stored - Lvalues include variables and pointer expressions * An Rvalue is anything that can be on the right side of an assignment operator, in other words it represents a value * All expressions are Rvalues, but only some of them can be used as Lvalues, in other words any place that an Rvalue is required an Lvalue can be used, but the opposite is not true Assignment Expression * There are several forms of assignment expressions, note that assignment is an expression, it has a value and can be used as part of a larger expression * The general format of an assignment expression is Lvalue assignment_operator Rvalue * The standard assignment operator is =, but there are several other useful ones, such as: += -= *= /= * These operators are interpreted in the following way: x op=y is the same as x = (x op y) Operators * C has the standard arithmetic operators: + - * / % - modulus or remainder * The standard comparison operators are: == != < <= > >= Logical Operators * Recall, that in C zero is treated as false and non-zero is treated as true * The ! is the logical invert operator, if the operand is non-zero the result is zero and if the operand is zero the result is 1 * There are two binary logical operators: && - logical and || - logical or * These operators don't always evaluate their second operand, in the case of && if the first operand evaluates to zero, the second operand is not evaluated since the result is zero * Similarly for ||, if the first operand evaluates to a non-zero value the second operand won't be evaluated since the result will be 1 * The allows us to write expressions like: if(n != 0 && m/n > 5) ... Increment and Decrement Operators * The ++ and -- operators are used to increment and decrement Lvalues, they can be used as both a prefix or postfix operator * If they are used as prefix operators the value of the expression is the new value of the Lvalue, for example m = ++n; is the same as n = n+1; m = n; * If they are used as a postfix operator, the value of the expression is the value of the Lvalue before the operation is performed, for example m = n++; is the same as m = n; n = n+1; Conditional Expression * A conditional expression has the following syntax: expression1 ? expression2 : expression3 * First expression1 is evaluated, if it is non-zero then expression2 is evaluated and used as the value of the expression, in this case expression3 is ignored * If expression1 is zero then expression3 is evaluated and used as the value of the expression, in this case expression2 is ignored * This operator can be used in the following way x = n != 0 ? m/n : 0; * The conditional operator essentially allows the programmer to put an if statement in the middle of an expression Procedures - Part 1 * There are two ways of declaring and defining procedures in C, the old way and the ANSI standard way - you will run into both so we will cover both * All procedures in C are really functions, that is they return a value, the special type void is used to indicate that the return value is never used, therefore, one is not produced * C has both procedure declarations and procedure definitions - these are two different concepts * A procedure declaration contains all the information required to call the procedure, that is the name of the procedure, the types of its parameters (optional) and the type of the return value * A procedure definition includes all the information in a procedure declaration, plus local variables and the statements in the procedure - it not only describes how the procedure can be called, but also how it computes its value Procedure Declarations * The old style of procedure declaration is: type procedure_name(); * The type is the type of the return value * The ANSI style of procedure declaration is: type procedure_name(parameter_declarations); * The parameter declarations are separated by commas and each declaration has the following format: type parameter_name * This is the same format as variable declarations * The ANSI style procedure declarations should be used Procedure Definitions * The old style of procedure definition is: type procedure_name(parameter_names) parameter declaration; parameter declaration; . . . parameter declaration; { variable declarations statements } * The parameter_names is a comma separated list of parameter names, the parameter declarations need not be in the same order as the parameter_names * I prefer this format, since there is a separate line for each parameter, easier to document * The ANSI style of procedure definition is: type procedure_name(parameter_declarations) { variable declarations statements } * The return statement is used to specify the value returned by the procedure and return control to the calling procedure * The two formats of the return statement are: return; and return(Rvalue); * The first form is only used with procedures of type void Parameter Passing * All parameter passing in C is by value, that is, when a procedure is called the parameter values from the calling procedure are copied into temporary storage in the called procedure - all modifications to the parameter values occurs in this temporary storage, the original values in the calling procedure are not changed * This means that you cannot return a result directly through a parameter, the only way to return a result through a parameter is to pass a pointer to the variable where the result should be stored * Remember arrays are the same as pointers, so if you pass an array (not an array element) to a procedure, you can modify the elements in the array and these modifications will be seen outside of the procedure foo(int x) { we can do anything we like to x inside this procedure the calling procedure won't see any of these changes the calling procedure only provides the initial value of x } foo(int *x) { *x = 5; } foo(&i) printf("%d",i); * This will print 5, since we have passed a pointer to i into the procedure, the value pointed at is changed by the procedure - note that foo(i) will cause all sorts of problems if i isn't a pointer ------------------------------------------------------------------------------- Cmput 201 Course Notes / Dr. Mark Green / mark@cs.ualberta.ca