Assignment Makes Pointer From Integer Without A Cast Care

This chapter explains the features, technical details and syntaxes of the C programming language. I assume that you could write some simple programs. Otherwise, read "Introduction to Programming in C for Novices and First-time Programmers".

Introduction to C

C Standards

C is standardized as ISO/IEC 9899.

  1. K&R C: Pre-standardized C, based on Brian Kernighan and Dennis Ritchie (K&R) "The C Programming Language" 1978 book.
  2. C90 (ISO/IEC 9899:1990 "Programming Languages. C"). Also known as ANSI C 89.
  3. C99 (ISO/IEC 9899:1999 "Programming Languages. C")
  4. C11 (ISO/IEC 9899:2011 "Programming Languages. C")
C Features

[TODO]

C Strength and Pitfall

[TODO]

Basic Syntaxes

Revision

Below is a simple C program that illustrates the important programming constructs (sequential flow, while-loop, and if-else) and input/output. Read "Introduction to Programming in C for Novices and First-time Programmers" if you need help in understanding this program.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 #include <stdio.h> int main() { int sumOdd = 0; int sumEven = 0; int upperbound; int absDiff; printf("Enter the upperbound: "); scanf("%d", &upperbound); int number = 1; while (number <= upperbound) { if (number % 2 == 0) { sumEven += number; } else { sumOdd += number; } ++number; } if (sumOdd > sumEven) { absDiff = sumOdd - sumEven; } else { absDiff = sumEven - sumOdd; } printf("The sum of odd numbers is %d.\n", sumOdd); printf("The sum of even numbers is %d.\n", sumEven); printf("The absolute difference is %d.\n", absDiff); return 0; }
Enter the upperbound: 1000 The sum of odd numbers is 250000. The sum of even numbers is 250500. The absolute difference is 500.

Comments

Comments are used to document and explain your codes and program logic. Comments are not programming statements and are ignored by the compiler, but they VERY IMPORTANT for providing documentation and explanation for others to understand your program (and also for yourself three days later).

There are two kinds of comments in C:

  1. Multi-line Comment: begins with a and ends with a , and can span several lines.
  2. End-of-line Comment: begins with and lasts till the end of the current line.

You should use comments liberally to explain and document your codes. During program development, instead of deleting a chunk of statements permanently, you could comment-out these statements so that you could get them back later, if needed.

Statements and Blocks

Statement: A programming statement is the smallest independent unit in a program, just like a sentence in the English language. It performs a piece of programming action. A programming statement must be terminated by a semi-colon (), just like an English sentence ends with a period. (Why not ends with a period like an english sentence? This is because period crashes with decimal point - it is hard for the dumb computer to differentiate between period and decimal point!)

For examples,

int number1 = 10; int number2, number3 = 99; int product; product = number1 * number2 * number3; printf("Hello\n");

Block: A block (or a compound statement) is a group of statements surrounded by braces . All the statements inside the block is treated as one unit. Blocks are used as the body in constructs like function, if-else and loop, which may contain multiple statements but are treated as one unit. There is no need to put a semi-colon after the closing brace to end a complex statement. Empty block (without any statement) is permitted. For examples,

if (mark >= 50) { printf("PASS\n"); printf("Well Done!\n"); printf("Keep it Up!\n"); } if (number == 88) { printf("Got it\n"); } else { printf("Try Again\n"); } i = 1; while (i < 8) { printf("%d\n", i); ++i; } int main() { ...statements... }

White Spaces and Formatting Source Codes

White Spaces: Blank, tab and new-line are collectively called white spaces. C ignores extra white spaces. That is, multiple contiguous white spaces are treated as a single white space.

You need to use a white space to separate two keywords or tokens, e.g.,

int sum=0; double average; average=sum/100.0;

Additional white spaces and extra lines are, however, ignored, e.g.,

int sum = 0 ; double average ; average = sum / 100.0;

Formatting Source Codes: As mentioned, extra white spaces are ignored and have no computational significance. However, proper indentation (with tabs and blanks) and extra empty lines greatly improves the readability of the program, which is extremely important for others (and yourself three days later) to understand your programs. For example, the following hello-world works, but can you understand the program?

#include <stdio.h> int main(){printf("Hello, world!\n");return 0;}

Braces: Place the beginning brace at the end of the line, and align the ending brace with the start of the statement.

Indentation: Indent the body of a block by an extra 3 (or 4 spaces), according to its level.

For example,

#include <stdio.h> int main() { int mark = 70; if (mark >= 50) { printf("You Pass!\n"); } else { printf("You Fail!\n"); } return 0; }

Most IDEs (such as CodeBlocks, Eclipse and NetBeans) have a command to re-format your source code automatically.

Note: Traditional C-style formatting places the beginning and ending braces on the same column. For example,

#include <stdio.h> int main() { int mark = 70; if (mark >= 50) { printf("You Pass!\n"); } else { printf("You Fail!\n"); } return 0; }

Preprocessor Directives

C source code is preprocessed before it is compiled into object code (as illustrated).

A preprocessor directive, which begins with a sign (such as , ), tells the preprocessor to perform a certain action (such as including a header file, or performing text replacement), before compiling the source code into object code. Preprocessor directives are not programming statements, and therefore should NOT be terminated with a semi-colon. For example,

#include <stdio.h> #include <math.h> #define PI 3.14159265 // DO NOT terminate preprocessor directive with a semi-colon

In almost all of the C programs, we use to include the input/output stream library header into our program, so as to use the IO library function to carry out input/output operations (such as and ).

More on preprocessor directives later.

Variables and Types

Variables

Computer programs manipulate (or process) data. A variable is used to store a piece of data for processing. It is called variable because you can change the value stored.

More precisely, a variable is a named storage location, that stores a value of a particular data type. In other words, a variable has a name, a type and stores a value.

  • A variable has a name (or identifier), e.g., , , , . The name is needed to uniquely identify each variable, so as to assign a value to the variable (e.g., ), and retrieve the value stored (e.g., ).
  • A variable has a type. Examples of type are,
    • : for integers (whole numbers) such as and ;
    • : for floating-point or real numbers such as , , having a decimal point and fractional part.
  • A variable can store a value of that particular type. It is important to take note that a variable in most programming languages is associated with a type, and can only store value of the particular type. For example, a variable can store an integer value such as , but NOT real number such as , nor texts such as .
  • The concept of type was introduced into the early programming languages to simplify interpretation of data made up of 0s and 1s. The type determines the size and layout of the data, the range of its values, and the set of operations that can be applied.

The following diagram illustrates two types of variables: and . An variable stores an integer (whole number). A variable stores a real number.

Identifiers

An identifier is needed to name a variable (or any other entity such as a function or a class). C imposes the following rules on identifiers:

  • An identifier is a sequence of characters, of up to a certain length (compiler-dependent, typically 255 characters), comprising uppercase and lowercase letters , digits , and underscore .
  • White space (blank, tab, new-line) and other special characters (such as , , , , , , commas, etc.) are not allowed.
  • An identifier must begin with a letter or underscore. It cannot begin with a digit. Identifiers beginning with an underscore are typically reserved for system use.
  • An identifier cannot be a reserved keyword or a reserved literal (e.g.,, , , , ).
  • Identifiers are case-sensitive. A is NOT a , and is NOT a .

Caution: Programmers don't use blank character in names. It is either not supported, or will pose you more challenges.

Variable Naming Convention

A variable name is a noun, or a noun phrase made up of several words. The first word is in lowercase, while the remaining words are initial-capitalized, with no spaces between words. For example, , , , , and . This convention is also known as camel-case.

Recommendations
  1. It is important to choose a name that is self-descriptive and closely reflects the meaning of the variable, e.g., or .
  2. Do not use meaningless names like , , , , , , , , .
  3. Avoid single-alphabet names, which is easier to type but often meaningless, unless they are common names like , , for coordinates, for index.
  4. It is perfectly okay to use long names of says 30 characters to make sure that the name accurately reflects its meaning!
  5. Use singular and plural nouns prudently to differentiate between singular and plural variables.  For example, you may use the variable to refer to a single row number and the variable to refer to many rows (such as an array of rows - to be discussed later).

Variable Declaration

To use a variable in your program, you need to first "introduce" it by declaring its name and type, in one of the following syntaxes:

SyntaxExample
type identifier;type identifier-1,identifier-2, ...,identifier-n;type identifier=value;type identifier-1=value-1, ...,identifier-n=value-n;  int option;   double sum, difference, product, quotient;   int magicNumber = 88;   double sum = 0.0, product = 1.0;

Example,

int mark1; mark1 = 76; int mark2; mark2 = mark1 + 10; double average; average = (mark1 + mark2) / 2.0; int mark1;mark2 = "Hello";

Take note that:

  • In C, you need to declare the name of a variable before it can be used.
  • C is a "strongly-type" language. A variable takes on a type. Once the type of a variable is declared, it can only store a value belonging to this particular type. For example, an variable can hold only integer such as , and NOT floating-point number such as or text string such as . The concept of type was introduced into the early programming languages to simplify interpretation of data made up of 0s and 1s. Knowing the type of a piece of data greatly simplifies its interpretation and processing.
  • Each variable can only be declared once.
  • In C, you can declare a variable anywhere inside the program, as long as it is declared before used. (In C prior to C99, all the variables must be declared at the beginning of functions.) It is recommended that your declare a variable just before it is first used.
  • The type of a variable cannot be changed inside the program.
CAUTION: Uninitialized Variables

When a variable is declared, it contains garbage until you assign an initial value. It is important to take note that C does not issue any warning/error if you use a variable before initialize it - which certainly leads to some unexpected results. For example,

1 2 3 4 5 6 7 8 #include <stdio.h> int main() { int number; printf("%d\n", number); return 0; }

Constants (const)

Constants are non-modifiable variables, declared with keyword . Their values cannot be changed during program execution. Also, must be initialized during declaration. For examples:

const double PI = 3.1415926;

Constant Naming Convention: Use uppercase words, joined with underscore. For example, , .

Expressions

An expression is a combination of operators (such as addition , subtraction , multiplication , division ) and operands (variables or literal values), that can be evaluated to yield a single value of a certain type. For example,

1 + 2 * 3 int sum, number; sum + number double principal, interestRate; principal * (1 + interestRate)

Assignment (=)

An assignment statement:

  1. assigns a literal value (of the RHS) to a variable (of the LHS); or
  2. evaluates an expression (of the RHS) and assign the resultant value to a variable (of the LHS).

The RHS shall be a value; and the LHS shall be a variable (or memory address).

The syntax for assignment statement is:

SyntaxExample
variable=literal-value;variable=expression;  number = 88;   sum = sum + number;

The assignment statement should be interpreted this way: The expression on the right-hand-side (RHS) is first evaluated to produce a resultant value (called rvalue or right-value). The rvalue is then assigned to the variable on the left-hand-side (LHS) (or lvalue, which is a location that can hold a rvalue). Take note that you have to first evaluate the RHS, before assigning the resultant value to the LHS. For examples,

number = 8; number = number + 1;

The symbol "" is known as the assignment operator. The meaning of "" in programming is different from Mathematics. It denotes assignment instead of equality. The RHS is a literal value; or an expression that evaluates to a value; while the LHS must be a variable. Note that is valid (and often used) in programming. It evaluates and assign the resultant value to the variable . illegal in Mathematics. While is allowed in Mathematics, it is invalid in programming (because the LHS of an assignment statement must be a variable). Some programming languages use symbol "", "←", "->", or "→" as the assignment operator to avoid confusion with equality.

Fundamental Types

Integers: C supports these integer types: , , , , (in C11) in a non-decreasing order of size. The actual size depends on the implementation. The integers (except ) are signed number (which can hold zero, positive and negative numbers). You could use the keyword to declare an unsigned integers (which can hold zero and positive numbers). There are a total 10 types of integers - combined with .

Characters: Characters (e.g., , , , ) are encoded in ASCII into integers, and kept in type . For example, character is (decimal) or (hexadecimal); character is (decimal) or (hexadecimal); character is (decimal) or (hexadecimal). Take note that the type can be interpreted as character in ASCII code, or an 8-bit integer. Unlike or , which is , could be or , depending on the implementation. You can use or to explicitly declare or .

Floating-point Numbers: There are 3 floating point types: , and , for single, double and long double precision floating point numbers. and are represented as specified by IEEE 754 standard. A can represent a number between and , approximated. A can represented a number between and , approximated. Take note that not all real numbers can be represented by and , because there are infinite real numbers. Most of the values are approximated.

The table below shows the typical size, minimum, maximum for the primitive types. Again, take note that the sizes are implementation dependent.

CategoryTypeDescriptionBytes
(Typical)
Minimum
(Typical)
Maximum
(Typical)
Integersint
(or signed int)
Signed integer (of at least 16 bits)4 (2)-21474836482147483647
unsigned intUnsigned integer (of at least 16 bits)4 (2)04294967295
charCharacter
(can be either signed or unsigned depends on implementation)
1  
signed charCharacter or signed tiny integer
(guarantee to be signed)
1-128127
unsigned charCharacter or unsigned tiny integer
(guarantee to be unsigned)
10255
short
(or short int)
(or signed short)
(or signed short int)
Short signed integer (of at least 16 bits)2-3276832767
unsigned short
(or unsigned shot int)
Unsigned short integer (of at least 16 bits)2065535
long
(or long int)
(or signed long)
(or signed long int)
Long signed integer (of at least 32 bits)4 (8)-21474836482147483647
unsigned long
(or unsigned long int)
Unsigned long integer (of at least 32 bits)4 (8)0same as above
long long
(or long long int)
(or signed long long)
(or signed long long int)
Very long signed integer (of at least 64 bits)8-263263-1
unsigned long long
(or unsigned long long int)
Unsigned very long integer (of at least 64 bits)80264-1
Real NumbersfloatFloating-point number, ≈7 digits
(IEEE 754 single-precision floating point format)
43.4e383.4e-38
doubleDouble precision floating-point number, ≈15 digits
(IEEE 754 double-precision floating point format)
81.7e3081.7e-308
long doubleLong double precision floating-point number, ≈19 digits
(IEEE 754 quadruple-precision floating point format)
12 (8)  
Wide
Characters
wchar_tWide (double-byte) character2 (4)  

In addition, many C library functions use a type called , which is equivalent () to a , meant for counting, size or length, with 0 and positive integers.

*The sizeof Operator

C provides an unary operator to get the size of the operand (in bytes). The following program uses operator to print the size of the fundamental types.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 #include <stdio.h> int main() { printf("sizeof(char) is %d bytes.\n", sizeof(char)); printf("sizeof(short) is %d bytes.\n", sizeof(short)); printf("sizeof(int) is %d bytes.\n", sizeof(int)); printf("sizeof(long) is %d bytes.\n", sizeof(long)); printf("sizeof(long long) is %d bytes.\n", sizeof(long long)); printf("sizeof(float) is %d bytes.\n", sizeof(float)); printf("sizeof(double) is %d bytes.\n", sizeof(double)); printf("sizeof(long double) is %d bytes.\n", sizeof(long double)); return 0; }
sizeof(char) is 1 bytes. sizeof(short) is 2 bytes. sizeof(int) is 4 bytes. sizeof(long) is 4 bytes. sizeof(long long) is 8 bytes. sizeof(float) is 4 bytes. sizeof(double) is 8 bytes. sizeof(long double) is 12 bytes.

The results may vary among different systems.

*Header <limits.h>

The header contains information about limits of integer type. For example,

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 #include <stdio.h> #include <limits.h> int main() { printf("int max = %d\n", INT_MAX); printf("int min = %d\n", INT_MIN); printf("unsigned int max = %u\n", UINT_MAX); printf("long max = %ld\n", LONG_MAX); printf("long min = %ld\n", LONG_MIN); printf("unsigned long max = %lu\n", ULONG_MAX); printf("long long max = %lld\n", LLONG_MAX); printf("long long min = %lld\n", LLONG_MIN); printf("unsigned long long max = %llu\n", ULLONG_MAX); printf("Bits in char = %d\n", CHAR_BIT); printf("char max = %d\n", CHAR_MAX); printf("char min = %d\n", CHAR_MIN); printf("signed char max = %d\n", SCHAR_MAX); printf("signed char min = %d\n", SCHAR_MIN); printf("unsigned char max = %u\n", UCHAR_MAX); return 0; }
int max = 2147483647 int min = -2147483648 unsigned int max = 4294967295 long max = 2147483647 long min = -2147483648 unsigned long max = 4294967295 long long max = 9223372036854775807 long long min = -9223372036854775808 unsigned long long max = 18446744073709551615 Bits in char = 8 char max = 127 char min = -128 signed char max = 127 signed char min = -128 unsigned char max = 255

Again, the outputs depend on the system.

The minimum of unsigned integer is always 0. The other constants are , , , , , . Try inspecting this header (search for under your compiler).

*Header <float.h>

Similarly, the header contain information on limits for floating point numbers, such as minimum number of significant digits (, , for , and ), number of bits for mantissa (, , ), maximum and minimum exponent values, etc. Try inspecting this header (search for under your compiler).

Choosing Types

As a programmer, you need to choose variables and decide on the type of the variables to be used in your programs. Most of the times, the decision is intuitive. For example, use an integer type for counting and whole number; a floating-point type for number with fractional part, for a single character, and for binary outcome.

Rule of Thumb
  • Use for integer and for floating point numbers. Use , , and only if you have a good reason to choose that specific precision.
  • Use (or ) for counting and indexing, NOT floating-point type ( or ). This is because integer type are precise and more efficient in operations.
  • Use an integer type if possible. Use a floating-point type only if the number contains a fractional part.

Read my article on "Data Representation" if you wish to understand how the numbers and characters are represented inside the computer memory. In brief, It is important to take note that is different from , , , , and . They are represented differently in the computer memory, with different precision and interpretation. For example, is , is , is , is , is , is .

There is a subtle difference between and .

Furthermore, you MUST know the type of a value before you can interpret a value. For example, this value cannot be interpreted unless you know the type.

*The typedef Statement

Typing "" many time can get annoying. The statement can be used to create a new name for an existing type. For example, you can create a new type called "" for "" as follow. You should place the immediately after . Use with care because it makes the program hard to read and understand.

typedef unsigned int uint;

Many C compilers define a type called , which is a of .

typedef unsigned int size_t;

Output via printf() Function

C programs use function of library to print output to the console. You need to issue a so-called preprocessor directive "" to use .

To print a string literal such as "Hello, world", simply place it inside the parentheses, as follow:

printf(aStringLiteral);

For example,

printf("Hello, world\n"); Hello, world _

The represents the newline character. Printing a newline advances the cursor (denoted by in the above example) to the beginning of next line. , by default, places the cursor after the printed string, and does not advance the cursor to the next line. For example,

printf("Hello"); printf(", "); printf("world!"); printf("\n"); printf("Hello\nworld\nagain\n"); Hello, world! Hello world again _
Formatted Output via printf()

The "" in stands for "formatted" printing. To do formatted printing, you need to use the following syntax:

printf(formattingString, variable1, variable2, ...)

The formattingString is a string composing of normal texts and conversion specifiers. Normal texts will be printed as they are. A conversion specifier begins with a percent sign (), followed by a code to specify the type of variable and format of the output (such as the field width and number of decimal places). For example, denotes an int; for an with field-width of 3. The conversion specifiers are used as placeholders, which will be substituted by the variables given after the formatting string in a sequential manner. For example,

1 2 3 4 5 6 7 8 9 10 11 12 13 14 #include <stdio.h> int main() { int number1 = 12345, number2 = 678; printf("Hello, number1 is %d.\n", number1); printf("number1=%d, number2=%d.\n", number1, number2); printf("number1=%8d, number2=%5d.\n", number1, number2); printf("number1=%08d, number2=%05d.\n", number1, number2); printf("number1=%-8d, number2=%-5d.\n", number1, number2); return 0; }
Hello, number1 is 12345. number1=12345, number2=678. number1= 12345, number2= 678. number1=00012345, number2=00678. number1=12345 , number2=678 .
Type Conversion Code

The commonly-used type conversion codes are:

TypeType Conversion CodeType & Format
Integers (or )
in octal
in hexadecimal ( uses uppercase A-F)
,
, ,
, ,
Floating-point in fixed notation
in scientific notation
in fixed/scientific notation depending on its value
, (), () : Use or in , but in .
, , , ,
Character
Stringstring

Notes:

  • For , you must use (for long float) in (or , , , ), but you can use either or in (or , , , , , , , ).
  • Use to print a in the formatting string.

For example,

int anInt = 12345; float aFloat = 55.6677; double aDouble = 11.2233; char aChar = 'a'; char aStr[] = "Hello"; printf("The int is %d.\n", anInt); printf("The float is %f.\n", aFloat); printf("The double is %lf.\n", aDouble); printf("The char is %c.\n", aChar); printf("The string is %s.\n", aStr); printf("The int (in hex) is %x.\n", anInt); printf("The double (in scientific) is %le.\n", aDouble); printf("The float (in scientific) is %E.\n", aFloat);

Using the wrong type conversion code usually produces garbage.

Field Width

You can optionally specify a field-width before the type conversion code, e.g., , , . If the value to be formatted is shorter than the field width, it will be padded with spaces (by default). Otherwise, the field-width will be ignored. For example,

int number = 123456; printf("number=%d.\n", number); printf("number=%8d.\n", number); printf("number=%3d.\n", number);
Precision (Decimal Places) for Floating-point Numbers

For floating-point numbers, you can optionally specify the number of decimal places to be printed, e.g., , . For example,

double value = 123.14159265; printf("value=%lf;\n", value); printf("value=%6.2lf;\n", value); printf("value=%9.4lf;\n", value); printf("value=%3.2lf;\n", value);
Alignment

The output are right-aligned by default. You could include a "" flag (before the field width) to ask for left-aligned. For example,

int i1 = 12345, i2 = 678; printf("Hello, first int is %d, second int is %5d.\n", i1, i2); printf("Hello, first int is %d, second int is %-5d.\n", i1, i2); char msg[] = "Hello"; printf("xx%20sxx\n", msg); printf("xx%-20sxx\n", msg);
Others
  • (plus sign): display plus or minus sign preceding the number.
  • or : Pad with leading or .
C11's printf_s()/scanf_s()

C11 introduces more secure version of called to deal with mismatched conversion specifiers. Microsoft Visual C implemented its own versions of before C11, and issues a deprecated warning for using .

Input via scanf() Function

In C, you can use function of to read inputs from keyboard. uses the type-conversion code like . For example,

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 #include <stdio.h> int main() { int anInt; float aFloat; double aDouble; printf("Enter an int: "); scanf("%d", &anInt); printf("The value entered is %d.\n", anInt); printf("Enter a floating-point number: "); scanf("%f", &aFloat); printf("The value entered is %f.\n", aFloat); printf("Enter a floating-point number: "); scanf("%lf", &aDouble); printf("The value entered is %lf.\n", aDouble); return 0; }

Notes:

  • To place the input into a variable in , you need to prefix the variable name by an ampersand sign (). The ampersand () is called address-of operator, which will be explained later. However, it is important to stress that missing ampersand () is a common error.
  • For double, you must use type conversion code for . You could use or for .
Return-Value for

The returns an indicating the number of values read.

For example,

int number1 = 55, number2 = 66; int rcode = scanf("%d", &number); printf("return code is %d\n", rcode); printf("number1 is %d\n", number1); printf("number2 is %d\n", number2);

The returns 1 if user enters an integer which is read into the variable . It returns 0 if user enters a non-integer (such as "hello"), and variable number is not assigned.

int number1 = 55, number2 = 66; int rcode = scanf("%d%d", &number1, &number2); printf("return code is %d\n", rcode); printf("number1 is %d\n", number1); printf("number2 is %d\n", number2);

The returns 2 if user enters two integers that are read into and . It returns 1 if user enters an integer followed by a non-integer, and will not be affected. It returns 0 if user enters a non-integer, and both and will not be affected.

Checking the return code of is recommended for secure coding.

Literals for Fundamental Types and String

A literal is a specific constant value, such as , , , , , that can be assigned directly to a variable; or used as part of an expression. They are called literals because they literally and explicitly identify their values.

Integer Literals

A whole number, such as and , is treated as an , by default. For example,

int number = -123; int sum = 4567; int bigSum = 8234567890;

An literal may precede with a plus () or minus () sign, followed by digits. No commas or special symbols (e.g., or space) is allowed (e.g., and are invalid). No preceding is allowed too (e.g., is invalid).

Besides the default base 10 integers, you can use a prefix (zero) to denote a value in octal, prefix for a value in hexadecimal, and prefix '' for binary value (in some compilers), e.g.,

int number1 = 1234; int number2 = 01234; int number3 = 0x1abc; int number4 = 0b10001001;

A literal is identified by a suffix or (avoid lowercase, which can be confused with the number one). A is identified by a suffix . You can also use suffix for , for , and for . For example,

long number = 12345678L; long sum = 123; long long bigNumber = 987654321LL;

No suffix is needed for literals. But you can only use integer values in the permitted range. For example,

short smallNumber = 1234567890; short midSizeNumber = -12345;
Floating-point Literals

A number with a decimal point, such as and , is treated as a , by default. You can also express them in scientific notation, e.g., , , where or denotes the exponent in power of 10. You could precede the fractional part or exponent with a plus () or minus () sign. Exponent shall be an integer. There should be no space or other characters (e.g., space) in the number.

You MUST use a suffix of or for literals, e.g., . For example,

float average = 55.66; float average = 55.66f;

Use suffix (or ) for .

Character Literals and Escape Sequences

A printable literal is written by enclosing the character with a pair of single quotes, e.g., , , and . In C, characters are represented using 8-bit ASCII code, and can be treated as a 8-bit signed integers in arithmetic operations. In other words, and 8-bit signed integer are interchangeable. You can also assign an integer in the range of to a variable; and to an .

You can find the ASCII code table HERE.

For example,

char letter = 'a'; char anotherLetter = 98; printf("%c\n", letter); printf("%c\n", anotherLetter); anotherLetter += 2; printf("%c\n", anotherLetter); printf("%d\n", anotherLetter);

Non-printable and control characters can be represented by so-called escape sequences, which begins with a back-slash () followed by a code. The commonly-used escape sequences are:

Escape SequenceDescriptionHex (Decimal)
\nNew-line (or Line-feed)0AH (10D)
\rCarriage-return0DH (13D)
\tTab09H (9D)
\"Double-quote (needed to include " in double-quoted string)22H (34D)
\'Single-quote27H (39D)
\\Back-slash (to resolve ambiguity)5CH (92D)

Notes:

  • New-line () and carriage return (), represented by , and respectively, are used as line delimiter (or end-of-line, or EOL). However, take note that UNIX/Linux/Mac use as EOL, Windows use .
  • Horizontal Tab () is represented as .
  • To resolve ambiguity, characters back-slash (), single-quote () and double-quote () are represented using escape sequences , and , respectively. This is because a single back-slash begins an escape sequence, while single-quotes and double-quotes are used to enclose character and string.
  • Other less commonly-used escape sequences are: or , for alert or bell, for backspace, for form-feed, for vertical tab. These may not be supported in some consoles.
The <ctype.h> Header

The header provides functions such as , , , , , , to determine the type of character; and , for case conversion.

String Literals

A literal is composed of zero of more characters surrounded by a pair of double quotes, e.g., , , .

String literals may contains escape sequences. Inside a , you need to use for double-quote to distinguish it from the ending double-quote, e.g. . Single quote inside a does not require escape sequence. For example,

printf("Use \\\" to place\n a \" within\ta\tstring\n"); Use \" to place a " within a string

TRY: Write a program to print the following picture. Take note that you need to use escape sequences to print special characters.

'__' (oo) +========\/ / || %%% || * ||-----|| "" ""
Example (Literals)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 #include <stdio.h> int main() { char gender = 'm'; unsigned short numChildren = 8; short yearOfBirth = 1945; unsigned int salary = 88000; double weight = 88.88; float gpa = 3.88f; printf("Gender is %c.\n", gender); printf("Number of children is %u.\n", numChildren); printf("Year of birth is %d.\n", yearOfBirth); printf("Salary is %u.\n", salary); printf("Weight is %.2lf.\n", weight); printf("GPA is %.2f.\n", gpa); return 0; }
Gender is m. Number of children is 8. Year of birth is 1945. Salary is 88000. Weight is 88.88. GPA is 3.88.

Operations

Arithmetic Operators

C supports the following arithmetic operators for numbers: , , , , (treated as 8-bit signed integer), , , , , , , and .

OperatorDescriptionUsageExamples
*Multiplicationexpr1 * expr22 * 3 → 6; 3.3 * 1.0 → 3.3
/Divisionexpr1 / expr21 / 2 → 0; 1.0 / 2.0 → 0.5
%Remainder (Modulus)expr1 % expr25 % 2 → 1; -5 % 2 → -1
+Additionexpr1 + expr21 + 2 → 3; 1.1 + 2.2 → 3.3
-Subtractionexpr1 - expr21 - 2 → -1; 1.1 - 2.2 → -1.1

All the above operators are binary operators, i.e., they take two operands. The multiplication, division and remainder take precedence over addition and subtraction. Within the same precedence level (e.g., addition and subtraction), the expression is evaluated from left to right. For example, is evaluated as .

It is important to take note that produces an , with the result truncated, e.g.,(instead of ).

Take note that C does not have an exponent (power) operator ( is exclusive-or, not exponent).

Arithmetic Expressions

In programming, the following arithmetic expression:

must be written as . You cannot omit the multiplication symbol (as in Mathematics).

Like Mathematics, the multiplication and division take precedence over addition and subtraction . Parentheses have higher precedence. The operators , , , and are left-associative. That is, is treated as .

Mixed-Type Operations

If both the operands of an arithmetic operation belong to the same type, the operation is carried out in that type, and the result belongs to that type. For example, .

However, if the two operands belong to different types, the compiler promotes the value of the smaller type to the larger type (known as implicit type-casting). The operation is then carried out in the larger type. For example, . Hence, .

For example,

TypeExampleOperation
int2 + 3int 2 + int 3 → int 5
double2.2 + 3.3double 2.2 + double 3.3 → double 5.5
mix2 + 3.3int 2 + double 3.3 → double 2.0 + double 3.3 → double 5.3
int1 / 2int 1 / int 2 → int 0
double1.0 / 2.0double 1.0 / double 2.0 → double 0.5
mix1 / 2.0int 1 / double 2.0 → double 1.0 + double 2.0 → double 0.5
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 #include <stdio.h> int main() { int i1 = 2, i2 = 4; double d1 = 2.5, d2 = 5.2; printf("%d + %d = %d\n", i1, i2, i1+i2); printf("%.1lf + %.1lf = %.1lf\n", d1, d2, d1+d2); printf("%d + %.1lf = %.1lf\n", i1, d2, i1+d2); printf("%d / %d = %d\n", i1, i2, i1/i2); printf("%.1lf / %.1lf = %.2lf\n", d1, d2, d1/d2); printf("%d / %.1lf = %.2lf\n", i1, d2, i1/d2); return 0; }

Overflow/UnderFlow

Study the output of the following program:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 #include <stdio.h> int main() { int i1 = 2147483647; printf("%d\n", i1 + 1); printf("%d\n", i1 + 2); printf("%d\n", i1 * i1); int i2 = -2147483648; printf("%d\n", i2 - 1); printf("%d\n", i2 - 2); printf("%d\n", i2 * i2); return 0; }

In arithmetic operations, the resultant value wraps around if it exceeds its range (i.e., overflow or underflow). C runtime does not issue an error/warning message but produces incorrect result.

It is important to take note that checking of overflow/underflow is the programmer's responsibility, i.e., your job!

This feature is an legacy design, where processors were slow. Checking for overflow/underflow consumes computation power and reduces performance.

To check for arithmetic overflow (known as secure coding) is tedious. Google for "INT32-C. Ensure that operations on signed integers do not result in overflow" @ www.securecoding.cert.org.

Compound Assignment Operators

Besides the usual simple assignment operator described earlier, C also provides the so-called compound assignment operators as listed:

OperatorUsageDescriptionExample
=var = exprAssign the value of the LHS to the variable at the RHSx = 5;
+=var += exprsame as var = var + exprx += 5; same as x = x + 5
-=var -= exprsame as var = var - exprx -= 5; same as x = x - 5
*=var *= exprsame as var = var * exprx *= 5; same as x = x * 5
/=var /= exprsame as var = var / exprx /= 5; same as x = x / 5
%=var %= exprsame as var = var % exprx %= 5; same as x = x % 5

Increment/Decrement Operators

C supports these unary arithmetic operators: increment and decrement .

OperatorExampleResult
++x++; ++xIncrement by 1, same as x += 1
--x--; --xDecrement by 1, same as x -= 1
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 #include <stdio.h> int main() { int mark = 76; printf("%d\n", mark); mark++; printf("%d\n", mark); ++mark; printf("%d\n", mark); mark = mark + 1; printf("%d\n", mark); mark--; printf("%d\n", mark); --mark; printf("%d\n", mark); mark = mark - 1; printf("%d\n", mark); return 0; }

The increment/decrement unary operator can be placed before the operand (prefix operator), or after the operands (postfix operator). They takes on different meaning in operations.

OperatorDescriptionExampleResult
++varPre-Increment
Increment var, then use the new value of var
y = ++x;same as x=x+1; y=x;
var++Post-Increment
Use the old value of var, then increment var
y = x++;same as oldX=x; x=x+1; y=oldX;
--varPre-Decrementy = --x;same as x=x-1; y=x;
var--Post-Decrementy = x--;same as oldX=x; x=x-1; y=oldX;

If '++' or '--' involves another operation, then pre- or post-order is important to specify the order of the two operations. For examples,

x = 5; printf("%d\n", x++); x = 5; printf("%d\n", ++x);

Prefix operator (e.g, ) could be more efficient than postfix operator (e.g., ) in some situations.

Implicit Type-Conversion vs. Explicit Type-Casting

Converting a value from one type to another type is called type casting (or type conversion). There are two kinds of type casting:

  1. Implicit type-conversion performed by the compiler automatically, and
  2. Explicit type-casting via an unary type-casting operator in the form of .
Implicit (Automatic) Type Conversion

When you assign a value of a fundamental (built-in) type to a variable of another fundamental type, C automatically converts the value to the receiving type, if the two types are compatible. For examples,

  • If you assign an value to a variable, the compiler automatically casts the value to a double (e.g., from 1 to 1.0) and assigns it to the variable.
  • if you assign a value of to an variable, the compiler automatically casts the value to an value (e.g., from 1.2 to 1) and assigns it to the variable. The fractional part would be truncated and lost. Some compilers issue a warning/error "possible loss in precision"; others do not.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 #include <stdio.h> int main() { int i; double d; i = 3; d = i; printf("d = %lf\n", d); d = 5.5; i = d; printf("i = %d\n", i); i = 6.6; printf("i = %d\n", i); }

C will not perform automatic type conversion, if the two types are not compatible.

Explicit Type-Casting

You can explicitly perform type-casting via the so-called unary type-casting operator in the form of . The type-casting operator takes one operand in the particular type, and returns an equivalent value in the new type. Take note that it is an operation that yields a resultant value, similar to an addition operation although addition involves two operands. For example,

printf("%lf\n", (double)5); printf("%d\n", (int)5.5); double aDouble = 5.6; int anInt = (int)aDouble;

Example: Suppose that you want to find the average (in ) of the integers between and . Study the following codes:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 #include <stdio.h> int main() { int sum = 0; double average; int number = 1; while (number <= 100) { sum += number; ++number; } average = sum / 100; printf("Average is %lf\n", average); return 0; }

You don't get the fractional part although the is a . This is because both the and are . The result of division is an , which is then implicitly casted to and assign to the variable . To get the correct answer, you can do either:

average = (double)sum / 100; average = sum / (double)100; average = sum / 100.0; average = (double)(sum / 100);

Example:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 #include <stdio.h> int main() { double celsius, fahrenheit; printf("Enter the temperature in celsius: "); scanf("%lf", &celsius); fahrenheit = celsius * 9 / 5 + 32; printf("%.2lf degree C is %.2lf degree F\n", celsius, fahrenheit); printf("Enter the temperature in fahrenheit: "); scanf("%lf", &fahrenheit); celsius = (fahrenheit - 32) * 5 / 9; printf("%.2lf degree F is %.2lf degree C\n", fahrenheit, celsius); return 0; }

Relational and Logical Operators

Very often, you need to compare two values before deciding on the action to be taken, e.g., if mark is more than or equal to 50, print "PASS".

C provides six comparison operators (or relational operators):

OperatorDescriptionUsageExample (x=5, y=8)
==Equal toexpr1 == expr2(x == y) → false
!=Not Equal toexpr1 != expr2(x != y) → true
>Greater thanexpr1 > expr2(x > y) → false
>=Greater than or equal toexpr1 >= expr2(x >= 5) → true
<Less thanexpr1 < expr2(y < 8) → false
<=Less than or equal toexpr1 >= expr2(y <= 8) → true

Each comparison operation involves two operands, e.g., . It is invalid to write in programming. Instead, you need to break out the two comparison operations , , and join with with a logical AND operator, i.e., , where denotes AND operator.

C provides four logical operators:

OperatorDescriptionUsage
&&Logical ANDexpr1 && expr2
||Logical ORexpr1 || expr2
!Logical NOT!expr
^Logical XORexpr1 ^ expr2

The truth tables are as follows:

AND (&&)truefalse
true
false
OR (||)truefalse
true
false
XOR (^)truefalse
true
false

Example:

(x >= 0) && (x <= 100)   (x < 0) || (x > 100) !((x >= 0) && (x <= 100))   ((year % 4 == 0) && (year % 100 != 0)) || (year % 400 == 0)

Exercise: Given the year, month (1-12), and day (1-31), write a boolean expression which returns true for dates before October 15, 1582 (Gregorian calendar cut over date).

Ans:

Flow Control

There are three basic flow control constructs - sequential, conditional (or decision), and loop (or iteration), as illustrated below.

Sequential Flow Control

A program is a sequence of instructions. Sequential flow is the most common and straight-forward, where programming statements are executed in the order that they are written - from top to bottom in a sequential manner.

Conditional (Decision) Flow Control

There are a few types of conditionals, if-then, if-then-else, nested-if (if-elseif-elseif-...-else), switch-case, and conditional expression.

"switch-case" is an alternative to the "nested-if". In a switch-case statement, a statement is needed for each of the cases. If is missing, execution will flow through the following case. You can use either an or variable as the case-selector.

Conditional Operator: A conditional operator is a ternary (3-operand) operator, in the form of . Depending on the , it evaluates and returns the value of or .

SyntaxExample
booleanExpr?trueExpr:falseExprprintf("%s\n", (mark >= 50) ? "PASS" : "FAIL"); max = (a > b) ? a : b; abs = (a > 0) ? a : -a;

Braces: You could omit the braces , if there is only one statement inside the block. For example,

if (mark >= 50) printf("PASS\n"); else { printf("FAIL\n"); printf("Try Harder!\n"); }

However, I recommend that you keep the braces, even though there is only one statement in the block, to improve the readability of your program.

Exercises

[TODO]

Loop Flow Control

Again, there are a few types of loops: for-loop, while-do, and do-while.

SyntaxExampleFlowchart
for (init;test;post-proc) {body;} int sum = 0, number; for (number = 1; number <= 1000; ++number) { sum += number; }
while (condition) {body; }  int sum = 0, number = 1; while (number <= 1000) { sum += number; ++number; }
do {body; } while (condition) ;int sum = 0, number = 1; do { sum += number; ++number; } while (number <= 1000);

The difference between while-do and do-while lies in the order of the body and condition. In while-do, the condition is tested first. The body will be executed if the condition is true and the process repeats. In do-while, the body is executed and then the condition is tested. Take note that the body of do-while will be executed at least once (vs. possibly zero for while-do).

Suppose that your program prompts user for a number between to , and checks for valid input, do-while with a boolean flag could be more appropriate.

bool valid = false; int number; do { ...... if (number >=1 && number <= 10) { valid = true; } } while (!valid);

Below is an example of using while-do:

bool gameOver = false; while (!gameOver) { ...... ...... }

Example (Counter-Controlled Loop): Prompt user for an upperbound. Sum the integers from 1 to a given upperbound and compute its average.

Although programmers often use integers and pointers interchangeably in C, pointer-to-integer and integer-to-pointer conversions are implementation-defined. 

Conversions between integers and pointers can have undesired consequences depending on the implementation. According to the C Standard, subclause 6.3.2.3 [ISO/IEC 9899:2011],

An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

Do not convert an integer type to a pointer type if the resulting pointer is incorrectly aligned, does not point to an entity of the referenced type, or is a trap representation.

Do not convert a pointer type to an integer type if the result cannot be represented in the integer type. (See undefined behavior 24.)

The mapping between pointers and integers must be consistent with the addressing structure of the execution environment. Issues may arise, for example, on architectures that have a segmented memory model.

Noncompliant Code Example

The size of a pointer can be greater than the size of an integer, such as in an implementation where pointers are 64 bits and unsigned integers are 32 bits. This code example is noncompliant on such implementations because the result of converting the 64-bit cannot be represented in the 32-bit integer type:

Compliant Solution

Any valid pointer to can be converted to or and back with no change in value. (See INT36-EX2.) The C Standard guarantees that a pointer to may be converted to or from a pointer to any object type and back again and that the result must compare equal to the original pointer. Consequently, converting directly from a pointer to a , as in this compliant solution, is allowed on implementations that support the type.

Noncompliant Code Example

In this noncompliant code example, the pointer is converted to an integer value. The high-order 9 bits of the number are used to hold a flag value, and the result is converted back into a pointer. This example is noncompliant on an implementation where pointers are 64 bits and unsigned integers are 32 bits because the result of converting the 64-bit cannot be represented in the 32-bit integer type.

A similar scheme was used in early versions of Emacs, limiting its portability and preventing the ability to edit files larger than 8MB.

Compliant Solution

This compliant solution uses a to provide storage for both the pointer and the flag value. This solution is portable to machines of different word sizes, both smaller and larger than 32 bits, working even when pointers cannot be represented in any integer type. 

Noncompliant Code Example

It is sometimes necessary to access memory at a specific location, requiring a literal integer to pointer conversion. In this noncompliant code, a pointer is set directly to an integer constant, where it is unknown whether the result will be as intended:

The result of this assignment is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

Compliant Solution

Adding an explicit cast may help the compiler convert the integer value into a valid pointer. A common technique is to assign the integer to a volatile-qualified object of type or and then assign the integer value to the pointer:

Exceptions

INT36-C-EX1: A null pointer can be converted to an integer; it takes on the value 0. Likewise, the integer value 0 can be converted to a pointer; it becomes the null pointer.

INT36-C-EX2: Any valid pointer to can be converted to or or their underlying types and back again with no change in value. Use of underlying types instead of or is discouraged, however, because it limits portability.

Risk Assessment

Converting from pointer to integer or vice versa results in code that is not portable and may create unexpected pointers to invalid memory locations.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

INT36-C

Low

Probable

High

P2

L3

Automated Detection

Tool

Version

Checker

Description

Astrée

17.04i

pointer-integral-cast

pointer-integral-cast-implicit

function-pointer-integer-cast

function-pointer-integer-cast-implicit

Fully checked
Clang

3.9

, Can detect some instances of this rule, but does not detect all
CodeSonar

4.5p1

LANG.CAST.PC.CONST2PTR
LANG.CAST.PC.INT
Conversion: integer constant to pointer
Conversion: pointer/integer
Compass/ROSE


Coverity

2017.07

PW.POINTER_CONVERSION_LOSES_BITSFully implemented
Klocwork

2017

MISRA.CAST.OBJ_PTR_TO_INT.2012
LDRA tool suite

9.7.1

439 S, 440 S

Fully implemented
Parasoft C/C++test

10.3

MISRA2008-5_2_8, CODSTA-127_bFully implemented
Polyspace Bug FinderR2016bUnsafe conversion between pointer and integerMisaligned or invalid results from conversions between pointer and integer types
PRQA QA-C

9.3

305, 306, 309, 429, 432, 557, 563, 671, 674Partially implemented
RuleChecker

17.04i

pointer-integral-cast

pointer-integral-cast-implicit

function-pointer-integer-cast

function-pointer-integer-cast-implicit

Fully checked
SonarQube C/C++ Plugin

3.11

S1767Partially implemented

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

Key here (explains table format and definitions)

CERT-CWE Mapping Notes

Key here for mapping notes

CWE-758 and INT36-C

Independent( INT34-C, INT36-C, MEM30-C, MSC37-C, FLP32-C, EXP33-C, EXP30-C, ERR34-C, ARR32-C)

CWE-758 = Union( INT36-C, list) where list =

  • Undefined behavior that results from anything other than integer <-> pointer conversion

CWE-704 and INT36-C

CWE-704 = Union( INT36-C, list) where list =

  • Incorrect (?) typecast that is not between integers and pointers

CWE-587 and INT36-C

Intersection( CWE-587, INT36-C) =

  • Setting a pointer to an integer value that is ill-defined (trap representation, improperly aligned, mis-typed, etc)

CWE-587 – INT36-C =

  • Setting a pointer to a valid integer value (eg points to an object of the correct t ype)

INT36-C – Cwe-587 =

  • Illegal pointer-to-integer conversion

Bibliography


void f(void) { char *ptr; /* ... */ unsigned int number = (unsigned int)ptr; /* ... */ }
#include <stdint.h>   void f(void) { char *ptr; /* ... */ uintptr_t number = (uintptr_t)ptr; /* ... */ }
void func(unsigned int flag) { char *ptr; /* ... */ unsigned int number = (unsigned int)ptr; number = (number & 0x7fffff) | (flag << 23); ptr = (char *)number; }
struct ptrflag { char *pointer; unsigned int flag : 9; } ptrflag;   void func(unsigned int flag) { char *ptr; /* ... */ ptrflag.pointer = ptr; ptrflag.flag = flag; }
unsigned int *g(void) { unsigned int *ptr = 0xdeadbeef; /* ... */ return ptr; } 
unsigned int *g(void) { volatile uintptr_t iptr = 0xdeadbeef; unsigned int *ptr = (unsigned int *)iptr; /* ... */ return ptr; }
#include <assert.h> #include <stdint.h>   void h(void) { intptr_t i = (intptr_t)(void *)&i; uintptr_t j = (uintptr_t)(void *)&j; void *ip = (void *)i; void *jp = (void *)j; assert(ip == &i); assert(jp == &j); }

0 comments

Leave a Reply

Your email address will not be published. Required fields are marked *