C Language

Topics related to C Language:

Getting started with C Language

C is a general-purpose, imperative computer programming language, supporting structured programming, lexical variable scope and recursion, while a static type system prevents many unintended operations. By design, C provides constructs that map efficiently to typical machine instructions, and therefore it has found lasting use in applications that had formerly been coded in assembly language, including operating systems, as well as various application software for computers ranging from supercomputers to embedded systems.

Despite its low-level capabilities, the language was designed to encourage cross-platform programming. A standards-compliant and portably written C program can be compiled for a very wide variety of computer platforms and operating systems with few changes to its source code. The language has become available on a very wide range of platforms, from embedded microcontrollers to supercomputers.

C was originally developed by Dennis Ritchie between 1969 and 1973 at Bell Labs and used to re-implement the Unix operating systems. It has since become one of the most widely used programming languages of all time, with C compilers from various vendors available for the majority of existing computer architectures and operating systems.

Common Compilers

The process to compile a C program differs between compilers and operating systems. Most operating systems ship without a compiler, so you will have to install one. Some common compilers choices are:

The following documents should give you a good overview on how to get started using a few of the most common compilers:

Compiler C version Support

Note that compilers have varying levels of support for standard C with many still not completely supporting C99. For example, as of the 2015 release, MSVC supports much of C99 yet still has some important exceptions for support of the language itself (e.g the preprocessing seems non-conformant) and for the C library (e.g. <tgmath.h>), nor do they necessarily document their "implementation dependent choices". Wikipedia has a table showing support offered by some popular compilers.

Some compilers (notably GCC) have offered, or continue to offer, compiler extensions that implement additional features that the compiler producers deem necessary, helpful or believe may become part of a future C version, but that are not currently part of any C standard. As these extensions are compiler specific they can be considered to not be cross-compatible and compiler developers may remove or alter them in later compiler versions. The use of such extensions can generally be controlled by compiler flags.

Additionally, many developers have compilers that support only specific versions of C imposed by the environment or platform they are targeting.

If selecting a compiler, it is recommended to choose a compiler that has the best support for the latest version of C allowed for the target environment.

Code style (off-topic here):

Because white space is insignificant in C (that is, it does not affect the operation of the code), programmers often use white space to make the code easier to read and comprehend, this is called the code style. It is a set of rules and guidelines used when writing the source code. It covers concerns such as how lines should be indented, whether spaces or tabs should be used, how braces should be placed, how spaces should be used around operators and brackets, how variables should be named and so forth.

Code style is not covered by the standard and is primarily opinion based (different people find different styles easier to read), as such, it is generally considered off-topic on SO. The overriding advice on style in one's own code is that consistency is paramount - pick, or make, a style and stick to it. Suffice it to explain that there are various named styles in common usage that are often chosen by programmers rather than creating their own style.

Some common indent styles are: K & R style, Allman style, GNU style and so on. Some of these styles have different variants. Allman, for example, is used as either regular Allman or the popular variant, Allman-8. Information on some of the popular styles may be found on Wikipedia. Such style names are taken from the standards the authors or organizations often publish for use by the many people contributing to their code, so that everyone can easily read the code when they know the style, such as the GNU formatting guide that makes up part of the GNU coding standards document.

Some common naming conventions are: UpperCamelCase, lowerCamelCase, lower_case_with_underscore, ALL_CAPS, etc. These styles are combined in various ways for use with different objects and types (e.g., macros often use ALL_CAPS style)

K & R style is generally recommended for use within SO documentation, whereas the more esoteric styles, such as Pico, are discouraged.

Libraries and APIs not covered by the C Standard (and therefore being off-topic here):

Function Pointers

Operators

Operators have an arity, a precedence and an associativity.

  • Arity indicates the number of operands. In C, three different operator arities exist:

    • Unary (1 operand)
    • Binary (2 operands)
    • Ternary (3 operands)
  • Precedence indicates which operators "bind" first to their operands. That is, which operator has priority to operate on its operands. For instance, the C language obeys the convention that multiplication and division have precedence over addition and subtraction:

    a * b + c
    

    Gives the same result as

    (a * b) + c
    

    If this is not what was wanted, precedence can be forced using parentheses, because they have the highest precedence of all operators.

    a * (b + c)
    

    This new expression will produce a result that differs from the previous two expressions.

    The C language has many precedence levels; A table is given below of all operators, in descending order of precedence.

    Precedence Table

    OperatorsAssociativity
    () [] -> .left to right
    ! ~ ++ -- + - * (dereference) (type) sizeofright to left
    * (multiplication) / %left to right
    + -left to right
    << >>left to right
    < <= > >=left to right
    == !=left to right
    &left to right
    ^left to right
    |left to right
    &&left to right
    ||left to right
    ?:right to left
    = += -= *= /= %= &= ^= |= <<= >>=right to left
    ,left to right
  • Associativity indicates how equal-precedence operators binds by default, and there are two kinds: Left-to-Right and Right-to-Left. An example of Left-to-Right binding is the subtraction operator (-). The expression

    a - b - c - d
    

    has three identical-precedence subtractions, but gives the same result as

    ((a - b) - c) - d
    

    because the left-most - binds first to its two operands.

    An example of Right-to-Left associativity are the dereference * and post-increment ++ operators. Both have equal precedence, so if they are used in an expression such as

    * ptr ++
    

    , this is equivalent to

    * (ptr ++)
    

    because the rightmost, unary operator (++) binds first to its single operand.

Data Types

Arrays

Why do we need arrays?

Arrays provide a way to organize objects into an aggregate with its own significance. For example, C strings are arrays of characters (chars), and a string such as "Hello, World!" has meaning as an aggregate that is not inherent in the characters individually. Similarly, arrays are commonly used to represent mathematical vectors and matrices, as well as lists of many kinds. Moreover, without some way to group the elements, one would need to address each individually, such as via separate variables. Not only is that unwieldy, it does not easily accommodate collections of different lengths.

Arrays are implicitly converted to pointers in most contexts.

Except when appearing as the operand of the sizeof operator, the _Alignof operator (C2011), or the unary & (address-of) operator, or as a string literal used to initialize an(other) array, an array is implicitly converted into ("decays to") a pointer to its first element. This implicit conversion is tightly coupled to the definition of the array subscripting operator ([]): the expression arr[idx] is defined as be equivalent to *(arr + idx). Furthermore, since pointer arithmetic is commutative, *(arr + idx) is also equivalent to *(idx + arr), which in turn is equivalent toidx[arr]. All of those expressions are valid and evaluate to the same value, provided that either idx or arr is a pointer (or an array, which decays to a pointer), the other is an integer, and the integer is a valid index into the array to which the pointer points.

As a special case, observe that &(arr[0]) is equivalent to &*(arr + 0), which simplifies to arr. All of those expressions are interchangeable wherever the last decays to a pointer. This simply expresses again that an array decays to a pointer to its first element.

In contrast, if the address-of operator is applied to an array of type T[N] (i.e. &arr) then the result has type T (*)[N] and points to the whole array. This is distinct from a pointer to the first array element at least with respect to pointer arithmetic, which is defined in terms of the size of the pointed-to type.

Function parameters are not arrays.

void foo(int a[], int n);
void foo(int *a, int n);

Although the first declaration of foo uses array-like syntax for parameter a, such syntax is used to declare a function parameter declares that parameter as a pointer to the array's element type. Thus, the second signature for foo() is semantically identical to the first. This corresponds to the decay of array values to pointers where they appear as arguments to a function call, such that if a variable and a function parameter are declared with the same array type then that variable's value is suitable for use in a function call as the argument associated with the parameter.

Undefined behavior

What is Undefined Behavior (UB)?

Undefined behavior is a term used in the C standard. The C11 standard (ISO/IEC 9899:2011) defines the term undefined behavior as

behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

What happens if there is UB in my code?

These are the results which can happen due to undefined behavior according to standard:

NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

The following quote is often used to describe (less formally though) results happening from undefined behavior:

“When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose” (the implication is that the compiler may choose any arbitrarily bizarre way to interpret the code without violating the ANSI C standard)

Why does UB exist?

If it's so bad, why didn't they just define it or make it implementation-defined?

Undefined behavior allows more opportunities for optimization; The compiler can justifiably assume that any code does not contain undefined behaviour, which can allow it to avoid run-time checks and perform optimizations whose validity would be costly or impossible to prove otherwise.

Why is UB hard to track down?

There are at least two reasons why undefined behavior creates bugs that are difficult to detect:

  • The compiler is not required to - and generally can't reliably - warn you about undefined behavior. In fact requiring it to do so would go directly against the reason for the existence of undefined behaviour.
  • The unpredictable results might not start unfolding at the exact point of the operation where the construct whose behavior is undefined occurs; Undefined behaviour taints the whole execution and its effects may happen at any time: During, after, or even before the undefined construct.

Consider null-pointer dereference: the compiler is not required to diagnose null-pointer dereference, and even could not, as at run-time any pointer passed into a function, or in a global variable might be null. And when the null-pointer dereference occurs, the standard does not mandate that the program needs to crash. Rather, the program might crash earlier, later, or not crash at all; it could even behave as if the null pointer pointed to a valid object, and behave completely normally, only to crash under other circumstances.

In the case of null-pointer dereference, C language differs from managed languages such as Java or C#, where the behavior of null-pointer dereference is defined: an exception is thrown, at the exact time (NullPointerException in Java, NullReferenceException in C#), thus those coming from Java or C# might incorrectly believe that in such a case, a C program must crash, with or without the issuance of a diagnostic message.

Additional information

There are several such situations that should be clearly distinguished:

  • Explicitly undefined behavior, that is where the C standard explicitly tells you that you are off limits.
  • Implicitly undefined behavior, where there is simply no text in the standard that foresees a behavior for the situation you brought your program in.

Also have in mind that in many places the behavior of certain constructs is deliberately undefined by the C standard to leave room for compiler and library implementors to come up with their own definitions. A good example are signals and signal handlers, where extensions to C, such as the POSIX operating system standard, define much more elaborated rules. In such cases you just have to check the documentation of your platform; the C standard can't tell you anything.

Also note that if undefined behavior occurs in program it doesn't mean that just the point where undefined behavior occurred is problematic, rather entire program becomes meaningless.

Because of such concerns it is important (especially since compilers don't always warn us about UB) for person programming in C to be at least familiar with the kind of things that trigger undefined behavior.

It should be noted there are some tools (e.g. static analysis tools such as PC-Lint) which aid in detecting undefined behavior, but again, they can't detect all occurrences of undefined behavior.

Random Number Generation

Preprocessor and Macros

When a compiler encounters a macro in the code, it performs simple string replacement, no additional operations are performed. Because of this, changes by the preprocessor do not respect scope of C programs - for example, a macro definition is not limited to being within a block, so is unaffected by a '}' that ends a block statement.

The preprocessor is, by design, not turing complete - there are several types of computation that cannot be done by the preprocessor alone.

Usually compilers have a command line flag (or configuration setting) that allows us to stop compilation after the preprocessing phase and to inspect the result. On POSIX platforms this flag is -E. So, running gcc with this flag prints the expanded source to stdout:

$ gcc -E cprog.c

Often the preprocessor is implemented as a separate program, which is invoked by the compiler, common name for that program is cpp. A number of preprocessors emit supporting information, such as information about line numbers - which is used by subsequent phases of compilation to generate debugging information. In the case the preprocessor is based on gcc, the -P option suppresses such information.

$ cpp -P cprog.c

Signal handling

The usage of signal handlers with only the guarantees from the C standard imposes various limitations what can, or can't be done in the user defined signal handler.

  • If the user defined function returns while handling SIGSEGV, SIGFPE, SIGILL or any other implementation-defined hardware interrupt, the behavior is undefined by the C standard. This is because C's interface doesn't give means to change the faulty state (e.g after a division by 0) and thus when returning the program is in exactly the same erroneous state than before the hardware interrupt occurred.

  • If the user defined function was called as the result of a call to abort, or raise, the signal handler is not allowed to call raise, again.

  • Signals can arrive in the middle of any operation, and therefore the indivisibility of operations can in generally not be guaranteed nor does signal handling work well with optimization. Therefore all modifications to data in a signal handler must be to variables

    • of type sig_atomic_t (all versions) or a lock-free atomic type (since C11, optional)
    • that are volatile qualified.
  • Other functions from the C standard library will usually not respect these restrictions, because they may change variables in the global state of the program. The C standard only makes guarantees for abort, _Exit (since C99), quick_exit (since C11), signal (for the same signal number), and some atomic operations (since C11).

Behavior is undefined by the C standard if any of the rules above are violated. Platforms may have specific extensions, but these are generally not portable beyond that platform.

  • Usually systems have their own list of functions that are asynchronous signal safe, that is of C library functions that can be used from a signal handler. E.g often printf is among these function.

  • In particular the C standard doesn't define much about the interaction with its threads interface (since C11) or any platform specific thread libraries such as POSIX threads. Such platforms have to specify the interaction of such thread libraries with signals by themselves.

Variable arguments

The va_start, va_arg, va_end, and va_copy functions are actually macros.

Be sure to always call va_start first, and only once, and to call va_end last, and only once, and on every exit point of the function. Not doing so may work on your system but surely is not portable and thus invites bugs.

Take care to declare your function correctly, i.e. with a prototype, and mind the restrictions on the last non-variadic argument (not register, not a function or array type). It is not possible to declare a function that takes only variadic arguments, as at least one non-variadic argument is needed to be able to start argument processing.

When calling va_arg, you must request the promoted argument type, that is:

  • short is promoted to int (and unsigned short is also promoted to int unless sizeof(unsigned short) == sizeof(int), in which case it is promoted to unsigned int).
  • float is promoted to double.
  • signed char is promoted to int; unsigned char is also promoted to int unless sizeof(unsigned char) == sizeof(int), which is seldom the case.
  • char is usually promoted to int.
  • C99 types like uint8_t or int16_t are similarly promoted.

Historic (i.e. K&R) variadic argument processing is declared in <varargs.h> but should not be used as it’s obsolete. Standard variadic argument processing (the one described here and declared in <stdarg.h>) was introduced in C89; the va_copy macro was introduced in C99 but provided by many compilers prior to that.

Files and I/O streams

Mode strings:

Mode strings in fopen() and freopen() can be one of those values:

  • "r": Open the file in read-only mode, with the cursor set to the beginning of the file.
  • "r+": Open the file in read-write mode, with the cursor set to the beginning of the file.
  • "w": Open or create the file in write-only mode, with its content truncated to 0 bytes. The cursor is set to the beginning of the file.
  • "w+": Open or create the file in read-write mode, with its content truncated to 0 bytes. The cursor is set to the beginning of the file.
  • "a": Open or create the file in write-only mode, with the cursor set to the end of the file.
  • "a+": Open or create the file in read-write mode, with the read-cursor set to the beginning of the file. The output, however, will always be appended to the end of the file.

Each of these file modes may have a b added after the initial letter (e.g. "rb" or "a+b" or "ab+"). The b means that the file should be treated as a binary file instead of a text file on those systems where there is a difference. It doesn't make a difference on Unix-like systems; it is important on Windows systems. (Additionally, Windows fopen allows an explicit t instead of b to indicate 'text file' — and numerous other platform-specific options.)

C11
  • "wx": Create a text file in write-only mode. The file may not exist.
  • "wbx": Create a binary file in write-only mode. The file may not exist.

The x, if present, must be the last character in the mode string.

Assertion

Both assert and static_assert are macros defined in assert.h.

The definition of assert depends on the macro NDEBUG which is not defined by the standard library. If NDEBUG is defined, assert is a no-op:

#ifdef NDEBUG
#  define assert(condition) ((void) 0)
#else
#  define assert(condition) /* implementation defined */
#endif

Opinion varies about whether NDEBUG should always be used for production compilations.

  • The pro-camp argues that assert calls abort and assertion messages are not helpful for end users, so the result is not helpful to user. If you have fatal conditions to check in production code you should use ordinary if/else conditions and exit or quick_exit to end the program. In contrast to abort, these allow the program to do some cleanup (via functions registered with atexit or at_quick_exit).
  • The con-camp argues that assert calls should never fire in production code, but if they do, the condition that is checked means there is something dramatically wrong and the program will misbehave worse if execution continues. Therefore, it is better to have the assertions active in production code because if they fire, hell has already broken loose.
  • Another option is to use a home-brew system of assertions which always perform the check but handle errors differently between development (where abort is appropriate) and production (where an 'unexpected internal error - please contact Technical Support' may be more appropriate).

static_assert expands to _Static_assert which is a keyword. The condition is checked at compile time, thus condition must be a constant expression. There is no need for this to be handled differently between development and production.

Linked lists

The C language does not define a linked list data structure. If you are using C and need a linked list, you either need to use a linked list from an existing library (such as GLib) or write your own linked list interface. This topic shows examples for linked lists and double linked lists that can be used as a starting point for writing your own linked lists.

Singly linked list

The list contains nodes which are composed of one link called next.

Data structure

struct singly_node
{
  struct singly_node * next;
};

Doubly linked list

The list contains nodes which are composed of two links called previous and next. The links are normally referencing to a node with the same structure.

Data structure

struct doubly_node
{
  struct doubly_node * prev;
  struct doubly_node * next;
};

Topoliges

Linear or open

enter image description here

Circular or ring

enter image description here

Procedures

Bind

Bind two nodes together. enter image description here

void doubly_node_bind (struct doubly_node * prev, struct doubly_node * next)
{
  prev->next = next;
  next->prev = prev;
}

Making circularly linked list

enter image description here

void doubly_node_make_empty_circularly_list (struct doubly_node * head)
{
  doubly_node_bind (head, head);
}

Making linearly linked list

enter image description here

void doubly_node_make_empty_linear_list (struct doubly_node * head, struct doubly_node * tail)
{
  head->prev = NULL;
  tail->next = NULL;
  doubly_node_bind (head, tail);
}

Insertion

Lets assume a empty list always contains one node instead of NULL. Then insertion procedures do not have to take NULL into consideration.

void doubly_node_insert_between
(struct doubly_node * prev, struct doubly_node * next, struct doubly_node * insertion)
{
  doubly_node_bind (prev, insertion);
  doubly_node_bind (insertion, next);
}

void doubly_node_insert_before
(struct doubly_node * tail, struct doubly_node * insertion)
{
  doubly_node_insert_between (tail->prev, tail, insertion);
}

void doubly_node_insert_after
(struct doubly_node * head, struct doubly_node * insertion)
{
  doubly_node_insert_between (head, head->next, insertion);
}

Generic selection

X-macros

Function Parameters

In C, it is common to use return values to denote errors that occur; and to return data through the use of passed in pointers. This can be done for multiple reasons; including not having to allocate memory on the heap or using static allocation at the point where the function is called.

Pointers

The position of the asterisk does not affect the meaning of the definition:

/* The * operator binds to right and therefore these are all equivalent. */
int *i;
int * i;
int* i;

However, when defining multiple pointers at once, each requires its own asterisk:

int *i, *j; /* i and j are both pointers */
int* i, j;  /* i is a pointer, but j is an int not a pointer variable */

An array of pointers is also possible, where an asterisk is given before the array variable's name:

int *foo[2]; /* foo is a array of pointers, can be accessed as *foo[0] and *foo[1] */

Structs

Sequence points

International Standard ISO/IEC 9899:201x Programming languages — C

Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.

The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.

Here is the complete list of sequence points from Annex C of the online 2011 pre-publication draft of the C language standard:

Sequence points

1     The following are the sequence points described in 5.1.2.3:

  • Between the evaluations of the function designator and actual arguments in a function call and the actual call. (6.5.2.2).
  • Between the evaluations of the first and second operands of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); comma , (6.5.17).
  • Between the evaluations of the first operand of the conditional ? : operator and whichever of the second and third operands is evaluated (6.5.15).
  • The end of a full declarator: declarators (6.7.6);
  • Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: an initializer that is not part of a compound literal (6.7.9); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the (optional) expressions of a for statement (6.8.5.3); the (optional) expression in a return statement (6.8.6.4).
  • Immediately before a library function returns (7.1.4).
  • After the actions associated with each formatted input/output function conversion specifier (7.21.6, 7.29.2).
  • Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call (7.22.5).

Command-line arguments

A C program running in a 'hosted environment' (the normal type — as opposed to a 'freestanding environment') must have a main function. It is traditionally defined as:

int main(int argc, char *argv[])

Note that argv can also be, and very often is, defined as char **argv; the behavior is the same. Also, the parameter names can be changed because they're just local variables within the function, but argc and argv are conventional and you should use those names.

For main functions where the code does not use any arguments, use int main(void).

Both parameters are initialized when the program starts:

  • argc is initialized to the number of space-separated arguments given to the program from the command-line as well as the program name itself.
  • argv is an array of char-pointers (strings) containing the arguments (and the program name) that was given on the command-line.
  • some systems expand command-line arguments "in the shell", others do not. On Unix if the user types myprogram *.txt the program will receive a list of text files; on Windows it will receive the string "*.txt".

Note: Before using argv, you might need to check the value of argc. In theory, argc could be 0, and if argc is zero, then there are no arguments and argv[0] (equivalent to argv[argc]) is a null pointer. It would be an unusual system with a hosted environment if you ran into this problem. Similarly, it is possible, though very unusual, for there to be no information about the program name. In that case, argv[0][0] == '\0' — the program name may be empty.

Suppose we start the program like this:

./some_program abba banana mamajam

Then argc is equal to 4, and the command-line arguments:

  • argv[0] points to "./some_program" (the program name) if the program name is available from the host environment. Otherwise an empty string "".
  • argv[1] points to "abba",
  • argv[2] points to "banana",
  • argv[3] points to "mamajam",
  • argv[4] contains the value NULL.

See also What should main() return in C and C++ for complete quotes from the standard.

Aliasing and effective type

Violations of aliasing rules and of violating the effective type of an object are two different things and should not be confounded.

  • Aliasing is the property of two pointers a and b that refer to the same object, that is that a == b.

  • The effective type of a data object is used by C to determine which operations can be done on that object. In particular the effective type is used to determine if two pointers can alias each other.

Aliasing can be a problem for optimization, because changing the object through one pointer, a say, can change the object that is visible through the other pointer, b. If your C compiler would have to assume that pointers could always alias each other, regardless of their type and provenance, many optimization opportunities would be lost, and many programs would run slower.

C's strict aliasing rules refers to cases in the compiler may assume which objects do (or do not) alias each other. There are two rules of thumb that you always should have in mind for data pointers.

Unless said otherwise, two pointers with the same base type may alias.

Two pointers with different base type cannot alias, unless at least one of the two types is a character type.

Here base type means that we put aside type qualifications such as const, e.g. If a is double* and b is const double*, the compiler must generally assume that a change of *a may change *b.

Violating the second rule can have catastrophic results. Here violating the strict aliasing rule means that you present two pointers a and b of different type to the compiler which in reality point to the same object. The compiler then may always assume that the two point to different objects, and will not update its idea of *b if you changed the object through *a.

If you do so the behavior of your program becomes undefined. Therefore, C puts quite severe restrictions on pointer conversions in order to help you to avoid such situation to occur accidentally.

Unless the source or target type is void, all pointer conversions between pointers with different base type must be explicit.

Or in other words, they need a cast, unless you do a conversion that just adds a qualifier such as const to the target type.

Avoiding pointer conversions in general and casts in particular protects you from aliasing problems. Unless you really need them, and these cases are very special, you should avoid them as you can.

Compilation

Filename extensionDescription
.cSource file. Usually contains definitions and code.
.hHeader file. Usually contains declarations.
.oObject file. Compiled code in machine language.
.objAlternative extension for object files.
.aLibrary file. Package of object files.
.dllDynamic-Link Library on Windows.
.soShared object (library) on many Unix-like systems.
.dylibDynamic-Link Library on OSX (Unix variant).
.exe, .comWindows executable file. Formed by linking object files and library files. In Unix-like systems, there is no special file name extension for executable file.
POSIX c99 compiler flagsDescription
-o filenameOutput file name eg. (bin/program.exe, program)
-I directorysearch for headers in direrctory.
-D namedefine macro name
-L directorysearch for libraries in directory.
-l namelink library libname.

Compilers on POSIX platforms (Linux, mainframes, Mac) usually accept these options, even if they are not called c99.

GCC (GNU Compiler Collection) FlagsDescription
-WallEnables all warning messages that are commonly accepted to be useful.
-WextraEnables more warning messages, can be too noisy.
-pedanticForce warnings where code violates the chosen standard.
-WconversionEnable warnings on implicit conversion, use with care.
-cCompiles source files without linking.
-vPrints compilation info.
  • gcc accepts the POSIX flags plus a lot of others.
  • Many other compilers on POSIX platforms (clang, vendor specific compilers) also use the flags that are listed above.
  • See also Invoking GCC for many more options.
TCC (Tiny C Compiler) FlagsDescription
-Wimplicit-function-declarationWarn about implicit function declaration.
-WunsupportedWarn about unsupported GCC features that are ignored by TCC.
-Wwrite-stringsMake string constants be of type const char * instead of char *.
-WerrorAbort compilation if warnings are issued.
-WallActivate all warnings, except -Werror, -Wunusupported and -Wwrite strings.

Identifier Scope

Bit-fields

The only portable types for bit-fields are signed, unsigned or _Bool. The plain int type can be used, but the standard says (§6.7.2¶5) … for bit-fields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int.

Other integer types may be allowed by a specific implementation, but using them is not portable.

Strings

Common pitfalls

Error handling

Have in mind that errno is not necessarily a variable but that the syntax is only an indication how it might been declared. On many modern systems with thread interfaces errno is some macro that resolves to an object that is local to the current thread.

Implicit and Explicit Conversions

"Explicit conversion" is also commonly referred to as "casting".

Type Qualifiers

Type qualifiers are the keywords which describe additional semantics about a type. They are an integral part of type signatures. They can appear both at the topmost level of a declaration (directly affecting the identifier) or at sub-levels (relevant to pointers only, affecting the pointed-to values):

KeywordRemarks
constPrevents the mutation of the declared object (by appearing at the topmost level) or prevents the mutation of the pointed-to value (by appearing next to a pointer subtype).
volatileInforms the compiler that the declared object (at topmost level) or the pointed-to value (in pointer subtypes) may change its value as a result of external conditions, not only as a result of program control flow.
restrictAn optimization hint, relevant to pointers only. Declares intent that for the lifetime of the pointer, no other pointers will be used to access the same pointed-to object.

The ordering of type qualifiers with respect to storage class specifiers (static, extern, auto, register), type modifiers (signed, unsigned, short, long) and type specifiers (int, char, double, etc.) is not enforced, but the good practice is to put them in the aforementioned order:

static const volatile unsigned long int a = 5; /* good practice */
unsigned volatile long static int const b = 5; /* bad practice */

Top-level qualifications

/* "a" cannot be mutated by the program but can change as a result of external conditions */
const volatile int a = 5;

/* the const applies to array elements, i.e. "a[0]" cannot be mutated */    
const int arr[] = { 1, 2, 3 };

/* for the lifetime of "ptr", no other pointer could point to the same "int" object */
int *restrict ptr;

Pointer subtype qualifications

/* "s1" can be mutated, but "*s1" cannot */
const char *s1 = "Hello";

/* neither "s2" (because of top-level const) nor "*s2" can be mutated */
const char *const s2 = "World";

/* "*p" may change its value as a result of external conditions, "**p" and "p" cannot */
char *volatile *p;

/* "q", "*q" and "**q" may change their values as a result of external conditions */
volatile char *volatile *volatile q;

Valgrind

Typedef


Disadvantages of Typedef

typedef could lead to the pollution of namespace in large C programs.

Disadvantages of Typedef Structs

Also, typedef'd structs without a tag name are a major cause of needless imposition of ordering relationships among header files.

Consider:

#ifndef FOO_H
#define FOO_H 1

#define FOO_DEF (0xDEADBABE)

struct bar; /* forward declaration, defined in bar.h*/

struct foo {
    struct bar *bar;
};

#endif

With such a definition, not using typedefs, it is possible for a compilation unit to include foo.h to get at the FOO_DEF definition. If it doesn't attempt to dereference the bar member of the foo struct then there will be no need to include the bar.h file.

Typedef vs #define

#define is a C pre-processor directive which is also used to define the aliases for various data types similar to typedef but with the following differences:

  • typedef is limited to giving symbolic names to types only where as #define can be used to define alias for values as well.

  • typedef interpretation is performed by the compiler whereas #define statements are processed by the pre-processor.

  • Note that #define cptr char * followed by cptr a, b; does not do the same as typedef char *cptr; followed by cptr a, b;. With the #define, b is a plain char variable, but it is also a pointer with the typedef.

Selection Statements

Declaration vs Definition

Standard Math

  1. To link with math library use -lm with gcc flags.
  2. A portable program that needs to check for an error from a mathematical function should set errno to zero, and make the following call feclearexcept(FE_ALL_EXCEPT); before calling a mathematical function. Upon return from the mathematical function, if errno is nonzero, or the following call returns nonzero fetestexcept(FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW); then an error occurred in the mathematical function. Read manpage of math_error for more information.

Boolean

To use the predefined type _Bool and the header <stdbool.h>, you must be using the C99/C11 versions of C.

To avoid compiler warnings and possibly errors, you should only use the typedef/define example if you're using C89 and previous versions of the language.

Literals for numbers, characters and strings

The term literal is commonly used to describe a sequence of characters in a C code that designates a constant value such as a number (e.g. 0) or a string (e.g. "C"). Strictly speaking, the standard uses the term constant for integer constants, floating constants, enumeration constants and character constants, reserving the term 'literal' for string literals, but this is not common usage.

Literals can have prefixes or suffixes (but not both) which are extra characters that can start or end a literal to change its default type or its representation.

Storage Classes

Storage class specifiers are the keywords which can appear next to the top-level type of a declaration. The use of these keywords affects the storage duration and linkage of the declared object, depending on whether it is declared at file scope or at block scope:

KeywordStorage DurationLinkageRemarks
staticStaticInternalSets internal linkage for objects at file scope; sets static storage duration for objects at block scope.
externStaticExternalImplied and therefore redundant for objects defined at file scope which also have an initializer. When used in a declaration at file scope without an initializer, hints that the definition is to be found in another translation unit and will be resolved at link-time.
autoAutomaticIrrelevantImplied and therefore redundant for objects declared at block scope.
registerAutomaticIrrelevantRelevant only to objects with automatic storage duration. Provides a hint that the variable should be stored in a register. An imposed constraint is that one cannot use the unary & "address of" operator on such an object, and therefore the object cannot be aliased.
typedefIrrelevantIrrelevantNot a storage class specifier in practice, but works like one from a syntactic point of view. The only difference is that the declared identifier is a type, rather than an object.
_Thread_localThreadInternal/externalIntroduced in C11, to represent thread storage duration. If used at block scope, it shall also include extern or static.

Every object has an associated storage duration (regardless of scope) and linkage (relevant to declarations at file scope only), even when these keywords are omitted.

The ordering of storage class specifiers with respect to top-level type specifiers (int, unsigned, short, etc.) and top-level type qualifiers (const, volatile) is not enforced, so both of these declarations are valid:

int static const unsigned a = 5; /* bad practice */
static const unsigned int b = 5; /* good practice */

It is, however, considered a good practice to put storage class specifiers first, then any type qualifiers, then the type specifier (void, char, int, signed long, unsigned long long, long double...).

Not all storage class specifiers are legal at a certain scope:

register int x; /* legal at block scope, illegal at file scope */
auto int y; /* same */

static int z; /* legal at both file and block scope */
extern int a; /* same */

extern int b = 5; /* legal and redundant at file scope, illegal at block scope */

/* legal because typedef is treated like a storage class specifier syntactically */
int typedef new_type_name;

Storage Duration

Storage duration can be either static or automatic. For a declared object, it is determined depending on its scope and the storage class specifiers.

Static Storage Duration

Variables with static storage duration live throughout the whole execution of the program and can be declared both at file scope (with or without static) and at block scope (by putting static explicitly). They are usually allocated and initialized by the operating system at program startup and reclaimed when the process terminates. In practice, executable formats have dedicated sections for such variables (data, bss and rodata) and these whole sections from the file are mapped into memory at certain ranges.

Thread Storage Duration

C11

This storage duration was introduced in C11. This wasn't available in earlier C standards. Some compilers provide a non-standard extension with similar semantics. For example, gcc supports __thread specifier which can be used in earlier C standards which didn't have _Thread_local.

Variables with thread storage duration can be declared at both file scope and block scope. If declared at block scope, it shall also use static or extern storage specifier. Its lifetime is the entire execution the thread in which it's created. This is the only storage specifier that can appear alongside another storage specifier.

Automatic Storage Duration

Variables with automatic storage duration can only be declared at block scope (directly within a function or within a block in that function). They are usable only in the period between entering and leaving the function or block. Once the variable goes out of scope (either by returning from the function or by leaving the block), its storage is automatically deallocated. Any further references to the same variable from pointers are invalid and lead to undefined behaviour.

In typical implementations, automatic variables are located at certain offsets in the stack frame of a function or in registers.

External and Internal Linkage

Linkage is only relevant to objects (functions and variables) declared at file scope and affects their visibility across different translation units. Objects with external linkage are visible in every other translation unit (provided that the appropriate declaration is included). Objects with internal linkage are not exposed to other translation units and can only be used in the translation unit where they are defined.

Declarations

Declaration of identifier referring to object or function is often referred for short as simply a declaration of object or function.

Formatted Input/Output

Compound Literals

C standard says in C11-§6.5.2.5/3:

A postfix expression that consists of a parenthesized type name followed by a brace enclosed list of initializers is a compound literal. It provides an unnamed object whose value is given by the initializer list.99)

and footnote 99 says:

Note that this differs from a cast expression. For example, a cast specifies a conversion to scalar types or void only, and the result of a cast expression is not an lvalue.

Note that:

String literals, and compound literals with const-qualified types, need not designate distinct objects.101)

101) This allows implementations to share storage for string literals and constant compound literals with the same or overlapping representations.

Example is given in standard:
C11-§6.5.2.5/13:

Like string literals, const-qualified compound literals can be placed into read-only memory and can even be shared. For example,

(const char []){"abc"} == "abc"

might yield 1 if the literals’ storage is shared.

Inline assembly

Inline assembly is the practice of adding assembly instructions in the middle of C source code. No ISO C standard requires support of inline assembly. Since it is not required, the syntax for inline assembly varies from compiler to compiler. Even though it is typically supported there are very few reasons to use inline assembly and many reasons not to.

Pros

  1. Performance By writing the specific assembly instructions for an operation, you can achieve better performance than the assembly code generated by the compiler. Note that these performance gains are rare. In most cases you can achieve better performance gains just by rearranging your C code so the optimizer can do its job.
  2. Hardware interface Device driver or processor startup code may need some assembly code to access the correct registers and to guarantee certain operations occur in a specific order with a specific delay between operations.

Cons

  1. Compiler Portability Syntax for inline assembly is not guaranteed to be the same from one compiler to another. If you are writing code with inline assembly that should be supported by different compilers, use preprocessor macros (#ifdef) to check which compiler is being used. Then, write a separate inline assembly section for each supported compiler.
  2. Processor Portability You can't write inline assembly for an x86 processor and expect it to work on an ARM processor. Inline assembly is intended to be written for a specific processor or processor family. If you have inline assembly that you want supported on different processors, use preprocessor macros to check which processor the code is being compiled for and to select the appropriate assembly code section.
  3. Future Performance Changes Inline assembly may be written expecting delays based upon a certain processor clock speed. If the program is compiled for a processor with a faster clock, the assembly code may not perform as expected.

Threads (native)

Initialization

Structure Padding and Packing

Memory management

C11

Note that aligned_alloc() is only defined for C11 or later.

Systems such as those based on POSIX provide other ways of allocating aligned memory (e.g. posix_memalign()), and also have other memory management options (e.g. mmap()).

Implementation-defined behaviour

Overview

The C standard describes the language syntax, the functions provided by the standard library, and the behavior of conforming C processors (roughly speaking, compilers) and conforming C programs. With respect to behavior, the standard for the most part specifies particular behaviors for programs and processors. On the other hand, some operations have explicit or implicit undefined behavior -- such operations are always to be avoided, as you cannot rely on anything about them. In between, there are a variety of implementation defined behaviors. These behaviors may vary between C processors, runtimes, and standard libraries (collectively, implementations), but they are consistent and reliable for any given implementation, and conforming implementations document their behavior in each of these areas.

It is sometimes reasonable for a program to rely on implementation-defined behavior. For example, if the program is anyway specific to a particular operating environment then relying on implementation-defined behaviors general to the common processors for that environment is unlikely to be a problem. Alternatively, one can use conditional compilation directives to select implementation-defined behaviors appropriate for the implementation in use. In any case, it is essential to know which operations have implementation defined behavior, so as to either avoid them or to make an informed decision about whether and how to use them.

The balance of these remarks constitute a list of all the implementation-defined behaviors and characteristics specified in the C2011 standard, with references to the standard. Many of them use the terminology of the standard. Some others rely more generally on the context of the standard, such as the eight stages of translating source code into a program, or the difference between hosted and freestanding implementations. Some that may be particularly surprising or notable are presented in bold typeface. Not all the behaviors described are supported by earlier C standards, but generally speaking, they have implementation-defined behavior in all versions of the standard that support them.

Programs and Processors

General

  • The number of bits in one byte (3.6/3). At least 8, the actual value can be queried with the macro CHAR_BIT.

  • Which output messages are considered "diagnostic messages" (3.10/1)

Source translation

  • The manner in which physical source file multibyte characters are mapped to the source character set (5.1.1.2/1).

  • Whether non-empty sequences of non-newline whitespace are replaced by single spaces during translation phase 3 (5.1.1.2/1)

  • The execution-set character(s) to which character literals and characters in string constants are converted (during translation phase 5) when there is otherwise no corresponding character (5.1.1.2/1).

Operating environment

  • The manner in which the diagnostic messages to be emitted are identified (5.1.1.3/1).

  • The name and type of the function called at startup in a freestanding implementation (5.1.2.1/1).

  • Which library facilities are available in a freestanding implementation, beyond a specified minimal set (5.1.2.1/1).

  • The effect of program termination in a freestanding environment (5.1.2.1/2).

  • In a hosted environment, any allowed signatures for the main() function other than int main(int argc, char *arg[]) and int main(void) (5.1.2.2.1/1).

  • The manner in which a hosted implementation defines the strings pointed to by the second argument to main() (5.1.2.2.1/2).

  • What constitutes an "interactive device" for the purpose of sections 5.1.2.3 (Program Execution) and 7.21.3 (Files) (5.1.2.3/7).

  • Any restrictions on objects referred to by interrupt-handler routines in an optimizing implementation (5.1.2.3/10).

  • In a freestanding implementation, whether multiple threads of execution are supported (5.1.2.4/1).

  • The values of the members of the execution character set (5.2.1/1).

  • The char values corresponding to the defined alphabetic escape sequences (5.2.2/3).

  • The integer and floating-point numeric limits and characteristics (5.2.4.2/1).

  • The accuracy of floating-point arithmetic operations and of the standard library's conversions from internal floating point representations to string representations (5.2.4.2.2/6).

  • The value of macro FLT_ROUNDS, which encodes the default floating-point rounding mode (5.2.4.2.2/8).

  • The rounding behaviors characterized by supported values of FLT_ROUNDS greater than 3 or less than -1 (5.2.4.2.2/8).

  • The value of macro FLT_EVAL_METHOD, which characterizes floating-point evaluation behavior (5.2.4.2.2/9).

  • The behavior characterized by any supported values of FLT_EVAL_METHOD less than -1 (5.2.4.2.2/9).

  • The values of macros FLT_HAS_SUBNORM, DBL_HAS_SUBNORM, and LDBL_HAS_SUBNORM, characterizing whether the standard floating-point formats support subnormal numbers (5.2.4.2.2/10)

Types

  • The result of attempting to (indirectly) access an object with thread storage duration from a thread other than the one with which the object is associated (6.2.4/4)

  • The value of a char to which a character outside the basic execution set has been assigned (6.2.5/3).

  • The supported extended signed integer types, if any, (6.2.5/4), and any extension keywords used to identify them.

  • Whether char has the same representation and behavior as signed char or as unsigned char (6.2.5/15). Can be queried with CHAR_MIN, which is either 0 or SCHAR_MIN if char is unsigned or signed, respectively.

  • The number, order, and encoding of bytes in the representations of objects, except where explicitly specified by the standard (6.2.6.1/2).

  • Which of the three recognized forms of integer representation applies in any given situation, and whether certain bit patterns of integer objects are trap representations (6.2.6.2/2).

  • The alignment requirement of each type (6.2.8/1).

  • Whether and in what contexts any extended alignments are supported (6.2.8/3).

  • The set of supported extended alignments (6.2.8/4).

  • The integer conversion ranks of any extended signed integer types relative to each other (6.3.1.1/1).

  • The effect of assigning an out-of-range value to a signed integer (6.3.1.3/3).

  • When an in-range but unrepresentable value is assigned to a floating-point object, how the representable value stored in the object is chosen from between the two nearest representable values (6.3.1.4/2; 6.3.1.5/1; 6.4.4.2/3).

  • The result of converting an integer to a pointer type, except for integer constant expressions with value 0 (6.3.2.3/5).

Source form

  • The locations within #pragma directives where header name tokens are recognized (6.4/4).

  • The characters, including multibyte characters, other than underscore, unaccented Latin letters, universal character names, and decimal digits that may appear in identifiers (6.4.2.1/1).

  • The number of significant characters in an identifier (6.4.2.1/5).

  • With some exceptions, the manner in which the source characters in an integer character constant are mapped to execution-set characters (6.4.4.4/2; 6.4.4.4/10).

  • The current locale used for computing the value of a wide character constant, and most other aspects of the conversion for many such constants (6.4.4.4/11).

  • Whether differently-prefixed wide string literal tokens can be concatenated and, if so, the treatment of the resulting multibyte character sequence (6.4.5/5)

  • The locale used during translation phase 7 to convert wide string literals to multibyte character sequences, and their value when the result is not representable in the execution character set (6.4.5/6).

  • The manner in which header names are mapped to file names (6.4.7/2).

Evaluation

  • Whether and how floating-point expressions are contracted when FP_CONTRACT is not used (6.5/8).

  • The values of the results of the sizeof and _Alignof operators (6.5.3.4/5).

  • The size of the result type of pointer subtraction (6.5.6/9).

  • The result of right-shifting a signed integer with a negative value (6.5.7/5).

Runtime behavior

  • The extent to which the register keyword is effective (6.7.1/6).

  • Whether the type of a bitfield declared as int is the same type as unsigned int or as signed int (6.7.2/5).

  • What types bitfields may take, other than optionally-qualified _Bool, signed int, and unsigned int; whether bitfields may have atomic types (6.7.2.1/5).

  • Aspects of how implementations lay out the storage for bitfields (6.7.2.1/11).

  • The alignment of non-bitfield members of structures and unions (6.7.2.1/14).

  • The underlying type for each enumerated type (6.7.2.2/4).

  • What constitutes an "access" to an object of volatile-qualifed type (6.7.3/7).

  • The effectiveness of inline function declarations (6.7.4/6).

Preprocessor

  • Whether character constants are converted to integer values the same way in preprocessor conditionals as in ordinary expressions, and whether a single-character constant may have a negative value (6.10.1/4).

  • The locations searched for files designated in an #include directive (6.10.2/2-3).

  • The manner in which a header name is formed from the tokens of a multi-token #include directive (6.10.2/4).

  • The limit for #include nesting (6.10.2/6).

  • Whether a \ character is inserted before the \ introducing a universal character name in the result of the preprocessor's # operator (6.10.3.2/2).

  • The behavior of the #pragma preprocessing directive for pragmas other than STDC (6.10.6/1).

  • The value of the __DATE__ and __TIME__ macros if no translation date or time, respectively, is available (6.10.8.1/1).

  • The internal character encoding used for wchar_t if macro __STDC_ISO_10646__ is not defined (6.10.8.2/1).

  • The internal character encoding used for char32_t if macro __STDC_UTF_32__ is not defined (6.10.8.2/1).

Standard Library

General

  • The format of the messages emitted when assertions fail (7.2.1.1/2).

Floating-point environment functions

  • Any additional floating-point exceptions beyond those defined by the standard (7.6/6).

  • Any additional floating-point rounding modes beyond those defined by the standard (7.6/8).

  • Any additional floating-point environments beyond those defined by the standard (7.6/10).

  • The default value of the floating-point environment access switch (7.6.1/2).

  • The representation of the floating-point status flags recorded by fegetexceptflag() (7.6.2.2/1).

  • Whether the feraiseexcept() function additionally raises the "inexact" floating-point exception whenever it raises the "overflow" or "underflow" floating-point exception (7.6.2.3/2).

Locale-related functions

  • The locale strings other than "C" supported by setlocale() (7.11.1.1/3).

Math functions

  • The types represented by float_t and double_t when the FLT_EVAL_METHOD macro has a value different from 0, 1, and 2 (7.12/2).

  • Any supported floating-point classifications beyond those defined by the standard (7.12/6).

  • The value returned by the math.h functions in the event of a domain error (7.12.1/2).

  • The value returned by the math.h functions in the event of a pole error (7.12.1/3).

  • The value returned by the math.h functions when the result underflows, and aspects of whether errno is set to ERANGE and whether a floating-point exception is raised under those circumstances (7.12.1/6).

  • The default value of the FP-contraction switch (7.12.2/2).

  • Whether the fmod() functions return 0 or raise a domain error when their second argument is 0 (7.12.10.1/3).

  • Whether the remainder() functions return 0 or raise a domain error when their second argument is 0 (7.12.10.2/3).

  • The number of significant bits in the quotient moduli computed by the remquo() functions (7.12.10.3/2).

  • Whether the remquo() functions return 0 or raise a domain error when their second argument is 0 (7.12.10.3/3).

Signals

  • The complete set of supported signals, their semantics, and their default handling (7.14/4).

  • When a signal is raised and there is a custom handler associated with that signal, which signals, if any, are blocked for the duration of the execution of the handler (7.14.1.1/3).

  • Which signals other than SIGFPE, SIGILL, and SIGSEGV cause the behavior upon returning from a custom signal handler to be undefined (7.14.1.1/3).

  • Which signals are initially configured to be ignored (regardless of their default handling; 7.14.1.1/6).

Miscellaneous

  • The specific null pointer constant to which macro NULL expands (7.19/3).

File-handling functions

  • Whether the last line of a text stream requires a terminating newline (7.21.2/2).

  • The number of null characters automatically appended to a binary stream (7.21.2/3).

  • The initial position of a file opened in append mode (7.21.3/1).

  • Whether a write on a text stream causes the stream to be truncated (7.21.3/2).

  • Support for stream buffering (7.21.3/3).

  • Whether zero-length files actually exist (7.21.3/4).

  • The rules for composing valid file names (7.21.3/8).

  • Whether the same file can simultaneously be open multiple times (7.21.3/8).

  • The nature and choice of encoding for multibyte characters (7.21.3/10).

  • The behavior of the remove() function when the target file is open (7.21.4.1/2).

  • The behavior of the rename() function when the target file already exists (7.21.4.2/2).

  • Whether files created via the tmpfile() function are removed in the event that the program terminates abnormally (7.21.4.3/2).

  • Which mode changes under which circumstances are permitted via freopen() (7.21.5.4/3).

I/O functions

  • Which of the permitted representations of infinite and not-a-number FP values are produced by the printf()-family functions (7.21.6.1/8).

  • The manner in which pointers are formatted by the printf()-family functions (7.21.6.1/8).

  • The behavior of scanf()-family functions when the - character appears in an internal position of the scanlist of a [ field (7.21.6.2/12).

  • Most aspects of the scanf()-family functions' handing of p fields (7.21.6.2/12).

  • The errno value set by fgetpos() on failure (7.21.9.1/2).

  • The errno value set by fsetpos() on failure (7.21.9.3/2).

  • The errno value set by ftell() on failure (7.21.9.4/3).

  • The meaning to the strtod()-family functions of some supported aspects of a NaN formatting (7.22.1.3p4).

  • Whether the strtod()-family functions set errno to ERANGE when the result underflows (7.22.1.3/10).

Memory allocation functions

  • The behavior of the memory-allocation functions when the number of bytes requested is 0 (7.22.3/1).

System environment functions

  • What cleanups, if any, are performed and what status is returned to the host OS when the abort() function is called (7.22.4.1/2).

  • What status is returned to the host environment when exit() is called (7.22.4.4/5).

  • The handling of open streams and what status is returned to the host environment when _Exit() is called (7.22.4.5/2).

  • The set of environment names accessible via getenv() and the method for altering the environment (7.22.4.6/2).

  • The return value of the system() function (7.22.4.8/3).

Date and time functions

  • The local time zone and Daylight Saving time (7.27.1/1).

  • The range and precision of times representable via types clock_t and time_t (7.27.1/4).

  • The beginning of the era that serves as the reference for the times returned by the clock() function (7.27.2.1/3).

  • The beginning of the epoch that serves as the reference for the times returned by the timespec_get() function (when the time base is TIME_UTC; 7.27.2.5/3).

  • The strftime() replacement for the %Z conversion specifier in the "C" locale (7.27.3.5/7).

Wide-character I/O functions

  • Which of the permitted representations of infinite and not-a-number FP values are produced by the wprintf()-family functions (7.29.2.1/8).

  • The manner in which pointers are formatted by the wprintf()-family functions (7.29.2.1/8).

  • The behavior of wscanf()-family functions when the - character appears in an internal position of the scanlist of a [ field (7.29.2.2/12).

  • Most aspects of the wscanf()-family functions' handing of p fields (7.29.2.2/12).

  • The meaning to the wstrtod()-family functions of some supported aspects of NaN formatting (7.29.4.1.1/4).

  • Whether the wstrtod()-family functions set errno to ERANGE when the result underflows (7.29.4.1.1/10).

Atomics

Atomics as part of the C language are an optional feature that is available since C11.

Their purpose is to ensure race-free access to variables that are shared between different threads. Without atomic qualification, the state of a shared variable would be undefined if two threads access it concurrently. Eg an increment operation (++) could be split into several assembler instructions, a read, the addition itself and a store instruction. If another thread would be doing the same operation their two instruction sequences could be intertwined and lead to an inconsistent result.

  • Types: All object types with the exception of array types can be qualified with _Atomic.

  • Operators: All read-modify-write operators (e.g ++ or *=) on these are guaranteed to be atomic.

  • Operations: There are some other operations that are specified as type generic functions, e.g atomic_compare_exchange.

  • Threads: Access to them is guaranteed not to produce data race when they are accessed by different threads.

  • Signal handlers: Atomic types are called lock-free if all operations on them are stateless. In that case they can also be used to deal state changes between normal control flow and a signal handler.

  • There is only one data type that is guaranteed to be lock-free: atomic_flag. This is a minimal type who's operations are intended to map to efficient test-and-set hardware instructions.

Other means to avoid race conditions are available in C11's thread interface, in particular a mutex type mtx_t to mutually exclude threads from accessing critical data or critical sections of code. If atomics are not available, these must be used to prevent races.

Iteration Statements/Loops: for, while, do-while

Iteration Statement/Loops fall into two categories:

  • head-controlled iteration statement/loops
  • foot-controlled iteration statement/loops

Head-Controlled Iteration Statement/Loops

for ([<expression>]; [<expression>]; [<expression>]) <statement>
while (<expression>) <statement>
C99
for ([declaration expression]; [expression] [; [expression]]) statement

Foot-Controlled Iteration Statement/Loops

do <statement> while (<expression>);

Enumerations

Enumerations consist of the enum keyword and an optional identifier followed by an enumerator-list enclosed by braces.

An identifier is of type int.

The enumerator-list has at least one enumerator element.

An enumerator may optionally be "assigned" a constant expression of type int.

An enumerator is constant and is compatible to either a char, a signed integer or an unsigned integer. Which ever is used is implementation-defined. In any case the type used should be able to represent all values defined for the enumeration in question.

If no constant expression is "assigned" to an enumerator and it is the 1st entry in an enumerator-list it takes value of 0, else it get takes the value of the previous entry in the enumerator-list plus 1.

Using multiple "assignments" can lead to different enumerators of the same enumeration carry the same values.

Jump Statements

Create and include header files

Testing frameworks

<ctype.h> — character classification & conversion

Pass 2D-arrays to functions

Side Effects

Multi-Character Character Sequence

Not all preprocessors support trigraph sequence processing. Some compilers give an extra option or switch for processing them. Others use a separate program to convert trigraphs.

The GCC compiler does not recognize them unless you explicitly request it to do so (use -trigraphs to enable them; use -Wtrigraphs, part of -Wall, to get warnings about trigraphs).

As most platforms in use today support the full range of single characters used in C, digraphs are preferred over trigraphs but the use of any multi-character character sequences is generally discouraged.

Also, beware of accidental trigraph use (puts("What happened??!!");, for example).

Constraints

Constraints are a term used in all of the existing C specifications (recently ISO-IEC 9899-2011). They are one of the three parts of the language described in clause 6 of the standard (along side syntax and semantics).

ISO-IEC 9899-2011 defines a constraint as a:

restriction, either syntactic or semantic, by which the exposition of language elements is to be interpreted

(Please also note, in terms of the C standard, a "runtime-constraint" is not a kind of constraint and has extensively different rules.)

In other words a constraint describes a rule of the language which would make an otherwise syntactically valid program illegal. In this respect constraints are somewhat like undefined behavior, any program which does not follow them is not defined in terms of the C language.

Constraints on the other hand have a very significant difference from Undefined Behaviors. Namely an implementation is required to provide a diagnostic message during the translation phase (part of compilation) if a constraint is breached, this message may be a warning or may halt the compilation.

Inlining

Unions

Multithreading

Using threads can introduce extra undefined behavior such as a https://stackoverflow.com/documentation/c/364/undefined-behavior/2622/data-race#t=201706130820201457052. For race-free access to variables that are shared between different threads C11 provides the mtx_lock() mutex functionality or the (optional) https://stackoverflow.com/documentation/c/4924/atomics#t=201706150835215525448 data-types and associated functions in stdatomic.h.

Common C programming idioms and developer practices

Interprocess Communication (IPC)

Comments