The C standard describes the language syntax, the functions provided by the standard library, and the behavior of conforming C processors (roughly speaking, compilers) and conforming C programs. With respect to behavior, the standard for the most part specifies particular behaviors for programs and processors. On the other hand, some operations have explicit or implicit undefined behavior -- such operations are always to be avoided, as you cannot rely on anything about them. In between, there are a variety of implementation defined behaviors. These behaviors may vary between C processors, runtimes, and standard libraries (collectively, implementations), but they are consistent and reliable for any given implementation, and conforming implementations document their behavior in each of these areas.
It is sometimes reasonable for a program to rely on implementation-defined behavior. For example, if the program is anyway specific to a particular operating environment then relying on implementation-defined behaviors general to the common processors for that environment is unlikely to be a problem. Alternatively, one can use conditional compilation directives to select implementation-defined behaviors appropriate for the implementation in use. In any case, it is essential to know which operations have implementation defined behavior, so as to either avoid them or to make an informed decision about whether and how to use them.
The balance of these remarks constitute a list of all the implementation-defined behaviors and characteristics specified in the C2011 standard, with references to the standard. Many of them use the terminology of the standard. Some others rely more generally on the context of the standard, such as the eight stages of translating source code into a program, or the difference between hosted and freestanding implementations. Some that may be particularly surprising or notable are presented in bold typeface. Not all the behaviors described are supported by earlier C standards, but generally speaking, they have implementation-defined behavior in all versions of the standard that support them.
The number of bits in one byte (3.6/3). At least 8
, the actual value can be queried with the macro CHAR_BIT
.
Which output messages are considered "diagnostic messages" (3.10/1)
The manner in which physical source file multibyte characters are mapped to the source character set (5.1.1.2/1).
Whether non-empty sequences of non-newline whitespace are replaced by single spaces during translation phase 3 (5.1.1.2/1)
The execution-set character(s) to which character literals and characters in string constants are converted (during translation phase 5) when there is otherwise no corresponding character (5.1.1.2/1).
The manner in which the diagnostic messages to be emitted are identified (5.1.1.3/1).
The name and type of the function called at startup in a freestanding implementation (5.1.2.1/1).
Which library facilities are available in a freestanding implementation, beyond a specified minimal set (5.1.2.1/1).
The effect of program termination in a freestanding environment (5.1.2.1/2).
In a hosted environment, any allowed signatures for the main()
function other than int main(int argc, char *arg[])
and int main(void)
(5.1.2.2.1/1).
The manner in which a hosted implementation defines the strings pointed to by the second argument to main()
(5.1.2.2.1/2).
What constitutes an "interactive device" for the purpose of sections 5.1.2.3 (Program Execution) and 7.21.3 (Files) (5.1.2.3/7).
Any restrictions on objects referred to by interrupt-handler routines in an optimizing implementation (5.1.2.3/10).
In a freestanding implementation, whether multiple threads of execution are supported (5.1.2.4/1).
The values of the members of the execution character set (5.2.1/1).
The char
values corresponding to the defined alphabetic escape sequences (5.2.2/3).
The integer and floating-point numeric limits and characteristics (5.2.4.2/1).
The accuracy of floating-point arithmetic operations and of the standard library's conversions from internal floating point representations to string representations (5.2.4.2.2/6).
The value of macro FLT_ROUNDS
, which encodes the default floating-point rounding mode (5.2.4.2.2/8).
The rounding behaviors characterized by supported values of FLT_ROUNDS
greater than 3 or less than -1 (5.2.4.2.2/8).
The value of macro FLT_EVAL_METHOD
, which characterizes floating-point evaluation behavior (5.2.4.2.2/9).
The behavior characterized by any supported values of FLT_EVAL_METHOD
less than -1 (5.2.4.2.2/9).
The values of macros FLT_HAS_SUBNORM
, DBL_HAS_SUBNORM
, and LDBL_HAS_SUBNORM
, characterizing whether the standard floating-point formats support subnormal numbers (5.2.4.2.2/10)
The result of attempting to (indirectly) access an object with thread storage duration from a thread other than the one with which the object is associated (6.2.4/4)
The value of a char
to which a character outside the basic execution set has been assigned (6.2.5/3).
The supported extended signed integer types, if any, (6.2.5/4), and any extension keywords used to identify them.
Whether char
has the same representation and behavior as signed char
or as unsigned char
(6.2.5/15). Can be queried with CHAR_MIN
, which is either 0
or SCHAR_MIN
if char
is unsigned or signed, respectively.
The number, order, and encoding of bytes in the representations of objects, except where explicitly specified by the standard (6.2.6.1/2).
Which of the three recognized forms of integer representation applies in any given situation, and whether certain bit patterns of integer objects are trap representations (6.2.6.2/2).
The alignment requirement of each type (6.2.8/1).
Whether and in what contexts any extended alignments are supported (6.2.8/3).
The set of supported extended alignments (6.2.8/4).
The integer conversion ranks of any extended signed integer types relative to each other (6.3.1.1/1).
The effect of assigning an out-of-range value to a signed integer (6.3.1.3/3).
When an in-range but unrepresentable value is assigned to a floating-point object, how the representable value stored in the object is chosen from between the two nearest representable values (6.3.1.4/2; 6.3.1.5/1; 6.4.4.2/3).
The result of converting an integer to a pointer type, except for integer constant expressions with value 0
(6.3.2.3/5).
The locations within #pragma
directives where header name tokens are recognized (6.4/4).
The characters, including multibyte characters, other than underscore, unaccented Latin letters, universal character names, and decimal digits that may appear in identifiers (6.4.2.1/1).
The number of significant characters in an identifier (6.4.2.1/5).
With some exceptions, the manner in which the source characters in an integer character constant are mapped to execution-set characters (6.4.4.4/2; 6.4.4.4/10).
The current locale used for computing the value of a wide character constant, and most other aspects of the conversion for many such constants (6.4.4.4/11).
Whether differently-prefixed wide string literal tokens can be concatenated and, if so, the treatment of the resulting multibyte character sequence (6.4.5/5)
The locale used during translation phase 7 to convert wide string literals to multibyte character sequences, and their value when the result is not representable in the execution character set (6.4.5/6).
The manner in which header names are mapped to file names (6.4.7/2).
Whether and how floating-point expressions are contracted when FP_CONTRACT
is not used (6.5/8).
The values of the results of the sizeof
and _Alignof
operators (6.5.3.4/5).
The size of the result type of pointer subtraction (6.5.6/9).
The result of right-shifting a signed integer with a negative value (6.5.7/5).
The extent to which the register
keyword is effective (6.7.1/6).
Whether the type of a bitfield declared as int
is the same type as unsigned int
or as signed int
(6.7.2/5).
What types bitfields may take, other than optionally-qualified _Bool
, signed int
, and unsigned int
; whether bitfields may have atomic types (6.7.2.1/5).
Aspects of how implementations lay out the storage for bitfields (6.7.2.1/11).
The alignment of non-bitfield members of structures and unions (6.7.2.1/14).
The underlying type for each enumerated type (6.7.2.2/4).
What constitutes an "access" to an object of volatile
-qualifed type (6.7.3/7).
The effectiveness of inline
function declarations (6.7.4/6).
Whether character constants are converted to integer values the same way in preprocessor conditionals as in ordinary expressions, and whether a single-character constant may have a negative value (6.10.1/4).
The locations searched for files designated in an #include
directive (6.10.2/2-3).
The manner in which a header name is formed from the tokens of a multi-token #include
directive (6.10.2/4).
The limit for #include
nesting (6.10.2/6).
Whether a \
character is inserted before the \
introducing a universal character name in the result of the preprocessor's #
operator (6.10.3.2/2).
The behavior of the #pragma
preprocessing directive for pragmas other than STDC
(6.10.6/1).
The value of the __DATE__
and __TIME__
macros if no translation date or time, respectively, is available (6.10.8.1/1).
The internal character encoding used for wchar_t
if macro __STDC_ISO_10646__
is not defined (6.10.8.2/1).
The internal character encoding used for char32_t
if macro __STDC_UTF_32__
is not defined (6.10.8.2/1).
Any additional floating-point exceptions beyond those defined by the standard (7.6/6).
Any additional floating-point rounding modes beyond those defined by the standard (7.6/8).
Any additional floating-point environments beyond those defined by the standard (7.6/10).
The default value of the floating-point environment access switch (7.6.1/2).
The representation of the floating-point status flags recorded by fegetexceptflag()
(7.6.2.2/1).
Whether the feraiseexcept()
function additionally raises the "inexact" floating-point exception whenever it raises the "overflow" or "underflow" floating-point exception (7.6.2.3/2).
"C"
supported by setlocale()
(7.11.1.1/3).The types represented by float_t
and double_t
when the FLT_EVAL_METHOD
macro has a value different from 0
, 1
, and 2
(7.12/2).
Any supported floating-point classifications beyond those defined by the standard (7.12/6).
The value returned by the math.h
functions in the event of a domain error (7.12.1/2).
The value returned by the math.h
functions in the event of a pole error (7.12.1/3).
The value returned by the math.h
functions when the result underflows, and aspects of whether errno
is set to ERANGE
and whether a floating-point exception is raised under those circumstances (7.12.1/6).
The default value of the FP-contraction switch (7.12.2/2).
Whether the fmod()
functions return 0 or raise a domain error when their second argument is 0 (7.12.10.1/3).
Whether the remainder()
functions return 0 or raise a domain error when their second argument is 0 (7.12.10.2/3).
The number of significant bits in the quotient moduli computed by the remquo()
functions (7.12.10.3/2).
Whether the remquo()
functions return 0 or raise a domain error when their second argument is 0 (7.12.10.3/3).
The complete set of supported signals, their semantics, and their default handling (7.14/4).
When a signal is raised and there is a custom handler associated with that signal, which signals, if any, are blocked for the duration of the execution of the handler (7.14.1.1/3).
Which signals other than SIGFPE
, SIGILL
, and SIGSEGV
cause the behavior upon returning from a custom signal handler to be undefined (7.14.1.1/3).
Which signals are initially configured to be ignored (regardless of their default handling; 7.14.1.1/6).
NULL
expands (7.19/3).Whether the last line of a text stream requires a terminating newline (7.21.2/2).
The number of null characters automatically appended to a binary stream (7.21.2/3).
The initial position of a file opened in append mode (7.21.3/1).
Whether a write on a text stream causes the stream to be truncated (7.21.3/2).
Support for stream buffering (7.21.3/3).
Whether zero-length files actually exist (7.21.3/4).
The rules for composing valid file names (7.21.3/8).
Whether the same file can simultaneously be open multiple times (7.21.3/8).
The nature and choice of encoding for multibyte characters (7.21.3/10).
The behavior of the remove()
function when the target file is open (7.21.4.1/2).
The behavior of the rename()
function when the target file already exists (7.21.4.2/2).
Whether files created via the tmpfile()
function are removed in the event that the program terminates abnormally (7.21.4.3/2).
Which mode changes under which circumstances are permitted via freopen()
(7.21.5.4/3).
Which of the permitted representations of infinite and not-a-number FP values are produced by the printf()-family functions (7.21.6.1/8).
The manner in which pointers are formatted by the printf()
-family functions (7.21.6.1/8).
The behavior of scanf()
-family functions when the -
character appears in an internal position of the scanlist of a [
field (7.21.6.2/12).
Most aspects of the scanf()
-family functions' handing of p
fields (7.21.6.2/12).
The errno
value set by fgetpos()
on failure (7.21.9.1/2).
The errno
value set by fsetpos()
on failure (7.21.9.3/2).
The errno
value set by ftell()
on failure (7.21.9.4/3).
The meaning to the strtod()
-family functions of some supported aspects of a NaN formatting (7.22.1.3p4).
Whether the strtod()
-family functions set errno
to ERANGE
when the result underflows (7.22.1.3/10).
What cleanups, if any, are performed and what status is returned to the host OS when the abort()
function is called (7.22.4.1/2).
What status is returned to the host environment when exit()
is called (7.22.4.4/5).
The handling of open streams and what status is returned to the host environment when _Exit()
is called (7.22.4.5/2).
The set of environment names accessible via getenv()
and the method for altering the environment (7.22.4.6/2).
The return value of the system()
function (7.22.4.8/3).
The local time zone and Daylight Saving time (7.27.1/1).
The range and precision of times representable via types clock_t
and time_t
(7.27.1/4).
The beginning of the era that serves as the reference for the times returned by the clock()
function (7.27.2.1/3).
The beginning of the epoch that serves as the reference for the times returned by the timespec_get()
function (when the time base is TIME_UTC
; 7.27.2.5/3).
The strftime()
replacement for the %Z
conversion specifier in the "C" locale (7.27.3.5/7).
Which of the permitted representations of infinite and not-a-number FP values are produced by the wprintf()
-family functions (7.29.2.1/8).
The manner in which pointers are formatted by the wprintf()
-family functions (7.29.2.1/8).
The behavior of wscanf()
-family functions when the -
character appears in an internal position of the scanlist of a [
field (7.29.2.2/12).
Most aspects of the wscanf()
-family functions' handing of p
fields (7.29.2.2/12).
The meaning to the wstrtod()
-family functions of some supported aspects of NaN formatting (7.29.4.1.1/4).
Whether the wstrtod()
-family functions set errno
to ERANGE
when the result underflows (7.29.4.1.1/10).
int signed_integer = -1;
// The right shift operation exhibits implementation-defined behavior:
int result = signed_integer >> 1;
// Supposing SCHAR_MAX, the maximum value that can be represented by a signed char, is
// 127, the behavior of this assignment is implementation-defined:
signed char integer;
integer = 128;
// The allocation functions have implementation-defined behavior when the requested size
// of the allocation is zero.
void *p = malloc(0);
Each signed integer type may be represented in any one of three formats; it is implementation-defined which one is used. The implementation in use for any given signed integer type at least as wide as int
can be determined at runtime from the two lowest-order bits of the representation of value -1
in that type, like so:
enum { sign_magnitude = 1, ones_compl = 2, twos_compl = 3, };
#define SIGN_REP(T) ((T)-1 & (T)3)
switch (SIGN_REP(long)) {
case sign_magnitude: { /* do something */ break; }
case ones_compl: { /* do otherwise */ break; }
case twos_compl: { /* do yet else */ break; }
case 0: { _Static_assert(SIGN_REP(long), "bogus sign representation"); }
}
The same pattern applies to the representation of narrower types, but they cannot be tested by this technique because the operands of &
are subject to "the usual arithmetic conversions" before the result is computed.