From dankwiki
Jump to: navigation, search


  • Use -Q -Ox --help=optimizers to determine which optimization flags are enabled for a given -Ox setting (or read the info pages).
  • Dataflow analysis is only performed at -O2 or above, and thus at least this level of optimization is necessary for use-of-uninitialized-variable warnings and such.

Extensions to C

Extensions to the C language are documented at the online gcc docs, and in the Info pages distributed with gcc.

__builtin_ functions

  • __builtin_expect(expr,expectedp) - Instruct gcc that expr is or is not likely to be true (0 for unlikely), affecting generation of conditional code (normally, gcc assumes that if conditionals are taken in most/all cases).


Attributes are preceded by the keyword __attribute__ and in some cases followed by a parenthesized argument list; the attribute name and any argument list are both then enclosed within double parentheses. All are non-standard extensions.

Function Attributes

  • See the gcc documentation at http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html
  • aligned - takes a single parameter, the minimum number of bytes at which to align the function. -falign-functions will override this, if larger.
  • malloc - indicates that any value returned does not alias any other currently-valid pointers
  • alloc_size - takes one or two parameters and indicates that the function will return a pointer to an allocated chunk of memory having either the size provided by a single argument, or the product of two arguments. This is necessary for __builtin_object_size's correct functioning.
  • pure (2.96) - indicates that the function has no side-effects save its return value, which is based only on calls to other pure functions, the function's own parameters and/or non-volatile global memory.
  • const (2.5) - stronger than pure, a pure function which does not dereference any pointer parameters, use global memory or call non-const functions.
  • warn_unused_result - warn if the return value is not used, for instance in a wrapper to open(2) or malloc(3)
  • cold (4.3) - indicates the code does not lie on any hotpaths, resulting in optimization for size, location within the .text section, and automatic application of __builtin_expect((x),0) to conditionals on a calling path. Disabled by -fprofile-use.
  • hot (4.3) - opposite of cold.
  • nothrow (3.3) - marks the function as never throwing an exception, for optimization purposes.
  • noreturn (2.5) - marks the function as never normally returning (longjmp(3) and exceptions may still be used).
  • unused - a function is (possibly) unused. Calls may still be made to it, but -Wunused-functions warnings will not be generated.

Inline Assembly

  • There's no need to use inline assembly for SIMD; use Target-Specific Builtins and Vector Extensions, or autovectorization if applicable.
  • Functions only referenced by inline assembly might not have code generated for them; use of the used function attribute will force generation.
  • The GNU assembler (gas) is used for assembly and syntax.
  • Statements can be arbitrarily reordered by default, or anchored with the volatile qualifier
  • outputs, inputs and clobbers are expressed in a colon-delimited list of comma-delimited lists of the form [asmsymbol] "constraints" (c symbol)
    • operand constraints are properties of the assembly code, not the values
    • without "=" or "+" constraint modifier, operands are assumed to be read-only
    • compiler verifies that all outputs are lvalues. types of operands are not checked.
    • "cc" logical register ought be listed as a clobber if the conditional code register is changed
    • "memory" must be listed as clobbered if memory is touched in an unpredictable fashion
  • constraint modifiers:
    • "=": operand is write-only (previous value needn't be preserved until write)
    • "+": operand is read-write (can't be arbitrarily used)
    • "&": operand is clobbered early (prior to use of all inputs), and thus can't be placed atop an input operand's

Intermediate Representations

  • RTL: The Register Transfer Language, GCC's older IR (still used for late optimization passes)
  • GENERIC: A loose IR to which frontends must now compile
  • GIMPLE: A restricted subset of GENERIC, on which most optimizations are performed
  • Graphite: GIMPLE as Polyhedra, an optimization framework making use of polyhedral methods (especially for autovectorization)

See also