8,434

edits

From dankwiki

→See Also: fix syntax

(→See Also: fix syntax) |
|||

(15 intermediate revisions by the same user not shown) | |||

Line 3: | Line 3: | ||

* either b immediately follows a, and a does not end in an unconditional branch, | * either b immediately follows a, and a does not end in an unconditional branch, | ||

* or a ends in a branch, of which b is a potential target. | * or a ends in a branch, of which b is a potential target. | ||

Note that an indirect branch, without context information, trivializes all blocks (every instruction becomes a leader) and flows to them all from at least that point (an arborescence is | Note that an indirect branch, without context information, trivializes all blocks (every instruction becomes a leader) and flows to them all from at least that point (an arborescence is induced)! | ||

The directed multigraph defined by interpreting basic blocks as vertices, and flow relationships as edges, yields its control flow graph (CFG). A start node exists for each CFG, corresponding to the basic block whose header is the first instruction of the program. | The directed multigraph defined by interpreting basic blocks as vertices, and flow relationships as edges, yields its control flow graph (CFG). A start node exists for each CFG, corresponding to the basic block whose header is the first instruction of the program. | ||

Line 13: | Line 13: | ||

Dataflow analysis is most usefully performed into and out of ''regions'', subsets of the nodes such that a ''header'' exists which dominates all nodes in the region, and all edges between nodes in the region are themselves in the region. A loop is a region which is strongly connected, where all back-edges to the header are themselves within the region '''FIXME -- unclear'''. | Dataflow analysis is most usefully performed into and out of ''regions'', subsets of the nodes such that a ''header'' exists which dominates all nodes in the region, and all edges between nodes in the region are themselves in the region. A loop is a region which is strongly connected, where all back-edges to the header are themselves within the region '''FIXME -- unclear'''. | ||

Loops can be discovered via domination analysis (it is important to note that this refers to loops in the generated code, not loop constructs of the source language, and furthermore that all possible loops will be found (ie, unstructured loops constructed from C <tt> | Loops can be discovered via domination analysis (it is important to note that this refers to loops in the generated code, not loop constructs of the source language, and furthermore that all possible loops will be found (ie, unstructured loops constructed from C <tt>gotos</tt>)). Discover all strongly-connected subgraphs (SCCs) of the CFG (subgraphs where, for each vertex, a path ('''not''' necessarily an edge) exists from that vertex to all other nodes of the subgraph); if a subgraph contains a node dominating all that subgraph's nodes, the subgraph is a loop. The trivial case is, of course, a statement which jumps to itself, ala the BASIC program <tt>10 GOTO 10</tt>. Implementation via [[Kosaraju's Algorithm]] is simple, with O(|V|+|E|) time complexity using graph encoding and O(N<sup>2</sup>) time complexity using adjacency matrices: | ||

* Perform a recursive depth-first traversal of the graph starting from s. Each time you return, add that node onto an auxiliary vector. Upon the traversal's completion, this vector sorts the nodes topologically. | * Perform a recursive depth-first traversal of the graph starting from s. Each time you return, add that node onto an auxiliary vector. Upon the traversal's completion, this vector sorts the nodes topologically. | ||

* Until the the vector is empty, use the last node of the vector to begin traversing the transpose graph. Remove the path from the vector; these paths partition the graph into SCCs. | * Until the the vector is empty, use the last node of the vector to begin traversing the transpose graph. Remove the path from the vector; these paths partition the graph into SCCs. | ||

Kosaraju's algorithm is improved upon by [[Tarjan's Algorithm]] and [[Gabow's Algorithm]]. | Kosaraju's algorithm is improved upon by [[Tarjan's Algorithm]] and [[Gabow's Algorithm]]. | ||

''Natural loop'' identification proceeds via identification of ''back edges'' (edges from a node b to a node a, where a dominates b). A loop is associated with every such back edge; if a backedge exists from b to a, the associated loop is entered at a, and consists additionally of all nodes which can reach b without going through a. Similarly, a loop is associated with the target of every back edge, this being the union of all such backedges' associated natural loops. | ''Natural loop'' identification proceeds via identification of ''back edges'' (edges from a node b to a node a, where a dominates b). A loop is associated with every such back edge; if a backedge exists from b to a, the associated loop is entered at a, and consists additionally of all nodes which can reach b without going through a. Similarly, a loop is associated with the target of every back edge, this being the union of all such backedges' associated natural loops. | ||

===Dead code elimination=== | |||

Control flow analysis by itself is sufficient to remove some ''unreachable code'' (viz ''dead code'', below). | |||

==Dataflow/Dependency Analysis== | ==Dataflow/Dependency Analysis== | ||

Line 28: | Line 31: | ||

** Def-Use chain -- a definition D of a variable, and all uses of that variable it can reach (those reached from D) | ** Def-Use chain -- a definition D of a variable, and all uses of that variable it can reach (those reached from D) | ||

** Common subexpression elimination lexically matches subexpressions, and can preserve them in registers | ** Common subexpression elimination lexically matches subexpressions, and can preserve them in registers | ||

** Liveness analysis can detect some instances of ''dead code'' (code whose result is never used, viz ''unreachable code'' above) | |||

===Intermediate Representations=== | ===Intermediate Representations=== | ||

Line 37: | Line 41: | ||

* It's much easier to perform global value numbering optimizations in this representation (see Muchnick, p378-396) | * It's much easier to perform global value numbering optimizations in this representation (see Muchnick, p378-396) | ||

** Global value numbering is much more complete that basic common subexpression elimination | ** Global value numbering is much more complete that basic common subexpression elimination | ||

** Partial redundancy elimination can be unified with global value numbering | |||

** Lazy code motion is described in Muchnick p. 407-415 | |||

===Loops=== | ===Loops=== | ||

Line 77: | Line 83: | ||

==See Also== | ==See Also== | ||

* [[gcc]] | |||

* [http://scienceblogs.com/goodmath/2007/10/computing_strongly_connected_c.php Computing Strongly Connected Subgraphs] from [http://scienceblogs.com/goodmath Good Math, Bad Math] | * [http://scienceblogs.com/goodmath/2007/10/computing_strongly_connected_c.php Computing Strongly Connected Subgraphs] from [http://scienceblogs.com/goodmath Good Math, Bad Math] | ||

* The [http://llvm.org/ LLVM] Compiler Infrastructure Project at [http://www.cs.uiuc.edu/ UIUC] | * The [http://llvm.org/ LLVM] Compiler Infrastructure Project at [http://www.cs.uiuc.edu/ UIUC] | ||

* GCC's [http://gcc.gnu.org/projects/tree-ssa/ SSA for Trees] project page | * GCC's [http://gcc.gnu.org/projects/tree-ssa/ SSA for Trees] project page | ||

* "[http://www.cs.lth.se/home/Jonas_Skeppstedt/kongstad.pdf An Implementation of Global Value Numbering in the GNU Compiler Collection, with Performance Measurements]" (Kongstad 2004) | * "[http://www.cs.lth.se/home/Jonas_Skeppstedt/kongstad.pdf An Implementation of Global Value Numbering in the GNU Compiler Collection, with Performance Measurements]" (Kongstad 2004) | ||

* "[http://portal.acm.org/citation.cfm?id=255129.255158 On the perfect accuracy of an approximate subscript analysis test]" (Klappholz, Psarris, Kong, 1990) analyzes the GCD and [[Banerjee Inequality|Banerjee inequalities]], explaining the crappiness of the former and general robustness of the latter. | |||

** "[http://portal.acm.org/citation.cfm?id=110518.110525&coll=&dl=ACM&CFID=15151515&CFTOKEN=6184618 On the Accuracy of the Banerjee Test]" (same authors, 1991) suggests improvements on the [[Banerjee Inequality|Banerjee test]]. | |||

* "[http://portal.acm.org/citation.cfm?id=143129&dl=GUIDE&coll=GUIDE&CFID=31575025&CFTOKEN=24090323 Eliminating False Data Dependencies using the Omega Test]" (Pugh, Wonnacott, 1992) moves from integer programming-based ([http://mathworld.wolfram.com/DiophantineEquation.html Diophantine]) solutions to a subclass of the [http://en.wikipedia.org/wiki/Presburger_arithmetic Presburger formulae]. | |||

* "[http://coyotegulch.com/products/acovea/ Acovea: Using Natural Selection to Investigate Software Complexities]" explores the gcc optimization flag space for a given chunk of code | |||

* "[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.5532 Combining Register Allocation and Instruction Scheduling]", a seminal 1995 paper by Motwani et al, explores the complexity of scheduling |