gcc/doc/cfg.texi

   1 @c -*-texinfo-*-
   2 @c Copyright (C) 2001, 2003, 2004, 2005, 2006, 2007, 2008 Free Software
   3 @c Foundation, Inc.
   4 @c This is part of the GCC manual.
   5 @c For copying conditions, see the file gcc.texi.
   6
   7 @c ---------------------------------------------------------------------
   8 @c Control Flow Graph
   9 @c ---------------------------------------------------------------------
  10
  11 @node Control Flow
  12 @chapter Control Flow Graph
  13 @cindex CFG, Control Flow Graph
  14 @findex basic-block.h
  15
  16 A control flow graph (CFG) is a data structure built on top of the
  17 intermediate code representation (the RTL or @code{tree} instruction
  18 stream) abstracting the control flow behavior of a function that is
  19 being compiled.  The CFG is a directed graph where the vertices
  20 represent basic blocks and edges represent possible transfer of
  21 control flow from one basic block to another.  The data structures
  22 used to represent the control flow graph are defined in
  23 @file{basic-block.h}.
  24
  25 @menu
  26 * Basic Blocks::           The definition and representation of basic blocks.
  27 * Edges::                  Types of edges and their representation.
  28 * Profile information::    Representation of frequencies and probabilities.
  29 * Maintaining the CFG::    Keeping the control flow graph and up to date.
  30 * Liveness information::   Using and maintaining liveness information.
  31 @end menu
  32
  33
  34 @node Basic Blocks
  35 @section Basic Blocks
  36
  37 @cindex basic block
  38 @findex basic_block
  39 A basic block is a straight-line sequence of code with only one entry
  40 point and only one exit.  In GCC, basic blocks are represented using
  41 the @code{basic_block} data type.
  42
  43 @findex next_bb, prev_bb, FOR_EACH_BB
  44 Two pointer members of the @code{basic_block} structure are the
  45 pointers @code{next_bb} and @code{prev_bb}.  These are used to keep
  46 doubly linked chain of basic blocks in the same order as the
  47 underlying instruction stream.  The chain of basic blocks is updated
  48 transparently by the provided API for manipulating the CFG@.  The macro
  49 @code{FOR_EACH_BB} can be used to visit all the basic blocks in
  50 lexicographical order.  Dominator traversals are also possible using
  51 @code{walk_dominator_tree}.  Given two basic blocks A and B, block A
  52 dominates block B if A is @emph{always} executed before B@.
  53
  54 @findex BASIC_BLOCK
  55 The @code{BASIC_BLOCK} array contains all basic blocks in an
  56 unspecified order.  Each @code{basic_block} structure has a field
  57 that holds a unique integer identifier @code{index} that is the
  58 index of the block in the @code{BASIC_BLOCK} array.
  59 The total number of basic blocks in the function is
  60 @code{n_basic_blocks}.  Both the basic block indices and
  61 the total number of basic blocks may vary during the compilation
  62 process, as passes reorder, create, duplicate, and destroy basic
  63 blocks.  The index for any block should never be greater than
  64 @code{last_basic_block}.
  65
  66 @findex ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR
  67 Special basic blocks represent possible entry and exit points of a
  68 function.  These blocks are called @code{ENTRY_BLOCK_PTR} and
  69 @code{EXIT_BLOCK_PTR}.  These blocks do not contain any code, and are
  70 not elements of the @code{BASIC_BLOCK} array.  Therefore they have
  71 been assigned unique, negative index numbers.
  72
  73 Each @code{basic_block} also contains pointers to the first
  74 instruction (the @dfn{head}) and the last instruction (the @dfn{tail})
  75 or @dfn{end} of the instruction stream contained in a basic block.  In
  76 fact, since the @code{basic_block} data type is used to represent
  77 blocks in both major intermediate representations of GCC (@code{tree}
  78 and RTL), there are pointers to the head and end of a basic block for
  79 both representations.
  80
  81 @findex NOTE_INSN_BASIC_BLOCK, CODE_LABEL, notes
  82 For RTL, these pointers are @code{rtx head, end}.  In the RTL function
  83 representation, the head pointer always points either to a
  84 @code{NOTE_INSN_BASIC_BLOCK} or to a @code{CODE_LABEL}, if present.
  85 In the RTL representation of a function, the instruction stream
  86 contains not only the ``real'' instructions, but also @dfn{notes}.
  87 Any function that moves or duplicates the basic blocks needs
  88 to take care of updating of these notes.  Many of these notes expect
  89 that the instruction stream consists of linear regions, making such
  90 updates difficult.   The @code{NOTE_INSN_BASIC_BLOCK} note is the only
  91 kind of note that may appear in the instruction stream contained in a
  92 basic block.  The instruction stream of a basic block always follows a
  93 @code{NOTE_INSN_BASIC_BLOCK},  but zero or more @code{CODE_LABEL}
  94 nodes can precede the block note.   A basic block ends by control flow
  95 instruction or last instruction before following @code{CODE_LABEL} or
  96 @code{NOTE_INSN_BASIC_BLOCK}.  A @code{CODE_LABEL} cannot appear in
  97 the instruction stream of a basic block.
  98
  99 @findex can_fallthru
 100 @cindex table jump
 101 In addition to notes, the jump table vectors are also represented as
 102 ``pseudo-instructions'' inside the insn stream.  These vectors never
 103 appear in the basic block and should always be placed just after the
 104 table jump instructions referencing them.  After removing the
 105 table-jump it is often difficult to eliminate the code computing the
 106 address and referencing the vector, so cleaning up these vectors is
 107 postponed until after liveness analysis.   Thus the jump table vectors
 108 may appear in the insn stream unreferenced and without any purpose.
 109 Before any edge is made @dfn{fall-thru}, the existence of such
 110 construct in the way needs to be checked by calling
 111 @code{can_fallthru} function.
 112
 113 @cindex block statement iterators
 114 For the @code{tree} representation, the head and end of the basic block
 115 are being pointed to by the @code{stmt_list} field, but this special
 116 @code{tree} should never be referenced directly.  Instead, at the tree
 117 level abstract containers and iterators are used to access statements
 118 and expressions in basic blocks.  These iterators are called
 119 @dfn{block statement iterators} (BSIs).  Grep for @code{^bsi}
 120 in the various @file{tree-*} files.
 121 The following snippet will pretty-print all the statements of the
 122 program in the GIMPLE representation.
 123
 124 @smallexample
 125 FOR_EACH_BB (bb)
 126   @{
 127      block_stmt_iterator si;
 128
 129      for (si = bsi_start (bb); !bsi_end_p (si); bsi_next (&si))
 130        @{
 131           tree stmt = bsi_stmt (si);
 132           print_generic_stmt (stderr, stmt, 0);
 133        @}
 134   @}
 135 @end smallexample
 136
 137
 138 @node Edges
 139 @section Edges
 140
 141 @cindex edge in the flow graph
 142 @findex edge
 143 Edges represent possible control flow transfers from the end of some
 144 basic block A to the head of another basic block B@.  We say that A is
 145 a predecessor of B, and B is a successor of A@.  Edges are represented
 146 in GCC with the @code{edge} data type.  Each @code{edge} acts as a
 147 link between two basic blocks: the @code{src} member of an edge
 148 points to the predecessor basic block of the @code{dest} basic block.
 149 The members @code{preds} and @code{succs} of the @code{basic_block} data
 150 type point to type-safe vectors of edges to the predecessors and
 151 successors of the block.
 152
 153 @cindex edge iterators
 154 When walking the edges in an edge vector, @dfn{edge iterators} should
 155 be used.  Edge iterators are constructed using the
 156 @code{edge_iterator} data structure and several methods are available
 157 to operate on them:
 158
 159 @ftable @code
 160 @item ei_start
 161 This function initializes an @code{edge_iterator} that points to the
 162 first edge in a vector of edges.
 163
 164 @item ei_last
 165 This function initializes an @code{edge_iterator} that points to the
 166 last edge in a vector of edges.
 167
 168 @item ei_end_p
 169 This predicate is @code{true} if an @code{edge_iterator} represents
 170 the last edge in an edge vector.
 171
 172 @item ei_one_before_end_p
 173 This predicate is @code{true} if an @code{edge_iterator} represents
 174 the second last edge in an edge vector.
 175
 176 @item ei_next
 177 This function takes a pointer to an @code{edge_iterator} and makes it
 178 point to the next edge in the sequence.
 179
 180 @item ei_prev
 181 This function takes a pointer to an @code{edge_iterator} and makes it
 182 point to the previous edge in the sequence.
 183
 184 @item ei_edge
 185 This function returns the @code{edge} currently pointed to by an
 186 @code{edge_iterator}.
 187
 188 @item ei_safe_safe
 189 This function returns the @code{edge} currently pointed to by an
 190 @code{edge_iterator}, but returns @code{NULL} if the iterator is
 191 pointing at the end of the sequence.  This function has been provided
 192 for existing code makes the assumption that a @code{NULL} edge
 193 indicates the end of the sequence.
 194
 195 @end ftable
 196
 197 The convenience macro @code{FOR_EACH_EDGE} can be used to visit all of
 198 the edges in a sequence of predecessor or successor edges.  It must
 199 not be used when an element might be removed during the traversal,
 200 otherwise elements will be missed.  Here is an example of how to use
 201 the macro:
 202
 203 @smallexample
 204 edge e;
 205 edge_iterator ei;
 206
 207 FOR_EACH_EDGE (e, ei, bb->succs)
 208   @{
 209      if (e->flags & EDGE_FALLTHRU)
 210        break;
 211   @}
 212 @end smallexample
 213
 214 @findex fall-thru
 215 There are various reasons why control flow may transfer from one block
 216 to another.  One possibility is that some instruction, for example a
 217 @code{CODE_LABEL}, in a linearized instruction stream just always
 218 starts a new basic block.  In this case a @dfn{fall-thru} edge links
 219 the basic block to the first following basic block.  But there are
 220 several other reasons why edges may be created.  The @code{flags}
 221 field of the @code{edge} data type is used to store information
 222 about the type of edge we are dealing with.  Each edge is of one of
 223 the following types:
 224
 225 @table @emph
 226 @item jump
 227 No type flags are set for edges corresponding to jump instructions.
 228 These edges are used for unconditional or conditional jumps and in
 229 RTL also for table jumps.  They are the easiest to manipulate as they
 230 may be freely redirected when the flow graph is not in SSA form.
 231
 232 @item fall-thru
 233 @findex EDGE_FALLTHRU, force_nonfallthru
 234 Fall-thru edges are present in case where the basic block may continue
 235 execution to the following one without branching.  These edges have
 236 the @code{EDGE_FALLTHRU} flag set.  Unlike other types of edges, these
 237 edges must come into the basic block immediately following in the
 238 instruction stream.  The function @code{force_nonfallthru} is
 239 available to insert an unconditional jump in the case that redirection
 240 is needed.  Note that this may require creation of a new basic block.
 241
 242 @item exception handling
 243 @cindex exception handling
 244 @findex EDGE_ABNORMAL, EDGE_EH
 245 Exception handling edges represent possible control transfers from a
 246 trapping instruction to an exception handler.  The definition of
 247 ``trapping'' varies.  In C++, only function calls can throw, but for
 248 Java, exceptions like division by zero or segmentation fault are
 249 defined and thus each instruction possibly throwing this kind of
 250 exception needs to be handled as control flow instruction.  Exception
 251 edges have the @code{EDGE_ABNORMAL} and @code{EDGE_EH} flags set.
 252
 253 @findex purge_dead_edges
 254 When updating the instruction stream it is easy to change possibly
 255 trapping instruction to non-trapping, by simply removing the exception
 256 edge.  The opposite conversion is difficult, but should not happen
 257 anyway.  The edges can be eliminated via @code{purge_dead_edges} call.
 258
 259 @findex REG_EH_REGION, EDGE_ABNORMAL_CALL
 260 In the RTL representation, the destination of an exception edge is
 261 specified by @code{REG_EH_REGION} note attached to the insn.
 262 In case of a trapping call the @code{EDGE_ABNORMAL_CALL} flag is set
 263 too.  In the @code{tree} representation, this extra flag is not set.
 264
 265 @findex may_trap_p, tree_could_trap_p
 266 In the RTL representation, the predicate @code{may_trap_p} may be used
 267 to check whether instruction still may trap or not.  For the tree
 268 representation, the @code{tree_could_trap_p} predicate is available,
 269 but this predicate only checks for possible memory traps, as in
 270 dereferencing an invalid pointer location.
 271
 272
 273 @item sibling calls
 274 @cindex sibling call
 275 @findex EDGE_ABNORMAL, EDGE_SIBCALL
 276 Sibling calls or tail calls terminate the function in a non-standard
 277 way and thus an edge to the exit must be present.
 278 @code{EDGE_SIBCALL} and @code{EDGE_ABNORMAL} are set in such case.
 279 These edges only exist in the RTL representation.
 280
 281 @item computed jumps
 282 @cindex computed jump
 283 @findex EDGE_ABNORMAL
 284 Computed jumps contain edges to all labels in the function referenced
 285 from the code.  All those edges have @code{EDGE_ABNORMAL} flag set.
 286 The edges used to represent computed jumps often cause compile time
 287 performance problems, since functions consisting of many taken labels
 288 and many computed jumps may have @emph{very} dense flow graphs, so
 289 these edges need to be handled with special care.  During the earlier
 290 stages of the compilation process, GCC tries to avoid such dense flow
 291 graphs by factoring computed jumps.  For example, given the following
 292 series of jumps,
 293
 294 @smallexample
 295   goto *x;
 296   [ @dots{} ]
 297
 298   goto *x;
 299   [ @dots{} ]
 300
 301   goto *x;
 302   [ @dots{} ]
 303 @end smallexample
 304
 305 @noindent
 306 factoring the computed jumps results in the following code sequence
 307 which has a much simpler flow graph:
 308
 309 @smallexample
 310   goto y;
 311   [ @dots{} ]
 312
 313   goto y;
 314   [ @dots{} ]
 315
 316   goto y;
 317   [ @dots{} ]
 318
 319 y:
 320   goto *x;
 321 @end smallexample
 322
 323 However, the classic problem with this transformation is that it has a
 324 runtime cost in there resulting code: An extra jump.  Therefore, the
 325 computed jumps are un-factored in the later passes of the compiler.
 326 Be aware of that when you work on passes in that area.  There have
 327 been numerous examples already where the compile time for code with
 328 unfactored computed jumps caused some serious headaches.
 329
 330 @item nonlocal goto handlers
 331 @cindex nonlocal goto handler
 332 @findex EDGE_ABNORMAL, EDGE_ABNORMAL_CALL
 333 GCC allows nested functions to return into caller using a @code{goto}
 334 to a label passed to as an argument to the callee.  The labels passed
 335 to nested functions contain special code to cleanup after function
 336 call.  Such sections of code are referred to as ``nonlocal goto
 337 receivers''.  If a function contains such nonlocal goto receivers, an
 338 edge from the call to the label is created with the
 339 @code{EDGE_ABNORMAL} and @code{EDGE_ABNORMAL_CALL} flags set.
 340
 341 @item function entry points
 342 @cindex function entry point, alternate function entry point
 343 @findex LABEL_ALTERNATE_NAME
 344 By definition, execution of function starts at basic block 0, so there
 345 is always an edge from the @code{ENTRY_BLOCK_PTR} to basic block 0.
 346 There is no @code{tree} representation for alternate entry points at
 347 this moment.  In RTL, alternate entry points are specified by
 348 @code{CODE_LABEL} with @code{LABEL_ALTERNATE_NAME} defined.  This
 349 feature is currently used for multiple entry point prologues and is
 350 limited to post-reload passes only.  This can be used by back-ends to
 351 emit alternate prologues for functions called from different contexts.
 352 In future full support for multiple entry functions defined by Fortran
 353 90 needs to be implemented.
 354
 355 @item function exits
 356 In the pre-reload representation a function terminates after the last
 357 instruction in the insn chain and no explicit return instructions are
 358 used.  This corresponds to the fall-thru edge into exit block.  After
 359 reload, optimal RTL epilogues are used that use explicit (conditional)
 360 return instructions that are represented by edges with no flags set.
 361
 362 @end table
 363
 364
 365 @node Profile information
 366 @section Profile information
 367
 368 @cindex profile representation
 369 In many cases a compiler must make a choice whether to trade speed in
 370 one part of code for speed in another, or to trade code size for code
 371 speed.  In such cases it is useful to know information about how often
 372 some given block will be executed.  That is the purpose for
 373 maintaining profile within the flow graph.
 374 GCC can handle profile information obtained through @dfn{profile
 375 feedback}, but it can also  estimate branch probabilities based on
 376 statics and heuristics.
 377
 378 @cindex profile feedback
 379 The feedback based profile is produced by compiling the program with
 380 instrumentation, executing it on a train run and reading the numbers
 381 of executions of basic blocks and edges back to the compiler while
 382 re-compiling the program to produce the final executable.  This method
 383 provides very accurate information about where a program spends most
 384 of its time on the train run.  Whether it matches the average run of
 385 course depends on the choice of train data set, but several studies
 386 have shown that the behavior of a program usually changes just
 387 marginally over different data sets.
 388
 389 @cindex Static profile estimation
 390 @cindex branch prediction
 391 @findex predict.def
 392 When profile feedback is not available, the compiler may be asked to
 393 attempt to predict the behavior of each branch in the program using a
 394 set of heuristics (see @file{predict.def} for details) and compute
 395 estimated frequencies of each basic block by propagating the
 396 probabilities over the graph.
 397
 398 @findex frequency, count, BB_FREQ_BASE
 399 Each @code{basic_block} contains two integer fields to represent
 400 profile information: @code{frequency} and @code{count}.  The
 401 @code{frequency} is an estimation how often is basic block executed
 402 within a function.  It is represented as an integer scaled in the
 403 range from 0 to @code{BB_FREQ_BASE}.  The most frequently executed
 404 basic block in function is initially set to @code{BB_FREQ_BASE} and
 405 the rest of frequencies are scaled accordingly.  During optimization,
 406 the frequency of the most frequent basic block can both decrease (for
 407 instance by loop unrolling) or grow (for instance by cross-jumping
 408 optimization), so scaling sometimes has to be performed multiple
 409 times.
 410
 411 @findex gcov_type
 412 The @code{count} contains hard-counted numbers of execution measured
 413 during training runs and is nonzero only when profile feedback is
 414 available.  This value is represented as the host's widest integer
 415 (typically a 64 bit integer) of the special type @code{gcov_type}.
 416
 417 Most optimization passes can use only the frequency information of a
 418 basic block, but a few passes may want to know hard execution counts.
 419 The frequencies should always match the counts after scaling, however
 420 during updating of the profile information numerical error may
 421 accumulate into quite large errors.
 422
 423 @findex REG_BR_PROB_BASE, EDGE_FREQUENCY
 424 Each edge also contains a branch probability field: an integer in the
 425 range from 0 to @code{REG_BR_PROB_BASE}.  It represents probability of
 426 passing control from the end of the @code{src} basic block to the
 427 @code{dest} basic block, i.e.@: the probability that control will flow
 428 along this edge.   The @code{EDGE_FREQUENCY} macro is available to
 429 compute how frequently a given edge is taken.  There is a @code{count}
 430 field for each edge as well, representing same information as for a
 431 basic block.
 432
 433 The basic block frequencies are not represented in the instruction
 434 stream, but in the RTL representation the edge frequencies are
 435 represented for conditional jumps (via the @code{REG_BR_PROB}
 436 macro) since they are used when instructions are output to the
 437 assembly file and the flow graph is no longer maintained.
 438
 439 @cindex reverse probability
 440 The probability that control flow arrives via a given edge to its
 441 destination basic block is called @dfn{reverse probability} and is not
 442 directly represented, but it may be easily computed from frequencies
 443 of basic blocks.
 444
 445 @findex redirect_edge_and_branch
 446 Updating profile information is a delicate task that can unfortunately
 447 not be easily integrated with the CFG manipulation API@.  Many of the
 448 functions and hooks to modify the CFG, such as
 449 @code{redirect_edge_and_branch}, do not have enough information to
 450 easily update the profile, so updating it is in the majority of cases
 451 left up to the caller.  It is difficult to uncover bugs in the profile
 452 updating code, because they manifest themselves only by producing
 453 worse code, and checking profile consistency is not possible because
 454 of numeric error accumulation.  Hence special attention needs to be
 455 given to this issue in each pass that modifies the CFG@.
 456
 457 @findex REG_BR_PROB_BASE, BB_FREQ_BASE, count
 458 It is important to point out that @code{REG_BR_PROB_BASE} and
 459 @code{BB_FREQ_BASE} are both set low enough to be possible to compute
 460 second power of any frequency or probability in the flow graph, it is
 461 not possible to even square the @code{count} field, as modern CPUs are
 462 fast enough to execute $2^32$ operations quickly.
 463
 464
 465 @node Maintaining the CFG
 466 @section Maintaining the CFG
 467 @findex cfghooks.h
 468
 469 An important task of each compiler pass is to keep both the control
 470 flow graph and all profile information up-to-date.  Reconstruction of
 471 the control flow graph after each pass is not an option, since it may be
 472 very expensive and lost profile information cannot be reconstructed at
 473 all.
 474
 475 GCC has two major intermediate representations, and both use the
 476 @code{basic_block} and @code{edge} data types to represent control
 477 flow.  Both representations share as much of the CFG maintenance code
 478 as possible.  For each representation, a set of @dfn{hooks} is defined
 479 so that each representation can provide its own implementation of CFG
 480 manipulation routines when necessary.  These hooks are defined in
 481 @file{cfghooks.h}.  There are hooks for almost all common CFG
 482 manipulations, including block splitting and merging, edge redirection
 483 and creating and deleting basic blocks.  These hooks should provide
 484 everything you need to maintain and manipulate the CFG in both the RTL
 485 and @code{tree} representation.
 486
 487 At the moment, the basic block boundaries are maintained transparently
 488 when modifying instructions, so there rarely is a need to move them
 489 manually (such as in case someone wants to output instruction outside
 490 basic block explicitly).
 491 Often the CFG may be better viewed as integral part of instruction
 492 chain, than structure built on the top of it.  However, in principle
 493 the control flow graph for the @code{tree} representation is
 494 @emph{not} an integral part of the representation, in that a function
 495 tree may be expanded without first building a  flow graph for the
 496 @code{tree} representation at all.  This happens when compiling
 497 without any @code{tree} optimization enabled.  When the @code{tree}
 498 optimizations are enabled and the instruction stream is rewritten in
 499 SSA form, the CFG is very tightly coupled with the instruction stream.
 500 In particular, statement insertion and removal has to be done with
 501 care.  In fact, the whole @code{tree} representation can not be easily
 502 used or maintained without proper maintenance of the CFG
 503 simultaneously.
 504
 505 @findex BLOCK_FOR_INSN, bb_for_stmt
 506 In the RTL representation, each instruction has a
 507 @code{BLOCK_FOR_INSN} value that represents pointer to the basic block
 508 that contains the instruction.  In the @code{tree} representation, the
 509 function @code{bb_for_stmt} returns a pointer to the basic block
 510 containing the queried statement.
 511
 512 @cindex block statement iterators
 513 When changes need to be applied to a function in its @code{tree}
 514 representation, @dfn{block statement iterators} should be used.  These
 515 iterators provide an integrated abstraction of the flow graph and the
 516 instruction stream.  Block statement iterators iterators are
 517 constructed using the @code{block_stmt_iterator} data structure and
 518 several modifier are available, including the following:
 519
 520 @ftable @code
 521 @item bsi_start
 522 This function initializes a @code{block_stmt_iterator} that points to
 523 the first non-empty statement in a basic block.
 524
 525 @item bsi_last
 526 This function initializes a @code{block_stmt_iterator} that points to
 527 the last statement in a basic block.
 528
 529 @item bsi_end_p
 530 This predicate is @code{true} if a @code{block_stmt_iterator}
 531 represents the end of a basic block.
 532
 533 @item bsi_next
 534 This function takes a @code{block_stmt_iterator} and makes it point to
 535 its successor.
 536
 537 @item bsi_prev
 538 This function takes a @code{block_stmt_iterator} and makes it point to
 539 its predecessor.
 540
 541 @item bsi_insert_after
 542 This function inserts a statement after the @code{block_stmt_iterator}
 543 passed in.  The final parameter determines whether the statement
 544 iterator is updated to point to the newly inserted statement, or left
 545 pointing to the original statement.
 546
 547 @item bsi_insert_before
 548 This function inserts a statement before the @code{block_stmt_iterator}
 549 passed in.  The final parameter determines whether the statement
 550 iterator is updated to point to the newly inserted statement, or left
 551 pointing to the original  statement.
 552
 553 @item bsi_remove
 554 This function removes the @code{block_stmt_iterator} passed in and
 555 rechains the remaining statements in a basic block, if any.
 556 @end ftable
 557
 558 @findex BB_HEAD, BB_END
 559 In the RTL representation, the macros @code{BB_HEAD} and @code{BB_END}
 560 may be used to get the head and end @code{rtx} of a basic block.  No
 561 abstract iterators are defined for traversing the insn chain, but you
 562 can just use @code{NEXT_INSN} and @code{PREV_INSN} instead.  See
 563 @xref{Insns}.
 564
 565 @findex purge_dead_edges
 566 Usually a code manipulating pass simplifies the instruction stream and
 567 the flow of control, possibly eliminating some edges.  This may for
 568 example happen when a conditional jump is replaced with an
 569 unconditional jump, but also when simplifying possibly trapping
 570 instruction to non-trapping while compiling Java.  Updating of edges
 571 is not transparent and each optimization pass is required to do so
 572 manually.  However only few cases occur in practice.  The pass may
 573 call @code{purge_dead_edges} on a given basic block to remove
 574 superfluous edges, if any.
 575
 576 @findex redirect_edge_and_branch, redirect_jump
 577 Another common scenario is redirection of branch instructions, but
 578 this is best modeled as redirection of edges in the control flow graph
 579 and thus use of @code{redirect_edge_and_branch} is preferred over more
 580 low level functions, such as @code{redirect_jump} that operate on RTL
 581 chain only.  The CFG hooks defined in @file{cfghooks.h} should provide
 582 the complete API required for manipulating and maintaining the CFG@.
 583
 584 @findex split_block
 585 It is also possible that a pass has to insert control flow instruction
 586 into the middle of a basic block, thus creating an entry point in the
 587 middle of the basic block, which is impossible by definition: The
 588 block must be split to make sure it only has one entry point, i.e.@: the
 589 head of the basic block.  The CFG hook @code{split_block} may be used
 590 when an instruction in the middle of a basic block has to become the
 591 target of a jump or branch instruction.
 592
 593 @findex insert_insn_on_edge
 594 @findex commit_edge_insertions
 595 @findex bsi_insert_on_edge
 596 @findex bsi_commit_edge_inserts
 597 @cindex edge splitting
 598 For a global optimizer, a common operation is to split edges in the
 599 flow graph and insert instructions on them.  In the RTL
 600 representation, this can be easily done using the
 601 @code{insert_insn_on_edge} function that emits an instruction
 602 ``on the edge'', caching it for a later @code{commit_edge_insertions}
 603 call that will take care of moving the inserted instructions off the
 604 edge into the instruction stream contained in a basic block.  This
 605 includes the creation of new basic blocks where needed.  In the
 606 @code{tree} representation, the equivalent functions are
 607 @code{bsi_insert_on_edge} which inserts a block statement
 608 iterator on an edge, and @code{bsi_commit_edge_inserts} which flushes
 609 the instruction to actual instruction stream.
 610
 611 While debugging the optimization pass, an @code{verify_flow_info}
 612 function may be useful to find bugs in the control flow graph updating
 613 code.
 614
 615 Note that at present, the representation of control flow in the
 616 @code{tree} representation is discarded before expanding to RTL@.
 617 Long term the CFG should be maintained and ``expanded'' to the
 618 RTL representation along with the function @code{tree} itself.
 619
 620
 621 @node Liveness information
 622 @section Liveness information
 623 @cindex Liveness representation
 624 Liveness information is useful to determine whether some register is
 625 ``live'' at given point of program, i.e.@: that it contains a value that
 626 may be used at a later point in the program.  This information is
 627 used, for instance, during register allocation, as the pseudo
 628 registers only need to be assigned to a unique hard register or to a
 629 stack slot if they are live.  The hard registers and stack slots may
 630 be freely reused for other values when a register is dead.
 631
 632 Liveness information is available in the back end starting with
 633 @code{pass_df_initialize} and ending with @code{pass_df_finish}.  Three
 634 flavors of live analysis are available: With @code{LR}, it is possible
 635 to determine at any point @code{P} in the function if the register may be
 636 used on some path from @code{P} to the end of the function.  With
 637 @code{UR}, it is possible to determine if there is a path from the
 638 beginning of the function to @code{P} that defines the variable.
 639 @code{LIVE} is the intersection of the @code{LR} and @code{UR} and a
 640 variable is live at @code{P} if there is both an assignment that reaches
 641 it from the beginning of the function and a uses that can be reached on
 642 some path from @code{P} to the end of the function.
 643
 644 In general @code{LIVE} is the most useful of the three.  The macros
 645 @code{DF_[LR,UR,LIVE]_[IN,OUT]} can be used to access this information.
 646 The macros take a basic block number and return a bitmap that is indexed
 647 by the register number.  This information is only guaranteed to be up to
 648 date after calls are made to @code{df_analyze}.  See the file
 649 @code{df-core.c} for details on using the dataflow.
 650
 651
 652 @findex REG_DEAD, REG_UNUSED
 653 The liveness information is stored partly in the RTL instruction stream
 654 and partly in the flow graph.  Local information is stored in the
 655 instruction stream: Each instruction may contain @code{REG_DEAD} notes
 656 representing that the value of a given register is no longer needed, or
 657 @code{REG_UNUSED} notes representing that the value computed by the
 658 instruction is never used.  The second is useful for instructions
 659 computing multiple values at once.
 660