1 @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
2 @c 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
3 @c This is part of the GCC manual.
4 @c For copying conditions, see the file gcc.texi.
7 @chapter Passes and Files of the Compiler
8 @cindex passes and files of the compiler
9 @cindex files and passes of the compiler
10 @cindex compiler passes and files
12 @cindex top level of compiler
13 The overall control structure of the compiler is in @file{toplev.c}. This
14 file is responsible for initialization, decoding arguments, opening and
15 closing files, and sequencing the passes.
18 The parsing pass is invoked only once, to parse the entire input. A
19 high level tree representation is then generated from the input,
20 one function at a time. This tree code is then transformed into RTL
21 intermediate code, and processed. The files involved in transforming
22 the trees into RTL are @file{expr.c}, @file{expmed.c}, and
24 @c Note, the above files aren't strictly the only files involved. It's
25 @c all over the place (function.c, final.c,etc). However, those are
26 @c the files that are supposed to be directly involved, and have
27 @c their purpose listed as such, so i've only listed them.
28 The order of trees that are processed, is not
29 necessarily the same order they are generated from
30 the input, due to deferred inlining, and other considerations.
32 @findex rest_of_compilation
33 @findex rest_of_decl_compilation
34 Each time the parsing pass reads a complete function definition or
35 top-level declaration, it calls either the function
36 @code{rest_of_compilation}, or the function
37 @code{rest_of_decl_compilation} in @file{toplev.c}, which are
38 responsible for all further processing necessary, ending with output of
39 the assembler language. All other compiler passes run, in sequence,
40 within @code{rest_of_compilation}. When that function returns from
41 compiling a function definition, the storage used for that function
42 definition's compilation is entirely freed, unless it is an inline
43 function, or was deferred for some reason (this can occur in
44 templates, for example).
45 (@pxref{Inline,,An Inline Function is As Fast As a Macro,gcc,Using the
46 GNU Compiler Collection (GCC)}).
48 Here is a list of all the passes of the compiler and their source files.
49 Also included is a description of where debugging dumps can be requested
50 with @option{-d} options.
54 Parsing. This pass reads the entire text of a function definition,
55 constructing a high level tree representation. (Because of the semantic
56 analysis that takes place during this pass, it does more than is
57 formally considered to be parsing.)
59 The tree representation does not entirely follow C syntax, because it is
60 intended to support other languages as well.
62 Language-specific data type analysis is also done in this pass, and every
63 tree node that represents an expression has a data type attached.
64 Variables are represented as declaration nodes.
66 The language-independent source files for parsing are
67 @file{tree.c}, @file{fold-const.c}, and @file{stor-layout.c}.
68 There are also header files @file{tree.h} and @file{tree.def}
69 which define the format of the tree representation.
71 C preprocessing, for language front ends, that want or require it, is
72 performed by cpplib, which is covered in separate documentation. In
73 particular, the internals are covered in @xref{Top, ,Cpplib internals,
74 cppinternals, Cpplib Internals}.
76 The source files to parse C are found in the toplevel directory, and
77 by convention are named @file{c-*}. Some of these are also used by
78 the other C-like languages: @file{c-common.c},
90 Files specific to each language are in subdirectories named after the
91 language in question, like @file{ada}, @file{objc}, @file{cp} (for C++).
93 @cindex Tree optimization
95 Tree optimization. This is the optimization of the tree
96 representation, before converting into RTL code.
98 @cindex inline on trees, automatic
99 Currently, the main optimization performed here is tree-based
101 This is implemented in @file{tree-inline.c} and used by both C and C++.
102 Note that tree based inlining turns off rtx based inlining (since it's more
103 powerful, it would be a waste of time to do rtx based inlining in
106 @cindex constant folding
107 @cindex arithmetic simplifications
108 @cindex simplifications, arithmetic
109 Constant folding and some arithmetic simplifications are also done
110 during this pass, on the tree representation.
111 The routines that perform these tasks are located in @file{fold-const.c}.
113 @cindex RTL generation
115 RTL generation. This is the conversion of syntax tree into RTL code.
117 @cindex target-parameter-dependent code
118 This is where the bulk of target-parameter-dependent code is found,
119 since often it is necessary for strategies to apply only when certain
120 standard kinds of instructions are available. The purpose of named
121 instruction patterns is to provide this information to the RTL
124 @cindex tail recursion optimization
125 Optimization is done in this pass for @code{if}-conditions that are
126 comparisons, boolean operations or conditional expressions. Tail
127 recursion is detected at this time also. Decisions are made about how
128 best to arrange loops and how to output @code{switch} statements.
130 @c Avoiding overfull is tricky here.
131 The source files for RTL generation include
139 and @file{emit-rtl.c}.
141 @file{insn-emit.c}, generated from the machine description by the
142 program @code{genemit}, is used in this pass. The header file
143 @file{expr.h} is used for communication within this pass.
147 The header files @file{insn-flags.h} and @file{insn-codes.h},
148 generated from the machine description by the programs @code{genflags}
149 and @code{gencodes}, tell this pass which standard names are available
150 for use and which patterns correspond to them.
152 Aside from debugging information output, none of the following passes
153 refers to the tree structure representation of the function (only
154 part of which is saved).
156 @cindex inline on rtx, automatic
157 The decision of whether the function can and should be expanded inline
158 in its subsequent callers is made at the end of rtl generation. The
159 function must meet certain criteria, currently related to the size of
160 the function and the types and number of parameters it has. Note that
161 this function may contain loops, recursive calls to itself
162 (tail-recursive functions can be inlined!), gotos, in short, all
163 constructs supported by GCC@. The file @file{integrate.c} contains
164 the code to save a function's rtl for later inlining and to inline that
165 rtl when the function is called. The header file @file{integrate.h}
166 is also used for this purpose.
169 The option @option{-dr} causes a debugging dump of the RTL code after
170 this pass. This dump file's name is made by appending @samp{.rtl} to
173 @c Should the exception handling pass be talked about here?
175 @cindex sibling call optimization
177 Sibling call optimization. This pass performs tail recursion
178 elimination, and tail and sibling call optimizations. The purpose of
179 these optimizations is to reduce the overhead of function calls,
182 The source file of this pass is @file{sibcall.c}
185 The option @option{-di} causes a debugging dump of the RTL code after
186 this pass is run. This dump file's name is made by appending
187 @samp{.sibling} to the input file name.
189 @cindex jump optimization
190 @cindex unreachable code
193 Jump optimization. This pass simplifies jumps to the following
194 instruction, jumps across jumps, and jumps to jumps. It deletes
195 unreferenced labels and unreachable code, except that unreachable code
196 that contains a loop is not recognized as unreachable in this pass.
197 (Such loops are deleted later in the basic block analysis.) It also
198 converts some code originally written with jumps into sequences of
199 instructions that directly set values from the results of comparisons,
200 if the machine has such instructions.
202 Jump optimization is performed two or three times. The first time is
203 immediately following RTL generation. The second time is after CSE,
204 but only if CSE says repeated jump optimization is needed. The
205 last time is right before the final pass. That time, cross-jumping
206 and deletion of no-op move instructions are done together with the
207 optimizations described above.
209 The source file of this pass is @file{jump.c}.
212 The option @option{-dj} causes a debugging dump of the RTL code after
213 this pass is run for the first time. This dump file's name is made by
214 appending @samp{.jump} to the input file name.
217 @cindex register use analysis
219 Register scan. This pass finds the first and last use of each
220 register, as a guide for common subexpression elimination. Its source
221 is in @file{regclass.c}.
223 @cindex jump threading
225 @opindex fthread-jumps
226 Jump threading. This pass detects a condition jump that branches to an
227 identical or inverse test. Such jumps can be @samp{threaded} through
228 the second conditional test. The source code for this pass is in
229 @file{jump.c}. This optimization is only performed if
230 @option{-fthread-jumps} is enabled.
232 @cindex SSA optimizations
233 @cindex Single Static Assignment optimizations
236 Static Single Assignment (SSA) based optimization passes. The
237 SSA conversion passes (to/from) are turned on by the @option{-fssa}
238 option (it is also done automatically if you enable an SSA optimization pass).
239 These passes utilize a form called Static Single Assignment. In SSA form,
240 each variable (pseudo register) is only set once, giving you def-use
241 and use-def chains for free, and enabling a lot more optimization
242 passes to be run in linear time.
243 Conversion to and from SSA form is handled by functions in
247 The option @option{-de} causes a debugging dump of the RTL code after
248 this pass. This dump file's name is made by appending @samp{.ssa} to
251 @cindex SSA Conditional Constant Propagation
252 @cindex Conditional Constant Propagation, SSA based
253 @cindex conditional constant propagation
256 SSA Conditional Constant Propagation. Turned on by the @option{-fssa-ccp}
257 option. This pass performs conditional constant propagation to simplify
258 instructions including conditional branches. This pass is more aggressive
259 than the constant propagation done by the CSE and GCSE passes, but operates
263 The option @option{-dW} causes a debugging dump of the RTL code after
264 this pass. This dump file's name is made by appending @samp{.ssaccp} to
268 @cindex DCE, SSA based
269 @cindex dead code elimination
272 SSA Aggressive Dead Code Elimination. Turned on by the @option{-fssa-dce}
273 option. This pass performs elimination of code considered unnecessary because
274 it has no externally visible effects on the program. It operates in
278 The option @option{-dX} causes a debugging dump of the RTL code after
279 this pass. This dump file's name is made by appending @samp{.ssadce} to
283 @cindex common subexpression elimination
284 @cindex constant propagation
286 Common subexpression elimination. This pass also does constant
287 propagation. Its source files are @file{cse.c}, and @file{cselib.c}.
288 If constant propagation causes conditional jumps to become
289 unconditional or to become no-ops, jump optimization is run again when
293 The option @option{-ds} causes a debugging dump of the RTL code after
294 this pass. This dump file's name is made by appending @samp{.cse} to
297 @cindex global common subexpression elimination
298 @cindex constant propagation
299 @cindex copy propagation
301 Global common subexpression elimination. This pass performs two
302 different types of GCSE depending on whether you are optimizing for
303 size or not (LCM based GCSE tends to increase code size for a gain in
304 speed, while Morel-Renvoise based GCSE does not).
305 When optimizing for size, GCSE is done using Morel-Renvoise Partial
306 Redundancy Elimination, with the exception that it does not try to move
307 invariants out of loops---that is left to the loop optimization pass.
308 If MR PRE GCSE is done, code hoisting (aka unification) is also done, as
310 If you are optimizing for speed, LCM (lazy code motion) based GCSE is
311 done. LCM is based on the work of Knoop, Ruthing, and Steffen. LCM
312 based GCSE also does loop invariant code motion. We also perform load
313 and store motion when optimizing for speed.
314 Regardless of which type of GCSE is used, the GCSE pass also performs
315 global constant and copy propagation.
317 The source file for this pass is @file{gcse.c}, and the LCM routines
321 The option @option{-dG} causes a debugging dump of the RTL code after
322 this pass. This dump file's name is made by appending @samp{.gcse} to
325 @cindex loop optimization
327 @cindex strength-reduction
329 Loop optimization. This pass moves constant expressions out of loops,
330 and optionally does strength-reduction and loop unrolling as well.
331 Its source files are @file{loop.c} and @file{unroll.c}, plus the header
332 @file{loop.h} used for communication between them. Loop unrolling uses
333 some functions in @file{integrate.c} and the header @file{integrate.h}.
334 Loop dependency analysis routines are contained in @file{dependence.c}.
336 Second loop optimization pass takes care of basic block level optimalizations --
337 unrolling, peeling and unswitching loops. The source files are
338 @file{cfgloopanal.c} and @file{cfgloopmanip.c} containing generic loop
339 analysis and manipulation code, @file{loop-init.c} with initialization and
340 finalization code, @file{loop-unswitch.c} for loop unswitching and
341 @file{loop-unroll.c} for loop unrolling and peeling.
344 The option @option{-dL} causes a debugging dump of the RTL code after
345 these passes. The dump file names are made by appending @samp{.loop} and
346 @samp{.loop2} to the input file name.
348 @cindex jump bypassing
350 Jump bypassing. This pass is an aggressive form of GCSE that transforms
351 the control flow graph of a function by propagating constants into
352 conditional branch instructions.
354 The source file for this pass is @file{gcse.c}.
357 The option @option{-dG} causes a debugging dump of the RTL code after
358 this pass. This dump file's name is made by appending @samp{.bypass}
359 to the input file name.
362 @opindex frerun-cse-after-loop
363 If @option{-frerun-cse-after-loop} was enabled, a second common
364 subexpression elimination pass is performed after the loop optimization
365 pass. Jump threading is also done again at this time if it was specified.
368 The option @option{-dt} causes a debugging dump of the RTL code after
369 this pass. This dump file's name is made by appending @samp{.cse2} to
372 @cindex data flow analysis
373 @cindex analysis, data flow
376 Data flow analysis (@file{flow.c}). This pass divides the program
377 into basic blocks (and in the process deletes unreachable loops); then
378 it computes which pseudo-registers are live at each point in the
379 program, and makes the first instruction that uses a value point at
380 the instruction that computed the value.
382 @cindex autoincrement/decrement analysis
383 This pass also deletes computations whose results are never used, and
384 combines memory references with add or subtract instructions to make
385 autoincrement or autodecrement addressing.
388 The option @option{-df} causes a debugging dump of the RTL code after
389 this pass. This dump file's name is made by appending @samp{.flow} to
390 the input file name. If stupid register allocation is in use, this
391 dump file reflects the full results of such allocation.
393 @cindex instruction combination
395 Instruction combination (@file{combine.c}). This pass attempts to
396 combine groups of two or three instructions that are related by data
397 flow into single instructions. It combines the RTL expressions for
398 the instructions by substitution, simplifies the result using algebra,
399 and then attempts to match the result against the machine description.
402 The option @option{-dc} causes a debugging dump of the RTL code after
403 this pass. This dump file's name is made by appending @samp{.combine}
404 to the input file name.
406 @cindex if conversion
408 If-conversion is a transformation that transforms control dependencies
409 into data dependencies (IE it transforms conditional code into a
410 single control stream).
411 It is implemented in the file @file{ifcvt.c}.
414 The option @option{-dE} causes a debugging dump of the RTL code after
415 this pass. This dump file's name is made by appending @samp{.ce} to
418 @cindex register movement
420 Register movement (@file{regmove.c}). This pass looks for cases where
421 matching constraints would force an instruction to need a reload, and
422 this reload would be a register-to-register move. It then attempts
423 to change the registers used by the instruction to avoid the move
427 The option @option{-dN} causes a debugging dump of the RTL code after
428 this pass. This dump file's name is made by appending @samp{.regmove}
429 to the input file name.
431 @cindex instruction scheduling
432 @cindex scheduling, instruction
434 Instruction scheduling (@file{sched.c}). This pass looks for
435 instructions whose output will not be available by the time that it is
436 used in subsequent instructions. (Memory loads and floating point
437 instructions often have this behavior on RISC machines). It re-orders
438 instructions within a basic block to try to separate the definition and
439 use of items that otherwise would cause pipeline stalls.
441 Instruction scheduling is performed twice. The first time is immediately
442 after instruction combination and the second is immediately after reload.
445 The option @option{-dS} causes a debugging dump of the RTL code after this
446 pass is run for the first time. The dump file's name is made by
447 appending @samp{.sched} to the input file name.
449 @cindex register allocation
451 Register allocation. These passes make sure that all occurrences of pseudo
452 registers are eliminated, either by allocating them to a hard register,
453 replacing them by an equivalent expression (e.g.@: a constant) or by placing
454 them on the stack. This is done in several subpasses:
457 @cindex register class preference pass
459 Register class preferencing. The RTL code is scanned to find out
460 which register class is best for each pseudo register. The source
461 file is @file{regclass.c}.
463 @cindex local register allocation
465 Local register allocation (@file{local-alloc.c}). This pass allocates
466 hard registers to pseudo registers that are used only within one basic
467 block. Because the basic block is linear, it can use fast and
468 powerful techniques to do a very good job.
471 The option @option{-dl} causes a debugging dump of the RTL code after
472 this pass. This dump file's name is made by appending @samp{.lreg} to
475 @cindex global register allocation
477 Global register allocation (@file{global.c}). This pass
478 allocates hard registers for the remaining pseudo registers (those
479 whose life spans are not contained in one basic block).
481 @cindex graph coloring register allocation
485 Graph coloring register allocator. The files @file{ra.c}, @file{ra-build.c},
486 @file{ra-colorize.c}, @file{ra-debug.c}, @file{ra-rewrite.c} together with
487 the header @file{ra.h} contain another register allocator, which is used
488 when the option @option{-fnew-ra} is given. In that case it is run instead
489 of the above mentioned local and global register allocation passes, and the
490 option @option{-dl} causes a debugging dump of its work.
494 Reloading. This pass renumbers pseudo registers with the hardware
495 registers numbers they were allocated. Pseudo registers that did not
496 get hard registers are replaced with stack slots. Then it finds
497 instructions that are invalid because a value has failed to end up in
498 a register, or has ended up in a register of the wrong kind. It fixes
499 up these instructions by reloading the problematical values
500 temporarily into registers. Additional instructions are generated to
503 The reload pass also optionally eliminates the frame pointer and inserts
504 instructions to save and restore call-clobbered registers around calls.
506 Source files are @file{reload.c} and @file{reload1.c}, plus the header
507 @file{reload.h} used for communication between them.
510 The option @option{-dg} causes a debugging dump of the RTL code after
511 this pass. This dump file's name is made by appending @samp{.greg} to
515 @cindex instruction scheduling
516 @cindex scheduling, instruction
518 Instruction scheduling is repeated here to try to avoid pipeline stalls
519 due to memory loads generated for spilled pseudo registers.
522 The option @option{-dR} causes a debugging dump of the RTL code after
523 this pass. This dump file's name is made by appending @samp{.sched2}
524 to the input file name.
526 @cindex basic block reordering
527 @cindex reordering, block
529 Basic block reordering. This pass implements profile guided code
530 positioning. If profile information is not available, various types of
531 static analysis are performed to make the predictions normally coming
532 from the profile feedback (IE execution frequency, branch probability,
533 etc). It is implemented in the file @file{bb-reorder.c}, and the
534 various prediction routines are in @file{predict.c}.
537 The option @option{-dB} causes a debugging dump of the RTL code after
538 this pass. This dump file's name is made by appending @samp{.bbro} to
541 @cindex cross-jumping
542 @cindex no-op move instructions
544 Jump optimization is repeated, this time including cross-jumping
545 and deletion of no-op move instructions.
548 The option @option{-dJ} causes a debugging dump of the RTL code after
549 this pass. This dump file's name is made by appending @samp{.jump2}
550 to the input file name.
552 @cindex delayed branch scheduling
553 @cindex scheduling, delayed branch
555 Delayed branch scheduling. This optional pass attempts to find
556 instructions that can go into the delay slots of other instructions,
557 usually jumps and calls. The source file name is @file{reorg.c}.
560 The option @option{-dd} causes a debugging dump of the RTL code after
561 this pass. This dump file's name is made by appending @samp{.dbr}
562 to the input file name.
564 @cindex branch shortening
566 Branch shortening. On many RISC machines, branch instructions have a
567 limited range. Thus, longer sequences of instructions must be used for
568 long branches. In this pass, the compiler figures out what how far each
569 instruction will be from each other instruction, and therefore whether
570 the usual instructions, or the longer sequences, must be used for each
573 @cindex register-to-stack conversion
575 Conversion from usage of some hard registers to usage of a register
576 stack may be done at this point. Currently, this is supported only
577 for the floating-point registers of the Intel 80387 coprocessor. The
578 source file name is @file{reg-stack.c}.
581 The options @option{-dk} causes a debugging dump of the RTL code after
582 this pass. This dump file's name is made by appending @samp{.stack}
583 to the input file name.
586 @cindex peephole optimization
588 Final. This pass outputs the assembler code for the function. It is
589 also responsible for identifying spurious test and compare
590 instructions. Machine-specific peephole optimizations are performed
591 at the same time. The function entry and exit sequences are generated
592 directly as assembler code in this pass; they never exist as RTL@.
594 The source files are @file{final.c} plus @file{insn-output.c}; the
595 latter is generated automatically from the machine description by the
596 tool @file{genoutput}. The header file @file{conditions.h} is used
597 for communication between these files.
599 @cindex debugging information generation
601 Debugging information output. This is run after final because it must
602 output the stack slot offsets for pseudo registers that did not get
603 hard registers. Source files are @file{dbxout.c} for DBX symbol table
604 format, @file{sdbout.c} for SDB symbol table format, @file{dwarfout.c}
605 for DWARF symbol table format, files @file{dwarf2out.c} and
606 @file{dwarf2asm.c} for DWARF2 symbol table format, and @file{vmsdbgout.c}
607 for VMS debug symbol table format.
610 Some additional files are used by all or many passes:
614 Every pass uses @file{machmode.def} and @file{machmode.h} which define
618 Several passes use @file{real.h}, which defines the default
619 representation of floating point constants and how to operate on them.
622 All the passes that work with RTL use the header files @file{rtl.h}
623 and @file{rtl.def}, and subroutines in file @file{rtl.c}. The tools
624 @code{gen*} also use these files to read and work with the machine
628 All the tools that read the machine description use support routines
629 found in @file{gensupport.c}, @file{errors.c}, and @file{read-rtl.c}.
633 Several passes refer to the header file @file{insn-config.h} which
634 contains a few parameters (C macro definitions) generated
635 automatically from the machine description RTL by the tool
638 @cindex instruction recognizer
640 Several passes use the instruction recognizer, which consists of
641 @file{recog.c} and @file{recog.h}, plus the files @file{insn-recog.c}
642 and @file{insn-extract.c} that are generated automatically from the
643 machine description by the tools @file{genrecog} and
647 Several passes use the header files @file{regs.h} which defines the
648 information recorded about pseudo register usage, and @file{basic-block.h}
649 which defines the information recorded about basic blocks.
652 @file{hard-reg-set.h} defines the type @code{HARD_REG_SET}, a bit-vector
653 with a bit for each hard register, and some macros to manipulate it.
654 This type is just @code{int} if the machine has few enough hard registers;
655 otherwise it is an array of @code{int} and some of the macros expand
659 Several passes use instruction attributes. A definition of the
660 attributes defined for a particular machine is in file
661 @file{insn-attr.h}, which is generated from the machine description by
662 the program @file{genattr}. The file @file{insn-attrtab.c} contains
663 subroutines to obtain the attribute values for insns and information
664 about processor pipeline characteristics for the instruction
665 scheduler. It is generated from the machine description by the
666 program @file{genattrtab}.