1 /* Instruction scheduling pass.
2 Copyright (C) 1992, 93-98, 1999 Free Software Foundation, Inc.
3 Contributed by Michael Tiemann (tiemann@cygnus.com) Enhanced by,
4 and currently maintained by, Jim Wilson (wilson@cygnus.com)
6 This file is part of GNU CC.
8 GNU CC is free software; you can redistribute it and/or modify it
9 under the terms of the GNU General Public License as published by
10 the Free Software Foundation; either version 2, or (at your option)
13 GNU CC is distributed in the hope that it will be useful, but
14 WITHOUT ANY WARRANTY; without even the implied warranty of
15 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
16 General Public License for more details.
18 You should have received a copy of the GNU General Public License
19 along with GNU CC; see the file COPYING. If not, write to the Free
20 the Free Software Foundation, 59 Temple Place - Suite 330,
21 Boston, MA 02111-1307, USA. */
24 /* Instruction scheduling pass.
26 This pass implements list scheduling within basic blocks. It is
27 run twice: (1) after flow analysis, but before register allocation,
28 and (2) after register allocation.
30 The first run performs interblock scheduling, moving insns between
31 different blocks in the same "region", and the second runs only
32 basic block scheduling.
34 Interblock motions performed are useful motions and speculative
35 motions, including speculative loads. Motions requiring code
36 duplication are not supported. The identification of motion type
37 and the check for validity of speculative motions requires
38 construction and analysis of the function's control flow graph.
39 The scheduler works as follows:
41 We compute insn priorities based on data dependencies. Flow
42 analysis only creates a fraction of the data-dependencies we must
43 observe: namely, only those dependencies which the combiner can be
44 expected to use. For this pass, we must therefore create the
45 remaining dependencies we need to observe: register dependencies,
46 memory dependencies, dependencies to keep function calls in order,
47 and the dependence between a conditional branch and the setting of
48 condition codes are all dealt with here.
50 The scheduler first traverses the data flow graph, starting with
51 the last instruction, and proceeding to the first, assigning values
52 to insn_priority as it goes. This sorts the instructions
53 topologically by data dependence.
55 Once priorities have been established, we order the insns using
56 list scheduling. This works as follows: starting with a list of
57 all the ready insns, and sorted according to priority number, we
58 schedule the insn from the end of the list by placing its
59 predecessors in the list according to their priority order. We
60 consider this insn scheduled by setting the pointer to the "end" of
61 the list to point to the previous insn. When an insn has no
62 predecessors, we either queue it until sufficient time has elapsed
63 or add it to the ready list. As the instructions are scheduled or
64 when stalls are introduced, the queue advances and dumps insns into
65 the ready list. When all insns down to the lowest priority have
66 been scheduled, the critical path of the basic block has been made
67 as short as possible. The remaining insns are then scheduled in
70 Function unit conflicts are resolved during forward list scheduling
71 by tracking the time when each insn is committed to the schedule
72 and from that, the time the function units it uses must be free.
73 As insns on the ready list are considered for scheduling, those
74 that would result in a blockage of the already committed insns are
75 queued until no blockage will result.
77 The following list shows the order in which we want to break ties
78 among insns in the ready list:
80 1. choose insn with the longest path to end of bb, ties
82 2. choose insn with least contribution to register pressure,
84 3. prefer in-block upon interblock motion, ties broken by
85 4. prefer useful upon speculative motion, ties broken by
86 5. choose insn with largest control flow probability, ties
88 6. choose insn with the least dependences upon the previously
89 scheduled insn, or finally
90 7 choose the insn which has the most insns dependent on it.
91 8. choose insn with lowest UID.
93 Memory references complicate matters. Only if we can be certain
94 that memory references are not part of the data dependency graph
95 (via true, anti, or output dependence), can we move operations past
96 memory references. To first approximation, reads can be done
97 independently, while writes introduce dependencies. Better
98 approximations will yield fewer dependencies.
100 Before reload, an extended analysis of interblock data dependences
101 is required for interblock scheduling. This is performed in
102 compute_block_backward_dependences ().
104 Dependencies set up by memory references are treated in exactly the
105 same way as other dependencies, by using LOG_LINKS backward
106 dependences. LOG_LINKS are translated into INSN_DEPEND forward
107 dependences for the purpose of forward list scheduling.
109 Having optimized the critical path, we may have also unduly
110 extended the lifetimes of some registers. If an operation requires
111 that constants be loaded into registers, it is certainly desirable
112 to load those constants as early as necessary, but no earlier.
113 I.e., it will not do to load up a bunch of registers at the
114 beginning of a basic block only to use them at the end, if they
115 could be loaded later, since this may result in excessive register
118 Note that since branches are never in basic blocks, but only end
119 basic blocks, this pass will not move branches. But that is ok,
120 since we can use GNU's delayed branch scheduling pass to take care
123 Also note that no further optimizations based on algebraic
124 identities are performed, so this pass would be a good one to
125 perform instruction splitting, such as breaking up a multiply
126 instruction into shifts and adds where that is profitable.
128 Given the memory aliasing analysis that this pass should perform,
129 it should be possible to remove redundant stores to memory, and to
130 load values from registers instead of hitting memory.
132 Before reload, speculative insns are moved only if a 'proof' exists
133 that no exception will be caused by this, and if no live registers
134 exist that inhibit the motion (live registers constraints are not
135 represented by data dependence edges).
137 This pass must update information that subsequent passes expect to
138 be correct. Namely: reg_n_refs, reg_n_sets, reg_n_deaths,
139 reg_n_calls_crossed, and reg_live_length. Also, BLOCK_HEAD,
142 The information in the line number notes is carefully retained by
143 this pass. Notes that refer to the starting and ending of
144 exception regions are also carefully retained by this pass. All
145 other NOTE insns are grouped in their same relative order at the
146 beginning of basic blocks and regions that have been scheduled.
148 The main entry point for this pass is schedule_insns(), called for
149 each function. The work of the scheduler is organized in three
150 levels: (1) function level: insns are subject to splitting,
151 control-flow-graph is constructed, regions are computed (after
152 reload, each region is of one block), (2) region level: control
153 flow graph attributes required for interblock scheduling are
154 computed (dominators, reachability, etc.), data dependences and
155 priorities are computed, and (3) block level: insns in the block
156 are actually scheduled. */
163 #include "basic-block.h"
165 #include "function.h"
166 #include "hard-reg-set.h"
168 #include "insn-config.h"
169 #include "insn-attr.h"
174 extern char *reg_known_equiv_p;
175 extern rtx *reg_known_value;
177 #ifdef INSN_SCHEDULING
179 /* target_units bitmask has 1 for each unit in the cpu. It should be
180 possible to compute this variable from the machine description.
181 But currently it is computed by examining the insn list. Since
182 this is only needed for visualization, it seems an acceptable
183 solution. (For understanding the mapping of bits to units, see
184 definition of function_units[] in "insn-attrtab.c".) */
186 static int target_units = 0;
188 /* issue_rate is the number of insns that can be scheduled in the same
189 machine cycle. It can be defined in the config/mach/mach.h file,
190 otherwise we set it to 1. */
192 static int issue_rate;
198 /* sched-verbose controls the amount of debugging output the
199 scheduler prints. It is controlled by -fsched-verbose-N:
200 N>0 and no -DSR : the output is directed to stderr.
201 N>=10 will direct the printouts to stderr (regardless of -dSR).
203 N=2: bb's probabilities, detailed ready list info, unit/insn info.
204 N=3: rtl at abort point, control-flow, regions info.
205 N=5: dependences info. */
207 #define MAX_RGN_BLOCKS 10
208 #define MAX_RGN_INSNS 100
210 static int sched_verbose_param = 0;
211 static int sched_verbose = 0;
213 /* nr_inter/spec counts interblock/speculative motion for the function. */
214 static int nr_inter, nr_spec;
217 /* Debugging file. All printouts are sent to dump, which is always set,
218 either to stderr, or to the dump listing file (-dRS). */
219 static FILE *dump = 0;
221 /* fix_sched_param() is called from toplev.c upon detection
222 of the -fsched-***-N options. */
225 fix_sched_param (param, val)
226 const char *param, *val;
228 if (!strcmp (param, "verbose"))
229 sched_verbose_param = atoi (val);
231 warning ("fix_sched_param: unknown param: %s", param);
235 /* Element N is the next insn that sets (hard or pseudo) register
236 N within the current basic block; or zero, if there is no
237 such insn. Needed for new registers which may be introduced
238 by splitting insns. */
239 static rtx *reg_last_uses;
240 static rtx *reg_last_sets;
241 static rtx *reg_last_clobbers;
242 static regset reg_pending_sets;
243 static regset reg_pending_clobbers;
244 static int reg_pending_sets_all;
246 /* Vector indexed by INSN_UID giving the original ordering of the insns. */
247 static int *insn_luid;
248 #define INSN_LUID(INSN) (insn_luid[INSN_UID (INSN)])
250 /* Vector indexed by INSN_UID giving each instruction a priority. */
251 static int *insn_priority;
252 #define INSN_PRIORITY(INSN) (insn_priority[INSN_UID (INSN)])
254 static short *insn_costs;
255 #define INSN_COST(INSN) insn_costs[INSN_UID (INSN)]
257 /* Vector indexed by INSN_UID giving an encoding of the function units
259 static short *insn_units;
260 #define INSN_UNIT(INSN) insn_units[INSN_UID (INSN)]
262 /* Vector indexed by INSN_UID giving each instruction a
263 register-weight. This weight is an estimation of the insn
264 contribution to registers pressure. */
265 static int *insn_reg_weight;
266 #define INSN_REG_WEIGHT(INSN) (insn_reg_weight[INSN_UID (INSN)])
268 /* Vector indexed by INSN_UID giving list of insns which
269 depend upon INSN. Unlike LOG_LINKS, it represents forward dependences. */
270 static rtx *insn_depend;
271 #define INSN_DEPEND(INSN) insn_depend[INSN_UID (INSN)]
273 /* Vector indexed by INSN_UID. Initialized to the number of incoming
274 edges in forward dependence graph (= number of LOG_LINKS). As
275 scheduling procedes, dependence counts are decreased. An
276 instruction moves to the ready list when its counter is zero. */
277 static int *insn_dep_count;
278 #define INSN_DEP_COUNT(INSN) (insn_dep_count[INSN_UID (INSN)])
280 /* Vector indexed by INSN_UID giving an encoding of the blockage range
281 function. The unit and the range are encoded. */
282 static unsigned int *insn_blockage;
283 #define INSN_BLOCKAGE(INSN) insn_blockage[INSN_UID (INSN)]
285 #define BLOCKAGE_MASK ((1 << BLOCKAGE_BITS) - 1)
286 #define ENCODE_BLOCKAGE(U, R) \
287 (((U) << BLOCKAGE_BITS \
288 | MIN_BLOCKAGE_COST (R)) << BLOCKAGE_BITS \
289 | MAX_BLOCKAGE_COST (R))
290 #define UNIT_BLOCKED(B) ((B) >> (2 * BLOCKAGE_BITS))
291 #define BLOCKAGE_RANGE(B) \
292 (((((B) >> BLOCKAGE_BITS) & BLOCKAGE_MASK) << (HOST_BITS_PER_INT / 2)) \
293 | ((B) & BLOCKAGE_MASK))
295 /* Encodings of the `<name>_unit_blockage_range' function. */
296 #define MIN_BLOCKAGE_COST(R) ((R) >> (HOST_BITS_PER_INT / 2))
297 #define MAX_BLOCKAGE_COST(R) ((R) & ((1 << (HOST_BITS_PER_INT / 2)) - 1))
299 #define DONE_PRIORITY -1
300 #define MAX_PRIORITY 0x7fffffff
301 #define TAIL_PRIORITY 0x7ffffffe
302 #define LAUNCH_PRIORITY 0x7f000001
303 #define DONE_PRIORITY_P(INSN) (INSN_PRIORITY (INSN) < 0)
304 #define LOW_PRIORITY_P(INSN) ((INSN_PRIORITY (INSN) & 0x7f000000) == 0)
306 /* Vector indexed by INSN_UID giving number of insns referring to this
308 static int *insn_ref_count;
309 #define INSN_REF_COUNT(INSN) (insn_ref_count[INSN_UID (INSN)])
311 /* Vector indexed by INSN_UID giving line-number note in effect for each
312 insn. For line-number notes, this indicates whether the note may be
314 static rtx *line_note;
315 #define LINE_NOTE(INSN) (line_note[INSN_UID (INSN)])
317 /* Vector indexed by basic block number giving the starting line-number
318 for each basic block. */
319 static rtx *line_note_head;
321 /* List of important notes we must keep around. This is a pointer to the
322 last element in the list. */
323 static rtx note_list;
327 /* An instruction is ready to be scheduled when all insns preceding it
328 have already been scheduled. It is important to ensure that all
329 insns which use its result will not be executed until its result
330 has been computed. An insn is maintained in one of four structures:
332 (P) the "Pending" set of insns which cannot be scheduled until
333 their dependencies have been satisfied.
334 (Q) the "Queued" set of insns that can be scheduled when sufficient
336 (R) the "Ready" list of unscheduled, uncommitted insns.
337 (S) the "Scheduled" list of insns.
339 Initially, all insns are either "Pending" or "Ready" depending on
340 whether their dependencies are satisfied.
342 Insns move from the "Ready" list to the "Scheduled" list as they
343 are committed to the schedule. As this occurs, the insns in the
344 "Pending" list have their dependencies satisfied and move to either
345 the "Ready" list or the "Queued" set depending on whether
346 sufficient time has passed to make them ready. As time passes,
347 insns move from the "Queued" set to the "Ready" list. Insns may
348 move from the "Ready" list to the "Queued" set if they are blocked
349 due to a function unit conflict.
351 The "Pending" list (P) are the insns in the INSN_DEPEND of the unscheduled
352 insns, i.e., those that are ready, queued, and pending.
353 The "Queued" set (Q) is implemented by the variable `insn_queue'.
354 The "Ready" list (R) is implemented by the variables `ready' and
356 The "Scheduled" list (S) is the new insn chain built by this pass.
358 The transition (R->S) is implemented in the scheduling loop in
359 `schedule_block' when the best insn to schedule is chosen.
360 The transition (R->Q) is implemented in `queue_insn' when an
361 insn is found to have a function unit conflict with the already
363 The transitions (P->R and P->Q) are implemented in `schedule_insn' as
364 insns move from the ready list to the scheduled list.
365 The transition (Q->R) is implemented in 'queue_to_insn' as time
366 passes or stalls are introduced. */
368 /* Implement a circular buffer to delay instructions until sufficient
369 time has passed. INSN_QUEUE_SIZE is a power of two larger than
370 MAX_BLOCKAGE and MAX_READY_COST computed by genattr.c. This is the
371 longest time an isnsn may be queued. */
372 static rtx insn_queue[INSN_QUEUE_SIZE];
373 static int q_ptr = 0;
374 static int q_size = 0;
375 #define NEXT_Q(X) (((X)+1) & (INSN_QUEUE_SIZE-1))
376 #define NEXT_Q_AFTER(X, C) (((X)+C) & (INSN_QUEUE_SIZE-1))
378 /* Vector indexed by INSN_UID giving the minimum clock tick at which
379 the insn becomes ready. This is used to note timing constraints for
380 insns in the pending list. */
381 static int *insn_tick;
382 #define INSN_TICK(INSN) (insn_tick[INSN_UID (INSN)])
384 /* Forward declarations. */
385 static void add_dependence PROTO ((rtx, rtx, enum reg_note));
386 static void remove_dependence PROTO ((rtx, rtx));
387 static rtx find_insn_list PROTO ((rtx, rtx));
388 static int insn_unit PROTO ((rtx));
389 static unsigned int blockage_range PROTO ((int, rtx));
390 static void clear_units PROTO ((void));
391 static int actual_hazard_this_instance PROTO ((int, int, rtx, int, int));
392 static void schedule_unit PROTO ((int, rtx, int));
393 static int actual_hazard PROTO ((int, rtx, int, int));
394 static int potential_hazard PROTO ((int, rtx, int));
395 static int insn_cost PROTO ((rtx, rtx, rtx));
396 static int priority PROTO ((rtx));
397 static void free_pending_lists PROTO ((void));
398 static void add_insn_mem_dependence PROTO ((rtx *, rtx *, rtx, rtx));
399 static void flush_pending_lists PROTO ((rtx, int));
400 static void sched_analyze_1 PROTO ((rtx, rtx));
401 static void sched_analyze_2 PROTO ((rtx, rtx));
402 static void sched_analyze_insn PROTO ((rtx, rtx, rtx));
403 static void sched_analyze PROTO ((rtx, rtx));
404 static int rank_for_schedule PROTO ((const PTR, const PTR));
405 static void swap_sort PROTO ((rtx *, int));
406 static void queue_insn PROTO ((rtx, int));
407 static int schedule_insn PROTO ((rtx, rtx *, int, int));
408 static void find_insn_reg_weight PROTO ((int));
409 static int schedule_block PROTO ((int, int));
410 static char *safe_concat PROTO ((char *, char *, const char *));
411 static int insn_issue_delay PROTO ((rtx));
412 static void adjust_priority PROTO ((rtx));
414 /* Mapping of insns to their original block prior to scheduling. */
415 static int *insn_orig_block;
416 #define INSN_BLOCK(insn) (insn_orig_block[INSN_UID (insn)])
418 /* Some insns (e.g. call) are not allowed to move across blocks. */
419 static char *cant_move;
420 #define CANT_MOVE(insn) (cant_move[INSN_UID (insn)])
422 /* Control flow graph edges are kept in circular lists. */
431 static haifa_edge *edge_table;
433 #define NEXT_IN(edge) (edge_table[edge].next_in)
434 #define NEXT_OUT(edge) (edge_table[edge].next_out)
435 #define FROM_BLOCK(edge) (edge_table[edge].from_block)
436 #define TO_BLOCK(edge) (edge_table[edge].to_block)
438 /* Number of edges in the control flow graph. (In fact, larger than
439 that by 1, since edge 0 is unused.) */
442 /* Circular list of incoming/outgoing edges of a block. */
443 static int *in_edges;
444 static int *out_edges;
446 #define IN_EDGES(block) (in_edges[block])
447 #define OUT_EDGES(block) (out_edges[block])
451 static int is_cfg_nonregular PROTO ((void));
452 static int build_control_flow PROTO ((int_list_ptr *, int_list_ptr *,
454 static void new_edge PROTO ((int, int));
457 /* A region is the main entity for interblock scheduling: insns
458 are allowed to move between blocks in the same region, along
459 control flow graph edges, in the 'up' direction. */
462 int rgn_nr_blocks; /* Number of blocks in region. */
463 int rgn_blocks; /* cblocks in the region (actually index in rgn_bb_table). */
467 /* Number of regions in the procedure. */
468 static int nr_regions;
470 /* Table of region descriptions. */
471 static region *rgn_table;
473 /* Array of lists of regions' blocks. */
474 static int *rgn_bb_table;
476 /* Topological order of blocks in the region (if b2 is reachable from
477 b1, block_to_bb[b2] > block_to_bb[b1]). Note: A basic block is
478 always referred to by either block or b, while its topological
479 order name (in the region) is refered to by bb. */
480 static int *block_to_bb;
482 /* The number of the region containing a block. */
483 static int *containing_rgn;
485 #define RGN_NR_BLOCKS(rgn) (rgn_table[rgn].rgn_nr_blocks)
486 #define RGN_BLOCKS(rgn) (rgn_table[rgn].rgn_blocks)
487 #define BLOCK_TO_BB(block) (block_to_bb[block])
488 #define CONTAINING_RGN(block) (containing_rgn[block])
490 void debug_regions PROTO ((void));
491 static void find_single_block_region PROTO ((void));
492 static void find_rgns PROTO ((int_list_ptr *, int_list_ptr *,
493 int *, int *, sbitmap *));
494 static int too_large PROTO ((int, int *, int *));
496 extern void debug_live PROTO ((int, int));
498 /* Blocks of the current region being scheduled. */
499 static int current_nr_blocks;
500 static int current_blocks;
502 /* The mapping from bb to block. */
503 #define BB_TO_BLOCK(bb) (rgn_bb_table[current_blocks + (bb)])
506 /* Bit vectors and bitset operations are needed for computations on
507 the control flow graph. */
509 typedef unsigned HOST_WIDE_INT *bitset;
512 int *first_member; /* Pointer to the list start in bitlst_table. */
513 int nr_members; /* The number of members of the bit list. */
517 static int bitlst_table_last;
518 static int bitlst_table_size;
519 static int *bitlst_table;
521 static char bitset_member PROTO ((bitset, int, int));
522 static void extract_bitlst PROTO ((bitset, int, bitlst *));
524 /* Target info declarations.
526 The block currently being scheduled is referred to as the "target" block,
527 while other blocks in the region from which insns can be moved to the
528 target are called "source" blocks. The candidate structure holds info
529 about such sources: are they valid? Speculative? Etc. */
530 typedef bitlst bblst;
541 static candidate *candidate_table;
543 /* A speculative motion requires checking live information on the path
544 from 'source' to 'target'. The split blocks are those to be checked.
545 After a speculative motion, live information should be modified in
548 Lists of split and update blocks for each candidate of the current
549 target are in array bblst_table. */
550 static int *bblst_table, bblst_size, bblst_last;
552 #define IS_VALID(src) ( candidate_table[src].is_valid )
553 #define IS_SPECULATIVE(src) ( candidate_table[src].is_speculative )
554 #define SRC_PROB(src) ( candidate_table[src].src_prob )
556 /* The bb being currently scheduled. */
557 static int target_bb;
560 typedef bitlst edgelst;
562 /* Target info functions. */
563 static void split_edges PROTO ((int, int, edgelst *));
564 static void compute_trg_info PROTO ((int));
565 void debug_candidate PROTO ((int));
566 void debug_candidates PROTO ((int));
569 /* Bit-set of bbs, where bit 'i' stands for bb 'i'. */
570 typedef bitset bbset;
572 /* Number of words of the bbset. */
573 static int bbset_size;
575 /* Dominators array: dom[i] contains the bbset of dominators of
576 bb i in the region. */
579 /* bb 0 is the only region entry. */
580 #define IS_RGN_ENTRY(bb) (!bb)
582 /* Is bb_src dominated by bb_trg. */
583 #define IS_DOMINATED(bb_src, bb_trg) \
584 ( bitset_member (dom[bb_src], bb_trg, bbset_size) )
586 /* Probability: Prob[i] is a float in [0, 1] which is the probability
587 of bb i relative to the region entry. */
590 /* The probability of bb_src, relative to bb_trg. Note, that while the
591 'prob[bb]' is a float in [0, 1], this macro returns an integer
593 #define GET_SRC_PROB(bb_src, bb_trg) ((int) (100.0 * (prob[bb_src] / \
596 /* Bit-set of edges, where bit i stands for edge i. */
597 typedef bitset edgeset;
599 /* Number of edges in the region. */
600 static int rgn_nr_edges;
602 /* Array of size rgn_nr_edges. */
603 static int *rgn_edges;
605 /* Number of words in an edgeset. */
606 static int edgeset_size;
608 /* Mapping from each edge in the graph to its number in the rgn. */
609 static int *edge_to_bit;
610 #define EDGE_TO_BIT(edge) (edge_to_bit[edge])
612 /* The split edges of a source bb is different for each target
613 bb. In order to compute this efficiently, the 'potential-split edges'
614 are computed for each bb prior to scheduling a region. This is actually
615 the split edges of each bb relative to the region entry.
617 pot_split[bb] is the set of potential split edges of bb. */
618 static edgeset *pot_split;
620 /* For every bb, a set of its ancestor edges. */
621 static edgeset *ancestor_edges;
623 static void compute_dom_prob_ps PROTO ((int));
625 #define ABS_VALUE(x) (((x)<0)?(-(x)):(x))
626 #define INSN_PROBABILITY(INSN) (SRC_PROB (BLOCK_TO_BB (INSN_BLOCK (INSN))))
627 #define IS_SPECULATIVE_INSN(INSN) (IS_SPECULATIVE (BLOCK_TO_BB (INSN_BLOCK (INSN))))
628 #define INSN_BB(INSN) (BLOCK_TO_BB (INSN_BLOCK (INSN)))
630 /* Parameters affecting the decision of rank_for_schedule(). */
631 #define MIN_DIFF_PRIORITY 2
632 #define MIN_PROBABILITY 40
633 #define MIN_PROB_DIFF 10
635 /* Speculative scheduling functions. */
636 static int check_live_1 PROTO ((int, rtx));
637 static void update_live_1 PROTO ((int, rtx));
638 static int check_live PROTO ((rtx, int));
639 static void update_live PROTO ((rtx, int));
640 static void set_spec_fed PROTO ((rtx));
641 static int is_pfree PROTO ((rtx, int, int));
642 static int find_conditional_protection PROTO ((rtx, int));
643 static int is_conditionally_protected PROTO ((rtx, int, int));
644 static int may_trap_exp PROTO ((rtx, int));
645 static int haifa_classify_insn PROTO ((rtx));
646 static int is_prisky PROTO ((rtx, int, int));
647 static int is_exception_free PROTO ((rtx, int, int));
649 static char find_insn_mem_list PROTO ((rtx, rtx, rtx, rtx));
650 static void compute_block_forward_dependences PROTO ((int));
651 static void init_rgn_data_dependences PROTO ((int));
652 static void add_branch_dependences PROTO ((rtx, rtx));
653 static void compute_block_backward_dependences PROTO ((int));
654 void debug_dependencies PROTO ((void));
656 /* Notes handling mechanism:
657 =========================
658 Generally, NOTES are saved before scheduling and restored after scheduling.
659 The scheduler distinguishes between three types of notes:
661 (1) LINE_NUMBER notes, generated and used for debugging. Here,
662 before scheduling a region, a pointer to the LINE_NUMBER note is
663 added to the insn following it (in save_line_notes()), and the note
664 is removed (in rm_line_notes() and unlink_line_notes()). After
665 scheduling the region, this pointer is used for regeneration of
666 the LINE_NUMBER note (in restore_line_notes()).
668 (2) LOOP_BEGIN, LOOP_END, SETJMP, EHREGION_BEG, EHREGION_END notes:
669 Before scheduling a region, a pointer to the note is added to the insn
670 that follows or precedes it. (This happens as part of the data dependence
671 computation). After scheduling an insn, the pointer contained in it is
672 used for regenerating the corresponding note (in reemit_notes).
674 (3) All other notes (e.g. INSN_DELETED): Before scheduling a block,
675 these notes are put in a list (in rm_other_notes() and
676 unlink_other_notes ()). After scheduling the block, these notes are
677 inserted at the beginning of the block (in schedule_block()). */
679 static rtx unlink_other_notes PROTO ((rtx, rtx));
680 static rtx unlink_line_notes PROTO ((rtx, rtx));
681 static void rm_line_notes PROTO ((int));
682 static void save_line_notes PROTO ((int));
683 static void restore_line_notes PROTO ((int));
684 static void rm_redundant_line_notes PROTO ((void));
685 static void rm_other_notes PROTO ((rtx, rtx));
686 static rtx reemit_notes PROTO ((rtx, rtx));
688 static void get_block_head_tail PROTO ((int, rtx *, rtx *));
690 static int queue_to_ready PROTO ((rtx [], int));
692 static void debug_ready_list PROTO ((rtx[], int));
693 static void init_target_units PROTO ((void));
694 static void insn_print_units PROTO ((rtx));
695 static int get_visual_tbl_length PROTO ((void));
696 static void init_block_visualization PROTO ((void));
697 static void print_block_visualization PROTO ((int, const char *));
698 static void visualize_scheduled_insns PROTO ((int, int));
699 static void visualize_no_unit PROTO ((rtx));
700 static void visualize_stall_cycles PROTO ((int, int));
701 static void print_exp PROTO ((char *, rtx, int));
702 static void print_value PROTO ((char *, rtx, int));
703 static void print_pattern PROTO ((char *, rtx, int));
704 static void print_insn PROTO ((char *, rtx, int));
705 void debug_reg_vector PROTO ((regset));
707 static rtx move_insn1 PROTO ((rtx, rtx));
708 static rtx move_insn PROTO ((rtx, rtx));
709 static rtx group_leader PROTO ((rtx));
710 static int set_priorities PROTO ((int));
711 static void init_rtx_vector PROTO ((rtx **, rtx *, int, int));
712 static void schedule_region PROTO ((int));
714 #endif /* INSN_SCHEDULING */
716 #define SIZE_FOR_MODE(X) (GET_MODE_SIZE (GET_MODE (X)))
718 /* Add ELEM wrapped in an INSN_LIST with reg note kind DEP_TYPE to the
719 LOG_LINKS of INSN, if not already there. DEP_TYPE indicates the type
720 of dependence that this link represents. */
723 add_dependence (insn, elem, dep_type)
726 enum reg_note dep_type;
730 /* Don't depend an insn on itself. */
734 /* We can get a dependency on deleted insns due to optimizations in
735 the register allocation and reloading or due to splitting. Any
736 such dependency is useless and can be ignored. */
737 if (GET_CODE (elem) == NOTE)
740 /* If elem is part of a sequence that must be scheduled together, then
741 make the dependence point to the last insn of the sequence.
742 When HAVE_cc0, it is possible for NOTEs to exist between users and
743 setters of the condition codes, so we must skip past notes here.
744 Otherwise, NOTEs are impossible here. */
746 next = NEXT_INSN (elem);
749 while (next && GET_CODE (next) == NOTE)
750 next = NEXT_INSN (next);
753 if (next && SCHED_GROUP_P (next)
754 && GET_CODE (next) != CODE_LABEL)
756 /* Notes will never intervene here though, so don't bother checking
758 /* We must reject CODE_LABELs, so that we don't get confused by one
759 that has LABEL_PRESERVE_P set, which is represented by the same
760 bit in the rtl as SCHED_GROUP_P. A CODE_LABEL can never be
762 while (NEXT_INSN (next) && SCHED_GROUP_P (NEXT_INSN (next))
763 && GET_CODE (NEXT_INSN (next)) != CODE_LABEL)
764 next = NEXT_INSN (next);
766 /* Again, don't depend an insn on itself. */
770 /* Make the dependence to NEXT, the last insn of the group, instead
771 of the original ELEM. */
775 #ifdef INSN_SCHEDULING
776 /* (This code is guarded by INSN_SCHEDULING, otherwise INSN_BB is undefined.)
777 No need for interblock dependences with calls, since
778 calls are not moved between blocks. Note: the edge where
779 elem is a CALL is still required. */
780 if (GET_CODE (insn) == CALL_INSN
781 && (INSN_BB (elem) != INSN_BB (insn)))
786 /* Check that we don't already have this dependence. */
787 for (link = LOG_LINKS (insn); link; link = XEXP (link, 1))
788 if (XEXP (link, 0) == elem)
790 /* If this is a more restrictive type of dependence than the existing
791 one, then change the existing dependence to this type. */
792 if ((int) dep_type < (int) REG_NOTE_KIND (link))
793 PUT_REG_NOTE_KIND (link, dep_type);
796 /* Might want to check one level of transitivity to save conses. */
798 link = alloc_INSN_LIST (elem, LOG_LINKS (insn));
799 LOG_LINKS (insn) = link;
801 /* Insn dependency, not data dependency. */
802 PUT_REG_NOTE_KIND (link, dep_type);
805 /* Remove ELEM wrapped in an INSN_LIST from the LOG_LINKS
806 of INSN. Abort if not found. */
809 remove_dependence (insn, elem)
813 rtx prev, link, next;
816 for (prev = 0, link = LOG_LINKS (insn); link; link = next)
818 next = XEXP (link, 1);
819 if (XEXP (link, 0) == elem)
822 XEXP (prev, 1) = next;
824 LOG_LINKS (insn) = next;
825 free_INSN_LIST_node (link);
838 #ifndef INSN_SCHEDULING
840 schedule_insns (dump_file)
850 #define HAIFA_INLINE __inline
853 /* Computation of memory dependencies. */
855 /* The *_insns and *_mems are paired lists. Each pending memory operation
856 will have a pointer to the MEM rtx on one list and a pointer to the
857 containing insn on the other list in the same place in the list. */
859 /* We can't use add_dependence like the old code did, because a single insn
860 may have multiple memory accesses, and hence needs to be on the list
861 once for each memory access. Add_dependence won't let you add an insn
862 to a list more than once. */
864 /* An INSN_LIST containing all insns with pending read operations. */
865 static rtx pending_read_insns;
867 /* An EXPR_LIST containing all MEM rtx's which are pending reads. */
868 static rtx pending_read_mems;
870 /* An INSN_LIST containing all insns with pending write operations. */
871 static rtx pending_write_insns;
873 /* An EXPR_LIST containing all MEM rtx's which are pending writes. */
874 static rtx pending_write_mems;
876 /* Indicates the combined length of the two pending lists. We must prevent
877 these lists from ever growing too large since the number of dependencies
878 produced is at least O(N*N), and execution time is at least O(4*N*N), as
879 a function of the length of these pending lists. */
881 static int pending_lists_length;
883 /* The last insn upon which all memory references must depend.
884 This is an insn which flushed the pending lists, creating a dependency
885 between it and all previously pending memory references. This creates
886 a barrier (or a checkpoint) which no memory reference is allowed to cross.
888 This includes all non constant CALL_INSNs. When we do interprocedural
889 alias analysis, this restriction can be relaxed.
890 This may also be an INSN that writes memory if the pending lists grow
893 static rtx last_pending_memory_flush;
895 /* The last function call we have seen. All hard regs, and, of course,
896 the last function call, must depend on this. */
898 static rtx last_function_call;
900 /* The LOG_LINKS field of this is a list of insns which use a pseudo register
901 that does not already cross a call. We create dependencies between each
902 of those insn and the next call insn, to ensure that they won't cross a call
903 after scheduling is done. */
905 static rtx sched_before_next_call;
907 /* Pointer to the last instruction scheduled. Used by rank_for_schedule,
908 so that insns independent of the last scheduled insn will be preferred
909 over dependent instructions. */
911 static rtx last_scheduled_insn;
913 /* Data structures for the computation of data dependences in a regions. We
914 keep one copy of each of the declared above variables for each bb in the
915 region. Before analyzing the data dependences for a bb, its variables
916 are initialized as a function of the variables of its predecessors. When
917 the analysis for a bb completes, we save the contents of each variable X
918 to a corresponding bb_X[bb] variable. For example, pending_read_insns is
919 copied to bb_pending_read_insns[bb]. Another change is that few
920 variables are now a list of insns rather than a single insn:
921 last_pending_memory_flash, last_function_call, reg_last_sets. The
922 manipulation of these variables was changed appropriately. */
924 static rtx **bb_reg_last_uses;
925 static rtx **bb_reg_last_sets;
926 static rtx **bb_reg_last_clobbers;
928 static rtx *bb_pending_read_insns;
929 static rtx *bb_pending_read_mems;
930 static rtx *bb_pending_write_insns;
931 static rtx *bb_pending_write_mems;
932 static int *bb_pending_lists_length;
934 static rtx *bb_last_pending_memory_flush;
935 static rtx *bb_last_function_call;
936 static rtx *bb_sched_before_next_call;
938 /* Functions for construction of the control flow graph. */
940 /* Return 1 if control flow graph should not be constructed, 0 otherwise.
942 We decide not to build the control flow graph if there is possibly more
943 than one entry to the function, if computed branches exist, of if we
944 have nonlocal gotos. */
953 /* If we have a label that could be the target of a nonlocal goto, then
954 the cfg is not well structured. */
955 if (nonlocal_goto_handler_labels)
958 /* If we have any forced labels, then the cfg is not well structured. */
962 /* If this function has a computed jump, then we consider the cfg
963 not well structured. */
964 if (current_function_has_computed_jump)
967 /* If we have exception handlers, then we consider the cfg not well
968 structured. ?!? We should be able to handle this now that flow.c
969 computes an accurate cfg for EH. */
970 if (exception_handler_labels)
973 /* If we have non-jumping insns which refer to labels, then we consider
974 the cfg not well structured. */
975 /* Check for labels referred to other thn by jumps. */
976 for (b = 0; b < n_basic_blocks; b++)
977 for (insn = BLOCK_HEAD (b);; insn = NEXT_INSN (insn))
979 code = GET_CODE (insn);
980 if (GET_RTX_CLASS (code) == 'i')
984 for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
985 if (REG_NOTE_KIND (note) == REG_LABEL)
989 if (insn == BLOCK_END (b))
993 /* All the tests passed. Consider the cfg well structured. */
997 /* Build the control flow graph and set nr_edges.
999 Instead of trying to build a cfg ourselves, we rely on flow to
1000 do it for us. Stamp out useless code (and bug) duplication.
1002 Return nonzero if an irregularity in the cfg is found which would
1003 prevent cross block scheduling. */
1006 build_control_flow (s_preds, s_succs, num_preds, num_succs)
1007 int_list_ptr *s_preds;
1008 int_list_ptr *s_succs;
1016 /* Count the number of edges in the cfg. */
1019 for (i = 0; i < n_basic_blocks; i++)
1021 nr_edges += num_succs[i];
1023 /* Unreachable loops with more than one basic block are detected
1024 during the DFS traversal in find_rgns.
1026 Unreachable loops with a single block are detected here. This
1027 test is redundant with the one in find_rgns, but it's much
1028 cheaper to go ahead and catch the trivial case here. */
1029 if (num_preds[i] == 0
1030 || (num_preds[i] == 1 && INT_LIST_VAL (s_preds[i]) == i))
1034 /* Account for entry/exit edges. */
1037 in_edges = (int *) xcalloc (n_basic_blocks, sizeof (int));
1038 out_edges = (int *) xcalloc (n_basic_blocks, sizeof (int));
1039 edge_table = (haifa_edge *) xcalloc (nr_edges, sizeof (haifa_edge));
1042 for (i = 0; i < n_basic_blocks; i++)
1043 for (succ = s_succs[i]; succ; succ = succ->next)
1045 if (INT_LIST_VAL (succ) != EXIT_BLOCK)
1046 new_edge (i, INT_LIST_VAL (succ));
1049 /* Increment by 1, since edge 0 is unused. */
1056 /* Record an edge in the control flow graph from SOURCE to TARGET.
1058 In theory, this is redundant with the s_succs computed above, but
1059 we have not converted all of haifa to use information from the
1063 new_edge (source, target)
1067 int curr_edge, fst_edge;
1069 /* Check for duplicates. */
1070 fst_edge = curr_edge = OUT_EDGES (source);
1073 if (FROM_BLOCK (curr_edge) == source
1074 && TO_BLOCK (curr_edge) == target)
1079 curr_edge = NEXT_OUT (curr_edge);
1081 if (fst_edge == curr_edge)
1087 FROM_BLOCK (e) = source;
1088 TO_BLOCK (e) = target;
1090 if (OUT_EDGES (source))
1092 next_edge = NEXT_OUT (OUT_EDGES (source));
1093 NEXT_OUT (OUT_EDGES (source)) = e;
1094 NEXT_OUT (e) = next_edge;
1098 OUT_EDGES (source) = e;
1102 if (IN_EDGES (target))
1104 next_edge = NEXT_IN (IN_EDGES (target));
1105 NEXT_IN (IN_EDGES (target)) = e;
1106 NEXT_IN (e) = next_edge;
1110 IN_EDGES (target) = e;
1116 /* BITSET macros for operations on the control flow graph. */
1118 /* Compute bitwise union of two bitsets. */
1119 #define BITSET_UNION(set1, set2, len) \
1120 do { register bitset tp = set1, sp = set2; \
1122 for (i = 0; i < len; i++) \
1123 *(tp++) |= *(sp++); } while (0)
1125 /* Compute bitwise intersection of two bitsets. */
1126 #define BITSET_INTER(set1, set2, len) \
1127 do { register bitset tp = set1, sp = set2; \
1129 for (i = 0; i < len; i++) \
1130 *(tp++) &= *(sp++); } while (0)
1132 /* Compute bitwise difference of two bitsets. */
1133 #define BITSET_DIFFER(set1, set2, len) \
1134 do { register bitset tp = set1, sp = set2; \
1136 for (i = 0; i < len; i++) \
1137 *(tp++) &= ~*(sp++); } while (0)
1139 /* Inverts every bit of bitset 'set'. */
1140 #define BITSET_INVERT(set, len) \
1141 do { register bitset tmpset = set; \
1143 for (i = 0; i < len; i++, tmpset++) \
1144 *tmpset = ~*tmpset; } while (0)
1146 /* Turn on the index'th bit in bitset set. */
1147 #define BITSET_ADD(set, index, len) \
1149 if (index >= HOST_BITS_PER_WIDE_INT * len) \
1152 set[index/HOST_BITS_PER_WIDE_INT] |= \
1153 1 << (index % HOST_BITS_PER_WIDE_INT); \
1156 /* Turn off the index'th bit in set. */
1157 #define BITSET_REMOVE(set, index, len) \
1159 if (index >= HOST_BITS_PER_WIDE_INT * len) \
1162 set[index/HOST_BITS_PER_WIDE_INT] &= \
1163 ~(1 << (index%HOST_BITS_PER_WIDE_INT)); \
1167 /* Check if the index'th bit in bitset set is on. */
1170 bitset_member (set, index, len)
1174 if (index >= HOST_BITS_PER_WIDE_INT * len)
1176 return (set[index / HOST_BITS_PER_WIDE_INT] &
1177 1 << (index % HOST_BITS_PER_WIDE_INT)) ? 1 : 0;
1181 /* Translate a bit-set SET to a list BL of the bit-set members. */
1184 extract_bitlst (set, len, bl)
1190 unsigned HOST_WIDE_INT word;
1192 /* bblst table space is reused in each call to extract_bitlst. */
1193 bitlst_table_last = 0;
1195 bl->first_member = &bitlst_table[bitlst_table_last];
1198 for (i = 0; i < len; i++)
1201 offset = i * HOST_BITS_PER_WIDE_INT;
1202 for (j = 0; word; j++)
1206 bitlst_table[bitlst_table_last++] = offset;
1217 /* Functions for the construction of regions. */
1219 /* Print the regions, for debugging purposes. Callable from debugger. */
1226 fprintf (dump, "\n;; ------------ REGIONS ----------\n\n");
1227 for (rgn = 0; rgn < nr_regions; rgn++)
1229 fprintf (dump, ";;\trgn %d nr_blocks %d:\n", rgn,
1230 rgn_table[rgn].rgn_nr_blocks);
1231 fprintf (dump, ";;\tbb/block: ");
1233 for (bb = 0; bb < rgn_table[rgn].rgn_nr_blocks; bb++)
1235 current_blocks = RGN_BLOCKS (rgn);
1237 if (bb != BLOCK_TO_BB (BB_TO_BLOCK (bb)))
1240 fprintf (dump, " %d/%d ", bb, BB_TO_BLOCK (bb));
1243 fprintf (dump, "\n\n");
1248 /* Build a single block region for each basic block in the function.
1249 This allows for using the same code for interblock and basic block
1253 find_single_block_region ()
1257 for (i = 0; i < n_basic_blocks; i++)
1259 rgn_bb_table[i] = i;
1260 RGN_NR_BLOCKS (i) = 1;
1262 CONTAINING_RGN (i) = i;
1263 BLOCK_TO_BB (i) = 0;
1265 nr_regions = n_basic_blocks;
1269 /* Update number of blocks and the estimate for number of insns
1270 in the region. Return 1 if the region is "too large" for interblock
1271 scheduling (compile time considerations), otherwise return 0. */
1274 too_large (block, num_bbs, num_insns)
1275 int block, *num_bbs, *num_insns;
1278 (*num_insns) += (INSN_LUID (BLOCK_END (block)) -
1279 INSN_LUID (BLOCK_HEAD (block)));
1280 if ((*num_bbs > MAX_RGN_BLOCKS) || (*num_insns > MAX_RGN_INSNS))
1287 /* Update_loop_relations(blk, hdr): Check if the loop headed by max_hdr[blk]
1288 is still an inner loop. Put in max_hdr[blk] the header of the most inner
1289 loop containing blk. */
1290 #define UPDATE_LOOP_RELATIONS(blk, hdr) \
1292 if (max_hdr[blk] == -1) \
1293 max_hdr[blk] = hdr; \
1294 else if (dfs_nr[max_hdr[blk]] > dfs_nr[hdr]) \
1295 RESET_BIT (inner, hdr); \
1296 else if (dfs_nr[max_hdr[blk]] < dfs_nr[hdr]) \
1298 RESET_BIT (inner,max_hdr[blk]); \
1299 max_hdr[blk] = hdr; \
1304 /* Find regions for interblock scheduling.
1306 A region for scheduling can be:
1308 * A loop-free procedure, or
1310 * A reducible inner loop, or
1312 * A basic block not contained in any other region.
1315 ?!? In theory we could build other regions based on extended basic
1316 blocks or reverse extended basic blocks. Is it worth the trouble?
1318 Loop blocks that form a region are put into the region's block list
1319 in topological order.
1321 This procedure stores its results into the following global (ick) variables
1330 We use dominator relationships to avoid making regions out of non-reducible
1333 This procedure needs to be converted to work on pred/succ lists instead
1334 of edge tables. That would simplify it somewhat. */
1337 find_rgns (s_preds, s_succs, num_preds, num_succs, dom)
1338 int_list_ptr *s_preds;
1339 int_list_ptr *s_succs;
1344 int *max_hdr, *dfs_nr, *stack, *queue, *degree;
1346 int node, child, loop_head, i, head, tail;
1347 int count = 0, sp, idx = 0, current_edge = out_edges[0];
1348 int num_bbs, num_insns, unreachable;
1349 int too_large_failure;
1351 /* Note if an edge has been passed. */
1354 /* Note if a block is a natural loop header. */
1357 /* Note if a block is an natural inner loop header. */
1360 /* Note if a block is in the block queue. */
1363 /* Note if a block is in the block queue. */
1366 /* Perform a DFS traversal of the cfg. Identify loop headers, inner loops
1367 and a mapping from block to its loop header (if the block is contained
1368 in a loop, else -1).
1370 Store results in HEADER, INNER, and MAX_HDR respectively, these will
1371 be used as inputs to the second traversal.
1373 STACK, SP and DFS_NR are only used during the first traversal. */
1375 /* Allocate and initialize variables for the first traversal. */
1376 max_hdr = (int *) alloca (n_basic_blocks * sizeof (int));
1377 dfs_nr = (int *) alloca (n_basic_blocks * sizeof (int));
1378 bzero ((char *) dfs_nr, n_basic_blocks * sizeof (int));
1379 stack = (int *) alloca (nr_edges * sizeof (int));
1381 inner = sbitmap_alloc (n_basic_blocks);
1382 sbitmap_ones (inner);
1384 header = sbitmap_alloc (n_basic_blocks);
1385 sbitmap_zero (header);
1387 passed = sbitmap_alloc (nr_edges);
1388 sbitmap_zero (passed);
1390 in_queue = sbitmap_alloc (n_basic_blocks);
1391 sbitmap_zero (in_queue);
1393 in_stack = sbitmap_alloc (n_basic_blocks);
1394 sbitmap_zero (in_stack);
1396 for (i = 0; i < n_basic_blocks; i++)
1399 /* DFS traversal to find inner loops in the cfg. */
1404 if (current_edge == 0 || TEST_BIT (passed, current_edge))
1406 /* We have reached a leaf node or a node that was already
1407 processed. Pop edges off the stack until we find
1408 an edge that has not yet been processed. */
1410 && (current_edge == 0 || TEST_BIT (passed, current_edge)))
1412 /* Pop entry off the stack. */
1413 current_edge = stack[sp--];
1414 node = FROM_BLOCK (current_edge);
1415 child = TO_BLOCK (current_edge);
1416 RESET_BIT (in_stack, child);
1417 if (max_hdr[child] >= 0 && TEST_BIT (in_stack, max_hdr[child]))
1418 UPDATE_LOOP_RELATIONS (node, max_hdr[child]);
1419 current_edge = NEXT_OUT (current_edge);
1422 /* See if have finished the DFS tree traversal. */
1423 if (sp < 0 && TEST_BIT (passed, current_edge))
1426 /* Nope, continue the traversal with the popped node. */
1430 /* Process a node. */
1431 node = FROM_BLOCK (current_edge);
1432 child = TO_BLOCK (current_edge);
1433 SET_BIT (in_stack, node);
1434 dfs_nr[node] = ++count;
1436 /* If the successor is in the stack, then we've found a loop.
1437 Mark the loop, if it is not a natural loop, then it will
1438 be rejected during the second traversal. */
1439 if (TEST_BIT (in_stack, child))
1442 SET_BIT (header, child);
1443 UPDATE_LOOP_RELATIONS (node, child);
1444 SET_BIT (passed, current_edge);
1445 current_edge = NEXT_OUT (current_edge);
1449 /* If the child was already visited, then there is no need to visit
1450 it again. Just update the loop relationships and restart
1454 if (max_hdr[child] >= 0 && TEST_BIT (in_stack, max_hdr[child]))
1455 UPDATE_LOOP_RELATIONS (node, max_hdr[child]);
1456 SET_BIT (passed, current_edge);
1457 current_edge = NEXT_OUT (current_edge);
1461 /* Push an entry on the stack and continue DFS traversal. */
1462 stack[++sp] = current_edge;
1463 SET_BIT (passed, current_edge);
1464 current_edge = OUT_EDGES (child);
1466 /* This is temporary until haifa is converted to use rth's new
1467 cfg routines which have true entry/exit blocks and the
1468 appropriate edges from/to those blocks.
1470 Generally we update dfs_nr for a node when we process its
1471 out edge. However, if the node has no out edge then we will
1472 not set dfs_nr for that node. This can confuse the scheduler
1473 into thinking that we have unreachable blocks, which in turn
1474 disables cross block scheduling.
1476 So, if we have a node with no out edges, go ahead and mark it
1477 as reachable now. */
1478 if (current_edge == 0)
1479 dfs_nr[child] = ++count;
1482 /* Another check for unreachable blocks. The earlier test in
1483 is_cfg_nonregular only finds unreachable blocks that do not
1486 The DFS traversal will mark every block that is reachable from
1487 the entry node by placing a nonzero value in dfs_nr. Thus if
1488 dfs_nr is zero for any block, then it must be unreachable. */
1490 for (i = 0; i < n_basic_blocks; i++)
1497 /* Gross. To avoid wasting memory, the second pass uses the dfs_nr array
1498 to hold degree counts. */
1501 /* Compute the in-degree of every block in the graph. */
1502 for (i = 0; i < n_basic_blocks; i++)
1503 degree[i] = num_preds[i];
1505 /* Do not perform region scheduling if there are any unreachable
1510 SET_BIT (header, 0);
1512 /* Second travsersal:find reducible inner loops and topologically sort
1513 block of each region. */
1515 queue = (int *) alloca (n_basic_blocks * sizeof (int));
1517 /* Find blocks which are inner loop headers. We still have non-reducible
1518 loops to consider at this point. */
1519 for (i = 0; i < n_basic_blocks; i++)
1521 if (TEST_BIT (header, i) && TEST_BIT (inner, i))
1526 /* Now check that the loop is reducible. We do this separate
1527 from finding inner loops so that we do not find a reducible
1528 loop which contains an inner non-reducible loop.
1530 A simple way to find reducible/natural loops is to verify
1531 that each block in the loop is dominated by the loop
1534 If there exists a block that is not dominated by the loop
1535 header, then the block is reachable from outside the loop
1536 and thus the loop is not a natural loop. */
1537 for (j = 0; j < n_basic_blocks; j++)
1539 /* First identify blocks in the loop, except for the loop
1541 if (i == max_hdr[j] && i != j)
1543 /* Now verify that the block is dominated by the loop
1545 if (!TEST_BIT (dom[j], i))
1550 /* If we exited the loop early, then I is the header of
1551 a non-reducible loop and we should quit processing it
1553 if (j != n_basic_blocks)
1556 /* I is a header of an inner loop, or block 0 in a subroutine
1557 with no loops at all. */
1559 too_large_failure = 0;
1560 loop_head = max_hdr[i];
1562 /* Decrease degree of all I's successors for topological
1564 for (ps = s_succs[i]; ps; ps = ps->next)
1565 if (INT_LIST_VAL (ps) != EXIT_BLOCK
1566 && INT_LIST_VAL (ps) != ENTRY_BLOCK)
1567 --degree[INT_LIST_VAL(ps)];
1569 /* Estimate # insns, and count # blocks in the region. */
1571 num_insns = (INSN_LUID (BLOCK_END (i))
1572 - INSN_LUID (BLOCK_HEAD (i)));
1575 /* Find all loop latches (blocks with back edges to the loop
1576 header) or all the leaf blocks in the cfg has no loops.
1578 Place those blocks into the queue. */
1581 for (j = 0; j < n_basic_blocks; j++)
1582 /* Leaf nodes have only a single successor which must
1584 if (num_succs[j] == 1
1585 && INT_LIST_VAL (s_succs[j]) == EXIT_BLOCK)
1588 SET_BIT (in_queue, j);
1590 if (too_large (j, &num_bbs, &num_insns))
1592 too_large_failure = 1;
1601 for (ps = s_preds[i]; ps; ps = ps->next)
1603 node = INT_LIST_VAL (ps);
1605 if (node == ENTRY_BLOCK || node == EXIT_BLOCK)
1608 if (max_hdr[node] == loop_head && node != i)
1610 /* This is a loop latch. */
1611 queue[++tail] = node;
1612 SET_BIT (in_queue, node);
1614 if (too_large (node, &num_bbs, &num_insns))
1616 too_large_failure = 1;
1624 /* Now add all the blocks in the loop to the queue.
1626 We know the loop is a natural loop; however the algorithm
1627 above will not always mark certain blocks as being in the
1636 The algorithm in the DFS traversal may not mark B & D as part
1637 of the loop (ie they will not have max_hdr set to A).
1639 We know they can not be loop latches (else they would have
1640 had max_hdr set since they'd have a backedge to a dominator
1641 block). So we don't need them on the initial queue.
1643 We know they are part of the loop because they are dominated
1644 by the loop header and can be reached by a backwards walk of
1645 the edges starting with nodes on the initial queue.
1647 It is safe and desirable to include those nodes in the
1648 loop/scheduling region. To do so we would need to decrease
1649 the degree of a node if it is the target of a backedge
1650 within the loop itself as the node is placed in the queue.
1652 We do not do this because I'm not sure that the actual
1653 scheduling code will properly handle this case. ?!? */
1655 while (head < tail && !too_large_failure)
1658 child = queue[++head];
1660 for (ps = s_preds[child]; ps; ps = ps->next)
1662 node = INT_LIST_VAL (ps);
1664 /* See discussion above about nodes not marked as in
1665 this loop during the initial DFS traversal. */
1666 if (node == ENTRY_BLOCK || node == EXIT_BLOCK
1667 || max_hdr[node] != loop_head)
1672 else if (!TEST_BIT (in_queue, node) && node != i)
1674 queue[++tail] = node;
1675 SET_BIT (in_queue, node);
1677 if (too_large (node, &num_bbs, &num_insns))
1679 too_large_failure = 1;
1686 if (tail >= 0 && !too_large_failure)
1688 /* Place the loop header into list of region blocks. */
1690 rgn_bb_table[idx] = i;
1691 RGN_NR_BLOCKS (nr_regions) = num_bbs;
1692 RGN_BLOCKS (nr_regions) = idx++;
1693 CONTAINING_RGN (i) = nr_regions;
1694 BLOCK_TO_BB (i) = count = 0;
1696 /* Remove blocks from queue[] when their in degree
1697 becomes zero. Repeat until no blocks are left on the
1698 list. This produces a topological list of blocks in
1706 child = queue[head];
1707 if (degree[child] == 0)
1710 rgn_bb_table[idx++] = child;
1711 BLOCK_TO_BB (child) = ++count;
1712 CONTAINING_RGN (child) = nr_regions;
1713 queue[head] = queue[tail--];
1715 for (ps = s_succs[child]; ps; ps = ps->next)
1716 if (INT_LIST_VAL (ps) != ENTRY_BLOCK
1717 && INT_LIST_VAL (ps) != EXIT_BLOCK)
1718 --degree[INT_LIST_VAL (ps)];
1729 /* Any block that did not end up in a region is placed into a region
1731 for (i = 0; i < n_basic_blocks; i++)
1734 rgn_bb_table[idx] = i;
1735 RGN_NR_BLOCKS (nr_regions) = 1;
1736 RGN_BLOCKS (nr_regions) = idx++;
1737 CONTAINING_RGN (i) = nr_regions++;
1738 BLOCK_TO_BB (i) = 0;
1749 /* Functions for regions scheduling information. */
1751 /* Compute dominators, probability, and potential-split-edges of bb.
1752 Assume that these values were already computed for bb's predecessors. */
1755 compute_dom_prob_ps (bb)
1758 int nxt_in_edge, fst_in_edge, pred;
1759 int fst_out_edge, nxt_out_edge, nr_out_edges, nr_rgn_out_edges;
1762 if (IS_RGN_ENTRY (bb))
1764 BITSET_ADD (dom[bb], 0, bbset_size);
1769 fst_in_edge = nxt_in_edge = IN_EDGES (BB_TO_BLOCK (bb));
1771 /* Intialize dom[bb] to '111..1'. */
1772 BITSET_INVERT (dom[bb], bbset_size);
1776 pred = FROM_BLOCK (nxt_in_edge);
1777 BITSET_INTER (dom[bb], dom[BLOCK_TO_BB (pred)], bbset_size);
1779 BITSET_UNION (ancestor_edges[bb], ancestor_edges[BLOCK_TO_BB (pred)],
1782 BITSET_ADD (ancestor_edges[bb], EDGE_TO_BIT (nxt_in_edge), edgeset_size);
1785 nr_rgn_out_edges = 0;
1786 fst_out_edge = OUT_EDGES (pred);
1787 nxt_out_edge = NEXT_OUT (fst_out_edge);
1788 BITSET_UNION (pot_split[bb], pot_split[BLOCK_TO_BB (pred)],
1791 BITSET_ADD (pot_split[bb], EDGE_TO_BIT (fst_out_edge), edgeset_size);
1793 /* The successor doesn't belong in the region? */
1794 if (CONTAINING_RGN (TO_BLOCK (fst_out_edge)) !=
1795 CONTAINING_RGN (BB_TO_BLOCK (bb)))
1798 while (fst_out_edge != nxt_out_edge)
1801 /* The successor doesn't belong in the region? */
1802 if (CONTAINING_RGN (TO_BLOCK (nxt_out_edge)) !=
1803 CONTAINING_RGN (BB_TO_BLOCK (bb)))
1805 BITSET_ADD (pot_split[bb], EDGE_TO_BIT (nxt_out_edge), edgeset_size);
1806 nxt_out_edge = NEXT_OUT (nxt_out_edge);
1810 /* Now nr_rgn_out_edges is the number of region-exit edges from
1811 pred, and nr_out_edges will be the number of pred out edges
1812 not leaving the region. */
1813 nr_out_edges -= nr_rgn_out_edges;
1814 if (nr_rgn_out_edges > 0)
1815 prob[bb] += 0.9 * prob[BLOCK_TO_BB (pred)] / nr_out_edges;
1817 prob[bb] += prob[BLOCK_TO_BB (pred)] / nr_out_edges;
1818 nxt_in_edge = NEXT_IN (nxt_in_edge);
1820 while (fst_in_edge != nxt_in_edge);
1822 BITSET_ADD (dom[bb], bb, bbset_size);
1823 BITSET_DIFFER (pot_split[bb], ancestor_edges[bb], edgeset_size);
1825 if (sched_verbose >= 2)
1826 fprintf (dump, ";; bb_prob(%d, %d) = %3d\n", bb, BB_TO_BLOCK (bb), (int) (100.0 * prob[bb]));
1827 } /* compute_dom_prob_ps */
1829 /* Functions for target info. */
1831 /* Compute in BL the list of split-edges of bb_src relatively to bb_trg.
1832 Note that bb_trg dominates bb_src. */
1835 split_edges (bb_src, bb_trg, bl)
1840 int es = edgeset_size;
1841 edgeset src = (edgeset) alloca (es * sizeof (HOST_WIDE_INT));
1844 src[es] = (pot_split[bb_src])[es];
1845 BITSET_DIFFER (src, pot_split[bb_trg], edgeset_size);
1846 extract_bitlst (src, edgeset_size, bl);
1850 /* Find the valid candidate-source-blocks for the target block TRG, compute
1851 their probability, and check if they are speculative or not.
1852 For speculative sources, compute their update-blocks and split-blocks. */
1855 compute_trg_info (trg)
1858 register candidate *sp;
1860 int check_block, update_idx;
1861 int i, j, k, fst_edge, nxt_edge;
1863 /* Define some of the fields for the target bb as well. */
1864 sp = candidate_table + trg;
1866 sp->is_speculative = 0;
1869 for (i = trg + 1; i < current_nr_blocks; i++)
1871 sp = candidate_table + i;
1873 sp->is_valid = IS_DOMINATED (i, trg);
1876 sp->src_prob = GET_SRC_PROB (i, trg);
1877 sp->is_valid = (sp->src_prob >= MIN_PROBABILITY);
1882 split_edges (i, trg, &el);
1883 sp->is_speculative = (el.nr_members) ? 1 : 0;
1884 if (sp->is_speculative && !flag_schedule_speculative)
1890 sp->split_bbs.first_member = &bblst_table[bblst_last];
1891 sp->split_bbs.nr_members = el.nr_members;
1892 for (j = 0; j < el.nr_members; bblst_last++, j++)
1893 bblst_table[bblst_last] =
1894 TO_BLOCK (rgn_edges[el.first_member[j]]);
1895 sp->update_bbs.first_member = &bblst_table[bblst_last];
1897 for (j = 0; j < el.nr_members; j++)
1899 check_block = FROM_BLOCK (rgn_edges[el.first_member[j]]);
1900 fst_edge = nxt_edge = OUT_EDGES (check_block);
1903 for (k = 0; k < el.nr_members; k++)
1904 if (EDGE_TO_BIT (nxt_edge) == el.first_member[k])
1907 if (k >= el.nr_members)
1909 bblst_table[bblst_last++] = TO_BLOCK (nxt_edge);
1913 nxt_edge = NEXT_OUT (nxt_edge);
1915 while (fst_edge != nxt_edge);
1917 sp->update_bbs.nr_members = update_idx;
1922 sp->split_bbs.nr_members = sp->update_bbs.nr_members = 0;
1924 sp->is_speculative = 0;
1928 } /* compute_trg_info */
1931 /* Print candidates info, for debugging purposes. Callable from debugger. */
1937 if (!candidate_table[i].is_valid)
1940 if (candidate_table[i].is_speculative)
1943 fprintf (dump, "src b %d bb %d speculative \n", BB_TO_BLOCK (i), i);
1945 fprintf (dump, "split path: ");
1946 for (j = 0; j < candidate_table[i].split_bbs.nr_members; j++)
1948 int b = candidate_table[i].split_bbs.first_member[j];
1950 fprintf (dump, " %d ", b);
1952 fprintf (dump, "\n");
1954 fprintf (dump, "update path: ");
1955 for (j = 0; j < candidate_table[i].update_bbs.nr_members; j++)
1957 int b = candidate_table[i].update_bbs.first_member[j];
1959 fprintf (dump, " %d ", b);
1961 fprintf (dump, "\n");
1965 fprintf (dump, " src %d equivalent\n", BB_TO_BLOCK (i));
1970 /* Print candidates info, for debugging purposes. Callable from debugger. */
1973 debug_candidates (trg)
1978 fprintf (dump, "----------- candidate table: target: b=%d bb=%d ---\n",
1979 BB_TO_BLOCK (trg), trg);
1980 for (i = trg + 1; i < current_nr_blocks; i++)
1981 debug_candidate (i);
1985 /* Functions for speculative scheduing. */
1987 /* Return 0 if x is a set of a register alive in the beginning of one
1988 of the split-blocks of src, otherwise return 1. */
1991 check_live_1 (src, x)
1997 register rtx reg = SET_DEST (x);
2002 while (GET_CODE (reg) == SUBREG || GET_CODE (reg) == ZERO_EXTRACT
2003 || GET_CODE (reg) == SIGN_EXTRACT
2004 || GET_CODE (reg) == STRICT_LOW_PART)
2005 reg = XEXP (reg, 0);
2007 if (GET_CODE (reg) == PARALLEL
2008 && GET_MODE (reg) == BLKmode)
2011 for (i = XVECLEN (reg, 0) - 1; i >= 0; i--)
2012 if (check_live_1 (src, XVECEXP (reg, 0, i)))
2017 if (GET_CODE (reg) != REG)
2020 regno = REGNO (reg);
2022 if (regno < FIRST_PSEUDO_REGISTER && global_regs[regno])
2024 /* Global registers are assumed live. */
2029 if (regno < FIRST_PSEUDO_REGISTER)
2031 /* Check for hard registers. */
2032 int j = HARD_REGNO_NREGS (regno, GET_MODE (reg));
2035 for (i = 0; i < candidate_table[src].split_bbs.nr_members; i++)
2037 int b = candidate_table[src].split_bbs.first_member[i];
2039 if (REGNO_REG_SET_P (BASIC_BLOCK (b)->global_live_at_start,
2049 /* Check for psuedo registers. */
2050 for (i = 0; i < candidate_table[src].split_bbs.nr_members; i++)
2052 int b = candidate_table[src].split_bbs.first_member[i];
2054 if (REGNO_REG_SET_P (BASIC_BLOCK (b)->global_live_at_start, regno))
2066 /* If x is a set of a register R, mark that R is alive in the beginning
2067 of every update-block of src. */
2070 update_live_1 (src, x)
2076 register rtx reg = SET_DEST (x);
2081 while (GET_CODE (reg) == SUBREG || GET_CODE (reg) == ZERO_EXTRACT
2082 || GET_CODE (reg) == SIGN_EXTRACT
2083 || GET_CODE (reg) == STRICT_LOW_PART)
2084 reg = XEXP (reg, 0);
2086 if (GET_CODE (reg) == PARALLEL
2087 && GET_MODE (reg) == BLKmode)
2090 for (i = XVECLEN (reg, 0) - 1; i >= 0; i--)
2091 update_live_1 (src, XVECEXP (reg, 0, i));
2095 if (GET_CODE (reg) != REG)
2098 /* Global registers are always live, so the code below does not apply
2101 regno = REGNO (reg);
2103 if (regno >= FIRST_PSEUDO_REGISTER || !global_regs[regno])
2105 if (regno < FIRST_PSEUDO_REGISTER)
2107 int j = HARD_REGNO_NREGS (regno, GET_MODE (reg));
2110 for (i = 0; i < candidate_table[src].update_bbs.nr_members; i++)
2112 int b = candidate_table[src].update_bbs.first_member[i];
2114 SET_REGNO_REG_SET (BASIC_BLOCK (b)->global_live_at_start,
2121 for (i = 0; i < candidate_table[src].update_bbs.nr_members; i++)
2123 int b = candidate_table[src].update_bbs.first_member[i];
2125 SET_REGNO_REG_SET (BASIC_BLOCK (b)->global_live_at_start, regno);
2132 /* Return 1 if insn can be speculatively moved from block src to trg,
2133 otherwise return 0. Called before first insertion of insn to
2134 ready-list or before the scheduling. */
2137 check_live (insn, src)
2141 /* Find the registers set by instruction. */
2142 if (GET_CODE (PATTERN (insn)) == SET
2143 || GET_CODE (PATTERN (insn)) == CLOBBER)
2144 return check_live_1 (src, PATTERN (insn));
2145 else if (GET_CODE (PATTERN (insn)) == PARALLEL)
2148 for (j = XVECLEN (PATTERN (insn), 0) - 1; j >= 0; j--)
2149 if ((GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == SET
2150 || GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == CLOBBER)
2151 && !check_live_1 (src, XVECEXP (PATTERN (insn), 0, j)))
2161 /* Update the live registers info after insn was moved speculatively from
2162 block src to trg. */
2165 update_live (insn, src)
2169 /* Find the registers set by instruction. */
2170 if (GET_CODE (PATTERN (insn)) == SET
2171 || GET_CODE (PATTERN (insn)) == CLOBBER)
2172 update_live_1 (src, PATTERN (insn));
2173 else if (GET_CODE (PATTERN (insn)) == PARALLEL)
2176 for (j = XVECLEN (PATTERN (insn), 0) - 1; j >= 0; j--)
2177 if (GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == SET
2178 || GET_CODE (XVECEXP (PATTERN (insn), 0, j)) == CLOBBER)
2179 update_live_1 (src, XVECEXP (PATTERN (insn), 0, j));
2183 /* Exception Free Loads:
2185 We define five classes of speculative loads: IFREE, IRISKY,
2186 PFREE, PRISKY, and MFREE.
2188 IFREE loads are loads that are proved to be exception-free, just
2189 by examining the load insn. Examples for such loads are loads
2190 from TOC and loads of global data.
2192 IRISKY loads are loads that are proved to be exception-risky,
2193 just by examining the load insn. Examples for such loads are
2194 volatile loads and loads from shared memory.
2196 PFREE loads are loads for which we can prove, by examining other
2197 insns, that they are exception-free. Currently, this class consists
2198 of loads for which we are able to find a "similar load", either in
2199 the target block, or, if only one split-block exists, in that split
2200 block. Load2 is similar to load1 if both have same single base
2201 register. We identify only part of the similar loads, by finding
2202 an insn upon which both load1 and load2 have a DEF-USE dependence.
2204 PRISKY loads are loads for which we can prove, by examining other
2205 insns, that they are exception-risky. Currently we have two proofs for
2206 such loads. The first proof detects loads that are probably guarded by a
2207 test on the memory address. This proof is based on the
2208 backward and forward data dependence information for the region.
2209 Let load-insn be the examined load.
2210 Load-insn is PRISKY iff ALL the following hold:
2212 - insn1 is not in the same block as load-insn
2213 - there is a DEF-USE dependence chain (insn1, ..., load-insn)
2214 - test-insn is either a compare or a branch, not in the same block
2216 - load-insn is reachable from test-insn
2217 - there is a DEF-USE dependence chain (insn1, ..., test-insn)
2219 This proof might fail when the compare and the load are fed
2220 by an insn not in the region. To solve this, we will add to this
2221 group all loads that have no input DEF-USE dependence.
2223 The second proof detects loads that are directly or indirectly
2224 fed by a speculative load. This proof is affected by the
2225 scheduling process. We will use the flag fed_by_spec_load.
2226 Initially, all insns have this flag reset. After a speculative
2227 motion of an insn, if insn is either a load, or marked as
2228 fed_by_spec_load, we will also mark as fed_by_spec_load every
2229 insn1 for which a DEF-USE dependence (insn, insn1) exists. A
2230 load which is fed_by_spec_load is also PRISKY.
2232 MFREE (maybe-free) loads are all the remaining loads. They may be
2233 exception-free, but we cannot prove it.
2235 Now, all loads in IFREE and PFREE classes are considered
2236 exception-free, while all loads in IRISKY and PRISKY classes are
2237 considered exception-risky. As for loads in the MFREE class,
2238 these are considered either exception-free or exception-risky,
2239 depending on whether we are pessimistic or optimistic. We have
2240 to take the pessimistic approach to assure the safety of
2241 speculative scheduling, but we can take the optimistic approach
2242 by invoking the -fsched_spec_load_dangerous option. */
2244 enum INSN_TRAP_CLASS
2246 TRAP_FREE = 0, IFREE = 1, PFREE_CANDIDATE = 2,
2247 PRISKY_CANDIDATE = 3, IRISKY = 4, TRAP_RISKY = 5
2250 #define WORST_CLASS(class1, class2) \
2251 ((class1 > class2) ? class1 : class2)
2253 /* Indexed by INSN_UID, and set if there's DEF-USE dependence between
2254 some speculatively moved load insn and this one. */
2255 char *fed_by_spec_load;
2258 /* Non-zero if block bb_to is equal to, or reachable from block bb_from. */
2259 #define IS_REACHABLE(bb_from, bb_to) \
2261 || IS_RGN_ENTRY (bb_from) \
2262 || (bitset_member (ancestor_edges[bb_to], \
2263 EDGE_TO_BIT (IN_EDGES (BB_TO_BLOCK (bb_from))), \
2265 #define FED_BY_SPEC_LOAD(insn) (fed_by_spec_load[INSN_UID (insn)])
2266 #define IS_LOAD_INSN(insn) (is_load_insn[INSN_UID (insn)])
2268 /* Non-zero iff the address is comprised from at most 1 register. */
2269 #define CONST_BASED_ADDRESS_P(x) \
2270 (GET_CODE (x) == REG \
2271 || ((GET_CODE (x) == PLUS || GET_CODE (x) == MINUS \
2272 || (GET_CODE (x) == LO_SUM)) \
2273 && (GET_CODE (XEXP (x, 0)) == CONST_INT \
2274 || GET_CODE (XEXP (x, 1)) == CONST_INT)))
2276 /* Turns on the fed_by_spec_load flag for insns fed by load_insn. */
2279 set_spec_fed (load_insn)
2284 for (link = INSN_DEPEND (load_insn); link; link = XEXP (link, 1))
2285 if (GET_MODE (link) == VOIDmode)
2286 FED_BY_SPEC_LOAD (XEXP (link, 0)) = 1;
2287 } /* set_spec_fed */
2289 /* On the path from the insn to load_insn_bb, find a conditional
2290 branch depending on insn, that guards the speculative load. */
2293 find_conditional_protection (insn, load_insn_bb)
2299 /* Iterate through DEF-USE forward dependences. */
2300 for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
2302 rtx next = XEXP (link, 0);
2303 if ((CONTAINING_RGN (INSN_BLOCK (next)) ==
2304 CONTAINING_RGN (BB_TO_BLOCK (load_insn_bb)))
2305 && IS_REACHABLE (INSN_BB (next), load_insn_bb)
2306 && load_insn_bb != INSN_BB (next)
2307 && GET_MODE (link) == VOIDmode
2308 && (GET_CODE (next) == JUMP_INSN
2309 || find_conditional_protection (next, load_insn_bb)))
2313 } /* find_conditional_protection */
2315 /* Returns 1 if the same insn1 that participates in the computation
2316 of load_insn's address is feeding a conditional branch that is
2317 guarding on load_insn. This is true if we find a the two DEF-USE
2319 insn1 -> ... -> conditional-branch
2320 insn1 -> ... -> load_insn,
2321 and if a flow path exist:
2322 insn1 -> ... -> conditional-branch -> ... -> load_insn,
2323 and if insn1 is on the path
2324 region-entry -> ... -> bb_trg -> ... load_insn.
2326 Locate insn1 by climbing on LOG_LINKS from load_insn.
2327 Locate the branch by following INSN_DEPEND from insn1. */
2330 is_conditionally_protected (load_insn, bb_src, bb_trg)
2336 for (link = LOG_LINKS (load_insn); link; link = XEXP (link, 1))
2338 rtx insn1 = XEXP (link, 0);
2340 /* Must be a DEF-USE dependence upon non-branch. */
2341 if (GET_MODE (link) != VOIDmode
2342 || GET_CODE (insn1) == JUMP_INSN)
2345 /* Must exist a path: region-entry -> ... -> bb_trg -> ... load_insn. */
2346 if (INSN_BB (insn1) == bb_src
2347 || (CONTAINING_RGN (INSN_BLOCK (insn1))
2348 != CONTAINING_RGN (BB_TO_BLOCK (bb_src)))
2349 || (!IS_REACHABLE (bb_trg, INSN_BB (insn1))
2350 && !IS_REACHABLE (INSN_BB (insn1), bb_trg)))
2353 /* Now search for the conditional-branch. */
2354 if (find_conditional_protection (insn1, bb_src))
2357 /* Recursive step: search another insn1, "above" current insn1. */
2358 return is_conditionally_protected (insn1, bb_src, bb_trg);
2361 /* The chain does not exist. */
2363 } /* is_conditionally_protected */
2365 /* Returns 1 if a clue for "similar load" 'insn2' is found, and hence
2366 load_insn can move speculatively from bb_src to bb_trg. All the
2367 following must hold:
2369 (1) both loads have 1 base register (PFREE_CANDIDATEs).
2370 (2) load_insn and load1 have a def-use dependence upon
2371 the same insn 'insn1'.
2372 (3) either load2 is in bb_trg, or:
2373 - there's only one split-block, and
2374 - load1 is on the escape path, and
2376 From all these we can conclude that the two loads access memory
2377 addresses that differ at most by a constant, and hence if moving
2378 load_insn would cause an exception, it would have been caused by
2382 is_pfree (load_insn, bb_src, bb_trg)
2387 register candidate *candp = candidate_table + bb_src;
2389 if (candp->split_bbs.nr_members != 1)
2390 /* Must have exactly one escape block. */
2393 for (back_link = LOG_LINKS (load_insn);
2394 back_link; back_link = XEXP (back_link, 1))
2396 rtx insn1 = XEXP (back_link, 0);
2398 if (GET_MODE (back_link) == VOIDmode)
2400 /* Found a DEF-USE dependence (insn1, load_insn). */
2403 for (fore_link = INSN_DEPEND (insn1);
2404 fore_link; fore_link = XEXP (fore_link, 1))
2406 rtx insn2 = XEXP (fore_link, 0);
2407 if (GET_MODE (fore_link) == VOIDmode)
2409 /* Found a DEF-USE dependence (insn1, insn2). */
2410 if (haifa_classify_insn (insn2) != PFREE_CANDIDATE)
2411 /* insn2 not guaranteed to be a 1 base reg load. */
2414 if (INSN_BB (insn2) == bb_trg)
2415 /* insn2 is the similar load, in the target block. */
2418 if (*(candp->split_bbs.first_member) == INSN_BLOCK (insn2))
2419 /* insn2 is a similar load, in a split-block. */
2426 /* Couldn't find a similar load. */
2430 /* Returns a class that insn with GET_DEST(insn)=x may belong to,
2431 as found by analyzing insn's expression. */
2434 may_trap_exp (x, is_store)
2442 code = GET_CODE (x);
2452 /* The insn uses memory: a volatile load. */
2453 if (MEM_VOLATILE_P (x))
2455 /* An exception-free load. */
2456 if (!may_trap_p (x))
2458 /* A load with 1 base register, to be further checked. */
2459 if (CONST_BASED_ADDRESS_P (XEXP (x, 0)))
2460 return PFREE_CANDIDATE;
2461 /* No info on the load, to be further checked. */
2462 return PRISKY_CANDIDATE;
2467 int i, insn_class = TRAP_FREE;
2469 /* Neither store nor load, check if it may cause a trap. */
2472 /* Recursive step: walk the insn... */
2473 fmt = GET_RTX_FORMAT (code);
2474 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
2478 int tmp_class = may_trap_exp (XEXP (x, i), is_store);
2479 insn_class = WORST_CLASS (insn_class, tmp_class);
2481 else if (fmt[i] == 'E')
2484 for (j = 0; j < XVECLEN (x, i); j++)
2486 int tmp_class = may_trap_exp (XVECEXP (x, i, j), is_store);
2487 insn_class = WORST_CLASS (insn_class, tmp_class);
2488 if (insn_class == TRAP_RISKY || insn_class == IRISKY)
2492 if (insn_class == TRAP_RISKY || insn_class == IRISKY)
2497 } /* may_trap_exp */
2500 /* Classifies insn for the purpose of verifying that it can be
2501 moved speculatively, by examining it's patterns, returning:
2502 TRAP_RISKY: store, or risky non-load insn (e.g. division by variable).
2503 TRAP_FREE: non-load insn.
2504 IFREE: load from a globaly safe location.
2505 IRISKY: volatile load.
2506 PFREE_CANDIDATE, PRISKY_CANDIDATE: load that need to be checked for
2507 being either PFREE or PRISKY. */
2510 haifa_classify_insn (insn)
2513 rtx pat = PATTERN (insn);
2514 int tmp_class = TRAP_FREE;
2515 int insn_class = TRAP_FREE;
2518 if (GET_CODE (pat) == PARALLEL)
2520 int i, len = XVECLEN (pat, 0);
2522 for (i = len - 1; i >= 0; i--)
2524 code = GET_CODE (XVECEXP (pat, 0, i));
2528 /* Test if it is a 'store'. */
2529 tmp_class = may_trap_exp (XEXP (XVECEXP (pat, 0, i), 0), 1);
2532 /* Test if it is a store. */
2533 tmp_class = may_trap_exp (SET_DEST (XVECEXP (pat, 0, i)), 1);
2534 if (tmp_class == TRAP_RISKY)
2536 /* Test if it is a load. */
2538 WORST_CLASS (tmp_class,
2539 may_trap_exp (SET_SRC (XVECEXP (pat, 0, i)), 0));
2542 tmp_class = TRAP_RISKY;
2546 insn_class = WORST_CLASS (insn_class, tmp_class);
2547 if (insn_class == TRAP_RISKY || insn_class == IRISKY)
2553 code = GET_CODE (pat);
2557 /* Test if it is a 'store'. */
2558 tmp_class = may_trap_exp (XEXP (pat, 0), 1);
2561 /* Test if it is a store. */
2562 tmp_class = may_trap_exp (SET_DEST (pat), 1);
2563 if (tmp_class == TRAP_RISKY)
2565 /* Test if it is a load. */
2567 WORST_CLASS (tmp_class,
2568 may_trap_exp (SET_SRC (pat), 0));
2571 tmp_class = TRAP_RISKY;
2575 insn_class = tmp_class;
2580 } /* haifa_classify_insn */
2582 /* Return 1 if load_insn is prisky (i.e. if load_insn is fed by
2583 a load moved speculatively, or if load_insn is protected by
2584 a compare on load_insn's address). */
2587 is_prisky (load_insn, bb_src, bb_trg)
2591 if (FED_BY_SPEC_LOAD (load_insn))
2594 if (LOG_LINKS (load_insn) == NULL)
2595 /* Dependence may 'hide' out of the region. */
2598 if (is_conditionally_protected (load_insn, bb_src, bb_trg))
2604 /* Insn is a candidate to be moved speculatively from bb_src to bb_trg.
2605 Return 1 if insn is exception-free (and the motion is valid)
2609 is_exception_free (insn, bb_src, bb_trg)
2613 int insn_class = haifa_classify_insn (insn);
2615 /* Handle non-load insns. */
2626 if (!flag_schedule_speculative_load)
2628 IS_LOAD_INSN (insn) = 1;
2635 case PFREE_CANDIDATE:
2636 if (is_pfree (insn, bb_src, bb_trg))
2638 /* Don't 'break' here: PFREE-candidate is also PRISKY-candidate. */
2639 case PRISKY_CANDIDATE:
2640 if (!flag_schedule_speculative_load_dangerous
2641 || is_prisky (insn, bb_src, bb_trg))
2647 return flag_schedule_speculative_load_dangerous;
2648 } /* is_exception_free */
2651 /* Process an insn's memory dependencies. There are four kinds of
2654 (0) read dependence: read follows read
2655 (1) true dependence: read follows write
2656 (2) anti dependence: write follows read
2657 (3) output dependence: write follows write
2659 We are careful to build only dependencies which actually exist, and
2660 use transitivity to avoid building too many links. */
2662 /* Return the INSN_LIST containing INSN in LIST, or NULL
2663 if LIST does not contain INSN. */
2665 HAIFA_INLINE static rtx
2666 find_insn_list (insn, list)
2672 if (XEXP (list, 0) == insn)
2674 list = XEXP (list, 1);
2680 /* Return 1 if the pair (insn, x) is found in (LIST, LIST1), or 0
2683 HAIFA_INLINE static char
2684 find_insn_mem_list (insn, x, list, list1)
2690 if (XEXP (list, 0) == insn
2691 && XEXP (list1, 0) == x)
2693 list = XEXP (list, 1);
2694 list1 = XEXP (list1, 1);
2700 /* Compute the function units used by INSN. This caches the value
2701 returned by function_units_used. A function unit is encoded as the
2702 unit number if the value is non-negative and the compliment of a
2703 mask if the value is negative. A function unit index is the
2704 non-negative encoding. */
2706 HAIFA_INLINE static int
2710 register int unit = INSN_UNIT (insn);
2714 recog_memoized (insn);
2716 /* A USE insn, or something else we don't need to understand.
2717 We can't pass these directly to function_units_used because it will
2718 trigger a fatal error for unrecognizable insns. */
2719 if (INSN_CODE (insn) < 0)
2723 unit = function_units_used (insn);
2724 /* Increment non-negative values so we can cache zero. */
2728 /* We only cache 16 bits of the result, so if the value is out of
2729 range, don't cache it. */
2730 if (FUNCTION_UNITS_SIZE < HOST_BITS_PER_SHORT
2732 || (unit & ~((1 << (HOST_BITS_PER_SHORT - 1)) - 1)) == 0)
2733 INSN_UNIT (insn) = unit;
2735 return (unit > 0 ? unit - 1 : unit);
2738 /* Compute the blockage range for executing INSN on UNIT. This caches
2739 the value returned by the blockage_range_function for the unit.
2740 These values are encoded in an int where the upper half gives the
2741 minimum value and the lower half gives the maximum value. */
2743 HAIFA_INLINE static unsigned int
2744 blockage_range (unit, insn)
2748 unsigned int blockage = INSN_BLOCKAGE (insn);
2751 if ((int) UNIT_BLOCKED (blockage) != unit + 1)
2753 range = function_units[unit].blockage_range_function (insn);
2754 /* We only cache the blockage range for one unit and then only if
2756 if (HOST_BITS_PER_INT >= UNIT_BITS + 2 * BLOCKAGE_BITS)
2757 INSN_BLOCKAGE (insn) = ENCODE_BLOCKAGE (unit + 1, range);
2760 range = BLOCKAGE_RANGE (blockage);
2765 /* A vector indexed by function unit instance giving the last insn to use
2766 the unit. The value of the function unit instance index for unit U
2767 instance I is (U + I * FUNCTION_UNITS_SIZE). */
2768 static rtx unit_last_insn[FUNCTION_UNITS_SIZE * MAX_MULTIPLICITY];
2770 /* A vector indexed by function unit instance giving the minimum time when
2771 the unit will unblock based on the maximum blockage cost. */
2772 static int unit_tick[FUNCTION_UNITS_SIZE * MAX_MULTIPLICITY];
2774 /* A vector indexed by function unit number giving the number of insns
2775 that remain to use the unit. */
2776 static int unit_n_insns[FUNCTION_UNITS_SIZE];
2778 /* Reset the function unit state to the null state. */
2783 bzero ((char *) unit_last_insn, sizeof (unit_last_insn));
2784 bzero ((char *) unit_tick, sizeof (unit_tick));
2785 bzero ((char *) unit_n_insns, sizeof (unit_n_insns));
2788 /* Return the issue-delay of an insn. */
2790 HAIFA_INLINE static int
2791 insn_issue_delay (insn)
2795 int unit = insn_unit (insn);
2797 /* Efficiency note: in fact, we are working 'hard' to compute a
2798 value that was available in md file, and is not available in
2799 function_units[] structure. It would be nice to have this
2800 value there, too. */
2803 if (function_units[unit].blockage_range_function &&
2804 function_units[unit].blockage_function)
2805 delay = function_units[unit].blockage_function (insn, insn);
2808 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
2809 if ((unit & 1) != 0 && function_units[i].blockage_range_function
2810 && function_units[i].blockage_function)
2811 delay = MAX (delay, function_units[i].blockage_function (insn, insn));
2816 /* Return the actual hazard cost of executing INSN on the unit UNIT,
2817 instance INSTANCE at time CLOCK if the previous actual hazard cost
2820 HAIFA_INLINE static int
2821 actual_hazard_this_instance (unit, instance, insn, clock, cost)
2822 int unit, instance, clock, cost;
2825 int tick = unit_tick[instance]; /* Issue time of the last issued insn. */
2827 if (tick - clock > cost)
2829 /* The scheduler is operating forward, so unit's last insn is the
2830 executing insn and INSN is the candidate insn. We want a
2831 more exact measure of the blockage if we execute INSN at CLOCK
2832 given when we committed the execution of the unit's last insn.
2834 The blockage value is given by either the unit's max blockage
2835 constant, blockage range function, or blockage function. Use
2836 the most exact form for the given unit. */
2838 if (function_units[unit].blockage_range_function)
2840 if (function_units[unit].blockage_function)
2841 tick += (function_units[unit].blockage_function
2842 (unit_last_insn[instance], insn)
2843 - function_units[unit].max_blockage);
2845 tick += ((int) MAX_BLOCKAGE_COST (blockage_range (unit, insn))
2846 - function_units[unit].max_blockage);
2848 if (tick - clock > cost)
2849 cost = tick - clock;
2854 /* Record INSN as having begun execution on the units encoded by UNIT at
2857 HAIFA_INLINE static void
2858 schedule_unit (unit, insn, clock)
2866 int instance = unit;
2867 #if MAX_MULTIPLICITY > 1
2868 /* Find the first free instance of the function unit and use that
2869 one. We assume that one is free. */
2870 for (i = function_units[unit].multiplicity - 1; i > 0; i--)
2872 if (!actual_hazard_this_instance (unit, instance, insn, clock, 0))
2874 instance += FUNCTION_UNITS_SIZE;
2877 unit_last_insn[instance] = insn;
2878 unit_tick[instance] = (clock + function_units[unit].max_blockage);
2881 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
2882 if ((unit & 1) != 0)
2883 schedule_unit (i, insn, clock);
2886 /* Return the actual hazard cost of executing INSN on the units encoded by
2887 UNIT at time CLOCK if the previous actual hazard cost was COST. */
2889 HAIFA_INLINE static int
2890 actual_hazard (unit, insn, clock, cost)
2891 int unit, clock, cost;
2898 /* Find the instance of the function unit with the minimum hazard. */
2899 int instance = unit;
2900 int best_cost = actual_hazard_this_instance (unit, instance, insn,
2902 #if MAX_MULTIPLICITY > 1
2905 if (best_cost > cost)
2907 for (i = function_units[unit].multiplicity - 1; i > 0; i--)
2909 instance += FUNCTION_UNITS_SIZE;
2910 this_cost = actual_hazard_this_instance (unit, instance, insn,
2912 if (this_cost < best_cost)
2914 best_cost = this_cost;
2915 if (this_cost <= cost)
2921 cost = MAX (cost, best_cost);
2924 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
2925 if ((unit & 1) != 0)
2926 cost = actual_hazard (i, insn, clock, cost);
2931 /* Return the potential hazard cost of executing an instruction on the
2932 units encoded by UNIT if the previous potential hazard cost was COST.
2933 An insn with a large blockage time is chosen in preference to one
2934 with a smaller time; an insn that uses a unit that is more likely
2935 to be used is chosen in preference to one with a unit that is less
2936 used. We are trying to minimize a subsequent actual hazard. */
2938 HAIFA_INLINE static int
2939 potential_hazard (unit, insn, cost)
2944 unsigned int minb, maxb;
2948 minb = maxb = function_units[unit].max_blockage;
2951 if (function_units[unit].blockage_range_function)
2953 maxb = minb = blockage_range (unit, insn);
2954 maxb = MAX_BLOCKAGE_COST (maxb);
2955 minb = MIN_BLOCKAGE_COST (minb);
2960 /* Make the number of instructions left dominate. Make the
2961 minimum delay dominate the maximum delay. If all these
2962 are the same, use the unit number to add an arbitrary
2963 ordering. Other terms can be added. */
2964 ncost = minb * 0x40 + maxb;
2965 ncost *= (unit_n_insns[unit] - 1) * 0x1000 + unit;
2972 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
2973 if ((unit & 1) != 0)
2974 cost = potential_hazard (i, insn, cost);
2979 /* Compute cost of executing INSN given the dependence LINK on the insn USED.
2980 This is the number of cycles between instruction issue and
2981 instruction results. */
2983 HAIFA_INLINE static int
2984 insn_cost (insn, link, used)
2985 rtx insn, link, used;
2987 register int cost = INSN_COST (insn);
2991 recog_memoized (insn);
2993 /* A USE insn, or something else we don't need to understand.
2994 We can't pass these directly to result_ready_cost because it will
2995 trigger a fatal error for unrecognizable insns. */
2996 if (INSN_CODE (insn) < 0)
2998 INSN_COST (insn) = 1;
3003 cost = result_ready_cost (insn);
3008 INSN_COST (insn) = cost;
3012 /* In this case estimate cost without caring how insn is used. */
3013 if (link == 0 && used == 0)
3016 /* A USE insn should never require the value used to be computed. This
3017 allows the computation of a function's result and parameter values to
3018 overlap the return and call. */
3019 recog_memoized (used);
3020 if (INSN_CODE (used) < 0)
3021 LINK_COST_FREE (link) = 1;
3023 /* If some dependencies vary the cost, compute the adjustment. Most
3024 commonly, the adjustment is complete: either the cost is ignored
3025 (in the case of an output- or anti-dependence), or the cost is
3026 unchanged. These values are cached in the link as LINK_COST_FREE
3027 and LINK_COST_ZERO. */
3029 if (LINK_COST_FREE (link))
3032 else if (!LINK_COST_ZERO (link))
3036 ADJUST_COST (used, link, insn, ncost);
3039 LINK_COST_FREE (link) = 1;
3043 LINK_COST_ZERO (link) = 1;
3050 /* Compute the priority number for INSN. */
3059 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
3062 if ((this_priority = INSN_PRIORITY (insn)) == 0)
3064 if (INSN_DEPEND (insn) == 0)
3065 this_priority = insn_cost (insn, 0, 0);
3067 for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
3072 if (RTX_INTEGRATED_P (link))
3075 next = XEXP (link, 0);
3077 /* Critical path is meaningful in block boundaries only. */
3078 if (INSN_BLOCK (next) != INSN_BLOCK (insn))
3081 next_priority = insn_cost (insn, link, next) + priority (next);
3082 if (next_priority > this_priority)
3083 this_priority = next_priority;
3085 INSN_PRIORITY (insn) = this_priority;
3087 return this_priority;
3091 /* Remove all INSN_LISTs and EXPR_LISTs from the pending lists and add
3092 them to the unused_*_list variables, so that they can be reused. */
3095 free_pending_lists ()
3097 if (current_nr_blocks <= 1)
3099 free_INSN_LIST_list (&pending_read_insns);
3100 free_INSN_LIST_list (&pending_write_insns);
3101 free_EXPR_LIST_list (&pending_read_mems);
3102 free_EXPR_LIST_list (&pending_write_mems);
3106 /* Interblock scheduling. */
3109 for (bb = 0; bb < current_nr_blocks; bb++)
3111 free_INSN_LIST_list (&bb_pending_read_insns[bb]);
3112 free_INSN_LIST_list (&bb_pending_write_insns[bb]);
3113 free_EXPR_LIST_list (&bb_pending_read_mems[bb]);
3114 free_EXPR_LIST_list (&bb_pending_write_mems[bb]);
3119 /* Add an INSN and MEM reference pair to a pending INSN_LIST and MEM_LIST.
3120 The MEM is a memory reference contained within INSN, which we are saving
3121 so that we can do memory aliasing on it. */
3124 add_insn_mem_dependence (insn_list, mem_list, insn, mem)
3125 rtx *insn_list, *mem_list, insn, mem;
3129 link = alloc_INSN_LIST (insn, *insn_list);
3132 link = alloc_EXPR_LIST (VOIDmode, mem, *mem_list);
3135 pending_lists_length++;
3139 /* Make a dependency between every memory reference on the pending lists
3140 and INSN, thus flushing the pending lists. If ONLY_WRITE, don't flush
3144 flush_pending_lists (insn, only_write)
3151 while (pending_read_insns && ! only_write)
3153 add_dependence (insn, XEXP (pending_read_insns, 0), REG_DEP_ANTI);
3155 link = pending_read_insns;
3156 pending_read_insns = XEXP (pending_read_insns, 1);
3157 free_INSN_LIST_node (link);
3159 link = pending_read_mems;
3160 pending_read_mems = XEXP (pending_read_mems, 1);
3161 free_EXPR_LIST_node (link);
3163 while (pending_write_insns)
3165 add_dependence (insn, XEXP (pending_write_insns, 0), REG_DEP_ANTI);
3167 link = pending_write_insns;
3168 pending_write_insns = XEXP (pending_write_insns, 1);
3169 free_INSN_LIST_node (link);
3171 link = pending_write_mems;
3172 pending_write_mems = XEXP (pending_write_mems, 1);
3173 free_EXPR_LIST_node (link);
3175 pending_lists_length = 0;
3177 /* last_pending_memory_flush is now a list of insns. */
3178 for (u = last_pending_memory_flush; u; u = XEXP (u, 1))
3179 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3181 free_INSN_LIST_list (&last_pending_memory_flush);
3182 last_pending_memory_flush = alloc_INSN_LIST (insn, NULL_RTX);
3185 /* Analyze a single SET, CLOBBER, PRE_DEC, POST_DEC, PRE_INC or POST_INC
3186 rtx, X, creating all dependencies generated by the write to the
3187 destination of X, and reads of everything mentioned. */
3190 sched_analyze_1 (x, insn)
3195 register rtx dest = XEXP (x, 0);
3196 enum rtx_code code = GET_CODE (x);
3201 if (GET_CODE (dest) == PARALLEL
3202 && GET_MODE (dest) == BLKmode)
3205 for (i = XVECLEN (dest, 0) - 1; i >= 0; i--)
3206 sched_analyze_1 (XVECEXP (dest, 0, i), insn);
3207 if (GET_CODE (x) == SET)
3208 sched_analyze_2 (SET_SRC (x), insn);
3212 while (GET_CODE (dest) == STRICT_LOW_PART || GET_CODE (dest) == SUBREG
3213 || GET_CODE (dest) == ZERO_EXTRACT || GET_CODE (dest) == SIGN_EXTRACT)
3215 if (GET_CODE (dest) == ZERO_EXTRACT || GET_CODE (dest) == SIGN_EXTRACT)
3217 /* The second and third arguments are values read by this insn. */
3218 sched_analyze_2 (XEXP (dest, 1), insn);
3219 sched_analyze_2 (XEXP (dest, 2), insn);
3221 dest = XEXP (dest, 0);
3224 if (GET_CODE (dest) == REG)
3228 regno = REGNO (dest);
3230 /* A hard reg in a wide mode may really be multiple registers.
3231 If so, mark all of them just like the first. */
3232 if (regno < FIRST_PSEUDO_REGISTER)
3234 i = HARD_REGNO_NREGS (regno, GET_MODE (dest));
3239 for (u = reg_last_uses[regno + i]; u; u = XEXP (u, 1))
3240 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3242 for (u = reg_last_sets[regno + i]; u; u = XEXP (u, 1))
3243 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3245 /* Clobbers need not be ordered with respect to one
3246 another, but sets must be ordered with respect to a
3250 free_INSN_LIST_list (®_last_uses[regno + i]);
3251 for (u = reg_last_clobbers[regno + i]; u; u = XEXP (u, 1))
3252 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3253 SET_REGNO_REG_SET (reg_pending_sets, regno + i);
3256 SET_REGNO_REG_SET (reg_pending_clobbers, regno + i);
3258 /* Function calls clobber all call_used regs. */
3259 if (global_regs[regno + i]
3260 || (code == SET && call_used_regs[regno + i]))
3261 for (u = last_function_call; u; u = XEXP (u, 1))
3262 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3269 for (u = reg_last_uses[regno]; u; u = XEXP (u, 1))
3270 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3272 for (u = reg_last_sets[regno]; u; u = XEXP (u, 1))
3273 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3277 free_INSN_LIST_list (®_last_uses[regno]);
3278 for (u = reg_last_clobbers[regno]; u; u = XEXP (u, 1))
3279 add_dependence (insn, XEXP (u, 0), REG_DEP_OUTPUT);
3280 SET_REGNO_REG_SET (reg_pending_sets, regno);
3283 SET_REGNO_REG_SET (reg_pending_clobbers, regno);
3285 /* Pseudos that are REG_EQUIV to something may be replaced
3286 by that during reloading. We need only add dependencies for
3287 the address in the REG_EQUIV note. */
3288 if (!reload_completed
3289 && reg_known_equiv_p[regno]
3290 && GET_CODE (reg_known_value[regno]) == MEM)
3291 sched_analyze_2 (XEXP (reg_known_value[regno], 0), insn);
3293 /* Don't let it cross a call after scheduling if it doesn't
3294 already cross one. */
3296 if (REG_N_CALLS_CROSSED (regno) == 0)
3297 for (u = last_function_call; u; u = XEXP (u, 1))
3298 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3301 else if (GET_CODE (dest) == MEM)
3303 /* Writing memory. */
3305 if (pending_lists_length > 32)
3307 /* Flush all pending reads and writes to prevent the pending lists
3308 from getting any larger. Insn scheduling runs too slowly when
3309 these lists get long. The number 32 was chosen because it
3310 seems like a reasonable number. When compiling GCC with itself,
3311 this flush occurs 8 times for sparc, and 10 times for m88k using
3313 flush_pending_lists (insn, 0);
3318 rtx pending, pending_mem;
3320 pending = pending_read_insns;
3321 pending_mem = pending_read_mems;
3324 if (anti_dependence (XEXP (pending_mem, 0), dest))
3325 add_dependence (insn, XEXP (pending, 0), REG_DEP_ANTI);
3327 pending = XEXP (pending, 1);
3328 pending_mem = XEXP (pending_mem, 1);
3331 pending = pending_write_insns;
3332 pending_mem = pending_write_mems;
3335 if (output_dependence (XEXP (pending_mem, 0), dest))
3336 add_dependence (insn, XEXP (pending, 0), REG_DEP_OUTPUT);
3338 pending = XEXP (pending, 1);
3339 pending_mem = XEXP (pending_mem, 1);
3342 for (u = last_pending_memory_flush; u; u = XEXP (u, 1))
3343 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3345 add_insn_mem_dependence (&pending_write_insns, &pending_write_mems,
3348 sched_analyze_2 (XEXP (dest, 0), insn);
3351 /* Analyze reads. */
3352 if (GET_CODE (x) == SET)
3353 sched_analyze_2 (SET_SRC (x), insn);
3356 /* Analyze the uses of memory and registers in rtx X in INSN. */
3359 sched_analyze_2 (x, insn)
3365 register enum rtx_code code;
3366 register const char *fmt;
3371 code = GET_CODE (x);
3380 /* Ignore constants. Note that we must handle CONST_DOUBLE here
3381 because it may have a cc0_rtx in its CONST_DOUBLE_CHAIN field, but
3382 this does not mean that this insn is using cc0. */
3390 /* User of CC0 depends on immediately preceding insn. */
3391 SCHED_GROUP_P (insn) = 1;
3393 /* There may be a note before this insn now, but all notes will
3394 be removed before we actually try to schedule the insns, so
3395 it won't cause a problem later. We must avoid it here though. */
3396 prev = prev_nonnote_insn (insn);
3398 /* Make a copy of all dependencies on the immediately previous insn,
3399 and add to this insn. This is so that all the dependencies will
3400 apply to the group. Remove an explicit dependence on this insn
3401 as SCHED_GROUP_P now represents it. */
3403 if (find_insn_list (prev, LOG_LINKS (insn)))
3404 remove_dependence (insn, prev);
3406 for (link = LOG_LINKS (prev); link; link = XEXP (link, 1))
3407 add_dependence (insn, XEXP (link, 0), REG_NOTE_KIND (link));
3416 int regno = REGNO (x);
3417 if (regno < FIRST_PSEUDO_REGISTER)
3421 i = HARD_REGNO_NREGS (regno, GET_MODE (x));
3424 reg_last_uses[regno + i]
3425 = alloc_INSN_LIST (insn, reg_last_uses[regno + i]);
3427 for (u = reg_last_sets[regno + i]; u; u = XEXP (u, 1))
3428 add_dependence (insn, XEXP (u, 0), 0);
3430 /* ??? This should never happen. */
3431 for (u = reg_last_clobbers[regno + i]; u; u = XEXP (u, 1))
3432 add_dependence (insn, XEXP (u, 0), 0);
3434 if ((call_used_regs[regno + i] || global_regs[regno + i]))
3435 /* Function calls clobber all call_used regs. */
3436 for (u = last_function_call; u; u = XEXP (u, 1))
3437 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3442 reg_last_uses[regno] = alloc_INSN_LIST (insn,
3443 reg_last_uses[regno]);
3445 for (u = reg_last_sets[regno]; u; u = XEXP (u, 1))
3446 add_dependence (insn, XEXP (u, 0), 0);
3448 /* ??? This should never happen. */
3449 for (u = reg_last_clobbers[regno]; u; u = XEXP (u, 1))
3450 add_dependence (insn, XEXP (u, 0), 0);
3452 /* Pseudos that are REG_EQUIV to something may be replaced
3453 by that during reloading. We need only add dependencies for
3454 the address in the REG_EQUIV note. */
3455 if (!reload_completed
3456 && reg_known_equiv_p[regno]
3457 && GET_CODE (reg_known_value[regno]) == MEM)
3458 sched_analyze_2 (XEXP (reg_known_value[regno], 0), insn);
3460 /* If the register does not already cross any calls, then add this
3461 insn to the sched_before_next_call list so that it will still
3462 not cross calls after scheduling. */
3463 if (REG_N_CALLS_CROSSED (regno) == 0)
3464 add_dependence (sched_before_next_call, insn, REG_DEP_ANTI);
3471 /* Reading memory. */
3473 rtx pending, pending_mem;
3475 pending = pending_read_insns;
3476 pending_mem = pending_read_mems;
3479 if (read_dependence (XEXP (pending_mem, 0), x))
3480 add_dependence (insn, XEXP (pending, 0), REG_DEP_ANTI);
3482 pending = XEXP (pending, 1);
3483 pending_mem = XEXP (pending_mem, 1);
3486 pending = pending_write_insns;
3487 pending_mem = pending_write_mems;
3490 if (true_dependence (XEXP (pending_mem, 0), VOIDmode,
3492 add_dependence (insn, XEXP (pending, 0), 0);
3494 pending = XEXP (pending, 1);
3495 pending_mem = XEXP (pending_mem, 1);
3498 for (u = last_pending_memory_flush; u; u = XEXP (u, 1))
3499 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3501 /* Always add these dependencies to pending_reads, since
3502 this insn may be followed by a write. */
3503 add_insn_mem_dependence (&pending_read_insns, &pending_read_mems,
3506 /* Take advantage of tail recursion here. */
3507 sched_analyze_2 (XEXP (x, 0), insn);
3511 /* Force pending stores to memory in case a trap handler needs them. */
3513 flush_pending_lists (insn, 1);
3518 case UNSPEC_VOLATILE:
3522 /* Traditional and volatile asm instructions must be considered to use
3523 and clobber all hard registers, all pseudo-registers and all of
3524 memory. So must TRAP_IF and UNSPEC_VOLATILE operations.
3526 Consider for instance a volatile asm that changes the fpu rounding
3527 mode. An insn should not be moved across this even if it only uses
3528 pseudo-regs because it might give an incorrectly rounded result. */
3529 if (code != ASM_OPERANDS || MEM_VOLATILE_P (x))
3531 int max_reg = max_reg_num ();
3532 for (i = 0; i < max_reg; i++)
3534 for (u = reg_last_uses[i]; u; u = XEXP (u, 1))
3535 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3536 free_INSN_LIST_list (®_last_uses[i]);
3538 for (u = reg_last_sets[i]; u; u = XEXP (u, 1))
3539 add_dependence (insn, XEXP (u, 0), 0);
3541 for (u = reg_last_clobbers[i]; u; u = XEXP (u, 1))
3542 add_dependence (insn, XEXP (u, 0), 0);
3544 reg_pending_sets_all = 1;
3546 flush_pending_lists (insn, 0);
3549 /* For all ASM_OPERANDS, we must traverse the vector of input operands.
3550 We can not just fall through here since then we would be confused
3551 by the ASM_INPUT rtx inside ASM_OPERANDS, which do not indicate
3552 traditional asms unlike their normal usage. */
3554 if (code == ASM_OPERANDS)
3556 for (j = 0; j < ASM_OPERANDS_INPUT_LENGTH (x); j++)
3557 sched_analyze_2 (ASM_OPERANDS_INPUT (x, j), insn);
3567 /* These both read and modify the result. We must handle them as writes
3568 to get proper dependencies for following instructions. We must handle
3569 them as reads to get proper dependencies from this to previous
3570 instructions. Thus we need to pass them to both sched_analyze_1
3571 and sched_analyze_2. We must call sched_analyze_2 first in order
3572 to get the proper antecedent for the read. */
3573 sched_analyze_2 (XEXP (x, 0), insn);
3574 sched_analyze_1 (x, insn);
3581 /* Other cases: walk the insn. */
3582 fmt = GET_RTX_FORMAT (code);
3583 for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3586 sched_analyze_2 (XEXP (x, i), insn);
3587 else if (fmt[i] == 'E')
3588 for (j = 0; j < XVECLEN (x, i); j++)
3589 sched_analyze_2 (XVECEXP (x, i, j), insn);
3593 /* Analyze an INSN with pattern X to find all dependencies. */
3596 sched_analyze_insn (x, insn, loop_notes)
3600 register RTX_CODE code = GET_CODE (x);
3602 int maxreg = max_reg_num ();
3605 if (code == SET || code == CLOBBER)
3606 sched_analyze_1 (x, insn);
3607 else if (code == PARALLEL)
3610 for (i = XVECLEN (x, 0) - 1; i >= 0; i--)
3612 code = GET_CODE (XVECEXP (x, 0, i));
3613 if (code == SET || code == CLOBBER)
3614 sched_analyze_1 (XVECEXP (x, 0, i), insn);
3616 sched_analyze_2 (XVECEXP (x, 0, i), insn);
3620 sched_analyze_2 (x, insn);
3622 /* Mark registers CLOBBERED or used by called function. */
3623 if (GET_CODE (insn) == CALL_INSN)
3624 for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1))
3626 if (GET_CODE (XEXP (link, 0)) == CLOBBER)
3627 sched_analyze_1 (XEXP (link, 0), insn);
3629 sched_analyze_2 (XEXP (link, 0), insn);
3632 /* If there is a {LOOP,EHREGION}_{BEG,END} note in the middle of a basic
3633 block, then we must be sure that no instructions are scheduled across it.
3634 Otherwise, the reg_n_refs info (which depends on loop_depth) would
3635 become incorrect. */
3639 int max_reg = max_reg_num ();
3640 int schedule_barrier_found = 0;
3643 /* Update loop_notes with any notes from this insn. Also determine
3644 if any of the notes on the list correspond to instruction scheduling
3645 barriers (loop, eh & setjmp notes, but not range notes. */
3647 while (XEXP (link, 1))
3649 if (INTVAL (XEXP (link, 0)) == NOTE_INSN_LOOP_BEG
3650 || INTVAL (XEXP (link, 0)) == NOTE_INSN_LOOP_END
3651 || INTVAL (XEXP (link, 0)) == NOTE_INSN_EH_REGION_BEG
3652 || INTVAL (XEXP (link, 0)) == NOTE_INSN_EH_REGION_END
3653 || INTVAL (XEXP (link, 0)) == NOTE_INSN_SETJMP)
3654 schedule_barrier_found = 1;
3656 link = XEXP (link, 1);
3658 XEXP (link, 1) = REG_NOTES (insn);
3659 REG_NOTES (insn) = loop_notes;
3661 /* Add dependencies if a scheduling barrier was found. */
3662 if (schedule_barrier_found)
3664 for (i = 0; i < max_reg; i++)
3667 for (u = reg_last_uses[i]; u; u = XEXP (u, 1))
3668 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3669 free_INSN_LIST_list (®_last_uses[i]);
3671 for (u = reg_last_sets[i]; u; u = XEXP (u, 1))
3672 add_dependence (insn, XEXP (u, 0), 0);
3674 for (u = reg_last_clobbers[i]; u; u = XEXP (u, 1))
3675 add_dependence (insn, XEXP (u, 0), 0);
3677 reg_pending_sets_all = 1;
3679 flush_pending_lists (insn, 0);
3684 /* Accumulate clobbers until the next set so that it will be output dependent
3685 on all of them. At the next set we can clear the clobber list, since
3686 subsequent sets will be output dependent on it. */
3687 EXECUTE_IF_SET_IN_REG_SET (reg_pending_sets, 0, i,
3689 free_INSN_LIST_list (®_last_sets[i]);
3690 free_INSN_LIST_list (®_last_clobbers[i]);
3692 = alloc_INSN_LIST (insn, NULL_RTX);
3694 EXECUTE_IF_SET_IN_REG_SET (reg_pending_clobbers, 0, i,
3696 reg_last_clobbers[i]
3697 = alloc_INSN_LIST (insn,
3698 reg_last_clobbers[i]);
3700 CLEAR_REG_SET (reg_pending_sets);
3701 CLEAR_REG_SET (reg_pending_clobbers);
3703 if (reg_pending_sets_all)
3705 for (i = 0; i < maxreg; i++)
3707 free_INSN_LIST_list (®_last_sets[i]);
3708 reg_last_sets[i] = alloc_INSN_LIST (insn, NULL_RTX);
3711 reg_pending_sets_all = 0;
3714 /* Handle function calls and function returns created by the epilogue
3716 if (GET_CODE (insn) == CALL_INSN || GET_CODE (insn) == JUMP_INSN)
3721 /* When scheduling instructions, we make sure calls don't lose their
3722 accompanying USE insns by depending them one on another in order.
3724 Also, we must do the same thing for returns created by the epilogue
3725 threading code. Note this code works only in this special case,
3726 because other passes make no guarantee that they will never emit
3727 an instruction between a USE and a RETURN. There is such a guarantee
3728 for USE instructions immediately before a call. */
3730 prev_dep_insn = insn;
3731 dep_insn = PREV_INSN (insn);
3732 while (GET_CODE (dep_insn) == INSN
3733 && GET_CODE (PATTERN (dep_insn)) == USE
3734 && GET_CODE (XEXP (PATTERN (dep_insn), 0)) == REG)
3736 SCHED_GROUP_P (prev_dep_insn) = 1;
3738 /* Make a copy of all dependencies on dep_insn, and add to insn.
3739 This is so that all of the dependencies will apply to the
3742 for (link = LOG_LINKS (dep_insn); link; link = XEXP (link, 1))
3743 add_dependence (insn, XEXP (link, 0), REG_NOTE_KIND (link));
3745 prev_dep_insn = dep_insn;
3746 dep_insn = PREV_INSN (dep_insn);
3751 /* Analyze every insn between HEAD and TAIL inclusive, creating LOG_LINKS
3752 for every dependency. */
3755 sched_analyze (head, tail)
3762 for (insn = head;; insn = NEXT_INSN (insn))
3764 if (GET_CODE (insn) == INSN || GET_CODE (insn) == JUMP_INSN)
3766 /* Clear out the stale LOG_LINKS from flow. */
3767 free_INSN_LIST_list (&LOG_LINKS (insn));
3769 /* Make each JUMP_INSN a scheduling barrier for memory
3771 if (GET_CODE (insn) == JUMP_INSN)
3772 last_pending_memory_flush
3773 = alloc_INSN_LIST (insn, last_pending_memory_flush);
3774 sched_analyze_insn (PATTERN (insn), insn, loop_notes);
3777 else if (GET_CODE (insn) == CALL_INSN)
3782 CANT_MOVE (insn) = 1;
3784 /* Clear out the stale LOG_LINKS from flow. */
3785 free_INSN_LIST_list (&LOG_LINKS (insn));
3787 /* Any instruction using a hard register which may get clobbered
3788 by a call needs to be marked as dependent on this call.
3789 This prevents a use of a hard return reg from being moved
3790 past a void call (i.e. it does not explicitly set the hard
3793 /* If this call is followed by a NOTE_INSN_SETJMP, then assume that
3794 all registers, not just hard registers, may be clobbered by this
3797 /* Insn, being a CALL_INSN, magically depends on
3798 `last_function_call' already. */
3800 if (NEXT_INSN (insn) && GET_CODE (NEXT_INSN (insn)) == NOTE
3801 && NOTE_LINE_NUMBER (NEXT_INSN (insn)) == NOTE_INSN_SETJMP)
3803 int max_reg = max_reg_num ();
3804 for (i = 0; i < max_reg; i++)
3806 for (u = reg_last_uses[i]; u; u = XEXP (u, 1))
3807 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3808 free_INSN_LIST_list (®_last_uses[i]);
3810 for (u = reg_last_sets[i]; u; u = XEXP (u, 1))
3811 add_dependence (insn, XEXP (u, 0), 0);
3813 for (u = reg_last_clobbers[i]; u; u = XEXP (u, 1))
3814 add_dependence (insn, XEXP (u, 0), 0);
3816 reg_pending_sets_all = 1;
3818 /* Add a pair of REG_SAVE_NOTEs which we will later
3819 convert back into a NOTE_INSN_SETJMP note. See
3820 reemit_notes for why we use a pair of NOTEs. */
3821 REG_NOTES (insn) = alloc_EXPR_LIST (REG_SAVE_NOTE,
3824 REG_NOTES (insn) = alloc_EXPR_LIST (REG_SAVE_NOTE,
3825 GEN_INT (NOTE_INSN_SETJMP),
3830 for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
3831 if (call_used_regs[i] || global_regs[i])
3833 for (u = reg_last_uses[i]; u; u = XEXP (u, 1))
3834 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3836 for (u = reg_last_sets[i]; u; u = XEXP (u, 1))
3837 add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);
3839 SET_REGNO_REG_SET (reg_pending_clobbers, i);
3843 /* For each insn which shouldn't cross a call, add a dependence
3844 between that insn and this call insn. */
3845 x = LOG_LINKS (sched_before_next_call);
3848 add_dependence (insn, XEXP (x, 0), REG_DEP_ANTI);
3851 free_INSN_LIST_list (&LOG_LINKS (sched_before_next_call));
3853 sched_analyze_insn (PATTERN (insn), insn, loop_notes);
3856 /* In the absence of interprocedural alias analysis, we must flush
3857 all pending reads and writes, and start new dependencies starting
3858 from here. But only flush writes for constant calls (which may
3859 be passed a pointer to something we haven't written yet). */
3860 flush_pending_lists (insn, CONST_CALL_P (insn));
3862 /* Depend this function call (actually, the user of this
3863 function call) on all hard register clobberage. */
3865 /* last_function_call is now a list of insns. */
3866 free_INSN_LIST_list(&last_function_call);
3867 last_function_call = alloc_INSN_LIST (insn, NULL_RTX);
3870 /* See comments on reemit_notes as to why we do this.
3871 ??? Actually, the reemit_notes just say what is done, not why. */
3873 else if (GET_CODE (insn) == NOTE
3874 && (NOTE_LINE_NUMBER (insn) == NOTE_INSN_RANGE_START
3875 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_RANGE_END))
3877 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE, NOTE_RANGE_INFO (insn),
3879 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE,
3880 GEN_INT (NOTE_LINE_NUMBER (insn)),
3883 else if (GET_CODE (insn) == NOTE
3884 && (NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_BEG
3885 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_END
3886 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_BEG
3887 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_END
3888 || (NOTE_LINE_NUMBER (insn) == NOTE_INSN_SETJMP
3889 && GET_CODE (PREV_INSN (insn)) != CALL_INSN)))
3893 if (NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_BEG
3894 || NOTE_LINE_NUMBER (insn) == NOTE_INSN_EH_REGION_END)
3895 rtx_region = GEN_INT (NOTE_EH_HANDLER (insn));
3897 rtx_region = GEN_INT (0);
3899 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE,
3902 loop_notes = alloc_EXPR_LIST (REG_SAVE_NOTE,
3903 GEN_INT (NOTE_LINE_NUMBER (insn)),
3905 CONST_CALL_P (loop_notes) = CONST_CALL_P (insn);
3914 /* Macros and functions for keeping the priority queue sorted, and
3915 dealing with queueing and dequeueing of instructions. */
3917 #define SCHED_SORT(READY, N_READY) \
3918 do { if ((N_READY) == 2) \
3919 swap_sort (READY, N_READY); \
3920 else if ((N_READY) > 2) \
3921 qsort (READY, N_READY, sizeof (rtx), rank_for_schedule); } \
3924 /* Returns a positive value if x is preferred; returns a negative value if
3925 y is preferred. Should never return 0, since that will make the sort
3929 rank_for_schedule (x, y)
3933 rtx tmp = *(rtx *)y;
3934 rtx tmp2 = *(rtx *)x;
3936 int tmp_class, tmp2_class, depend_count1, depend_count2;
3937 int val, priority_val, spec_val, prob_val, weight_val;
3940 /* Prefer insn with higher priority. */
3941 priority_val = INSN_PRIORITY (tmp2) - INSN_PRIORITY (tmp);
3943 return priority_val;
3945 /* Prefer an insn with smaller contribution to registers-pressure. */
3946 if (!reload_completed &&
3947 (weight_val = INSN_REG_WEIGHT (tmp) - INSN_REG_WEIGHT (tmp2)))
3948 return (weight_val);
3950 /* Some comparison make sense in interblock scheduling only. */
3951 if (INSN_BB (tmp) != INSN_BB (tmp2))
3953 /* Prefer an inblock motion on an interblock motion. */
3954 if ((INSN_BB (tmp2) == target_bb) && (INSN_BB (tmp) != target_bb))
3956 if ((INSN_BB (tmp) == target_bb) && (INSN_BB (tmp2) != target_bb))
3959 /* Prefer a useful motion on a speculative one. */
3960 if ((spec_val = IS_SPECULATIVE_INSN (tmp) - IS_SPECULATIVE_INSN (tmp2)))
3963 /* Prefer a more probable (speculative) insn. */
3964 prob_val = INSN_PROBABILITY (tmp2) - INSN_PROBABILITY (tmp);
3969 /* Compare insns based on their relation to the last-scheduled-insn. */
3970 if (last_scheduled_insn)
3972 /* Classify the instructions into three classes:
3973 1) Data dependent on last schedule insn.
3974 2) Anti/Output dependent on last scheduled insn.
3975 3) Independent of last scheduled insn, or has latency of one.
3976 Choose the insn from the highest numbered class if different. */
3977 link = find_insn_list (tmp, INSN_DEPEND (last_scheduled_insn));
3978 if (link == 0 || insn_cost (last_scheduled_insn, link, tmp) == 1)
3980 else if (REG_NOTE_KIND (link) == 0) /* Data dependence. */
3985 link = find_insn_list (tmp2, INSN_DEPEND (last_scheduled_insn));
3986 if (link == 0 || insn_cost (last_scheduled_insn, link, tmp2) == 1)
3988 else if (REG_NOTE_KIND (link) == 0) /* Data dependence. */
3993 if ((val = tmp2_class - tmp_class))
3997 /* Prefer the insn which has more later insns that depend on it.
3998 This gives the scheduler more freedom when scheduling later
3999 instructions at the expense of added register pressure. */
4001 for (link = INSN_DEPEND (tmp); link; link = XEXP (link, 1))
4005 for (link = INSN_DEPEND (tmp2); link; link = XEXP (link, 1))
4008 val = depend_count2 - depend_count1;
4012 /* If insns are equally good, sort by INSN_LUID (original insn order),
4013 so that we make the sort stable. This minimizes instruction movement,
4014 thus minimizing sched's effect on debugging and cross-jumping. */
4015 return INSN_LUID (tmp) - INSN_LUID (tmp2);
4018 /* Resort the array A in which only element at index N may be out of order. */
4020 HAIFA_INLINE static void
4025 rtx insn = a[n - 1];
4028 while (i >= 0 && rank_for_schedule (a + i, &insn) >= 0)
4036 static int max_priority;
4038 /* Add INSN to the insn queue so that it can be executed at least
4039 N_CYCLES after the currently executing insn. Preserve insns
4040 chain for debugging purposes. */
4042 HAIFA_INLINE static void
4043 queue_insn (insn, n_cycles)
4047 int next_q = NEXT_Q_AFTER (q_ptr, n_cycles);
4048 rtx link = alloc_INSN_LIST (insn, insn_queue[next_q]);
4049 insn_queue[next_q] = link;
4052 if (sched_verbose >= 2)
4054 fprintf (dump, ";;\t\tReady-->Q: insn %d: ", INSN_UID (insn));
4056 if (INSN_BB (insn) != target_bb)
4057 fprintf (dump, "(b%d) ", INSN_BLOCK (insn));
4059 fprintf (dump, "queued for %d cycles.\n", n_cycles);
4064 /* PREV is an insn that is ready to execute. Adjust its priority if that
4065 will help shorten or lengthen register lifetimes as appropriate. Also
4066 provide a hook for the target to tweek itself. */
4068 HAIFA_INLINE static void
4069 adjust_priority (prev)
4070 rtx prev ATTRIBUTE_UNUSED;
4072 /* ??? There used to be code here to try and estimate how an insn
4073 affected register lifetimes, but it did it by looking at REG_DEAD
4074 notes, which we removed in schedule_region. Nor did it try to
4075 take into account register pressure or anything useful like that.
4077 Revisit when we have a machine model to work with and not before. */
4079 #ifdef ADJUST_PRIORITY
4080 ADJUST_PRIORITY (prev);
4084 /* Clock at which the previous instruction was issued. */
4085 static int last_clock_var;
4087 /* INSN is the "currently executing insn". Launch each insn which was
4088 waiting on INSN. READY is a vector of insns which are ready to fire.
4089 N_READY is the number of elements in READY. CLOCK is the current
4093 schedule_insn (insn, ready, n_ready, clock)
4102 unit = insn_unit (insn);
4104 if (sched_verbose >= 2)
4106 fprintf (dump, ";;\t\t--> scheduling insn <<<%d>>> on unit ",
4108 insn_print_units (insn);
4109 fprintf (dump, "\n");
4112 if (sched_verbose && unit == -1)
4113 visualize_no_unit (insn);
4115 if (MAX_BLOCKAGE > 1 || issue_rate > 1 || sched_verbose)
4116 schedule_unit (unit, insn, clock);
4118 if (INSN_DEPEND (insn) == 0)
4121 /* This is used by the function adjust_priority above. */
4123 max_priority = MAX (INSN_PRIORITY (ready[0]), INSN_PRIORITY (insn));
4125 max_priority = INSN_PRIORITY (insn);
4127 for (link = INSN_DEPEND (insn); link != 0; link = XEXP (link, 1))
4129 rtx next = XEXP (link, 0);
4130 int cost = insn_cost (insn, link, next);
4132 INSN_TICK (next) = MAX (INSN_TICK (next), clock + cost);
4134 if ((INSN_DEP_COUNT (next) -= 1) == 0)
4136 int effective_cost = INSN_TICK (next) - clock;
4138 /* For speculative insns, before inserting to ready/queue,
4139 check live, exception-free, and issue-delay. */
4140 if (INSN_BB (next) != target_bb
4141 && (!IS_VALID (INSN_BB (next))
4143 || (IS_SPECULATIVE_INSN (next)
4144 && (insn_issue_delay (next) > 3
4145 || !check_live (next, INSN_BB (next))
4146 || !is_exception_free (next, INSN_BB (next), target_bb)))))
4149 if (sched_verbose >= 2)
4151 fprintf (dump, ";;\t\tdependences resolved: insn %d ",
4154 if (current_nr_blocks > 1 && INSN_BB (next) != target_bb)
4155 fprintf (dump, "/b%d ", INSN_BLOCK (next));
4157 if (effective_cost < 1)
4158 fprintf (dump, "into ready\n");
4160 fprintf (dump, "into queue with cost=%d\n", effective_cost);
4163 /* Adjust the priority of NEXT and either put it on the ready
4164 list or queue it. */
4165 adjust_priority (next);
4166 if (effective_cost < 1)
4167 ready[n_ready++] = next;
4169 queue_insn (next, effective_cost);
4173 /* Annotate the instruction with issue information -- TImode
4174 indicates that the instruction is expected not to be able
4175 to issue on the same cycle as the previous insn. A machine
4176 may use this information to decide how the instruction should
4178 if (reload_completed && issue_rate > 1)
4180 PUT_MODE (insn, clock > last_clock_var ? TImode : VOIDmode);
4181 last_clock_var = clock;
4187 /* Functions for handling of notes. */
4189 /* Delete notes beginning with INSN and put them in the chain
4190 of notes ended by NOTE_LIST.
4191 Returns the insn following the notes. */
4194 unlink_other_notes (insn, tail)
4197 rtx prev = PREV_INSN (insn);
4199 while (insn != tail && GET_CODE (insn) == NOTE)
4201 rtx next = NEXT_INSN (insn);
4202 /* Delete the note from its current position. */
4204 NEXT_INSN (prev) = next;
4206 PREV_INSN (next) = prev;
4208 /* See sched_analyze to see how these are handled. */
4209 if (NOTE_LINE_NUMBER (insn) != NOTE_INSN_SETJMP
4210 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_LOOP_BEG
4211 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_LOOP_END
4212 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_RANGE_START
4213 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_RANGE_END
4214 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_EH_REGION_BEG
4215 && NOTE_LINE_NUMBER (insn) != NOTE_INSN_EH_REGION_END)
4217 /* Insert the note at the end of the notes list. */
4218 PREV_INSN (insn) = note_list;
4220 NEXT_INSN (note_list) = insn;
4229 /* Delete line notes beginning with INSN. Record line-number notes so
4230 they can be reused. Returns the insn following the notes. */
4233 unlink_line_notes (insn, tail)
4236 rtx prev = PREV_INSN (insn);
4238 while (insn != tail && GET_CODE (insn) == NOTE)
4240 rtx next = NEXT_INSN (insn);
4242 if (write_symbols != NO_DEBUG && NOTE_LINE_NUMBER (insn) > 0)
4244 /* Delete the note from its current position. */
4246 NEXT_INSN (prev) = next;
4248 PREV_INSN (next) = prev;
4250 /* Record line-number notes so they can be reused. */
4251 LINE_NOTE (insn) = insn;
4261 /* Return the head and tail pointers of BB. */
4263 HAIFA_INLINE static void
4264 get_block_head_tail (bb, headp, tailp)
4274 b = BB_TO_BLOCK (bb);
4276 /* HEAD and TAIL delimit the basic block being scheduled. */
4277 head = BLOCK_HEAD (b);
4278 tail = BLOCK_END (b);
4280 /* Don't include any notes or labels at the beginning of the
4281 basic block, or notes at the ends of basic blocks. */
4282 while (head != tail)
4284 if (GET_CODE (head) == NOTE)
4285 head = NEXT_INSN (head);
4286 else if (GET_CODE (tail) == NOTE)
4287 tail = PREV_INSN (tail);
4288 else if (GET_CODE (head) == CODE_LABEL)
4289 head = NEXT_INSN (head);
4298 /* Delete line notes from bb. Save them so they can be later restored
4299 (in restore_line_notes ()). */
4310 get_block_head_tail (bb, &head, &tail);
4313 && (GET_RTX_CLASS (GET_CODE (head)) != 'i'))
4316 next_tail = NEXT_INSN (tail);
4317 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4321 /* Farm out notes, and maybe save them in NOTE_LIST.
4322 This is needed to keep the debugger from
4323 getting completely deranged. */
4324 if (GET_CODE (insn) == NOTE)
4327 insn = unlink_line_notes (insn, next_tail);
4333 if (insn == next_tail)
4339 /* Save line number notes for each insn in bb. */
4342 save_line_notes (bb)
4348 /* We must use the true line number for the first insn in the block
4349 that was computed and saved at the start of this pass. We can't
4350 use the current line number, because scheduling of the previous
4351 block may have changed the current line number. */
4353 rtx line = line_note_head[BB_TO_BLOCK (bb)];
4356 get_block_head_tail (bb, &head, &tail);
4357 next_tail = NEXT_INSN (tail);
4359 for (insn = BLOCK_HEAD (BB_TO_BLOCK (bb));
4361 insn = NEXT_INSN (insn))
4362 if (GET_CODE (insn) == NOTE && NOTE_LINE_NUMBER (insn) > 0)
4365 LINE_NOTE (insn) = line;
4369 /* After bb was scheduled, insert line notes into the insns list. */
4372 restore_line_notes (bb)
4375 rtx line, note, prev, new;
4376 int added_notes = 0;
4378 rtx head, next_tail, insn;
4380 b = BB_TO_BLOCK (bb);
4382 head = BLOCK_HEAD (b);
4383 next_tail = NEXT_INSN (BLOCK_END (b));
4385 /* Determine the current line-number. We want to know the current
4386 line number of the first insn of the block here, in case it is
4387 different from the true line number that was saved earlier. If
4388 different, then we need a line number note before the first insn
4389 of this block. If it happens to be the same, then we don't want to
4390 emit another line number note here. */
4391 for (line = head; line; line = PREV_INSN (line))
4392 if (GET_CODE (line) == NOTE && NOTE_LINE_NUMBER (line) > 0)
4395 /* Walk the insns keeping track of the current line-number and inserting
4396 the line-number notes as needed. */
4397 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4398 if (GET_CODE (insn) == NOTE && NOTE_LINE_NUMBER (insn) > 0)
4400 /* This used to emit line number notes before every non-deleted note.
4401 However, this confuses a debugger, because line notes not separated
4402 by real instructions all end up at the same address. I can find no
4403 use for line number notes before other notes, so none are emitted. */
4404 else if (GET_CODE (insn) != NOTE
4405 && (note = LINE_NOTE (insn)) != 0
4408 || NOTE_LINE_NUMBER (note) != NOTE_LINE_NUMBER (line)
4409 || NOTE_SOURCE_FILE (note) != NOTE_SOURCE_FILE (line)))
4412 prev = PREV_INSN (insn);
4413 if (LINE_NOTE (note))
4415 /* Re-use the original line-number note. */
4416 LINE_NOTE (note) = 0;
4417 PREV_INSN (note) = prev;
4418 NEXT_INSN (prev) = note;
4419 PREV_INSN (insn) = note;
4420 NEXT_INSN (note) = insn;
4425 new = emit_note_after (NOTE_LINE_NUMBER (note), prev);
4426 NOTE_SOURCE_FILE (new) = NOTE_SOURCE_FILE (note);
4427 RTX_INTEGRATED_P (new) = RTX_INTEGRATED_P (note);
4430 if (sched_verbose && added_notes)
4431 fprintf (dump, ";; added %d line-number notes\n", added_notes);
4434 /* After scheduling the function, delete redundant line notes from the
4438 rm_redundant_line_notes ()
4441 rtx insn = get_insns ();
4442 int active_insn = 0;
4445 /* Walk the insns deleting redundant line-number notes. Many of these
4446 are already present. The remainder tend to occur at basic
4447 block boundaries. */
4448 for (insn = get_last_insn (); insn; insn = PREV_INSN (insn))
4449 if (GET_CODE (insn) == NOTE && NOTE_LINE_NUMBER (insn) > 0)
4451 /* If there are no active insns following, INSN is redundant. */
4452 if (active_insn == 0)
4455 NOTE_SOURCE_FILE (insn) = 0;
4456 NOTE_LINE_NUMBER (insn) = NOTE_INSN_DELETED;
4458 /* If the line number is unchanged, LINE is redundant. */
4460 && NOTE_LINE_NUMBER (line) == NOTE_LINE_NUMBER (insn)
4461 && NOTE_SOURCE_FILE (line) == NOTE_SOURCE_FILE (insn))
4464 NOTE_SOURCE_FILE (line) = 0;
4465 NOTE_LINE_NUMBER (line) = NOTE_INSN_DELETED;
4472 else if (!((GET_CODE (insn) == NOTE
4473 && NOTE_LINE_NUMBER (insn) == NOTE_INSN_DELETED)
4474 || (GET_CODE (insn) == INSN
4475 && (GET_CODE (PATTERN (insn)) == USE
4476 || GET_CODE (PATTERN (insn)) == CLOBBER))))
4479 if (sched_verbose && notes)
4480 fprintf (dump, ";; deleted %d line-number notes\n", notes);
4483 /* Delete notes between head and tail and put them in the chain
4484 of notes ended by NOTE_LIST. */
4487 rm_other_notes (head, tail)
4495 && (GET_RTX_CLASS (GET_CODE (head)) != 'i'))
4498 next_tail = NEXT_INSN (tail);
4499 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4503 /* Farm out notes, and maybe save them in NOTE_LIST.
4504 This is needed to keep the debugger from
4505 getting completely deranged. */
4506 if (GET_CODE (insn) == NOTE)
4510 insn = unlink_other_notes (insn, next_tail);
4516 if (insn == next_tail)
4522 /* Functions for computation of registers live/usage info. */
4524 /* Calculate INSN_REG_WEIGHT for all insns of a block. */
4527 find_insn_reg_weight (bb)
4530 rtx insn, next_tail, head, tail;
4532 get_block_head_tail (bb, &head, &tail);
4533 next_tail = NEXT_INSN (tail);
4535 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
4540 /* Handle register life information. */
4541 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
4544 /* Increment weight for each register born here. */
4546 if ((GET_CODE (x) == SET || GET_CODE (x) == CLOBBER)
4547 && register_operand (SET_DEST (x), VOIDmode))
4549 else if (GET_CODE (x) == PARALLEL)
4552 for (j = XVECLEN (x, 0) - 1; j >= 0; j--)
4554 x = XVECEXP (PATTERN (insn), 0, j);
4555 if ((GET_CODE (x) == SET || GET_CODE (x) == CLOBBER)
4556 && register_operand (SET_DEST (x), VOIDmode))
4561 /* Decrement weight for each register that dies here. */
4562 for (x = REG_NOTES (insn); x; x = XEXP (x, 1))
4564 if (REG_NOTE_KIND (x) == REG_DEAD
4565 || REG_NOTE_KIND (x) == REG_UNUSED)
4569 INSN_REG_WEIGHT (insn) = reg_weight;
4573 /* Scheduling clock, modified in schedule_block() and queue_to_ready (). */
4574 static int clock_var;
4576 /* Move insns that became ready to fire from queue to ready list. */
4579 queue_to_ready (ready, n_ready)
4586 q_ptr = NEXT_Q (q_ptr);
4588 /* Add all pending insns that can be scheduled without stalls to the
4590 for (link = insn_queue[q_ptr]; link; link = XEXP (link, 1))
4593 insn = XEXP (link, 0);
4596 if (sched_verbose >= 2)
4597 fprintf (dump, ";;\t\tQ-->Ready: insn %d: ", INSN_UID (insn));
4599 if (sched_verbose >= 2 && INSN_BB (insn) != target_bb)
4600 fprintf (dump, "(b%d) ", INSN_BLOCK (insn));
4602 ready[n_ready++] = insn;
4603 if (sched_verbose >= 2)
4604 fprintf (dump, "moving to ready without stalls\n");
4606 insn_queue[q_ptr] = 0;
4608 /* If there are no ready insns, stall until one is ready and add all
4609 of the pending insns at that point to the ready list. */
4612 register int stalls;
4614 for (stalls = 1; stalls < INSN_QUEUE_SIZE; stalls++)
4616 if ((link = insn_queue[NEXT_Q_AFTER (q_ptr, stalls)]))
4618 for (; link; link = XEXP (link, 1))
4620 insn = XEXP (link, 0);
4623 if (sched_verbose >= 2)
4624 fprintf (dump, ";;\t\tQ-->Ready: insn %d: ", INSN_UID (insn));
4626 if (sched_verbose >= 2 && INSN_BB (insn) != target_bb)
4627 fprintf (dump, "(b%d) ", INSN_BLOCK (insn));
4629 ready[n_ready++] = insn;
4630 if (sched_verbose >= 2)
4631 fprintf (dump, "moving to ready with %d stalls\n", stalls);
4633 insn_queue[NEXT_Q_AFTER (q_ptr, stalls)] = 0;
4640 if (sched_verbose && stalls)
4641 visualize_stall_cycles (BB_TO_BLOCK (target_bb), stalls);
4642 q_ptr = NEXT_Q_AFTER (q_ptr, stalls);
4643 clock_var += stalls;
4648 /* Print the ready list for debugging purposes. Callable from debugger. */
4651 debug_ready_list (ready, n_ready)
4657 for (i = 0; i < n_ready; i++)
4659 fprintf (dump, " %d", INSN_UID (ready[i]));
4660 if (current_nr_blocks > 1 && INSN_BB (ready[i]) != target_bb)
4661 fprintf (dump, "/b%d", INSN_BLOCK (ready[i]));
4663 fprintf (dump, "\n");
4666 /* Print names of units on which insn can/should execute, for debugging. */
4669 insn_print_units (insn)
4673 int unit = insn_unit (insn);
4676 fprintf (dump, "none");
4678 fprintf (dump, "%s", function_units[unit].name);
4681 fprintf (dump, "[");
4682 for (i = 0, unit = ~unit; unit; i++, unit >>= 1)
4685 fprintf (dump, "%s", function_units[i].name);
4687 fprintf (dump, " ");
4689 fprintf (dump, "]");
4693 /* MAX_VISUAL_LINES is the maximum number of lines in visualization table
4694 of a basic block. If more lines are needed, table is splitted to two.
4695 n_visual_lines is the number of lines printed so far for a block.
4696 visual_tbl contains the block visualization info.
4697 vis_no_unit holds insns in a cycle that are not mapped to any unit. */
4698 #define MAX_VISUAL_LINES 100
4703 rtx vis_no_unit[10];
4705 /* Finds units that are in use in this fuction. Required only
4706 for visualization. */
4709 init_target_units ()
4714 for (insn = get_last_insn (); insn; insn = PREV_INSN (insn))
4716 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
4719 unit = insn_unit (insn);
4722 target_units |= ~unit;
4724 target_units |= (1 << unit);
4728 /* Return the length of the visualization table. */
4731 get_visual_tbl_length ()
4737 /* Compute length of one field in line. */
4738 s = (char *) alloca (INSN_LEN + 6);
4739 sprintf (s, " %33s", "uname");
4742 /* Compute length of one line. */
4745 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
4746 if (function_units[unit].bitmask & target_units)
4747 for (i = 0; i < function_units[unit].multiplicity; i++)
4750 n += strlen ("\n") + 2;
4752 /* Compute length of visualization string. */
4753 return (MAX_VISUAL_LINES * n);
4756 /* Init block visualization debugging info. */
4759 init_block_visualization ()
4761 strcpy (visual_tbl, "");
4769 safe_concat (buf, cur, str)
4774 char *end = buf + BUF_LEN - 2; /* Leave room for null. */
4783 while (cur < end && (c = *str++) != '\0')
4790 /* This recognizes rtx, I classified as expressions. These are always
4791 represent some action on values or results of other expression, that
4792 may be stored in objects representing values. */
4795 print_exp (buf, x, verbose)
4803 const char *fun = (char *)0;
4808 for (i = 0; i < 4; i++)
4814 switch (GET_CODE (x))
4817 op[0] = XEXP (x, 0);
4818 if (GET_CODE (XEXP (x, 1)) == CONST_INT
4819 && INTVAL (XEXP (x, 1)) < 0)
4822 op[1] = GEN_INT (-INTVAL (XEXP (x, 1)));
4827 op[1] = XEXP (x, 1);
4831 op[0] = XEXP (x, 0);
4833 op[1] = XEXP (x, 1);
4837 op[0] = XEXP (x, 0);
4839 op[1] = XEXP (x, 1);
4843 op[0] = XEXP (x, 0);
4844 op[1] = XEXP (x, 1);
4848 op[0] = XEXP (x, 0);
4851 op[0] = XEXP (x, 0);
4853 op[1] = XEXP (x, 1);
4856 op[0] = XEXP (x, 0);
4858 op[1] = XEXP (x, 1);
4862 op[0] = XEXP (x, 0);
4863 op[1] = XEXP (x, 1);
4866 op[0] = XEXP (x, 0);
4868 op[1] = XEXP (x, 1);
4872 op[0] = XEXP (x, 0);
4873 op[1] = XEXP (x, 1);
4877 op[0] = XEXP (x, 0);
4878 op[1] = XEXP (x, 1);
4882 op[0] = XEXP (x, 0);
4883 op[1] = XEXP (x, 1);
4887 op[0] = XEXP (x, 0);
4888 op[1] = XEXP (x, 1);
4892 op[0] = XEXP (x, 0);
4893 op[1] = XEXP (x, 1);
4897 op[0] = XEXP (x, 0);
4900 op[0] = XEXP (x, 0);
4902 op[1] = XEXP (x, 1);
4905 op[0] = XEXP (x, 0);
4907 op[1] = XEXP (x, 1);
4910 op[0] = XEXP (x, 0);
4912 op[1] = XEXP (x, 1);
4915 op[0] = XEXP (x, 0);
4917 op[1] = XEXP (x, 1);
4920 op[0] = XEXP (x, 0);
4922 op[1] = XEXP (x, 1);
4925 op[0] = XEXP (x, 0);
4927 op[1] = XEXP (x, 1);
4930 op[0] = XEXP (x, 0);
4932 op[1] = XEXP (x, 1);
4935 op[0] = XEXP (x, 0);
4937 op[1] = XEXP (x, 1);
4941 op[0] = XEXP (x, 0);
4945 op[0] = XEXP (x, 0);
4949 op[0] = XEXP (x, 0);
4952 op[0] = XEXP (x, 0);
4954 op[1] = XEXP (x, 1);
4957 op[0] = XEXP (x, 0);
4959 op[1] = XEXP (x, 1);
4962 op[0] = XEXP (x, 0);
4964 op[1] = XEXP (x, 1);
4968 op[0] = XEXP (x, 0);
4969 op[1] = XEXP (x, 1);
4972 op[0] = XEXP (x, 0);
4974 op[1] = XEXP (x, 1);
4978 op[0] = XEXP (x, 0);
4979 op[1] = XEXP (x, 1);
4982 op[0] = XEXP (x, 0);
4984 op[1] = XEXP (x, 1);
4988 op[0] = XEXP (x, 0);
4989 op[1] = XEXP (x, 1);
4992 op[0] = XEXP (x, 0);
4994 op[1] = XEXP (x, 1);
4998 op[0] = XEXP (x, 0);
4999 op[1] = XEXP (x, 1);
5002 fun = (verbose) ? "sign_extract" : "sxt";
5003 op[0] = XEXP (x, 0);
5004 op[1] = XEXP (x, 1);
5005 op[2] = XEXP (x, 2);
5008 fun = (verbose) ? "zero_extract" : "zxt";
5009 op[0] = XEXP (x, 0);
5010 op[1] = XEXP (x, 1);
5011 op[2] = XEXP (x, 2);
5014 fun = (verbose) ? "sign_extend" : "sxn";
5015 op[0] = XEXP (x, 0);
5018 fun = (verbose) ? "zero_extend" : "zxn";
5019 op[0] = XEXP (x, 0);
5022 fun = (verbose) ? "float_extend" : "fxn";
5023 op[0] = XEXP (x, 0);
5026 fun = (verbose) ? "trunc" : "trn";
5027 op[0] = XEXP (x, 0);
5029 case FLOAT_TRUNCATE:
5030 fun = (verbose) ? "float_trunc" : "ftr";
5031 op[0] = XEXP (x, 0);
5034 fun = (verbose) ? "float" : "flt";
5035 op[0] = XEXP (x, 0);
5037 case UNSIGNED_FLOAT:
5038 fun = (verbose) ? "uns_float" : "ufl";
5039 op[0] = XEXP (x, 0);
5043 op[0] = XEXP (x, 0);
5046 fun = (verbose) ? "uns_fix" : "ufx";
5047 op[0] = XEXP (x, 0);
5051 op[0] = XEXP (x, 0);
5055 op[0] = XEXP (x, 0);
5058 op[0] = XEXP (x, 0);
5062 op[0] = XEXP (x, 0);
5067 op[0] = XEXP (x, 0);
5071 op[1] = XEXP (x, 1);
5076 op[0] = XEXP (x, 0);
5078 op[1] = XEXP (x, 1);
5080 op[2] = XEXP (x, 2);
5085 op[0] = TRAP_CONDITION (x);
5088 case UNSPEC_VOLATILE:
5090 cur = safe_concat (buf, cur, "unspec");
5091 if (GET_CODE (x) == UNSPEC_VOLATILE)
5092 cur = safe_concat (buf, cur, "/v");
5093 cur = safe_concat (buf, cur, "[");
5095 for (i = 0; i < XVECLEN (x, 0); i++)
5097 print_pattern (tmp, XVECEXP (x, 0, i), verbose);
5098 cur = safe_concat (buf, cur, sep);
5099 cur = safe_concat (buf, cur, tmp);
5102 cur = safe_concat (buf, cur, "] ");
5103 sprintf (tmp, "%d", XINT (x, 1));
5104 cur = safe_concat (buf, cur, tmp);
5108 /* If (verbose) debug_rtx (x); */
5109 st[0] = GET_RTX_NAME (GET_CODE (x));
5113 /* Print this as a function? */
5116 cur = safe_concat (buf, cur, fun);
5117 cur = safe_concat (buf, cur, "(");
5120 for (i = 0; i < 4; i++)
5123 cur = safe_concat (buf, cur, st[i]);
5128 cur = safe_concat (buf, cur, ",");
5130 print_value (tmp, op[i], verbose);
5131 cur = safe_concat (buf, cur, tmp);
5136 cur = safe_concat (buf, cur, ")");
5139 /* Prints rtxes, I customly classified as values. They're constants,
5140 registers, labels, symbols and memory accesses. */
5143 print_value (buf, x, verbose)
5151 switch (GET_CODE (x))
5154 sprintf (t, HOST_WIDE_INT_PRINT_HEX, INTVAL (x));
5155 cur = safe_concat (buf, cur, t);
5158 sprintf (t, "<0x%lx,0x%lx>", (long)XWINT (x, 2), (long)XWINT (x, 3));
5159 cur = safe_concat (buf, cur, t);
5162 cur = safe_concat (buf, cur, "\"");
5163 cur = safe_concat (buf, cur, XSTR (x, 0));
5164 cur = safe_concat (buf, cur, "\"");
5167 cur = safe_concat (buf, cur, "`");
5168 cur = safe_concat (buf, cur, XSTR (x, 0));
5169 cur = safe_concat (buf, cur, "'");
5172 sprintf (t, "L%d", INSN_UID (XEXP (x, 0)));
5173 cur = safe_concat (buf, cur, t);
5176 print_value (t, XEXP (x, 0), verbose);
5177 cur = safe_concat (buf, cur, "const(");
5178 cur = safe_concat (buf, cur, t);
5179 cur = safe_concat (buf, cur, ")");
5182 print_value (t, XEXP (x, 0), verbose);
5183 cur = safe_concat (buf, cur, "high(");
5184 cur = safe_concat (buf, cur, t);
5185 cur = safe_concat (buf, cur, ")");
5188 if (REGNO (x) < FIRST_PSEUDO_REGISTER)
5190 int c = reg_names[ REGNO (x) ][0];
5191 if (c >= '0' && c <= '9')
5192 cur = safe_concat (buf, cur, "%");
5194 cur = safe_concat (buf, cur, reg_names[ REGNO (x) ]);
5198 sprintf (t, "r%d", REGNO (x));
5199 cur = safe_concat (buf, cur, t);
5203 print_value (t, SUBREG_REG (x), verbose);
5204 cur = safe_concat (buf, cur, t);
5205 sprintf (t, "#%d", SUBREG_WORD (x));
5206 cur = safe_concat (buf, cur, t);
5209 cur = safe_concat (buf, cur, "scratch");
5212 cur = safe_concat (buf, cur, "cc0");
5215 cur = safe_concat (buf, cur, "pc");
5218 print_value (t, XEXP (x, 0), verbose);
5219 cur = safe_concat (buf, cur, "[");
5220 cur = safe_concat (buf, cur, t);
5221 cur = safe_concat (buf, cur, "]");
5224 print_exp (t, x, verbose);
5225 cur = safe_concat (buf, cur, t);
5230 /* The next step in insn detalization, its pattern recognition. */
5233 print_pattern (buf, x, verbose)
5238 char t1[BUF_LEN], t2[BUF_LEN], t3[BUF_LEN];
5240 switch (GET_CODE (x))
5243 print_value (t1, SET_DEST (x), verbose);
5244 print_value (t2, SET_SRC (x), verbose);
5245 sprintf (buf, "%s=%s", t1, t2);
5248 sprintf (buf, "return");
5251 print_exp (buf, x, verbose);
5254 print_value (t1, XEXP (x, 0), verbose);
5255 sprintf (buf, "clobber %s", t1);
5258 print_value (t1, XEXP (x, 0), verbose);
5259 sprintf (buf, "use %s", t1);
5266 for (i = 0; i < XVECLEN (x, 0); i++)
5268 print_pattern (t2, XVECEXP (x, 0, i), verbose);
5269 sprintf (t3, "%s%s;", t1, t2);
5272 sprintf (buf, "%s}", t1);
5279 sprintf (t1, "%%{");
5280 for (i = 0; i < XVECLEN (x, 0); i++)
5282 print_insn (t2, XVECEXP (x, 0, i), verbose);
5283 sprintf (t3, "%s%s;", t1, t2);
5286 sprintf (buf, "%s%%}", t1);
5290 sprintf (buf, "asm {%s}", XSTR (x, 0));
5295 print_value (buf, XEXP (x, 0), verbose);
5298 print_value (t1, TRAP_CONDITION (x), verbose);
5299 sprintf (buf, "trap_if %s", t1);
5305 sprintf (t1, "unspec{");
5306 for (i = 0; i < XVECLEN (x, 0); i++)
5308 print_pattern (t2, XVECEXP (x, 0, i), verbose);
5309 sprintf (t3, "%s%s;", t1, t2);
5312 sprintf (buf, "%s}", t1);
5315 case UNSPEC_VOLATILE:
5319 sprintf (t1, "unspec/v{");
5320 for (i = 0; i < XVECLEN (x, 0); i++)
5322 print_pattern (t2, XVECEXP (x, 0, i), verbose);
5323 sprintf (t3, "%s%s;", t1, t2);
5326 sprintf (buf, "%s}", t1);
5330 print_value (buf, x, verbose);
5332 } /* print_pattern */
5334 /* This is the main function in rtl visualization mechanism. It
5335 accepts an rtx and tries to recognize it as an insn, then prints it
5336 properly in human readable form, resembling assembler mnemonics.
5337 For every insn it prints its UID and BB the insn belongs too.
5338 (Probably the last "option" should be extended somehow, since it
5339 depends now on sched.c inner variables ...) */
5342 print_insn (buf, x, verbose)
5350 switch (GET_CODE (x))
5353 print_pattern (t, PATTERN (x), verbose);
5355 sprintf (buf, "b%d: i% 4d: %s", INSN_BB (x),
5358 sprintf (buf, "%-4d %s", INSN_UID (x), t);
5361 print_pattern (t, PATTERN (x), verbose);
5363 sprintf (buf, "b%d: i% 4d: jump %s", INSN_BB (x),
5366 sprintf (buf, "%-4d %s", INSN_UID (x), t);
5370 if (GET_CODE (x) == PARALLEL)
5372 x = XVECEXP (x, 0, 0);
5373 print_pattern (t, x, verbose);
5376 strcpy (t, "call <...>");
5378 sprintf (buf, "b%d: i% 4d: %s", INSN_BB (insn),
5379 INSN_UID (insn), t);
5381 sprintf (buf, "%-4d %s", INSN_UID (insn), t);
5384 sprintf (buf, "L%d:", INSN_UID (x));
5387 sprintf (buf, "i% 4d: barrier", INSN_UID (x));
5390 if (NOTE_LINE_NUMBER (x) > 0)
5391 sprintf (buf, "%4d note \"%s\" %d", INSN_UID (x),
5392 NOTE_SOURCE_FILE (x), NOTE_LINE_NUMBER (x));
5394 sprintf (buf, "%4d %s", INSN_UID (x),
5395 GET_NOTE_INSN_NAME (NOTE_LINE_NUMBER (x)));
5400 sprintf (buf, "Not an INSN at all\n");
5404 sprintf (buf, "i%-4d <What?>", INSN_UID (x));
5408 /* Print visualization debugging info. */
5411 print_block_visualization (b, s)
5418 fprintf (dump, "\n;; ==================== scheduling visualization for block %d %s \n", b, s);
5420 /* Print names of units. */
5421 fprintf (dump, ";; %-8s", "clock");
5422 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
5423 if (function_units[unit].bitmask & target_units)
5424 for (i = 0; i < function_units[unit].multiplicity; i++)
5425 fprintf (dump, " %-33s", function_units[unit].name);
5426 fprintf (dump, " %-8s\n", "no-unit");
5428 fprintf (dump, ";; %-8s", "=====");
5429 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
5430 if (function_units[unit].bitmask & target_units)
5431 for (i = 0; i < function_units[unit].multiplicity; i++)
5432 fprintf (dump, " %-33s", "==============================");
5433 fprintf (dump, " %-8s\n", "=======");
5435 /* Print insns in each cycle. */
5436 fprintf (dump, "%s\n", visual_tbl);
5439 /* Print insns in the 'no_unit' column of visualization. */
5442 visualize_no_unit (insn)
5445 vis_no_unit[n_vis_no_unit] = insn;
5449 /* Print insns scheduled in clock, for visualization. */
5452 visualize_scheduled_insns (b, clock)
5457 /* If no more room, split table into two. */
5458 if (n_visual_lines >= MAX_VISUAL_LINES)
5460 print_block_visualization (b, "(incomplete)");
5461 init_block_visualization ();
5466 sprintf (visual_tbl + strlen (visual_tbl), ";; %-8d", clock);
5467 for (unit = 0; unit < FUNCTION_UNITS_SIZE; unit++)
5468 if (function_units[unit].bitmask & target_units)
5469 for (i = 0; i < function_units[unit].multiplicity; i++)
5471 int instance = unit + i * FUNCTION_UNITS_SIZE;
5472 rtx insn = unit_last_insn[instance];
5474 /* Print insns that still keep the unit busy. */
5476 actual_hazard_this_instance (unit, instance, insn, clock, 0))
5479 print_insn (str, insn, 0);
5480 str[INSN_LEN] = '\0';
5481 sprintf (visual_tbl + strlen (visual_tbl), " %-33s", str);
5484 sprintf (visual_tbl + strlen (visual_tbl), " %-33s", "------------------------------");
5487 /* Print insns that are not assigned to any unit. */
5488 for (i = 0; i < n_vis_no_unit; i++)
5489 sprintf (visual_tbl + strlen (visual_tbl), " %-8d",
5490 INSN_UID (vis_no_unit[i]));
5493 sprintf (visual_tbl + strlen (visual_tbl), "\n");
5496 /* Print stalled cycles. */
5499 visualize_stall_cycles (b, stalls)
5504 /* If no more room, split table into two. */
5505 if (n_visual_lines >= MAX_VISUAL_LINES)
5507 print_block_visualization (b, "(incomplete)");
5508 init_block_visualization ();
5513 sprintf (visual_tbl + strlen (visual_tbl), ";; ");
5514 for (i = 0; i < stalls; i++)
5515 sprintf (visual_tbl + strlen (visual_tbl), ".");
5516 sprintf (visual_tbl + strlen (visual_tbl), "\n");
5519 /* move_insn1: Remove INSN from insn chain, and link it after LAST insn. */
5522 move_insn1 (insn, last)
5525 NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn);
5526 PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn);
5528 NEXT_INSN (insn) = NEXT_INSN (last);
5529 PREV_INSN (NEXT_INSN (last)) = insn;
5531 NEXT_INSN (last) = insn;
5532 PREV_INSN (insn) = last;
5537 /* Search INSN for REG_SAVE_NOTE note pairs for NOTE_INSN_SETJMP,
5538 NOTE_INSN_{LOOP,EHREGION}_{BEG,END}; and convert them back into
5539 NOTEs. The REG_SAVE_NOTE note following first one is contains the
5540 saved value for NOTE_BLOCK_NUMBER which is useful for
5541 NOTE_INSN_EH_REGION_{BEG,END} NOTEs. LAST is the last instruction
5542 output by the instruction scheduler. Return the new value of LAST. */
5545 reemit_notes (insn, last)
5552 for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
5554 if (REG_NOTE_KIND (note) == REG_SAVE_NOTE)
5556 int note_type = INTVAL (XEXP (note, 0));
5557 if (note_type == NOTE_INSN_SETJMP)
5559 retval = emit_note_after (NOTE_INSN_SETJMP, insn);
5560 CONST_CALL_P (retval) = CONST_CALL_P (note);
5561 remove_note (insn, note);
5562 note = XEXP (note, 1);
5564 else if (note_type == NOTE_INSN_RANGE_START
5565 || note_type == NOTE_INSN_RANGE_END)
5567 last = emit_note_before (note_type, last);
5568 remove_note (insn, note);
5569 note = XEXP (note, 1);
5570 NOTE_RANGE_INFO (last) = XEXP (note, 0);
5574 last = emit_note_before (note_type, last);
5575 remove_note (insn, note);
5576 note = XEXP (note, 1);
5577 if (note_type == NOTE_INSN_EH_REGION_BEG
5578 || note_type == NOTE_INSN_EH_REGION_END)
5579 NOTE_EH_HANDLER (last) = INTVAL (XEXP (note, 0));
5581 remove_note (insn, note);
5587 /* Move INSN, and all insns which should be issued before it,
5588 due to SCHED_GROUP_P flag. Reemit notes if needed.
5590 Return the last insn emitted by the scheduler, which is the
5591 return value from the first call to reemit_notes. */
5594 move_insn (insn, last)
5599 /* If INSN has SCHED_GROUP_P set, then issue it and any other
5600 insns with SCHED_GROUP_P set first. */
5601 while (SCHED_GROUP_P (insn))
5603 rtx prev = PREV_INSN (insn);
5605 /* Move a SCHED_GROUP_P insn. */
5606 move_insn1 (insn, last);
5607 /* If this is the first call to reemit_notes, then record
5608 its return value. */
5609 if (retval == NULL_RTX)
5610 retval = reemit_notes (insn, insn);
5612 reemit_notes (insn, insn);
5616 /* Now move the first non SCHED_GROUP_P insn. */
5617 move_insn1 (insn, last);
5619 /* If this is the first call to reemit_notes, then record
5620 its return value. */
5621 if (retval == NULL_RTX)
5622 retval = reemit_notes (insn, insn);
5624 reemit_notes (insn, insn);
5629 /* Return an insn which represents a SCHED_GROUP, which is
5630 the last insn in the group. */
5641 insn = next_nonnote_insn (insn);
5643 while (insn && SCHED_GROUP_P (insn) && (GET_CODE (insn) != CODE_LABEL));
5648 /* Use forward list scheduling to rearrange insns of block BB in region RGN,
5649 possibly bringing insns from subsequent blocks in the same region.
5650 Return number of insns scheduled. */
5653 schedule_block (bb, rgn_n_insns)
5657 /* Local variables. */
5663 /* Flow block of this bb. */
5664 int b = BB_TO_BLOCK (bb);
5666 /* target_n_insns == number of insns in b before scheduling starts.
5667 sched_target_n_insns == how many of b's insns were scheduled.
5668 sched_n_insns == how many insns were scheduled in b. */
5669 int target_n_insns = 0;
5670 int sched_target_n_insns = 0;
5671 int sched_n_insns = 0;
5673 #define NEED_NOTHING 0
5678 /* Head/tail info for this block. */
5685 /* We used to have code to avoid getting parameters moved from hard
5686 argument registers into pseudos.
5688 However, it was removed when it proved to be of marginal benefit
5689 and caused problems because schedule_block and compute_forward_dependences
5690 had different notions of what the "head" insn was. */
5691 get_block_head_tail (bb, &head, &tail);
5693 /* Interblock scheduling could have moved the original head insn from this
5694 block into a proceeding block. This may also cause schedule_block and
5695 compute_forward_dependences to have different notions of what the
5698 If the interblock movement happened to make this block start with
5699 some notes (LOOP, EH or SETJMP) before the first real insn, then
5700 HEAD will have various special notes attached to it which must be
5701 removed so that we don't end up with extra copies of the notes. */
5702 if (GET_RTX_CLASS (GET_CODE (head)) == 'i')
5706 for (note = REG_NOTES (head); note; note = XEXP (note, 1))
5707 if (REG_NOTE_KIND (note) == REG_SAVE_NOTE)
5708 remove_note (head, note);
5711 next_tail = NEXT_INSN (tail);
5712 prev_head = PREV_INSN (head);
5714 /* If the only insn left is a NOTE or a CODE_LABEL, then there is no need
5715 to schedule this block. */
5717 && (GET_RTX_CLASS (GET_CODE (head)) != 'i'))
5718 return (sched_n_insns);
5723 fprintf (dump, ";; ======================================================\n");
5725 ";; -- basic block %d from %d to %d -- %s reload\n",
5726 b, INSN_UID (BLOCK_HEAD (b)), INSN_UID (BLOCK_END (b)),
5727 (reload_completed ? "after" : "before"));
5728 fprintf (dump, ";; ======================================================\n");
5729 fprintf (dump, "\n");
5731 visual_tbl = (char *) alloca (get_visual_tbl_length ());
5732 init_block_visualization ();
5735 /* Remove remaining note insns from the block, save them in
5736 note_list. These notes are restored at the end of
5737 schedule_block (). */
5739 rm_other_notes (head, tail);
5743 /* Prepare current target block info. */
5744 if (current_nr_blocks > 1)
5746 candidate_table = (candidate *) alloca (current_nr_blocks
5747 * sizeof (candidate));
5750 /* ??? It is not clear why bblst_size is computed this way. The original
5751 number was clearly too small as it resulted in compiler failures.
5752 Multiplying by the original number by 2 (to account for update_bbs
5753 members) seems to be a reasonable solution. */
5754 /* ??? Or perhaps there is a bug somewhere else in this file? */
5755 bblst_size = (current_nr_blocks - bb) * rgn_nr_edges * 2;
5756 bblst_table = (int *) alloca (bblst_size * sizeof (int));
5758 bitlst_table_last = 0;
5759 bitlst_table_size = rgn_nr_edges;
5760 bitlst_table = (int *) alloca (rgn_nr_edges * sizeof (int));
5762 compute_trg_info (bb);
5767 /* Allocate the ready list. */
5768 ready = (rtx *) alloca ((rgn_n_insns + 1) * sizeof (rtx));
5770 /* Print debugging information. */
5771 if (sched_verbose >= 5)
5772 debug_dependencies ();
5775 /* Initialize ready list with all 'ready' insns in target block.
5776 Count number of insns in the target block being scheduled. */
5778 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
5782 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
5784 next = NEXT_INSN (insn);
5786 if (INSN_DEP_COUNT (insn) == 0
5787 && (SCHED_GROUP_P (next) == 0 || GET_RTX_CLASS (GET_CODE (next)) != 'i'))
5788 ready[n_ready++] = insn;
5789 if (!(SCHED_GROUP_P (insn)))
5793 /* Add to ready list all 'ready' insns in valid source blocks.
5794 For speculative insns, check-live, exception-free, and
5796 for (bb_src = bb + 1; bb_src < current_nr_blocks; bb_src++)
5797 if (IS_VALID (bb_src))
5803 get_block_head_tail (bb_src, &head, &tail);
5804 src_next_tail = NEXT_INSN (tail);
5808 && (GET_RTX_CLASS (GET_CODE (head)) != 'i'))
5811 for (insn = src_head; insn != src_next_tail; insn = NEXT_INSN (insn))
5813 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
5816 if (!CANT_MOVE (insn)
5817 && (!IS_SPECULATIVE_INSN (insn)
5818 || (insn_issue_delay (insn) <= 3
5819 && check_live (insn, bb_src)
5820 && is_exception_free (insn, bb_src, target_bb))))
5825 /* Note that we havn't squirrled away the notes for
5826 blocks other than the current. So if this is a
5827 speculative insn, NEXT might otherwise be a note. */
5828 next = next_nonnote_insn (insn);
5829 if (INSN_DEP_COUNT (insn) == 0
5830 && (SCHED_GROUP_P (next) == 0
5831 || GET_RTX_CLASS (GET_CODE (next)) != 'i'))
5832 ready[n_ready++] = insn;
5837 #ifdef MD_SCHED_INIT
5838 MD_SCHED_INIT (dump, sched_verbose);
5841 /* No insns scheduled in this block yet. */
5842 last_scheduled_insn = 0;
5844 /* Q_SIZE is the total number of insns in the queue. */
5848 bzero ((char *) insn_queue, sizeof (insn_queue));
5850 /* Start just before the beginning of time. */
5853 /* We start inserting insns after PREV_HEAD. */
5856 /* Initialize INSN_QUEUE, LIST and NEW_NEEDS. */
5857 new_needs = (NEXT_INSN (prev_head) == BLOCK_HEAD (b)
5858 ? NEED_HEAD : NEED_NOTHING);
5859 if (PREV_INSN (next_tail) == BLOCK_END (b))
5860 new_needs |= NEED_TAIL;
5862 /* Loop until all the insns in BB are scheduled. */
5863 while (sched_target_n_insns < target_n_insns)
5869 /* Add to the ready list all pending insns that can be issued now.
5870 If there are no ready insns, increment clock until one
5871 is ready and add all pending insns at that point to the ready
5873 n_ready = queue_to_ready (ready, n_ready);
5878 if (sched_verbose >= 2)
5880 fprintf (dump, ";;\t\tReady list after queue_to_ready: ");
5881 debug_ready_list (ready, n_ready);
5884 /* Sort the ready list based on priority. */
5885 SCHED_SORT (ready, n_ready);
5887 /* Allow the target to reorder the list, typically for
5888 better instruction bundling. */
5889 #ifdef MD_SCHED_REORDER
5890 MD_SCHED_REORDER (dump, sched_verbose, ready, n_ready, clock_var,
5893 can_issue_more = issue_rate;
5898 fprintf (dump, "\n;;\tReady list (t =%3d): ", clock_var);
5899 debug_ready_list (ready, n_ready);
5902 /* Issue insns from ready list. */
5903 while (n_ready != 0 && can_issue_more)
5905 /* Select and remove the insn from the ready list. */
5906 rtx insn = ready[--n_ready];
5907 int cost = actual_hazard (insn_unit (insn), insn, clock_var, 0);
5911 queue_insn (insn, cost);
5915 /* An interblock motion? */
5916 if (INSN_BB (insn) != target_bb)
5920 if (IS_SPECULATIVE_INSN (insn))
5922 if (!check_live (insn, INSN_BB (insn)))
5924 update_live (insn, INSN_BB (insn));
5926 /* For speculative load, mark insns fed by it. */
5927 if (IS_LOAD_INSN (insn) || FED_BY_SPEC_LOAD (insn))
5928 set_spec_fed (insn);
5935 while (SCHED_GROUP_P (temp))
5936 temp = PREV_INSN (temp);
5938 /* Update source block boundaries. */
5939 b1 = INSN_BLOCK (temp);
5940 if (temp == BLOCK_HEAD (b1)
5941 && insn == BLOCK_END (b1))
5943 /* We moved all the insns in the basic block.
5944 Emit a note after the last insn and update the
5945 begin/end boundaries to point to the note. */
5946 emit_note_after (NOTE_INSN_DELETED, insn);
5947 BLOCK_END (b1) = NEXT_INSN (insn);
5948 BLOCK_HEAD (b1) = NEXT_INSN (insn);
5950 else if (insn == BLOCK_END (b1))
5952 /* We took insns from the end of the basic block,
5953 so update the end of block boundary so that it
5954 points to the first insn we did not move. */
5955 BLOCK_END (b1) = PREV_INSN (temp);
5957 else if (temp == BLOCK_HEAD (b1))
5959 /* We took insns from the start of the basic block,
5960 so update the start of block boundary so that
5961 it points to the first insn we did not move. */
5962 BLOCK_HEAD (b1) = NEXT_INSN (insn);
5967 /* In block motion. */
5968 sched_target_n_insns++;
5971 last_scheduled_insn = insn;
5972 last = move_insn (insn, last);
5975 #ifdef MD_SCHED_VARIABLE_ISSUE
5976 MD_SCHED_VARIABLE_ISSUE (dump, sched_verbose, insn,
5982 n_ready = schedule_insn (insn, ready, n_ready, clock_var);
5984 /* Close this block after scheduling its jump. */
5985 if (GET_CODE (last_scheduled_insn) == JUMP_INSN)
5991 visualize_scheduled_insns (b, clock_var);
5997 fprintf (dump, ";;\tReady list (final): ");
5998 debug_ready_list (ready, n_ready);
5999 print_block_visualization (b, "");
6002 /* Sanity check -- queue must be empty now. Meaningless if region has
6004 if (current_nr_blocks > 1)
6005 if (!flag_schedule_interblock && q_size != 0)
6008 /* Update head/tail boundaries. */
6009 head = NEXT_INSN (prev_head);
6012 /* Restore-other-notes: NOTE_LIST is the end of a chain of notes
6013 previously found among the insns. Insert them at the beginning
6017 rtx note_head = note_list;
6019 while (PREV_INSN (note_head))
6021 note_head = PREV_INSN (note_head);
6024 PREV_INSN (note_head) = PREV_INSN (head);
6025 NEXT_INSN (PREV_INSN (head)) = note_head;
6026 PREV_INSN (head) = note_list;
6027 NEXT_INSN (note_list) = head;
6031 /* Update target block boundaries. */
6032 if (new_needs & NEED_HEAD)
6033 BLOCK_HEAD (b) = head;
6035 if (new_needs & NEED_TAIL)
6036 BLOCK_END (b) = tail;
6041 fprintf (dump, ";; total time = %d\n;; new basic block head = %d\n",
6042 clock_var, INSN_UID (BLOCK_HEAD (b)));
6043 fprintf (dump, ";; new basic block end = %d\n\n",
6044 INSN_UID (BLOCK_END (b)));
6047 return (sched_n_insns);
6048 } /* schedule_block () */
6051 /* Print the bit-set of registers, S, callable from debugger. */
6054 debug_reg_vector (s)
6059 EXECUTE_IF_SET_IN_REG_SET (s, 0, regno,
6061 fprintf (dump, " %d", regno);
6064 fprintf (dump, "\n");
6067 /* Use the backward dependences from LOG_LINKS to build
6068 forward dependences in INSN_DEPEND. */
6071 compute_block_forward_dependences (bb)
6077 enum reg_note dep_type;
6079 get_block_head_tail (bb, &head, &tail);
6080 next_tail = NEXT_INSN (tail);
6081 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
6083 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
6086 insn = group_leader (insn);
6088 for (link = LOG_LINKS (insn); link; link = XEXP (link, 1))
6090 rtx x = group_leader (XEXP (link, 0));
6093 if (x != XEXP (link, 0))
6096 #ifdef ENABLE_CHECKING
6097 /* If add_dependence is working properly there should never
6098 be notes, deleted insns or duplicates in the backward
6099 links. Thus we need not check for them here.
6101 However, if we have enabled checking we might as well go
6102 ahead and verify that add_dependence worked properly. */
6103 if (GET_CODE (x) == NOTE
6104 || INSN_DELETED_P (x)
6105 || find_insn_list (insn, INSN_DEPEND (x)))
6109 new_link = alloc_INSN_LIST (insn, INSN_DEPEND (x));
6111 dep_type = REG_NOTE_KIND (link);
6112 PUT_REG_NOTE_KIND (new_link, dep_type);
6114 INSN_DEPEND (x) = new_link;
6115 INSN_DEP_COUNT (insn) += 1;
6120 /* Initialize variables for region data dependence analysis.
6121 n_bbs is the number of region blocks. */
6123 __inline static void
6124 init_rgn_data_dependences (n_bbs)
6129 /* Variables for which one copy exists for each block. */
6130 bzero ((char *) bb_pending_read_insns, n_bbs * sizeof (rtx));
6131 bzero ((char *) bb_pending_read_mems, n_bbs * sizeof (rtx));
6132 bzero ((char *) bb_pending_write_insns, n_bbs * sizeof (rtx));
6133 bzero ((char *) bb_pending_write_mems, n_bbs * sizeof (rtx));
6134 bzero ((char *) bb_pending_lists_length, n_bbs * sizeof (rtx));
6135 bzero ((char *) bb_last_pending_memory_flush, n_bbs * sizeof (rtx));
6136 bzero ((char *) bb_last_function_call, n_bbs * sizeof (rtx));
6137 bzero ((char *) bb_sched_before_next_call, n_bbs * sizeof (rtx));
6139 /* Create an insn here so that we can hang dependencies off of it later. */
6140 for (bb = 0; bb < n_bbs; bb++)
6142 bb_sched_before_next_call[bb] =
6143 gen_rtx_INSN (VOIDmode, 0, NULL_RTX, NULL_RTX,
6144 NULL_RTX, 0, NULL_RTX, NULL_RTX);
6145 LOG_LINKS (bb_sched_before_next_call[bb]) = 0;
6149 /* Add dependences so that branches are scheduled to run last in their
6153 add_branch_dependences (head, tail)
6159 /* For all branches, calls, uses, and cc0 setters, force them to remain
6160 in order at the end of the block by adding dependencies and giving
6161 the last a high priority. There may be notes present, and prev_head
6164 Branches must obviously remain at the end. Calls should remain at the
6165 end since moving them results in worse register allocation. Uses remain
6166 at the end to ensure proper register allocation. cc0 setters remaim
6167 at the end because they can't be moved away from their cc0 user. */
6170 while (GET_CODE (insn) == CALL_INSN || GET_CODE (insn) == JUMP_INSN
6171 || (GET_CODE (insn) == INSN
6172 && (GET_CODE (PATTERN (insn)) == USE
6174 || sets_cc0_p (PATTERN (insn))
6177 || GET_CODE (insn) == NOTE)
6179 if (GET_CODE (insn) != NOTE)
6182 && !find_insn_list (insn, LOG_LINKS (last)))
6184 add_dependence (last, insn, REG_DEP_ANTI);
6185 INSN_REF_COUNT (insn)++;
6188 CANT_MOVE (insn) = 1;
6191 /* Skip over insns that are part of a group.
6192 Make each insn explicitly depend on the previous insn.
6193 This ensures that only the group header will ever enter
6194 the ready queue (and, when scheduled, will automatically
6195 schedule the SCHED_GROUP_P block). */
6196 while (SCHED_GROUP_P (insn))
6198 rtx temp = prev_nonnote_insn (insn);
6199 add_dependence (insn, temp, REG_DEP_ANTI);
6204 /* Don't overrun the bounds of the basic block. */
6208 insn = PREV_INSN (insn);
6211 /* Make sure these insns are scheduled last in their block. */
6214 while (insn != head)
6216 insn = prev_nonnote_insn (insn);
6218 if (INSN_REF_COUNT (insn) != 0)
6221 add_dependence (last, insn, REG_DEP_ANTI);
6222 INSN_REF_COUNT (insn) = 1;
6224 /* Skip over insns that are part of a group. */
6225 while (SCHED_GROUP_P (insn))
6226 insn = prev_nonnote_insn (insn);
6230 /* Compute backward dependences inside bb. In a multiple blocks region:
6231 (1) a bb is analyzed after its predecessors, and (2) the lists in
6232 effect at the end of bb (after analyzing for bb) are inherited by
6235 Specifically for reg-reg data dependences, the block insns are
6236 scanned by sched_analyze () top-to-bottom. Two lists are
6237 maintained by sched_analyze (): reg_last_sets[] for register DEFs,
6238 and reg_last_uses[] for register USEs.
6240 When analysis is completed for bb, we update for its successors:
6241 ; - DEFS[succ] = Union (DEFS [succ], DEFS [bb])
6242 ; - USES[succ] = Union (USES [succ], DEFS [bb])
6244 The mechanism for computing mem-mem data dependence is very
6245 similar, and the result is interblock dependences in the region. */
6248 compute_block_backward_dependences (bb)
6254 int max_reg = max_reg_num ();
6256 b = BB_TO_BLOCK (bb);
6258 if (current_nr_blocks == 1)
6260 reg_last_uses = (rtx *) alloca (max_reg * sizeof (rtx));
6261 reg_last_sets = (rtx *) alloca (max_reg * sizeof (rtx));
6262 reg_last_clobbers = (rtx *) alloca (max_reg * sizeof (rtx));
6264 bzero ((char *) reg_last_uses, max_reg * sizeof (rtx));
6265 bzero ((char *) reg_last_sets, max_reg * sizeof (rtx));
6266 bzero ((char *) reg_last_clobbers, max_reg * sizeof (rtx));
6268 pending_read_insns = 0;
6269 pending_read_mems = 0;
6270 pending_write_insns = 0;
6271 pending_write_mems = 0;
6272 pending_lists_length = 0;
6273 last_function_call = 0;
6274 last_pending_memory_flush = 0;
6275 sched_before_next_call
6276 = gen_rtx_INSN (VOIDmode, 0, NULL_RTX, NULL_RTX,
6277 NULL_RTX, 0, NULL_RTX, NULL_RTX);
6278 LOG_LINKS (sched_before_next_call) = 0;
6282 reg_last_uses = bb_reg_last_uses[bb];
6283 reg_last_sets = bb_reg_last_sets[bb];
6284 reg_last_clobbers = bb_reg_last_clobbers[bb];
6286 pending_read_insns = bb_pending_read_insns[bb];
6287 pending_read_mems = bb_pending_read_mems[bb];
6288 pending_write_insns = bb_pending_write_insns[bb];
6289 pending_write_mems = bb_pending_write_mems[bb];
6290 pending_lists_length = bb_pending_lists_length[bb];
6291 last_function_call = bb_last_function_call[bb];
6292 last_pending_memory_flush = bb_last_pending_memory_flush[bb];
6294 sched_before_next_call = bb_sched_before_next_call[bb];
6297 /* Do the analysis for this block. */
6298 get_block_head_tail (bb, &head, &tail);
6299 sched_analyze (head, tail);
6300 add_branch_dependences (head, tail);
6302 if (current_nr_blocks > 1)
6305 int b_succ, bb_succ;
6307 rtx link_insn, link_mem;
6310 /* These lists should point to the right place, for correct
6312 bb_pending_read_insns[bb] = pending_read_insns;
6313 bb_pending_read_mems[bb] = pending_read_mems;
6314 bb_pending_write_insns[bb] = pending_write_insns;
6315 bb_pending_write_mems[bb] = pending_write_mems;
6317 /* bb's structures are inherited by it's successors. */
6318 first_edge = e = OUT_EDGES (b);
6322 b_succ = TO_BLOCK (e);
6323 bb_succ = BLOCK_TO_BB (b_succ);
6325 /* Only bbs "below" bb, in the same region, are interesting. */
6326 if (CONTAINING_RGN (b) != CONTAINING_RGN (b_succ)
6333 for (reg = 0; reg < max_reg; reg++)
6336 /* reg-last-uses lists are inherited by bb_succ. */
6337 for (u = reg_last_uses[reg]; u; u = XEXP (u, 1))
6339 if (find_insn_list (XEXP (u, 0),
6340 (bb_reg_last_uses[bb_succ])[reg]))
6343 (bb_reg_last_uses[bb_succ])[reg]
6344 = alloc_INSN_LIST (XEXP (u, 0),
6345 (bb_reg_last_uses[bb_succ])[reg]);
6348 /* reg-last-defs lists are inherited by bb_succ. */
6349 for (u = reg_last_sets[reg]; u; u = XEXP (u, 1))
6351 if (find_insn_list (XEXP (u, 0),
6352 (bb_reg_last_sets[bb_succ])[reg]))
6355 (bb_reg_last_sets[bb_succ])[reg]
6356 = alloc_INSN_LIST (XEXP (u, 0),
6357 (bb_reg_last_sets[bb_succ])[reg]);
6360 for (u = reg_last_clobbers[reg]; u; u = XEXP (u, 1))
6362 if (find_insn_list (XEXP (u, 0),
6363 (bb_reg_last_clobbers[bb_succ])[reg]))
6366 (bb_reg_last_clobbers[bb_succ])[reg]
6367 = alloc_INSN_LIST (XEXP (u, 0),
6368 (bb_reg_last_clobbers[bb_succ])[reg]);
6372 /* Mem read/write lists are inherited by bb_succ. */
6373 link_insn = pending_read_insns;
6374 link_mem = pending_read_mems;
6377 if (!(find_insn_mem_list (XEXP (link_insn, 0),
6379 bb_pending_read_insns[bb_succ],
6380 bb_pending_read_mems[bb_succ])))
6381 add_insn_mem_dependence (&bb_pending_read_insns[bb_succ],
6382 &bb_pending_read_mems[bb_succ],
6383 XEXP (link_insn, 0), XEXP (link_mem, 0));
6384 link_insn = XEXP (link_insn, 1);
6385 link_mem = XEXP (link_mem, 1);
6388 link_insn = pending_write_insns;
6389 link_mem = pending_write_mems;
6392 if (!(find_insn_mem_list (XEXP (link_insn, 0),
6394 bb_pending_write_insns[bb_succ],
6395 bb_pending_write_mems[bb_succ])))
6396 add_insn_mem_dependence (&bb_pending_write_insns[bb_succ],
6397 &bb_pending_write_mems[bb_succ],
6398 XEXP (link_insn, 0), XEXP (link_mem, 0));
6400 link_insn = XEXP (link_insn, 1);
6401 link_mem = XEXP (link_mem, 1);
6404 /* last_function_call is inherited by bb_succ. */
6405 for (u = last_function_call; u; u = XEXP (u, 1))
6407 if (find_insn_list (XEXP (u, 0),
6408 bb_last_function_call[bb_succ]))
6411 bb_last_function_call[bb_succ]
6412 = alloc_INSN_LIST (XEXP (u, 0),
6413 bb_last_function_call[bb_succ]);
6416 /* last_pending_memory_flush is inherited by bb_succ. */
6417 for (u = last_pending_memory_flush; u; u = XEXP (u, 1))
6419 if (find_insn_list (XEXP (u, 0),
6420 bb_last_pending_memory_flush[bb_succ]))
6423 bb_last_pending_memory_flush[bb_succ]
6424 = alloc_INSN_LIST (XEXP (u, 0),
6425 bb_last_pending_memory_flush[bb_succ]);
6428 /* sched_before_next_call is inherited by bb_succ. */
6429 x = LOG_LINKS (sched_before_next_call);
6430 for (; x; x = XEXP (x, 1))
6431 add_dependence (bb_sched_before_next_call[bb_succ],
6432 XEXP (x, 0), REG_DEP_ANTI);
6436 while (e != first_edge);
6439 /* Free up the INSN_LISTs.
6441 Note this loop is executed max_reg * nr_regions times. It's first
6442 implementation accounted for over 90% of the calls to free_INSN_LIST_list.
6443 The list was empty for the vast majority of those calls. On the PA, not
6444 calling free_INSN_LIST_list in those cases improves -O2 compile times by
6446 for (b = 0; b < max_reg; ++b)
6448 if (reg_last_clobbers[b])
6449 free_INSN_LIST_list (®_last_clobbers[b]);
6450 if (reg_last_sets[b])
6451 free_INSN_LIST_list (®_last_sets[b]);
6452 if (reg_last_uses[b])
6453 free_INSN_LIST_list (®_last_uses[b]);
6456 /* Assert that we won't need bb_reg_last_* for this block anymore. */
6457 if (current_nr_blocks > 1)
6459 bb_reg_last_uses[bb] = (rtx *) NULL_RTX;
6460 bb_reg_last_sets[bb] = (rtx *) NULL_RTX;
6461 bb_reg_last_clobbers[bb] = (rtx *) NULL_RTX;
6465 /* Print dependences for debugging, callable from debugger. */
6468 debug_dependencies ()
6472 fprintf (dump, ";; --------------- forward dependences: ------------ \n");
6473 for (bb = 0; bb < current_nr_blocks; bb++)
6481 get_block_head_tail (bb, &head, &tail);
6482 next_tail = NEXT_INSN (tail);
6483 fprintf (dump, "\n;; --- Region Dependences --- b %d bb %d \n",
6484 BB_TO_BLOCK (bb), bb);
6486 fprintf (dump, ";; %7s%6s%6s%6s%6s%6s%11s%6s\n",
6487 "insn", "code", "bb", "dep", "prio", "cost", "blockage", "units");
6488 fprintf (dump, ";; %7s%6s%6s%6s%6s%6s%11s%6s\n",
6489 "----", "----", "--", "---", "----", "----", "--------", "-----");
6490 for (insn = head; insn != next_tail; insn = NEXT_INSN (insn))
6495 if (GET_RTX_CLASS (GET_CODE (insn)) != 'i')
6498 fprintf (dump, ";; %6d ", INSN_UID (insn));
6499 if (GET_CODE (insn) == NOTE)
6501 n = NOTE_LINE_NUMBER (insn);
6503 fprintf (dump, "%s\n", GET_NOTE_INSN_NAME (n));
6505 fprintf (dump, "line %d, file %s\n", n,
6506 NOTE_SOURCE_FILE (insn));
6509 fprintf (dump, " {%s}\n", GET_RTX_NAME (GET_CODE (insn)));
6513 unit = insn_unit (insn);
6515 || function_units[unit].blockage_range_function == 0) ? 0 :
6516 function_units[unit].blockage_range_function (insn);
6518 ";; %s%5d%6d%6d%6d%6d%6d %3d -%3d ",
6519 (SCHED_GROUP_P (insn) ? "+" : " "),
6523 INSN_DEP_COUNT (insn),
6524 INSN_PRIORITY (insn),
6525 insn_cost (insn, 0, 0),
6526 (int) MIN_BLOCKAGE_COST (range),
6527 (int) MAX_BLOCKAGE_COST (range));
6528 insn_print_units (insn);
6529 fprintf (dump, "\t: ");
6530 for (link = INSN_DEPEND (insn); link; link = XEXP (link, 1))
6531 fprintf (dump, "%d ", INSN_UID (XEXP (link, 0)));
6532 fprintf (dump, "\n");
6536 fprintf (dump, "\n");
6539 /* Set_priorities: compute priority of each insn in the block. */
6552 get_block_head_tail (bb, &head, &tail);
6553 prev_head = PREV_INSN (head);
6556 && (GET_RTX_CLASS (GET_CODE (head)) != 'i'))
6560 for (insn = tail; insn != prev_head; insn = PREV_INSN (insn))
6563 if (GET_CODE (insn) == NOTE)
6566 if (!(SCHED_GROUP_P (insn)))
6568 (void) priority (insn);
6574 /* Make each element of VECTOR point at an rtx-vector,
6575 taking the space for all those rtx-vectors from SPACE.
6576 SPACE is of type (rtx *), but it is really as long as NELTS rtx-vectors.
6577 BYTES_PER_ELT is the number of bytes in one rtx-vector.
6578 (this is the same as init_regset_vector () in flow.c) */
6581 init_rtx_vector (vector, space, nelts, bytes_per_elt)
6588 register rtx *p = space;
6590 for (i = 0; i < nelts; i++)
6593 p += bytes_per_elt / sizeof (*p);
6597 /* Schedule a region. A region is either an inner loop, a loop-free
6598 subroutine, or a single basic block. Each bb in the region is
6599 scheduled after its flow predecessors. */
6602 schedule_region (rgn)
6606 int rgn_n_insns = 0;
6607 int sched_rgn_n_insns = 0;
6611 /* Set variables for the current region. */
6612 current_nr_blocks = RGN_NR_BLOCKS (rgn);
6613 current_blocks = RGN_BLOCKS (rgn);
6615 reg_pending_sets = ALLOCA_REG_SET ();
6616 reg_pending_clobbers = ALLOCA_REG_SET ();
6617 reg_pending_sets_all = 0;
6619 /* Create a bitmap of the blocks in this region. */
6620 blocks = sbitmap_alloc (n_basic_blocks);
6621 sbitmap_zero (blocks);
6623 for (bb = current_nr_blocks - 1; bb >= 0; --bb)
6624 SET_BIT (blocks, BB_TO_BLOCK (bb));
6626 /* Initializations for region data dependence analyisis. */
6627 if (current_nr_blocks > 1)
6630 int maxreg = max_reg_num ();
6632 bb_reg_last_uses = (rtx **) alloca (current_nr_blocks * sizeof (rtx *));
6633 space = (rtx *) alloca (current_nr_blocks * maxreg * sizeof (rtx));
6634 bzero ((char *) space, current_nr_blocks * maxreg * sizeof (rtx));
6635 init_rtx_vector (bb_reg_last_uses, space, current_nr_blocks,
6636 maxreg * sizeof (rtx *));
6638 bb_reg_last_sets = (rtx **) alloca (current_nr_blocks * sizeof (rtx *));
6639 space = (rtx *) alloca (current_nr_blocks * maxreg * sizeof (rtx));
6640 bzero ((char *) space, current_nr_blocks * maxreg * sizeof (rtx));
6641 init_rtx_vector (bb_reg_last_sets, space, current_nr_blocks,
6642 maxreg * sizeof (rtx *));
6644 bb_reg_last_clobbers =
6645 (rtx **) alloca (current_nr_blocks * sizeof (rtx *));
6646 space = (rtx *) alloca (current_nr_blocks * maxreg * sizeof (rtx));
6647 bzero ((char *) space, current_nr_blocks * maxreg * sizeof (rtx));
6648 init_rtx_vector (bb_reg_last_clobbers, space, current_nr_blocks,
6649 maxreg * sizeof (rtx *));
6651 bb_pending_read_insns = (rtx *) alloca (current_nr_blocks * sizeof (rtx));
6652 bb_pending_read_mems = (rtx *) alloca (current_nr_blocks * sizeof (rtx));
6653 bb_pending_write_insns =
6654 (rtx *) alloca (current_nr_blocks * sizeof (rtx));
6655 bb_pending_write_mems = (rtx *) alloca (current_nr_blocks * sizeof (rtx));
6656 bb_pending_lists_length =
6657 (int *) alloca (current_nr_blocks * sizeof (int));
6658 bb_last_pending_memory_flush =
6659 (rtx *) alloca (current_nr_blocks * sizeof (rtx));
6660 bb_last_function_call = (rtx *) alloca (current_nr_blocks * sizeof (rtx));
6661 bb_sched_before_next_call =
6662 (rtx *) alloca (current_nr_blocks * sizeof (rtx));
6664 init_rgn_data_dependences (current_nr_blocks);
6667 /* Compute LOG_LINKS. */
6668 for (bb = 0; bb < current_nr_blocks; bb++)
6669 compute_block_backward_dependences (bb);
6671 /* Compute INSN_DEPEND. */
6672 for (bb = current_nr_blocks - 1; bb >= 0; bb--)
6673 compute_block_forward_dependences (bb);
6675 /* Compute INSN_REG_WEIGHT. */
6676 for (bb = current_nr_blocks - 1; bb >= 0; bb--)
6677 find_insn_reg_weight (bb);
6679 /* Remove death notes. */
6680 initial_deaths = count_or_remove_death_notes (blocks, 1);
6682 /* Delete line notes and set priorities. */
6683 for (bb = 0; bb < current_nr_blocks; bb++)
6685 if (write_symbols != NO_DEBUG)
6687 save_line_notes (bb);
6691 rgn_n_insns += set_priorities (bb);
6694 /* Compute interblock info: probabilities, split-edges, dominators, etc. */
6695 if (current_nr_blocks > 1)
6699 prob = (float *) alloca ((current_nr_blocks) * sizeof (float));
6701 bbset_size = current_nr_blocks / HOST_BITS_PER_WIDE_INT + 1;
6702 dom = (bbset *) alloca (current_nr_blocks * sizeof (bbset));
6703 for (i = 0; i < current_nr_blocks; i++)
6705 dom[i] = (bbset) alloca (bbset_size * sizeof (HOST_WIDE_INT));
6706 bzero ((char *) dom[i], bbset_size * sizeof (HOST_WIDE_INT));
6711 edge_to_bit = (int *) alloca (nr_edges * sizeof (int));
6712 for (i = 1; i < nr_edges; i++)
6713 if (CONTAINING_RGN (FROM_BLOCK (i)) == rgn)
6714 EDGE_TO_BIT (i) = rgn_nr_edges++;
6715 rgn_edges = (int *) alloca (rgn_nr_edges * sizeof (int));
6718 for (i = 1; i < nr_edges; i++)
6719 if (CONTAINING_RGN (FROM_BLOCK (i)) == (rgn))
6720 rgn_edges[rgn_nr_edges++] = i;
6723 edgeset_size = rgn_nr_edges / HOST_BITS_PER_WIDE_INT + 1;
6724 pot_split = (edgeset *) alloca (current_nr_blocks * sizeof (edgeset));
6725 ancestor_edges = (edgeset *) alloca (current_nr_blocks
6726 * sizeof (edgeset));
6727 for (i = 0; i < current_nr_blocks; i++)
6730 (edgeset) alloca (edgeset_size * sizeof (HOST_WIDE_INT));
6731 bzero ((char *) pot_split[i],
6732 edgeset_size * sizeof (HOST_WIDE_INT));
6734 (edgeset) alloca (edgeset_size * sizeof (HOST_WIDE_INT));
6735 bzero ((char *) ancestor_edges[i],
6736 edgeset_size * sizeof (HOST_WIDE_INT));
6739 /* Compute probabilities, dominators, split_edges. */
6740 for (bb = 0; bb < current_nr_blocks; bb++)
6741 compute_dom_prob_ps (bb);
6744 /* Now we can schedule all blocks. */
6745 for (bb = 0; bb < current_nr_blocks; bb++)
6747 sched_rgn_n_insns += schedule_block (bb, rgn_n_insns);
6754 /* Sanity check: verify that all region insns were scheduled. */
6755 if (sched_rgn_n_insns != rgn_n_insns)
6758 /* Update register life and usage information. Scheduling a multi-block
6759 region requires a global update. */
6760 if (current_nr_blocks > 1)
6761 update_life_info (blocks, UPDATE_LIFE_GLOBAL);
6764 update_life_info (blocks, UPDATE_LIFE_LOCAL);
6766 /* In the single block case, the count of registers that died should
6767 not have changed during the schedule. */
6768 if (count_or_remove_death_notes (blocks, 0) != initial_deaths)
6772 /* Restore line notes. */
6773 if (write_symbols != NO_DEBUG)
6775 for (bb = 0; bb < current_nr_blocks; bb++)
6776 restore_line_notes (bb);
6779 /* Done with this region. */
6780 free_pending_lists ();
6782 FREE_REG_SET (reg_pending_sets);
6783 FREE_REG_SET (reg_pending_clobbers);
6784 sbitmap_free (blocks);
6787 /* The one entry point in this file. DUMP_FILE is the dump file for
6791 schedule_insns (dump_file)
6802 /* Disable speculative loads in their presence if cc0 defined. */
6804 flag_schedule_speculative_load = 0;
6807 /* Taking care of this degenerate case makes the rest of
6808 this code simpler. */
6809 if (n_basic_blocks == 0)
6812 /* Set dump and sched_verbose for the desired debugging output. If no
6813 dump-file was specified, but -fsched-verbose-N (any N), print to stderr.
6814 For -fsched-verbose-N, N>=10, print everything to stderr. */
6815 sched_verbose = sched_verbose_param;
6816 if (sched_verbose_param == 0 && dump_file)
6818 dump = ((sched_verbose_param >= 10 || !dump_file) ? stderr : dump_file);
6823 /* Initialize issue_rate. */
6824 issue_rate = ISSUE_RATE;
6826 split_all_insns (1);
6828 max_uid = (get_max_uid () + 1);
6830 cant_move = xcalloc (max_uid, sizeof (char));
6831 fed_by_spec_load = xcalloc (max_uid, sizeof (char));
6832 is_load_insn = xcalloc (max_uid, sizeof (char));
6834 insn_orig_block = (int *) xmalloc (max_uid * sizeof (int));
6835 insn_luid = (int *) xmalloc (max_uid * sizeof (int));
6838 for (b = 0; b < n_basic_blocks; b++)
6839 for (insn = BLOCK_HEAD (b);; insn = NEXT_INSN (insn))
6841 INSN_BLOCK (insn) = b;
6842 INSN_LUID (insn) = luid++;
6844 if (insn == BLOCK_END (b))
6849 rgn_table = (region *) alloca ((n_basic_blocks) * sizeof (region));
6850 rgn_bb_table = (int *) alloca ((n_basic_blocks) * sizeof (int));
6851 block_to_bb = (int *) alloca ((n_basic_blocks) * sizeof (int));
6852 containing_rgn = (int *) alloca ((n_basic_blocks) * sizeof (int));
6854 /* Compute regions for scheduling. */
6855 if (reload_completed
6856 || n_basic_blocks == 1
6857 || !flag_schedule_interblock)
6859 find_single_block_region ();
6863 /* Verify that a 'good' control flow graph can be built. */
6864 if (is_cfg_nonregular ())
6866 find_single_block_region ();
6870 int_list_ptr *s_preds, *s_succs;
6871 int *num_preds, *num_succs;
6872 sbitmap *dom, *pdom;
6874 s_preds = (int_list_ptr *) alloca (n_basic_blocks
6875 * sizeof (int_list_ptr));
6876 s_succs = (int_list_ptr *) alloca (n_basic_blocks
6877 * sizeof (int_list_ptr));
6878 num_preds = (int *) alloca (n_basic_blocks * sizeof (int));
6879 num_succs = (int *) alloca (n_basic_blocks * sizeof (int));
6880 dom = sbitmap_vector_alloc (n_basic_blocks, n_basic_blocks);
6881 pdom = sbitmap_vector_alloc (n_basic_blocks, n_basic_blocks);
6883 /* The scheduler runs after flow; therefore, we can't blindly call
6884 back into find_basic_blocks since doing so could invalidate the
6885 info in global_live_at_start.
6887 Consider a block consisting entirely of dead stores; after life
6888 analysis it would be a block of NOTE_INSN_DELETED notes. If
6889 we call find_basic_blocks again, then the block would be removed
6890 entirely and invalidate our the register live information.
6892 We could (should?) recompute register live information. Doing
6893 so may even be beneficial. */
6895 compute_preds_succs (s_preds, s_succs, num_preds, num_succs);
6897 /* Compute the dominators and post dominators. We don't
6898 currently use post dominators, but we should for
6899 speculative motion analysis. */
6900 compute_dominators (dom, pdom, s_preds, s_succs);
6902 /* build_control_flow will return nonzero if it detects unreachable
6903 blocks or any other irregularity with the cfg which prevents
6904 cross block scheduling. */
6905 if (build_control_flow (s_preds, s_succs, num_preds, num_succs) != 0)
6906 find_single_block_region ();
6908 find_rgns (s_preds, s_succs, num_preds, num_succs, dom);
6910 if (sched_verbose >= 3)
6913 /* For now. This will move as more and more of haifa is converted
6914 to using the cfg code in flow.c. */
6921 /* Allocate data for this pass. See comments, above,
6922 for what these vectors do.
6924 We use xmalloc instead of alloca, because max_uid can be very large
6925 when there is a lot of function inlining. If we used alloca, we could
6926 exceed stack limits on some hosts for some inputs. */
6927 insn_priority = (int *) xcalloc (max_uid, sizeof (int));
6928 insn_reg_weight = (int *) xcalloc (max_uid, sizeof (int));
6929 insn_tick = (int *) xcalloc (max_uid, sizeof (int));
6930 insn_costs = (short *) xcalloc (max_uid, sizeof (short));
6931 insn_units = (short *) xcalloc (max_uid, sizeof (short));
6932 insn_blockage = (unsigned int *) xcalloc (max_uid, sizeof (unsigned int));
6933 insn_ref_count = (int *) xcalloc (max_uid, sizeof (int));
6935 /* Allocate for forward dependencies. */
6936 insn_dep_count = (int *) xcalloc (max_uid, sizeof (int));
6937 insn_depend = (rtx *) xcalloc (max_uid, sizeof (rtx));
6939 init_alias_analysis ();
6941 if (write_symbols != NO_DEBUG)
6945 line_note = (rtx *) xcalloc (max_uid, sizeof (rtx));
6946 line_note_head = (rtx *) alloca (n_basic_blocks * sizeof (rtx));
6947 bzero ((char *) line_note_head, n_basic_blocks * sizeof (rtx));
6949 /* Save-line-note-head:
6950 Determine the line-number at the start of each basic block.
6951 This must be computed and saved now, because after a basic block's
6952 predecessor has been scheduled, it is impossible to accurately
6953 determine the correct line number for the first insn of the block. */
6955 for (b = 0; b < n_basic_blocks; b++)
6956 for (line = BLOCK_HEAD (b); line; line = PREV_INSN (line))
6957 if (GET_CODE (line) == NOTE && NOTE_LINE_NUMBER (line) > 0)
6959 line_note_head[b] = line;
6964 /* Find units used in this fuction, for visualization. */
6966 init_target_units ();
6968 /* ??? Add a NOTE after the last insn of the last basic block. It is not
6969 known why this is done. */
6971 insn = BLOCK_END (n_basic_blocks - 1);
6972 if (NEXT_INSN (insn) == 0
6973 || (GET_CODE (insn) != NOTE
6974 && GET_CODE (insn) != CODE_LABEL
6975 /* Don't emit a NOTE if it would end up between an unconditional
6976 jump and a BARRIER. */
6977 && !(GET_CODE (insn) == JUMP_INSN
6978 && GET_CODE (NEXT_INSN (insn)) == BARRIER)))
6979 emit_note_after (NOTE_INSN_DELETED, BLOCK_END (n_basic_blocks - 1));
6981 /* Schedule every region in the subroutine. */
6982 for (rgn = 0; rgn < nr_regions; rgn++)
6984 schedule_region (rgn);
6991 /* Reposition the prologue and epilogue notes in case we moved the
6992 prologue/epilogue insns. */
6993 if (reload_completed)
6994 reposition_prologue_and_epilogue_notes (get_insns ());
6996 /* Delete redundant line notes. */
6997 if (write_symbols != NO_DEBUG)
6998 rm_redundant_line_notes ();
7002 if (reload_completed == 0 && flag_schedule_interblock)
7004 fprintf (dump, "\n;; Procedure interblock/speculative motions == %d/%d \n",
7012 fprintf (dump, "\n\n");
7016 free (fed_by_spec_load);
7017 free (is_load_insn);
7018 free (insn_orig_block);
7021 free (insn_priority);
7022 free (insn_reg_weight);
7026 free (insn_blockage);
7027 free (insn_ref_count);
7029 free (insn_dep_count);
7032 if (write_symbols != NO_DEBUG)
7052 #endif /* INSN_SCHEDULING */