2001-05-22 Toon Moene <toon@moene.indiv.nluug.nl>

[pf3gnuchains/gcc-fork.git] / gcc / md.texi
diff --git a/gcc/md.texi b/gcc/md.texi

index c019312..aacb608 100644 (file)
--- a/gcc/md.texi
+++ b/gcc/md.texi
@@ -1,4 +1,4 @@
-@c Copyright (C) 1988, 89, 92, 93, 94, 96, 1998, 2000 Free Software Foundation, Inc.
+@c Copyright (C) 1988, 89, 92, 93, 94, 96, 1998, 2000, 2001 Free Software Foundation, Inc.
  @c This is part of the GCC manual.
  @c For copying conditions, see the file gcc.texi.
  
@@ -19,6 +19,7 @@ is inside a quoted string.
  See the next chapter for information on the C header file.
  
  @menu
+* Overview::            How the machine description is used.
  * Patterns::            How to write instruction patterns.
  * Example::             An explained example of a @code{define_insn} pattern.
  * RTL Template::        The RTL template defines what insns match a pattern.
@@ -31,14 +32,67 @@ See the next chapter for information on the C header file.
  * Pattern Ordering::    When the order of patterns makes a difference.
  * Dependent Patterns::  Having one pattern may make you need another.
  * Jump Patterns::       Special considerations for patterns for jump insns.
+* Looping Patterns::    How to define patterns for special looping insns.
  * Insn Canonicalizations::Canonicalization of Instructions
  * Expander Definitions::Generating a sequence of several RTL insns
                            for a standard operation.
  * Insn Splitting::      Splitting Instructions into Multiple Instructions.
  * Peephole Definitions::Defining machine-specific peephole optimizations.
  * Insn Attributes::     Specifying the value of attributes for generated insns.
+* Conditional Execution::Generating @code{define_insn} patterns for
+                           predication.
+* Constant Definitions::Defining symbolic constants that can be used in the
+                        md file.
  @end menu
  
+@node Overview
+@section Overview of How the Machine Description is Used
+
+There are three main conversions that happen in the compiler:
+
+@enumerate
+
+@item
+The front end reads the source code and builds a parse tree.
+
+@item
+The parse tree is used to generate an RTL insn list based on named
+instruction patterns.
+
+@item
+The insn list is matched against the RTL templates to produce assembler
+code.
+
+@end enumerate
+
+For the generate pass, only the names of the insns matter, from either a
+named @code{define_insn} or a @code{define_expand}.  The compiler will
+choose the pattern with the right name and apply the operands according
+to the documentation later in this chapter, without regard for the RTL
+template or operand constraints.  Note that the names the compiler looks
+for are hard-coded in the compiler - it will ignore unnamed patterns and
+patterns with names it doesn't know about, but if you don't provide a
+named pattern it needs, it will abort.
+
+If a @code{define_insn} is used, the template given is inserted into the
+insn list.  If a @code{define_expand} is used, one of three things
+happens, based on the condition logic.  The condition logic may manually
+create new insns for the insn list, say via @code{emit_insn()}, and
+invoke DONE.  For certain named patterns, it may invoke FAIL to tell the
+compiler to use an alternate way of performing that task.  If it invokes
+neither @code{DONE} nor @code{FAIL}, the template given in the pattern
+is inserted, as if the @code{define_expand} were a @code{define_insn}.
+
+Once the insn list is generated, various optimization passes convert,
+replace, and rearrange the insns in the insn list.  This is where the
+@code{define_split} and @code{define_peephole} patterns get used, for
+example.
+
+Finally, the insn list's RTL is matched up with the RTL templates in the
+@code{define_insn} patterns, and those patterns are used to emit the
+final assembly code.  For this purpose, each named @code{define_insn}
+acts like it's unnamed, since the names are ignored.
+
  @node Patterns
  @section Everything about Instruction Patterns
  @cindex patterns
@@ -263,6 +317,16 @@ number @var{n} has already been determined by a @code{match_operand}
  appearing earlier in the recognition template, and it matches only an
  identical-looking expression.
  
+Note that @code{match_dup} should not be used to tell the compiler that
+a particular register is being used for two operands (example:
+@code{add} that adds one register to another; the second register is
+both an input operand and the output operand).  Use a matching
+constraint (@pxref{Simple Constraints}) for those.  @code{match_dup} is for the cases where one
+operand is used in two places in the template, such as an instruction
+that computes both a quotient and a remainder, where the opcode takes
+two input operands but the RTL template has to refer to each of those
+twice; once for the quotient pattern and once for the remainder pattern.
+
  @findex match_operator
  @item (match_operator:@var{m} @var{n} @var{predicate} [@var{operands}@dots{}])
  This pattern is a kind of placeholder for a variable RTL expression
@@ -726,13 +790,6 @@ postincrement) is allowed.
  A register operand is allowed provided that it is in a general
  register.
  
-@cindex @samp{d} in constraint
-@item @samp{d}, @samp{a}, @samp{f}, @dots{}
-Other letters can be defined in machine-dependent fashion to stand for
-particular classes of registers.  @samp{d}, @samp{a} and @samp{f} are
-defined on the 68000/68020 to stand for data, address and floating
-point registers.
-
  @cindex constants in constraints
  @cindex @samp{i} in constraint
  @item @samp{i}
@@ -864,22 +921,21 @@ as the predicate in the @code{match_operand}.  This predicate interprets
  the mode specified in the @code{match_operand} as the mode of the memory
  reference for which the address would be valid.
  
+@cindex other register constraints
  @cindex extensible constraints
-@cindex @samp{Q}, in constraint
-@item @samp{Q}, @samp{R}, @samp{S}, @dots{} @samp{U}
-Letters in the range @samp{Q} through @samp{U} may be defined in a
-machine-dependent fashion to stand for arbitrary operand types.
-@ifset INTERNALS
-The machine description macro @code{EXTRA_CONSTRAINT} is passed the
-operand as its first argument and the constraint letter as its
-second operand.
+@item @var{other letters}
+Other letters can be defined in machine-dependent fashion to stand for
+particular classes of registers or other arbitrary operand types.
+@samp{d}, @samp{a} and @samp{f} are defined on the 68000/68020 to stand
+for data, address and floating point registers.
  
-A typical use for this would be to distinguish certain types of
-memory references that affect other insn operands.
+@ifset INTERNALS
+The machine description macro @code{REG_CLASS_FROM_LETTER} has first
+cut at the otherwise unused letters.  If it evaluates to @code{NO_REGS},
+then @code{EXTRA_CONSTRAINT} is evaluated.
  
-Do not define these constraint letters to accept register references
-(@code{reg}); the reload pass does not expect this and would not handle
-it properly.
+A typical use for @code{EXTRA_CONSTRANT} would be to distinguish certain
+types of memory references that affect other insn operands.
  @end ifset
  @end table
  
@@ -1320,20 +1376,20 @@ Constant greater than 0, less than 0x10000
  Constant whose high 24 bits are on (1)
  
  @item L
-16 bit constant whose high 8 bits are on (1)
+16-bit constant whose high 8 bits are on (1)
  
  @item M
-32 bit constant whose high 16 bits are on (1)
+32-bit constant whose high 16 bits are on (1)
  
  @item N
-32 bit negative constant that fits in 8 bits
+32-bit negative constant that fits in 8 bits
  
  @item O
-The constant 0x80000000 or, on the 29050, any 32 bit constant
+The constant 0x80000000 or, on the 29050, any 32-bit constant
  whose low 16 bits are 0.
  
  @item P
-16 bit negative constant that fits in 8 bits
+16-bit negative constant that fits in 8 bits
  
  @item G
  @itemx H
@@ -1353,7 +1409,7 @@ Registers from r16 to r23
  Registers from r16 to r31
  
  @item w
-Register from r24 to r31. This registers can be used in @samp{addw} command
+Registers from r24 to r31.  These registers can be used in @samp{adiw} command
  
  @item e
  Pointer register (r26 - r31)
@@ -1361,6 +1417,9 @@ Pointer register (r26 - r31)
  @item b
  Base pointer register (r28 - r31)
  
+@item q
+Stack pointer register (SPH:SPL)
+
  @item t
  Temporary register r0
  
@@ -1392,7 +1451,7 @@ Constant that fits in 8 bits
  Constant integer -1
  
  @item O
-Constant integer 8
+Constant integer 8, 16, or 24
  
  @item P
  Constant integer 1
@@ -1431,17 +1490,17 @@ Floating point register
  @samp{FPMEM} stack memory for FPR-GPR transfers
  
  @item I
-Signed 16 bit constant
+Signed 16-bit constant
  
  @item J
-Unsigned 16 bit constant shifted left 16 bits (use @samp{L} instead for 
+Unsigned 16-bit constant shifted left 16 bits (use @samp{L} instead for 
  @code{SImode} constants)
  
  @item K
-Unsigned 16 bit constant
+Unsigned 16-bit constant
  
  @item L
-Signed 16 bit constant shifted left 16 bits
+Signed 16-bit constant shifted left 16 bits
  
  @item M
  Constant larger than 31
@@ -1453,7 +1512,7 @@ Exact power of 2
  Zero
  
  @item P
-Constant whose negation is a signed 16 bit constant
+Constant whose negation is a signed 16-bit constant
  
  @item G
  Floating point constant that can be loaded into a register with one
@@ -1479,10 +1538,24 @@ System V Release 4 small data area reference
  @item Intel 386---@file{i386.h}
  @table @code
  @item q
-@samp{a}, @code{b}, @code{c}, or @code{d} register
+@samp{a}, @code{b}, @code{c}, or @code{d} register for the i386.
+For x86-64 it is equivalent to @samp{r} class. (for 8-bit instructions that
+do not use upper halves)
+
+@item Q
+@samp{a}, @code{b}, @code{c}, or @code{d} register. (for 8-bit instructions,
+that do use upper halves)
+
+@item R
+Legacy register --- equivalent to @code{r} class in i386 mode.
+(for non-8-bit registers used together with 8-bit upper halves in a single
+instruction)
  
  @item A
-@samp{a}, or @code{d} register (for 64-bit ints)
+Specifies the @samp{a} or @samp{d} registers.  This is primarily useful
+for 64-bit integer values (when in 32-bit mode) intended to be returned
+with the @samp{d} register holding the most significant bits and the
+@samp{a} register holding the least significant bits.
  
  @item f
  Floating point register
@@ -1511,11 +1584,17 @@ Second floating point register
  @item S
  @samp{si} register
  
+@item x
+@samp{xmm} SSE register
+
+@item y
+MMX register
+
  @item I
-Constant in range 0 to 31 (for 32 bit shifts)
+Constant in range 0 to 31 (for 32-bit shifts)
  
  @item J
-Constant in range 0 to 63 (for 64 bit shifts)
+Constant in range 0 to 63 (for 64-bit shifts)
  
  @item K
  @samp{0xff}
@@ -1529,6 +1608,14 @@ Constant in range 0 to 63 (for 64 bit shifts)
  @item N
  Constant in range 0 to 255 (for @code{out} instruction)
  
+@item Z
+Constant in range 0 to 0xffffffff or symbolic reference known to fit specified range.
+(for using immediates in zero extending 32-bit to 64-bit x86-64 instructions)
+
+@item e
+Constant in range -2147483648 to 2147483647 or symbolic reference known to fit specified range.
+(for using immediates in 64-bit x86-64 instructions)
+
  @item G
  Standard 80387 floating point constant
  @end table
@@ -1587,7 +1674,7 @@ General-purpose integer register
  Floating-point status register
  
  @item I
-Signed 16 bit constant (for arithmetic instructions)
+Signed 16-bit constant (for arithmetic instructions)
  
  @item J
  Zero
@@ -1599,17 +1686,17 @@ Zero-extended 16-bit constant (for logic instructions)
  Constant with low 16 bits zero (can be loaded with @code{lui})
  
  @item M
-32 bit constant which requires two instructions to load (a constant
+32-bit constant which requires two instructions to load (a constant
  which is not @samp{I}, @samp{K}, or @samp{L})
  
  @item N
-Negative 16 bit constant
+Negative 16-bit constant
  
  @item O
  Exact power of two
  
  @item P
-Positive 16 bit constant
+Positive 16-bit constant
  
  @item G
  Floating point zero
@@ -1648,7 +1735,7 @@ First 16 Sun FPA registers, if available
  Integer in the range 1 to 8
  
  @item J
-16 bit signed number
+16-bit signed number
  
  @item K
  Signed number whose magnitude is greater than 0x80
@@ -1666,30 +1753,88 @@ Floating point constant that is not a 68881 constant
  Floating point constant that can be used by Sun FPA
  @end table
  
+@item Motorola 68HC11 & 68HC12 families---@file{m68hc11.h}
+@table @code
+@item a
+Register 'a'
+
+@item b
+Register 'b'
+
+@item d
+Register 'd'
+
+@item q
+An 8-bit register
+
+@item t
+Temporary soft register _.tmp
+
+@item u
+A soft register _.d1 to _.d31
+
+@item w
+Stack pointer register
+
+@item x
+Register 'x'
+
+@item y
+Register 'y'
+
+@item z
+Pseudo register 'z' (replaced by 'x' or 'y' at the end)
+
+@item A
+An address register: x, y or z
+
+@item B
+An address register: x or y
+
+@item D
+Register pair (x:d) to form a 32-bit value
+
+@item L
+Constants in the range -65536 to 65535
+
+@item M
+Constants whose 16-bit low part is zero
+
+@item N
+Constant integer 1 or -1
+
+@item O
+Constant integer 16
+
+@item P
+Constants in the range -8 to 2
+
+@end table
+
  @need 1000
  @item SPARC---@file{sparc.h}
  @table @code
  @item f
-Floating-point register that can hold 32 or 64 bit values.
+Floating-point register that can hold 32- or 64-bit values.
  
  @item e
-Floating-point register that can hold 64 or 128 bit values.
+Floating-point register that can hold 64- or 128-bit values.
  
  @item I
-Signed 13 bit constant
+Signed 13-bit constant
  
  @item J
  Zero
  
  @item K
-32 bit constant with the low 12 bits clear (a constant that can be
+32-bit constant with the low 12 bits clear (a constant that can be
  loaded with the @code{sethi} instruction)
  
  @item G
  Floating-point zero
  
  @item H
-Signed 13 bit constant, sign-extended to 32 or 64 bits
+Signed 13-bit constant, sign-extended to 32 or 64 bits
  
  @item Q
  Floating-point constant whose integral representation can
@@ -1723,22 +1868,22 @@ Auxiliary (address) register (ar0-ar7)
  Stack pointer register (sp)
  
  @item c
-Standard (32 bit) precision integer register
+Standard (32-bit) precision integer register
  
  @item f
-Extended (40 bit) precision register (r0-r11)
+Extended (40-bit) precision register (r0-r11)
  
  @item k
  Block count register (bk)
  
  @item q
-Extended (40 bit) precision low register (r0-r7)
+Extended (40-bit) precision low register (r0-r7)
  
  @item t
-Extended (40 bit) precision register (r0-r1)
+Extended (40-bit) precision register (r0-r1)
  
  @item u
-Extended (40 bit) precision register (r2-r3)
+Extended (40-bit) precision register (r2-r3)
  
  @item v
  Repeat count register (rc)
@@ -1756,34 +1901,34 @@ Data page register (dp)
  Floating-point zero
  
  @item H
-Immediate 16 bit floating-point constant
+Immediate 16-bit floating-point constant
  
  @item I
-Signed 16 bit constant
+Signed 16-bit constant
  
  @item J
-Signed 8 bit constant
+Signed 8-bit constant
  
  @item K
-Signed 5 bit constant
+Signed 5-bit constant
  
  @item L
-Unsigned 16 bit constant
+Unsigned 16-bit constant
  
  @item M
-Unsigned 8 bit constant
+Unsigned 8-bit constant
  
  @item N
-Ones complement of unsigned 16 bit constant
+Ones complement of unsigned 16-bit constant
  
  @item O
-High 16 bit constant (32 bit constant with 16 LSBs zero)
+High 16-bit constant (32-bit constant with 16 LSBs zero)
  
  @item Q
-Indirect memory reference with signed 8 bit or index register displacement 
+Indirect memory reference with signed 8-bit or index register displacement 
  
  @item R
-Indirect memory reference with unsigned 5 bit displacement
+Indirect memory reference with unsigned 5-bit displacement
  
  @item S
  Indirect memory reference with 1 bit or index register displacement 
@@ -1822,8 +1967,10 @@ to store the specified value in the part of the register that corresponds
  to mode @var{m}.  The effect on the rest of the register is undefined.
  
  This class of patterns is special in several ways.  First of all, each
-of these names @emph{must} be defined, because there is no other way
-to copy a datum from one place to another.
+of these names up to and including full word size @emph{must} be defined,
+because there is no other way to copy a datum from one place to another.
+If there are patterns accepting operands in larger modes,
+@samp{mov@var{m}} must be defined for integer modes of those sizes.
  
  Second, these patterns are not used solely in the RTL generation pass.
  Even the reload pass can generate move insns to copy values from stack
@@ -1841,8 +1988,7 @@ function which might generate new pseudo registers.
  
  This requirement exists even for subword modes on a RISC machine where
  fetching those modes from memory normally requires several insns and
-some temporary registers.  Look in @file{spur.md} to see how the
-requirement can be satisfied.
+some temporary registers.
  
  @findex change_address
  During reload a memory reference with an invalid address may be passed
@@ -1969,6 +2115,13 @@ means of constraints requiring operands 1 and 0 to be the same location.
  @itemx @samp{smin@var{m}3}, @samp{smax@var{m}3}, @samp{umin@var{m}3}, @samp{umax@var{m}3}
  @itemx @samp{and@var{m}3}, @samp{ior@var{m}3}, @samp{xor@var{m}3}
  Similar, for other arithmetic operations.
+@cindex @code{min@var{m}3} instruction pattern
+@cindex @code{max@var{m}3} instruction pattern
+@itemx @samp{min@var{m}3}, @samp{max@var{m}3}
+Floating point min and max operations.  If both operands are zeros,
+or if either operand is NaN, then it is unspecified which of the two
+operands is returned as the result.
+
  
  @cindex @code{mulhisi3} instruction pattern
  @item @samp{mulhisi3}
@@ -2479,6 +2632,47 @@ table it uses.  Its assembler code normally has no need to use the
  second operand, but you should incorporate it in the RTL pattern so
  that the jump optimizer will not delete the table as unreachable code.
  
+
+@cindex @code{decrement_and_branch_until_zero} instruction pattern
+@item @samp{decrement_and_branch_until_zero}
+Conditional branch instruction that decrements a register and
+jumps if the register is non-zero.  Operand 0 is the register to
+decrement and test; operand 1 is the label to jump to if the
+register is non-zero.  @xref{Looping Patterns}.
+
+This optional instruction pattern is only used by the combiner,
+typically for loops reversed by the loop optimizer when strength
+reduction is enabled.
+
+@cindex @code{doloop_end} instruction pattern
+@item @samp{doloop_end}
+Conditional branch instruction that decrements a register and jumps if
+the register is non-zero.  This instruction takes five operands: Operand
+0 is the register to decrement and test; operand 1 is the number of loop
+iterations as a @code{const_int} or @code{const0_rtx} if this cannot be
+determined until run-time; operand 2 is the actual or estimated maximum
+number of iterations as a @code{const_int}; operand 3 is the number of
+enclosed loops as a @code{const_int} (an innermost loop has a value of
+1); operand 4 is the label to jump to if the register is non-zero.
+@xref{Looping Patterns}.
+
+This optional instruction pattern should be defined for machines with
+low-overhead looping instructions as the loop optimizer will try to
+modify suitable loops to utilize it.  If nested low-overhead looping is
+not supported, use a @code{define_expand} (@pxref{Expander Definitions})
+and make the pattern fail if operand 3 is not @code{const1_rtx}.
+Similarly, if the actual or estimated maximum number of iterations is
+too large for this instruction, make it fail.
+
+@cindex @code{doloop_begin} instruction pattern
+@item @samp{doloop_begin}
+Companion instruction to @code{doloop_end} required for machines that
+need to perform some initialisation, such as loading special registers
+used by a low-overhead looping instruction.  If initialisation insns do
+not always need to be emitted, use a @code{define_expand}
+(@pxref{Expander Definitions}) and make it fail.
+
+
  @cindex @code{canonicalize_funcptr_for_compare} instruction pattern
  @item @samp{canonicalize_funcptr_for_compare}
  Canonicalize the function pointer in operand 1 and store the result
@@ -2663,31 +2857,29 @@ You will not normally need to define this pattern unless you also define
  @code{builtin_setjmp_setup}.  The single argument is a pointer to the
  @code{jmp_buf}.
  
-@cindex @code{eh_epilogue} instruction pattern
-@item @samp{eh_epilogue}
+@cindex @code{eh_return} instruction pattern
+@item @samp{eh_return}
  This pattern, if defined, affects the way @code{__builtin_eh_return},
-and thence @code{__throw} are built.  It is intended to allow communication
-between the exception handling machinery and the normal epilogue code
-for the target.
-
-The pattern takes three arguments.  The first is the exception context
-pointer.  This will have already been copied to the function return
-register appropriate for a pointer; normally this can be ignored.  The
-second argument is an offset to be added to the stack pointer.  It will 
-have been copied to some arbitrary call-clobbered hard reg so that it
-will survive until after reload to when the normal epilogue is generated. 
-The final argument is the address of the exception handler to which
+and thence the call frame exception handling library routines, are
+built.  It is intended to handle non-trivial actions needed along
+the abnormal return path.
+
+The pattern takes two arguments.  The first is an offset to be applied
+to the stack pointer.  It will have been copied to some appropriate
+location (typically @code{EH_RETURN_STACKADJ_RTX}) which will survive
+until after reload to when the normal epilogue is generated. 
+The second argument is the address of the exception handler to which
  the function should return.  This will normally need to copied by the
-pattern to some special register.
+pattern to some special register or memory location.
  
-This pattern must be defined if @code{RETURN_ADDR_RTX} does not yield
-something that can be reliably and permanently modified, i.e. a fixed
-hard register or a stack memory reference.
+This pattern only needs to be defined if call frame exception handling
+is to be used, and simple moves to @code{EH_RETURN_STACKADJ_RTX} and
+@code{EH_RETURN_HANDLER_RTX} are not sufficient.
  
  @cindex @code{prologue} instruction pattern
  @item @samp{prologue}
  This pattern, if defined, emits RTL for entry to a function.  The function
-entry is resposible for setting up the stack frame, initializing the frame
+entry is responsible for setting up the stack frame, initializing the frame
  pointer register, saving callee saved registers, etc.
  
  Using a prologue pattern is generally preferred over defining
@@ -2699,7 +2891,7 @@ instruction scheduling.
  @cindex @code{epilogue} instruction pattern
  @item @samp{epilogue}
  This pattern, if defined, emits RTL for exit from a function.  The function
-exit is resposible for deallocating the stack frame, restoring callee saved
+exit is responsible for deallocating the stack frame, restoring callee saved
  registers and emitting the return instruction.
  
  Using an epilogue pattern is generally preferred over defining
@@ -2740,6 +2932,14 @@ A typical @code{conditional_trap} pattern looks like
    "@dots{}")
  @end smallexample
  
+@cindex @code{cycle_display} instruction pattern
+@item @samp{cycle_display}
+
+This pattern, if present, will be emitted by the instruction scheduler at
+the beginning of each new clock cycle.  This can be used for annotating the
+assembler output with cycle counts.  Operand 0 is a @code{const_int} that
+holds the clock cycle.
+
  @end table
  
  @node Pattern Ordering
@@ -2958,6 +3158,111 @@ discussed above, we have the pattern
  The @code{SELECT_CC_MODE} macro on the Sparc returns @code{CC_NOOVmode}
  for comparisons whose argument is a @code{plus}.
  
+@node Looping Patterns
+@section Defining Looping Instruction Patterns
+@cindex looping instruction patterns
+@cindex defining looping instruction patterns
+
+Some machines have special jump instructions that can be utilised to
+make loops more efficient.  A common example is the 68000 @samp{dbra}
+instruction which performs a decrement of a register and a branch if the
+result was greater than zero.  Other machines, in particular digital
+signal processors (DSPs), have special block repeat instructions to
+provide low-overhead loop support.  For example, the TI TMS320C3x/C4x
+DSPs have a block repeat instruction that loads special registers to
+mark the top and end of a loop and to count the number of loop
+iterations.  This avoids the need for fetching and executing a
+@samp{dbra}-like instruction and avoids pipeline stalls asociated with
+the jump.
+
+GNU CC has three special named patterns to support low overhead looping,
+@samp{decrement_and_branch_until_zero}, @samp{doloop_begin}, and
+@samp{doloop_end}.  The first pattern,
+@samp{decrement_and_branch_until_zero}, is not emitted during RTL
+generation but may be emitted during the instruction combination phase.
+This requires the assistance of the loop optimizer, using information
+collected during strength reduction, to reverse a loop to count down to
+zero.  Some targets also require the loop optimizer to add a
+@code{REG_NONNEG} note to indicate that the iteration count is always
+positive.  This is needed if the target performs a signed loop
+termination test.  For example, the 68000 uses a pattern similar to the
+following for its @code{dbra} instruction:
+
+@smallexample
+@group
+(define_insn "decrement_and_branch_until_zero"
+  [(set (pc)
+       (if_then_else
+         (ge (plus:SI (match_operand:SI 0 "general_operand" "+d*am")
+                      (const_int -1))
+             (const_int 0))
+         (label_ref (match_operand 1 "" ""))
+         (pc)))
+   (set (match_dup 0)
+       (plus:SI (match_dup 0)
+                (const_int -1)))]
+  "find_reg_note (insn, REG_NONNEG, 0)"
+  "...")
+@end group
+@end smallexample
+
+Note that since the insn is both a jump insn and has an output, it must
+deal with its own reloads, hence the `m' constraints.  Also note that
+since this insn is generated by the instruction combination phase
+combining two sequential insns together into an implicit parallel insn,
+the iteration counter needs to be biased by the same amount as the
+decrement operation, in this case -1.  Note that the following similar
+pattern will not be matched by the combiner.
+
+@smallexample
+@group
+(define_insn "decrement_and_branch_until_zero"
+  [(set (pc)
+       (if_then_else
+         (ge (match_operand:SI 0 "general_operand" "+d*am")
+             (const_int 1))
+         (label_ref (match_operand 1 "" ""))
+         (pc)))
+   (set (match_dup 0)
+       (plus:SI (match_dup 0)
+                (const_int -1)))]
+  "find_reg_note (insn, REG_NONNEG, 0)"
+  "...")
+@end group
+@end smallexample
+
+The other two special looping patterns, @samp{doloop_begin} and
+@samp{doloop_end}, are emitted by the loop optimiser for certain
+well-behaved loops with a finite number of loop iterations using
+information collected during strength reduction.  
+
+The @samp{doloop_end} pattern describes the actual looping instruction
+(or the implicit looping operation) and the @samp{doloop_begin} pattern
+is an optional companion pattern that can be used for initialisation
+needed for some low-overhead looping instructions.
+
+Note that some machines require the actual looping instruction to be
+emitted at the top of the loop (e.g., the TMS320C3x/C4x DSPs).  Emitting
+the true RTL for a looping instruction at the top of the loop can cause
+problems with flow analysis.  So instead, a dummy @code{doloop} insn is
+emitted at the end of the loop.  The machine dependent reorg pass checks
+for the presence of this @code{doloop} insn and then searches back to
+the top of the loop, where it inserts the true looping insn (provided
+there are no instructions in the loop which would cause problems).  Any
+additional labels can be emitted at this point.  In addition, if the
+desired special iteration counter register was not allocated, this
+machine dependent reorg pass could emit a traditional compare and jump
+instruction pair.
+
+The essential difference between the
+@samp{decrement_and_branch_until_zero} and the @samp{doloop_end}
+patterns is that the loop optimizer allocates an additional pseudo
+register for the latter as an iteration counter.  This pseudo register
+cannot be used within the loop (i.e., general induction variables cannot
+be derived from it), however, in many cases the loop induction variable
+may become redundant and removed by the flow pass.
+
+
  @node Insn Canonicalizations
  @section Canonicalization of Instructions
  @cindex canonicalization of instructions
@@ -3176,6 +3481,33 @@ shifting, etc.) and bitfield (@code{extv}, @code{extzv}, and @code{insv})
  operations.
  @end table
  
+If the preparation falls through (invokes neither @code{DONE} nor
+@code{FAIL}), then the @code{define_expand} acts like a
+@code{define_insn} in that the RTL template is used to generate the
+insn.
+
+The RTL template is not used for matching, only for generating the
+initial insn list.  If the preparation statement always invokes
+@code{DONE} or @code{FAIL}, the RTL template may be reduced to a simple
+list of operands, such as this example:
+
+@smallexample
+@group
+(define_expand "addsi3"
+  [(match_operand:SI 0 "register_operand" "")
+   (match_operand:SI 1 "register_operand" "")
+   (match_operand:SI 2 "register_operand" "")]
+@end group
+@group
+  ""
+  "
+@{
+  handle_add (operands[0], operands[1], operands[2]);
+  DONE;
+@}")
+@end group
+@end smallexample
+
  Here is an example, the definition of left-shift for the SPUR chip:
  
  @smallexample
@@ -3429,6 +3761,56 @@ insns that don't.  Instead, write two separate @code{define_split}
  definitions, one for the insns that are valid and one for the insns that
  are not valid.
  
+For the common case where the pattern of a define_split exactly matches the
+pattern of a define_insn, use @code{define_insn_and_split}.  It looks like
+this:
+
+@smallexample
+(define_insn_and_split
+  [@var{insn-pattern}]
+  "@var{condition}"
+  "@var{output-template}"
+  "@var{split-condition}"
+  [@var{new-insn-pattern-1}
+   @var{new-insn-pattern-2}
+   @dots{}]
+  "@var{preparation statements}"
+  [@var{insn-attributes}])
+
+@end smallexample
+
+@var{insn-pattern}, @var{condition}, @var{output-template}, and
+@var{insn-attributes} are used as in @code{define_insn}.  The
+@var{new-insn-pattern} vector and the @var{preparation-statements} are used as
+in a @code{define_split}.  The @var{split-condition} is also used as in
+@code{define_split}, with the additional behavior that if the condition starts
+with @samp{&&}, the condition used for the split will be the constructed as a
+logical "and" of the split condition with the insn condition.  For example,
+from i386.md:
+
+@smallexample
+(define_insn_and_split "zero_extendhisi2_and"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+     (zero_extend:SI (match_operand:HI 1 "register_operand" "0")))
+   (clobber (reg:CC 17))]
+  "TARGET_ZERO_EXTEND_WITH_AND && !optimize_size"
+  "#"
+  "&& reload_completed"
+  [(parallel [(set (match_dup 0) (and:SI (match_dup 0) (const_int 65535)))
+             (clobber (reg:CC 17))])]
+  ""
+  [(set_attr "type" "alu1")])
+
+@end smallexample
+
+In this case, the actual split condition will be 
+"TARGET_ZERO_EXTEND_WITH_AND && !optimize_size && reload_completed."
+
+The @code{define_insn_and_split} construction provides exactly the same
+functionality as two separate @code{define_insn} and @code{define_split}
+patterns.  It exists for compactness, and as a maintenance tool to prevent
+having to ensure the two patterns' templates match.
+
  @node Peephole Definitions
  @section Machine-Specific Peephole Optimizers
  @cindex peephole optimizer definitions
@@ -3714,7 +4096,7 @@ able to schedule around the memory load latency.  It allocates a single
  @code{SImode} register of class @code{GENERAL_REGS} (@code{"r"}) that needs
  to be live only at the point just before the arithmetic.
  
-A real example requring extended scratch lifetimes is harder to come by,
+A real example requiring extended scratch lifetimes is harder to come by,
  so here's a silly made-up example:
  
  @smallexample
@@ -4506,3 +4888,135 @@ used during their execution and there is no way of representing that
  conflict.  We welcome any examples of how function unit conflicts work
  in such processors and suggestions for their representation.
  @end ifset
+
+@node Conditional Execution
+@section Conditional Execution
+@cindex conditional execution
+@cindex predication
+
+A number of architectures provide for some form of conditional
+execution, or predication.  The hallmark of this feature is the
+ability to nullify most of the instructions in the instruction set.
+When the instruction set is large and not entirely symmetric, it
+can be quite tedious to describe these forms directly in the
+@file{.md} file.  An alternative is the @code{define_cond_exec} template.
+
+@findex define_cond_exec
+@smallexample
+(define_cond_exec
+  [@var{predicate-pattern}]
+  "@var{condition}"
+  "@var{output template}")
+@end smallexample
+
+@var{predicate-pattern} is the condition that must be true for the
+insn to be executed at runtime and should match a relational operator.
+One can use @code{match_operator} to match several relational operators
+at once.  Any @code{match_operand} operands must have no more than one
+alternative.
+
+@var{condition} is a C expression that must be true for the generated
+pattern to match.
+
+@findex current_insn_predicate
+@var{output template} is a string similar to the @code{define_insn}
+output template (@pxref{Output Template}), except that the @samp{*}
+and @samp{@@} special cases do not apply.  This is only useful if the
+assembly text for the predicate is a simple prefix to the main insn.
+In order to handle the general case, there is a global variable
+@code{current_insn_predicate} that will contain the entire predicate
+if the current insn is predicated, and will otherwise be @code{NULL}.
+
+When @code{define_cond_exec} is used, an implicit reference to 
+the @code{predicable} instruction attribute is made. 
+@xref{Insn Attributes}.  This attribute must be boolean (i.e. have
+exactly two elements in its @var{list-of-values}).  Further, it must
+not be used with complex expressions.  That is, the default and all
+uses in the insns must be a simple constant, not dependent on the 
+alternative or anything else.
+
+For each @code{define_insn} for which the @code{predicable} 
+attribute is true, a new @code{define_insn} pattern will be
+generated that matches a predicated version of the instruction.
+For example,
+
+@smallexample
+(define_insn "addsi"
+  [(set (match_operand:SI 0 "register_operand" "r")
+        (plus:SI (match_operand:SI 1 "register_operand" "r")
+                 (match_operand:SI 2 "register_operand" "r")))]
+  "@var{test1}"
+  "add %2,%1,%0")
+
+(define_cond_exec
+  [(ne (match_operand:CC 0 "register_operand" "c")
+       (const_int 0))]
+  "@var{test2}"
+  "(%0)")
+@end smallexample
+
+@noindent
+generates a new pattern
+
+@smallexample
+(define_insn ""
+  [(cond_exec
+     (ne (match_operand:CC 3 "register_operand" "c") (const_int 0))
+     (set (match_operand:SI 0 "register_operand" "r")
+          (plus:SI (match_operand:SI 1 "register_operand" "r")
+                   (match_operand:SI 2 "register_operand" "r"))))]
+  "(@var{test2}) && (@var{test1})"
+  "(%3) add %2,%1,%0")
+@end smallexample
+
+@node Constant Definitions
+@section Constant Definitions
+@cindex constant definitions
+@findex define_constants
+
+Using literal constants inside instruction patterns reduces legibility and
+can be a maintenance problem.
+
+To overcome this problem, you may use the @code{define_constants}
+expression.  It contains a vector of name-value pairs.  From that
+point on, wherever any of the names appears in the MD file, it is as
+if the corresponding value had been written instead.  You may use
+@code{define_constants} multiple times; each appearance adds more
+constants to the table.  It is an error to redefine a constant with
+a different value.
+
+To come back to the a29k load multiple example, instead of
+
+@smallexample
+(define_insn ""
+  [(match_parallel 0 "load_multiple_operation"
+     [(set (match_operand:SI 1 "gpc_reg_operand" "=r")
+           (match_operand:SI 2 "memory_operand" "m"))
+      (use (reg:SI 179))
+      (clobber (reg:SI 179))])]
+  ""
+  "loadm 0,0,%1,%2")
+@end smallexample
+
+You could write:
+
+@smallexample
+(define_constants [
+    (R_BP 177)
+    (R_FC 178)
+    (R_CR 179)
+    (R_Q  180)
+])
+
+(define_insn ""
+  [(match_parallel 0 "load_multiple_operation"
+     [(set (match_operand:SI 1 "gpc_reg_operand" "=r")
+           (match_operand:SI 2 "memory_operand" "m"))
+      (use (reg:SI R_CR))
+      (clobber (reg:SI R_CR))])]
+  ""
+  "loadm 0,0,%1,%2")
+@end smallexample
+
+The constants that are defined with a define_constant are also output
+in the insn-codes.h header file as #defines.