+2007-05-17 Uros Bizjak <ubizjak@gmail.com>
+
+ PR tree-optimization/24659
+ * optabs.h (enum optab_index): Add OTI_vec_unpacks_float_hi,
+ OTI_vec_unpacks_float_lo, OTI_vec_unpacku_float_hi,
+ OTI_vec_unpacku_float_lo, OTI_vec_pack_sfix_trunc and
+ OTI_vec_pack_ufix_trunc.
+ (vec_unpacks_float_hi_optab): Define new macro.
+ (vec_unpacks_float_lo_optab): Ditto.
+ (vec_unpacku_float_hi_optab): Ditto.
+ (vec_unpacku_float_lo_optab): Ditto.
+ (vec_pack_sfix_trunc_optab): Ditto.
+ (vec_pack_ufix_trunc_optab): Ditto.
+ * genopinit.c (optabs): Implement vec_unpack[s|u]_[hi|lo]_optab
+ and vec_pack_[s|u]fix_trunc_optab using
+ vec_unpack[s|u]_[hi\lo]_* and vec_pack_[u|s]fix_trunc_* patterns
+ * tree-vectorizer.c (supportable_widening_operation): Handle
+ FLOAT_EXPR and CONVERT_EXPR. Update comment.
+ (supportable_narrowing_operation): New function.
+ * tree-vectorizer.h (supportable_narrowing_operation): Prototype.
+ * tree-vect-transform.c (vectorizable_conversion): Handle
+ (nunits_in == nunits_out / 2) and (nunits_out == nunits_in / 2) cases.
+ (vect_gen_widened_results_half): Move before vectorizable_conversion.
+ (vectorizable_type_demotion): Call supportable_narrowing_operation()
+ to check for target support.
+ * optabs.c (optab_for_tree_code) Return vec_unpack[s|u]_float_hi_optab
+ for VEC_UNPACK_FLOAT_HI_EXPR, vec_unpack[s|u]_float_lo_optab
+ for VEC_UNPACK_FLOAT_LO_EXPR and vec_pack_[u|s]fix_trunc_optab
+ for VEC_PACK_FIX_TRUNC_EXPR.
+ (expand_binop): Special case mode of the result for
+ vec_pack_[u|s]fix_trunc_optab.
+ (init_optabs): Initialize vec_unpack[s|u]_[hi|lo]_optab and
+ vec_pack_[u|s]fix_trunc_optab.
+
+ * tree.def (VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR,
+ VEC_PACK_FIX_TRUNC_EXPR): New tree codes.
+ * tree-pretty-print.c (dump_generic_node): Handle
+ VEC_UNPACK_FLOAT_HI_EXPR, VEC_UNPACK_FLOAT_LO_EXPR and
+ VEC_PACK_FIX_TRUNC_EXPR.
+ (op_prio): Ditto.
+ * expr.c (expand_expr_real_1): Ditto.
+ * tree-inline.c (estimate_num_insns_1): Ditto.
+ * tree-vect-generic.c (expand_vector_operations_1): Ditto.
+
+ * config/i386/sse.md (vec_unpacks_float_hi_v8hi): New expander.
+ (vec_unpacks_float_lo_v8hi): Ditto.
+ (vec_unpacku_float_hi_v8hi): Ditto.
+ (vec_unpacku_float_lo_v8hi): Ditto.
+ (vec_unpacks_float_hi_v4si): Ditto.
+ (vec_unpacks_float_lo_v4si): Ditto.
+ (vec_pack_sfix_trunc_v2df): Ditto.
+
+ * doc/c-tree.texi (Expression trees) [VEC_UNPACK_FLOAT_HI_EXPR]:
+ Document.
+ [VEC_UNPACK_FLOAT_LO_EXPR]: Ditto.
+ [VEC_PACK_FIX_TRUNC_EXPR]: Ditto.
+ * doc/md.texi (Standard Names) [vec_pack_sfix_trunc]: Document.
+ [vec_pack_ufix_trunc]: Ditto.
+ [vec_unpacks_float_hi]: Ditto.
+ [vec_unpacks_float_lo]: Ditto.
+ [vec_unpacku_float_hi]: Ditto.
+ [vec_unpacku_float_lo]: Ditto.
+
2007-05-16 Uros Bizjak <ubizjak@gmail.com>
* soft-fp/README: Update for new files.
2007-05-16 Paolo Bonzini <bonzini@gnu.org>
- * config/i386/i386.c (legitimize_tls_address): Mark __tls_get_addr
- calls as pure.
+ * config/i386/i386.c (legitimize_tls_address): Mark __tls_get_addr
+ calls as pure.
2007-05-16 Eric Christopher <echristo@apple.com>
* config/rs6000/rs6000.c (rs6000_emit_prologue): Move altivec register
- saving after stack push. Set sp_offset whenever we push.
- (rs6000_emit_epilogue): Move altivec register restore before stack push.
+ saving after stack push. Set sp_offset whenever we push.
+ (rs6000_emit_epilogue): Move altivec register restore before
+ stack push.
2007-05-16 Richard Sandiford <richard@codesourcery.com>
dumps.
2007-05-08 Sandra Loosemore <sandra@codesourcery.com>
- Nigel Stephens <nigel@mips.com>
+ Nigel Stephens <nigel@mips.com>
* config/mips/mips.h (MAX_FPRS_PER_FMT): Renamed from FP_INC.
Update comments and all uses.
* configure: Regenerate.
* config.in: Regenerate.
-2007-05-07 Naveen.H.S <naveen.hs@kpitcummins.com>
+2007-05-07 Naveen.H.S <naveen.hs@kpitcummins.com>
* config/m32c/muldiv.md (mulhisi3_c): Limit the mode of the 2nd
operand to HI mode.
PR middle-end/22156
Temporarily revert:
2007-04-06 Andreas Tobler <a.tobler@schweiz.org>
- * tree-sra.c (sra_build_elt_assignment): Initialize min/maxshift.
+ * tree-sra.c (sra_build_elt_assignment): Initialize min/maxshift.
2007-04-05 Alexandre Oliva <aoliva@redhat.com>
* tree-sra.c (try_instantiate_multiple_fields): Needlessly
initialize align to silence bogus warning.
PR tree-optimization/30965
PR tree-optimization/30978
* Makefile.in (tree-ssa-forwprop.o): Depend on $(FLAGS_H).
- * tree-ssa-forwprop.c (forward_propagate_into_cond_1): Remove.
- (find_equivalent_equality_comparison): Likewise.
- (simplify_cond): Likewise.
- (get_prop_source_stmt): New helper.
- (get_prop_dest_stmt): Likewise.
+ * tree-ssa-forwprop.c (forward_propagate_into_cond_1): Remove.
+ (find_equivalent_equality_comparison): Likewise.
+ (simplify_cond): Likewise.
+ (get_prop_source_stmt): New helper.
+ (get_prop_dest_stmt): Likewise.
(can_propagate_from): Likewise.
(remove_prop_source_from_use): Likewise.
- (combine_cond_expr_cond): Likewise.
- (forward_propagate_comparison): New function.
- (forward_propagate_into_cond): Rewrite to use fold for
- tree combining.
+ (combine_cond_expr_cond): Likewise.
+ (forward_propagate_comparison): New function.
+ (forward_propagate_into_cond): Rewrite to use fold for
+ tree combining.
(tree_ssa_forward_propagate_single_use_vars): Call
forward_propagate_comparison to propagate comparisons.
(parallel [(const_int 0) (const_int 1)]))))]
"TARGET_SSE2")
+(define_expand "vec_unpacks_float_hi_v8hi"
+ [(match_operand:V4SF 0 "register_operand" "")
+ (match_operand:V8HI 1 "register_operand" "")]
+ "TARGET_SSE2"
+{
+ rtx tmp = gen_reg_rtx (V4SImode);
+
+ emit_insn (gen_vec_unpacks_hi_v8hi (tmp, operands[1]));
+ emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
+ DONE;
+})
+
+(define_expand "vec_unpacks_float_lo_v8hi"
+ [(match_operand:V4SF 0 "register_operand" "")
+ (match_operand:V8HI 1 "register_operand" "")]
+ "TARGET_SSE2"
+{
+ rtx tmp = gen_reg_rtx (V4SImode);
+
+ emit_insn (gen_vec_unpacks_lo_v8hi (tmp, operands[1]));
+ emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
+ DONE;
+})
+
+(define_expand "vec_unpacku_float_hi_v8hi"
+ [(match_operand:V4SF 0 "register_operand" "")
+ (match_operand:V8HI 1 "register_operand" "")]
+ "TARGET_SSE2"
+{
+ rtx tmp = gen_reg_rtx (V4SImode);
+
+ emit_insn (gen_vec_unpacku_hi_v8hi (tmp, operands[1]));
+ emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
+ DONE;
+})
+
+(define_expand "vec_unpacku_float_lo_v8hi"
+ [(match_operand:V4SF 0 "register_operand" "")
+ (match_operand:V8HI 1 "register_operand" "")]
+ "TARGET_SSE2"
+{
+ rtx tmp = gen_reg_rtx (V4SImode);
+
+ emit_insn (gen_vec_unpacku_lo_v8hi (tmp, operands[1]));
+ emit_insn (gen_sse2_cvtdq2ps (operands[0], tmp));
+ DONE;
+})
+
+(define_expand "vec_unpacks_float_hi_v4si"
+ [(set (match_dup 2)
+ (vec_select:V4SI
+ (match_operand:V4SI 1 "nonimmediate_operand" "")
+ (parallel [(const_int 2)
+ (const_int 3)
+ (const_int 2)
+ (const_int 3)])))
+ (set (match_operand:V2DF 0 "register_operand" "")
+ (float:V2DF
+ (vec_select:V2SI
+ (match_dup 2)
+ (parallel [(const_int 0) (const_int 1)]))))]
+ "TARGET_SSE2"
+{
+ operands[2] = gen_reg_rtx (V4SImode);
+})
+
+(define_expand "vec_unpacks_float_lo_v4si"
+ [(set (match_operand:V2DF 0 "register_operand" "")
+ (float:V2DF
+ (vec_select:V2SI
+ (match_operand:V4SI 1 "nonimmediate_operand" "")
+ (parallel [(const_int 0) (const_int 1)]))))]
+ "TARGET_SSE2")
+
(define_expand "vec_pack_trunc_v2df"
[(match_operand:V4SF 0 "register_operand" "")
(match_operand:V2DF 1 "nonimmediate_operand" "")
DONE;
})
+(define_expand "vec_pack_sfix_trunc_v2df"
+ [(match_operand:V4SI 0 "register_operand" "")
+ (match_operand:V2DF 1 "nonimmediate_operand" "")
+ (match_operand:V2DF 2 "nonimmediate_operand" "")]
+ "TARGET_SSE2"
+{
+ rtx r1, r2;
+
+ r1 = gen_reg_rtx (V4SImode);
+ r2 = gen_reg_rtx (V4SImode);
+
+ emit_insn (gen_sse2_cvttpd2dq (r1, operands[1]));
+ emit_insn (gen_sse2_cvttpd2dq (r2, operands[2]));
+ emit_insn (gen_sse2_punpcklqdq (gen_lowpart (V2DImode, operands[0]),
+ gen_lowpart (V2DImode, r1),
+ gen_lowpart (V2DImode, r2)));
+ DONE;
+})
+
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; Parallel double-precision floating point element swizzling
"TARGET_SSE2"
{
rtx op1, op2, h1, l1, h2, l2, h3, l3;
-
+
op1 = gen_lowpart (V16QImode, operands[1]);
op2 = gen_lowpart (V16QImode, operands[2]);
h1 = gen_reg_rtx (V16QImode);
l2 = gen_reg_rtx (V16QImode);
h3 = gen_reg_rtx (V16QImode);
l3 = gen_reg_rtx (V16QImode);
-
+
emit_insn (gen_vec_interleave_highv16qi (h1, op1, op2));
emit_insn (gen_vec_interleave_lowv16qi (l1, op1, op2));
emit_insn (gen_vec_interleave_highv16qi (h2, l1, h1));
emit_insn (gen_vec_interleave_lowv16qi (operands[0], l3, h3));
DONE;
})
-
+
;; Reduce:
;; op1 = abcdefgh
;; op2 = ijklmnop
"TARGET_SSE2"
{
rtx op1, op2, h1, l1, h2, l2;
-
+
op1 = gen_lowpart (V8HImode, operands[1]);
op2 = gen_lowpart (V8HImode, operands[2]);
h1 = gen_reg_rtx (V8HImode);
l1 = gen_reg_rtx (V8HImode);
h2 = gen_reg_rtx (V8HImode);
l2 = gen_reg_rtx (V8HImode);
-
+
emit_insn (gen_vec_interleave_highv8hi (h1, op1, op2));
emit_insn (gen_vec_interleave_lowv8hi (l1, op1, op2));
emit_insn (gen_vec_interleave_highv8hi (h2, l1, h1));
emit_insn (gen_vec_interleave_lowv8hi (operands[0], l2, h2));
DONE;
})
-
+
;; Reduce:
;; op1 = abcd
;; op2 = efgh
"TARGET_SSE2"
{
rtx op1, op2, h1, l1;
-
+
op1 = gen_lowpart (V4SImode, operands[1]);
op2 = gen_lowpart (V4SImode, operands[2]);
h1 = gen_reg_rtx (V4SImode);
l1 = gen_reg_rtx (V4SImode);
-
+
emit_insn (gen_vec_interleave_highv4si (h1, op1, op2));
emit_insn (gen_vec_interleave_lowv4si (l1, op1, op2));
emit_insn (gen_vec_interleave_lowv4si (operands[0], l1, h1));
@tindex VEC_WIDEN_MULT_LO_EXPR
@tindex VEC_UNPACK_HI_EXPR
@tindex VEC_UNPACK_LO_EXPR
+@tindex VEC_UNPACK_FLOAT_HI_EXPR
+@tindex VEC_UNPACK_FLOAT_LO_EXPR
@tindex VEC_PACK_TRUNC_EXPR
@tindex VEC_PACK_SAT_EXPR
+@tindex VEC_PACK_FIX_TRUNC_EXPR
@tindex VEC_EXTRACT_EVEN_EXPR
@tindex VEC_EXTRACT_ODD_EXPR
@tindex VEC_INTERLEAVE_HIGH_EXPR
In the case of @code{VEC_UNPACK_LO_EXPR} the low @code{N/2} elements of the
vector are extracted and widened (promoted).
+@item VEC_UNPACK_FLOAT_HI_EXPR
+@item VEC_UNPACK_FLOAT_LO_EXPR
+These nodes represent unpacking of the high and low parts of the input vector,
+where the values are converted from fixed point to floating point. The
+single operand is a vector that contains @code{N} elements of the same
+integral type. The result is a vector that contains half as many elements
+of a floating point type whose size is twice as wide. In the case of
+@code{VEC_UNPACK_HI_EXPR} the high @code{N/2} elements of the vector are
+extracted, converted and widened. In the case of @code{VEC_UNPACK_LO_EXPR}
+the low @code{N/2} elements of the vector are extracted, converted and widened.
+
@item VEC_PACK_TRUNC_EXPR
This node represents packing of truncated elements of the two input vectors
into the output vector. Input operands are vectors that contain the same
is half as wide. The elements of the two vectors are demoted and merged
(concatenated) to form the output vector.
+@item VEC_PACK_FIX_TRUNC_EXPR
+This node represents packing of elements of the two input vectors into the
+output vector, where the values are converted from floating point
+to fixed point. Input operands are vectors that contain the same number
+of elements of a floating point type. The result is a vector that contains
+twice as many elements of an integral type whose size is half as wide. The
+elements of the two vectors are merged (concatenated) to form the output
+vector.
+
@item VEC_EXTRACT_EVEN_EXPR
@item VEC_EXTRACT_ODD_EXPR
These nodes represent extracting of the even/odd elements of the two input
vectors are concatenated after narrowing them down using signed/unsigned
saturating arithmetic.
+@cindex @code{vec_pack_sfix_trunc_@var{m}} instruction pattern
+@cindex @code{vec_pack_ufix_trunc_@var{m}} instruction pattern
+@item @samp{vec_pack_sfix_trunc_@var{m}}, @samp{vec_pack_ufix_trunc_@var{m}}
+Narrow, convert to signed/unsigned integral type and merge the elements
+of two vectors. Operands 1 and 2 are vectors of the same mode having N
+floating point elements of size S. Operand 0 is the resulting vector
+in which 2*N elements of size N/2 are concatenated.
+
@cindex @code{vec_unpacks_hi_@var{m}} instruction pattern
@cindex @code{vec_unpacks_lo_@var{m}} instruction pattern
@item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}}
Widen (promote) the high/low elements of the vector using zero extension and
place the resulting N/2 values of size 2*S in the output vector (operand 0).
+@cindex @code{vec_unpacks_float_hi_@var{m}} instruction pattern
+@cindex @code{vec_unpacks_float_lo_@var{m}} instruction pattern
+@cindex @code{vec_unpacku_float_hi_@var{m}} instruction pattern
+@cindex @code{vec_unpacku_float_lo_@var{m}} instruction pattern
+@item @samp{vec_unpacks_float_hi_@var{m}}, @samp{vec_unpacks_float_lo_@var{m}}
+@itemx @samp{vec_unpacku_float_hi_@var{m}}, @samp{vec_unpacku_float_lo_@var{m}}
+Extract, convert to floating point type and widen the high/low part of a
+vector of signed/unsigned integral elements. The input vector (operand 1)
+has N elements of size S. Convert the high/low elements of the vector using
+floating point conversion and place the resulting N/2 values of size 2*S in
+the output vector (operand 0).
+
@cindex @code{vec_widen_umult_hi_@var{m}} instruction pattern
@cindex @code{vec_widen_umult_lo__@var{m}} instruction pattern
@cindex @code{vec_widen_smult_hi_@var{m}} instruction pattern
@cindex @code{vec_widen_smult_lo_@var{m}} instruction pattern
-@item @samp{vec_widen_umult_hi_@var{m}}, @samp{vec_widen_umult_lo_@var{m}}, @samp{vec_widen_smult_hi_@var{m}}, @samp{vec_widen_smult_lo_@var{m}}
+@item @samp{vec_widen_umult_hi_@var{m}}, @samp{vec_widen_umult_lo_@var{m}}
+@itemx @samp{vec_widen_smult_hi_@var{m}}, @samp{vec_widen_smult_lo_@var{m}}
Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2)
are vectors with N signed/unsigned elements of size S. Multiply the high/low
elements of the two vectors, and put the N/2 products of size 2*S in the
return temp;
}
+ case VEC_UNPACK_FLOAT_HI_EXPR:
+ case VEC_UNPACK_FLOAT_LO_EXPR:
+ {
+ op0 = expand_normal (TREE_OPERAND (exp, 0));
+ /* The signedness is determined from input operand. */
+ this_optab = optab_for_tree_code (code,
+ TREE_TYPE (TREE_OPERAND (exp, 0)));
+ temp = expand_widen_pattern_expr
+ (exp, op0, NULL_RTX, NULL_RTX,
+ target, TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp, 0))));
+
+ gcc_assert (temp);
+ return temp;
+ }
+
case VEC_WIDEN_MULT_HI_EXPR:
case VEC_WIDEN_MULT_LO_EXPR:
{
case VEC_PACK_TRUNC_EXPR:
case VEC_PACK_SAT_EXPR:
+ case VEC_PACK_FIX_TRUNC_EXPR:
{
mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0)));
goto binop;
"vec_unpacks_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacks_lo_$a$)",
"vec_unpacku_hi_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_hi_$a$)",
"vec_unpacku_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_lo_$a$)",
+ "vec_unpacks_float_hi_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacks_float_hi_$a$)",
+ "vec_unpacks_float_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacks_float_lo_$a$)",
+ "vec_unpacku_float_hi_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_float_hi_$a$)",
+ "vec_unpacku_float_lo_optab->handlers[$A].insn_code = CODE_FOR_$(vec_unpacku_float_lo_$a$)",
"vec_pack_trunc_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_trunc_$a$)",
"vec_pack_ssat_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_ssat_$a$)",
- "vec_pack_usat_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_usat_$a$)"
+ "vec_pack_usat_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_usat_$a$)",
+ "vec_pack_sfix_trunc_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_sfix_trunc_$a$)",
+ "vec_pack_ufix_trunc_optab->handlers[$A].insn_code = CODE_FOR_$(vec_pack_ufix_trunc_$a$)"
};
static void gen_insn (rtx);
return TYPE_UNSIGNED (type) ?
vec_unpacku_lo_optab : vec_unpacks_lo_optab;
+ case VEC_UNPACK_FLOAT_HI_EXPR:
+ /* The signedness is determined from input operand. */
+ return TYPE_UNSIGNED (type) ?
+ vec_unpacku_float_hi_optab : vec_unpacks_float_hi_optab;
+
+ case VEC_UNPACK_FLOAT_LO_EXPR:
+ /* The signedness is determined from input operand. */
+ return TYPE_UNSIGNED (type) ?
+ vec_unpacku_float_lo_optab : vec_unpacks_float_lo_optab;
+
case VEC_PACK_TRUNC_EXPR:
return vec_pack_trunc_optab;
case VEC_PACK_SAT_EXPR:
return TYPE_UNSIGNED (type) ? vec_pack_usat_optab : vec_pack_ssat_optab;
+ case VEC_PACK_FIX_TRUNC_EXPR:
+ return TYPE_UNSIGNED (type) ?
+ vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;
+
default:
break;
}
if (binoptab == vec_pack_trunc_optab
|| binoptab == vec_pack_usat_optab
- || binoptab == vec_pack_ssat_optab)
+ || binoptab == vec_pack_ssat_optab
+ || binoptab == vec_pack_ufix_trunc_optab
+ || binoptab == vec_pack_sfix_trunc_optab)
{
/* The mode of the result is different then the mode of the
arguments. */
vec_unpacks_lo_optab = init_optab (UNKNOWN);
vec_unpacku_hi_optab = init_optab (UNKNOWN);
vec_unpacku_lo_optab = init_optab (UNKNOWN);
+ vec_unpacks_float_hi_optab = init_optab (UNKNOWN);
+ vec_unpacks_float_lo_optab = init_optab (UNKNOWN);
+ vec_unpacku_float_hi_optab = init_optab (UNKNOWN);
+ vec_unpacku_float_lo_optab = init_optab (UNKNOWN);
vec_pack_trunc_optab = init_optab (UNKNOWN);
vec_pack_usat_optab = init_optab (UNKNOWN);
vec_pack_ssat_optab = init_optab (UNKNOWN);
+ vec_pack_ufix_trunc_optab = init_optab (UNKNOWN);
+ vec_pack_sfix_trunc_optab = init_optab (UNKNOWN);
powi_optab = init_optab (UNKNOWN);
elements. */
OTI_vec_unpacku_hi,
OTI_vec_unpacku_lo,
+
+ /* Extract, convert to floating point and widen the high/low part of
+ a vector of signed or unsigned integer elements. */
+ OTI_vec_unpacks_float_hi,
+ OTI_vec_unpacks_float_lo,
+ OTI_vec_unpacku_float_hi,
+ OTI_vec_unpacku_float_lo,
+
/* Narrow (demote) and merge the elements of two vectors. */
OTI_vec_pack_trunc,
OTI_vec_pack_usat,
OTI_vec_pack_ssat,
+ /* Convert to signed/unsigned integer, narrow and merge elements
+ of two vectors of floating point elements. */
+ OTI_vec_pack_sfix_trunc,
+ OTI_vec_pack_ufix_trunc,
+
/* Perform a raise to the power of integer. */
OTI_powi,
#define vec_unpacks_lo_optab (optab_table[OTI_vec_unpacks_lo])
#define vec_unpacku_hi_optab (optab_table[OTI_vec_unpacku_hi])
#define vec_unpacku_lo_optab (optab_table[OTI_vec_unpacku_lo])
+#define vec_unpacks_float_hi_optab (optab_table[OTI_vec_unpacks_float_hi])
+#define vec_unpacks_float_lo_optab (optab_table[OTI_vec_unpacks_float_lo])
+#define vec_unpacku_float_hi_optab (optab_table[OTI_vec_unpacku_float_hi])
+#define vec_unpacku_float_lo_optab (optab_table[OTI_vec_unpacku_float_lo])
#define vec_pack_trunc_optab (optab_table[OTI_vec_pack_trunc])
#define vec_pack_ssat_optab (optab_table[OTI_vec_pack_ssat])
#define vec_pack_usat_optab (optab_table[OTI_vec_pack_usat])
+#define vec_pack_sfix_trunc_optab (optab_table[OTI_vec_pack_sfix_trunc])
+#define vec_pack_ufix_trunc_optab (optab_table[OTI_vec_pack_ufix_trunc])
#define powi_optab (optab_table[OTI_powi])
+2007-05-17 Uros Bizjak <ubizjak@gmail.com>
+
+ PR tree-optimization/24659
+ * gcc.dg/vect/vect-floatint-conversion-2.c: New test.
+ * gcc.dg/vect/vect-intfloat-conversion-1.c: Require vect_float,
+ not vect_int target.
+ * gcc.dg/vect/vect-intfloat-conversion-2.c: Require vect_float,
+ not vect_int target. Loop is vectorized for vect_intfloat_cvt
+ targets.
+ * gcc.dg/vect/vect-intfloat-conversion-3.c: New test.
+ * gcc.dg/vect/vect-intfloat-conversion-4a.c: New test.
+ * gcc.dg/vect/vect-intfloat-conversion-4b.c: New test.
+
2007-05-16 Uros Bizjak <ubizjak@gmail.com>
* gcc.dg/torture/fp-int-convert-float128.c: Do not xfail for i?86-*-*
* g++.dg/expr/bitfield8.C: New test.
2007-04-17 Joseph Myers <joseph@codesourcery.com>
- Richard Sandiford <richard@codesourcery.com>
+ Richard Sandiford <richard@codesourcery.com>
* lib/target-supports.exp (check_profiling_available): Return 0
for uClibc with -p or -pg.
--- /dev/null
+/* { dg-require-effective-target vect_double } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+#define N 32
+
+int
+main1 ()
+{
+ int i;
+ double db[N] = {0.4,3.5,6.6,9.4,12.5,15.6,18.4,21.5,24.6,27.4,30.5,33.6,36.4,39.5,42.6,45.4,0.5,3.6,6.4,9.5,12.6,15.4,18.5,21.6,24.4,27.5,30.6,33.4,36.5,39.6,42.4,45.5};
+ int ia[N];
+
+ /* double -> int */
+ for (i = 0; i < N; i++)
+ {
+ ia[i] = (int) db[i];
+ }
+
+ /* check results: */
+ for (i = 0; i < N; i++)
+ {
+ if (ia[i] != (int) db[i])
+ abort ();
+ }
+
+ return 0;
+}
+
+int
+main (void)
+{
+ check_vect ();
+
+ return main1 ();
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_floatint_cvt } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
-/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
-/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
return main1 ();
}
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
--- /dev/null
+/* { dg-require-effective-target vect_double } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+#define N 32
+
+int main1 ()
+{
+ int i;
+ int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ double da[N];
+
+ /* int -> double */
+ for (i = 0; i < N; i++)
+ {
+ da[i] = (double) ib[i];
+ }
+
+ /* check results: */
+ for (i = 0; i < N; i++)
+ {
+ if (da[i] != (double) ib[i])
+ abort ();
+ }
+
+ return 0;
+}
+
+int main (void)
+{
+ check_vect ();
+
+ return main1 ();
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
--- /dev/null
+/* { dg-require-effective-target vect_float } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+#define N 32
+
+int main1 ()
+{
+ int i;
+ short sb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,-3,-6,-9,-12,-15,-18,-21,-24,-27,-30,-33,-36,-39,-42,-45};
+ float fa[N];
+
+ /* short -> float */
+ for (i = 0; i < N; i++)
+ {
+ fa[i] = (float) sb[i];
+ }
+
+ /* check results: */
+ for (i = 0; i < N; i++)
+ {
+ if (fa[i] != (float) sb[i])
+ abort ();
+ }
+
+ return 0;
+}
+
+int main (void)
+{
+ check_vect ();
+
+ return main1 ();
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
--- /dev/null
+/* { dg-require-effective-target vect_float } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+#define N 32
+
+int main1 ()
+{
+ int i;
+ unsigned short usb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,0,65533,65530,65527,65524,65521,65518,65515,65512,65509,65506,65503,65500,65497,65494,65491};
+ float fa[N];
+
+ /* unsigned short -> float */
+ for (i = 0; i < N; i++)
+ {
+ fa[i] = (float) usb[i];
+ }
+
+ /* check results: */
+ for (i = 0; i < N; i++)
+ {
+ if (fa[i] != (float) usb[i])
+ abort ();
+ }
+
+ return 0;
+}
+
+int main (void)
+{
+ check_vect ();
+
+ return main1 ();
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_intfloat_cvt } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
case VEC_WIDEN_MULT_LO_EXPR:
case VEC_UNPACK_HI_EXPR:
case VEC_UNPACK_LO_EXPR:
+ case VEC_UNPACK_FLOAT_HI_EXPR:
+ case VEC_UNPACK_FLOAT_LO_EXPR:
case VEC_PACK_TRUNC_EXPR:
case VEC_PACK_SAT_EXPR:
+ case VEC_PACK_FIX_TRUNC_EXPR:
case WIDEN_MULT_EXPR:
pp_string (buffer, " > ");
break;
+ case VEC_UNPACK_FLOAT_HI_EXPR:
+ pp_string (buffer, " VEC_UNPACK_FLOAT_HI_EXPR < ");
+ dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
+ pp_string (buffer, " > ");
+ break;
+
+ case VEC_UNPACK_FLOAT_LO_EXPR:
+ pp_string (buffer, " VEC_UNPACK_FLOAT_LO_EXPR < ");
+ dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
+ pp_string (buffer, " > ");
+ break;
+
case VEC_PACK_TRUNC_EXPR:
pp_string (buffer, " VEC_PACK_TRUNC_EXPR < ");
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
dump_generic_node (buffer, TREE_OPERAND (node, 1), spc, flags, false);
pp_string (buffer, " > ");
break;
-
+
case VEC_PACK_SAT_EXPR:
pp_string (buffer, " VEC_PACK_SAT_EXPR < ");
dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
dump_generic_node (buffer, TREE_OPERAND (node, 1), spc, flags, false);
pp_string (buffer, " > ");
break;
-
+
+ case VEC_PACK_FIX_TRUNC_EXPR:
+ pp_string (buffer, " VEC_PACK_FIX_TRUNC_EXPR < ");
+ dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
+ pp_string (buffer, ", ");
+ dump_generic_node (buffer, TREE_OPERAND (node, 1), spc, flags, false);
+ pp_string (buffer, " > ");
+ break;
+
case BLOCK:
{
tree t;
case VEC_RSHIFT_EXPR:
case VEC_UNPACK_HI_EXPR:
case VEC_UNPACK_LO_EXPR:
+ case VEC_UNPACK_FLOAT_HI_EXPR:
+ case VEC_UNPACK_FLOAT_LO_EXPR:
case VEC_PACK_TRUNC_EXPR:
case VEC_PACK_SAT_EXPR:
return 16;
|| code == VEC_WIDEN_MULT_LO_EXPR
|| code == VEC_UNPACK_HI_EXPR
|| code == VEC_UNPACK_LO_EXPR
+ || code == VEC_UNPACK_FLOAT_HI_EXPR
+ || code == VEC_UNPACK_FLOAT_LO_EXPR
|| code == VEC_PACK_TRUNC_EXPR
- || code == VEC_PACK_SAT_EXPR)
+ || code == VEC_PACK_SAT_EXPR
+ || code == VEC_PACK_FIX_TRUNC_EXPR)
type = TREE_TYPE (TREE_OPERAND (rhs, 0));
/* Optabs will try converting a negation into a subtraction, so
accessed in the loop by STMT, along with the def-use update chain to
appropriately advance the pointer through the loop iterations. Also set
aliasing information for the pointer. This vector pointer is used by the
- callers to this function to create a memory reference expression for vector
+ callers to this function to create a memory reference expression for vector
load/store access.
Input:
}
+/* Function vect_gen_widened_results_half
+
+ Create a vector stmt whose code, type, number of arguments, and result
+ variable are CODE, VECTYPE, OP_TYPE, and VEC_DEST, and its arguments are
+ VEC_OPRND0 and VEC_OPRND1. The new vector stmt is to be inserted at BSI.
+ In the case that CODE is a CALL_EXPR, this means that a call to DECL
+ needs to be created (DECL is a function-decl of a target-builtin).
+ STMT is the original scalar stmt that we are vectorizing. */
+
+static tree
+vect_gen_widened_results_half (enum tree_code code, tree vectype, tree decl,
+ tree vec_oprnd0, tree vec_oprnd1, int op_type,
+ tree vec_dest, block_stmt_iterator *bsi,
+ tree stmt)
+{
+ tree expr;
+ tree new_stmt;
+ tree new_temp;
+ tree sym;
+ ssa_op_iter iter;
+
+ /* Generate half of the widened result: */
+ if (code == CALL_EXPR)
+ {
+ /* Target specific support */
+ if (op_type == binary_op)
+ expr = build_call_expr (decl, 2, vec_oprnd0, vec_oprnd1);
+ else
+ expr = build_call_expr (decl, 1, vec_oprnd0);
+ }
+ else
+ {
+ /* Generic support */
+ gcc_assert (op_type == TREE_CODE_LENGTH (code));
+ if (op_type == binary_op)
+ expr = build2 (code, vectype, vec_oprnd0, vec_oprnd1);
+ else
+ expr = build1 (code, vectype, vec_oprnd0);
+ }
+ new_stmt = build_gimple_modify_stmt (vec_dest, expr);
+ new_temp = make_ssa_name (vec_dest, new_stmt);
+ GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
+ vect_finish_stmt_generation (stmt, new_stmt, bsi);
+
+ if (code == CALL_EXPR)
+ {
+ FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
+ {
+ if (TREE_CODE (sym) == SSA_NAME)
+ sym = SSA_NAME_VAR (sym);
+ mark_sym_for_renaming (sym);
+ }
+ }
+
+ return new_stmt;
+}
+
+
/* Function vectorizable_conversion.
Check if STMT performs a conversion operation, that can be vectorized.
tree scalar_dest;
tree operation;
tree op0;
- tree vec_oprnd0 = NULL_TREE;
+ tree vec_oprnd0 = NULL_TREE, vec_oprnd1 = NULL_TREE;
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
- enum tree_code code;
+ enum tree_code code, code1 = CODE_FOR_nothing, code2 = CODE_FOR_nothing;
+ tree decl1 = NULL_TREE, decl2 = NULL_TREE;
tree new_temp;
tree def, def_stmt;
enum vect_def_type dt0;
tree new_stmt;
+ stmt_vec_info prev_stmt_info;
int nunits_in;
int nunits_out;
- int ncopies, j;
tree vectype_out, vectype_in;
+ int ncopies, j;
+ tree expr;
tree rhs_type, lhs_type;
tree builtin_decl;
- stmt_vec_info prev_stmt_info;
+ enum { NARROW, NONE, WIDEN } modifier;
/* Is STMT a vectorizable conversion? */
scalar_dest = GIMPLE_STMT_OPERAND (stmt, 0);
lhs_type = TREE_TYPE (scalar_dest);
vectype_out = get_vectype_for_scalar_type (lhs_type);
- gcc_assert (STMT_VINFO_VECTYPE (stmt_info) == vectype_out);
nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out);
- /* FORNOW: need to extend to support short<->float conversions as well. */
- if (nunits_out != nunits_in)
+ /* FORNOW */
+ if (nunits_in == nunits_out / 2)
+ modifier = NARROW;
+ else if (nunits_out == nunits_in)
+ modifier = NONE;
+ else if (nunits_out == nunits_in / 2)
+ modifier = WIDEN;
+ else
return false;
+ if (modifier == NONE)
+ gcc_assert (STMT_VINFO_VECTYPE (stmt_info) == vectype_out);
+
/* Bail out if the types are both integral or non-integral */
if ((INTEGRAL_TYPE_P (rhs_type) && INTEGRAL_TYPE_P (lhs_type))
|| (!INTEGRAL_TYPE_P (rhs_type) && !INTEGRAL_TYPE_P (lhs_type)))
return false;
+ if (modifier == NARROW)
+ ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_out;
+ else
+ ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_in;
+
/* Sanity check: make sure that at least one copy of the vectorized stmt
needs to be generated. */
- ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_in;
gcc_assert (ncopies >= 1);
+ /* Check the operands of the operation. */
if (!vect_is_simple_use (op0, loop_vinfo, &def_stmt, &def, &dt0))
{
if (vect_print_dump_info (REPORT_DETAILS))
}
/* Supportable by target? */
- if (!targetm.vectorize.builtin_conversion (code, vectype_in))
+ if ((modifier == NONE
+ && !targetm.vectorize.builtin_conversion (code, vectype_in))
+ || (modifier == WIDEN
+ && !supportable_widening_operation (code, stmt, vectype_in,
+ &decl1, &decl2,
+ &code1, &code2))
+ || (modifier == NARROW
+ && !supportable_narrowing_operation (code, stmt, vectype_in,
+ &code1)))
{
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "op not supported by target.");
return false;
}
+ if (modifier != NONE)
+ STMT_VINFO_VECTYPE (stmt_info) = vectype_in;
+
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = type_conversion_vec_info_type;
return true;
}
- /** Transform. **/
-
+ /** Transform. **/
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "transform conversion.");
vec_dest = vect_create_destination_var (scalar_dest, vectype_out);
prev_stmt_info = NULL;
- for (j = 0; j < ncopies; j++)
+ switch (modifier)
{
- tree sym;
- ssa_op_iter iter;
+ case NONE:
+ for (j = 0; j < ncopies; j++)
+ {
+ tree sym;
+ ssa_op_iter iter;
- if (j == 0)
- vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
- else
- vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
+ if (j == 0)
+ vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
+ else
+ vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
- builtin_decl =
- targetm.vectorize.builtin_conversion (code, vectype_in);
- new_stmt = build_call_expr (builtin_decl, 1, vec_oprnd0);
+ builtin_decl =
+ targetm.vectorize.builtin_conversion (code, vectype_in);
+ new_stmt = build_call_expr (builtin_decl, 1, vec_oprnd0);
- /* Arguments are ready. create the new vector stmt. */
- new_stmt = build_gimple_modify_stmt (vec_dest, new_stmt);
- new_temp = make_ssa_name (vec_dest, new_stmt);
- GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
- vect_finish_stmt_generation (stmt, new_stmt, bsi);
- FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
- {
- if (TREE_CODE (sym) == SSA_NAME)
- sym = SSA_NAME_VAR (sym);
- mark_sym_for_renaming (sym);
- }
+ /* Arguments are ready. create the new vector stmt. */
+ new_stmt = build_gimple_modify_stmt (vec_dest, new_stmt);
+ new_temp = make_ssa_name (vec_dest, new_stmt);
+ GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
+ vect_finish_stmt_generation (stmt, new_stmt, bsi);
+ FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
+ {
+ if (TREE_CODE (sym) == SSA_NAME)
+ sym = SSA_NAME_VAR (sym);
+ mark_sym_for_renaming (sym);
+ }
- if (j == 0)
- STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt;
- else
- STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
- prev_stmt_info = vinfo_for_stmt (new_stmt);
+ if (j == 0)
+ STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt;
+ else
+ STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
+ prev_stmt_info = vinfo_for_stmt (new_stmt);
+ }
+ break;
+
+ case WIDEN:
+ /* In case the vectorization factor (VF) is bigger than the number
+ of elements that we can fit in a vectype (nunits), we have to
+ generate more than one vector stmt - i.e - we need to "unroll"
+ the vector stmt by a factor VF/nunits. */
+ for (j = 0; j < ncopies; j++)
+ {
+ if (j == 0)
+ vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
+ else
+ vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
+
+ STMT_VINFO_VECTYPE (stmt_info) = vectype_in;
+
+ /* Generate first half of the widened result: */
+ new_stmt
+ = vect_gen_widened_results_half (code1, vectype_out, decl1,
+ vec_oprnd0, vec_oprnd1,
+ unary_op, vec_dest, bsi, stmt);
+ if (j == 0)
+ STMT_VINFO_VEC_STMT (stmt_info) = new_stmt;
+ else
+ STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
+ prev_stmt_info = vinfo_for_stmt (new_stmt);
+
+ /* Generate second half of the widened result: */
+ new_stmt
+ = vect_gen_widened_results_half (code2, vectype_out, decl2,
+ vec_oprnd0, vec_oprnd1,
+ unary_op, vec_dest, bsi, stmt);
+ STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
+ prev_stmt_info = vinfo_for_stmt (new_stmt);
+ }
+ break;
+
+ case NARROW:
+ /* In case the vectorization factor (VF) is bigger than the number
+ of elements that we can fit in a vectype (nunits), we have to
+ generate more than one vector stmt - i.e - we need to "unroll"
+ the vector stmt by a factor VF/nunits. */
+ for (j = 0; j < ncopies; j++)
+ {
+ /* Handle uses. */
+ if (j == 0)
+ {
+ vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
+ vec_oprnd1 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
+ }
+ else
+ {
+ vec_oprnd0 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd1);
+ vec_oprnd1 = vect_get_vec_def_for_stmt_copy (dt0, vec_oprnd0);
+ }
+
+ /* Arguments are ready. Create the new vector stmt. */
+ expr = build2 (code1, vectype_out, vec_oprnd0, vec_oprnd1);
+ new_stmt = build_gimple_modify_stmt (vec_dest, expr);
+ new_temp = make_ssa_name (vec_dest, new_stmt);
+ GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
+ vect_finish_stmt_generation (stmt, new_stmt, bsi);
+
+ if (j == 0)
+ STMT_VINFO_VEC_STMT (stmt_info) = new_stmt;
+ else
+ STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
+
+ prev_stmt_info = vinfo_for_stmt (new_stmt);
+ }
+
+ *vec_stmt = STMT_VINFO_VEC_STMT (stmt_info);
}
return true;
}
bool
vectorizable_type_demotion (tree stmt, block_stmt_iterator *bsi,
- tree *vec_stmt)
+ tree *vec_stmt)
{
tree vec_dest;
tree scalar_dest;
tree vec_oprnd0=NULL, vec_oprnd1=NULL;
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
- enum tree_code code;
+ enum tree_code code, code1 = CODE_FOR_nothing;
tree new_temp;
tree def, def_stmt;
enum vect_def_type dt0;
tree expr;
tree vectype_in;
tree scalar_type;
- optab optab;
- enum machine_mode vec_mode;
if (!STMT_VINFO_RELEVANT_P (stmt_info))
return false;
}
/* Supportable by target? */
- code = VEC_PACK_TRUNC_EXPR;
- optab = optab_for_tree_code (code, vectype_in);
- if (!optab)
- return false;
-
- vec_mode = TYPE_MODE (vectype_in);
- if (optab->handlers[(int) vec_mode].insn_code == CODE_FOR_nothing)
+ if (!supportable_narrowing_operation (code, stmt, vectype_in, &code1))
return false;
STMT_VINFO_VECTYPE (stmt_info) = vectype_in;
}
/* Arguments are ready. Create the new vector stmt. */
- expr = build2 (code, vectype_out, vec_oprnd0, vec_oprnd1);
+ expr = build2 (code1, vectype_out, vec_oprnd0, vec_oprnd1);
new_stmt = build_gimple_modify_stmt (vec_dest, expr);
new_temp = make_ssa_name (vec_dest, new_stmt);
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
}
-/* Function vect_gen_widened_results_half
-
- Create a vector stmt whose code, type, number of arguments, and result
- variable are CODE, VECTYPE, OP_TYPE, and VEC_DEST, and its arguments are
- VEC_OPRND0 and VEC_OPRND1. The new vector stmt is to be inserted at BSI.
- In the case that CODE is a CALL_EXPR, this means that a call to DECL
- needs to be created (DECL is a function-decl of a target-builtin).
- STMT is the original scalar stmt that we are vectorizing. */
-
-static tree
-vect_gen_widened_results_half (enum tree_code code, tree vectype, tree decl,
- tree vec_oprnd0, tree vec_oprnd1, int op_type,
- tree vec_dest, block_stmt_iterator *bsi,
- tree stmt)
-{
- tree expr;
- tree new_stmt;
- tree new_temp;
- tree sym;
- ssa_op_iter iter;
-
- /* Generate half of the widened result: */
- if (code == CALL_EXPR)
- {
- /* Target specific support */
- if (op_type == binary_op)
- expr = build_call_expr (decl, 2, vec_oprnd0, vec_oprnd1);
- else
- expr = build_call_expr (decl, 1, vec_oprnd0);
- }
- else
- {
- /* Generic support */
- gcc_assert (op_type == TREE_CODE_LENGTH (code));
- if (op_type == binary_op)
- expr = build2 (code, vectype, vec_oprnd0, vec_oprnd1);
- else
- expr = build1 (code, vectype, vec_oprnd0);
- }
- new_stmt = build_gimple_modify_stmt (vec_dest, expr);
- new_temp = make_ssa_name (vec_dest, new_stmt);
- GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
- vect_finish_stmt_generation (stmt, new_stmt, bsi);
-
- if (code == CALL_EXPR)
- {
- FOR_EACH_SSA_TREE_OPERAND (sym, new_stmt, iter, SSA_OP_ALL_VIRTUALS)
- {
- if (TREE_CODE (sym) == SSA_NAME)
- sym = SSA_NAME_VAR (sym);
- mark_sym_for_renaming (sym);
- }
- }
-
- return new_stmt;
-}
-
-
/* Function vectorizable_type_promotion
Check if STMT performs a binary or unary operation that involves
operation = GIMPLE_STMT_OPERAND (stmt, 1);
code = TREE_CODE (operation);
- if (code != NOP_EXPR && code != WIDEN_MULT_EXPR)
+ if (code != NOP_EXPR && code != CONVERT_EXPR
+ && code != WIDEN_MULT_EXPR)
return false;
op0 = TREE_OPERAND (operation, 0);
widening operation that is supported by the target platform in
vector form (i.e., when operating on arguments of type VECTYPE).
- The two kinds of widening operations we currently support are
- NOP and WIDEN_MULT. This function checks if these operations
- are supported by the target platform either directly (via vector
- tree-codes), or via target builtins.
+ Widening operations we currently support are NOP (CONVERT), FLOAT
+ and WIDEN_MULT. This function checks if these operations are supported
+ by the target platform either directly (via vector tree-codes), or via
+ target builtins.
Output:
- CODE1 and CODE2 are codes of vector operations to be used when
break;
case NOP_EXPR:
+ case CONVERT_EXPR:
if (BYTES_BIG_ENDIAN)
{
c1 = VEC_UNPACK_HI_EXPR;
}
break;
+ case FLOAT_EXPR:
+ if (BYTES_BIG_ENDIAN)
+ {
+ c1 = VEC_UNPACK_FLOAT_HI_EXPR;
+ c2 = VEC_UNPACK_FLOAT_LO_EXPR;
+ }
+ else
+ {
+ c2 = VEC_UNPACK_FLOAT_HI_EXPR;
+ c1 = VEC_UNPACK_FLOAT_LO_EXPR;
+ }
+ break;
+
default:
gcc_unreachable ();
}
}
+/* Function supportable_narrowing_operation
+
+ Check whether an operation represented by the code CODE is a
+ narrowing operation that is supported by the target platform in
+ vector form (i.e., when operating on arguments of type VECTYPE).
+
+ Narrowing operations we currently support are NOP (CONVERT) and
+ FIX_TRUNC. This function checks if these operations are supported by
+ the target platform directly via vector tree-codes.
+
+ Output:
+ - CODE1 is the code of a vector operation to be used when
+ vectorizing the operation, if available. */
+
+bool
+supportable_narrowing_operation (enum tree_code code,
+ tree stmt, tree vectype,
+ enum tree_code *code1)
+{
+ enum machine_mode vec_mode;
+ enum insn_code icode1;
+ optab optab1;
+ tree expr = GIMPLE_STMT_OPERAND (stmt, 1);
+ tree type = TREE_TYPE (expr);
+ tree narrow_vectype = get_vectype_for_scalar_type (type);
+ enum tree_code c1;
+
+ switch (code)
+ {
+ case NOP_EXPR:
+ case CONVERT_EXPR:
+ c1 = VEC_PACK_TRUNC_EXPR;
+ break;
+
+ case FIX_TRUNC_EXPR:
+ c1 = VEC_PACK_FIX_TRUNC_EXPR;
+ break;
+
+ default:
+ gcc_unreachable ();
+ }
+
+ *code1 = c1;
+ optab1 = optab_for_tree_code (c1, vectype);
+
+ if (!optab1)
+ return false;
+
+ vec_mode = TYPE_MODE (vectype);
+ if ((icode1 = optab1->handlers[(int) vec_mode].insn_code) == CODE_FOR_nothing
+ || insn_data[icode1].operand[0].mode != TYPE_MODE (narrow_vectype))
+ return false;
+
+ return true;
+}
+
+
/* Function reduction_code_for_scalar_code
Input:
extern bool reduction_code_for_scalar_code (enum tree_code, enum tree_code *);
extern bool supportable_widening_operation (enum tree_code, tree, tree,
tree *, tree *, enum tree_code *, enum tree_code *);
+extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
+ enum tree_code *);
+
/* Creation and deletion of loop and stmt info structs. */
extern loop_vec_info new_loop_vec_info (struct loop *loop);
extern void destroy_loop_vec_info (loop_vec_info);
DEFTREECODE (VEC_WIDEN_MULT_HI_EXPR, "widen_mult_hi_expr", tcc_binary, 2)
DEFTREECODE (VEC_WIDEN_MULT_LO_EXPR, "widen_mult_hi_expr", tcc_binary, 2)
-/* Unpack (extract and promote/widen) the high/low elements of the input vector
- into the output vector. The input vector has twice as many elements
- as the output vector, that are half the size of the elements
+/* Unpack (extract and promote/widen) the high/low elements of the input
+ vector into the output vector. The input vector has twice as many
+ elements as the output vector, that are half the size of the elements
of the output vector. This is used to support type promotion. */
DEFTREECODE (VEC_UNPACK_HI_EXPR, "vec_unpack_hi_expr", tcc_unary, 1)
DEFTREECODE (VEC_UNPACK_LO_EXPR, "vec_unpack_lo_expr", tcc_unary, 1)
+/* Unpack (extract) the high/low elements of the input vector, convert
+ fixed point values to floating point and widen elements into the
+ output vector. The input vector has twice as many elements as the output
+ vector, that are half the size of the elements of the output vector. */
+DEFTREECODE (VEC_UNPACK_FLOAT_HI_EXPR, "vec_unpack_float_hi_expr", tcc_unary, 1)
+DEFTREECODE (VEC_UNPACK_FLOAT_LO_EXPR, "vec_unpack_float_lo_expr", tcc_unary, 1)
+
/* Pack (demote/narrow and merge) the elements of the two input vectors
into the output vector using truncation/saturation.
The elements of the input vectors are twice the size of the elements of the
DEFTREECODE (VEC_PACK_TRUNC_EXPR, "vec_pack_trunc_expr", tcc_binary, 2)
DEFTREECODE (VEC_PACK_SAT_EXPR, "vec_pack_sat_expr", tcc_binary, 2)
+/* Convert floating point values of the two input vectors to integer
+ and pack (narrow and merge) the elements into the output vector. The
+ elements of the input vector are twice the size of the elements of
+ the output vector. */
+DEFTREECODE (VEC_PACK_FIX_TRUNC_EXPR, "vec_pack_fix_trunc_expr", tcc_binary, 2)
+
/* Extract even/odd fields from vectors. */
DEFTREECODE (VEC_EXTRACT_EVEN_EXPR, "vec_extracteven_expr", tcc_binary, 2)
DEFTREECODE (VEC_EXTRACT_ODD_EXPR, "vec_extractodd_expr", tcc_binary, 2)