Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[~2024-4-15] Pull from upstream #3285

Open
wants to merge 552 commits into
base: master
Choose a base branch
from

Conversation

powerboat9
Copy link
Contributor

Merging this will be a bit weird, since I'm guessing the merge queue will try to rebase this instead of allowing the merge commit

GCC Administrator and others added 30 commits March 29, 2024 00:17
Recently I've fixed two wrong FP vector negate implementation which
caused wrong sign bits in zeros in targets (r14-8786 and r14-8801).  To
prevent a similar issue from happening again, add a test case.

Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
(with MSA), LoongArch (with LSX and LASX).

gcc/testsuite:

	* gcc.dg/vect/vect-neg-zero.c: New test.
This changes an internal error to be a fatal error for when the ZSTD
is not enabled but the section was compressed as ZSTD.

Committed as approved after bootstrap/test on x86_64-linux-gnu.

gcc/ChangeLog:

	* lto-compress.cc (lto_end_uncompression): Use
	fatal_error instead of internal_error when ZSTD
	is not enabled.

Signed-off-by: Andrew Pinski <[email protected]>
2024-03-29  Paul Thomas  <[email protected]>

gcc/fortran
	PR fortran/36337
	PR fortran/110987
	PR fortran/113885
	* trans-expr.cc (gfc_trans_assignment_1): Place finalization
	block before rhs post block for elemental rhs.
	* trans.cc (gfc_finalize_tree_expr): Check directly if a type
	has no components, rather than the zero components attribute.
	Treat elemental zero component expressions in the same way as
	scalars.

gcc/testsuite/
	PR fortran/113885
	* gfortran.dg/finalize_54.f90: New test.
	* gfortran.dg/finalize_55.f90: New test.

gcc/testsuite/
	PR fortran/110987
	* gfortran.dg/finalize_56.f90: New test.
…PR50410]

gcc/fortran/ChangeLog:

	PR fortran/50410
	* trans-expr.cc (gfc_conv_structure): Check for NULL pointer.

gcc/testsuite/ChangeLog:

	PR fortran/50410
	* gfortran.dg/data_initialized_4.f90: New test.
Via XPASSing test cases after commit a657c7e
"testsuite: un-xfail TSVC loops that check for exit control flow vectorization":

    PASS: gcc.dg/vect/tsvc/vect-tsvc-s332.c (test for excess errors)
    PASS: gcc.dg/vect/tsvc/vect-tsvc-s332.c execution test
    [-XFAIL:-]{+XPASS:+} gcc.dg/vect/tsvc/vect-tsvc-s332.c scan-tree-dump vect "vectorized 1 loops"

    PASS: gcc.dg/vect/tsvc/vect-tsvc-s481.c (test for excess errors)
    PASS: gcc.dg/vect/tsvc/vect-tsvc-s481.c execution test
    [-XFAIL:-]{+XPASS:+} gcc.dg/vect/tsvc/vect-tsvc-s481.c scan-tree-dump vect "vectorized 1 loops"

    PASS: gcc.dg/vect/tsvc/vect-tsvc-s482.c (test for excess errors)
    PASS: gcc.dg/vect/tsvc/vect-tsvc-s482.c execution test
    [-XFAIL:-]{+XPASS:+} gcc.dg/vect/tsvc/vect-tsvc-s482.c scan-tree-dump vect "vectorized 1 loops"

..., it became apparent that GCN, too, does support vectorization of loops with
early breaks.  The relevant test cases are all-PASS with just the following
exceptions, to be looked into individually, later on:

    PASS: gcc.dg/vect/vect-early-break_25.c (test for excess errors)
    PASS: gcc.dg/vect/vect-early-break_25.c scan-tree-dump-times vect "vectorized 1 loops" 1
    FAIL: gcc.dg/vect/vect-early-break_25.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1

    PASS: gcc.dg/vect/vect-early-break_56.c (test for excess errors)
    PASS: gcc.dg/vect/vect-early-break_56.c execution test
    XPASS: gcc.dg/vect/vect-early-break_56.c scan-tree-dump-times vect "vectorized 2 loops" 2

	gcc/testsuite/
	* lib/target-supports.exp
	(check_effective_target_vect_early_break)
	(check_effective_target_vect_early_break_hw): Enable for GCN.
... as made apparent by commit 4e1fcf4
"testsuite: vect: Require vect_hw_misalign in gcc.dg/vect/vect-cost-model-1.c etc. [PR98238]"
causing:

     PASS: gcc.dg/vect/vect-cost-model-1.c (test for excess errors)
    -PASS: gcc.dg/vect/vect-cost-model-1.c scan-tree-dump vect "LOOP VECTORIZED"

     PASS: gcc.dg/vect/vect-cost-model-3.c (test for excess errors)
    -PASS: gcc.dg/vect/vect-cost-model-3.c scan-tree-dump vect "LOOP VECTORIZED"

     PASS: gcc.dg/vect/vect-cost-model-5.c (test for excess errors)
    -PASS: gcc.dg/vect/vect-cost-model-5.c scan-tree-dump vect "LOOP VECTORIZED"

..., and similarly commit ffd47fb
"testsuite: Fix pr113431.c FAIL on sparc* [PR113431]" causing:

     PASS: gcc.dg/vect/pr113431.c (test for excess errors)
     PASS: gcc.dg/vect/pr113431.c execution test
    -PASS: gcc.dg/vect/pr113431.c scan-tree-dump-times slp1 "optimized: basic block part vectorized" 2

..., which this commit all restores, and also enables a good number of further
FAIL -> PASS, UNSUPPORTED -> PASS, etc. progressions.  There are also a small
number of regressions, mostly in the SLP area apparently:

     PASS: gcc.dg/vect/bb-slp-layout-12.c (test for excess errors)
    +XPASS: gcc.dg/vect/bb-slp-layout-12.c scan-tree-dump-not slp1 "duplicating permutation node"
    +XFAIL: gcc.dg/vect/bb-slp-layout-12.c scan-tree-dump-times slp1 "add new stmt: [^\\n\\r]* = VEC_PERM_EXPR" 3

     PASS: gcc.dg/vect/bb-slp-layout-6.c (test for excess errors)
    +FAIL: gcc.dg/vect/bb-slp-layout-6.c scan-tree-dump slp2 "absorbing input layouts"

     PASS: gcc.dg/vect/pr97428.c (test for excess errors)
     PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving load of size 8"
     PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving store of size 16"
     PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
    -XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2
    +FAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2

     PASS: gcc.dg/vect/vect-33.c (test for excess errors)
    +FAIL: gcc.dg/vect/vect-33.c scan-tree-dump vect "Vectorizing an unaligned access"
     PASS: gcc.dg/vect/vect-33.c scan-tree-dump-not optimized "Invalid sum"
     PASS: gcc.dg/vect/vect-33.c scan-tree-dump-times vect "vectorized 1 loops" 1

..., so some further conditionalizing etc. seems necessary.  These seem to
mostly appear next to pre-existing similar FAILs in related test cases.
(Overall, way more PASS than FAIL.)

	gcc/testsuite/
	* lib/target-supports.exp
	(check_effective_target_vect_hw_misalign): Enable for GCN.
	(check_effective_target_vect_element_align): Adjust.
... as made apparent by commit bfd6b36
"testsuite/vect: Fix pr25413a.c expectations [PR109705]" causing:

     PASS: gcc.dg/vect/pr25413a.c (test for excess errors)
     PASS: gcc.dg/vect/pr25413a.c execution test
    -PASS: gcc.dg/vect/pr25413a.c scan-tree-dump-times vect "vectorized 2 loops" 1
    +FAIL: gcc.dg/vect/pr25413a.c scan-tree-dump-times vect "vectorized 1 loops" 1

..., which this commit resolves.

	gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_vect_long_mult):
	Enable for GCN.
It was mistakenly added to these files.

libstdc++-v3/ChangeLog:

	* testsuite/24_iterators/range_generators/01.cc: Drop GCC
	Runtime Library Exception.
	* testsuite/24_iterators/range_generators/02.cc: Drop GCC
	Runtime Library Exception.
	* testsuite/24_iterators/range_generators/copy.cc: Drop GCC
	Runtime Library Exception.
	* testsuite/24_iterators/range_generators/except.cc: Drop GCC
	Runtime Library Exception.
	* testsuite/24_iterators/range_generators/subrange.cc: Drop GCC
	Runtime Library Exception.
	* testsuite/24_iterators/range_generators/synopsis.cc: Drop GCC
	Runtime Library Exception.
	* testsuite/24_iterators/range_generators/iter_deref_return.cc:
	Drop GCC Runtime Library Exception from the "You should have
	received a copy" paragraph.
There was a typo in the testcase, with GCC_CPUINFO pointing to the
wrong file.

2024-03-29  Christophe Lyon  <[email protected]>

	gcc/testsuite/
	* gcc.target/aarch64/cpunative/native_cpu_24.c: Fix GCC_CPUINFO.
gcc/jit/ChangeLog:

	* libgccjit.cc (gcc_jit_type_get_size): Add pointer support
We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named
arguments and there is nothing to advance, but that is not the case
for (...) functions returning by hidden reference which have one such
artificial argument.  This is causing gcc.dg/c23-stdarg-{6,8,9}.c to
fail.

Fix the issue by checking if arg.type is NULL, as r14-9503 explains.

gcc/ChangeLog:

	PR target/114175
	* config/mips/mips.cc (mips_setup_incoming_varargs): Only skip
	mips_function_arg_advance for TYPE_NO_NAMED_ARGS_STDARG_P
	functions if arg.type is NULL.
This patch would like to fix one unused variable as below:

../../gcc/common/config/riscv/riscv-common.cc: In static member function
'static riscv_subset_list* riscv_subset_list::parse(const char*, location_t)':
../../gcc/common/config/riscv/riscv-common.cc:1501:19: error: unused variable 'itr'
  [-Werror=unused-variable]
 1501 |   riscv_subset_t *itr;

The variable consume code was removed but missed the var itself in
previous.  Thus, we have unused variable here.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc (riscv_subset_list::parse):
	Remove unused var decl.

Signed-off-by: Pan Li <[email protected]>
This patch would like to fix below misspelled term in error message.

../../gcc/config/riscv/riscv-vector-builtins.cc:4592:16: error:
misspelled term 'builtin function' in format; use 'built-in function' instead [-Werror=format-diag]
 4592 |               "builtin function %qE requires the V ISA extension", exp);

The below tests are passed for this patch.
* The riscv regression test on rvv.exp and riscv.exp.

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins.cc (expand_builtin): Take
	the term built-in over builtin.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c:
	Adjust test dg-error.
	* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c:
	Ditto.

Signed-off-by: Pan Li <[email protected]>
…er model

The test for the recently added XiangShan Nanhu microarchitecture is failing
because the scheduler description does not have entries for certain insn types.

I'm adding  branch, jalr, ret and sfb_alu to the scheduler description, that's
enough to get the trivial test to pass.  However, I strongly suspect running
any significant code through the compiler when scheduling for this
microarchitecture will trigger faults.

Basically we have checking now that will fault if we have an insn in the IL
without an associated type or if we have an insn in the IL that does not map to
an insn reservation in the scheduler model.  We were tripping the latter
assertion for one of those branch types.  My suspicion is many insn types
aren't handled by that DFA.

The branch insns were pretty obvious and easy to fix.  But someone with more
experience with the uarch needs to do an audit to ensure that all insn types
map to an insn reservation.

gcc/
	* config/riscv/xiangshan.md (xiangshan_jump): Add branch, jalr, ret
	and sfb_alu.
This rule was missing, and 'make install-html' was failing.
It is copied from the corresponding one in fortran.

2024-03-29  Christophe Lyon  <[email protected]>

	gcc/m2/
	* Make-lang.in (install-html): New rule.
Fix a few typos: the generated filename is m2.info (not gm2.info, and
gm2$(exeext) is a file not a directory (so test -d would always fail).

2024-03-29  Christophe Lyon  <[email protected]>

	gcc/m2/
	* Make-lang.in (m2.install-info): Fix rule.
Add descriptions for the compilation options '-mfrecipe' '-mdiv32'
'-mlam-bh' '-mlamcas' and '-mld-seq-sa'.

gcc/ChangeLog:

	* doc/invoke.texi: Add descriptions for the compilation
	options.
…edures

gcc/ChangeLog:

	* config/loongarch/genopts/loongarch.opt.in: Mark -m[no-]recip as
	aliases to -mrecip={all,none}, respectively.
	* config/loongarch/loongarch.opt: Regenerate.
	* config/loongarch/loongarch-def.h (ABI_FPU_64): Rename to...
	(ABI_FPU64_P): ...this.
	(ABI_FPU_32): Rename to...
	(ABI_FPU32_P): ...this.
	(ABI_FPU_NONE): Rename to...
	(ABI_NOFPU_P): ...this.
	(ABI_LP64_P): Define.
	* config/loongarch/loongarch.cc (loongarch_init_print_operand_punct):
	Merged into loongarch_global_init.
	(loongarch_cpu_option_override): Renamed to
	loongarch_target_option_override.
	(loongarch_option_override_internal): Move the work after
	loongarch_config_target into loongarch_target_option_override.
	(loongarch_global_init): Define.
	(INIT_TARGET_FLAG): Move to loongarch-opts.cc.
	(loongarch_option_override): Call loongarch_global_init
	separately.
	* config/loongarch/loongarch-opts.cc (loongarch_parse_mrecip_scheme):
	Split the parsing of -mrecip=<string> from
	loongarch_option_override_internal.
	(loongarch_generate_mrecip_scheme): Define. Split from
	loongarch_option_override_internal.
	(loongarch_target_option_override): Define. Renamed from
	loongarch_cpu_option_override.
	(loongarch_init_misc_options): Define. Split from
	loongarch_option_override_internal.
	(INIT_TARGET_FLAG): Move from loongarch.cc.
	* config/loongarch/loongarch-opts.h (loongarch_target_option_override):
	New prototype.
	(loongarch_parse_mrecip_scheme): New prototype.
	(loongarch_init_misc_options): New prototype.
	(TARGET_ABI_LP64): Simplify with ABI_LP64_P.
	* config/loongarch/loongarch.h (TARGET_RECIP_DIV): Simplify.
	Do not reference specific CPU architecture (LA664).
	(TARGET_RECIP_SQRT): Same.
	(TARGET_RECIP_RSQRT): Same.
	(TARGET_RECIP_VEC_DIV): Same.
	(TARGET_RECIP_VEC_SQRT): Same.
	(TARGET_RECIP_VEC_RSQRT): Same.
P2748R5 makes it ill-formed to return a reference to temporary in C++26;
implementing this is a simple matter of changing the existing warning to a
permerror.

For most of the tests I just changed dg-warning to dg-message to accept
both; I test the specific diagnostic type in Wreturn-local-addr-5.C.

gcc/cp/ChangeLog:

	* typeck.cc (maybe_warn_about_returning_address_of_local):
	Permerror in C++26.

gcc/testsuite/ChangeLog:

	* g++.dg/conversion/pr16333.C: Change dg-warning to dg-message.
	* g++.dg/cpp0x/constexpr-48324.C
	* g++.dg/other/pr94326.C
	* g++.dg/warn/Wreturn-local-addr-2.C
	* g++.old-deja/g++.jason/warning8.C: Likewise.
	* g++.dg/cpp1y/auto-fn6.C: Check that others don't complain.
	* g++.dg/warn/Wreturn-local-addr-5.C: Expect error in C++26.
This patch introduces stricter checking within standard procedure
functions which detect whether paramaters are variable when used
in a const expression.

gcc/m2/ChangeLog:

	PR modula2/114548
	* gm2-compiler/M2Quads.mod (ConvertToAddress): Pass
	procedure, false parameters to BuildConvertFunction.
	(PushOne): Pass procedure, true parameters to
	BuildConvertFunction.
	Remove usused parameter internal.
	(BuildPseudoBy): Remove parameter to PushOne.
	(BuildIncProcedure): Ditto.
	(BuildDecProcedure): Ditto.
	(BuildFunctionCall): Add ConstExpr parameter to
	BuildPseudoFunctionCall.
	(BuildConstFunctionCall): Add procedure and true to
	BuildConvertFunction.
	(BuildPseudoFunctionCall): Add ConstExpr parameter.
	Pass ProcSym and ConstExpr to BuildLengthFunction,
	BuildConvertFunction, BuildOddFunction, BuildAbsFunction,
	BuildCapFunction, BuildValFunction, BuildChrFunction,
	BuildOrdFunction, BuildIntFunction, BuildTruncFunction,
	BuildFloatFunction, BuildAddAdrFunction, BuildSubAdrFunction,
	BuildDifAdrFunction, BuildCastFunction, BuildReFunction,
	BuildImFunction and BuildCmplxFunction.
	(BuildAddAdrFunction): Add ProcSym, ConstExpr parameters and
	check for constant parameters.
	(BuildSubAdrFunction): Ditto.
	(BuildDifAdrFunction): Ditto.
	(ConstExprError): Ditto.
	(BuildLengthFunction): Ditto.
	(BuildOddFunction): Ditto.
	(BuildAbsFunction): Ditto.
	(BuildCapFunction): Ditto.
	(BuildChrFunction): Ditto.
	(BuildOrdFunction): Ditto.
	(BuildIntFunction): Ditto.
	(BuildValFunction): Ditto.
	(BuildCastFunction): Ditto.
	(BuildConvertFunction): Ditto.
	(BuildTruncFunction): Ditto.
	(BuildFloatFunction): Ditto.
	(BuildReFunction): Ditto.
	(BuildImFunction): Ditto.
	(BuildCmplxFunction): Ditto.

gcc/testsuite/ChangeLog:

	PR modula2/114548
	* gm2/iso/const/fail/expression.mod: New test.
	* gm2/iso/const/fail/iso-const-fail.exp: New test.
	* gm2/iso/const/fail/testabs.mod: New test.
	* gm2/iso/const/fail/testaddadr.mod: New test.
	* gm2/iso/const/fail/testcap.mod: New test.
	* gm2/iso/const/fail/testcap2.mod: New test.
	* gm2/iso/const/fail/testchr.mod: New test.
	* gm2/iso/const/fail/testchr2.mod: New test.
	* gm2/iso/const/fail/testcmplx.mod: New test.
	* gm2/iso/const/fail/testfloat.mod: New test.
	* gm2/iso/const/fail/testim.mod: New test.
	* gm2/iso/const/fail/testint.mod: New test.
	* gm2/iso/const/fail/testlength.mod: New test.
	* gm2/iso/const/fail/testodd.mod: New test.
	* gm2/iso/const/fail/testord.mod: New test.
	* gm2/iso/const/fail/testre.mod: New test.
	* gm2/iso/const/fail/testtrunc.mod: New test.
	* gm2/iso/const/fail/testval.mod: New test.
	* gm2/iso/const/pass/constbool.mod: New test.
	* gm2/iso/const/pass/constbool2.mod: New test.
	* gm2/iso/const/pass/constbool3.mod: New test.

Signed-off-by: Gaius Mulley <[email protected]>
Fixes: d28ea8e ("LoongArch: Split loongarch_option_override_internal
		      into smaller procedures")

gcc/ChangeLog:

	* config/loongarch/loongarch.opt.urls: Regenerate.
Add support for TLS descriptors on normal code model and extreme
code model.

Normal code model instruction sequence:
  -mno-explicit-relocs:
    la.tls.desc	$r4, s
    add.d	$r12, $r4, $r2
  -mexplicit-relocs:
    pcalau12i	$r4,%desc_pc_hi20(s)
    addi.d	$r4,$r4,%desc_pc_lo12(s)
    ld.d	$r1,$r4,%desc_ld(s)
    jirl	$r1,$r1,%desc_call(s)
    add.d	$r12, $r4, $r2

Extreme code model instruction sequence:
  -mno-explicit-relocs:
    la.tls.desc	$r4, $r12, s
    add.d	$r12, $r4, $r2
  -mexplicit-relocs:
    pcalau12i	$r4,%desc_pc_hi20(s)
    addi.d	$r12,$r0,%desc_pc_lo12(s)
    lu32i.d	$r12,%desc64_pc_lo20(s)
    lu52i.d	$r12,$r12,%desc64_pc_hi12(s)
    add.d	$r4,$r4,$r12
    ld.d	$r1,$r4,%desc_ld(s)
    jirl	$r1,$r1,%desc_call(s)
    add.d	$r12, $r4, $r2

The default is still traditional TLS model, but can be configured with
--with-tls={trad,desc}. The default can change to TLS descriptors once
libc and LLVM support this.

gcc/ChangeLog:

	* config.gcc: Add --with-tls option to change TLS flavor.
	* config/loongarch/genopts/loongarch.opt.in: Add -mtls-dialect to
	configure TLS flavor.
	* config/loongarch/loongarch-def.h (struct loongarch_target): Add
	tls_dialect.
	* config/loongarch/loongarch-driver.cc (la_driver_init): Add tls
	flavor.
	* config/loongarch/loongarch-opts.cc (loongarch_init_target): Add
	tls_dialect.
	(loongarch_config_target): Ditto.
	(loongarch_update_gcc_opt_status): Ditto.
	* config/loongarch/loongarch-opts.h (loongarch_init_target): Ditto.
	(TARGET_TLS_DESC): New define.
	* config/loongarch/loongarch.cc (loongarch_symbol_insns): Add TLS
	DESC instructions sequence length.
	(loongarch_legitimize_tls_address): New TLS DESC instruction sequence.
	(loongarch_option_override_internal): Add la_opt_tls_dialect.
	(loongarch_option_restore): Add la_target.tls_dialect.
	* config/loongarch/loongarch.md (@got_load_tls_desc<mode>): Normal
	code model for TLS DESC.
	(got_load_tls_desc_off64): Extreme cmode model for TLS DESC.
	* config/loongarch/loongarch.opt: Regenerate.
	* config/loongarch/loongarch.opt.urls: Ditto.
	* doc/invoke.texi: Add a description of the compilation option
	'-mtls-dialect={trad,desc}'.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/cmodel-extreme-1.c: Add -mtls-dialect=trad.
	* gcc.target/loongarch/cmodel-extreme-2.c: Ditto.
	* gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c: Ditto.
	* gcc.target/loongarch/explicit-relocs-medium-call36-auto-tls-ld-gd.c:
	Ditto.
	* gcc.target/loongarch/func-call-medium-1.c: Ditto.
	* gcc.target/loongarch/func-call-medium-2.c: Ditto.
	* gcc.target/loongarch/func-call-medium-3.c: Ditto.
	* gcc.target/loongarch/func-call-medium-4.c: Ditto.
	* gcc.target/loongarch/tls-extreme-macro.c: Ditto.
	* gcc.target/loongarch/tls-gd-noplt.c: Ditto.
	* gcc.target/loongarch/explicit-relocs-auto-extreme-tls-desc.c: New test.
	* gcc.target/loongarch/explicit-relocs-auto-tls-desc.c: New test.
	* gcc.target/loongarch/explicit-relocs-extreme-tls-desc.c: New test.
	* gcc.target/loongarch/explicit-relocs-tls-desc.c: New test.

Co-authored-by: Lulu Cheng <[email protected]>
Co-authored-by: Xi Ruoyao <[email protected]>
gcc/ChangeLog:

	* config/loongarch/t-loongarch: Add loongarch-def-arrays.h
	to OPTION_H_EXTRA.
A recent change to libiberty has improved the process spawning on
older Darwin platforms.  This patch updates the expected test output
after the changes.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/bad-mapper-1.C: Update expected test output
	for earlier Darwin.

Signed-off-by: Iain Sandoe <[email protected]>
Some verions of dsymutil do not ignore .macinfo sections, but instead
ignore the entire debug in the file.

To avoid this total loss of debug, when we detect that the debug level
is g3 and the dsymutil version cannot support it, we reduce the level
to g2 and issue a note.

This behaviour can be overidden by -gstrict-dwarf (although the objects
will contain macinfo; dsymutil will not produce a .dSYM with it).

gcc/ChangeLog:

	* config/darwin.cc (darwin_override_options): Reduce the debug
	level to 2 if dsymutil cannot handle .macinfo sections.

Signed-off-by: Iain Sandoe <[email protected]>
Patrick Palka and others added 23 commits April 12, 2024 14:52
The original PR114393 testcase is unfortunately still not accepted after
r14-9938-g081c1e93d56d35 due to return type deduction confusion when a
lambda-expr is used as a default template argument.

The below reduced testcase demonstrates the bug.  Here when forming the
dependent specialization b_v<U> we substitute the default argument of F,
a lambda-expr, with _Descriptor=U.  (In this case in_template_context is
true since we're in the context of the template c_v, so we don't defer.)
This substitution in turn lowers the level of the lambda's auto return
type from 2 to 1 and so later, when instantiating c_v<int, char> we wrongly
substitute this auto with the template argument at level=0,index=0, i.e.
int, instead of going through do_auto_deduction which would yield char.

One way to fix this would be to use a level-less auto to represent a
deduced return type of a lambda, but that might be too invasive of a
change at this stage, and it might be better to do this across the board
for all deduced return types.

Another way would be to pass tf_partial from coerce_template_parms during
dependent substitution into a default template argument so that the
substitution doesn't do any level-lowering, but that wouldn't do the right
thing in this case due to the tf_partial early exit in the LAMBDA_EXPR
case of tsubst_expr.

Yet another way, and the approach that this patch takes, is to just
defer all dependent substitution into a lambda-expr, building upon the
logic added in r14-9938-g081c1e93d56d35.  This also helps ensure
LAMBDA_EXPR_REGEN_INFO consists only of the concrete template arguments
that were ultimately substituted into the most general lambda.

	PR c++/114393

gcc/cp/ChangeLog:

	* pt.cc (tsubst_lambda_expr): Also defer all dependent
	substitution.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/lambda-targ2a.C: New test.

Reviewed-by: Jason Merrill <[email protected]>
The middle-end warns about the ANNOTATE_EXPR added for while/for loops
if they declare a var inside of the loop condition.
This is because the assumption is that ANNOTATE_EXPR argument is used
immediately in a COND_EXPR (later GIMPLE_COND), but simplify_loop_decl_cond
wraps the ANNOTATE_EXPR inside of a TRUTH_NOT_EXPR, so it no longer
holds.

The following patch fixes that by adding the TRUTH_NOT_EXPR inside of the
ANNOTATE_EXPR argument if any.

2024-04-12  Jakub Jelinek  <[email protected]>

	PR c++/114691
	* semantics.cc (simplify_loop_decl_cond): Use cp_build_unary_op with
	TRUTH_NOT_EXPR on ANNOTATE_EXPR argument (if any) rather than
	ANNOTATE_EXPR itself.

	* g++.dg/ext/pr114691.C: New test.
Fixes: 9706965 ("RISC-V: Implement TLS Descriptors.")

gcc/ChangeLog:
	* config/riscv/riscv.opt.urls: Regenerated.

Reviewed-by: Palmer Dabbelt <[email protected]>
Acked-by: Palmer Dabbelt <[email protected]>
The original testcase in PR113141 is an instance of CWG1996; the standard
fails to consider conversion functions when initializing a reference
directly from an initializer-list of one element, but then does consider
them when initializing a temporary.  I have a proposed fix for this defect,
which is implemented here.

	DR 1996
	PR c++/113141

gcc/cp/ChangeLog:

	* call.cc (reference_binding): Check direct binding from
	a single-element list.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/initlist-ref1.C: New test.
	* g++.dg/cpp0x/initlist-ref2.C: New test.
	* g++.dg/cpp0x/initlist-ref3.C: New test.

Co-authored-by: Patrick Palka <[email protected]>
The second testcase in 113141 is a separate issue: we first decide that the
conversion is ill-formed, but then when recalculating the special c_cast_p
handling makes us think it's OK.  We don't want that, it should continue to
fall back to the reinterpret_cast interpretation.  And while we're here,
let's warn that we're not using the conversion function.

Note that the standard seems to say that in this case we should
treat (Matrix &) as const_cast<Matrix &>(static_cast<const Matrix &>(X)),
which would use the conversion operator, but that doesn't match existing
practice, so let's resolve that another day.  I've raised this issue with
CWG; at the moment I lean toward never binding a temporary in a C-style cast
to reference type, which would also be a change from existing practice.

	PR c++/113141

gcc/c-family/ChangeLog:

	* c.opt: Add -Wcast-user-defined.

gcc/ChangeLog:

	* doc/invoke.texi: Document -Wcast-user-defined.

gcc/cp/ChangeLog:

	* call.cc (reference_binding): For an invalid cast, warn and don't
	recalculate.

gcc/testsuite/ChangeLog:

	* g++.dg/conversion/ref12.C: New test.

Co-authored-by: Patrick Palka <[email protected]>
One known missing piece in the modules implementation is merging of a
streamed-in local type (class or enum) with the corresponding in-TU
version of the local type.  This missing piece turns out to cause a
hard-to-reduce use-after-free GC issue due to the entity_ary not being
marked as a GC root (deliberately), and manifests as a serialization
error on stream-in as in PR99426 (see comment Rust-GCC#6 for a reduction).  It's
also reproducible on trunk when running the xtreme-header tests without
-fno-module-lazy.

This patch implements this missing piece, making us merge such local
types according to their position within the containing function's
definition, analogous to how we merge FIELD_DECLs of a class according
to their index in the TYPE_FIELDS list.

	PR c++/99426

gcc/cp/ChangeLog:

	* module.cc (merge_kind::MK_local_type): New enumerator.
	(merge_kind_name): Update.
	(trees_out::chained_decls): Move BLOCK-specific handling
	of DECL_LOCAL_DECL_P decls to ...
	(trees_out::core_vals) <case BLOCK>: ... here.  Stream
	BLOCK_VARS manually.
	(trees_in::core_vals) <case BLOCK>: Stream BLOCK_VARS
	manually.  Handle deduplicated local types..
	(trees_out::key_local_type): Define.
	(trees_in::key_local_type): Define.
	(trees_out::get_merge_kind) <case FUNCTION_DECL>: Return
	MK_local_type for a local type.
	(trees_out::key_mergeable) <case FUNCTION_DECL>: Use
	key_local_type.
	(trees_in::key_mergeable) <case FUNCTION_DECL>: Likewise.
	(trees_in::is_matching_decl): Be flexible with type mismatches
	for local entities.
	(trees_in::register_duplicate): Also register the
	DECL_TEMPLATE_RESULT of a TEMPLATE_DECL as a duplicate.
	(depset_cmp): Return 0 for equal IDENTIFIER_HASH_VALUEs.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/merge-17.h: New test.
	* g++.dg/modules/merge-17_a.H: New test.
	* g++.dg/modules/merge-17_b.C: New test.
	* g++.dg/modules/xtreme-header-7_a.H: New test.
	* g++.dg/modules/xtreme-header-7_b.C: New test.

Reviewed-by: Jason Merrill <[email protected]>
The bug in PR101865 is the _ARCH_PWR8 predefine macro is conditional upon
TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the
-mno-altivec or -mno-vsx options are used.  The solution here is to create
a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of
Altivec or VSX enablement.

Unfortunately, the only way to create an OPTION_MASK_* mask is to create
a new option, which we have done here, but marked it as WarnRemoved since
we do not want users using it.  For stage1, we will look into how we can
create ISA mask flags for use in the compiler without the need for explicit
options.

2024-04-12  Will Schmidt  <[email protected]>
	    Peter Bergner  <[email protected]>

gcc/
	PR target/101865
	* config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
	TARGET_POWER8.
	* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Use
	OPTION_MASK_POWER8.
	* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add OPTION_MASK_POWER8.
	(ISA_2_7_MASKS_SERVER): Likewise.
	* config/rs6000/rs6000.cc (rs6000_option_override_internal): Update
	comment.  Use OPTION_MASK_POWER8 and TARGET_POWER8.
	* config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8.
	* config/rs6000/rs6000.md (define_attr "isa"): Add p8.
	(define_attr "enabled"): Handle it.
	(define_insn "prefetch"): Use TARGET_POWER8.
	* config/rs6000/rs6000.opt (mpower8-internal): New.

gcc/testsuite/
	PR target/101865
	* gcc.target/powerpc/predefine-p7-novsx.c: New test.
	* gcc.target/powerpc/predefine-p8-noaltivec-novsx.c: New test.
	* gcc.target/powerpc/predefine-p8-noaltivec.c: New test.
	* gcc.target/powerpc/predefine-p8-novsx.c: New test.
	* gcc.target/powerpc/predefine-p8-pragma-vsx.c: New test.
	* gcc.target/powerpc/predefine-p9-novsx.c: New test.
This ICE started with the fairly complicated r13-765.  We crash in
gimplify_var_or_parm_decl because a stray VAR_DECL leaked there.
The problem is ultimately that potential_prvalue_result_of wasn't
correctly handling arrays and replace_placeholders_for_class_temp_r
replaced a PLACEHOLDER_EXPR in a TARGET_EXPR which is used in the
context of copy elision.  If I have

  M m[2] = { M{""}, M{""} };

then we don't invoke the M(const M&) copy-ctor.

One part of the fix is to use TARGET_EXPR_ELIDING_P rather than
potential_prvalue_result_of.  That unfortunately doesn't handle the
case like

  struct N { N(M); };
  N arr[2] = { M{""}, M{""} };

because TARGET_EXPRs that initialize a function argument are not
marked TARGET_EXPR_ELIDING_P even though gimplify_arg drops such
TARGET_EXPRs on the floor.  We can use a pset to avoid replacing
placeholders in them.

I made an attempt to use set_target_expr_eliding in
convert_for_arg_passing but that regressed constexpr-diag1.C, and does
not seem like a prudent change in stage 4 anyway.

	PR c++/109966

gcc/cp/ChangeLog:

	* typeck2.cc (potential_prvalue_result_of): Remove.
	(replace_placeholders_for_class_temp_r): Check TARGET_EXPR_ELIDING_P.
	Use a pset.  Don't replace_placeholders in TARGET_EXPRs that initialize
	a function argument.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1y/nsdmi-aggr20.C: New test.
	* g++.dg/cpp1y/nsdmi-aggr21.C: New test.
FEAT_CSSC is mandatory in the architecture from Armv8.9.

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def: Add CSSC to V8_9A
	dependencies.
We don't yet have a separate feature flag for FEAT_LRCPC2 (and adding
one will require extending the feature bitmask).  Instead, make the
FEAT_LRCPC2 patterns available when either armv8.4-a or +rcpc3 is
specified.  We already have a +rcpc flag, so this dependency can be
specified directly.

Also add an explicit dependance on +rcpc to the FEAT_LRCPC2 patterns, so
that they are disabled with armv8.4-a+norcpc.

The cpunative test needed updating because it used an invalid Features
list, since lrcpc3 requires both ilrcpc and lrcpc to be present.
Without this change, host_detect_local_cpu would return the architecture
string 'armv8-a+dotprod+crc+crypto+rcpc3+norcpc'.

gcc/ChangeLog:

	* config/aarch64/aarch64-option-extensions.def: Add RCPC to
	RCPC3 dependencies.
	* config/aarch64/aarch64.h (AARCH64_ISA_RCPC8_4): Add test for
	RCPC3 bit

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/info_24: Include lrcpc and ilrcpc.
	* config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt:
	Updated.
One would expect consecutive calls to bytes_in/out::b for streaming
adjacent bits, as is done for tree flag streaming, to at least be
optimized by the compiler into individual bit operations using
statically known bit positions (and ideally combined into larger sized
reads/writes).

Unfortunately this doesn't happen because the compiler has trouble
tracking the values of this->bit_pos and this->bit_val across the
calls, likely because the compiler doesn't know the value of 'this'.
Thus for each consecutive bit stream operation, bit_pos and bit_val are
loaded from 'this', checked if buffering is needed, and finally the bit
is extracted from bit_val according to the (unknown) bit_pos, even
though relative to the previous operation (if we didn't need to buffer)
bit_val is unchanged and bit_pos is just 1 larger.  This ends up being
quite slow, with tree_node_bools taking 10% of time when streaming in
the std module.

This patch improves this by making tracking of bit_pos and bit_val
easier for the compiler.  Rather than bit_pos and bit_val being members
of the (effectively global) bytes_in/out objects, this patch factors out
the bit streaming code/state into separate classes bits_in/out that get
constructed locally as needed for bit streaming.  Since these objects
are now clearly local, the compiler can more easily track their values
and optimize away redundant buffering checks.

And since bit streaming is intended to be batched it's natural for these
new classes to be RAII-enabled such that the bit stream is flushed upon
destruction.

In order to make the most of this improved tracking of bit position,
this patch changes parts where we conditionally stream a tree flag
to unconditionally stream (the flag or a dummy value).  That way
the number of bits streamed and the respective bit positions are as
statically known as reasonably possible.  In lang_decl_bools and
lang_type_bools this patch makes us flush the current bit buffer at the
start so that subsequent bit positions are in turn statically known.
And in core_bools, we can add explicit early exits utilizing invariants
that the compiler can't figure out itself (e.g. a tree code can't have
both TS_TYPE_COMMON and TS_DECL_COMMON, and if a tree code doesn't have
TS_DECL_COMMON then it doesn't have TS_DECL_WITH_VIS).

This patch also moves the definitions of the relevant streaming classes
into anonymous namespaces so that the compiler can make more informed
decisions about inlining their member functions.

After this patch, compile time for a simple Hello World using the std
module is reduced by 7% with a release compiler.  The on-disk size of
the std module increases by 0.4% (presumably due to the extra flushing
done in lang_decl_bools and lang_type_bools).

The bit stream out performance isn't improved as much as the stream in
due to the spans/lengths instrumentation performed on stream out (which
maybe should be disabled for release builds?)

gcc/cp/ChangeLog:

	* module.cc: Update comment about classes defined within.
	(class data): Enclose in an anonymous namespace.
	(data::calc_crc): Moved from bytes::calc_crc.
	(class bytes): Remove.  Move bit_flush to namespace scope.
	(class bytes_in): Enclose in an anonymous namespace.  Inherit
	directly from data and adjust accordingly.  Move b and bflush
	members to bits_in.
	(class bytes_out): As above.  Remove is_set static data member.
	(bit_flush): Moved from class bytes.
	(struct bytes_in::bits_in): Define.
	(struct bytes_out::bits_out): Define.
	(bytes_in::stream_bits): Define.
	(bytes_out::stream_bits): Define.
	(bytes_out::bflush): Moved to bits_out/in.
	(bytes_in::bflush): Likewise
	(bytes_in::bfill): Removed.
	(bytes_out::b): Moved to bits_out/in.
	(bytes_in::b): Likewise.
	(class trees_in): Enclose in an anonymous namespace.
	(class trees_out): Enclose in an anonymous namespace.
	(trees_out::core_bools): Add bits_out/in parameter and use it.
	Unconditionally stream a bit for public_flag.  Add early exits
	as appropriate.
	(trees_out::core_bools): Likewise.
	(trees_out::lang_decl_bools): Add bits_out/in parameter and use
	it.  Flush the current bit buffer at the start.  Unconditionally
	stream a bit for module_keyed_decls_p.
	(trees_in::lang_decl_bools): Likewise.
	(trees_out::lang_type_bools): Add bits_out/in parameter and use
	it.  Flush the current bit buffer at the start.
	(trees_in::lang_type_bools): Likewise.
	(trees_out::tree_node_bools): Construct a bits_out object and
	use/pass it.
	(trees_in::tree_node_bools): Likewise.
	(trees_out::decl_value): Likewise.
	(trees_in::decl_value): Likewise.
	(module_state::write_define): Likewise.
	(module_state::read_define): Likewise.

Reviewed-by: Jason Merrill <[email protected]>
gcc/cp/ChangeLog:

	* module.cc (struct bytes_in::bits_in): Define defaulted
	move ctor.
	(struct bytes_out::bits_out): Likewise.
Fixes: df7bfdb ("c++: reference cast, conversion fn [PR113141]")

A new warning option -Wcast-user-defined was added to c.opt and
documented in doc/invoke.texi. But c.opt.urls wasn't regenerate.

gcc/c-family/ChangeLog:

	* c.opt.urls: Regenerate.
I wonder if more generally we need to be doing more work when importing
definitions from header units especially to handle all the work that
'make_rtl_for_nonlocal_decl' and 'rest_of_decl_compilation' would have
been performing. But this patch fixes at least one missing step.

	PR c++/106820

gcc/cp/ChangeLog:

	* module.cc (trees_in::decl_value): Assemble alias when needed.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/pr106820_a.H: New test.
	* g++.dg/modules/pr106820_b.C: New test.

Signed-off-by: Nathaniel Shead <[email protected]>
A typo in r14-6978 made us emit too many things. This ensures that we
don't emit using-declarations from the GMF that we don't need to.

	PR c++/114600

gcc/cp/ChangeLog:

	* module.cc (depset::hash::add_binding_entity): Require both
	WMB_Using and WMB_Export for GMF entities.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/using-14.C: New test.

Signed-off-by: Nathaniel Shead <[email protected]>
Co-authored-by: Patrick Palka <[email protected]>
…634]

The enumerator still doesn't have TREE_TYPE set but diag_attr_exclusions
assumes that all decls must have types.
I think it is better in something as unimportant as diag_attr_exclusions
to be more robust, if there is no type, it can just diagnose exclusions
on the DECL_ATTRIBUTES, like for types it only diagnoses it on
TYPE_ATTRIBUTES.

2024-04-15  Jakub Jelinek  <[email protected]>

	PR c++/114634
	* attribs.cc (diag_attr_exclusions): Set attrs[1] to NULL_TREE for
	decls with NULL TREE_TYPE.

	* g++.dg/ext/attrib68.C: New test.
…/GNU

The new gcc.target/i386/fhardened-1.c etc. tests FAIL on Solaris/x86 and
Darwin/x86:

FAIL: gcc.target/i386/fhardened-1.c (test for excess errors)
FAIL: gcc.target/i386/fhardened-2.c (test for excess errors)

Excess errors:
cc1: warning: '-fhardened' not supported for this target

Support for -fhardened is restricted to HAVE_FHARDENED_SUPPORT in
toplev.cc (process_options) which again is only defined for linux*|gnu*
targets in gcc/configure.ac.

Accordingly, this patch restricts the tests to those two, as is already
done in gcc.target/i386/cf_check-6.c.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

2024-04-15  Rainer Orth  <[email protected]>

	gcc/testsuite:
	* gcc.target/i386/fhardened-1.c: Restrict to Linux/GNU.
	* gcc.target/i386/fhardened-2.c: Likewise.
ChangeLog:

	* .github/alpine_32bit_log_warnings: Adjust line numbers.
	* .github/log_expected_warnings: Likewise.

Signed-off-by: Owen Avery <[email protected]>
@powerboat9
Copy link
Contributor Author

To merge this, someone would have to manually merge 7f4ba5480e0ee5c03317d24d3fa858c0966f3464 into master, and then cherry pick 950ec4a2616c89f422da8ef34d709c7733402315

@dkm
Copy link
Member

dkm commented Dec 3, 2024

I thought we already had a PR using the merge strategy opened: #3218
... and we're probably going to test the rebase strategy...
I guess your approach is something else? Maybe it's related to the few patches you've submitted directly to gcc?

@powerboat9
Copy link
Contributor Author

Kind of. I'm trying a merge strategy where we fix conflicts before the merges -- my theory is that most of the merge conflicts we're having could be prevented with some targeted PRs/upstreaming attempts. This PR should just be a subset of the changes from #3218 along with some warning adjustments. Meanwhile, the patches I submitted upstream are the couple that occur before things get weird with cargo-based compilation differences between us and upstream.

@powerboat9
Copy link
Contributor Author

powerboat9 commented Dec 3, 2024

It looks like carefully upstreaming ab8b4cc38806e1a7190a7426ce073951752d1a60 and 5f0db57567e147846cd0b7aa4cb2fc8bba9208a0 (the latter with a bit of conflict resolution) would fix some of the early merge conflicts.

(those commits being child and grandchild of 509c286cb0665720550cb88a2628a98d35f1b37e)

@powerboat9
Copy link
Contributor Author

If I make small modifications to a patch, as part of sending it upstream, I add myself as co-author, right?

@dkm
Copy link
Member

dkm commented Dec 3, 2024

Oh ok. You're doing more than "trying" as you're sending patches to the gcc mailing list 😅 . I think it would be good to discuss with @tschwinge and @CohenArthur before committing anything... If we apply some commits in upstream gcc (some with small delta), it can be quite a headache in a few weeks/months.

@powerboat9
Copy link
Contributor Author

Yeah, I suppose we should probably talk about it a bit before merging the patches

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.