GNU bug report logs - #64058
[PATCH] wc: Fix crashes due to incomplete AVX2 enumeration

Previous Next

Package: coreutils;

Reported by: Dave Hansen <dave.hansen <at> linux.intel.com>

Date: Wed, 14 Jun 2023 02:40:02 UTC

Severity: normal

Tags: patch

Merged with 65164, 65165, 65166, 65167, 65168, 65169, 65170

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 64058 in the body.
You can then email your comments to 64058 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#64058; Package coreutils. (Wed, 14 Jun 2023 02:40:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Dave Hansen <dave.hansen <at> linux.intel.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Wed, 14 Jun 2023 02:40:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Dave Hansen <dave.hansen <at> linux.intel.com>
To: bug-coreutils <at> gnu.org
Cc: Kristoffer Brånemyr <ztion1 <at> yahoo.se>,
 Dave Hansen <dave.hansen <at> linux.intel.com>
Subject: [PATCH] wc: Fix crashes due to incomplete AVX2 enumeration
Date: Tue, 13 Jun 2023 10:26:10 -0700
The AVX2 enumeration for 'wc -l' is incomplete which may cause wc to
crash.

The Intel SDM documents the whole AVX2 enumeration sequence in its
"Detection of Intel AVX2" section.  There are three pieces:
 1. Ensuring the CPU supports AVX2 instructions
 2. Ensuring the OS has enabled XSAVE
 3. Ensuring XSAVE is managing the YMM registers (XGETBV)

The existing code does #1 and #2 but misses #3.  The kernel will
enable #3 in almost all situations where AVX2 is supported, but there
are situations where the kernel might disable XSAVE YMM support but
leave FP and SSE enabled.  This is _unusual_ today, but can occur when
Linux notices XSAVE enumeration issues or in virtual machines with
odd/buggy XSAVE enumeration.

If wc is used on one of those systems in '-l' mode, it will crash:

	# cat foo | ./wc.orig -l
	Illegal instruction (core dumped)

Fortunately, gcc and llvm provide a builtin to check for CPU support
and AVX2 is supported.  Use the builtin instead of using CPUID
directly.  This fixes the problem and is also substantially simpler.

The result is that wc no longer crashes on systems where XSAVE YMM
support is disabled:

	# cat foo | ./wc.fixed -l
	30232

I've reproduced this issue in the current upstream git repo which git
describe says is v9.3-53-g378902407.

I needed to run a patched version of Linux to expose this issue.  We
(the upstream Linux x86 maintainers) are considering merging a patch
to do this in the upstream kernel.  It would be great if a wc fix was
available.
---
 src/wc.c | 36 +-----------------------------------
 1 file changed, 1 insertion(+), 35 deletions(-)

diff --git a/src/wc.c b/src/wc.c
index becceda98..944cfc426 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -157,41 +157,7 @@ static enum total_type total_mode = total_auto;
 static bool
 avx2_supported (void)
 {
-  unsigned int eax = 0;
-  unsigned int ebx = 0;
-  unsigned int ecx = 0;
-  unsigned int edx = 0;
-  bool getcpuid_ok = false;
-  bool avx_enabled = false;
-
-  if (__get_cpuid (1, &eax, &ebx, &ecx, &edx))
-    {
-      getcpuid_ok = true;
-      if (ecx & bit_OSXSAVE)
-        avx_enabled = true;  /* Support is not disabled.  */
-    }
-
-
-  if (avx_enabled)
-    {
-      eax = ebx = ecx = edx = 0;
-      if (! __get_cpuid_count (7, 0, &eax, &ebx, &ecx, &edx))
-        getcpuid_ok = false;
-      else
-        {
-          if (! (ebx & bit_AVX2))
-            avx_enabled = false;  /* Hardware doesn't support it.  */
-        }
-    }
-
-
-  if (! getcpuid_ok)
-    {
-      if (debug)
-        error (0, 0, "%s", _("failed to get cpuid"));
-      return false;
-    }
-  else if (! avx_enabled)
+  if (!__builtin_cpu_supports("avx2"))
     {
       if (debug)
         error (0, 0, "%s", _("avx2 support not detected"));
-- 
2.34.1





Information forwarded to bug-coreutils <at> gnu.org:
bug#64058; Package coreutils. (Wed, 14 Jun 2023 04:16:02 GMT) Full text and rfc822 format available.

Message #8 received at 64058 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Dave Hansen <dave.hansen <at> linux.intel.com>
Cc: Kristoffer Brånemyr <ztion1 <at> yahoo.se>,
 64058 <at> debbugs.gnu.org
Subject: Re: bug#64058: [PATCH] wc: Fix crashes due to incomplete AVX2
 enumeration
Date: Tue, 13 Jun 2023 21:14:55 -0700
[Message part 1 (text/plain, inline)]
Thanks for the bug report. I installed the attached patch into coreutils 
on Savannah. It builds on your idea with several other changes:

* There's a similar issue with cksum.c and pclmul.

* configure.ac can be simplified, since it seems there's no point 
compiling these instructions if __builtin_cpu_supports doesn't work.

* This lets us simplify the source code a bit more.

Please let me know if the attached patch works for you.

PS. Does the attached cksum.c / pclmul change fix any user-visible 
misbehavior? If so, what should we put into the NEWS file?
[0001-wc-port-to-kernels-that-disable-XSAVE-YMM.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#64058; Package coreutils. (Wed, 14 Jun 2023 10:48:01 GMT) Full text and rfc822 format available.

Message #11 received at 64058 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>,
 Dave Hansen <dave.hansen <at> linux.intel.com>, Axel Beckert <abe <at> debian.org>
Cc: Kristoffer Brånemyr <ztion1 <at> yahoo.se>,
 64058 <at> debbugs.gnu.org
Subject: Re: bug#64058: [PATCH] wc: Fix crashes due to incomplete AVX2
 enumeration
Date: Wed, 14 Jun 2023 11:46:58 +0100
[Message part 1 (text/plain, inline)]
On 14/06/2023 05:14, Paul Eggert wrote:
> Thanks for the bug report. I installed the attached patch into coreutils
> on Savannah. It builds on your idea with several other changes:
> 
> * There's a similar issue with cksum.c and pclmul.
> 
> * configure.ac can be simplified, since it seems there's no point
> compiling these instructions if __builtin_cpu_supports doesn't work.
> 
> * This lets us simplify the source code a bit more.
> 
> Please let me know if the attached patch works for you.

__builtin_cpu_supports() looks to have sufficient support in
Arch + compiler versions for our needs, so that's good.

Paul you removed the "avx" check from cksum.c. Was that intended?

> PS. Does the attached cksum.c / pclmul change fix any user-visible
> misbehavior? If so, what should we put into the NEWS file?

We have an illegal instruction issue with cksum under Xen DomU
which may be related, as discussed at: https://bugs.debian.org/1037264

Axel does the attached patch change anything for you?

thanks,
Pádraig
[cksum-xsave.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#64058; Package coreutils. (Wed, 14 Jun 2023 16:45:02 GMT) Full text and rfc822 format available.

Message #14 received at 64058 <at> debbugs.gnu.org (full text, mbox):

From: Axel Beckert <abe <at> debian.org>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Kristoffer Brånemyr <ztion1 <at> yahoo.se>,
 Paul Eggert <eggert <at> cs.ucla.edu>, Dave Hansen <dave.hansen <at> linux.intel.com>,
 1037264 <at> bugs.debian.org, 64058 <at> debbugs.gnu.org
Subject: Re: bug#64058: [PATCH] wc: Fix crashes due to incomplete AVX2
 enumeration
Date: Wed, 14 Jun 2023 14:55:43 +0200
Control: tag 1037264 + patch

Hi Pádraig,

On Wed, Jun 14, 2023 at 11:46:58AM +0100, Pádraig Brady wrote:
> On 14/06/2023 05:14, Paul Eggert wrote:
> > Thanks for the bug report. I installed the attached patch into coreutils
> > on Savannah. It builds on your idea with several other changes:
> > 
> > * There's a similar issue with cksum.c and pclmul.
> > 
> > * configure.ac can be simplified, since it seems there's no point
> > compiling these instructions if __builtin_cpu_supports doesn't work.
> > 
> > * This lets us simplify the source code a bit more.
> > 
> > Please let me know if the attached patch works for you.
> 
> __builtin_cpu_supports() looks to have sufficient support in
> Arch + compiler versions for our needs, so that's good.
> 
> Paul you removed the "avx" check from cksum.c. Was that intended?
> 
> > PS. Does the attached cksum.c / pclmul change fix any user-visible
> > misbehavior? If so, what should we put into the NEWS file?
> 
> We have an illegal instruction issue with cksum under Xen DomU
> which may be related, as discussed at: https://bugs.debian.org/1037264
> 
> Axel does the attached patch change anything for you?

Yes! This patch helps! Thanks a lot! :-)

Cc'ing the Debian bug report again (and doing a nearly fullquote for
that) and tagging Debian's bug report as containing a patch. (Given
it's high bug report number it should do any harm in GNU's BTS even if
it uses the same control syntax…)

> diff --git a/src/cksum.c b/src/cksum.c
> index 85afab0ac..881d90413 100644
> --- a/src/cksum.c
> +++ b/src/cksum.c
> @@ -160,29 +160,16 @@ static bool
>  pclmul_supported (void)
>  {
>  # if USE_PCLMUL_CRC32
> -  unsigned int eax = 0;
> -  unsigned int ebx = 0;
> -  unsigned int ecx = 0;
> -  unsigned int edx = 0;
> -
> -  if (! __get_cpuid (1, &eax, &ebx, &ecx, &edx))
> -    {
> -      if (cksum_debug)
> -        error (0, 0, "%s", _("failed to get cpuid"));
> -      return false;
> -    }
> -
> -  if (! (ecx & bit_PCLMUL) || ! (ecx & bit_AVX))
> -    {
> -      if (cksum_debug)
> -        error (0, 0, "%s", _("pclmul support not detected"));
> -      return false;
> -    }
> +  bool pclmul_enabled = 0 < __builtin_cpu_supports ("pclmul")
> +                        && 0 < __builtin_cpu_supports ("avx");
>  
>    if (cksum_debug)
> -    error (0, 0, "%s", _("using pclmul hardware support"));
> +    error (0, 0, "%s",
> +           (pclmul_enabled
> +            ? _("using pclmul hardware support")
> +            : _("pclmul support not detected")));
>  
> -  return true;
> +  return pclmul_enabled;
>  # else
>    if (cksum_debug)
>      error (0, 0, "%s", _("using generic hardware support"));

		Regards, Axel
-- 
 ,''`.  |  Axel Beckert <abe <at> debian.org>, https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-    |  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE




Information forwarded to bug-coreutils <at> gnu.org:
bug#64058; Package coreutils. (Wed, 14 Jun 2023 22:06:01 GMT) Full text and rfc822 format available.

Message #17 received at 64058 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>,
 Dave Hansen <dave.hansen <at> linux.intel.com>, Axel Beckert <abe <at> debian.org>
Cc: Kristoffer Brånemyr <ztion1 <at> yahoo.se>,
 64058 <at> debbugs.gnu.org
Subject: Re: bug#64058: [PATCH] wc: Fix crashes due to incomplete AVX2
 enumeration
Date: Wed, 14 Jun 2023 15:04:57 -0700
[Message part 1 (text/plain, inline)]
On 6/14/23 03:46, Pádraig Brady wrote:

> Paul you removed the "avx" check from cksum.c. Was that intended?

No, it's a typo I introduced. Thanks for catching that. Fixed in the 
first attached patch.

While looking into this I noticed a couple of other cleanups, fixed in 
the other attached patches.

I installed these into coreutils on Savannah.
[0001-cksum-fix-bug-in-check-for-cksum_pclmul.patch (text/x-patch, attachment)]
[0002-cksum-wc-don-t-include-cpuid.h.patch (text/x-patch, attachment)]
[0003-cksum-wc-clean-up-hw-capability-checking.patch (text/x-patch, attachment)]

Reply sent to Pádraig Brady <P <at> draigBrady.com>:
You have taken responsibility. (Thu, 15 Jun 2023 00:23:02 GMT) Full text and rfc822 format available.

Notification sent to Dave Hansen <dave.hansen <at> linux.intel.com>:
bug acknowledged by developer. (Thu, 15 Jun 2023 00:23:02 GMT) Full text and rfc822 format available.

Message #22 received at 64058-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 64058-done <at> debbugs.gnu.org
Subject: Re: bug#64058: [PATCH] wc: Fix crashes due to incomplete AVX2
 enumeration
Date: Thu, 15 Jun 2023 01:22:07 +0100
[Message part 1 (text/plain, inline)]
On 14/06/2023 23:04, Paul Eggert wrote:
> On 6/14/23 03:46, Pádraig Brady wrote:
> 
>> Paul you removed the "avx" check from cksum.c. Was that intended?
> 
> No, it's a typo I introduced. Thanks for catching that. Fixed in the
> first attached patch.
> 
> While looking into this I noticed a couple of other cleanups, fixed in
> the other attached patches.
> 
> I installed these into coreutils on Savannah.

Changes look good.
I pushed the attached to explicitly document the cksum fix,
and cleanup a syntax-check failure.

Marking this as done.

thanks,
Pádraig
[cksum-sigill-news.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#64058; Package coreutils. (Fri, 16 Jun 2023 05:57:05 GMT) Full text and rfc822 format available.

Message #25 received at 64058 <at> debbugs.gnu.org (full text, mbox):

From: Dave Hansen <dave.hansen <at> intel.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Dave Hansen <dave.hansen <at> linux.intel.com>
Cc: Kristoffer Brånemyr <ztion1 <at> yahoo.se>,
 64058 <at> debbugs.gnu.org
Subject: Re: bug#64058: [PATCH] wc: Fix crashes due to incomplete AVX2
 enumeration
Date: Thu, 15 Jun 2023 09:05:07 -0700
On 6/13/23 21:14, Paul Eggert wrote:
> PS. Does the attached cksum.c / pclmul change fix any user-visible
> misbehavior? If so, what should we put into the NEWS file?

Yes, this patch also works for me.  Thanks!

Your news blurb from that patch looks fine to me.  The only tweak I
would suggest is adding some text that a user might actually see if they
hit this issue, like:

 'wc -l' no longer crashes with "Illegal instruction" messages on x86
 Linux kernels that disable XSAVE YMM. [bug introduced in coreutils-9.0]





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 14 Jul 2023 11:24:07 GMT) Full text and rfc822 format available.

bug unarchived. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Wed, 09 Aug 2023 22:47:01 GMT) Full text and rfc822 format available.

Merged 64058 65164 65165 65166 65167 65168 65169 65170. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Wed, 09 Aug 2023 22:47:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 07 Sep 2023 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 287 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.