From unknown Fri Sep 05 19:41:18 2025 X-Loop: help-debbugs@gnu.org Subject: bug#62267: grep-3.9 bug: \d matches multibyte digits Resent-From: Jim Meyering Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 19 Mar 2023 00:07:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 62267 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: 62267@debbugs.gnu.org X-Debbugs-Original-To: bug-grep@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.167918440816401 (code B ref -1); Sun, 19 Mar 2023 00:07:01 +0000 Received: (at submit) by debbugs.gnu.org; 19 Mar 2023 00:06:48 +0000 Received: from localhost ([127.0.0.1]:49175 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdgZT-0004GS-Gt for submit@debbugs.gnu.org; Sat, 18 Mar 2023 20:06:48 -0400 Received: from lists.gnu.org ([209.51.188.17]:44628) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdgZR-0004GK-BO for submit@debbugs.gnu.org; Sat, 18 Mar 2023 20:06:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pdgZR-0004BH-5C for bug-grep@gnu.org; Sat, 18 Mar 2023 20:06:45 -0400 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pdgZO-00019f-Mq for bug-grep@gnu.org; Sat, 18 Mar 2023 20:06:44 -0400 Received: by mail-pl1-x62a.google.com with SMTP id ix20so8866767plb.3 for ; Sat, 18 Mar 2023 17:06:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679184400; h=mime-version:message-id:date:subject:to:from:sender:from:to:cc :subject:date:message-id:reply-to; bh=VLULjz60WQvByOrwbD6ppHduTuK+3W1jOaAat7wcDe0=; b=aJPLaH2LkdZfMl5taTlEsO9qqisgwJ3jQSLsxxPcrliLZbD+ehk2ZMUtCakx2/HAtC p7UpfDqA45bQV0k17wgLkqO1qPJAWgLuDHsAe4Ch8J1W78V9A1mBTKZWeMrLn9nVFodY POmqsbzX4NfJdP/KDXU0Ne2aePMcn0SFVF5GEtgOVrAEvgn9oK6rF1F3sd797WW+CcyI NTwuuo1DaLeHn9yXn8uRsVNdoGqBrisY/jQq7KzBtS1yp33lKiRkF41aXCww9ch/nMaw DSEOGqklpgBfhIlrn9cJ0690+NNIkqVwYs5VVLaxwFkPArsfn9ULvT5E2ZzfjzjcECPh wy1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679184400; h=mime-version:message-id:date:subject:to:from:sender :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VLULjz60WQvByOrwbD6ppHduTuK+3W1jOaAat7wcDe0=; b=AuC9l+FFONiGcx1KsCIjJzSquWa5ZbR+nKWA5h6gkcw0tQ4GnUpqpAQOnBithROYam eTObe1DFoeRt8eRuU0p12JI8HdSPDV/a7uMPCXMQ6FT2clVChwxLPg8ICHQt6g7CzAth FgNiLsr+jlmAMiAiAi/vTI9feXKlajDPtbKoWW+PM/8gQzDdsJyOaWN/U6qe5LLxlC7u okGO1RVfRocmOFb6+XhP5mZvdUcsC0u3Iwk4jDnrkFAVCiD0J79+GsvRejOnwOTt9BD1 zZvIavKINzWtN3CoXYr47+TCQTnGoO7bGeup3JCdkOmyjmG9adAx3drX3qQaOPki5iJZ CzfA== X-Gm-Message-State: AO0yUKUMX23rSzrqGQ7OFwa/eJ32OEf16ok4XwZhGLmgxcIHkHRH53GC 3jEIIkL21u8zCJc6Phhqa0sot2906QM= X-Google-Smtp-Source: AK7set8KAeVoPEEIgFzmajXsYbp1fpQziPnZgbCUxvW5QWeuwqSe4TxxxLxHkNijspwEVTnsttobGA== X-Received: by 2002:a17:903:1392:b0:1a1:b8cc:59da with SMTP id jx18-20020a170903139200b001a1b8cc59damr2612173plb.33.1679184400188; Sat, 18 Mar 2023 17:06:40 -0700 (PDT) Received: from meyering-mbp ([2620:10d:c090:400::5:53da]) by smtp.gmail.com with ESMTPSA id jj11-20020a170903048b00b0019f1222b9f6sm3818258plb.154.2023.03.18.17.06.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 18 Mar 2023 17:06:39 -0700 (PDT) From: Jim Meyering Date: Sat, 18 Mar 2023 17:06:37 -0700 Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=meyering@gmail.com; helo=mail-pl1-x62a.google.com X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.25, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.1 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.1 (--) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable I was not happy to discover that with grep-3.9 and -P, \d can match multibyte digits like the Arabic ones: $ LC_ALL=3Den_US.UTF-8 grep -Po '\d+' <<< '=D9=A0=D9=A1=D9=A2=D9=A3=D9=A4= =D9=A5=D9=A6=D9=A7=D9=A8=D9=A9' =D9=A0=D9=A1=D9=A2=D9=A3=D9=A4=D9=A5=D9=A6=D9=A7=D9=A8=D9=A9 grep -P has never before done that. Of course, in the C/POSIX locale, there is no such match: $ LC_ALL=3DC grep -Po '\d+' <<< '=D9=A0=D9=A1=D9=A2=D9=A3=D9=A4=D9=A5=D9= =A6=D9=A7=D9=A8=D9=A9' [1] TL;DR, with the attached fix, grep preprocesses each affected regexp, changing each eligible "\d" to "[0-9]". Consider this a short-term fix. Longer term (subject to pcre2 releases), we may instead simply add a "(?aD)" prefix. If you really want to match non-ASCII digits, use \p{Nd}. For background, see the PCRE2 documentation: https://www.pcre.org/current/doc/html/pcre2pattern.html https://www.pcre.org/current/doc/html/pcre2syntax.html which say this: By default, \d, \s, and \w match only ASCII characters, even in UTF-8 mode or in the 16-bit and 32-bit libraries. However, if locale-specific matching is happening, \s and \w may also match characters with code points in the range 128-255. If the PCRE2_UCP option is set, the behaviour of these escape sequences is changed to use Unicode properties and they match many more characters. Per upstream pcre2-10.40-112-g6277357, (?aD) does what we want: PCRE2_EXTRA_ASCII_BSD: This option forces \d to match only ASCII digits, even when PCRE2_UCP is set. It can be changed within a pattern by means of the (?aD) option setting. I used pcre2grep (built from master) to demonstrate how we may eventually u= se "(?aD)" under the covers: $ LC_ALL=3Den_US.UTF-8 ./pcre2grep --color -u '(?aD)\d' <<< '=D9=A0=D9=A1= =D9=A2=D9=A3=D9=A4=D9=A5=D9=A6=D9=A7=D9=A8=D9=A9' [Exit 1] $ LC_ALL=3Den_US.UTF-8 ./pcre2grep --color -u '(?aD)^\d+$' <<< '=D9=A0=D9= =A1=D9=A2=D9=A3=D9=A4=D9=A5=D9=A6=D9=A7=D9=A8=D9=A9' =D9=A0=D9=A1=D9=A2=D9=A3=D9=A4=D9=A5=D9=A6=D9=A7=D9=A8=D9=A9 For the record, https://github.com/PCRE2Project/pcre2 currently declares 10.42 to be the latest, while there's a commit suggesting it's 10.43. The difference is important: the 10.43 has support for (?aD), while 10.42 does not. Incidentally, you can demonstrate this in python3, too: $ LC_ALL=3Den_US.UTF-8 python3 \ -c "import re; print(re.match(r'\d+', '=D9=A0=D9=A1=D9=A2=D9=A3=D9=A4= =D9=A5=D9=A6=D9=A7=D9=A8=D9=A9'))" Use flags=3Dre.ASCII to get the often-desired behavior: $ LC_ALL=3Den_US.UTF-8 python3 \ -c "import re; print(re.match(r'\d+', '=D9=A0=D9=A1=D9=A2=D9=A3=D9=A4= =D9=A5=D9=A6=D9=A7=D9=A8=D9=A9', flags=3Dre.ASCII))" None This is cause for a new snapshot today and soon thereafter, the release of grep-3.10. --=-=-= Content-Type: text/x-patch; charset=utf-8 Content-Disposition: inline; filename=grep-multibyte-digits.patch Content-Transfer-Encoding: quoted-printable >From 0daefc8c5659e79149a650d97ca12b49ad5e6548 Mon Sep 17 00:00:00 2001 From: Jim Meyering Date: Sat, 18 Mar 2023 08:28:36 -0700 Subject: [PATCH] grep: -P (--perl-regexp) \d: match only ASCII digits Prior to grep-3.9, the PCRE matcher had always treated \d just like [0-9]. grep-3.9's fix for \w and \b mistakenly relaxed \d to also match multibyte digits. * src/grep.c (P_MATCHER_INDEX): Define enum. (pcre_pattern_expand_backslash_d): New function. (main): Call it for -P. * NEWS (Bug fixes): Mention it. * doc/grep.texi: Document it: with -P, \d matches only ASCII digits. Provide a PCRE documentation URL and an example of how to use (?s) with -z. * tests/pcre-ascii-digits: New test. * tests/Makefile.am (TESTS): Add that file name. --- NEWS | 10 +++++ doc/grep.texi | 31 ++++++++++++++++ src/grep.c | 82 ++++++++++++++++++++++++++++++++++++++++- tests/Makefile.am | 1 + tests/pcre-ascii-digits | 31 ++++++++++++++++ 5 files changed, 154 insertions(+), 1 deletion(-) create mode 100755 tests/pcre-ascii-digits diff --git a/NEWS b/NEWS index 803e14b..a24cebd 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,16 @@ GNU grep NEWS -*- outli= ne -*- * Noteworthy changes in release ?.? (????-??-??) [?] +** Bug fixes + + With -P, \d now matches only ASCII digits, regardless of PCRE + options/modes. The changes in grep-3.9 to make \b and \w work + properly had the undesirable side effect of making \d also match + e.g., the Arabic digits: =D9=A0=D9=A1=D9=A2=D9=A3=D9=A4=D9=A5=D9=A6=D9= =A7=D9=A8=D9=A9. With grep-3.9, -P '\d+' + would match that ten-digit (20-byte) string. Now, to match such + a digit, you would use \p{Nd}. + [bug introduced in grep 3.9] + * Noteworthy changes in release 3.9 (2023-03-05) [stable] diff --git a/doc/grep.texi b/doc/grep.texi index 621beaf..eaad6e1 100644 --- a/doc/grep.texi +++ b/doc/grep.texi @@ -1141,6 +1141,37 @@ combined with the @option{-z} (@option{--null-data})= option, and note that @samp{grep@ -P} may warn of unimplemented features. @xref{Other Options}. +For documentation, refer to @url{https://www.pcre.org/}, with these caveat= s: +@itemize +@item +@samp{\d} always matches only the ten ASCII digits, regardless of locale or +in-regexp directives like @samp{(?aD)}. +Use @samp{\p@{Nd@}} if you require to match non-ASCII digits. +Once pcre2 support for @samp{(?aD)} is widespread enough, +we expect to make that the default, so it will be overridable. +@c Using pcre2 git commit pcre2-10.40-112-g6277357, this demonstrates how +@c we'll prefix with (?aD) to make \d's ASCII-only behavior the default: +@c $ LC_ALL=3Den_US.UTF-8 ./pcre2grep -u '(?aD)^\d+' <<< '=D9=A0=D9=A1=D9= =A2=D9=A3=D9=A4=D9=A5=D9=A6=D9=A7=D9=A8=D9=A9' +@c [Exit 1] +@c $ LC_ALL=3Den_US.UTF-8 ./pcre2grep -u '^\d+' <<< '=D9=A0=D9=A1=D9=A2=D9= =A3=D9=A4=D9=A5=D9=A6=D9=A7=D9=A8=D9=A9' +@c =D9=A0=D9=A1=D9=A2=D9=A3=D9=A4=D9=A5=D9=A6=D9=A7=D9=A8=D9=A9 + +@item +By default, @command{grep} applies each regexp to a line at a time, +so the @samp{(?s)} directive (making @samp{.} match line breaks) +is generally ineffective. +However, with @option{-z} (@option{--null-data}) it can work: +@example +$ printf 'a\nb\n' |grep -zP '(?s)a.b' +a +b +@end example +But beware: with the @option{-z} (@option{--null-data}) and a file +containing no NUL byte, grep must read the entire file into memory +before processing any of it. +Thus, it will exhaust memory and fail for some large files. +@end itemize + @end table diff --git a/src/grep.c b/src/grep.c index 7547b64..6ba881e 100644 --- a/src/grep.c +++ b/src/grep.c @@ -2089,7 +2089,8 @@ static struct #endif }; /* Keep these in sync with the 'matchers' table. */ -enum { E_MATCHER_INDEX =3D 1, F_MATCHER_INDEX =3D 2, G_MATCHER_INDEX =3D 0= }; +enum { E_MATCHER_INDEX =3D 1, F_MATCHER_INDEX =3D 2, G_MATCHER_INDEX =3D 0, + P_MATCHER_INDEX =3D 6 }; /* Return the index of the matcher corresponding to M if available. MATCHER is the index of the previous matcher, or -1 if none. @@ -2378,6 +2379,80 @@ fgrep_to_grep_pattern (char **keys_p, idx_t *len_p) *len_p =3D p - new_keys; } +/* Replace each \d in *KEYS_P with [0-9], to ensure that \d matches only A= SCII + digits. Now that we enable PCRE2_UCP for pcre regexps, \d would otherw= ise + match non-ASCII digits in some locales. Use \p{Nd} if you require to m= atch + those. */ +static void +pcre_pattern_expand_backslash_d (char **keys_p, idx_t *len_p) +{ + idx_t len =3D *len_p; + char *keys =3D *keys_p; + mbstate_t mb_state =3D { 0 }; + char *new_keys =3D xnmalloc (len / 2 + 1, 5); + char *p =3D new_keys; + bool prev_backslash =3D false; + + for (ptrdiff_t n; len; keys +=3D n, len -=3D n) + { + n =3D mb_clen (keys, len, &mb_state); + switch (n) + { + case -2: + n =3D len; + FALLTHROUGH; + default: + if (prev_backslash) + { + prev_backslash =3D false; + *p++ =3D '\\'; + } + p =3D mempcpy (p, keys, n); + break; + + case -1: + if (prev_backslash) + { + prev_backslash =3D false; + *p++ =3D '\\'; + } + memset (&mb_state, 0, sizeof mb_state); + n =3D 1; + FALLTHROUGH; + case 1: + if (prev_backslash) + { + prev_backslash =3D false; + switch (*keys) + { + case 'd': + p =3D mempcpy (p, "[0-9]", 5); + break; + default: + *p++ =3D '\\'; + *p++ =3D *keys; + break; + } + } + else + { + if (*keys =3D=3D '\\') + prev_backslash =3D true; + else + *p++ =3D *keys; + } + break; + } + } + + if (prev_backslash) + *p++ =3D '\\'; + *p =3D '\n'; + free (*keys_p); + *keys_p =3D new_keys; + *len_p =3D p - new_keys; +} + /* If it is easy, convert the MATCHER-style patterns KEYS (of size *LEN_P) to -F style, update *LEN_P to a possibly-smaller value, and return F_MATCHER_INDEX. If not, leave KEYS and *LEN_P alone and @@ -2970,6 +3045,11 @@ main (int argc, char **argv) matcher =3D try_fgrep_pattern (matcher, keys, &keycc); } + /* If -P, replace each \d with [0-9]. + Those who want to match non-ASCII digits must use \p{Nd}. */ + if (matcher =3D=3D P_MATCHER_INDEX) + pcre_pattern_expand_backslash_d (&keys, &keycc); + execute =3D matchers[matcher].execute; compiled_pattern =3D matchers[matcher].compile (keys, keycc, matchers[matcher].syntax, diff --git a/tests/Makefile.am b/tests/Makefile.am index a47cf5c..f195c8d 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -139,6 +139,7 @@ TESTS =3D \ options \ pcre \ pcre-abort \ + pcre-ascii-digits \ pcre-context \ pcre-count \ pcre-infloop \ diff --git a/tests/pcre-ascii-digits b/tests/pcre-ascii-digits new file mode 100755 index 0000000..ae713f7 --- /dev/null +++ b/tests/pcre-ascii-digits @@ -0,0 +1,31 @@ +#!/bin/sh +# Ensure that grep -P's \d matches only the 10 ASCII digits. +# With, grep-3.9, \d would match e.g., the multibyte Arabic digits. +# +# Copyright (C) 2023 Free Software Foundation, Inc. +# +# Copying and distribution of this file, with or without modification, +# are permitted in any medium without royalty provided the copyright +# notice and this notice are preserved. + +. "${srcdir=3D.}/init.sh"; path_prepend_ ../src +require_en_utf8_locale_ +LC_ALL=3Den_US.UTF-8 +export LC_ALL +require_pcre_ + +echo . | grep -qP '(*UTF).' 2>/dev/null \ + || skip_ 'PCRE unicode support is compiled out' + +fail=3D0 + +# $ printf %s =D9=A0=D9=A1=D9=A2=D9=A3=D9=A4=D9=A5=D9=A6=D9=A7=D9=A8=D9=A9= |od -An -to1 -w10 |sed 's/ /\\/g'; : arabic digits +# \331\240\331\241\331\242\331\243\331\244 +# \331\245\331\246\331\247\331\250\331\251 +printf '\331\240\331\241\331\242\331\243\331\244' > in || framework_failur= e_ +printf '\331\245\331\246\331\247\331\250\331\251' >> in || framework_failu= re_ + +grep -P '\d+' in > out && fail=3D1 +compare /dev/null out || fail=3D1 + +Exit $fail --=20 2.40.0.rc2 --=-=-=-- From unknown Fri Sep 05 19:41:18 2025 X-Loop: help-debbugs@gnu.org Subject: bug#62267: grep-3.9 bug: \d matches multibyte digits Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 19 Mar 2023 00:40:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62267 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Jim Meyering Cc: 62267@debbugs.gnu.org Received: via spool by 62267-submit@debbugs.gnu.org id=B62267.167918636119331 (code B ref 62267); Sun, 19 Mar 2023 00:40:01 +0000 Received: (at 62267) by debbugs.gnu.org; 19 Mar 2023 00:39:21 +0000 Received: from localhost ([127.0.0.1]:49183 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdh4z-00051j-0u for submit@debbugs.gnu.org; Sat, 18 Mar 2023 20:39:21 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:59868) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdh4x-00051U-6T for 62267@debbugs.gnu.org; Sat, 18 Mar 2023 20:39:20 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 4BC22160045; Sat, 18 Mar 2023 17:39:12 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 0uOyOp07Uksl; Sat, 18 Mar 2023 17:39:11 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 833A2160054; Sat, 18 Mar 2023 17:39:11 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.9.2 zimbra.cs.ucla.edu 833A2160054 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=78364E5A-2AF3-11ED-87FA-8298ECA2D365; t=1679186351; bh=T+9A2iI+/ZdKx0B6iV3OBRHlAC35s3j3gaiKhmFF8i4=; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type: Content-Transfer-Encoding; b=DlxHuhJTOywikOuWvR8818rbSXknLwFygpnmUwfTlJ0kHkSFrbLo1YilocmDSPjpj imFloftqm4JBfRU3ih/MXQTnEEe3SupiUQVfdG9VfJ3LQNEPouTCkmOHAMTT1qdpY6 aiojhNcAfd2pKa7P/f/riaVQ28rKgibrFXHAL20c= X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 0uphpvHPKcI0; Sat, 18 Mar 2023 17:39:11 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 53DF2160045; Sat, 18 Mar 2023 17:39:11 -0700 (PDT) Message-ID: <870af32d-5cdf-65d1-4cbc-9988c57b8e37@cs.ucla.edu> Date: Sat, 18 Mar 2023 17:39:11 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Content-Language: en-US References: From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.4 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.4 (----) Thanks for looking into this. A couple of questions. First, some documentation issues. Why is PCRE2 incompatible with Perl on this issue? Are there other areas where the two are incompatible? Are these incompatibilities documented anywhere? Is the goal for 'grep -P' to be compatible with Perl, not with PCRE2? Second, although that patch focuses on \d, doesn't \D have a similar problem and shouldn't it be fixed too? (OK, that was more than two questions. :-) From unknown Fri Sep 05 19:41:18 2025 X-Loop: help-debbugs@gnu.org Subject: bug#62267: grep-3.9 bug: \d matches multibyte digits Resent-From: Jim Meyering Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 19 Mar 2023 05:56:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62267 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Paul Eggert Cc: 62267@debbugs.gnu.org Received: via spool by 62267-submit@debbugs.gnu.org id=B62267.167920530421490 (code B ref 62267); Sun, 19 Mar 2023 05:56:01 +0000 Received: (at 62267) by debbugs.gnu.org; 19 Mar 2023 05:55:04 +0000 Received: from localhost ([127.0.0.1]:49294 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdm0V-0005aX-Vd for submit@debbugs.gnu.org; Sun, 19 Mar 2023 01:55:04 -0400 Received: from mail-lj1-f174.google.com ([209.85.208.174]:41784) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdm0T-0005Zt-Ff for 62267@debbugs.gnu.org; Sun, 19 Mar 2023 01:55:02 -0400 Received: by mail-lj1-f174.google.com with SMTP id e11so356408lji.8 for <62267@debbugs.gnu.org>; Sat, 18 Mar 2023 22:55:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679205295; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=51Z564oMXjySVrzvHpbiGkoxBrfpTIrRvA33uHrmj5k=; b=yZrjUpBROaFdGC4ODERAkRl1STvrWE/n0eJGKIJgxJNXDOAPJ3QzqoLEVj7O4UE+9x tbu1jLpB01vnEPaLh0hhorl1oHZe2CYMVfaAE/lrr1ilwGKtI0K/I0sed14Gg494wzK6 bCoog1cCz58IIOBONF30yL74tw5O8YfgOR3r71CvyISt/Epu2+usFkRspWRIcyFdfeTR LAPSVr/awZpcVfVHKJONPTzVmOAPXMFp9ODbnfUDnY80Xazjsh3uxjGLeWGMLHl4pP1b 1Ay9JcqJK/qYjg7s+asaLDGf6HRNCi8YOhw+9w1jZlWcVYO7G87wzV/ff6Mrs6Nyq7/K 3vhA== X-Gm-Message-State: AO0yUKV3hYK32WFBLzu8QMmEY+7TOlQ5v1O6w6nuuugpqjAQosiJylu+ WYpmrVoB7mEFnh9QVWB0Q5408Y9mhf2s0NrJXKUuYos3 X-Google-Smtp-Source: AK7set8gcHGBUesCxs//8VcALUK6rO7GadkRXAQsNNCeQLXUt5fdzsXd/VAYkZaBfGipCO1EcJdOqTCZvbIT3C8u4WA= X-Received: by 2002:a2e:b521:0:b0:294:6de5:e642 with SMTP id z1-20020a2eb521000000b002946de5e642mr4801569ljm.3.1679205295106; Sat, 18 Mar 2023 22:54:55 -0700 (PDT) MIME-Version: 1.0 References: <870af32d-5cdf-65d1-4cbc-9988c57b8e37@cs.ucla.edu> In-Reply-To: <870af32d-5cdf-65d1-4cbc-9988c57b8e37@cs.ucla.edu> From: Jim Meyering Date: Sat, 18 Mar 2023 22:54:42 -0700 Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.2 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.8 (/) On Sat, Mar 18, 2023 at 5:39=E2=80=AFPM Paul Eggert wr= ote: > Thanks for looking into this. A couple of questions. > > First, some documentation issues. Why is PCRE2 incompatible with Perl on > this issue? Are there other areas where the two are incompatible? To be honest, I was not too concerned about keeping up with Perl and am not worried about divergence, but admit I do not like the implication, given the name of the option: --perl-regexp. It's always been "pcre-regexp" in spirit. I suppose we'll want to document that, eventually. > Are > these incompatibilities documented anywhere? Is the goal for 'grep -P' > to be compatible with Perl, not with PCRE2? Doesn't Perl have the same issue? That's why the /a and /aa match modifiers were added. > Second, although that patch focuses on \d, doesn't \D have a similar > problem and shouldn't it be fixed too? Good point about \D. Will adjust. From unknown Fri Sep 05 19:41:18 2025 X-Loop: help-debbugs@gnu.org Subject: bug#62267: grep-3.9 bug: \d matches multibyte digits Resent-From: Jim Meyering Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 19 Mar 2023 06:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62267 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Paul Eggert Cc: 62267@debbugs.gnu.org Received: via spool by 62267-submit@debbugs.gnu.org id=B62267.167920763425354 (code B ref 62267); Sun, 19 Mar 2023 06:34:02 +0000 Received: (at 62267) by debbugs.gnu.org; 19 Mar 2023 06:33:54 +0000 Received: from localhost ([127.0.0.1]:49330 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdmc6-0006as-0t for submit@debbugs.gnu.org; Sun, 19 Mar 2023 02:33:54 -0400 Received: from mail-lj1-f172.google.com ([209.85.208.172]:43547) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdmc3-0006ae-Pw for 62267@debbugs.gnu.org; Sun, 19 Mar 2023 02:33:52 -0400 Received: by mail-lj1-f172.google.com with SMTP id f16so9024235ljq.10 for <62267@debbugs.gnu.org>; Sat, 18 Mar 2023 23:33:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679207626; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TllSbWXH9ZJ8Yo7o7Pn4NdvjNG+/3nPIqJjRDhLwvbA=; b=HI7D18KF/D9rCkoyDHautS8UAZjf4bOn5MD6WaJaLSAv84szOfC6NPzt8uPKJdc376 Za44nIjhoA8SRplMj54KbzEbfAP1nfQRWlz02USr5gmauxlk7+dTXMHvmp+UsqNSHrh/ 3krFx4DB1oJFAScIIhkt410gzdc6tNoJTvJf/xDLwD+H3h3wohsZiwaamRsaE9P+JGo7 3k2qdN4veVJhmMSiOsxdGp+5txiWGbSXPV5/BEPS6UDvNvSLXWcYM1xWY0zfPH6d68z0 xIGsVf4LAMGqyHPCqOztTSzNWE1n5pHT4bxN/qmGFEdlafUdEOjdJNgIsCR75YHcFBx4 mBfg== X-Gm-Message-State: AO0yUKXutEY9QtzB1cdM9RTQ9+vso4L5suT19VLJN8QjI1Qx47W908KL 29QMI/PPUD3rNkDQI97ttTcL2HQa7iAbo4MHweM= X-Google-Smtp-Source: AK7set87fhIOj82PXG+tTi83/ifXSPrpmOY2g3v2eM0eyb3yjnZv+L18pd6cRUfN6CXKovZAnCNgvPpuRFHR1SotAa0= X-Received: by 2002:a2e:8553:0:b0:29a:a76a:194b with SMTP id u19-20020a2e8553000000b0029aa76a194bmr2145935ljj.3.1679207625573; Sat, 18 Mar 2023 23:33:45 -0700 (PDT) MIME-Version: 1.0 References: <870af32d-5cdf-65d1-4cbc-9988c57b8e37@cs.ucla.edu> In-Reply-To: From: Jim Meyering Date: Sat, 18 Mar 2023 23:33:33 -0700 Message-ID: Content-Type: multipart/mixed; boundary="000000000000078b6d05f73afea6" X-Spam-Score: 0.2 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.8 (/) --000000000000078b6d05f73afea6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, Mar 18, 2023 at 10:54=E2=80=AFPM Jim Meyering wr= ote: > On Sat, Mar 18, 2023 at 5:39=E2=80=AFPM Paul Eggert = wrote: > > Thanks for looking into this. A couple of questions. > > > > First, some documentation issues. Why is PCRE2 incompatible with Perl o= n > > this issue? Are there other areas where the two are incompatible? > > To be honest, I was not too concerned about keeping up with Perl > and am not worried about divergence, but admit I do not like the > implication, given the name of the option: --perl-regexp. It's always > been "pcre-regexp" in spirit. I suppose we'll want to document that, > eventually. > > > Are > > these incompatibilities documented anywhere? Is the goal for 'grep -P' > > to be compatible with Perl, not with PCRE2? > > Doesn't Perl have the same issue? > That's why the /a and /aa match modifiers were added. > > > Second, although that patch focuses on \d, doesn't \D have a similar > > problem and shouldn't it be fixed too? > > Good point about \D. Will adjust. Here's an additional patch to handle \D. I've only just written it, so it's probably wrong or incomplete somewhere. I'll review it properly and probably improve it (could certainly add more tests in this area) tomorrow. By the way, have you ever used \D? I think I have not. --000000000000078b6d05f73afea6 Content-Type: application/octet-stream; name="grep-multibyte-D.patch" Content-Disposition: attachment; filename="grep-multibyte-D.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lff0tvk00 RnJvbSAxYjA5MTE2NWQxZWQyZDFlZDllNTc1YmZhYjRmMWIxODA4YTg1ZjA0IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBKaW0gTWV5ZXJpbmcgPG1leWVyaW5nQGZiLmNvbT4KRGF0ZTog U2F0LCAxOCBNYXIgMjAyMyAyMzoyNTowMyAtMDcwMApTdWJqZWN0OiBbUEFUQ0hdIGdyZXA6IC1Q ICgtLXBlcmwtcmVnZXhwKSBcRCBvbmNlIGFnYWluIHdvcmtzIGxpZSBbXjAtOV0KCiogTkVXUzog TWVudGlvbiBcRCwgdG9vLgoqIGRvYy9ncmVwLnRleGk6IExpa2V3aXNlCiogc3JjL2dyZXAuYyAo cGNyZV9wYXR0ZXJuX2V4cGFuZF9iYWNrc2xhc2hfZCk6IEhhbmRsZSBcRC4KKiB0ZXN0cy9wY3Jl LWFzY2lpLWRpZ2l0czogVGVzdCBcRCwgdG9vLiBBZGQgY29tbWVudHMuClRpZ2h0ZW4gb25lIHRl c3QgYnkgdXNpbmcgcmV0dXJuc18gMS4KUmVwb3J0ZWQgYnkgUGF1bCBFZ2dlcnQgaW4gaHR0cHM6 Ly9idWdzLmdudS5vcmcvNjIyNjcjOAotLS0KIE5FV1MgICAgICAgICAgICAgICAgICAgIHwgMiAr LQogZG9jL2dyZXAudGV4aSAgICAgICAgICAgfCAxICsKIHNyYy9ncmVwLmMgICAgICAgICAgICAg IHwgNyArKysrKy0tCiB0ZXN0cy9wY3JlLWFzY2lpLWRpZ2l0cyB8IDkgKysrKysrKystCiA0IGZp bGVzIGNoYW5nZWQsIDE1IGluc2VydGlvbnMoKyksIDQgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0 IGEvTkVXUyBiL05FV1MKaW5kZXggYTI0Y2ViZC4uNmY3N2QxNiAxMDA2NDQKLS0tIGEvTkVXUwor KysgYi9ORVdTCkBAIC05LDcgKzksNyBAQCBHTlUgZ3JlcCBORVdTICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgLSotIG91dGxpbmUgLSotCiAgIHByb3Blcmx5IGhhZCB0aGUgdW5k ZXNpcmFibGUgc2lkZSBlZmZlY3Qgb2YgbWFraW5nIFxkIGFsc28gbWF0Y2gKICAgZS5nLiwgdGhl IEFyYWJpYyBkaWdpdHM6INmg2aHZotmj2aTZpdmm2afZqNmpLiAgV2l0aCBncmVwLTMuOSwgLVAg J1xkKycKICAgd291bGQgbWF0Y2ggdGhhdCB0ZW4tZGlnaXQgKDIwLWJ5dGUpIHN0cmluZy4gTm93 LCB0byBtYXRjaCBzdWNoCi0gIGEgZGlnaXQsIHlvdSB3b3VsZCB1c2UgXHB7TmR9LgorICBhIGRp Z2l0LCB5b3Ugd291bGQgdXNlIFxwe05kfS4gU2ltaWxhcmx5LCBcRCBpcyBub3cgbWFwcGVkIHRv IFteMC05XS4KICAgW2J1ZyBpbnRyb2R1Y2VkIGluIGdyZXAgMy45XQoKCmRpZmYgLS1naXQgYS9k b2MvZ3JlcC50ZXhpIGIvZG9jL2dyZXAudGV4aQppbmRleCBlYWFkNmUxLi5hZDAzNGYxIDEwMDY0 NAotLS0gYS9kb2MvZ3JlcC50ZXhpCisrKyBiL2RvYy9ncmVwLnRleGkKQEAgLTExNDksNiArMTE0 OSw3IEBAIGluLXJlZ2V4cCBkaXJlY3RpdmVzIGxpa2UgQHNhbXB7KD9hRCl9LgogVXNlIEBzYW1w e1xwQHtOZEB9fSBpZiB5b3UgcmVxdWlyZSB0byBtYXRjaCBub24tQVNDSUkgZGlnaXRzLgogT25j ZSBwY3JlMiBzdXBwb3J0IGZvciBAc2FtcHsoP2FEKX0gaXMgd2lkZXNwcmVhZCBlbm91Z2gsCiB3 ZSBleHBlY3QgdG8gbWFrZSB0aGF0IHRoZSBkZWZhdWx0LCBzbyBpdCB3aWxsIGJlIG92ZXJyaWRh YmxlLgorU2ltaWxhcmx5LCBAc2FtcHtcRH0gbWF0Y2hlcyBhbnl0aGluZyBidXQgdGhvc2UgdGVu IEFTQ0lJIGRpZ2l0cy4KIEBjIFVzaW5nIHBjcmUyIGdpdCBjb21taXQgcGNyZTItMTAuNDAtMTEy LWc2Mjc3MzU3LCB0aGlzIGRlbW9uc3RyYXRlcyBob3cKIEBjIHdlJ2xsIHByZWZpeCB3aXRoICg/ YUQpIHRvIG1ha2UgXGQncyBBU0NJSS1vbmx5IGJlaGF2aW9yIHRoZSBkZWZhdWx0OgogQGMgJCBM Q19BTEw9ZW5fVVMuVVRGLTggLi9wY3JlMmdyZXAgLXUgJyg/YUQpXlxkKycgPDw8ICfZoNmh2aLZ o9mk2aXZptmn2ajZqScKZGlmZiAtLWdpdCBhL3NyYy9ncmVwLmMgYi9zcmMvZ3JlcC5jCmluZGV4 IDZiYTg4MWUuLjc5NDU5ZjMgMTAwNjQ0Ci0tLSBhL3NyYy9ncmVwLmMKKysrIGIvc3JjL2dyZXAu YwpAQCAtMjM4MiwxNCArMjM4MiwxNCBAQCBmZ3JlcF90b19ncmVwX3BhdHRlcm4gKGNoYXIgKipr ZXlzX3AsIGlkeF90ICpsZW5fcCkKIC8qIFJlcGxhY2UgZWFjaCBcZCBpbiAqS0VZU19QIHdpdGgg WzAtOV0sIHRvIGVuc3VyZSB0aGF0IFxkIG1hdGNoZXMgb25seSBBU0NJSQogICAgZGlnaXRzLiAg Tm93IHRoYXQgd2UgZW5hYmxlIFBDUkUyX1VDUCBmb3IgcGNyZSByZWdleHBzLCBcZCB3b3VsZCBv dGhlcndpc2UKICAgIG1hdGNoIG5vbi1BU0NJSSBkaWdpdHMgaW4gc29tZSBsb2NhbGVzLiAgVXNl IFxwe05kfSBpZiB5b3UgcmVxdWlyZSB0byBtYXRjaAotICAgdGhvc2UuICAqLworICAgdGhvc2Uu ICBTaW1pbGFybHksIHJlcGxhY2UgZWFjaCBcRCB3aXRoIFteMC05XS4gICovCiBzdGF0aWMgdm9p ZAogcGNyZV9wYXR0ZXJuX2V4cGFuZF9iYWNrc2xhc2hfZCAoY2hhciAqKmtleXNfcCwgaWR4X3Qg Kmxlbl9wKQogewogICBpZHhfdCBsZW4gPSAqbGVuX3A7CiAgIGNoYXIgKmtleXMgPSAqa2V5c19w OwogICBtYnN0YXRlX3QgbWJfc3RhdGUgPSB7IDAgfTsKLSAgY2hhciAqbmV3X2tleXMgPSB4bm1h bGxvYyAobGVuIC8gMiArIDEsIDUpOworICBjaGFyICpuZXdfa2V5cyA9IHhubWFsbG9jIChsZW4g LyAyICsgMSwgNik7CiAgIGNoYXIgKnAgPSBuZXdfa2V5czsKICAgYm9vbCBwcmV2X2JhY2tzbGFz aCA9IGZhbHNlOwoKQEAgLTI0MjgsNiArMjQyOCw5IEBAIHBjcmVfcGF0dGVybl9leHBhbmRfYmFj a3NsYXNoX2QgKGNoYXIgKiprZXlzX3AsIGlkeF90ICpsZW5fcCkKICAgICAgICAgICAgICAgICBj YXNlICdkJzoKICAgICAgICAgICAgICAgICAgIHAgPSBtZW1wY3B5IChwLCAiWzAtOV0iLCA1KTsK ICAgICAgICAgICAgICAgICAgIGJyZWFrOworICAgICAgICAgICAgICAgIGNhc2UgJ0QnOgorICAg ICAgICAgICAgICAgICAgcCA9IG1lbXBjcHkgKHAsICJbXjAtOV0iLCA2KTsKKyAgICAgICAgICAg ICAgICAgIGJyZWFrOwogICAgICAgICAgICAgICAgIGRlZmF1bHQ6CiAgICAgICAgICAgICAgICAg ICAqcCsrID0gJ1xcJzsKICAgICAgICAgICAgICAgICAgICpwKysgPSAqa2V5czsKZGlmZiAtLWdp dCBhL3Rlc3RzL3BjcmUtYXNjaWktZGlnaXRzIGIvdGVzdHMvcGNyZS1hc2NpaS1kaWdpdHMKaW5k ZXggYWU3MTNmNy4uMDE1OTI4NiAxMDA3NTUKLS0tIGEvdGVzdHMvcGNyZS1hc2NpaS1kaWdpdHMK KysrIGIvdGVzdHMvcGNyZS1hc2NpaS1kaWdpdHMKQEAgLTEsNiArMSw3IEBACiAjIS9iaW4vc2gK ICMgRW5zdXJlIHRoYXQgZ3JlcCAtUCdzIFxkIG1hdGNoZXMgb25seSB0aGUgMTAgQVNDSUkgZGln aXRzLgogIyBXaXRoLCBncmVwLTMuOSwgXGQgd291bGQgbWF0Y2ggZS5nLiwgdGhlIG11bHRpYnl0 ZSBBcmFiaWMgZGlnaXRzLgorIyBUaGUgc2FtZSBhcHBsaWVkIHRvIFxELgogIwogIyBDb3B5cmln aHQgKEMpIDIwMjMgRnJlZSBTb2Z0d2FyZSBGb3VuZGF0aW9uLCBJbmMuCiAjCkBAIC0yNCw4ICsy NSwxNCBAQCBmYWlsPTAKICMgXDMzMVwyNDVcMzMxXDI0NlwzMzFcMjQ3XDMzMVwyNTBcMzMxXDI1 MQogcHJpbnRmICdcMzMxXDI0MFwzMzFcMjQxXDMzMVwyNDJcMzMxXDI0M1wzMzFcMjQ0JyA+IGlu IHx8IGZyYW1ld29ya19mYWlsdXJlXwogcHJpbnRmICdcMzMxXDI0NVwzMzFcMjQ2XDMzMVwyNDdc MzMxXDI1MFwzMzFcMjUxJyA+PiBpbiB8fCBmcmFtZXdvcmtfZmFpbHVyZV8KK3ByaW50ZiAnXG4n ID4+IGluIHx8IGZyYW1ld29ya19mYWlsdXJlXwoKLWdyZXAgLVAgJ1xkKycgaW4gPiBvdXQgJiYg ZmFpbD0xCisjIEVuc3VyZSB0aGF0IFxkIG1hdGNoZXMgbm8gY2hhcmFjdGVyLgorcmV0dXJuc18g MSBncmVwIC1QICdcZCcgaW4gPiBvdXQgfHwgZmFpbD0xCiBjb21wYXJlIC9kZXYvbnVsbCBvdXQg fHwgZmFpbD0xCgorIyBFbnN1cmUgdGhhdCBeXEQrJCBtYXRjaGVzIHRoZSBlbnRpcmUgbGluZS4K K2dyZXAgLVAgJ15cRCskJyBpbiA+IG91dCB8fCBmYWlsPTEKK2NvbXBhcmUgaW4gb3V0IHx8IGZh aWw9MQorCiBFeGl0ICRmYWlsCi0tIAoyLjQwLjAucmMyCgo= --000000000000078b6d05f73afea6-- From unknown Fri Sep 05 19:41:18 2025 X-Loop: help-debbugs@gnu.org Subject: bug#62267: grep-3.9 bug: \d matches multibyte digits Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 19 Mar 2023 08:29:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62267 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Jim Meyering Cc: 62267@debbugs.gnu.org Received: via spool by 62267-submit@debbugs.gnu.org id=B62267.16792145313898 (code B ref 62267); Sun, 19 Mar 2023 08:29:01 +0000 Received: (at 62267) by debbugs.gnu.org; 19 Mar 2023 08:28:51 +0000 Received: from localhost ([127.0.0.1]:49411 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdoPK-00010n-Gx for submit@debbugs.gnu.org; Sun, 19 Mar 2023 04:28:51 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:60036) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdoPH-00010Z-GQ for 62267@debbugs.gnu.org; Sun, 19 Mar 2023 04:28:49 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 2DA6B160045; Sun, 19 Mar 2023 01:28:41 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id g_57o6p9oJp0; Sun, 19 Mar 2023 01:28:39 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 4978E160054; Sun, 19 Mar 2023 01:28:39 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.9.2 zimbra.cs.ucla.edu 4978E160054 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=78364E5A-2AF3-11ED-87FA-8298ECA2D365; t=1679214519; bh=/ssb75AVyIilVU+7a2+94ONDePrcpvSwLbHx2pxHSJQ=; h=Content-Type:Message-ID:Date:MIME-Version:To:From:Subject; b=bXu7sZ+8SZ+SNoQlmXlbrA0XRlHsJVInxT/TKVSPdIUhln2AqKtuEpattClXbgIyv x/hdgtf3B1UEBGGi3rnOnDtbG77KBz3D5QBcnI2c8ZWTw5mlgRBK6de8keVyQlhxjp i3iPLAgG4Imlo2GTfy4CsGGp6XvZyz5TmsA/iZjg= X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ZQqA2-BXxC3Z; Sun, 19 Mar 2023 01:28:39 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id E5B4F160045; Sun, 19 Mar 2023 01:28:38 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------WFSXL759eBRytamQBMw3TUVH" Message-ID: Date: Sun, 19 Mar 2023 01:28:38 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Content-Language: en-US References: <870af32d-5cdf-65d1-4cbc-9988c57b8e37@cs.ucla.edu> From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: X-Spam-Score: -3.4 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.4 (----) This is a multi-part message in MIME format. --------------WFSXL759eBRytamQBMw3TUVH Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable On 2023-03-18 23:33, Jim Meyering wrote: > By the way, have you ever used \D? I think I have not. No, I'm not much of a Perl user these days (last seriously used it in=20 the 1990s...). > - char *new_keys =3D xnmalloc (len / 2 + 1, 5); > + char *new_keys =3D xnmalloc (len / 2 + 1, 6); This could be xnmalloc (len + 1, 3). Or if you want to show the work, you can replace it with something like: int origlen =3D sizeof "\\D" - 1; int repllen =3D sizeof "[^0-9]" - 1; int expansion =3D repllen / origlen + (repllen % origlen !=3D 0); char *new_keys =3D xnmalloc (len + 1, expansion); (Isn't memory allocation fun? :-) > Doesn't Perl have the same issue? Oh, you're right. Not being a Perl expert, all I did was run this: echo '=D9=A0=D9=A1=D9=A2=D9=A3=D9=A4=D9=A5=D9=A6=D9=A7=D9=A8=D9=A9' | = perl -ne 'print if /\d/' and I observed no output. However, I now see that I need to use perl's=20 -C option too, to get the kind of regular-expression behavior that plain=20 grep has. Looking at the source code again, how about if we move the PCRE-specific=20 changes from src/grep.c to src/pcresearch.c which is where it really=20 belongs, and more importantly use the bleeding-edge=20 PCRE2_EXTRA_ASCII_BSD macro if available? Something like the attached patch, say. This patch doesn't take your \D=20 fixes (or the above suggestions) into account. --------------WFSXL759eBRytamQBMw3TUVH Content-Type: text/x-patch; charset=UTF-8; name="0001-grep-forward-port-to-PCRE2-10.43.patch" Content-Disposition: attachment; filename="0001-grep-forward-port-to-PCRE2-10.43.patch" Content-Transfer-Encoding: base64 RnJvbSBlZDdmYTgwMTk2M2FhZjUyNmY3NzI1NzQxZDA5NWM4MGFkOTQ0NzMxIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBTdW4sIDE5IE1hciAyMDIzIDAxOjIzOjUxIC0wNzAwClN1YmplY3Q6IFtQQVRD SF0gZ3JlcDogZm9yd2FyZCBwb3J0IHRvIFBDUkUyIDEwLjQzCgoqIGRvYy9ncmVwLnRleGk6 IERvY3VtZW50IHRoaXMsIGFuZCB2ZXJzaW9uIGhhc3NsZXMuCiogc3JjL2dyZXAuYzogTW92 ZSByZWNlbnQgY2hhbmdlcyBpbnRvIHBjcmVzZWFyY2guYy4KKFBfTUFUQ0hFUl9JTkRFWCk6 IFJlbW92ZS4KKHBjcmVfcGF0dGVybl9leHBhbmRfYmFja3NsYXNoX2QpOiBNb3ZlIGZyb20g aGVyZSAuLi4KKiBzcmMvcGNyZXNlYXJjaC5jOiAuLi4gdG8gaGVyZS4KKFBDUkUyX0VYVFJB X0FTQ0lJX0JTRCk6IERlZmF1bHQgdG8gMC4KKFBjb21waWxlKTogVXNlIFBDUkUyX0VYVFJB X0FTQ0lJX0JTRCBpZiBhdmFpbGFibGUsCmFuZCBleHBhbmQgXGQgdG8gWzAtOV0gb3RoZXJ3 aXNlLgotLS0KIGRvYy9ncmVwLnRleGkgICAgfCAyNCArKysrKysrKysrLS0tLQogc3JjL2dy ZXAuYyAgICAgICB8IDgyICstLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0KIHNyYy9wY3Jlc2VhcmNoLmMgfCA4NCArKysrKysrKysrKysrKysrKysrKysr KysrKysrKysrKysrKysrKysrKysrKysrKy0KIDMgZmlsZXMgY2hhbmdlZCwgMTAxIGluc2Vy dGlvbnMoKyksIDg5IGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL2RvYy9ncmVwLnRleGkg Yi9kb2MvZ3JlcC50ZXhpCmluZGV4IGVhYWQ2ZTEuLjhjOGJhYTkgMTAwNjQ0Ci0tLSBhL2Rv Yy9ncmVwLnRleGkKKysrIGIvZG9jL2dyZXAudGV4aQpAQCAtMTE0NCwxOCArMTE0NCwyOCBA QCBjb21iaW5lZCB3aXRoIHRoZSBAb3B0aW9uey16fSAoQG9wdGlvbnstLW51bGwtZGF0YX0p IG9wdGlvbiwgYW5kIG5vdGUgdGhhdAogRm9yIGRvY3VtZW50YXRpb24sIHJlZmVyIHRvIEB1 cmx7aHR0cHM6Ly93d3cucGNyZS5vcmcvfSwgd2l0aCB0aGVzZSBjYXZlYXRzOgogQGl0ZW1p emUKIEBpdGVtCi1Ac2FtcHtcZH0gYWx3YXlzIG1hdGNoZXMgb25seSB0aGUgdGVuIEFTQ0lJ IGRpZ2l0cywgcmVnYXJkbGVzcyBvZiBsb2NhbGUgb3IKLWluLXJlZ2V4cCBkaXJlY3RpdmVz IGxpa2UgQHNhbXB7KD9hRCl9LgotVXNlIEBzYW1we1xwQHtOZEB9fSBpZiB5b3UgcmVxdWly ZSB0byBtYXRjaCBub24tQVNDSUkgZGlnaXRzLgotT25jZSBwY3JlMiBzdXBwb3J0IGZvciBA c2FtcHsoP2FEKX0gaXMgd2lkZXNwcmVhZCBlbm91Z2gsCi13ZSBleHBlY3QgdG8gbWFrZSB0 aGF0IHRoZSBkZWZhdWx0LCBzbyBpdCB3aWxsIGJlIG92ZXJyaWRhYmxlLgotQGMgVXNpbmcg cGNyZTIgZ2l0IGNvbW1pdCBwY3JlMi0xMC40MC0xMTItZzYyNzczNTcsIHRoaXMgZGVtb25z dHJhdGVzIGhvdwotQGMgd2UnbGwgcHJlZml4IHdpdGggKD9hRCkgdG8gbWFrZSBcZCdzIEFT Q0lJLW9ubHkgYmVoYXZpb3IgdGhlIGRlZmF1bHQ6CitAc2FtcHtcZH0gbWF0Y2hlcyBvbmx5 IHRoZSB0ZW4gQVNDSUkgZGlnaXRzLCByZWdhcmRsZXNzIG9mIGxvY2FsZS4KK1VzZSBAc2Ft cHtccEB7TmRAfX0gdG8gYWxzbyBtYXRjaCBub24tQVNDSUkgZGlnaXRzLgorCitXaGVuIEBj b21tYW5ke2dyZXB9IGlzIGJ1aWx0IHdpdGggUENSRTIgMTAuNDIgYW5kIGVhcmxpZXIsIEBz YW1we1xkfQoraWdub3JlcyBpbi1yZWdleHAgZGlyZWN0aXZlcyBsaWtlIEBzYW1weyg/YUQp fSBhbmQgbWF0Y2hlcyBvbmx5IEFTQ0lJCitkaWdpdHMgcmVnYXJkbGVzcyBvZiB0aGVzZSBk aXJlY3RpdmVzLiAgSG93ZXZlciwgbGF0ZXIgdmVyc2lvbnMgb2YKK1BDUkUyIGxpa2VseSB3 aWxsIGZpeCB0aGlzLCBhbmQgdGhlIHBsYW4gaXMgZm9yIEBjb21tYW5ke2dyZXB9IHRvCity ZXNwZWN0IHRob3NlIGRpcmVjdGl2ZXMgaWYgcG9zc2libGUuCitAYyBVc2luZyBQQ1JFMiBn aXQgY29tbWl0IHBjcmUyLTEwLjQwLTExMi1nNjI3NzM1NywgdGhpcyBkZW1vbnN0cmF0ZXMK K0BjIHRoZSBlcXVpdmFsZW50IG9mIGhvdyBncmVwIGNvdWxkIHVzZSBQQ1JFMl9FWFRSQV9B U0NJSV9CU0QgdG8gbWFrZSBcZCdzCitAYyBBU0NJSS1vbmx5IGJlaGF2aW9yIHRoZSBkZWZh dWx0OgogQGMgJCBMQ19BTEw9ZW5fVVMuVVRGLTggLi9wY3JlMmdyZXAgLXUgJyg/YUQpXlxk KycgPDw8ICfZoNmh2aLZo9mk2aXZptmn2ajZqScKIEBjIFtFeGl0IDFdCiBAYyAkIExDX0FM TD1lbl9VUy5VVEYtOCAuL3BjcmUyZ3JlcCAtdSAnXlxkKycgPDw8ICfZoNmh2aLZo9mk2aXZ ptmn2ajZqScKIEBjINmg2aHZotmj2aTZpdmm2afZqNmpCiAKK0BpdGVtCitBbHRob3VnaCBQ Q1JFMiB0cmFja3MgdGhlIHN5bnRheCBhbmQgc2VtYW50aWNzIG9mIFBlcmwncyByZWd1bGFy CitleHByZXNzaW9ucywgdGhlIG1hdGNoIGlzIG5vdCBhbHdheXMgZXhhY3QsIHBhcnRseSBi ZWNhdXNlIFBlcmwKK2V2b2x2ZXMgYW5kIGEgUGVybCBpbnN0YWxsYXRpb24gbWF5IHByZWRh dGUgb3IgcG9zdGRhdGUgdGhlIFBDUkUyCitpbnN0YWxsYXRpb24gb24gdGhlIHNhbWUgaG9z dC4KKwogQGl0ZW0KIEJ5IGRlZmF1bHQsIEBjb21tYW5ke2dyZXB9IGFwcGxpZXMgZWFjaCBy ZWdleHAgdG8gYSBsaW5lIGF0IGEgdGltZSwKIHNvIHRoZSBAc2FtcHsoP3MpfSBkaXJlY3Rp dmUgKG1ha2luZyBAc2FtcHsufSBtYXRjaCBsaW5lIGJyZWFrcykKZGlmZiAtLWdpdCBhL3Ny Yy9ncmVwLmMgYi9zcmMvZ3JlcC5jCmluZGV4IDZiYTg4MWUuLjc1NDdiNjQgMTAwNjQ0Ci0t LSBhL3NyYy9ncmVwLmMKKysrIGIvc3JjL2dyZXAuYwpAQCAtMjA4OSw4ICsyMDg5LDcgQEAg c3RhdGljIHN0cnVjdAogI2VuZGlmCiB9OwogLyogS2VlcCB0aGVzZSBpbiBzeW5jIHdpdGgg dGhlICdtYXRjaGVycycgdGFibGUuICAqLwotZW51bSB7IEVfTUFUQ0hFUl9JTkRFWCA9IDEs IEZfTUFUQ0hFUl9JTkRFWCA9IDIsIEdfTUFUQ0hFUl9JTkRFWCA9IDAsCi0gICAgICAgUF9N QVRDSEVSX0lOREVYID0gNiB9OworZW51bSB7IEVfTUFUQ0hFUl9JTkRFWCA9IDEsIEZfTUFU Q0hFUl9JTkRFWCA9IDIsIEdfTUFUQ0hFUl9JTkRFWCA9IDAgfTsKIAogLyogUmV0dXJuIHRo ZSBpbmRleCBvZiB0aGUgbWF0Y2hlciBjb3JyZXNwb25kaW5nIHRvIE0gaWYgYXZhaWxhYmxl LgogICAgTUFUQ0hFUiBpcyB0aGUgaW5kZXggb2YgdGhlIHByZXZpb3VzIG1hdGNoZXIsIG9y IC0xIGlmIG5vbmUuCkBAIC0yMzc5LDgwICsyMzc4LDYgQEAgZmdyZXBfdG9fZ3JlcF9wYXR0 ZXJuIChjaGFyICoqa2V5c19wLCBpZHhfdCAqbGVuX3ApCiAgICpsZW5fcCA9IHAgLSBuZXdf a2V5czsKIH0KIAotLyogUmVwbGFjZSBlYWNoIFxkIGluICpLRVlTX1Agd2l0aCBbMC05XSwg dG8gZW5zdXJlIHRoYXQgXGQgbWF0Y2hlcyBvbmx5IEFTQ0lJCi0gICBkaWdpdHMuICBOb3cg dGhhdCB3ZSBlbmFibGUgUENSRTJfVUNQIGZvciBwY3JlIHJlZ2V4cHMsIFxkIHdvdWxkIG90 aGVyd2lzZQotICAgbWF0Y2ggbm9uLUFTQ0lJIGRpZ2l0cyBpbiBzb21lIGxvY2FsZXMuICBV c2UgXHB7TmR9IGlmIHlvdSByZXF1aXJlIHRvIG1hdGNoCi0gICB0aG9zZS4gICovCi1zdGF0 aWMgdm9pZAotcGNyZV9wYXR0ZXJuX2V4cGFuZF9iYWNrc2xhc2hfZCAoY2hhciAqKmtleXNf cCwgaWR4X3QgKmxlbl9wKQotewotICBpZHhfdCBsZW4gPSAqbGVuX3A7Ci0gIGNoYXIgKmtl eXMgPSAqa2V5c19wOwotICBtYnN0YXRlX3QgbWJfc3RhdGUgPSB7IDAgfTsKLSAgY2hhciAq bmV3X2tleXMgPSB4bm1hbGxvYyAobGVuIC8gMiArIDEsIDUpOwotICBjaGFyICpwID0gbmV3 X2tleXM7Ci0gIGJvb2wgcHJldl9iYWNrc2xhc2ggPSBmYWxzZTsKLQotICBmb3IgKHB0cmRp ZmZfdCBuOyBsZW47IGtleXMgKz0gbiwgbGVuIC09IG4pCi0gICAgewotICAgICAgbiA9IG1i X2NsZW4gKGtleXMsIGxlbiwgJm1iX3N0YXRlKTsKLSAgICAgIHN3aXRjaCAobikKLSAgICAg ICAgewotICAgICAgICBjYXNlIC0yOgotICAgICAgICAgIG4gPSBsZW47Ci0gICAgICAgICAg RkFMTFRIUk9VR0g7Ci0gICAgICAgIGRlZmF1bHQ6Ci0gICAgICAgICAgaWYgKHByZXZfYmFj a3NsYXNoKQotICAgICAgICAgICAgewotICAgICAgICAgICAgICBwcmV2X2JhY2tzbGFzaCA9 IGZhbHNlOwotICAgICAgICAgICAgICAqcCsrID0gJ1xcJzsKLSAgICAgICAgICAgIH0KLSAg ICAgICAgICBwID0gbWVtcGNweSAocCwga2V5cywgbik7Ci0gICAgICAgICAgYnJlYWs7Ci0K LSAgICAgICAgY2FzZSAtMToKLSAgICAgICAgICBpZiAocHJldl9iYWNrc2xhc2gpCi0gICAg ICAgICAgICB7Ci0gICAgICAgICAgICAgIHByZXZfYmFja3NsYXNoID0gZmFsc2U7Ci0gICAg ICAgICAgICAgICpwKysgPSAnXFwnOwotICAgICAgICAgICAgfQotICAgICAgICAgIG1lbXNl dCAoJm1iX3N0YXRlLCAwLCBzaXplb2YgbWJfc3RhdGUpOwotICAgICAgICAgIG4gPSAxOwot ICAgICAgICAgIEZBTExUSFJPVUdIOwotICAgICAgICBjYXNlIDE6Ci0gICAgICAgICAgaWYg KHByZXZfYmFja3NsYXNoKQotICAgICAgICAgICAgewotICAgICAgICAgICAgICBwcmV2X2Jh Y2tzbGFzaCA9IGZhbHNlOwotICAgICAgICAgICAgICBzd2l0Y2ggKCprZXlzKQotICAgICAg ICAgICAgICAgIHsKLSAgICAgICAgICAgICAgICBjYXNlICdkJzoKLSAgICAgICAgICAgICAg ICAgIHAgPSBtZW1wY3B5IChwLCAiWzAtOV0iLCA1KTsKLSAgICAgICAgICAgICAgICAgIGJy ZWFrOwotICAgICAgICAgICAgICAgIGRlZmF1bHQ6Ci0gICAgICAgICAgICAgICAgICAqcCsr ID0gJ1xcJzsKLSAgICAgICAgICAgICAgICAgICpwKysgPSAqa2V5czsKLSAgICAgICAgICAg ICAgICAgIGJyZWFrOwotICAgICAgICAgICAgICAgIH0KLSAgICAgICAgICAgIH0KLSAgICAg ICAgICBlbHNlCi0gICAgICAgICAgICB7Ci0gICAgICAgICAgICAgIGlmICgqa2V5cyA9PSAn XFwnKQotICAgICAgICAgICAgICAgIHByZXZfYmFja3NsYXNoID0gdHJ1ZTsKLSAgICAgICAg ICAgICAgZWxzZQotICAgICAgICAgICAgICAgICpwKysgPSAqa2V5czsKLSAgICAgICAgICAg IH0KLSAgICAgICAgICBicmVhazsKLSAgICAgICAgfQotICAgIH0KLQotICBpZiAocHJldl9i YWNrc2xhc2gpCi0gICAgKnArKyA9ICdcXCc7Ci0gICpwID0gJ1xuJzsKLSAgZnJlZSAoKmtl eXNfcCk7Ci0gICprZXlzX3AgPSBuZXdfa2V5czsKLSAgKmxlbl9wID0gcCAtIG5ld19rZXlz OwotfQotCiAvKiBJZiBpdCBpcyBlYXN5LCBjb252ZXJ0IHRoZSBNQVRDSEVSLXN0eWxlIHBh dHRlcm5zIEtFWVMgKG9mIHNpemUKICAgICpMRU5fUCkgdG8gLUYgc3R5bGUsIHVwZGF0ZSAq TEVOX1AgdG8gYSBwb3NzaWJseS1zbWFsbGVyIHZhbHVlLCBhbmQKICAgIHJldHVybiBGX01B VENIRVJfSU5ERVguICBJZiBub3QsIGxlYXZlIEtFWVMgYW5kICpMRU5fUCBhbG9uZSBhbmQK QEAgLTMwNDUsMTEgKzI5NzAsNiBAQCBtYWluIChpbnQgYXJnYywgY2hhciAqKmFyZ3YpCiAg ICAgICAgIG1hdGNoZXIgPSB0cnlfZmdyZXBfcGF0dGVybiAobWF0Y2hlciwga2V5cywgJmtl eWNjKTsKICAgICB9CiAKLSAgLyogSWYgLVAsIHJlcGxhY2UgZWFjaCBcZCB3aXRoIFswLTld LgotICAgICBUaG9zZSB3aG8gd2FudCB0byBtYXRjaCBub24tQVNDSUkgZGlnaXRzIG11c3Qg dXNlIFxwe05kfS4gICovCi0gIGlmIChtYXRjaGVyID09IFBfTUFUQ0hFUl9JTkRFWCkKLSAg ICBwY3JlX3BhdHRlcm5fZXhwYW5kX2JhY2tzbGFzaF9kICgma2V5cywgJmtleWNjKTsKLQog ICBleGVjdXRlID0gbWF0Y2hlcnNbbWF0Y2hlcl0uZXhlY3V0ZTsKICAgY29tcGlsZWRfcGF0 dGVybiA9CiAgICAgbWF0Y2hlcnNbbWF0Y2hlcl0uY29tcGlsZSAoa2V5cywga2V5Y2MsIG1h dGNoZXJzW21hdGNoZXJdLnN5bnRheCwKZGlmZiAtLWdpdCBhL3NyYy9wY3Jlc2VhcmNoLmMg Yi9zcmMvcGNyZXNlYXJjaC5jCmluZGV4IDViMTExYmUuLjNhMGZhNjAgMTAwNjQ0Ci0tLSBh L3NyYy9wY3Jlc2VhcmNoLmMKKysrIGIvc3JjL3BjcmVzZWFyY2guYwpAQCAtMzUsNiArMzUs OSBAQAogIyBkZWZpbmUgUENSRTJfRVJST1JfREVQVEhMSU1JVCBQQ1JFMl9FUlJPUl9SRUNV UlNJT05MSU1JVAogIyBkZWZpbmUgcGNyZTJfc2V0X2RlcHRoX2xpbWl0IHBjcmUyX3NldF9y ZWN1cnNpb25fbGltaXQKICNlbmRpZgorI2lmbmRlZiBQQ1JFMl9FWFRSQV9BU0NJSV9CU0QK KyMgZGVmaW5lIFBDUkUyX0VYVFJBX0FTQ0lJX0JTRCAwCisjZW5kaWYKIAogc3RydWN0IHBj cmVfY29tcAogewpAQCAtMTMwLDEyICsxMzMsODkgQEAgYmFkX3V0ZjhfZnJvbV9wY3JlMiAo aW50IGUpCiAjZW5kaWYKIH0KIAorLyogUmVwbGFjZSBlYWNoIFxkIGluICpLRVlTX1Agd2l0 aCBbMC05XSwgdG8gZW5zdXJlIHRoYXQgXGQgbWF0Y2hlcyBvbmx5IEFTQ0lJCisgICBkaWdp dHMuICBOb3cgdGhhdCB3ZSBlbmFibGUgUENSRTJfVUNQIGZvciBwY3JlIHJlZ2V4cHMsIFxk IHdvdWxkIG90aGVyd2lzZQorICAgbWF0Y2ggbm9uLUFTQ0lJIGRpZ2l0cyBpbiBzb21lIGxv Y2FsZXMuICBVc2UgXHB7TmR9IGlmIHlvdSByZXF1aXJlIHRvIG1hdGNoCisgICB0aG9zZS4g ICovCitzdGF0aWMgdm9pZAorcGNyZV9wYXR0ZXJuX2V4cGFuZF9iYWNrc2xhc2hfZCAoY2hh ciAqKmtleXNfcCwgaWR4X3QgKmxlbl9wKQoreworICBpZHhfdCBsZW4gPSAqbGVuX3A7Cisg IGNoYXIgKmtleXMgPSAqa2V5c19wOworICBtYnN0YXRlX3QgbWJfc3RhdGUgPSB7IDAgfTsK KyAgY2hhciAqbmV3X2tleXMgPSB4bm1hbGxvYyAobGVuIC8gMiArIDEsIDUpOworICBjaGFy ICpwID0gbmV3X2tleXM7CisgIGJvb2wgcHJldl9iYWNrc2xhc2ggPSBmYWxzZTsKKworICBm b3IgKHB0cmRpZmZfdCBuOyBsZW47IGtleXMgKz0gbiwgbGVuIC09IG4pCisgICAgeworICAg ICAgbiA9IG1iX2NsZW4gKGtleXMsIGxlbiwgJm1iX3N0YXRlKTsKKyAgICAgIHN3aXRjaCAo bikKKyAgICAgICAgeworICAgICAgICBjYXNlIC0yOgorICAgICAgICAgIG4gPSBsZW47Cisg ICAgICAgICAgRkFMTFRIUk9VR0g7CisgICAgICAgIGRlZmF1bHQ6CisgICAgICAgICAgaWYg KHByZXZfYmFja3NsYXNoKQorICAgICAgICAgICAgeworICAgICAgICAgICAgICBwcmV2X2Jh Y2tzbGFzaCA9IGZhbHNlOworICAgICAgICAgICAgICAqcCsrID0gJ1xcJzsKKyAgICAgICAg ICAgIH0KKyAgICAgICAgICBwID0gbWVtcGNweSAocCwga2V5cywgbik7CisgICAgICAgICAg YnJlYWs7CisKKyAgICAgICAgY2FzZSAtMToKKyAgICAgICAgICBpZiAocHJldl9iYWNrc2xh c2gpCisgICAgICAgICAgICB7CisgICAgICAgICAgICAgIHByZXZfYmFja3NsYXNoID0gZmFs c2U7CisgICAgICAgICAgICAgICpwKysgPSAnXFwnOworICAgICAgICAgICAgfQorICAgICAg ICAgIG1lbXNldCAoJm1iX3N0YXRlLCAwLCBzaXplb2YgbWJfc3RhdGUpOworICAgICAgICAg IG4gPSAxOworICAgICAgICAgIEZBTExUSFJPVUdIOworICAgICAgICBjYXNlIDE6CisgICAg ICAgICAgaWYgKHByZXZfYmFja3NsYXNoKQorICAgICAgICAgICAgeworICAgICAgICAgICAg ICBwcmV2X2JhY2tzbGFzaCA9IGZhbHNlOworICAgICAgICAgICAgICBzd2l0Y2ggKCprZXlz KQorICAgICAgICAgICAgICAgIHsKKyAgICAgICAgICAgICAgICBjYXNlICdkJzoKKyAgICAg ICAgICAgICAgICAgIHAgPSBtZW1wY3B5IChwLCAiWzAtOV0iLCA1KTsKKyAgICAgICAgICAg ICAgICAgIGJyZWFrOworICAgICAgICAgICAgICAgIGRlZmF1bHQ6CisgICAgICAgICAgICAg ICAgICAqcCsrID0gJ1xcJzsKKyAgICAgICAgICAgICAgICAgICpwKysgPSAqa2V5czsKKyAg ICAgICAgICAgICAgICAgIGJyZWFrOworICAgICAgICAgICAgICAgIH0KKyAgICAgICAgICAg IH0KKyAgICAgICAgICBlbHNlCisgICAgICAgICAgICB7CisgICAgICAgICAgICAgIGlmICgq a2V5cyA9PSAnXFwnKQorICAgICAgICAgICAgICAgIHByZXZfYmFja3NsYXNoID0gdHJ1ZTsK KyAgICAgICAgICAgICAgZWxzZQorICAgICAgICAgICAgICAgICpwKysgPSAqa2V5czsKKyAg ICAgICAgICAgIH0KKyAgICAgICAgICBicmVhazsKKyAgICAgICAgfQorICAgIH0KKworICBp ZiAocHJldl9iYWNrc2xhc2gpCisgICAgKnArKyA9ICdcXCc7CisgICpwID0gJ1xuJzsKKyAg ZnJlZSAoKmtleXNfcCk7CisgICprZXlzX3AgPSBuZXdfa2V5czsKKyAgKmxlbl9wID0gcCAt IG5ld19rZXlzOworfQorCiAvKiBDb21waWxlIHRoZSAtUCBzdHlsZSBQQVRURVJOLCBjb250 YWluaW5nIFNJWkUgYnl0ZXMgdGhhdCBhcmUKICAgIGZvbGxvd2VkIGJ5ICdcbicuICBSZXR1 cm4gYSBkZXNjcmlwdGlvbiBvZiB0aGUgY29tcGlsZWQgcGF0dGVybi4gICovCiAKIHZvaWQg KgogUGNvbXBpbGUgKGNoYXIgKnBhdHRlcm4sIGlkeF90IHNpemUsIHJlZ19zeW50YXhfdCBp Z25vcmVkLCBib29sIGV4YWN0KQogeworICBpZiAoISBQQ1JFMl9FWFRSQV9BU0NJSV9CU0Qp CisgICAgcGNyZV9wYXR0ZXJuX2V4cGFuZF9iYWNrc2xhc2hfZCAoJnBhdHRlcm4sICZzaXpl KTsKKwogICBQQ1JFMl9TSVpFIGU7CiAgIGludCBlYzsKICAgaW50IGZsYWdzID0gUENSRTJf RE9MTEFSX0VORE9OTFkgfCAobWF0Y2hfaWNhc2UgPyBQQ1JFMl9DQVNFTEVTUyA6IDApOwpA QCAtMTcyLDcgKzI1Miw5IEBAIFBjb21waWxlIChjaGFyICpwYXR0ZXJuLCBpZHhfdCBzaXpl LCByZWdfc3ludGF4X3QgaWdub3JlZCwgYm9vbCBleGFjdCkKICAgaWYgKG1hdGNoX2xpbmVz KQogICAgIHsKICNpZmRlZiBQQ1JFMl9FWFRSQV9NQVRDSF9MSU5FCi0gICAgICBwY3JlMl9z ZXRfY29tcGlsZV9leHRyYV9vcHRpb25zIChjY29udGV4dCwgUENSRTJfRVhUUkFfTUFUQ0hf TElORSk7CisgICAgICBwY3JlMl9zZXRfY29tcGlsZV9leHRyYV9vcHRpb25zIChjY29udGV4 dCwKKyAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIChQQ1JFMl9FWFRS QV9NQVRDSF9MSU5FCisgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg fCBQQ1JFMl9FWFRSQV9BU0NJSV9CU0QpKTsKICNlbHNlCiAgICAgICBzdGF0aWMgY2hhciBj b25zdCAvKiBUaGVzZSBzaXplcyBvbWl0IHRyYWlsaW5nIE5VTC4gICovCiAgICAgICAgIHhw cmVmaXhbNF0gPSAiXig/OiIsIHhzdWZmaXhbMl0gPSAiKSQiOwotLSAKMi4zOS4yCgo= --------------WFSXL759eBRytamQBMw3TUVH-- From unknown Fri Sep 05 19:41:18 2025 X-Loop: help-debbugs@gnu.org Subject: bug#62267: grep-3.9 bug: \d matches multibyte digits Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 19 Mar 2023 08:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62267 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Jim Meyering Cc: 62267@debbugs.gnu.org Received: via spool by 62267-submit@debbugs.gnu.org id=B62267.16792161066636 (code B ref 62267); Sun, 19 Mar 2023 08:56:02 +0000 Received: (at 62267) by debbugs.gnu.org; 19 Mar 2023 08:55:06 +0000 Received: from localhost ([127.0.0.1]:49443 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdooj-0001iv-CU for submit@debbugs.gnu.org; Sun, 19 Mar 2023 04:55:06 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:33034) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdoog-0001iK-3n for 62267@debbugs.gnu.org; Sun, 19 Mar 2023 04:55:04 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id DCB88160045; Sun, 19 Mar 2023 01:54:55 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id lK3t3E51zvfq; Sun, 19 Mar 2023 01:54:54 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1DB1E160054; Sun, 19 Mar 2023 01:54:54 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.9.2 zimbra.cs.ucla.edu 1DB1E160054 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=78364E5A-2AF3-11ED-87FA-8298ECA2D365; t=1679216094; bh=VSyLGItMHJwx2KgjZNBU1tf4M9yTgl+DG+CYTACcnLQ=; h=Content-Type:Message-ID:Date:MIME-Version:Subject:From:To; b=d8dG1WewS2PBIx8oFREuxwURTDmYIw63pVUbxKOufhLTxjf8yjJrNV2fU0Pl+TDnp gjJ6pCYNZuJo8iG4FEfsfu6thuWjllGVZg5iFJs+LFupX8GljpUru8s5XHDwdjovyr zDFO5qQE3J4Frt8ZcYjCol3B6QbnheRc4L2KalNw= X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id INAR8hz93tWU; Sun, 19 Mar 2023 01:54:53 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id BFAB6160045; Sun, 19 Mar 2023 01:54:53 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------CcaR8q003v6GQbok8dUjl5wR" Message-ID: <8a9eab6b-f87e-ce1b-1706-a14ee799bb7c@cs.ucla.edu> Date: Sun, 19 Mar 2023 01:54:53 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Content-Language: en-US From: Paul Eggert References: <870af32d-5cdf-65d1-4cbc-9988c57b8e37@cs.ucla.edu> Organization: UCLA Computer Science Department In-Reply-To: X-Spam-Score: -3.4 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.4 (----) This is a multi-part message in MIME format. --------------CcaR8q003v6GQbok8dUjl5wR Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2023-03-19 01:28, Paul Eggert wrote: > Looking at the source code again, how about if we move the PCRE-specific > changes from src/grep.c to src/pcresearch.c which is where it really > belongs, and more importantly use the bleeding-edge > PCRE2_EXTRA_ASCII_BSD macro if available? > > Something like the attached patch, say. This patch doesn't take your \D > fixes (or the above suggestions) into account. Oops, that patch assumed match_lines. Also, it covered two topics in the doc fix. I installed the obvious topic in the doc change, and removed the match_lines assumption. Revised patch attached; please ignore the patch of a half-hour ago. --------------CcaR8q003v6GQbok8dUjl5wR Content-Type: text/x-patch; charset=UTF-8; name="0001-grep-forward-port-to-PCRE2-10.43.patch" Content-Disposition: attachment; filename="0001-grep-forward-port-to-PCRE2-10.43.patch" Content-Transfer-Encoding: base64 RnJvbSAwZGRjNmJhZTZiMDljNTVlMzlhYTQ3MjNiOTRiMTNiYjU3MjJiZjQ3IE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBTdW4sIDE5IE1hciAyMDIzIDAxOjUwOjAwIC0wNzAwClN1YmplY3Q6IFtQQVRD SF0gZ3JlcDogZm9yd2FyZCBwb3J0IHRvIFBDUkUyIDEwLjQzCgoqIGRvYy9ncmVwLnRleGk6 IERvY3VtZW50IHRoaXMuCiogc3JjL2dyZXAuYzogTW92ZSByZWNlbnQgY2hhbmdlcyBpbnRv IHBjcmVzZWFyY2guYy4KKFBfTUFUQ0hFUl9JTkRFWCk6IFJlbW92ZS4KKHBjcmVfcGF0dGVy bl9leHBhbmRfYmFja3NsYXNoX2QpOiBNb3ZlIGZyb20gaGVyZSAuLi4KKiBzcmMvcGNyZXNl YXJjaC5jOiAuLi4gdG8gaGVyZS4KKFBDUkUyX0VYVFJBX0FTQ0lJX0JTRCk6IERlZmF1bHQg dG8gMC4KKFBjb21waWxlKTogVXNlIFBDUkUyX0VYVFJBX0FTQ0lJX0JTRCBpZiBhdmFpbGFi bGUsCmFuZCBleHBhbmQgXGQgdG8gWzAtOV0gb3RoZXJ3aXNlLgotLS0KIGRvYy9ncmVwLnRl eGkgICAgfCAxOCArKysrKystLS0tCiBzcmMvZ3JlcC5jICAgICAgIHwgODIgKy0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQogc3JjL3BjcmVzZWFyY2guYyB8 IDkwICsrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKystLQog MyBmaWxlcyBjaGFuZ2VkLCA5OSBpbnNlcnRpb25zKCspLCA5MSBkZWxldGlvbnMoLSkKCmRp ZmYgLS1naXQgYS9kb2MvZ3JlcC50ZXhpIGIvZG9jL2dyZXAudGV4aQppbmRleCBiMTdjNGRh Li44YTBhZWY1IDEwMDY0NAotLS0gYS9kb2MvZ3JlcC50ZXhpCisrKyBiL2RvYy9ncmVwLnRl eGkKQEAgLTExNDQsMTMgKzExNDQsMTcgQEAgY29tYmluZWQgd2l0aCB0aGUgQG9wdGlvbnst en0gKEBvcHRpb257LS1udWxsLWRhdGF9KSBvcHRpb24sIGFuZCBub3RlIHRoYXQKIEZvciBk b2N1bWVudGF0aW9uLCByZWZlciB0byBAdXJse2h0dHBzOi8vd3d3LnBjcmUub3JnL30sIHdp dGggdGhlc2UgY2F2ZWF0czoKIEBpdGVtaXplCiBAaXRlbQotQHNhbXB7XGR9IGFsd2F5cyBt YXRjaGVzIG9ubHkgdGhlIHRlbiBBU0NJSSBkaWdpdHMsIHJlZ2FyZGxlc3Mgb2YgbG9jYWxl IG9yCi1pbi1yZWdleHAgZGlyZWN0aXZlcyBsaWtlIEBzYW1weyg/YUQpfS4KLVVzZSBAc2Ft cHtccEB7TmRAfX0gaWYgeW91IHJlcXVpcmUgdG8gbWF0Y2ggbm9uLUFTQ0lJIGRpZ2l0cy4K LU9uY2UgcGNyZTIgc3VwcG9ydCBmb3IgQHNhbXB7KD9hRCl9IGlzIHdpZGVzcHJlYWQgZW5v dWdoLAotd2UgZXhwZWN0IHRvIG1ha2UgdGhhdCB0aGUgZGVmYXVsdCwgc28gaXQgd2lsbCBi ZSBvdmVycmlkYWJsZS4KLUBjIFVzaW5nIHBjcmUyIGdpdCBjb21taXQgcGNyZTItMTAuNDAt MTEyLWc2Mjc3MzU3LCB0aGlzIGRlbW9uc3RyYXRlcyBob3cKLUBjIHdlJ2xsIHByZWZpeCB3 aXRoICg/YUQpIHRvIG1ha2UgXGQncyBBU0NJSS1vbmx5IGJlaGF2aW9yIHRoZSBkZWZhdWx0 OgorQHNhbXB7XGR9IG1hdGNoZXMgb25seSB0aGUgdGVuIEFTQ0lJIGRpZ2l0cywgcmVnYXJk bGVzcyBvZiBsb2NhbGUuCitVc2UgQHNhbXB7XHBAe05kQH19IHRvIGFsc28gbWF0Y2ggbm9u LUFTQ0lJIGRpZ2l0cy4KKworV2hlbiBAY29tbWFuZHtncmVwfSBpcyBidWlsdCB3aXRoIFBD UkUyIDEwLjQyIGFuZCBlYXJsaWVyLCBAc2FtcHtcZH0KK2lnbm9yZXMgaW4tcmVnZXhwIGRp cmVjdGl2ZXMgbGlrZSBAc2FtcHsoP2FEKX0gYW5kIG1hdGNoZXMgb25seSBBU0NJSQorZGln aXRzIHJlZ2FyZGxlc3Mgb2YgdGhlc2UgZGlyZWN0aXZlcy4gIEhvd2V2ZXIsIGxhdGVyIHZl cnNpb25zIG9mCitQQ1JFMiBsaWtlbHkgd2lsbCBmaXggdGhpcywgYW5kIHRoZSBwbGFuIGlz IGZvciBAY29tbWFuZHtncmVwfSB0bworcmVzcGVjdCB0aG9zZSBkaXJlY3RpdmVzIGlmIHBv c3NpYmxlLgorQGMgVXNpbmcgUENSRTIgZ2l0IGNvbW1pdCBwY3JlMi0xMC40MC0xMTItZzYy NzczNTcsIHRoaXMgZGVtb25zdHJhdGVzCitAYyB0aGUgZXF1aXZhbGVudCBvZiBob3cgZ3Jl cCBjb3VsZCB1c2UgUENSRTJfRVhUUkFfQVNDSUlfQlNEIHRvIG1ha2UgXGQncworQGMgQVND SUktb25seSBiZWhhdmlvciB0aGUgZGVmYXVsdDoKIEBjICQgTENfQUxMPWVuX1VTLlVURi04 IC4vcGNyZTJncmVwIC11ICcoP2FEKV5cZCsnIDw8PCAn2aDZodmi2aPZpNml2abZp9mo2akn CiBAYyBbRXhpdCAxXQogQGMgJCBMQ19BTEw9ZW5fVVMuVVRGLTggLi9wY3JlMmdyZXAgLXUg J15cZCsnIDw8PCAn2aDZodmi2aPZpNml2abZp9mo2aknCmRpZmYgLS1naXQgYS9zcmMvZ3Jl cC5jIGIvc3JjL2dyZXAuYwppbmRleCA2YmE4ODFlLi43NTQ3YjY0IDEwMDY0NAotLS0gYS9z cmMvZ3JlcC5jCisrKyBiL3NyYy9ncmVwLmMKQEAgLTIwODksOCArMjA4OSw3IEBAIHN0YXRp YyBzdHJ1Y3QKICNlbmRpZgogfTsKIC8qIEtlZXAgdGhlc2UgaW4gc3luYyB3aXRoIHRoZSAn bWF0Y2hlcnMnIHRhYmxlLiAgKi8KLWVudW0geyBFX01BVENIRVJfSU5ERVggPSAxLCBGX01B VENIRVJfSU5ERVggPSAyLCBHX01BVENIRVJfSU5ERVggPSAwLAotICAgICAgIFBfTUFUQ0hF Ul9JTkRFWCA9IDYgfTsKK2VudW0geyBFX01BVENIRVJfSU5ERVggPSAxLCBGX01BVENIRVJf SU5ERVggPSAyLCBHX01BVENIRVJfSU5ERVggPSAwIH07CiAKIC8qIFJldHVybiB0aGUgaW5k ZXggb2YgdGhlIG1hdGNoZXIgY29ycmVzcG9uZGluZyB0byBNIGlmIGF2YWlsYWJsZS4KICAg IE1BVENIRVIgaXMgdGhlIGluZGV4IG9mIHRoZSBwcmV2aW91cyBtYXRjaGVyLCBvciAtMSBp ZiBub25lLgpAQCAtMjM3OSw4MCArMjM3OCw2IEBAIGZncmVwX3RvX2dyZXBfcGF0dGVybiAo Y2hhciAqKmtleXNfcCwgaWR4X3QgKmxlbl9wKQogICAqbGVuX3AgPSBwIC0gbmV3X2tleXM7 CiB9CiAKLS8qIFJlcGxhY2UgZWFjaCBcZCBpbiAqS0VZU19QIHdpdGggWzAtOV0sIHRvIGVu c3VyZSB0aGF0IFxkIG1hdGNoZXMgb25seSBBU0NJSQotICAgZGlnaXRzLiAgTm93IHRoYXQg d2UgZW5hYmxlIFBDUkUyX1VDUCBmb3IgcGNyZSByZWdleHBzLCBcZCB3b3VsZCBvdGhlcndp c2UKLSAgIG1hdGNoIG5vbi1BU0NJSSBkaWdpdHMgaW4gc29tZSBsb2NhbGVzLiAgVXNlIFxw e05kfSBpZiB5b3UgcmVxdWlyZSB0byBtYXRjaAotICAgdGhvc2UuICAqLwotc3RhdGljIHZv aWQKLXBjcmVfcGF0dGVybl9leHBhbmRfYmFja3NsYXNoX2QgKGNoYXIgKiprZXlzX3AsIGlk eF90ICpsZW5fcCkKLXsKLSAgaWR4X3QgbGVuID0gKmxlbl9wOwotICBjaGFyICprZXlzID0g KmtleXNfcDsKLSAgbWJzdGF0ZV90IG1iX3N0YXRlID0geyAwIH07Ci0gIGNoYXIgKm5ld19r ZXlzID0geG5tYWxsb2MgKGxlbiAvIDIgKyAxLCA1KTsKLSAgY2hhciAqcCA9IG5ld19rZXlz OwotICBib29sIHByZXZfYmFja3NsYXNoID0gZmFsc2U7Ci0KLSAgZm9yIChwdHJkaWZmX3Qg bjsgbGVuOyBrZXlzICs9IG4sIGxlbiAtPSBuKQotICAgIHsKLSAgICAgIG4gPSBtYl9jbGVu IChrZXlzLCBsZW4sICZtYl9zdGF0ZSk7Ci0gICAgICBzd2l0Y2ggKG4pCi0gICAgICAgIHsK LSAgICAgICAgY2FzZSAtMjoKLSAgICAgICAgICBuID0gbGVuOwotICAgICAgICAgIEZBTExU SFJPVUdIOwotICAgICAgICBkZWZhdWx0OgotICAgICAgICAgIGlmIChwcmV2X2JhY2tzbGFz aCkKLSAgICAgICAgICAgIHsKLSAgICAgICAgICAgICAgcHJldl9iYWNrc2xhc2ggPSBmYWxz ZTsKLSAgICAgICAgICAgICAgKnArKyA9ICdcXCc7Ci0gICAgICAgICAgICB9Ci0gICAgICAg ICAgcCA9IG1lbXBjcHkgKHAsIGtleXMsIG4pOwotICAgICAgICAgIGJyZWFrOwotCi0gICAg ICAgIGNhc2UgLTE6Ci0gICAgICAgICAgaWYgKHByZXZfYmFja3NsYXNoKQotICAgICAgICAg ICAgewotICAgICAgICAgICAgICBwcmV2X2JhY2tzbGFzaCA9IGZhbHNlOwotICAgICAgICAg ICAgICAqcCsrID0gJ1xcJzsKLSAgICAgICAgICAgIH0KLSAgICAgICAgICBtZW1zZXQgKCZt Yl9zdGF0ZSwgMCwgc2l6ZW9mIG1iX3N0YXRlKTsKLSAgICAgICAgICBuID0gMTsKLSAgICAg ICAgICBGQUxMVEhST1VHSDsKLSAgICAgICAgY2FzZSAxOgotICAgICAgICAgIGlmIChwcmV2 X2JhY2tzbGFzaCkKLSAgICAgICAgICAgIHsKLSAgICAgICAgICAgICAgcHJldl9iYWNrc2xh c2ggPSBmYWxzZTsKLSAgICAgICAgICAgICAgc3dpdGNoICgqa2V5cykKLSAgICAgICAgICAg ICAgICB7Ci0gICAgICAgICAgICAgICAgY2FzZSAnZCc6Ci0gICAgICAgICAgICAgICAgICBw ID0gbWVtcGNweSAocCwgIlswLTldIiwgNSk7Ci0gICAgICAgICAgICAgICAgICBicmVhazsK LSAgICAgICAgICAgICAgICBkZWZhdWx0OgotICAgICAgICAgICAgICAgICAgKnArKyA9ICdc XCc7Ci0gICAgICAgICAgICAgICAgICAqcCsrID0gKmtleXM7Ci0gICAgICAgICAgICAgICAg ICBicmVhazsKLSAgICAgICAgICAgICAgICB9Ci0gICAgICAgICAgICB9Ci0gICAgICAgICAg ZWxzZQotICAgICAgICAgICAgewotICAgICAgICAgICAgICBpZiAoKmtleXMgPT0gJ1xcJykK LSAgICAgICAgICAgICAgICBwcmV2X2JhY2tzbGFzaCA9IHRydWU7Ci0gICAgICAgICAgICAg IGVsc2UKLSAgICAgICAgICAgICAgICAqcCsrID0gKmtleXM7Ci0gICAgICAgICAgICB9Ci0g ICAgICAgICAgYnJlYWs7Ci0gICAgICAgIH0KLSAgICB9Ci0KLSAgaWYgKHByZXZfYmFja3Ns YXNoKQotICAgICpwKysgPSAnXFwnOwotICAqcCA9ICdcbic7Ci0gIGZyZWUgKCprZXlzX3Ap OwotICAqa2V5c19wID0gbmV3X2tleXM7Ci0gICpsZW5fcCA9IHAgLSBuZXdfa2V5czsKLX0K LQogLyogSWYgaXQgaXMgZWFzeSwgY29udmVydCB0aGUgTUFUQ0hFUi1zdHlsZSBwYXR0ZXJu cyBLRVlTIChvZiBzaXplCiAgICAqTEVOX1ApIHRvIC1GIHN0eWxlLCB1cGRhdGUgKkxFTl9Q IHRvIGEgcG9zc2libHktc21hbGxlciB2YWx1ZSwgYW5kCiAgICByZXR1cm4gRl9NQVRDSEVS X0lOREVYLiAgSWYgbm90LCBsZWF2ZSBLRVlTIGFuZCAqTEVOX1AgYWxvbmUgYW5kCkBAIC0z MDQ1LDExICsyOTcwLDYgQEAgbWFpbiAoaW50IGFyZ2MsIGNoYXIgKiphcmd2KQogICAgICAg ICBtYXRjaGVyID0gdHJ5X2ZncmVwX3BhdHRlcm4gKG1hdGNoZXIsIGtleXMsICZrZXljYyk7 CiAgICAgfQogCi0gIC8qIElmIC1QLCByZXBsYWNlIGVhY2ggXGQgd2l0aCBbMC05XS4KLSAg ICAgVGhvc2Ugd2hvIHdhbnQgdG8gbWF0Y2ggbm9uLUFTQ0lJIGRpZ2l0cyBtdXN0IHVzZSBc cHtOZH0uICAqLwotICBpZiAobWF0Y2hlciA9PSBQX01BVENIRVJfSU5ERVgpCi0gICAgcGNy ZV9wYXR0ZXJuX2V4cGFuZF9iYWNrc2xhc2hfZCAoJmtleXMsICZrZXljYyk7Ci0KICAgZXhl Y3V0ZSA9IG1hdGNoZXJzW21hdGNoZXJdLmV4ZWN1dGU7CiAgIGNvbXBpbGVkX3BhdHRlcm4g PQogICAgIG1hdGNoZXJzW21hdGNoZXJdLmNvbXBpbGUgKGtleXMsIGtleWNjLCBtYXRjaGVy c1ttYXRjaGVyXS5zeW50YXgsCmRpZmYgLS1naXQgYS9zcmMvcGNyZXNlYXJjaC5jIGIvc3Jj L3BjcmVzZWFyY2guYwppbmRleCA1YjExMWJlLi5kMzcwMTgxIDEwMDY0NAotLS0gYS9zcmMv cGNyZXNlYXJjaC5jCisrKyBiL3NyYy9wY3Jlc2VhcmNoLmMKQEAgLTM1LDYgKzM1LDkgQEAK ICMgZGVmaW5lIFBDUkUyX0VSUk9SX0RFUFRITElNSVQgUENSRTJfRVJST1JfUkVDVVJTSU9O TElNSVQKICMgZGVmaW5lIHBjcmUyX3NldF9kZXB0aF9saW1pdCBwY3JlMl9zZXRfcmVjdXJz aW9uX2xpbWl0CiAjZW5kaWYKKyNpZm5kZWYgUENSRTJfRVhUUkFfQVNDSUlfQlNECisjIGRl ZmluZSBQQ1JFMl9FWFRSQV9BU0NJSV9CU0QgMAorI2VuZGlmCiAKIHN0cnVjdCBwY3JlX2Nv bXAKIHsKQEAgLTEzMCwxMiArMTMzLDg5IEBAIGJhZF91dGY4X2Zyb21fcGNyZTIgKGludCBl KQogI2VuZGlmCiB9CiAKKy8qIFJlcGxhY2UgZWFjaCBcZCBpbiAqS0VZU19QIHdpdGggWzAt OV0sIHRvIGVuc3VyZSB0aGF0IFxkIG1hdGNoZXMgb25seSBBU0NJSQorICAgZGlnaXRzLiAg Tm93IHRoYXQgd2UgZW5hYmxlIFBDUkUyX1VDUCBmb3IgcGNyZSByZWdleHBzLCBcZCB3b3Vs ZCBvdGhlcndpc2UKKyAgIG1hdGNoIG5vbi1BU0NJSSBkaWdpdHMgaW4gc29tZSBsb2NhbGVz LiAgVXNlIFxwe05kfSBpZiB5b3UgcmVxdWlyZSB0byBtYXRjaAorICAgdGhvc2UuICAqLwor c3RhdGljIHZvaWQKK3BjcmVfcGF0dGVybl9leHBhbmRfYmFja3NsYXNoX2QgKGNoYXIgKipr ZXlzX3AsIGlkeF90ICpsZW5fcCkKK3sKKyAgaWR4X3QgbGVuID0gKmxlbl9wOworICBjaGFy ICprZXlzID0gKmtleXNfcDsKKyAgbWJzdGF0ZV90IG1iX3N0YXRlID0geyAwIH07CisgIGNo YXIgKm5ld19rZXlzID0geG5tYWxsb2MgKGxlbiAvIDIgKyAxLCA1KTsKKyAgY2hhciAqcCA9 IG5ld19rZXlzOworICBib29sIHByZXZfYmFja3NsYXNoID0gZmFsc2U7CisKKyAgZm9yIChw dHJkaWZmX3QgbjsgbGVuOyBrZXlzICs9IG4sIGxlbiAtPSBuKQorICAgIHsKKyAgICAgIG4g PSBtYl9jbGVuIChrZXlzLCBsZW4sICZtYl9zdGF0ZSk7CisgICAgICBzd2l0Y2ggKG4pCisg ICAgICAgIHsKKyAgICAgICAgY2FzZSAtMjoKKyAgICAgICAgICBuID0gbGVuOworICAgICAg ICAgIEZBTExUSFJPVUdIOworICAgICAgICBkZWZhdWx0OgorICAgICAgICAgIGlmIChwcmV2 X2JhY2tzbGFzaCkKKyAgICAgICAgICAgIHsKKyAgICAgICAgICAgICAgcHJldl9iYWNrc2xh c2ggPSBmYWxzZTsKKyAgICAgICAgICAgICAgKnArKyA9ICdcXCc7CisgICAgICAgICAgICB9 CisgICAgICAgICAgcCA9IG1lbXBjcHkgKHAsIGtleXMsIG4pOworICAgICAgICAgIGJyZWFr OworCisgICAgICAgIGNhc2UgLTE6CisgICAgICAgICAgaWYgKHByZXZfYmFja3NsYXNoKQor ICAgICAgICAgICAgeworICAgICAgICAgICAgICBwcmV2X2JhY2tzbGFzaCA9IGZhbHNlOwor ICAgICAgICAgICAgICAqcCsrID0gJ1xcJzsKKyAgICAgICAgICAgIH0KKyAgICAgICAgICBt ZW1zZXQgKCZtYl9zdGF0ZSwgMCwgc2l6ZW9mIG1iX3N0YXRlKTsKKyAgICAgICAgICBuID0g MTsKKyAgICAgICAgICBGQUxMVEhST1VHSDsKKyAgICAgICAgY2FzZSAxOgorICAgICAgICAg IGlmIChwcmV2X2JhY2tzbGFzaCkKKyAgICAgICAgICAgIHsKKyAgICAgICAgICAgICAgcHJl dl9iYWNrc2xhc2ggPSBmYWxzZTsKKyAgICAgICAgICAgICAgc3dpdGNoICgqa2V5cykKKyAg ICAgICAgICAgICAgICB7CisgICAgICAgICAgICAgICAgY2FzZSAnZCc6CisgICAgICAgICAg ICAgICAgICBwID0gbWVtcGNweSAocCwgIlswLTldIiwgNSk7CisgICAgICAgICAgICAgICAg ICBicmVhazsKKyAgICAgICAgICAgICAgICBkZWZhdWx0OgorICAgICAgICAgICAgICAgICAg KnArKyA9ICdcXCc7CisgICAgICAgICAgICAgICAgICAqcCsrID0gKmtleXM7CisgICAgICAg ICAgICAgICAgICBicmVhazsKKyAgICAgICAgICAgICAgICB9CisgICAgICAgICAgICB9Cisg ICAgICAgICAgZWxzZQorICAgICAgICAgICAgeworICAgICAgICAgICAgICBpZiAoKmtleXMg PT0gJ1xcJykKKyAgICAgICAgICAgICAgICBwcmV2X2JhY2tzbGFzaCA9IHRydWU7CisgICAg ICAgICAgICAgIGVsc2UKKyAgICAgICAgICAgICAgICAqcCsrID0gKmtleXM7CisgICAgICAg ICAgICB9CisgICAgICAgICAgYnJlYWs7CisgICAgICAgIH0KKyAgICB9CisKKyAgaWYgKHBy ZXZfYmFja3NsYXNoKQorICAgICpwKysgPSAnXFwnOworICAqcCA9ICdcbic7CisgIGZyZWUg KCprZXlzX3ApOworICAqa2V5c19wID0gbmV3X2tleXM7CisgICpsZW5fcCA9IHAgLSBuZXdf a2V5czsKK30KKwogLyogQ29tcGlsZSB0aGUgLVAgc3R5bGUgUEFUVEVSTiwgY29udGFpbmlu ZyBTSVpFIGJ5dGVzIHRoYXQgYXJlCiAgICBmb2xsb3dlZCBieSAnXG4nLiAgUmV0dXJuIGEg ZGVzY3JpcHRpb24gb2YgdGhlIGNvbXBpbGVkIHBhdHRlcm4uICAqLwogCiB2b2lkICoKIFBj b21waWxlIChjaGFyICpwYXR0ZXJuLCBpZHhfdCBzaXplLCByZWdfc3ludGF4X3QgaWdub3Jl ZCwgYm9vbCBleGFjdCkKIHsKKyAgaWYgKCEgUENSRTJfRVhUUkFfQVNDSUlfQlNEKQorICAg IHBjcmVfcGF0dGVybl9leHBhbmRfYmFja3NsYXNoX2QgKCZwYXR0ZXJuLCAmc2l6ZSk7CisK ICAgUENSRTJfU0laRSBlOwogICBpbnQgZWM7CiAgIGludCBmbGFncyA9IFBDUkUyX0RPTExB Ul9FTkRPTkxZIHwgKG1hdGNoX2ljYXNlID8gUENSRTJfQ0FTRUxFU1MgOiAwKTsKQEAgLTE2 OCwxMiArMjQ4LDE2IEBAIFBjb21waWxlIChjaGFyICpwYXR0ZXJuLCBpZHhfdCBzaXplLCBy ZWdfc3ludGF4X3QgaWdub3JlZCwgYm9vbCBleGFjdCkKICAgaWYgKHJhd21lbWNociAocGF0 dGVybiwgJ1xuJykgIT0gcGF0bGltKQogICAgIGRpZSAoRVhJVF9UUk9VQkxFLCAwLCBfKCJ0 aGUgLVAgb3B0aW9uIG9ubHkgc3VwcG9ydHMgYSBzaW5nbGUgcGF0dGVybiIpKTsKIAorI2lm ZGVmIFBDUkUyX0VYVFJBX01BVENIX0xJTkUKKyAgdWludDMyX3QgZXh0cmFfb3B0aW9ucyA9 IChQQ1JFMl9FWFRSQV9BU0NJSV9CU0QKKyAgICAgICAgICAgICAgICAgICAgICAgICAgICB8 IChtYXRjaF9saW5lcyA/IFBDUkUyX0VYVFJBX01BVENIX0xJTkUgOiAwKSk7CisgIHBjcmUy X3NldF9jb21waWxlX2V4dHJhX29wdGlvbnMgKGNjb250ZXh0LCBleHRyYV9vcHRpb25zKTsK KyNlbmRpZgorCiAgIHZvaWQgKnJlX3N0b3JhZ2UgPSBOVUxMOwogICBpZiAobWF0Y2hfbGlu ZXMpCiAgICAgewotI2lmZGVmIFBDUkUyX0VYVFJBX01BVENIX0xJTkUKLSAgICAgIHBjcmUy X3NldF9jb21waWxlX2V4dHJhX29wdGlvbnMgKGNjb250ZXh0LCBQQ1JFMl9FWFRSQV9NQVRD SF9MSU5FKTsKLSNlbHNlCisjaWZuZGVmIFBDUkUyX0VYVFJBX01BVENIX0xJTkUKICAgICAg IHN0YXRpYyBjaGFyIGNvbnN0IC8qIFRoZXNlIHNpemVzIG9taXQgdHJhaWxpbmcgTlVMLiAg Ki8KICAgICAgICAgeHByZWZpeFs0XSA9ICJeKD86IiwgeHN1ZmZpeFsyXSA9ICIpJCI7CiAg ICAgICBpZHhfdCByZV9zaXplID0gc2l6ZSArIHNpemVvZiB4cHJlZml4ICsgc2l6ZW9mIHhz dWZmaXg7Ci0tIAoyLjM5LjIKCg== --------------CcaR8q003v6GQbok8dUjl5wR-- From unknown Fri Sep 05 19:41:18 2025 X-Loop: help-debbugs@gnu.org Subject: bug#62267: grep-3.9 bug: \d matches multibyte digits Resent-From: Jim Meyering Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 19 Mar 2023 16:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62267 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Paul Eggert Cc: 62267@debbugs.gnu.org Received: via spool by 62267-submit@debbugs.gnu.org id=B62267.16792449113170 (code B ref 62267); Sun, 19 Mar 2023 16:56:02 +0000 Received: (at 62267) by debbugs.gnu.org; 19 Mar 2023 16:55:11 +0000 Received: from localhost ([127.0.0.1]:52613 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdwJK-0000p4-UJ for submit@debbugs.gnu.org; Sun, 19 Mar 2023 12:55:11 -0400 Received: from mail-lj1-f173.google.com ([209.85.208.173]:44732) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdwJJ-0000ol-0R for 62267@debbugs.gnu.org; Sun, 19 Mar 2023 12:55:09 -0400 Received: by mail-lj1-f173.google.com with SMTP id l22so9830363ljc.11 for <62267@debbugs.gnu.org>; Sun, 19 Mar 2023 09:55:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679244902; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gA4bV88d4CESGQnrER52Fyy4KKs9VdwyxumKFWCjmw4=; b=HTIiWeesZS2VRxseatIlNy26A2PyUT9wdfdNoibJte+tenHgyufpy2g73pQd2NVghj CyGPPXVLtNXJMvq7iylnPdwGJ8MoMY8xOcfNJhu8ZkVZkKcjR+nLHCVVIOYAID3434YN 9U+N+IpvsSJ6xd1aMLuRtuw/9lcUQmvC7GwKhpNoIOgJfZzvBcKO8YI+UAJi68CMD3Hf Z1rKkKjXhYILoFs+5/ror43AWd0Jew5FEn2HCLayyuPwEaLh4uGE5Yw0zOzGraDqeOvH SoBVur7LoQ0hWcATwvh6GovxAAWhFaGfTwu/Df29jz6E1QSv4wFCIlml3F0JciVotg/N v32g== X-Gm-Message-State: AO0yUKVXtMXwshuh/nqXdjJj7RMiC8wJioBVkEufUC2su66m3VG7R/Gl vkTj3GLNWCGruLf5XCpZvIei/cQMePkiaoGoPY0= X-Google-Smtp-Source: AK7set//13hBLcHlmpb+omyboOIbtgV2QO23E0F6R6tCXVmZKYv9cJhJLllGWkd09evigJc1z02i5SFj7eP0yg94oTA= X-Received: by 2002:a2e:b521:0:b0:294:6de5:e642 with SMTP id z1-20020a2eb521000000b002946de5e642mr5148517ljm.3.1679244902541; Sun, 19 Mar 2023 09:55:02 -0700 (PDT) MIME-Version: 1.0 References: <870af32d-5cdf-65d1-4cbc-9988c57b8e37@cs.ucla.edu> <8a9eab6b-f87e-ce1b-1706-a14ee799bb7c@cs.ucla.edu> In-Reply-To: <8a9eab6b-f87e-ce1b-1706-a14ee799bb7c@cs.ucla.edu> From: Jim Meyering Date: Sun, 19 Mar 2023 09:54:49 -0700 Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.2 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.8 (/) On Sun, Mar 19, 2023 at 1:55=E2=80=AFAM Paul Eggert wr= ote: > > On 2023-03-19 01:28, Paul Eggert wrote: > > Looking at the source code again, how about if we move the PCRE-specifi= c > > changes from src/grep.c to src/pcresearch.c which is where it really > > belongs, and more importantly use the bleeding-edge > > PCRE2_EXTRA_ASCII_BSD macro if available? > > > > Something like the attached patch, say. This patch doesn't take your \D > > fixes (or the above suggestions) into account. > > Oops, that patch assumed match_lines. Also, it covered two topics in the > doc fix. I installed the obvious topic in the doc change, and removed > the match_lines assumption. Revised patch attached; please ignore the > patch of a half-hour ago. Thanks. It definitely belongs in pcresearch.c. You're welcome to push that (or I will soon). I've rebased my changes on top of it and am adding tests. From unknown Fri Sep 05 19:41:18 2025 X-Loop: help-debbugs@gnu.org Subject: bug#62267: grep-3.9 bug: \d matches multibyte digits Resent-From: Jim Meyering Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 19 Mar 2023 20:45:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62267 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Paul Eggert Cc: 62267@debbugs.gnu.org Received: via spool by 62267-submit@debbugs.gnu.org id=B62267.16792586984988 (code B ref 62267); Sun, 19 Mar 2023 20:45:01 +0000 Received: (at 62267) by debbugs.gnu.org; 19 Mar 2023 20:44:58 +0000 Received: from localhost ([127.0.0.1]:52855 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdzth-0001IN-S2 for submit@debbugs.gnu.org; Sun, 19 Mar 2023 16:44:58 -0400 Received: from mail-lf1-f44.google.com ([209.85.167.44]:33702) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pdztf-0001I9-UV for 62267@debbugs.gnu.org; Sun, 19 Mar 2023 16:44:56 -0400 Received: by mail-lf1-f44.google.com with SMTP id o8so12620948lfo.0 for <62267@debbugs.gnu.org>; Sun, 19 Mar 2023 13:44:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679258689; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lSBszNafqbhyWsN/xzndQQSMLcaMv7hoobN1Fp+dBcE=; b=Yzz0djkRbH3/agC4Hqqp5nLOmxNhXgShMmNnKzlay4Sz05p1/5cBn4nhpwiUK8ibP4 vB++VXmKFI8x79WyZRcuskRSUX1TwmCFSExSGeG4a7VuQU7IRyx8CuQdQN5OOz4kheKh e8iGelxj0W4iChhVZ2GsjeB6HctefwQO5X5UxKm0Vql7lSPAFTdpVi6rFx/4Ci/GjexT SH9mf+hr/LydCxfvNgKPtiN9qZgHAvHH8yMUXtS9PEI0fGvl0nJoEr4sGjBqIeuLNafU dRtiMhfwJnUinrgODuKo/B+dWft43/BjBgbSGMEGtmF0+/h735xKYftfRTfZjfXkCazr ij7g== X-Gm-Message-State: AO0yUKV67bf1CyxqYcvhrYiJOsH+CSwcK84nR+TlmKtkn+WzXlNTqQM1 PrO5b2UmMF+F+nHPpHIONfNOd6WPOZYiTJQFPH4= X-Google-Smtp-Source: AK7set/Vq0DDBJ8cPbSu1cdQ7WTP09dNH9Z4AtZ8cb242osRoOgUVJL7lSFvpIXhSW6+Xxy9FrblAD0geAAknw+3AiQ= X-Received: by 2002:a05:6512:31c7:b0:4d8:86c2:75ea with SMTP id j7-20020a05651231c700b004d886c275eamr5096208lfe.3.1679258689440; Sun, 19 Mar 2023 13:44:49 -0700 (PDT) MIME-Version: 1.0 References: <870af32d-5cdf-65d1-4cbc-9988c57b8e37@cs.ucla.edu> <8a9eab6b-f87e-ce1b-1706-a14ee799bb7c@cs.ucla.edu> In-Reply-To: From: Jim Meyering Date: Sun, 19 Mar 2023 13:44:37 -0700 Message-ID: Content-Type: multipart/mixed; boundary="000000000000abf77405f746e184" X-Spam-Score: 0.2 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.8 (/) --000000000000abf77405f746e184 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Mar 19, 2023 at 9:54=E2=80=AFAM Jim Meyering wro= te: > On Sun, Mar 19, 2023 at 1:55=E2=80=AFAM Paul Eggert = wrote: > > > > On 2023-03-19 01:28, Paul Eggert wrote: > > > Looking at the source code again, how about if we move the PCRE-speci= fic > > > changes from src/grep.c to src/pcresearch.c which is where it really > > > belongs, and more importantly use the bleeding-edge > > > PCRE2_EXTRA_ASCII_BSD macro if available? > > > > > > Something like the attached patch, say. This patch doesn't take your = \D > > > fixes (or the above suggestions) into account. > > > > Oops, that patch assumed match_lines. Also, it covered two topics in th= e > > doc fix. I installed the obvious topic in the doc change, and removed > > the match_lines assumption. Revised patch attached; please ignore the > > patch of a half-hour ago. > > Thanks. It definitely belongs in pcresearch.c. > You're welcome to push that (or I will soon). > I've rebased my changes on top of it and am adding tests. I've pushed your change along with the attached. I'll probably create another snapshot today. --000000000000abf77405f746e184 Content-Type: application/octet-stream; name="grep-backslash-D.patch" Content-Disposition: attachment; filename="grep-backslash-D.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lffv8h840 RnJvbSA5OGVlMDViNGRkZmVlNWMxZGIyMjQ4YmRiMDYwYTJjZDY0YmY3NWZhIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBKaW0gTWV5ZXJpbmcgPG1leWVyaW5nQGZiLmNvbT4KRGF0ZTog U2F0LCAxOCBNYXIgMjAyMyAyMzoyNTowMyAtMDcwMApTdWJqZWN0OiBbUEFUQ0hdIGdyZXA6IC1Q ICgtLXBlcmwtcmVnZXhwKSBcRCBvbmNlIGFnYWluIHdvcmtzIGxpa2UgW14wLTldCgoqIE5FV1M6 IE1lbnRpb24gXEQsIHRvby4KKiBkb2MvZ3JlcC50ZXhpOiBMaWtld2lzZQoqIHNyYy9wY3Jlc2Vh cmNoLmMgKHBjcmVfcGF0dGVybl9leHBhbmRfYmFja3NsYXNoX2QpOiBIYW5kbGUgXEQuCkFsc28s IGlmZGVmLW91dCB0aGlzIG5ldyBmdW5jdGlvbiBhbmQgaXRzIGNhbGwgc2l0ZSB3aGVuIG5vdCBu ZWVkZWQuCiogdGVzdHMvcGNyZS1hc2NpaS1kaWdpdHM6IFRlc3QgXEQsIHRvby4KVGlnaHRlbiBv bmUgdGVzdCBieSB1c2luZyByZXR1cm5zXyAxLgpBZGQgY29tbWVudHMgYW5kIHRlc3RzIHRoYXQg d29yayBvbmx5IHdpdGggMTAuNDMgYW5kIG5ld2VyLgpQYXVsIEVnZ2VydCByYWlzZWQgdGhlIGlz c3VlIG9mIFxEIGluIGh0dHBzOi8vYnVncy5nbnUub3JnLzYyMjY3IzgKLS0tCiBORVdTICAgICAg ICAgICAgICAgICAgICB8ICAyICstCiBkb2MvZ3JlcC50ZXhpICAgICAgICAgICB8IDIwICsrKysr KystLS0tLS0tLS0tLS0tCiBzcmMvcGNyZXNlYXJjaC5jICAgICAgICB8IDE0ICsrKysrKysrKysr LS0tCiB0ZXN0cy9wY3JlLWFzY2lpLWRpZ2l0cyB8IDMzICsrKysrKysrKysrKysrKysrKysrKysr KysrKysrKysrLQogNCBmaWxlcyBjaGFuZ2VkLCA1MSBpbnNlcnRpb25zKCspLCAxOCBkZWxldGlv bnMoLSkKCmRpZmYgLS1naXQgYS9ORVdTIGIvTkVXUwppbmRleCBhMjRjZWJkLi42Zjc3ZDE2IDEw MDY0NAotLS0gYS9ORVdTCisrKyBiL05FV1MKQEAgLTksNyArOSw3IEBAIEdOVSBncmVwIE5FV1Mg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAtKi0gb3V0bGluZSAtKi0KICAgcHJv cGVybHkgaGFkIHRoZSB1bmRlc2lyYWJsZSBzaWRlIGVmZmVjdCBvZiBtYWtpbmcgXGQgYWxzbyBt YXRjaAogICBlLmcuLCB0aGUgQXJhYmljIGRpZ2l0czog2aDZodmi2aPZpNml2abZp9mo2akuICBX aXRoIGdyZXAtMy45LCAtUCAnXGQrJwogICB3b3VsZCBtYXRjaCB0aGF0IHRlbi1kaWdpdCAoMjAt Ynl0ZSkgc3RyaW5nLiBOb3csIHRvIG1hdGNoIHN1Y2gKLSAgYSBkaWdpdCwgeW91IHdvdWxkIHVz ZSBccHtOZH0uCisgIGEgZGlnaXQsIHlvdSB3b3VsZCB1c2UgXHB7TmR9LiBTaW1pbGFybHksIFxE IGlzIG5vdyBtYXBwZWQgdG8gW14wLTldLgogICBbYnVnIGludHJvZHVjZWQgaW4gZ3JlcCAzLjld CgoKZGlmZiAtLWdpdCBhL2RvYy9ncmVwLnRleGkgYi9kb2MvZ3JlcC50ZXhpCmluZGV4IDhhMGFl ZjUuLjdhMDBhZGQgMTAwNjQ0Ci0tLSBhL2RvYy9ncmVwLnRleGkKKysrIGIvZG9jL2dyZXAudGV4 aQpAQCAtMTE0NCwyMSArMTE0NCwxNSBAQCBjb21iaW5lZCB3aXRoIHRoZSBAb3B0aW9uey16fSAo QG9wdGlvbnstLW51bGwtZGF0YX0pIG9wdGlvbiwgYW5kIG5vdGUgdGhhdAogRm9yIGRvY3VtZW50 YXRpb24sIHJlZmVyIHRvIEB1cmx7aHR0cHM6Ly93d3cucGNyZS5vcmcvfSwgd2l0aCB0aGVzZSBj YXZlYXRzOgogQGl0ZW1pemUKIEBpdGVtCi1Ac2FtcHtcZH0gbWF0Y2hlcyBvbmx5IHRoZSB0ZW4g QVNDSUkgZGlnaXRzLCByZWdhcmRsZXNzIG9mIGxvY2FsZS4KK0BzYW1we1xkfSBtYXRjaGVzIG9u bHkgdGhlIHRlbiBBU0NJSSBkaWdpdHMKKyhhbmQgQHNhbXB7XER9IG1hdGNoZXMgdGhlIGNvbXBs ZW1lbnQpLCByZWdhcmRsZXNzIG9mIGxvY2FsZS4KIFVzZSBAc2FtcHtccEB7TmRAfX0gdG8gYWxz byBtYXRjaCBub24tQVNDSUkgZGlnaXRzLgoKLVdoZW4gQGNvbW1hbmR7Z3JlcH0gaXMgYnVpbHQg d2l0aCBQQ1JFMiAxMC40MiBhbmQgZWFybGllciwgQHNhbXB7XGR9Ci1pZ25vcmVzIGluLXJlZ2V4 cCBkaXJlY3RpdmVzIGxpa2UgQHNhbXB7KD9hRCl9IGFuZCBtYXRjaGVzIG9ubHkgQVNDSUkKLWRp Z2l0cyByZWdhcmRsZXNzIG9mIHRoZXNlIGRpcmVjdGl2ZXMuICBIb3dldmVyLCBsYXRlciB2ZXJz aW9ucyBvZgotUENSRTIgbGlrZWx5IHdpbGwgZml4IHRoaXMsIGFuZCB0aGUgcGxhbiBpcyBmb3Ig QGNvbW1hbmR7Z3JlcH0gdG8KLXJlc3BlY3QgdGhvc2UgZGlyZWN0aXZlcyBpZiBwb3NzaWJsZS4K LUBjIFVzaW5nIFBDUkUyIGdpdCBjb21taXQgcGNyZTItMTAuNDAtMTEyLWc2Mjc3MzU3LCB0aGlz IGRlbW9uc3RyYXRlcwotQGMgdGhlIGVxdWl2YWxlbnQgb2YgaG93IGdyZXAgY291bGQgdXNlIFBD UkUyX0VYVFJBX0FTQ0lJX0JTRCB0byBtYWtlIFxkJ3MKLUBjIEFTQ0lJLW9ubHkgYmVoYXZpb3Ig dGhlIGRlZmF1bHQ6Ci1AYyAkIExDX0FMTD1lbl9VUy5VVEYtOCAuL3BjcmUyZ3JlcCAtdSAnKD9h RCleXGQrJyA8PDwgJ9mg2aHZotmj2aTZpdmm2afZqNmpJwotQGMgW0V4aXQgMV0KLUBjICQgTENf QUxMPWVuX1VTLlVURi04IC4vcGNyZTJncmVwIC11ICdeXGQrJyA8PDwgJ9mg2aHZotmj2aTZpdmm 2afZqNmpJwotQGMg2aDZodmi2aPZpNml2abZp9mo2akKK1doZW4gQGNvbW1hbmR7Z3JlcH0gaXMg YnVpbHQgd2l0aCBQQ1JFMiAxMC40MiBhbmQgZWFybGllciwKK0BzYW1we1xkfSBhbmQgQHNhbXB7 XER9IGlnbm9yZSBpbi1yZWdleHAgZGlyZWN0aXZlcyBsaWtlIEBzYW1weyg/YUQpfQorYW5kIHdv cmsgbGlrZSBAc2FtcHtbMC05XX0gYW5kIEBzYW1we1teMC05XX0gcmVzcGVjdGl2ZWx5LgorSG93 ZXZlciwgbGF0ZXIgdmVyc2lvbnMgb2YgUENSRTIgbGlrZWx5IHdpbGwgZml4IHRoaXMsCithbmQg dGhlIHBsYW4gaXMgZm9yIEBjb21tYW5ke2dyZXB9IHRvIHJlc3BlY3QgdGhvc2UgZGlyZWN0aXZl cyBpZiBwb3NzaWJsZS4KCiBAaXRlbQogQWx0aG91Z2ggUENSRSB0cmFja3MgdGhlIHN5bnRheCBh bmQgc2VtYW50aWNzIG9mIFBlcmwncyByZWd1bGFyCmRpZmYgLS1naXQgYS9zcmMvcGNyZXNlYXJj aC5jIGIvc3JjL3BjcmVzZWFyY2guYwppbmRleCBkMzcwMTgxLi4zNGIyYWViIDEwMDY0NAotLS0g YS9zcmMvcGNyZXNlYXJjaC5jCisrKyBiL3NyYy9wY3Jlc2VhcmNoLmMKQEAgLTEzMywxMCArMTMz LDEzIEBAIGJhZF91dGY4X2Zyb21fcGNyZTIgKGludCBlKQogI2VuZGlmCiB9CgorI2lmICEgUENS RTJfRVhUUkFfQVNDSUlfQlNECiAvKiBSZXBsYWNlIGVhY2ggXGQgaW4gKktFWVNfUCB3aXRoIFsw LTldLCB0byBlbnN1cmUgdGhhdCBcZCBtYXRjaGVzIG9ubHkgQVNDSUkKICAgIGRpZ2l0cy4gIE5v dyB0aGF0IHdlIGVuYWJsZSBQQ1JFMl9VQ1AgZm9yIHBjcmUgcmVnZXhwcywgXGQgd291bGQgb3Ro ZXJ3aXNlCiAgICBtYXRjaCBub24tQVNDSUkgZGlnaXRzIGluIHNvbWUgbG9jYWxlcy4gIFVzZSBc cHtOZH0gaWYgeW91IHJlcXVpcmUgdG8gbWF0Y2gKLSAgIHRob3NlLiAgKi8KKyAgIHRob3NlLiAg U2ltaWxhcmx5LCByZXBsYWNlIGVhY2ggXEQgd2l0aCBbXjAtOV0uCisgICBGSVhNRTogcmVtb3Zl IGluIDIwMjUsIG9yIHdoZW5ldmVyIHdlIG5vIGxvbmdlciBhY2NvbW1vZGF0ZSBwY3JlMi0xMC40 MgorICAgYW5kIHByaW9yLiAgKi8KIHN0YXRpYyB2b2lkCiBwY3JlX3BhdHRlcm5fZXhwYW5kX2Jh Y2tzbGFzaF9kIChjaGFyICoqa2V5c19wLCBpZHhfdCAqbGVuX3ApCiB7CkBAIC0xODIsNiArMTg1 LDkgQEAgcGNyZV9wYXR0ZXJuX2V4cGFuZF9iYWNrc2xhc2hfZCAoY2hhciAqKmtleXNfcCwgaWR4 X3QgKmxlbl9wKQogICAgICAgICAgICAgICAgIGNhc2UgJ2QnOgogICAgICAgICAgICAgICAgICAg cCA9IG1lbXBjcHkgKHAsICJbMC05XSIsIDUpOwogICAgICAgICAgICAgICAgICAgYnJlYWs7Cisg ICAgICAgICAgICAgICAgY2FzZSAnRCc6CisgICAgICAgICAgICAgICAgICBwID0gbWVtcGNweSAo cCwgIlteMC05XSIsIDYpOworICAgICAgICAgICAgICAgICAgYnJlYWs7CiAgICAgICAgICAgICAg ICAgZGVmYXVsdDoKICAgICAgICAgICAgICAgICAgICpwKysgPSAnXFwnOwogICAgICAgICAgICAg ICAgICAgKnArKyA9ICprZXlzOwpAQCAtMjA2LDYgKzIxMiw3IEBAIHBjcmVfcGF0dGVybl9leHBh bmRfYmFja3NsYXNoX2QgKGNoYXIgKiprZXlzX3AsIGlkeF90ICpsZW5fcCkKICAgKmtleXNfcCA9 IG5ld19rZXlzOwogICAqbGVuX3AgPSBwIC0gbmV3X2tleXM7CiB9CisjZW5kaWYKCiAvKiBDb21w aWxlIHRoZSAtUCBzdHlsZSBQQVRURVJOLCBjb250YWluaW5nIFNJWkUgYnl0ZXMgdGhhdCBhcmUK ICAgIGZvbGxvd2VkIGJ5ICdcbicuICBSZXR1cm4gYSBkZXNjcmlwdGlvbiBvZiB0aGUgY29tcGls ZWQgcGF0dGVybi4gICovCkBAIC0yMTMsOCArMjIwLDkgQEAgcGNyZV9wYXR0ZXJuX2V4cGFuZF9i YWNrc2xhc2hfZCAoY2hhciAqKmtleXNfcCwgaWR4X3QgKmxlbl9wKQogdm9pZCAqCiBQY29tcGls ZSAoY2hhciAqcGF0dGVybiwgaWR4X3Qgc2l6ZSwgcmVnX3N5bnRheF90IGlnbm9yZWQsIGJvb2wg ZXhhY3QpCiB7Ci0gIGlmICghIFBDUkUyX0VYVFJBX0FTQ0lJX0JTRCkKLSAgICBwY3JlX3BhdHRl cm5fZXhwYW5kX2JhY2tzbGFzaF9kICgmcGF0dGVybiwgJnNpemUpOworI2lmICEgUENSRTJfRVhU UkFfQVNDSUlfQlNECisgIHBjcmVfcGF0dGVybl9leHBhbmRfYmFja3NsYXNoX2QgKCZwYXR0ZXJu LCAmc2l6ZSk7CisjZW5kaWYKCiAgIFBDUkUyX1NJWkUgZTsKICAgaW50IGVjOwpkaWZmIC0tZ2l0 IGEvdGVzdHMvcGNyZS1hc2NpaS1kaWdpdHMgYi90ZXN0cy9wY3JlLWFzY2lpLWRpZ2l0cwppbmRl eCBhZTcxM2Y3Li5kZTlmZTM4IDEwMDc1NQotLS0gYS90ZXN0cy9wY3JlLWFzY2lpLWRpZ2l0cwor KysgYi90ZXN0cy9wY3JlLWFzY2lpLWRpZ2l0cwpAQCAtMSw2ICsxLDcgQEAKICMhL2Jpbi9zaAog IyBFbnN1cmUgdGhhdCBncmVwIC1QJ3MgXGQgbWF0Y2hlcyBvbmx5IHRoZSAxMCBBU0NJSSBkaWdp dHMuCiAjIFdpdGgsIGdyZXAtMy45LCBcZCB3b3VsZCBtYXRjaCBlLmcuLCB0aGUgbXVsdGlieXRl IEFyYWJpYyBkaWdpdHMuCisjIFRoZSBzYW1lIGFwcGxpZWQgdG8gXEQuCiAjCiAjIENvcHlyaWdo dCAoQykgMjAyMyBGcmVlIFNvZnR3YXJlIEZvdW5kYXRpb24sIEluYy4KICMKQEAgLTI0LDggKzI1 LDM4IEBAIGZhaWw9MAogIyBcMzMxXDI0NVwzMzFcMjQ2XDMzMVwyNDdcMzMxXDI1MFwzMzFcMjUx CiBwcmludGYgJ1wzMzFcMjQwXDMzMVwyNDFcMzMxXDI0MlwzMzFcMjQzXDMzMVwyNDQnID4gaW4g fHwgZnJhbWV3b3JrX2ZhaWx1cmVfCiBwcmludGYgJ1wzMzFcMjQ1XDMzMVwyNDZcMzMxXDI0N1wz MzFcMjUwXDMzMVwyNTEnID4+IGluIHx8IGZyYW1ld29ya19mYWlsdXJlXworcHJpbnRmICdcbicg Pj4gaW4gfHwgZnJhbWV3b3JrX2ZhaWx1cmVfCgotZ3JlcCAtUCAnXGQrJyBpbiA+IG91dCAmJiBm YWlsPTEKKyMgRW5zdXJlIHRoYXQgXGQgbWF0Y2hlcyBubyBjaGFyYWN0ZXIuCityZXR1cm5zXyAx IGdyZXAgLVAgJ1xkJyBpbiA+IG91dCB8fCBmYWlsPTEKIGNvbXBhcmUgL2Rldi9udWxsIG91dCB8 fCBmYWlsPTEKCisjIEVuc3VyZSB0aGF0IF5cRCskIG1hdGNoZXMgdGhlIGVudGlyZSBsaW5lLgor Z3JlcCAtUCAnXlxEKyQnIGluID4gb3V0IHx8IGZhaWw9MQorY29tcGFyZSBpbiBvdXQgfHwgZmFp bD0xCisKKyMgV2hlbiBidWlsdCB3aXRoIFBDUkUyIDEwLjQzIGFuZCBuZXdlciwgb25lIG1heSB1 c2UgKD9hRCkgYW5kICg/LWFEKQorIyB0byB0b2dnbGUgYmV0d2VlbiBtb2Rlcy4gICg/YUQpIGlz IHRoZSBkZWZhdWx0IChtYWtpbmcgXGQgPT0gWzAtOV0pLgorIyAoPy1hRCkgcmVsYXhlcyBcZCwg bWFraW5nIGl0IG1hdGNoICJhbGwiIGRpZ2l0cy4KKyMgVXNlIG1peGVkIGRpZ2l0cyBhcyBpbnB1 dDogQXJhYmljIDAgYW5kIEFTQ0lJIDQ6INmgNAorcHJpbnRmICdcMzMxXDI0MDRcbicgPiBpbjIg fHwgZnJhbWV3b3JrX2ZhaWx1cmVfCisKK3JldHVybnNfIDEgZ3JlcCAtUCAnXGRcZCcgaW4yID4g b3V0IHx8IGZhaWw9MQorY29tcGFyZSAvZGV2L251bGwgb3V0IHx8IGZhaWw9MQorCisjIFRoZSBm b2xsb3dpbmcgdGVzdHMgd29yayBvbmx5IHdoZW4gYnVpbHQgd2l0aCAxMC40MyBvciBuZXdlciwK KyMgd2l0aCB3aGljaCwgZ3JlcCBhY2NlcHRzIHRoZSBtb2RlLXNldHRpbmcgJyg/YUQpJzoKK2lm IGVjaG8gMCB8IGdyZXAgLXFQICcoP2FEKVxkJzsgdGhlbgorCisgIGdyZXAgLVAgJyg/LWFEKVxk KD9hRClcZCcgaW4yID4gb3V0IHx8IGZhaWw9MQorICBjb21wYXJlIGluMiBvdXQgfHwgZmFpbD0x CisKKyAgcmV0dXJuc18gMSBncmVwIC1QICdcZCg/LWFEKVxkJyBpbjIgPiBvdXQgfHwgZmFpbD0x CisgIGNvbXBhcmUgL2Rldi9udWxsIG91dCB8fCBmYWlsPTEKKworZWxzZQorICB3YXJuXyAnc2tp cHBlZCBzb21lIHRlc3RzOiB1c2UgUENSRTIgMTAuNDMgb3IgbmV3ZXIgdG8gZW5hYmxlJyBcCisg ICAgJ3N1cHBvcnQgZm9yIGUuZy4sICg/YUQpIGFuZCAoPy1hRCknCitmaQorCiBFeGl0ICRmYWls Ci0tIAoyLjQwLjAucmMyCgo= --000000000000abf77405f746e184-- From unknown Fri Sep 05 19:41:18 2025 X-Loop: help-debbugs@gnu.org Subject: bug#62267: grep-3.9 bug: \d matches multibyte digits Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 19 Mar 2023 23:13:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62267 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Jim Meyering Cc: 62267@debbugs.gnu.org, Gnulib bugs Received: via spool by 62267-submit@debbugs.gnu.org id=B62267.167926752817303 (code B ref 62267); Sun, 19 Mar 2023 23:13:02 +0000 Received: (at 62267) by debbugs.gnu.org; 19 Mar 2023 23:12:08 +0000 Received: from localhost ([127.0.0.1]:52960 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pe2C7-0004V1-F1 for submit@debbugs.gnu.org; Sun, 19 Mar 2023 19:12:07 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:39578) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pe2C5-0004UW-Fg for 62267@debbugs.gnu.org; Sun, 19 Mar 2023 19:12:06 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 8EC6D160054; Sun, 19 Mar 2023 16:11:58 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 9XLex48WSfSb; Sun, 19 Mar 2023 16:11:57 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id F36B616005E; Sun, 19 Mar 2023 16:11:56 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.9.2 zimbra.cs.ucla.edu F36B616005E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=78364E5A-2AF3-11ED-87FA-8298ECA2D365; t=1679267517; bh=+M1L8aIbamqM9aqsuZCcWinwr+39I+eVpQpjxu0cyQU=; h=Content-Type:Message-ID:Date:MIME-Version:To:From:Subject; b=B1XTBz9fs4khvt9Atp+1Hkvx2IPE3fmuuIYk87GhlV4gzm7BffpszPkFW4CDH7iHB S72aJwiGqUKcoVFioNB4HU1ODhXa6fm2Z7IFuA9jQxtSMCGMQ2nw4U5es6V0j0PBnF CgMgcf8w5tNTVMEiwhUJBJi0s4H52RlkAtpSbSp8= X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 3644EubrZh3Y; Sun, 19 Mar 2023 16:11:56 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id C4EC1160054; Sun, 19 Mar 2023 16:11:56 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------JZZHkKbhqT6ySfjf6fbmzolJ" Message-ID: <3ec20ebe-5b01-0601-fad7-5252cf2afb9b@cs.ucla.edu> Date: Sun, 19 Mar 2023 16:11:56 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Content-Language: en-US References: <870af32d-5cdf-65d1-4cbc-9988c57b8e37@cs.ucla.edu> <8a9eab6b-f87e-ce1b-1706-a14ee799bb7c@cs.ucla.edu> From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: X-Spam-Score: -3.4 (---) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.4 (----) This is a multi-part message in MIME format. --------------JZZHkKbhqT6ySfjf6fbmzolJ Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2023-03-19 13:44, Jim Meyering wrote: > I've pushed your change along with the attached. > I'll probably create another snapshot today. Thanks. I also installed a minor dfa.c change in Gnulib yesterday to pacify Oracle Solaris Studio. No big deal since 'grep' builds OK anyway. I also ran into a weird issue with test-select on Fedora 37 x86-64. It appears to be timing dependent and usually doesn't happen. I can't reproduce under strace. This is another Gnulib thing and not relevant to grep (other than people might report test failures to bug-grep). I installed into Gnulib the attached patch which shouldn't hurt but which I don't know fixes the bug. --------------JZZHkKbhqT6ySfjf6fbmzolJ Content-Type: text/x-patch; charset=UTF-8; name="gnulib.patch" Content-Disposition: attachment; filename="gnulib.patch" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL0NoYW5nZUxvZyBiL0NoYW5nZUxvZwppbmRleCBkZmNlMWE4ZGY3Li5m NDJlMmVkZTMwIDEwMDY0NAotLS0gYS9DaGFuZ2VMb2cKKysrIGIvQ2hhbmdlTG9nCkBAIC0x LDMgKzEsMTcgQEAKKzIwMjMtMDMtMTkgIFBhdWwgRWdnZXJ0ICA8ZWdnZXJ0QGNzLnVjbGEu ZWR1PgorCisJdGVzdC1wc2VsZWN0LCB0ZXN0LXNlbGVjdDogdXNlIGRpZmZlcmVudCBwb3J0 cworCUkgaGF2ZSBvYnNlcnZlZCByYXJlIGFuZCBoYXJkLXRvLXJlcHJvZHVjZSBwcm9ibGVt cyB3aXRoIHRoZSBHTlUKKwlncmVwIHJlbGVhc2UgY2FuZGlkYXRlIHdpdGgg4oCYbWFrZSAt ajUgY2hlY2vigJkgb24gRmVkb3JhIDM3IHg4Ni02NC4KKwlPbmUgcG9zc2liaWxpdHkgaXMg dGhhdCB0ZXN0LXBzZWxlY3QgYW5kIHRlc3Qtc2VsZWN0IGludGVyZmVyZQorCXdpdGggZWFj aCBvdGhlciBzb21laG93IHdoZW4gcnVuIHNpbXVsdGFuZW91c2x5LCBhcyB0aGV5IHVzZSB0 aGUKKwlzYW1lIHBvcnQuICBXb3JrIGFyb3VuZCB0aGlzIHBvc3NpYmlsaXR5IGJ5IHVzaW5n IGRpZmZlcmVudCBwb3J0cworCWZyb20gZWFjaCBvdGhlciwgYW5kIGZyb20gdGVzdC1wb2xs ICh3aGljaCBhbHNvIHVzZXMgMTIzNDUpLgorCU9mIGNvdXJzZSBpdOKAmWQgYmUgYmV0dGVy IGlmIGFsbCB0aGVzZSB0ZXN0cyB1c2VkIHN5c3RlbS1hc3NpZ25lZAorCXBvcnRzLCBidXQg SSBhc3N1bWUgdGhhdOKAmWQgdGFrZSBtb3JlIHdvcmsuCisJKiB0ZXN0cy90ZXN0LXBzZWxl Y3QuYywgdGVzdHMvdGVzdC1zZWxlY3QuYyAoVEVTVF9QT1JUKTogTmV3IG1hY3JvLgorCSog dGVzdHMvdGVzdC1zZWxlY3QuaCAoVEVTVF9QT1JUKTogUmVtb3ZlLgorCiAyMDIzLTAzLTE5 ICBCcnVubyBIYWlibGUgIDxicnVub0BjbGlzcC5vcmc+CiAKIAlVcGRhdGUgTU9EVUxFUy5o dG1sLnNoLgpkaWZmIC0tZ2l0IGEvdGVzdHMvdGVzdC1wc2VsZWN0LmMgYi90ZXN0cy90ZXN0 LXBzZWxlY3QuYwppbmRleCA0MTQ2ODY4NGM1Li5hMzgzZjFkMWIyIDEwMDY0NAotLS0gYS90 ZXN0cy90ZXN0LXBzZWxlY3QuYworKysgYi90ZXN0cy90ZXN0LXBzZWxlY3QuYwpAQCAtMjQs NiArMjQsNyBAQCBTSUdOQVRVUkVfQ0hFQ0sgKHBzZWxlY3QsIGludCwKICAgICAgICAgICAg ICAgICAgKGludCwgZmRfc2V0ICpyZXN0cmljdCwgZmRfc2V0ICpyZXN0cmljdCwgZmRfc2V0 ICpyZXN0cmljdCwKICAgICAgICAgICAgICAgICAgIHN0cnVjdCB0aW1lc3BlYyBjb25zdCAq cmVzdHJpY3QsIGNvbnN0IHNpZ3NldF90ICpyZXN0cmljdCkpOwogCisjZGVmaW5lIFRFU1Rf UE9SVCAxMjM0NwogI2luY2x1ZGUgInRlc3Qtc2VsZWN0LmgiCiAKIHN0YXRpYyBpbnQKZGlm ZiAtLWdpdCBhL3Rlc3RzL3Rlc3Qtc2VsZWN0LmMgYi90ZXN0cy90ZXN0LXNlbGVjdC5jCmlu ZGV4IGQwNGJlNTg0MTguLmI0NjAwNjAzNTEgMTAwNjQ0Ci0tLSBhL3Rlc3RzL3Rlc3Qtc2Vs ZWN0LmMKKysrIGIvdGVzdHMvdGVzdC1zZWxlY3QuYwpAQCAtMjUsNiArMjUsNyBAQAogU0lH TkFUVVJFX0NIRUNLIChzZWxlY3QsIGludCwgKGludCwgZmRfc2V0ICosIGZkX3NldCAqLCBm ZF9zZXQgKiwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBzdHJ1Y3QgdGltZXZh bCAqKSk7CiAKKyNkZWZpbmUgVEVTVF9QT1JUIDEyMzQ2CiAjaW5jbHVkZSAidGVzdC1zZWxl Y3QuaCIKIAogaW50CmRpZmYgLS1naXQgYS90ZXN0cy90ZXN0LXNlbGVjdC5oIGIvdGVzdHMv dGVzdC1zZWxlY3QuaAppbmRleCBhZmE5OTZmNDBjLi5jZWViNDg1NDcxIDEwMDY0NAotLS0g YS90ZXN0cy90ZXN0LXNlbGVjdC5oCisrKyBiL3Rlc3RzL3Rlc3Qtc2VsZWN0LmgKQEAgLTM3 LDggKzM3LDYgQEAKICMgaW5jbHVkZSA8c3lzL3dhaXQuaD4KICNlbmRpZgogCi0jZGVmaW5l IFRFU1RfUE9SVCAgICAgICAxMjM0NQotCiAKIHR5cGVkZWYgaW50ICgqc2VsZWN0X2ZuKSAo aW50LCBmZF9zZXQgKiwgZmRfc2V0ICosIGZkX3NldCAqLCBzdHJ1Y3QgdGltZXZhbCAqKTsK IAo= --------------JZZHkKbhqT6ySfjf6fbmzolJ-- From unknown Fri Sep 05 19:41:18 2025 X-Loop: help-debbugs@gnu.org Subject: bug#62267: grep-3.9 bug: \d matches multibyte digits Resent-From: Jim Meyering Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Sun, 19 Mar 2023 23:19:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62267 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Paul Eggert Cc: 62267@debbugs.gnu.org, Gnulib bugs Received: via spool by 62267-submit@debbugs.gnu.org id=B62267.167926789517874 (code B ref 62267); Sun, 19 Mar 2023 23:19:01 +0000 Received: (at 62267) by debbugs.gnu.org; 19 Mar 2023 23:18:15 +0000 Received: from localhost ([127.0.0.1]:52965 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pe2I3-0004eD-FI for submit@debbugs.gnu.org; Sun, 19 Mar 2023 19:18:15 -0400 Received: from mail-lf1-f53.google.com ([209.85.167.53]:43861) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pe2I1-0004dz-7q for 62267@debbugs.gnu.org; Sun, 19 Mar 2023 19:18:13 -0400 Received: by mail-lf1-f53.google.com with SMTP id q16so1955494lfe.10 for <62267@debbugs.gnu.org>; Sun, 19 Mar 2023 16:18:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679267887; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YewV5GIUqSrhrmSWzLdeI9dy1G1Wcx5yXn/w/t/QTAw=; b=tu+8NDAbdVOgksAnq0EAbm78QyIbwPOsHK/iLPamQZKCsTNnhPILXWfXsoZyPW0fLW dLMU0593zpggihi4+D7bdIubxXpU1yU9b7Cezi6yXizYZSgw+QSZwCogPHuiCIDc3Yh4 XrUb8a0GifxsU3rsFnSLXrGTqDoyK+upwJ1umT151DBmH8im8+6BpZJWgj41iqDJKycG 1FmN+fWUu0ziTHpRh7bq+D+/xk9O5OsyChvXCQO8WUBwppbwYpwVUtoPxySNlLKpvg++ rXQR0+J9kVxIw5wlh5cofeshh/v/x7gRBanV03PbLTjtJKsT6VmxyMlCSjqAr+qxCOYx KzGQ== X-Gm-Message-State: AO0yUKUeWb074owxnfVA0fwW9U3g+RmMD/YuNvojgoAXfprBDvBOL0LF JCTGDD5yT77pjSRy0ktLYeLKnB/EbdaRYQgeZKU= X-Google-Smtp-Source: AK7set+VTCKDT5rJd3qF24kavPb+z42NJvTh7s4mGgw7gmGIc6POCzRaHjlAe/fFbT80gBtpL7/Fi4fC7bebnB4Q8Sk= X-Received: by 2002:ac2:5962:0:b0:4ea:2dce:fa0a with SMTP id h2-20020ac25962000000b004ea2dcefa0amr164171lfp.10.1679267887143; Sun, 19 Mar 2023 16:18:07 -0700 (PDT) MIME-Version: 1.0 References: <870af32d-5cdf-65d1-4cbc-9988c57b8e37@cs.ucla.edu> <8a9eab6b-f87e-ce1b-1706-a14ee799bb7c@cs.ucla.edu> <3ec20ebe-5b01-0601-fad7-5252cf2afb9b@cs.ucla.edu> In-Reply-To: <3ec20ebe-5b01-0601-fad7-5252cf2afb9b@cs.ucla.edu> From: Jim Meyering Date: Sun, 19 Mar 2023 16:17:54 -0700 Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.2 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.8 (/) On Sun, Mar 19, 2023 at 4:12=E2=80=AFPM Paul Eggert wr= ote: > On 2023-03-19 13:44, Jim Meyering wrote: > > I've pushed your change along with the attached. > > I'll probably create another snapshot today. > > Thanks. I also installed a minor dfa.c change in Gnulib yesterday to > pacify Oracle Solaris Studio. No big deal since 'grep' builds OK anyway. > > I also ran into a weird issue with test-select on Fedora 37 x86-64. It > appears to be timing dependent and usually doesn't happen. I can't > reproduce under strace. This is another Gnulib thing and not relevant to > grep (other than people might report test failures to bug-grep). > > I installed into Gnulib the attached patch which shouldn't hurt but > which I don't know fixes the bug. Oh! I must have missed getting the latter by bare minutes. I've just published another snapshot (which does include the dfa.c change) but not the select one. We'll get it for the release of 3.10 From debbugs-submit-bounces@debbugs.gnu.org Mon Mar 20 01:30:50 2023 Received: (at control) by debbugs.gnu.org; 20 Mar 2023 05:30:50 +0000 Received: from localhost ([127.0.0.1]:53447 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pe86b-0000qO-Ts for submit@debbugs.gnu.org; Mon, 20 Mar 2023 01:30:50 -0400 Received: from mail-lf1-f42.google.com ([209.85.167.42]:35783) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pe86a-0000qA-ES for control@debbugs.gnu.org; Mon, 20 Mar 2023 01:30:48 -0400 Received: by mail-lf1-f42.google.com with SMTP id y20so13444161lfj.2 for ; Sun, 19 Mar 2023 22:30:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679290241; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=nIOBsvmbb9D1QkqDxBogEnVWO3mULjrezxmAOB0X1Y4=; b=UXaj9RETChjfYbyzu82A5z8+xRPi3rwuxtdoA417acCNs/xx9X3LgUn/+6ewyIBu9F fEorfMSD9RR62/PDwEGWiAKqDJ+IB5IEE2BPE+yNrEjbdk4ZDcbCIHwRH/D2Vc2cOaRc 6nGmVlQ8mH6UzcyLFqEJB5/9SQUF15yFyI+GVrtf9mPGIehsAznPUWMENvpdMQALU2ms uqa73d4bxEhvbeMWnRGoogtj0976xgbKu4OeDQs7A2+5COPHwK7CCOPzsh7HtPNpPrSV 85Do+ayCZymSfzb06Q6/iTqpeTpjs1v3kjlBUDadPToeSOyxaxNvFGIrRHJT4qULMyUm 7AtA== X-Gm-Message-State: AO0yUKW/3X/7gfwkDC/dUj08SVjI0vzNTpJrYF9uTv1JWHKFSu4ypYaL 2EnRIDRnypdpNxteSNtlUp23cehSzoGeUW+ZUxRXRhM530k= X-Google-Smtp-Source: AK7set+O09S6FqwT1zyyDvyDg9BEBXY6py0bGrz1sh2IqiVp4g6CT8TQqv0Krei/xU+bTatwsBWWBE8Z99sBSyXggC4= X-Received: by 2002:ac2:4897:0:b0:4e9:609f:2572 with SMTP id x23-20020ac24897000000b004e9609f2572mr3055064lfc.10.1679290241324; Sun, 19 Mar 2023 22:30:41 -0700 (PDT) MIME-Version: 1.0 From: Jim Meyering Date: Sun, 19 Mar 2023 22:30:29 -0700 Message-ID: Subject: To: GNU bug tracker automated control server Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 2.2 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: close 62267 stop Content analysis details: (2.2 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (meyering[at]gmail.com) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record 0.2 HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level mail domains are different -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [209.85.167.42 listed in wl.mailspike.net] -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [209.85.167.42 listed in list.dnswl.org] 0.0 FREEMAIL_FORGED_FROMDOMAIN 2nd level domains in From and EnvelopeFrom freemail headers are different 2.0 BLANK_SUBJECT Subject is present but empty X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.2 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: close 62267 stop Content analysis details: (1.2 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [209.85.167.42 listed in wl.mailspike.net] -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [209.85.167.42 listed in list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (meyering[at]gmail.com) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record 0.2 HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level mail domains are different 0.0 FREEMAIL_FORGED_FROMDOMAIN 2nd level domains in From and EnvelopeFrom freemail headers are different 2.0 BLANK_SUBJECT Subject is present but empty -1.0 MAILING_LIST_MULTI Multiple indicators imply a widely-seen list manager close 62267 stop