GNU bug report logs -
#47264
RFE: pcre2 support
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Sun, 14 Nov 2021 12:45:29 -0800
with message-id <2fc13013-90f8-e63e-c833-6636dd28bdf6 <at> cs.ucla.edu>
and subject line Re: bug#47264: [PATCH v2] pcre: migrate to pcre2
has caused the debbugs.gnu.org bug report #47264,
regarding RFE: pcre2 support
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
47264: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=47264
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
Hi,
PCRE library has been superseded with PCRE2 project by PCRE
upstream in 2015. PCRE upstream considers PCRE obsolete now
and does not devote any resources to PCRE except of critical
bugs. Please consider adding PCRE2 support.
Downstream Fedora bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1755491
thanks & regards
Jaroslav
[Message part 3 (message/rfc822, inline)]
[Message part 4 (text/plain, inline)]
On 11/9/21 02:58, Carlo Marcelo Arenas Belón wrote:
> Sadly, hadn't been able to generate a release,
Does this mean you're having trouble running 'make dist'? If so, what's
the trouble?
> it seems to be ready for some broader testing, specially if the
> attached patch is applied on top of a 10.37 release (tested that way
> in OpenBSD i386)
OK, thanks, I installed it into the Savannah master copy of GNU grep,
except that I didn't rename m4/pcre.m4 to m4/pcre2.m4, or rename the
macros to use PCRE2. This made the change easier to audit. Revised patch
0001 attached.
Also, I followed up with several related patches (also attached as
0002-0012). Please take a look at them and let us know of any problems.
In the attached patch "grep: prefer signed integers" I followed the
usual grep approach of preferring signed to unsigned integers (e.g.,
idx_t to size_t) when either will do; this lets us debug better with
-fsanitize=undefined to catch integer overflow.
One issue I discovered: PCRE2_EXTRA_MATCH_WORD (which is used by
pcre2grep -w) is incompatible with 'grep -w'. For example, 'echo a%%a |
grep -Pw %%' outputs nothing, whereas 'echo a%%a | pcre2grep -w %%'
outputs 'a%%a'. I think the GNU grep behavior (which is the same as with
'grep -w', either on Linux or OpenBSD) is more intuitive here: do you
happen to know why PCRE behaves the way it does? Is that worth a PCRE2
bug report? Anyway, the attached patches avoid using
PCRE2_EXTRA_MATCH_WORD for that reason.
> * no more version restrictions (should work with >~10.20)
I tested with 10.00 and found one more glitch (it doesn't have
PCRE2_SIZE_MAX), which is fixed by the attached patch "grep: port to
PCRE2 10.20".
> Pending:
> * what to do with the current support of \C (enabled for now)
Let's open another bug report for that; I'm still a bit fuzzy about what
the pros and cons are.
> * merge of non critical bugfix in #51710[1]
I plan to follow up in that bug report.
Marking this bug as done. Thanks again for working on this.
[0001-grep-migrate-to-pcre2.patch (text/x-patch, attachment)]
[0002-maint-minor-rewording-and-reindenting.patch (text/x-patch, attachment)]
[0003-grep-Don-t-limit-jitstack_max-to-INT_MAX.patch (text/x-patch, attachment)]
[0004-grep-improve-pcre2_get_error_message-comments.patch (text/x-patch, attachment)]
[0005-grep-speed-up-fix-bad-UTF8-check-with-P.patch (text/x-patch, attachment)]
[0006-grep-prefer-signed-integers.patch (text/x-patch, attachment)]
[0007-grep-use-PCRE2_EXTRA_MATCH_LINE.patch (text/x-patch, attachment)]
[0008-grep-simplify-JIT-setup.patch (text/x-patch, attachment)]
[0009-grep-improve-memory-exhaustion-checking-with-P.patch (text/x-patch, attachment)]
[0010-grep-use-ximalloc-not-xcalloc.patch (text/x-patch, attachment)]
[0011-grep-fix-minor-P-memory-leak.patch (text/x-patch, attachment)]
[0012-grep-port-to-PCRE2-10.20.patch (text/x-patch, attachment)]
This bug report was last modified 3 years and 216 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.