GNU bug report logs - #60708
pcre: improve support for linking with a library without unicode

Previous Next

Package: grep;

Reported by: Carlo Arenas <carenas <at> gmail.com>

Date: Tue, 10 Jan 2023 11:14:02 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Carlo Arenas <carenas <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Jim Meyering <jim <at> meyering.net>, 60708 <at> debbugs.gnu.org
Subject: bug#60708: pcre: improve support for linking with a library without unicode
Date: Wed, 11 Jan 2023 20:03:43 -0800
On Wed, Jan 11, 2023 at 6:29 PM Paul Eggert <eggert <at> cs.ucla.edu> wrote:
>
> Oh, I think see your point, but doesn't this mean that even my code was
> too trusting? It should be something like this:
>
>    if (localeinfo.multibyte)
>      {
>        uint32_t unicode;
>        if (! (localeinfo.using_utf8
>               && 0 <= pcre2_config (PCRE2_CONFIG_UNICODE, &unicode)
>              && unicode))
>          die (EXIT_TROUBLE, 0, _("-P supports only unibyte and UTF-8
> locales"));
>        ...
>
> That is, we're better off diagnosing the problem and not attempting to
> use pcre2 if the result will be wrong (or even result in undefined
> behavior). The problem is unlikely to occur so it's good to be
> conservative here.

Maybe we are not clear on what the "problem" is.

The issue the original code was trying to avoid was to set PCRE_UTF if
the library doesn't have Unicode support, as that would block grep
with a PCRE error (as shown in the commit message), and which also
disabled some tests as it couldn't be differentiated with a failure in
grep because -P wasn't supported.  Your suggested code doesn't address
that, it merely changes the error message with one that would be IMHO
even less clear and worsens the problem.

Using a non Unicode PCRE library is perfectly fine, and there is no
"undefined behavior" risk, and indeed `grep -P` without the UTF flag
is exactly what the alternate path uses and what is recommended for
speed, so?

Carlo




This bug report was last modified 2 years and 154 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.