GNU bug report logs - #72524
how does grep determine locale if no LC environment variables are set

Previous Next

Package: grep;

Reported by: <mark.yagnatinsky <at> barclays.com>

Date: Thu, 8 Aug 2024 12:55:02 UTC

Severity: normal

Full log


Message #14 received at 72524 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: mark.yagnatinsky <at> barclays.com
Cc: 72524 <at> debbugs.gnu.org
Subject: Re: bug#72524: how does grep determine locale if no LC environment
 variables are set
Date: Fri, 9 Aug 2024 14:57:15 -0700
On 2024-08-08 12:24, mark.yagnatinsky <at> barclays.com wrote:
> Re: how am I doing that ... via bash, just like the way you suggested I run "locale" the second time:
> LC_CTYPE=C.UTF-8 grep -P needle haystack.txt  # just CTYPE seems to be enough, no need for ALL

As an aside, I wouldn't mess with LC_CTYPE independently. One can get 
into trouble if the LC_CTYPE locale disagrees with the others. However, 
I don't think that's your problem.


> Re: is_using_utf8 ... It relies on mbrtowc, which in turn relies on the current locale.
> It seems that this function should NEVER return false in a UTF-8 locale.

Correct.

> But how does grep decide what the locale even is?
> Presumably it must call setlocale at some point, or else it would be using the C locale, which is surely a unibyte locale.

Correct, it calls 'setlocale (LC_ALL, "")' first thing.


> is there any good way to find out what locale it actually got "resolved" to?

You could modify the source code to add a call like this:

   fprintf (stderr, "grep: locale is %s\n", setlocale (LC_ALL, 0));

after the earlier call to setlocale. Or you could run 'setlocale 
(LC_ALL, 0)' in a debugger.




This bug report was last modified 309 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.