GNU bug report logs -
#72524
how does grep determine locale if no LC environment variables are set
Previous Next
Full log
Message #14 received at 72524 <at> debbugs.gnu.org (full text, mbox):
On 2024-08-08 12:24, mark.yagnatinsky <at> barclays.com wrote:
> Re: how am I doing that ... via bash, just like the way you suggested I run "locale" the second time:
> LC_CTYPE=C.UTF-8 grep -P needle haystack.txt # just CTYPE seems to be enough, no need for ALL
As an aside, I wouldn't mess with LC_CTYPE independently. One can get
into trouble if the LC_CTYPE locale disagrees with the others. However,
I don't think that's your problem.
> Re: is_using_utf8 ... It relies on mbrtowc, which in turn relies on the current locale.
> It seems that this function should NEVER return false in a UTF-8 locale.
Correct.
> But how does grep decide what the locale even is?
> Presumably it must call setlocale at some point, or else it would be using the C locale, which is surely a unibyte locale.
Correct, it calls 'setlocale (LC_ALL, "")' first thing.
> is there any good way to find out what locale it actually got "resolved" to?
You could modify the source code to add a call like this:
fprintf (stderr, "grep: locale is %s\n", setlocale (LC_ALL, 0));
after the earlier call to setlocale. Or you could run 'setlocale
(LC_ALL, 0)' in a debugger.
This bug report was last modified 309 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.