GNU bug report logs -
#23269
new snapshot available: grep-2.24.13-bed6
Previous Next
Reported by: Jim Meyering <jim <at> meyering.net>
Date: Mon, 11 Apr 2016 15:54:02 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[The message below was originally sent only to Arnold, but I intended
it to go to 23269 <at> debbugs.gnu.org as well. Seeing as the conversation
regarding multi-threaded grep operation is continuing, I've decided to
forward it to the bug list. Apologies to Arnold (and others as
appropriate) if this is a duplicate. -- sur-behoffski]
-------- Forwarded Message --------
Subject: Re: bug#23269: Multi-threaded operation, mbrtowc, and "untangle" script [was Re: bug#23269...]
Date: Thu, 21 Apr 2016 21:32:15 +0930
From: sur-behoffski <sur_behoffski <at> grouse.com.au>
To: arnold <at> skeeve.com
On 04/21/16 19:25, arnold <at> skeeve.com wrote:
> sur-behoffski <sur_behoffski <at> grouse.com.au> wrote:
>
>> So, I'm not sure if a thread-safe (i.e. locale-safe) version of mbrtowc
>> exists; if not, this needs to be addressed before a split-locale,
>> multi-threaded version is feasible. (LC_CTYPE race conditions?)
>
> By definition, mbrtowc is thread safe. The question relates better
> to setlocale(), or rather to the underlying internal locale data. I don't
> think the current POSIX model lends itself to multiple locales within
> the same process.
>
Thanks for the response. As noted in the man pages, the thread safety
does not extend to multi-locale settings, and this is explicitly what Paul
was hoping for in the message that I replied to:
On 04/21/16 02:10, Paul Eggert wrote:
> [...]
> One thing that bugged me about dfa.c (when I was looking at this
> yesterday) is that it maintains some state in static variables, which
> means it can't be used in multiple threads using different locales.
> That's not an issue with grep or gawk now, but might be for other
> apps and might conceivably be a problem even in grep, which has a
> multithreaded patch pending and might conceivably want to use per-file
> encodings. [...]
"man 3 mbrtowc" on my Gentoo system has the following text in the ATTRIBUTES,
CONFORMING TO, NOTES and COLOPHON sections:
------ (Start of excerpt) ------
ATTRIBUTES
For an explanation of the terms used in this section, see attributes(7).
+----------+---------------+----------------------------+
|Interface | Attribute | Value |
+----------+---------------+----------------------------+
|mbrtowc() | Thread safety | MT-Unsafe race:mbrtowc/!ps |
+----------+---------------+----------------------------+
CONFORMING TO
POSIX.1-2001, POSIX.1-2008, C99.
NOTES
The behavior of mbrtowc() depends on the LC_CTYPE category of the current locale.
[...]
COLOPHON
This page is part of release 4.04 of the Linux man-pages project. A description of the
project, information about reporting bugs, and the latest version of this page, can be
found at http://www.kernel.org/doc/man-pages/.
GNU 2015-08-08 MBRTOWC(3)
------ (End of excerpt) ------
cheers,
sur-behoffski (Brenton Hoff)
Programmer, Grouse Software
This bug report was last modified 8 years and 309 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.