GNU bug report logs -
#16232
[PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Previous Next
Reported by: Jim Meyering <jim <at> meyering.net>
Date: Mon, 23 Dec 2013 22:40:02 UTC
Severity: normal
Tags: patch
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
Message #11 received at 16232 <at> debbugs.gnu.org (full text, mbox):
On Mon, Dec 23, 2013 at 2:52 PM, Eric Blake <eblake <at> redhat.com> wrote:
> On 12/23/2013 03:39 PM, Jim Meyering wrote:
>> FYI, here is a quick and clean/safe performance improvement for grep -i.
>> I expect to push this commit right after the upcoming bug-fix release.
>> Currently, this optimization is enabled when the search string is
>> ASCII and contains neither of '\' (backslash) nor '['. I expect to
>> eliminate the latter two constraints in a follow-on commit including
>> tests to exercise all of the corner cases.
>>
>
>> +
>> + /* Worst case is that every byte of keys will be alpha,
>> + so every byte B will map to the sequence of 4 bytes [Bb]. */
>
> Umm, is this always true? Consider the UTF-8 Turkish locale, where
Hi Eric,
Thanks for the review.
Did you miss the "isascii" check in the new trivial_case_convert function?
If you can describe circumstances in which the new patch malfunctions,
please do,
but everything you wrote seems to rely on a false assumption.
E.g., your turkish-I example works fine with my patch.
This bug report was last modified 11 years and 82 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.