GNU bug report logs - #16232
[PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales

Previous Next

Package: grep;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Mon, 23 Dec 2013 22:40:02 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #11 received at 16232 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Eric Blake <eblake <at> redhat.com>
Cc: 16232 <at> debbugs.gnu.org
Subject: Re: bug#16232: [PATCH] grep: make --ignore-case (-i) faster
 (sometimes 10x) in multibyte locales
Date: Mon, 23 Dec 2013 15:12:26 -0800
On Mon, Dec 23, 2013 at 2:52 PM, Eric Blake <eblake <at> redhat.com> wrote:
> On 12/23/2013 03:39 PM, Jim Meyering wrote:
>> FYI, here is a quick and clean/safe performance improvement for grep -i.
>> I expect to push this commit right after the upcoming bug-fix release.
>> Currently, this optimization is enabled when the search string is
>> ASCII and contains neither of '\' (backslash) nor '['.  I expect to
>> eliminate the latter two constraints in a follow-on commit including
>> tests to exercise all of the corner cases.
>>
>
>> +
>> +  /* Worst case is that every byte of keys will be alpha,
>> +     so every byte B will map to the sequence of 4 bytes [Bb].  */
>
> Umm, is this always true?  Consider the UTF-8 Turkish locale, where

Hi Eric,

Thanks for the review.
Did you miss the "isascii" check in the new trivial_case_convert function?
If you can describe circumstances in which the new patch malfunctions,
please do,
but everything you wrote seems to rely on a false assumption.
E.g., your turkish-I example works fine with my patch.




This bug report was last modified 11 years and 82 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.