On 12/23/2013 04:12 PM, Jim Meyering wrote: > Did you miss the "isascii" check in the new trivial_case_convert function? No. But even with that check in place: > If you can describe circumstances in which the new patch malfunctions, > please do, > but everything you wrote seems to rely on a false assumption. No, it's a quite real complaint - your patch is broken for tr_TR. > E.g., your turkish-I example works fine with my patch. isascii('i') is true, but converting 'i' to '[iI]' is incorrect in the tr_TR locale. Rather, the conversion must be to '[iİ]'; similarly, 'I' would be translated to '[Iı]'. Neither of those conversions fit in 4 bytes (since dotted-capital-I and dotless-lower-i are both multi-byte characters). Need help easily finding those characters on a non-Turkish keyboard? I used: $ echo iI | LC_ALL=tr_TR.UTF-8 sed 's/\(.\)\(.\)/\U\1\L\2/' At any rate, prior to your patch, lower dotless i in the buffer gives an insensitive match to upper dotless I in the pattern: $ echo ı | LC_ALL=tr_TR.UTF-8 grep -i I || echo no match ı After your patch: $ echo ı | LC_ALL=tr_TR.UTF-8 src/grep -i I || echo no match no match Oops, you failed to match lower dotless i insensitively against upper dotless I, because upper dotless I is ascii, but you incorrectly converted it into the wrong pattern. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org