GNU bug report logs - #15192
UTF-16 surrogate pair handling in grep -i option

Previous Next

Package: grep;

Reported by: Corinna Vinschen <vinschen <at> redhat.com>

Date: Mon, 26 Aug 2013 08:56:02 UTC

Severity: normal

Tags: moreinfo

Merged with 15199

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 15192 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Corinna Vinschen <vinschen <at> redhat.com>
Cc: 15192 <at> debbugs.gnu.org
Subject: Re: bug#15192: UTF-16 surrogate pair handling in grep -i option
Date: Mon, 26 Aug 2013 21:58:53 -0700
I guess it is a different point of view.  Maybe I'm just too
forward-thinking? :-)
I.e., if the remaining cygwin-specific bug is fixed soon, there will
be little reason for separate tests.
Are you planning to work on the cygwin/regexp bug?

On Mon, Aug 26, 2013 at 1:54 AM, Corinna Vinschen <vinschen <at> redhat.com> wrote:
> On Aug 25 12:49, Jim Meyering wrote:
>> On Mon, Aug 19, 2013 at 5:43 AM, Corinna Vinschen <...> wrote:
>> > But, here's a question:  If the surrogate-pair test fails without the
>> > patch due to the SEGV, and it also fails with the patch, just in a
>> > different way, what's the idea of the testcase?  In theory, shouldn't
>> > there be two tests, one of them testing only for this very SEGV, and
>> > another test testing how grep handles 4 byte UTF-8 values, since that's
>> > another problem entirely?
>>
>> It's a trade-off.  Split surrogate-pair testing into two very similar
>> test scripts?
>> Factor the similar parts into cfg.sh and use them from two test scripts?
>> It didn't fee like it was justified in this case, since it's a
>> cygwin-specific bug.
>>
>> If there's a short/reliable shell-level test for "is-cygwin", I suppose we
>
>   case $(uname -s) in
>   CYGWIN*)
>     ...;;
>   *)
>     ...;
>   esac
>
>> could make the loop that iterates over grep options skip the currently-
>> known-to-fail cases on Cygwin systems.
>
> No, that's not right, IMHO.  It's a matter how you define the test.
>
> Only one part of the test is actually testing for the SEGV bug, is all
> I'm saying.  If you want to have a PASS in the testsuite if this works,
> it should be a standalone test.
>
> The second part of the test tests if grep handles 4 byte UTF-8 sequences
> in regex'es correctly.  It's a different test.  If you define this one
> as a target-agnostic test, it requires another test script.
>
> If you define the whole script as *the* test for UTF-16 surrogates,
> I suppose it should stay as is and the testcase should FAIL on Cygwin
> as long as not all parts of grep grok UTF-16 surrogates.
>
> It's probably just a different point of view, so, never mind.
>
>
> Thanks,
> Corinna
>
> --
> Corinna Vinschen
> Cygwin Maintainer
> Red Hat




This bug report was last modified 11 years and 30 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.