GNU bug report logs -
#15192
UTF-16 surrogate pair handling in grep -i option
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 15192 in the body.
You can then email your comments to 15192 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org
:
bug#15192
; Package
grep
.
(Mon, 26 Aug 2013 08:56:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Corinna Vinschen <vinschen <at> redhat.com>
:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org
.
(Mon, 26 Aug 2013 08:56:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Aug 25 12:49, Jim Meyering wrote:
> On Mon, Aug 19, 2013 at 5:43 AM, Corinna Vinschen <...> wrote:
> > But, here's a question: If the surrogate-pair test fails without the
> > patch due to the SEGV, and it also fails with the patch, just in a
> > different way, what's the idea of the testcase? In theory, shouldn't
> > there be two tests, one of them testing only for this very SEGV, and
> > another test testing how grep handles 4 byte UTF-8 values, since that's
> > another problem entirely?
>
> It's a trade-off. Split surrogate-pair testing into two very similar
> test scripts?
> Factor the similar parts into cfg.sh and use them from two test scripts?
> It didn't fee like it was justified in this case, since it's a
> cygwin-specific bug.
>
> If there's a short/reliable shell-level test for "is-cygwin", I suppose we
case $(uname -s) in
CYGWIN*)
...;;
*)
...;
esac
> could make the loop that iterates over grep options skip the currently-
> known-to-fail cases on Cygwin systems.
No, that's not right, IMHO. It's a matter how you define the test.
Only one part of the test is actually testing for the SEGV bug, is all
I'm saying. If you want to have a PASS in the testsuite if this works,
it should be a standalone test.
The second part of the test tests if grep handles 4 byte UTF-8 sequences
in regex'es correctly. It's a different test. If you define this one
as a target-agnostic test, it requires another test script.
If you define the whole script as *the* test for UTF-16 surrogates,
I suppose it should stay as is and the testcase should FAIL on Cygwin
as long as not all parts of grep grok UTF-16 surrogates.
It's probably just a different point of view, so, never mind.
Thanks,
Corinna
--
Corinna Vinschen
Cygwin Maintainer
Red Hat
[Message part 2 (application/pgp-signature, inline)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#15192
; Package
grep
.
(Tue, 27 Aug 2013 05:00:03 GMT)
Full text and
rfc822 format available.
Message #8 received at 15192 <at> debbugs.gnu.org (full text, mbox):
I guess it is a different point of view. Maybe I'm just too
forward-thinking? :-)
I.e., if the remaining cygwin-specific bug is fixed soon, there will
be little reason for separate tests.
Are you planning to work on the cygwin/regexp bug?
On Mon, Aug 26, 2013 at 1:54 AM, Corinna Vinschen <vinschen <at> redhat.com> wrote:
> On Aug 25 12:49, Jim Meyering wrote:
>> On Mon, Aug 19, 2013 at 5:43 AM, Corinna Vinschen <...> wrote:
>> > But, here's a question: If the surrogate-pair test fails without the
>> > patch due to the SEGV, and it also fails with the patch, just in a
>> > different way, what's the idea of the testcase? In theory, shouldn't
>> > there be two tests, one of them testing only for this very SEGV, and
>> > another test testing how grep handles 4 byte UTF-8 values, since that's
>> > another problem entirely?
>>
>> It's a trade-off. Split surrogate-pair testing into two very similar
>> test scripts?
>> Factor the similar parts into cfg.sh and use them from two test scripts?
>> It didn't fee like it was justified in this case, since it's a
>> cygwin-specific bug.
>>
>> If there's a short/reliable shell-level test for "is-cygwin", I suppose we
>
> case $(uname -s) in
> CYGWIN*)
> ...;;
> *)
> ...;
> esac
>
>> could make the loop that iterates over grep options skip the currently-
>> known-to-fail cases on Cygwin systems.
>
> No, that's not right, IMHO. It's a matter how you define the test.
>
> Only one part of the test is actually testing for the SEGV bug, is all
> I'm saying. If you want to have a PASS in the testsuite if this works,
> it should be a standalone test.
>
> The second part of the test tests if grep handles 4 byte UTF-8 sequences
> in regex'es correctly. It's a different test. If you define this one
> as a target-agnostic test, it requires another test script.
>
> If you define the whole script as *the* test for UTF-16 surrogates,
> I suppose it should stay as is and the testcase should FAIL on Cygwin
> as long as not all parts of grep grok UTF-16 surrogates.
>
> It's probably just a different point of view, so, never mind.
>
>
> Thanks,
> Corinna
>
> --
> Corinna Vinschen
> Cygwin Maintainer
> Red Hat
Information forwarded
to
bug-grep <at> gnu.org
:
bug#15192
; Package
grep
.
(Tue, 27 Aug 2013 09:36:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 15192 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Aug 26 21:58, Jim Meyering wrote:
> I guess it is a different point of view. Maybe I'm just too
> forward-thinking? :-)
> I.e., if the remaining cygwin-specific bug is fixed soon, there will
> be little reason for separate tests.
> Are you planning to work on the cygwin/regexp bug?
Not in the next couple of weeks. I'll be abroad for a while. Maybe in
October or November. I put it on my TODO list. The biggest problem is
that I know the BSD regex code pretty well, but the gnulib regex code is
very different. I already had a look two weeks ago, but I have not
found the right place to mount surrogate pairs into it :(
Corinna
--
Corinna Vinschen
Cygwin Maintainer
Red Hat
[Message part 2 (application/pgp-signature, inline)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#15192
; Package
grep
.
(Sat, 31 Aug 2013 20:26:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 15192 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi Corinna,
Sorry about the delay.
I'm prepared to push the following. Are you ok with that?
On Tue, Aug 27, 2013 at 2:35 AM, Corinna Vinschen <vinschen <at> redhat.com> wrote:
> On Aug 26 21:58, Jim Meyering wrote:
>> I guess it is a different point of view. Maybe I'm just too
>> forward-thinking? :-)
>> I.e., if the remaining cygwin-specific bug is fixed soon, there will
>> be little reason for separate tests.
>> Are you planning to work on the cygwin/regexp bug?
>
> Not in the next couple of weeks. I'll be abroad for a while. Maybe in
> October or November. I put it on my TODO list. The biggest problem is
> that I know the BSD regex code pretty well, but the gnulib regex code is
> very different. I already had a look two weeks ago, but I have not
> found the right place to mount surrogate pairs into it :(
>
>
> Corinna
>
> --
> Corinna Vinschen
> Cygwin Maintainer
> Red Hat
[k.txt (text/plain, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#15192
; Package
grep
.
(Sun, 01 Sep 2013 08:58:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 15192 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi Jim,
On Aug 31 13:24, Jim Meyering wrote:
> Hi Corinna,
> Sorry about the delay.
> I'm prepared to push the following. Are you ok with that?
Looks good, thank you.
Corinna
[Message part 2 (application/pgp-signature, inline)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#15192
; Package
grep
.
(Sun, 01 Sep 2013 15:27:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 15192 <at> debbugs.gnu.org (full text, mbox):
pushed
On Sun, Sep 1, 2013 at 1:57 AM, Corinna Vinschen <vinschen <at> redhat.com> wrote:
> Hi Jim,
>
> On Aug 31 13:24, Jim Meyering wrote:
>> Hi Corinna,
>> Sorry about the delay.
>> I'm prepared to push the following. Are you ok with that?
>
> Looks good, thank you.
>
>
> Corinna
bug closed, send any further explanations to
15192 <at> debbugs.gnu.org and Corinna Vinschen <vinschen <at> redhat.com>
Request was from
Jim Meyering <jim <at> meyering.net>
to
control <at> debbugs.gnu.org
.
(Sun, 01 Sep 2013 15:29:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Mon, 30 Sep 2013 11:24:04 GMT)
Full text and
rfc822 format available.
bug unarchived.
Request was from
Eric Blake <eblake <at> redhat.com>
to
control <at> debbugs.gnu.org
.
(Mon, 28 Apr 2014 14:06:01 GMT)
Full text and
rfc822 format available.
Forcibly Merged 15192 15199.
Request was from
Eric Blake <eblake <at> redhat.com>
to
control <at> debbugs.gnu.org
.
(Mon, 28 Apr 2014 14:06:01 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 27 May 2014 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 11 years and 30 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.