GNU bug report logs -
#27555
cygwin surrogate-pair test now fails (since grep-2.22, and no one noticed)
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Tue, 23 Nov 2021 18:34:51 -0800
with message-id <32d817a3-4dd3-113a-addd-69f56174ef2f <at> cs.ucla.edu>
and subject line Re: [PATCH 1/1] tests: make surrogate-pair pass under Cygwin
has caused the debbugs.gnu.org bug report #27555,
regarding surrogate-pair test fails under Cygwin
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
27555: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=27555
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
[Message part 3 (text/plain, inline)]
The 3rd surrogate-pair test fails under Cygwin:
> # Also test whether a surrogate-pair in the search string works.
Fails at grep-3.7 or latest commit.
Reproduces easily enough from the command line:
> printf '%s\n' "$(printf '\360\220\220\205')" >in
> LANG=en_US.utf8
> locale
> src/grep --file=in in
Reports a match under Linux but not under Cygwin. Tested Cygwin64 on Windows 7
Home and Windows 10.
Comparing gdb sessions between the platforms, I noticed:
> linux: sbclen = '\001' <repeats 128 times>, '\377' <repeats 66 times>, '\376' <repeats 60 times>, "\377\377"
> cygwin: sbclen = '\001' <repeats 128 times>, '\377' <repeats 64 times>, '\376' <repeats 53 times>, '\377' <repeats 11 times>
in `dfa` (i.e. dfa.localeinfo.sbclen).
Also this:
> linux: enlistnew (cpp=0x, new=0x "\360\220\220\205") at dfa.c:3928
> cygwin: enlistnew (cpp=0x, new=0x "\360\355\260\205") at dfa.c:3928
Locale data is different for the same locale on the 2 systems. I investigated
this further by breakpointing the code as it starts to compute sbclen[250] which
is \376 ubder Linux but \377 under Cygwin. I captured the gdb sessions using
`script` and have attached them in the hope they are some help.
If your system rejects the tar.gz attachment I'll send them plaintext in
separate emails. They compare best in a side-by-side diff highlighting changed
characters. I find `tkdiff` good for this: from View choose "Show inline
comparison (recursive)".
Uninteresting changes between the sessions are removed:
Automatic
- strip hex numbers (addresses usually) to plain 0x
- remove escape sequences (colouring &c.)
- probably other stuff
Specifics
- force matching locale names
- insert blank lines at linux:72 to line up return stmt
- split linux:100 to more easily see later args
Cheers ... Duncan.
[gdb_sessions.tar.gz (application/x-tar-gz, attachment)]
[Message part 5 (message/rfc822, inline)]
[Message part 6 (text/plain, inline)]
I installed the attached fancier patch instead; please give it a try.
I'm boldly marking the bug as fixed; we can unmark it later if I'm wrong.
[0001-tests-skip-surrogate-search-test-on-Cygwin.patch (text/x-patch, attachment)]
This bug report was last modified 3 years and 180 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.