GNU bug report logs -
#31526
Range [a-z] does not follow collate order from locale.
Previous Next
Reported by: Bize Ma <binaryzebra <at> gmail.com>
Date: Sat, 19 May 2018 07:39:02 UTC
Severity: important
Tags: notabug
Found in version 4.4-2
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Package: sed
Version: 4.4-2
Severity: important
Dear Maintainer,
With a locale set to en_US.utf8 it is expected that the collating order is
this:
$ printf '%b' $(printf '\\U%x\\n' {32..127}) | sort | tr -d '\n'
`^~<=>| _-,;:!?/.'"()[]{}@$*\&#%+0123456789aAbBcCdDeEfFgGhHiIjJ
kKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ
It is expected that a range [a-z] will match 'aAbBcCdD…', all lower and
upper letters.
But it isn't:
$ printf '%b' $(printf '\\U%x' {32..127}) | sed 's/[^a-z]//g'
abcdefghijklmnopqrstuvwxyz
However, the range [a-Z] does match all letters, lower or upper:
$ printf '%b' $(printf '\\U%x' {32..127}) | sed 's/[^a-Z]//g'
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
If this is the correct way in which sed should work, then, if you please:
- What is the rationale leading to such decision?.
- Where is it documented?.
- Where is it implemented in the code?.
- Why does the manual document otherwise?.
[Message part 2 (text/html, inline)]
This bug report was last modified 7 years and 92 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.