Package: sed
Version: 4.4-2
Severity: important
Dear Maintainer,
With a locale set to en_US.utf8 it is expected that the collating order is this:
$ printf '%b' $(printf '\\U%x\\n' {32..127}) | sort | tr -d '\n'
`^~<=>| _-,;:!?/.'"()[]{}@$*\&#%+0123456789aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ
It is expected that a range [a-z] will match 'aAbBcCdD…', all lower and upper letters.
But it isn't:
$ printf '%b' $(printf '\\U%x' {32..127}) | sed 's/[^a-z]//g'
abcdefghijklmnopqrstuvwxyz
However, the range [a-Z] does match all letters, lower or upper:
$ printf '%b' $(printf '\\U%x' {32..127}) | sed 's/[^a-Z]//g'
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
If this is the correct way in which sed should work, then, if you please:
- What is the rationale leading to such decision?.
- Where is it documented?.
- Where is it implemented in the code?.
- Why does the manual document otherwise?.