GNU bug report logs -
#24924
multibyte: pr has no concept of wide characters
Previous Next
Full log
Message #20 received at 24924 <at> debbugs.gnu.org (full text, mbox):
2016-11-30 18:37:05 -0800, Paul Eggert:
> On 11/30/2016 03:30 AM, Stephane Chazelas wrote:
> >That can also be seen as a POSIX conformance bug
>
> Not really, as POSIX does not require support for UTF-8 (except in
> the pax utility, which is not part of coreutils).
[...]
POSIX does not require support for any charset. It only
specifies one locale (C/POSIX), doesn't specify the charset in
that locale other than it should be a single byte charset that
covers the portable character set. Examples of such charsets are
ASCII, iso8859-x or EBCDIC. In practice, that tends to be ASCII
(except for some rare EBCDIC based IBM systems) as tha
But it does support a localisation API and allows system to
support other locales with other charsets. That API does support
multi-byte encodings, including stateful ones (though how they
are /defined/ is implementation defined for lock-shift ones and
in practice those are unworkable so I'd expect those would
eventually be removed from the standard). It doesn't require
compliant systems to have locales with multi-byte character sets,
but if they have (if they show up in the output of locale -a),
then they have to be supported throughout (as specified, for all
the utilities for instance).
Basically, on systems that have locales with multi-byte
encodings --UTF-8 or other-- (most Unix-like ones including GNU
systems like Debian), GNU pr (and many other GNU utilities) is
not POSIX compliant.
See
http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/basedefs/V1_chap06.html
for details.
--
Stephane
This bug report was last modified 6 years and 231 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.