GNU bug report logs -
#24924
multibyte: pr has no concept of wide characters
Previous Next
Full log
Message #14 received at 24924 <at> debbugs.gnu.org (full text, mbox):
Only arguing on the classification of this bug here.
Let's call a cat a cat. When something doesn't work as
documented, it's a bug, not a wishlist entry.
AFAICT, there's nothing in the GNU coreutils documentation that
states that pr only works on input that consists exclusively of
single-byte characters that are neither zero-width (though it
copes OK with ASCII BS and TAB) nor double-width (or on
ASCII-only input).
Today, UTF-8 is the most commonly used character set, so it
even affects English text (where £ (the British currency symbol)
is encoded on two bytes in UTF-8 for instance), and even
US-English text like for the ‘quoting characters’ (3 bytes each
in UTF-8) now that ASCII ' has been demoted to just an
apostrophe.
That can also be seen as a POSIX conformance bug (though GNU
coreutils doesn't claim POSIX conformance, only "The GNU
utilities documented here are /mostly/ compatible with the
POSIX standard").
$ pr -tm --sep-string='|' <(du --version) <(truncate --version)
du (GNU coreutils) 8.25 |truncate (GNU coreutils) 8.25
Copyright (C) 2016 Free Software Fo|Copyright (C) 2016 Free Software Fo
License GPLv3+: GNU GPL version 3 o|License GPLv3+: GNU GPL version 3 o
This is free software: you are free|This is free software: you are free
There is NO WARRANTY, to the extent|There is NO WARRANTY, to the extent
|
Written by Torbjörn Granlund, David |Written by Pádraig Brady.
and Jim Meyering. |
--
Stephane
This bug report was last modified 6 years and 231 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.