GNU bug report logs -
#33775
multibyte: fold: multi-byte sequences as separate columns
Previous Next
To reply to this bug, email your comments to 33775 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#33775
; Package
coreutils
.
(Mon, 17 Dec 2018 02:15:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Michael Siegel <msi <at> malbolge.net>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Mon, 17 Dec 2018 02:15:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hello,
I've just discovered an odd behavior of `fold' while trying to wrap a
piece of text containing phonetic characters.
Take the following line, for example:
Tcl (pronounced "tickle" or tee cee ell /ˈtiː siː ɛl/) is a high-level,
It is 71 characters long. Still, running
echo "Tcl (pronounced "tickle" or tee cee ell /ˈtiː siː ɛl/) is a
high-level," | fold -w 72 -s
produces
Tcl (pronounced tickle or tee cee ell /ˈtiː siː ɛl/) is a
high-level,
I've had someone test this with FreeBSD's `fold', which didn't behave
that way. Instead, it filled out the line as expected.
Further investigation by developers of Adélie Linux revealed that GNU's
`fold' is counting multi-byte utf-8 sequences (in this case, the
phonetic characters) as separate columns:
awilcox on gwyn [pts/11 Sun 16 19:01] ~: cat testing.txt
1234567890 234567890 234567890 234567890 234567890 234567890 234567890
/ˈtiː siː ɛl/ Adélie en français español ¿que? ¡ay! here is 70 chars ^
yep.
awilcox on gwyn [pts/11 Sun 16 19:01] ~: fold -w 72 -s testing.txt
1234567890 234567890 234567890 234567890 234567890 234567890 234567890
/ˈtiː siː ɛl/ Adélie en français español ¿que? ¡ay! here is 70
chars ^
yep.
msi
Severity set to 'wishlist' from 'normal'
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Sun, 23 Dec 2018 06:04:01 GMT)
Full text and
rfc822 format available.
Changed bug title to 'multibyte: fold: multi-byte sequences as separate columns' from 'fold: counting multi-byte utf-8 sequences as separate columns'
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Sun, 23 Dec 2018 06:04:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#33775
; Package
coreutils
.
(Sun, 23 Dec 2018 06:05:01 GMT)
Full text and
rfc822 format available.
Message #12 received at 33775 <at> debbugs.gnu.org (full text, mbox):
severity 33775 wishlist
retitle 33775 multibyte: fold: multi-byte sequences as separate columns
stop
Hello,
On 2018-12-16 6:32 p.m., Michael Siegel wrote:
> I've just discovered an odd behavior of `fold' while trying to wrap a
> piece of text containing phonetic characters.
>
> Take the following line, for example:
Thank you for reporting this issue and
providing clear, reproducible examples.
Adding complete multibyte/utf8 support to all coreutils
programs is an on-going effort.
I'm marking this as a "wishlist" item, which will remain
open until we complete the implementation.
Related multibyte items are listed here (with "multibyte" prefix):
https://debbugs.gnu.org/cgi/pkgreport.cgi?which=pkg&data=coreutils
regards,
- assaf
This bug report was last modified 6 years and 178 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.