GNU bug report logs -
#33775
multibyte: fold: multi-byte sequences as separate columns
Previous Next
Full log
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hello,
I've just discovered an odd behavior of `fold' while trying to wrap a
piece of text containing phonetic characters.
Take the following line, for example:
Tcl (pronounced "tickle" or tee cee ell /ˈtiː siː ɛl/) is a high-level,
It is 71 characters long. Still, running
echo "Tcl (pronounced "tickle" or tee cee ell /ˈtiː siː ɛl/) is a
high-level," | fold -w 72 -s
produces
Tcl (pronounced tickle or tee cee ell /ˈtiː siː ɛl/) is a
high-level,
I've had someone test this with FreeBSD's `fold', which didn't behave
that way. Instead, it filled out the line as expected.
Further investigation by developers of Adélie Linux revealed that GNU's
`fold' is counting multi-byte utf-8 sequences (in this case, the
phonetic characters) as separate columns:
awilcox on gwyn [pts/11 Sun 16 19:01] ~: cat testing.txt
1234567890 234567890 234567890 234567890 234567890 234567890 234567890
/ˈtiː siː ɛl/ Adélie en français español ¿que? ¡ay! here is 70 chars ^
yep.
awilcox on gwyn [pts/11 Sun 16 19:01] ~: fold -w 72 -s testing.txt
1234567890 234567890 234567890 234567890 234567890 234567890 234567890
/ˈtiː siː ɛl/ Adélie en français español ¿que? ¡ay! here is 70
chars ^
yep.
msi
This bug report was last modified 6 years and 178 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.