GNU bug report logs -
#29606
Command 'fold' dangerous with utf-8 input
Previous Next
Full log
Message #8 received at 29606 <at> debbugs.gnu.org (full text, mbox):
Hello,
On 2017-12-07 03:10 AM, Mark Roberts wrote:
> I am using fold version 8.13 on a Debian 3.2.93-1
Do you mean Debian 7 (Wheezy) with Linux Kernel 3.2.93-1 ?
>> cat filename | fold
>
> If 'filename' contains utf8 characters consisting of more than one byte,
> fold will consider breaking the line inside such a character. There is
> no option to stop it doing that.
That is correct. "fold" currently (as of coreutils version 8.28) does
not support UTF-8 characters.
> or (3) that 'fold' fails to read my "LANG" environment variable which
> clearly states a UTF-8 locale. This, in 2017, is an error.
Considering you are using Debian 7 from 2013,
and coreutils 8.13 from 2011, the fact it is 2017 is not very relevant.
There is an on-going effort to add multibyte/utf8 support to all
coreutils programs. You can read more about it here:
https://crashcourse.housegordon.org/coreutils-multibyte-support.html
The current development patches do have utf8 support in fold.
> Please write back [...] if you need example data or clarifications.
If you'd like to help us test these patches, please try
an unofficial development snapshot here:
https://files.housegordon.org/src/coreutils-multibyte-experimental-8.28.39-79242.tar.xz
regards,
- assaf
This bug report was last modified 7 years and 169 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.