GNU bug report logs - #29606
Command 'fold' dangerous with utf-8 input

Previous Next

Package: coreutils;

Reported by: Mark Roberts <mroberts <at> rapid-arts-movement.de>

Date: Thu, 7 Dec 2017 16:27:02 UTC

Severity: normal

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 29606 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Mark Roberts <mroberts <at> rapid-arts-movement.de>, 29606 <at> debbugs.gnu.org
Subject: Re: bug#29606: Command 'fold' dangerous with utf-8 input
Date: Thu, 7 Dec 2017 09:46:38 -0700
Hello,

On 2017-12-07 03:10 AM, Mark Roberts wrote:

> I am using fold version 8.13 on a Debian 3.2.93-1

Do you mean Debian 7 (Wheezy) with Linux Kernel 3.2.93-1 ?

>> cat filename | fold
> 
> If 'filename' contains utf8 characters consisting of more than one byte, 
> fold will consider breaking the line inside such a character. There is 
> no option to stop it doing that.

That is correct. "fold" currently (as of coreutils version 8.28) does 
not support UTF-8 characters.

> or (3) that 'fold' fails to read my "LANG" environment variable which 
> clearly states a UTF-8 locale. This, in 2017, is an error.

Considering you are using Debian 7 from 2013,
and coreutils 8.13 from 2011, the fact it is 2017 is not very relevant.

There is an on-going effort to add multibyte/utf8 support to all 
coreutils programs. You can read more about it here:
https://crashcourse.housegordon.org/coreutils-multibyte-support.html

The current development patches do have utf8 support in fold.

> Please write back [...] if you need example data or clarifications.

If you'd like to help us test these patches, please try
an unofficial development snapshot here:

https://files.housegordon.org/src/coreutils-multibyte-experimental-8.28.39-79242.tar.xz



regards,
 - assaf




This bug report was last modified 7 years and 169 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.