GNU bug report logs - #29606
Command 'fold' dangerous with utf-8 input

Previous Next

Package: coreutils;

Reported by: Mark Roberts <mroberts <at> rapid-arts-movement.de>

Date: Thu, 7 Dec 2017 16:27:02 UTC

Severity: normal

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #17 received at 29606 <at> debbugs.gnu.org (full text, mbox):

From: Mark Roberts <mroberts <at> rapid-arts-movement.de>
To: Assaf Gordon <assafgordon <at> gmail.com>
Cc: 29606 <at> debbugs.gnu.org
Subject: Re: bug#29606: Command 'fold' dangerous with utf-8 input
Date: Fri, 8 Dec 2017 13:04:20 +0100 (CET)
Dear Assaf,

the reason for the unexpected behavior of 'fold', namely that specifying 
--bytes doesn't make it count bytes, is evident after a look at the source 
code.

When --bytes is not specified, the program treats '\b', '\r' and '\t' 
specially. It assumes a tab width of eight (compile-time #define) and 
attempts to keep track of what the output will look like.

This is absolutely not what I expected. But of course, when the program 
was first written, the words byte and character meant the same thing 
for printable characters. Printable bytes.

I will attempt to suggest an improved text for the man-page so that 
others will not be surprised.

Mark




This bug report was last modified 7 years and 169 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.