GNU bug report logs -
#29606
Command 'fold' dangerous with utf-8 input
Previous Next
Full log
View this message in rfc822 format
Dear maintainers,
I am using fold version 8.13 on a Debian 3.2.93-1
> cat filename | fold
If 'filename' contains utf8 characters consisting of more than one byte,
fold will consider breaking the line inside such a character. There is no
option to stop it doing that.
Except, of course "-s": break at spaces. But that may not be what the user
wants.
According to man-page, it counts columns by default, not bytes. This seems
not to be true. The switch "-b": count bytes, has no influence on the
output in my test case.
How to fix this?
I presume that either (1) the default behavior (counting columns) is not
what I expect, namely to count characters instead of bytes. This would
have to be clarified in man-page.
or (2) that the default isn't what the man-page says it is: possibly the
default set in the code is to count bytes. This would be an error.
or (3) that 'fold' fails to read my "LANG" environment variable which
clearly states a UTF-8 locale. This, in 2017, is an error.
Please write back to mroberts <at> rapid-arts-movement.de if you need example
data or clarifications.
Thank you,
Mark Roberts
This bug report was last modified 7 years and 169 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.