GNU bug report logs - #32236
df header corrupted with LANG=zh_TW.UTF-8 on macOS

Previous Next

Package: coreutils;

Reported by: Chih-Hsuan Yen <yan12125 <at> gmail.com>

Date: Sat, 21 Jul 2018 16:10:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>, Chih-Hsuan Yen <yan12125 <at> gmail.com>, 32236 <at> debbugs.gnu.org, bug-gnulib <bug-gnulib <at> gnu.org>
Subject: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS
Date: Sun, 22 Jul 2018 10:01:09 -0700
Pádraig Brady wrote:
> I did want to only avoid \n etc. that might cause issues for
> programs that parsed output from df on a line by line basis.
> This subset of control characters is safe to identify
> It seems problematic to start eliding improperly encoded
> mount points for example, rather than just outputting
> what's there.

Yes, I suppose you're right, it's not df's job to police encodings.

> Also just incrementing width++ per each wide character
> doesn't seem right, though again I've not tested it.

True as well. OK, please ignore my patch.

I was prompted by worries about multibyte encodings that use bytes that could be 
misinterpreted as ASCII control characters, such as a locale that uses EBCDIC 
encoding. However, that's probably just a theoretical concern; no coreutils 
users use EBCDIC any more, right? Plus there are doubtless lots of other places 
in coreutils that assume '\n' is a newline in encoded text.




This bug report was last modified 6 years and 160 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.