GNU bug report logs - #32236
df header corrupted with LANG=zh_TW.UTF-8 on macOS

Previous Next

Package: coreutils;

Reported by: Chih-Hsuan Yen <yan12125 <at> gmail.com>

Date: Sat, 21 Jul 2018 16:10:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Chih-Hsuan Yen <yan12125 <at> gmail.com>, 32236 <at> debbugs.gnu.org, bug-gnulib <bug-gnulib <at> gnu.org>
Subject: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS
Date: Sun, 22 Jul 2018 09:17:07 -0700
On 22/07/18 08:12, Paul Eggert wrote:
> Pádraig Brady wrote:
>> I've also attached an alternative patch for df (in your name).
> 
> That still has problems, since it can generate improperly-encoded strings in 
> UTF-8 locales (if the inputs are improperly encoded), and can replace parts of 
> multibyte characters with '?' in non-UTF-8 locales. Please try the attached 
> patch instead, which attempts to address these issues. This is more along the 
> lines that Bruno suggested, except it doesn't use mbsiter as I figured it was 
> simpler overall just to use mbrtowc directly for this one thing.

I haven't time to review this now,
but I did want to only avoid \n etc. that might cause issues for
programs that parsed output from df on a line by line basis.
This subset of control characters is safe to identify
It seems problematic to start eliding improperly encoded
mount points for example, rather than just outputting
what's there.

Also just incrementing width++ per each wide character
doesn't seem right, though again I've not tested it.

cheers,
Pádraig




This bug report was last modified 6 years and 160 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.