GNU bug report logs -
#32236
df header corrupted with LANG=zh_TW.UTF-8 on macOS
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Thu, 26 Jul 2018 18:23:02 -0700
with message-id <61bb0915-497b-b32c-9252-73e1406e0154 <at> cs.ucla.edu>
and subject line Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS
has caused the debbugs.gnu.org bug report #32236,
regarding df header corrupted with LANG=zh_TW.UTF-8 on macOS
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
32236: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=32236
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
Hi coreutils developers,
I'm using coreutils on macOS High Sierra (10.13). I noticed that with
`LANG=zh_TW.UTF-8`, `df` output is corrupted.
�?�?系統 容�?? 已�?� �?��?� 已�?�% �??�?�?
/dev/disk1s1 234G 151G 81G 65% /
/dev/disk1s4 234G 2.1G 81G 3% /private/var/vm
(I'm not sure if other mail agents can display those characters
correctly or not. See my blog post [1] for the exact output.)
Seems it's similar to bug#25630 [2], which is not resolved. I guess
the reason of my issue is that iscntrl() is broken on macOS High
Sierra, so in hide_problematic_chars(), some bytes in the Chinese
header is replaced with a question mark. I managed to patch coreutils
[3] to make `df` work. Could you have a look? Thanks!
Best,
Chih-Hsuan Yen
[1] https://blog.chyen.cc/posts/2018/06/23/mac-df-chinese.html
[2] http://lists.gnu.org/archive/html/bug-coreutils/2017-02/msg00008.html
[3] https://github.com/yan12125/macports-ports/blob/fix-coreutils-df-chinese/sysutils/coreutils/files/patch-df.diff
[Message part 3 (message/rfc822, inline)]
[Message part 4 (text/plain, inline)]
Pádraig Brady wrote:
> I've pushed the c_iscntrl patch since it's simplest
> and probably most appropriate patch for an existing release.
Yes, that makes sense for a quick patch. However, for the next release I think
it'd be better to catch encoding errors and multibyte control characters, given
the problems noted. I installed the attached further patch to try to do this.
This fixes the problem that Bruno noted, along with two others; my earlier patch
neglected the possibility that mbrtowc can return 0, and it incorrectly assumed
wide control characters always have a single-byte representation.
Either way the original bug appears to be fix so I'm boldly closing the bug report.
[0001-df-avoid-multibyte-character-corruption-on-macOS.patch (text/x-patch, attachment)]
This bug report was last modified 6 years and 160 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.