GNU bug report logs -
#32236
df header corrupted with LANG=zh_TW.UTF-8 on macOS
Previous Next
Full log
View this message in rfc822 format
Paul Eggert wrote:
> my earlier patch
> neglected the possibility that mbrtowc can return 0
I wouldn't see this as a bug: You can assume that mbrtowc returns
0 if and only if the multibyte sequence is a NUL byte - but you had
chosen srcend in such a way that this would not happen in the loop.
> and it incorrectly assumed
> wide control characters always have a single-byte representation.
Oops, you're right. My mistake as well.
The new patch looks good.
This will catch (and replace with '?') U+2028 and U+2029 on glibc systems.
On macOS, it will not do this, because iswcntrl(0x2028) and iswcntrl(0x2029)
is 0 on this system; this is consistent with the fact that the 'Terminal'
program displays these characters as simple spaces. So, no need to override
iswcntrl on macOS.
Bruno
2018-07-27 Bruno Haible <bruno <at> clisp.org>
iswcntrl: Mention minor problem on macOS.
* doc/posix-functions/iswcntrl.texi: Mention oddity on macOS.
diff --git a/doc/posix-functions/iswcntrl.texi b/doc/posix-functions/iswcntrl.texi
index 99eaa0e..44dd034 100644
--- a/doc/posix-functions/iswcntrl.texi
+++ b/doc/posix-functions/iswcntrl.texi
@@ -25,4 +25,8 @@ Portability problems not fixed by Gnulib:
@item
On AIX and Windows platforms, @code{wchar_t} is a 16-bit type and therefore cannot
accommodate all Unicode characters.
+@item
+This function returns 0 for U+2028 (LINE SEPARATOR) and
+U+2029 (PARAGRAPH SEPARATOR) on some platforms:
+Mac OS X 10.13.
@end itemize
This bug report was last modified 6 years and 160 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.