On 26/07/18 18:23, Paul Eggert wrote: > Pádraig Brady wrote: >> I've pushed the c_iscntrl patch since it's simplest >> and probably most appropriate patch for an existing release. > > Yes, that makes sense for a quick patch. However, for the next release I think > it'd be better to catch encoding errors and multibyte control characters, given > the problems noted. I installed the attached further patch to try to do this. > This fixes the problem that Bruno noted, along with two others; my earlier patch > neglected the possibility that mbrtowc can return 0, and it incorrectly assumed > wide control characters always have a single-byte representation. > > Either way the original bug appears to be fix so I'm boldly closing the bug report. Reviewing this, I dislike the way that we're now enforcing that the file system locale needs to match the current user's locale or otherwise df will not output all original characters. That has the potential to break scripts, as mismatched encodings is a common issue. In the attached I've taken the original less aggressive replacement policy when not outputting to a tty, leaving more sanitizing to the tty case. cheers, Pádraig