GNU bug report logs - #54124
fmt inserts garbage in certain cases?

Previous Next

Package: coreutils;

Reported by: "JD" <john1doe <at> ya.ru>

Date: Wed, 23 Feb 2022 11:28:01 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: JD <john1doe <at> ya.ru>, 54124 <at> debbugs.gnu.org
Subject: bug#54124: fmt inserts garbage in certain cases?
Date: Wed, 23 Feb 2022 17:55:49 +0000
[Message part 1 (text/plain, inline)]
On 23/02/2022 10:58, JD wrote:
> Hi!
> 
> I have fmt from coreutils 8.32.1 installed via MacPorts.
> 
> If I run the following command: `echo х х х х х х х х х х х х х х х х х х х х х х х х х х | gfmt -sw 10` (which is just echoing 26 Cyrillic 'х' ('kha') letters), I get the following results:
> 
> https://i.imgur.com/yRx7uuz.png (iTerm2)
> https://i.imgur.com/7oQ0UPz.png (iTerm2 if passed via `more`)
> https://i.imgur.com/UlLrEMy.png (Alacritty)
> 
> And if I delete just two 'х' letters, like this: `echo х х х х х х х х х х х х х х х х х х х х х х х х | gfmt -sw 10`, evertyhitng shows just fine: https://i.imgur.com/DwuWxyx.png
> 
> Would be grateful for any advice :)

The issue here is that (on macOS 10.15.7 at least),
isspace(0x85) returns true for UTF-8 locales
(but not for "C" or "iso8859-1" locales).
BTW iscntrl() returns true for 0x85 on all non C locales
on both Linux and macOS.

Now gnulib says wrt isspace() that:

"This function's behaviour depends on the locale, but does not support
the multibyte characters that occur in strings in locales with
@code{MB_CUR_MAX > 1} (this includes all the common UTF-8 locales)."

I think isspace(x85) returning true on macOS is a bug,
but we should probably avoid isspace() in fmt altogether
given it's inconsistency with multibyte locales.
The attached uses c_isspace() instead.

cheers,
Pádraig
[fmt-utf8-macOS.patch (text/x-patch, attachment)]

This bug report was last modified 3 years and 119 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.