On 23/02/2022 10:58, JD wrote: > Hi! > > I have fmt from coreutils 8.32.1 installed via MacPorts. > > If I run the following command: `echo х х х х х х х х х х х х х х х х х х х х х х х х х х | gfmt -sw 10` (which is just echoing 26 Cyrillic 'х' ('kha') letters), I get the following results: > > https://i.imgur.com/yRx7uuz.png (iTerm2) > https://i.imgur.com/7oQ0UPz.png (iTerm2 if passed via `more`) > https://i.imgur.com/UlLrEMy.png (Alacritty) > > And if I delete just two 'х' letters, like this: `echo х х х х х х х х х х х х х х х х х х х х х х х х | gfmt -sw 10`, evertyhitng shows just fine: https://i.imgur.com/DwuWxyx.png > > Would be grateful for any advice :) The issue here is that (on macOS 10.15.7 at least), isspace(0x85) returns true for UTF-8 locales (but not for "C" or "iso8859-1" locales). BTW iscntrl() returns true for 0x85 on all non C locales on both Linux and macOS. Now gnulib says wrt isspace() that: "This function's behaviour depends on the locale, but does not support the multibyte characters that occur in strings in locales with @code{MB_CUR_MAX > 1} (this includes all the common UTF-8 locales)." I think isspace(x85) returning true on macOS is a bug, but we should probably avoid isspace() in fmt altogether given it's inconsistency with multibyte locales. The attached uses c_isspace() instead. cheers, Pádraig