[readding the list]

On 02/02/2011 02:11 PM, Kostya Stopani wrote:
> On Wed, Feb 02, 2011 at 10:15:53AM -0700, Eric Blake wrote:
> 
>> Thanks for the patch.  However, it's not trivial, so it would need
>> copyright assignment.
> 
> Oh boy... Anyway I don't mind signing papers, if you (or whoever)
> don't mind bothering with it.

OK, I'll send you those details off-list.

> 
>> Furthermore, there are already known issues where upstream coreutils
>> is lacking multibyte character support, but a solution has to be
>> both maintainable and no-impact to the single-byte locale case.
> 
> I believe this patch doesn't break single-byte behavior because no
> conversion takes place. mbsnrtowcs() is used only to count
> characters. I've tested various cases (8-bit encoding was KOI8-R):
> 
> |--------+---------------+--------------------------|
> | Locale | Text encoding | Result                   |
> |--------+---------------+--------------------------|
> | UTF-8  | UTF-8         | old fmt: text too narrow |
> |        |               | new fmt: ok              |
> |--------+---------------+--------------------------|
> | UTF-8  | 8-bit         | same                     |
> |--------+---------------+--------------------------|
> | 8-bit  | UTF-8         | same                     |
> |--------+---------------+--------------------------|
> | 8-bit  | 8-bit         | same                     |
> |--------+---------------+--------------------------|
> 
> From my point of view the alternative is to convert everything to
> wchar_t, which imposes the need to keep track of conversion errors and
> gracefully fall back to single-byte.

Keeping things in multibyte rather than converting to wchar_t is the way
to go (especially given the ongoing discussion of how to handle the fact
that on cygwin, wchar_t is UTF-16 and thus still multi-unit as an
extension to POSIX, with all sorts of ramifications to programs that
expect POSIX semantics).

-- 
Eric Blake   eblake@redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org