GNU bug report logs - #7372
multibyte: fmt and multi-byte encodings

Previous Next

Package: coreutils;

Reported by: Ineiev <ineiev <at> gmail.com>

Date: Thu, 11 Nov 2010 09:42:03 UTC

Severity: wishlist

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Ineiev <ineiev <at> gmail.com>
Cc: 7372 <at> debbugs.gnu.org
Subject: bug#7372: fmt and multi-byte encodings
Date: Thu, 11 Nov 2010 16:01:02 +0000
On 11/11/10 09:32, Ineiev wrote:
> Hello;
> 
> Today I fed a text in Russian in UTF-8 to fmt
> and discovered that the utility counts the line width
> in bytes rather than in characters (the lines written in
> Cyrillics were roughly twice as short as the lines
> written in Latin script), which was not what I wanted.
> I checked fmt from coreutils-8.6.
> 
> As a workaround, I could iconv the text into a single-byte
> encoding like KOI8-R, but I would limit the character
> set then.
> 
> I've never used fmt before personally, so actually I'm not
> sure whether it was a bug or I did something wrong.
> 
> Any hints?

We're starting to apply multi-byte support,
so hopefully this will be fixed soon.

$ echo "1 2 æ 4 5 6" | fmt -w6
1 2
æ 4
5 6

That is with the official fedora
version of `fmt`

cheers,
Pádraig





This bug report was last modified 6 years and 254 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.