GNU bug report logs - #28038
multibyte: expand: expand(1) lacks MBC support

Previous Next

Package: coreutils;

Reported by: Tilman Schmidt <tschmidt <at> cardtech.de>

Date: Thu, 10 Aug 2017 16:11:01 UTC

Severity: wishlist

Full log


View this message in rfc822 format

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Tilman Schmidt <tschmidt <at> cardtech.de>, 28038 <at> debbugs.gnu.org
Subject: bug#28038: expand(1) lacks MBC support
Date: Fri, 11 Aug 2017 17:58:48 -0600
Hello Tilman,

On 10/08/17 10:10 AM, Tilman Schmidt wrote:
> it seems the expand(1) command does not properly support multi-byte
> characters.

That is correct.

> tschmidt <at> sl-vm-redmine01:~$ cat test.txt
> Text	ohne	Umlaute
> Täxt	müt	Umläuten
> tschmidt <at> sl-vm-redmine01:~$ expand test.txt
> Text    ohne    Umlaute
> Täxt   müt    Umläuten
> 
> Using Ubuntu 14.04.5 LTS with coreutils 8.21-1ubuntu.

Multibyte support is not available yet (neither in version 8.21 which is
4 years old, nor in the current version 8.27).

However, there is an on-going effort to add multibyte support
to all coreutils programs, including 'expand'.

You can read more technical details about it here:
  http://crashcourse.housegordon.org/coreutils-multibyte-support.html

In the current (work-in-progress) internationalization patch,
the 'expand' program does support multibyte locales, and expands
your input correctly:

multibyte locale:

   $ ./src/expand bug28038.txt
   Text    ohne    Umlaute
   Täxt    müt     Umläuten

versus forcing single-byte locale:

   $ LC_ALL=C ./src/expand bug28038.txt
   Text    ohne    Umlaute
   Täxt   müt    Umläuten


The latest version of the patch is available for download and
experimentation here:
  http://lists.gnu.org/archive/html/coreutils/2017-04/msg00009.html
However it should not be considered stable.

regards,
 - assaf






This bug report was last modified 6 years and 288 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.