GNU bug report logs -
#28038
multibyte: expand: expand(1) lacks MBC support
Previous Next
Full log
View this message in rfc822 format
Hello Tilman,
On 10/08/17 10:10 AM, Tilman Schmidt wrote:
> it seems the expand(1) command does not properly support multi-byte
> characters.
That is correct.
> tschmidt <at> sl-vm-redmine01:~$ cat test.txt
> Text ohne Umlaute
> Täxt müt Umläuten
> tschmidt <at> sl-vm-redmine01:~$ expand test.txt
> Text ohne Umlaute
> Täxt müt Umläuten
>
> Using Ubuntu 14.04.5 LTS with coreutils 8.21-1ubuntu.
Multibyte support is not available yet (neither in version 8.21 which is
4 years old, nor in the current version 8.27).
However, there is an on-going effort to add multibyte support
to all coreutils programs, including 'expand'.
You can read more technical details about it here:
http://crashcourse.housegordon.org/coreutils-multibyte-support.html
In the current (work-in-progress) internationalization patch,
the 'expand' program does support multibyte locales, and expands
your input correctly:
multibyte locale:
$ ./src/expand bug28038.txt
Text ohne Umlaute
Täxt müt Umläuten
versus forcing single-byte locale:
$ LC_ALL=C ./src/expand bug28038.txt
Text ohne Umlaute
Täxt müt Umläuten
The latest version of the patch is available for download and
experimentation here:
http://lists.gnu.org/archive/html/coreutils/2017-04/msg00009.html
However it should not be considered stable.
regards,
- assaf
This bug report was last modified 6 years and 288 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.