GNU bug report logs - #31033
multibyte: sort,uniq,join,tr,cut,paste,expand,unexpand patch

Previous Next

Package: coreutils;

Reported by: Eric Fischer <enf <at> pobox.com>

Date: Mon, 2 Apr 2018 23:17:01 UTC

Severity: wishlist

Tags: patch

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Eric Fischer <enf <at> pobox.com>
To: bug-coreutils <at> gnu.org
Subject: [PATCH] Multibyte support for sort, uniq, join, tr, cut, paste,
 expand, unexpand
Date: Mon, 2 Apr 2018 15:18:14 -0700
[Message part 1 (text/plain, inline)]
As previously discussed on the coreutils mailing list, beginning with

  http://lists.gnu.org/archive/html/coreutils/2017-12/msg00074.html

most of the coreutils text processing commands process bytes instead of
characters, regardless of the user's locale, so they do not handle UTF-8
text or options properly.

I propose the changes in

  https://github.com/ericfischer/coreutils/compare/multibyte-squash

to convert sort, uniq, join, tr, cut, paste, expand, and unexpand to
process characters instead of bytes, allowing them to work correctly on
non-ASCII text, as specified by POSIX.

Eric Fischer
[Message part 2 (text/html, inline)]

This bug report was last modified 6 years and 233 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.