GNU bug report logs - #26779
expr length

Previous Next

Package: coreutils;

Reported by: Андрей Воронов <voronov <at> factor-ts.ru>

Date: Thu, 4 May 2017 15:24:02 UTC

Severity: normal

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Assaf Gordon <assafgordon <at> gmail.com>
Cc: 26779 <at> debbugs.gnu.org, Андрей Воронов <voronov <at> factor-ts.ru>
Subject: bug#26779: expr length
Date: Tue, 9 May 2017 05:31:03 -0700
On 09/05/17 00:04, Assaf Gordon wrote:
> Hello,
> 
>> On May 4, 2017, at 11:43, Pádraig Brady <P <at> draigBrady.com> wrote:
>>
>> On 04/05/17 02:59, Андрей Воронов wrote:
>>> I have the bug in expr utility when it perform operation of the 
>>> calculating length of the string in my multi-byte encoding ru_RU.UTF-8.
>>
>> expr is listed in the plan here:
>> http://www.pixelbeat.org/docs/coreutils_i18n/
> 
> Attached a draft patch implementing multibyte support for 'expr'
> (it doesn't need any code from my previous multibyte stuff, so I'm sending it separately).
> 
> Specifically, the length/index/substr operators are adjusted.
> The regex engine for the 'match' operator already supported multibyte characters (only minor adjustment needed to return matched character count instead of matched byte count).
> The string comparison already used 'strcoll' so I assumed they work with multibyte strings.

Definitely needs a NEWS entry
and mention of this bug in the commit message.

Perf isn't a huge concern for expr use cases,
so I'd rather not address in this patch if at all,
but for future reference it might be nice to pass in
to mbschr() etc. whether the current locale is UTF8.
Maybe some global similar to MB_CUR_MAX.  Then mbschr()
could be optimized in the UTF8 case as per pseudo code at:
http://www.pixelbeat.org/docs/utf8_programming.html

s/sequnce/sequence/

I notice expr isn't handled in the rhat/suse i18n patch,
so there is nothing to consider there.

Excellent work on the tests.

I've updated status in http://www.pixelbeat.org/docs/coreutils_i18n/
as this is ready to land I think.

thanks!
Pádraig




This bug report was last modified 8 years and 56 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.