GNU bug report logs - #34524
wc: word count incorrect when words separated only by no-break space

Previous Next

Package: coreutils;

Reported by: vampyrebat <at> gmail.com

Date: Mon, 18 Feb 2019 08:13:02 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


Message #28 received at 34524-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Bruno Haible <bruno <at> clisp.org>
Cc: vampyrebat <at> gmail.com, 34524-done <at> debbugs.gnu.org,
 Paul Eggert <eggert <at> CS.UCLA.EDU>
Subject: Re: bug#34524: wc: word count incorrect when words separated only by
 no-break space
Date: Mon, 25 Feb 2019 20:26:55 -0800
[Message part 1 (text/plain, inline)]
On 24/02/19 19:55, Pádraig Brady wrote:
> On 24/02/19 17:07, Pádraig Brady wrote:
>> So non break space is generally considered a word delimiter,
>> though there are complications you detail from unicode.
>>
>> In regard to options for enabling various behaviors for wc(1),
>> I'm thinking we might keep the strict POSIX isspace() behavior
>> with LC_CTYPE=C and/or POSIXLY_CORRECT=1, and use iswnbspace()
>> by default, since that's the most common operation one would want,
>> and is consistent with libreoffice for example.
>> I'll adjust the patch along those lines.
> 
> Full patch attached.

Updated patch attached. I'll push in a few hours.
Marking this bug as done.

cheers,
Pádraig.

[wc-nbsp.patch (text/x-patch, attachment)]

This bug report was last modified 6 years and 78 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.