GNU bug report logs - #30681
What characters are in [[:space:]]?

Previous Next

Package: grep;

Reported by: Peng Yu <pengyu.ut <at> gmail.com>

Date: Fri, 2 Mar 2018 17:24:02 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 30681 in the body.
You can then email your comments to 30681 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#30681; Package grep. (Fri, 02 Mar 2018 17:24:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Peng Yu <pengyu.ut <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 02 Mar 2018 17:24:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Peng Yu <pengyu.ut <at> gmail.com>
To: bug-grep <at> gnu.org
Subject: What characters are in [[:space:]]?
Date: Fri, 2 Mar 2018 11:23:22 -0600
Hi,

[[:space:]] includes the following unicode character.
http://www.fileformat.info/info/unicode/char/00a0/index.htm

$ echo 'a b' | grep 'a[[:space:]]b'
a b
$ echo 'a b'|xxd
00000000: 61c2 a062 0a                             a..b.

Where is this info documented for grep?

Are these all the possible white space characters?

http://jkorpela.fi/chars/spaces.html

-- 
Regards,
Peng




Information forwarded to bug-grep <at> gnu.org:
bug#30681; Package grep. (Fri, 02 Mar 2018 22:33:02 GMT) Full text and rfc822 format available.

Message #8 received at 30681 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Peng Yu <pengyu.ut <at> gmail.com>, 30681 <at> debbugs.gnu.org
Subject: Re: bug#30681: What characters are in [[:space:]]?
Date: Fri, 2 Mar 2018 14:32:06 -0800
On 03/02/2018 09:23 AM, Peng Yu wrote:
> Where is this info documented for grep?

It's not documented for grep because it's not part of grep. It's part of 
your locale.





Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Fri, 02 Mar 2018 23:20:02 GMT) Full text and rfc822 format available.

Notification sent to Peng Yu <pengyu.ut <at> gmail.com>:
bug acknowledged by developer. (Fri, 02 Mar 2018 23:20:02 GMT) Full text and rfc822 format available.

Message #13 received at 30681-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 30681-done <at> debbugs.gnu.org, Peng Yu <pengyu.ut <at> gmail.com>
Subject: Re: bug#30681: What characters are in [[:space:]]?
Date: Fri, 2 Mar 2018 15:19:08 -0800
tags 30681 notabug
stop

On Fri, Mar 2, 2018 at 2:32 PM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> On 03/02/2018 09:23 AM, Peng Yu wrote:
>>
>> Where is this info documented for grep?
>
> It's not documented for grep because it's not part of grep. It's part of
> your locale.

You can check for yourself.
In every one of the 818 locales installed on a Fedora 27 system, I see
the same five bytes:

$ perl -e 'print pack ("C*", 0..255);'|grep -ao '[[:space:]]'|tr -d
'\n' |od -ac -An
  ht  vt  ff  cr  sp
  \t  \v  \f  \r




Information forwarded to bug-grep <at> gnu.org:
bug#30681; Package grep. (Sun, 04 Mar 2018 06:45:01 GMT) Full text and rfc822 format available.

Message #16 received at 30681-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: 30681-done <at> debbugs.gnu.org, Peng Yu <pengyu.ut <at> gmail.com>
Subject: Re: bug#30681: What characters are in [[:space:]]?
Date: Sat, 3 Mar 2018 22:44:29 -0800
Jim Meyering wrote:
> In every one of the 818 locales installed on a Fedora 27 system, I see
> the same five bytes:

Aren't some non-ASCII characters also spaces, in some locales?




Information forwarded to bug-grep <at> gnu.org:
bug#30681; Package grep. (Sun, 04 Mar 2018 10:24:01 GMT) Full text and rfc822 format available.

Message #19 received at 30681-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 30681-done <at> debbugs.gnu.org, Peng Yu <pengyu.ut <at> gmail.com>
Subject: Re: bug#30681: What characters are in [[:space:]]?
Date: Sun, 4 Mar 2018 02:22:39 -0800
On Sat, Mar 3, 2018 at 10:44 PM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> Jim Meyering wrote:
>>
>> In every one of the 818 locales installed on a Fedora 27 system, I see
>> the same five bytes:
>
> Aren't some non-ASCII characters also spaces, in some locales?

Sorry I was unclear.
Those are the single-byte ones.
In multi-byte locales, there are also some others (length two bytes or
more), indeed.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 01 Apr 2018 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 174 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.