GNU bug report logs - #25366
26.0.50; [:blank:] character class should match all Unicode horizontal whitespace

Previous Next

Package: emacs;

Reported by: Philipp Stephani <p.stephani2 <at> gmail.com>

Date: Thu, 5 Jan 2017 13:47:02 UTC

Severity: wishlist

Tags: confirmed

Found in version 26.0.50

Done: Philipp Stephani <p.stephani2 <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Philipp Stephani <p.stephani2 <at> gmail.com>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#25366: closed (26.0.50; [:blank:] character class should
 match all Unicode horizontal whitespace)
Date: Fri, 06 Jan 2017 19:22:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Fri, 06 Jan 2017 19:21:05 +0000
with message-id <CAArVCkTbYOPOp1RB=66F96pq07_z5wwV8PQ=2cCyLS8Uk1dj-g <at> mail.gmail.com>
and subject line Re: bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace
has caused the debbugs.gnu.org bug report #25366,
regarding 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
25366: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=25366
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 26.0.50; [:blank:] character class should match all Unicode horizontal
 whitespace
Date: Thu, 05 Jan 2017 14:46:01 +0100
(string-match-p "[[:blank:]]" "\N{HAIR SPACE}")
=> nil, expected 0

[[:blank:]] should be the same as \h in PRCE.


In GNU Emacs 26.0.50.26 (x86_64-unknown-linux-gnu, GTK+ Version 3.10.8)
 of 2017-01-05 built on unknown
Repository revision: d88cdad2847726438c7d1de9fd2651c4be9243aa
Windowing system distributor 'The X.Org Foundation', version 11.0.11501000
System Description:	Ubuntu 14.04 LTS

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Entering debugger...
Back to top level

Configured using:
 'configure --with-modules --enable-checking
 --enable-check-lisp-object-type 'CFLAGS=-ggdb3 -O0''

Configured features:
XPM JPEG TIFF GIF PNG SOUND GSETTINGS NOTIFY GNUTLS FREETYPE XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 MODULES

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message subr-x puny seq byte-opt gv
bytecomp byte-compile cl-extra cconv dired dired-loaddefs format-spec
rfc822 mml mml-sec password-cache epa derived epg epg-config gnus-util
rmail rmail-loaddefs mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util mail-prsvr mail-utils help-mode easymenu cl-loaddefs pcase
cl-lib debug time-date mule-util tooltip eldoc electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript case-table epa-hook jka-cmpr-hook help
simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button
faces cus-face macroexp files text-properties overlay sha1 md5 base64
format env code-pages mule custom widget hashtable-print-readable
backquote inotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty
make-network-process emacs)

Memory information:
((conses 16 182571 10570)
 (symbols 48 31257 1)
 (miscs 40 340 231)
 (strings 32 71112 6419)
 (string-bytes 1 1678721)
 (vectors 16 14561)
 (vector-slots 8 529555 10250)
 (floats 8 183 150)
 (intervals 56 250 6)
 (buffers 976 13)
 (heap 1024 36602 1391))

-- 
Google Germany GmbH
Erika-Mann-Straße 33
80636 München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle

Diese E-Mail ist vertraulich.  Wenn Sie nicht der richtige Adressat sind,
leiten Sie diese bitte nicht weiter, informieren Sie den Absender und löschen
Sie die E-Mail und alle Anhänge.  Vielen Dank.

This e-mail is confidential.  If you are not the right addressee please do not
forward it, please inform the sender, and please erase this e-mail including
any attachments.  Thanks.


[Message part 3 (message/rfc822, inline)]
From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 25366-done <at> debbugs.gnu.org
Subject: Re: bug#25366: 26.0.50; [:blank:] character class should match all
 Unicode horizontal whitespace
Date: Fri, 06 Jan 2017 19:21:05 +0000
[Message part 4 (text/plain, inline)]
Philipp Stephani <p.stephani2 <at> gmail.com> schrieb am Fr., 6. Jan. 2017 um
20:10 Uhr:

> Eli Zaretskii <eliz <at> gnu.org> schrieb am Fr., 6. Jan. 2017 um 16:11 Uhr:
>
> > From: Philipp Stephani <p.stephani2 <at> gmail.com>
> > Date: Fri, 06 Jan 2017 15:00:22 +0000
> > Cc: 25366 <at> debbugs.gnu.org
> >
> >
> http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties
> >
> >  Patches to that effect are welcome.
> >
> > Here's a patch.
>
> Thanks.  A few minor comments below.
>
> > +/* Return true if C is a horizontal whitespace character, as defined
> > +   by http://www.unicode.org/reports/tr18/tr18-19.html#blank.  */
> > +bool
> > +blankp (int c)
> > +{
> > +  if (c == '\t')
> > +    return true;
>
> Why does this test explicitly only for a TAB?  What about SPC, for
> example?
>
>
> Because TAB is the only character that is blank, but doesn't have the
> general category Zs.
> I've now also included space and added a comment. The risk that the
> general category of space will ever be changed seems very small.
>
>
>
> > --- a/doc/lispref/searching.texi
> > +++ b/doc/lispref/searching.texi
> > @@ -553,7 +553,10 @@ Char Classes
> >  (@pxref{Character Properties}) indicates they are alphabetic
> >  characters.
> >  @item [:blank:]
> > -This matches space and tab only.
> > +This matches horizontal whitespace, as defined by Unicode Technical
> > +Standard #18.  In particular, it matches tabs and characters whose
> > +Unicode @samp{general-category} property (@pxref{Character
> > +Properties}) indicates they are spacing separators.
>
> Similarly here: I find the lack of reference to a space potentially
> confusing.
>
>
> Added.
>
>
>
> > +** The regular expression character class [:blank:] now matches
> > +Unicode horizontal whitespace as defined in
> > +http://www.unicode.org/reports/tr18/tr18-19.html#blank.
>
> The reference to a particular version of UTS#18 might become obsolete
> when a new version is released.  So I suggest to provide a general
> reference to the report and its section, not an exact URL.
>
>
> Done.
>


Pushed to master as 512e9886be.
[Message part 5 (text/html, inline)]

This bug report was last modified 8 years and 194 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.