GNU bug report logs - #25366
26.0.50; [:blank:] character class should match all Unicode horizontal whitespace

Previous Next

Package: emacs;

Reported by: Philipp Stephani <p.stephani2 <at> gmail.com>

Date: Thu, 5 Jan 2017 13:47:02 UTC

Severity: wishlist

Tags: confirmed

Found in version 26.0.50

Done: Philipp Stephani <p.stephani2 <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Philipp Stephani <p.stephani2 <at> gmail.com>
Subject: bug#25366: closed (Re: bug#25366: 26.0.50; [:blank:] character
 class should match all Unicode horizontal whitespace)
Date: Fri, 06 Jan 2017 19:22:02 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace

which was filed against the emacs package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 25366 <at> debbugs.gnu.org.

-- 
25366: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=25366
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 25366-done <at> debbugs.gnu.org
Subject: Re: bug#25366: 26.0.50; [:blank:] character class should match all
 Unicode horizontal whitespace
Date: Fri, 06 Jan 2017 19:21:05 +0000
[Message part 3 (text/plain, inline)]
Philipp Stephani <p.stephani2 <at> gmail.com> schrieb am Fr., 6. Jan. 2017 um
20:10 Uhr:

> Eli Zaretskii <eliz <at> gnu.org> schrieb am Fr., 6. Jan. 2017 um 16:11 Uhr:
>
> > From: Philipp Stephani <p.stephani2 <at> gmail.com>
> > Date: Fri, 06 Jan 2017 15:00:22 +0000
> > Cc: 25366 <at> debbugs.gnu.org
> >
> >
> http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties
> >
> >  Patches to that effect are welcome.
> >
> > Here's a patch.
>
> Thanks.  A few minor comments below.
>
> > +/* Return true if C is a horizontal whitespace character, as defined
> > +   by http://www.unicode.org/reports/tr18/tr18-19.html#blank.  */
> > +bool
> > +blankp (int c)
> > +{
> > +  if (c == '\t')
> > +    return true;
>
> Why does this test explicitly only for a TAB?  What about SPC, for
> example?
>
>
> Because TAB is the only character that is blank, but doesn't have the
> general category Zs.
> I've now also included space and added a comment. The risk that the
> general category of space will ever be changed seems very small.
>
>
>
> > --- a/doc/lispref/searching.texi
> > +++ b/doc/lispref/searching.texi
> > @@ -553,7 +553,10 @@ Char Classes
> >  (@pxref{Character Properties}) indicates they are alphabetic
> >  characters.
> >  @item [:blank:]
> > -This matches space and tab only.
> > +This matches horizontal whitespace, as defined by Unicode Technical
> > +Standard #18.  In particular, it matches tabs and characters whose
> > +Unicode @samp{general-category} property (@pxref{Character
> > +Properties}) indicates they are spacing separators.
>
> Similarly here: I find the lack of reference to a space potentially
> confusing.
>
>
> Added.
>
>
>
> > +** The regular expression character class [:blank:] now matches
> > +Unicode horizontal whitespace as defined in
> > +http://www.unicode.org/reports/tr18/tr18-19.html#blank.
>
> The reference to a particular version of UTS#18 might become obsolete
> when a new version is released.  So I suggest to provide a general
> reference to the report and its section, not an exact URL.
>
>
> Done.
>


Pushed to master as 512e9886be.
[Message part 4 (text/html, inline)]
[Message part 5 (message/rfc822, inline)]
From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 26.0.50; [:blank:] character class should match all Unicode horizontal
 whitespace
Date: Thu, 05 Jan 2017 14:46:01 +0100
(string-match-p "[[:blank:]]" "\N{HAIR SPACE}")
=> nil, expected 0

[[:blank:]] should be the same as \h in PRCE.


In GNU Emacs 26.0.50.26 (x86_64-unknown-linux-gnu, GTK+ Version 3.10.8)
 of 2017-01-05 built on unknown
Repository revision: d88cdad2847726438c7d1de9fd2651c4be9243aa
Windowing system distributor 'The X.Org Foundation', version 11.0.11501000
System Description:	Ubuntu 14.04 LTS

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Entering debugger...
Back to top level

Configured using:
 'configure --with-modules --enable-checking
 --enable-check-lisp-object-type 'CFLAGS=-ggdb3 -O0''

Configured features:
XPM JPEG TIFF GIF PNG SOUND GSETTINGS NOTIFY GNUTLS FREETYPE XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 MODULES

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message subr-x puny seq byte-opt gv
bytecomp byte-compile cl-extra cconv dired dired-loaddefs format-spec
rfc822 mml mml-sec password-cache epa derived epg epg-config gnus-util
rmail rmail-loaddefs mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util mail-prsvr mail-utils help-mode easymenu cl-loaddefs pcase
cl-lib debug time-date mule-util tooltip eldoc electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript case-table epa-hook jka-cmpr-hook help
simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button
faces cus-face macroexp files text-properties overlay sha1 md5 base64
format env code-pages mule custom widget hashtable-print-readable
backquote inotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty
make-network-process emacs)

Memory information:
((conses 16 182571 10570)
 (symbols 48 31257 1)
 (miscs 40 340 231)
 (strings 32 71112 6419)
 (string-bytes 1 1678721)
 (vectors 16 14561)
 (vector-slots 8 529555 10250)
 (floats 8 183 150)
 (intervals 56 250 6)
 (buffers 976 13)
 (heap 1024 36602 1391))

-- 
Google Germany GmbH
Erika-Mann-Straße 33
80636 München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle

Diese E-Mail ist vertraulich.  Wenn Sie nicht der richtige Adressat sind,
leiten Sie diese bitte nicht weiter, informieren Sie den Absender und löschen
Sie die E-Mail und alle Anhänge.  Vielen Dank.

This e-mail is confidential.  If you are not the right addressee please do not
forward it, please inform the sender, and please erase this e-mail including
any attachments.  Thanks.



This bug report was last modified 8 years and 194 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.