GNU bug report logs - #58992
28.2; "lax space matching" no longer works

Previous Next

Package: emacs;

Reported by: Vincent Lefevre <vincent <at> vinc17.net>

Date: Thu, 3 Nov 2022 16:54:02 UTC

Severity: normal

Found in version 28.2

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #145 received at 58992 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Robert Pluim <rpluim <at> gmail.com>, 58992 <at> debbugs.gnu.org
Subject: Re: bug#58992: 28.2; "lax space matching" no longer works
Date: Fri, 4 Nov 2022 16:00:02 +0100
On 2022-11-04 16:04:07 +0200, Eli Zaretskii wrote:
> > From: Robert Pluim <rpluim <at> gmail.com>
> > Cc: Vincent Lefevre <vincent <at> vinc17.net>,  58992 <at> debbugs.gnu.org
> > Date: Fri, 04 Nov 2022 14:56:36 +0100
> > 
> > >>>>> On Fri, 04 Nov 2022 13:45:35 +0200, Eli Zaretskii <eliz <at> gnu.org> said:
> >     >> A character alternative can also specify named character classes
> >     >> (*note Char Classes::).  This is a POSIX feature.  [...]
> >     >> 
> >     >> You must not change its behavior! Making it depend on the major mode
> >     >> is even worse.
> > 
> >     Eli> Too late for such changes, sorry.  Emacs interprets [:space:] like
> >     Eli> that since at least Emacs 22, if not before.
> > 
> > Would it help if it said "This is based on a POSIX feature, but not
> > 100% identical"?
> 
> No.  But we can remove that sentence, since it doesn't add anything to
> the text.  Done.

IMHO, Section "Syntax of Regular Expressions" (for both Emacs and Elisp)
should warn that the meaning of a regular expression may depend on the
major mode.

Moreover, the [[:space:]\n]+ suggestion for search-whitespace-regexp
should be changed to something that does not use [:space:], as there
is no guarantee that the usual whitespace characters (e.g. space and
tab characters) have whitespace syntax. The manual says:

  @item Whitespace characters: @samp{@ } or @samp{-}
  Characters that separate symbols and words from each other.
  Typically, whitespace characters have no other syntactic significance,
  and multiple whitespace characters are syntactically equivalent to a
  single one.  Space, tab, and formfeed are classified as whitespace in
  almost all major modes.

But for Python, multiple whitespace characters are not syntactically
equivalent to a single one. So in a Python major mode, the user would
not want to use [:space:] for searching.

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)




This bug report was last modified 2 years and 203 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.