GNU bug report logs - #22147
Obsolete search-forward-lax-whitespace

Previous Next

Package: emacs;

Reported by: Juri Linkov <juri <at> linkov.net>

Date: Fri, 11 Dec 2015 23:54:02 UTC

Severity: normal

Tags: fixed

Fixed in version 28.0.50

Done: Juri Linkov <juri <at> linkov.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 22147 in the body.
You can then email your comments to 22147 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Fri, 11 Dec 2015 23:54:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Juri Linkov <juri <at> linkov.net>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 11 Dec 2015 23:54:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: bug-gnu-emacs <at> gnu.org
Subject: Obsolete search-forward-lax-whitespace
Date: Sat, 12 Dec 2015 01:52:03 +0200
After commit e5ece322 that removed a layer of indirection for lax-whitespace
it's not possible anymore to override search-forward-lax-whitespace
with own implementation to ignore all possible whitespace instead of
just spaces in the search string.

For example, such customization as this one:

(setq  search-whitespace-regexp "\\(\\s-\\|\n\\)+")
(defun search-whitespace-regexp (string)
  "Return a regexp which ignores all possible whitespace in search string.
Uses the value of the variable `search-whitespace-regexp'."
  (if (or (not (stringp search-whitespace-regexp))
          (null (if isearch-regexp
                    isearch-regexp-lax-whitespace
                  isearch-lax-whitespace)))
      string
    (replace-regexp-in-string
     search-whitespace-regexp
     search-whitespace-regexp
     string nil t)))
(defun search-forward-lax-whitespace (string &optional bound noerror count)
  (re-search-forward (search-whitespace-regexp (regexp-quote string)) bound noerror count))
(defun search-backward-lax-whitespace (string &optional bound noerror count)
  (re-search-backward (search-whitespace-regexp (regexp-quote string)) bound noerror count))
(defun re-search-forward-lax-whitespace (regexp &optional bound noerror count)
  (re-search-forward (search-whitespace-regexp regexp) bound noerror count))
(defun re-search-backward-lax-whitespace (regexp &optional bound noerror count)
  (re-search-backward (search-whitespace-regexp regexp) bound noerror count))

allowed to search for a string with a newline like ‘C-s abc C-q C-j def’
and match the text “abc def”.

It's not clear what to do with this customization now using
a replacement recommended in (make-obsolete old "instead, use (let
((search-spaces-regexp search-whitespace-regexp)) (re-search-... ...))"




Added indication that bug 22147 blocks19759 Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Sat, 12 Dec 2015 00:01:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sat, 12 Dec 2015 00:45:02 GMT) Full text and rfc822 format available.

Message #10 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sat, 12 Dec 2015 00:44:19 +0000
[Message part 1 (text/plain, inline)]
On 11 Dec 2015 11:52 pm, "Juri Linkov" <juri <at> linkov.net> wrote:
>
> It's not clear what to do with this customization now using
> a replacement recommended in (make-obsolete old "instead, use (let
> ((search-spaces-regexp search-whitespace-regexp)) (re-search-... ...))"

The obsoletion message tells you what to use instead of
search-forward-lax-whitespace, but that doesn't help you because you
weren't using this function, you were overriding it (IIUC).

Fortunately, I think you don't need to override anything at all. You can
just set search-default-regexp-function to your #'search-whitespace-regexp.
IIUC, that should have the same effect.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sat, 12 Dec 2015 23:49:02 GMT) Full text and rfc822 format available.

Message #13 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sun, 13 Dec 2015 01:31:10 +0200
> Fortunately, I think you don't need to override anything at all. You can
> just set search-default-regexp-function to your #'search-whitespace-regexp.
> IIUC, that should have the same effect.

Thanks, setting search-default-regexp-mode to #'search-whitespace-regexp
gives the same effect.

One drawback is that then it removes char-fold search.
Do you have a plan to combine lax-whitespace search
with char-fold search?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sun, 13 Dec 2015 00:30:02 GMT) Full text and rfc822 format available.

Message #16 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sun, 13 Dec 2015 00:29:23 +0000
[Message part 1 (text/plain, inline)]
On 12 Dec 2015 11:31 pm, "Juri Linkov" <juri <at> linkov.net> wrote:
>
> Thanks, setting search-default-regexp-mode to #'search-whitespace-regexp
> gives the same effect.
>
> One drawback is that then it removes char-fold search.

True. I think it might also be possible to get what you want by just
setting the search-whitespace-regexp variable to "[ \t\r\n]+". That would
have the advantage of not removing char folding (and would reduce
everything to one line).

> Do you have a plan to combine lax-whitespace search
> with char-fold search?

Char-folding is perfectly compatible with the regular lax-whitespace.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Mon, 14 Dec 2015 00:44:02 GMT) Full text and rfc822 format available.

Message #19 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Mon, 14 Dec 2015 02:23:34 +0200
>> Thanks, setting search-default-regexp-mode to #'search-whitespace-regexp
>> gives the same effect.
>>
>> One drawback is that then it removes char-fold search.
>
> True. I think it might also be possible to get what you want by just
> setting the search-whitespace-regexp variable to "[ \t\r\n]+". That would
> have the advantage of not removing char folding (and would reduce
> everything to one line).

This still doesn't allow ^J in the search string to match a newline.
I often paste multi-line texts into the search string and need to
ignore differences in newlines usually caused by text re-filling.

What the mentioned regexp function does is replacing all whitespace
in the search string with the regexp that matches whitespace (also
it's possible to replace whitespace with a space character and then
use search-spaces-regexp to match this space character using the regexp
in search-whitespace-regexp).

By analogy with char-folding, this means symmetric whitespace folding.
When char-fold-symmetric causes all members of a folding equivalence
class to be treated equivalently, lax-whitespace-symmetric could
treat only whitespace character equivalently.

>> Do you have a plan to combine lax-whitespace search with char-fold search?
>
> Char-folding is perfectly compatible with the regular lax-whitespace.

Could char-folding already do the described above (maybe simpler
would be to normalize the search string by turning all whitespace
into space characters), or better first implement char-fold-symmetric
and then use it for whitespace characters?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Mon, 14 Dec 2015 01:13:02 GMT) Full text and rfc822 format available.

Message #22 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Mon, 14 Dec 2015 01:11:59 +0000
[Message part 1 (text/plain, inline)]
On 14 Dec 2015 12:23 am, "Juri Linkov" <juri <at> linkov.net> wrote:
> >
> > True. I think it might also be possible to get what you want by just
> > setting the search-whitespace-regexp variable to "[ \t\r\n]+". That
would
> > have the advantage of not removing char folding (and would reduce
> > everything to one line).
>
> This still doesn't allow ^J in the search string to match a newline.

Right. I always get confused about that variable.

> (maybe simpler
> would be to normalize the search string by turning all whitespace
> into space characters),

Yes, I think this should give you the behaviour you're looking for.
Try setting search-default-regexp-function to #'my-lax-with-char-fold,
where

(defun my-lax-with-char-fold (s &optional l)
  (character-fold-to-regexp (replace-regexp-in-string "\t\n\r\s+" " " s)
l))

And then also set search-whitespace-regexp like above.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Tue, 15 Dec 2015 00:34:01 GMT) Full text and rfc822 format available.

Message #25 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Tue, 15 Dec 2015 01:58:11 +0200
>> (maybe simpler
>> would be to normalize the search string by turning all whitespace
>> into space characters),
>
> Yes, I think this should give you the behaviour you're looking for.
> Try setting search-default-regexp-function to #'my-lax-with-char-fold,

search-default-regexp-mode definitely needs to be renamed to
search-default-regexp-function ;-)

> where
>
> (defun my-lax-with-char-fold (s &optional l)
>   (character-fold-to-regexp (replace-regexp-in-string "\t\n\r\s+" " " s)
> l))
>
> And then also set search-whitespace-regexp like above.

Thanks for the suggestion.  Since it works, then maybe better
generalize it to allow a mode that supports normalization of
the search string, that also will do symmetric char-folding,
where e.g. searching for “ä” will match “a”, etc.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Tue, 15 Dec 2015 10:17:01 GMT) Full text and rfc822 format available.

Message #28 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Tue, 15 Dec 2015 10:15:56 +0000
2015-12-14 23:58 GMT+00:00 Juri Linkov <juri <at> linkov.net>:
>>> (maybe simpler
>>> would be to normalize the search string by turning all whitespace
>>> into space characters),
>>
>> Yes, I think this should give you the behaviour you're looking for.
>> Try setting search-default-regexp-function to #'my-lax-with-char-fold,
>
> search-default-regexp-mode definitely needs to be renamed to
> search-default-regexp-function ;-)

Like I said on the devel thread, not sure about this. The only reason
I got this wrong on all the above messages is that I wrote every
single one of them from my phone. :-)

>> where
>>
>> (defun my-lax-with-char-fold (s &optional l)
>>   (character-fold-to-regexp (replace-regexp-in-string "\t\n\r\s+" " " s)
>> l))
>>
>> And then also set search-whitespace-regexp like above.
>
> Thanks for the suggestion.  Since it works, then maybe better
> generalize it to allow a mode that supports normalization of
> the search string, that also will do symmetric char-folding,
> where e.g. searching for “ä” will match “a”, etc.

I don't know what you mean. IIUC, the current framework already
supports a "normalizing mode", which is what we just did here, isn't
it?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Wed, 16 Dec 2015 01:29:02 GMT) Full text and rfc822 format available.

Message #31 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Wed, 16 Dec 2015 02:57:42 +0200
>>> (defun my-lax-with-char-fold (s &optional l)
>>>   (character-fold-to-regexp (replace-regexp-in-string "\t\n\r\s+" " " s)
>>> l))
>>>
>>> And then also set search-whitespace-regexp like above.
>>
>> Thanks for the suggestion.  Since it works, then maybe better
>> generalize it to allow a mode that supports normalization of
>> the search string, that also will do symmetric char-folding,
>> where e.g. searching for “ä” will match “a”, etc.
>
> I don't know what you mean. IIUC, the current framework already
> supports a "normalizing mode", which is what we just did here, isn't
> it?

I mean a char-folding customization that allows a search
for “ä” match “a”.  Is this already possible?  If yes,
then it should be easy to customize it in such a way that
“\n” will match space “\s” to avoid the need to write own
functions that define an intersection of the existing functions
char-folding and lax-whitespace. IOW, to customize a char-folding
option instead of search-default-regexp-mode?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Wed, 16 Dec 2015 01:48:02 GMT) Full text and rfc822 format available.

Message #34 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Juri Linkov <juri <at> linkov.net>, Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 22147 <at> debbugs.gnu.org
Subject: RE: bug#22147: Obsolete search-forward-lax-whitespace
Date: Tue, 15 Dec 2015 17:47:05 -0800 (PST)
> I mean a char-folding customization that allows a search
> for “ä” match “a”.  Is this already possible?

It sounds like you are asking for symmetric char folding: being
able to use any of the various A's that make up the A-characters
equivalence class as a search pattern and find any of those
characters.

If so, I implemented that (one way, at least), and in emacs-devel
I proposed such behavior as a togglable option.

It is trivial to try it, if you like: character-fold+.el.
http://www.emacswiki.org/emacs/download/character-fold%2b.el

(A toggle command for it, `isearchp-toggle-symmetric-char-fold',
is defined in isearch+.el:
http://www.emacswiki.org/emacs/download/isearch%2b.el.)

> If yes, then it should be easy to customize it in such a way that
> “\n” will match space “\s” to avoid the need to write own
> functions that define an intersection of the existing functions
> char-folding and lax-whitespace. IOW, to customize a char-folding
> option instead of search-default-regexp-mode?

Not sure if it answers the need you just described, but the same
library has an option, `char-fold-ad-hoc', that lets users add
their own equivalence classes.

(Caveat: I think that Artur made some changes to character-fold.el
recently.  It's possible that character-fold+.el is not up-to-date
wrt those changes, in which case it might not work with the most
recent versions of character-fold.el.  Maybe check the dates, if
you are interested.)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Wed, 16 Dec 2015 11:00:03 GMT) Full text and rfc822 format available.

Message #37 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Wed, 16 Dec 2015 10:59:32 +0000
[Message part 1 (text/plain, inline)]
On 16 Dec 2015 12:57 am, "Juri Linkov" <juri <at> linkov.net> wrote:
>
> I mean a char-folding customization that allows a search
> for “ä” match “a”.  Is this already possible?

Not yet.
I do want to expose more char folding options, but I want to wait for
emacs-25 to come out first, to see if and how people will use this feature.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Thu, 17 Dec 2015 01:12:02 GMT) Full text and rfc822 format available.

Message #40 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Thu, 17 Dec 2015 02:57:46 +0200
>> I mean a char-folding customization that allows a search
>> for “ä” match “a”.  Is this already possible?
>
> Not yet.
> I do want to expose more char folding options, but I want to wait for
> emacs-25 to come out first, to see if and how people will use this feature.

char-fold-symmetric could wait for later, but we definitely need
char-fold-ad-hoc now before the release because the users should be
able to customize the default rules.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Thu, 17 Dec 2015 16:34:02 GMT) Full text and rfc822 format available.

Message #43 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Thu, 17 Dec 2015 16:33:47 +0000
[Message part 1 (text/plain, inline)]
On 17 Dec 2015 12:57 am, "Juri Linkov" <juri <at> linkov.net> wrote:
>
> >> I mean a char-folding customization that allows a search
> >> for “ä” match “a”.  Is this already possible?
> >
> > Not yet.
> > I do want to expose more char folding options, but I want to wait for
> > emacs-25 to come out first, to see if and how people will use this
feature.
>
> char-fold-symmetric could wait for later, but we definitely need
> char-fold-ad-hoc now before the release because the users should be
> able to customize the default rules.

Indeed. 👍
Once we do that, we also need a variable to determine whether we should
derive the default table from the unicode standard (like we currently do)
or just use an empty default with the ad-hoc rules slapped on top.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Thu, 17 Dec 2015 17:22:01 GMT) Full text and rfc822 format available.

Message #46 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: bruce.connor.am <at> gmail.com, Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org
Subject: RE: bug#22147: Obsolete search-forward-lax-whitespace
Date: Thu, 17 Dec 2015 09:21:05 -0800 (PST)
>> char-fold-symmetric could wait for later, but we definitely need
>> char-fold-ad-hoc now before the release because the users should be
>> able to customize the default rules.
>
> Indeed.  Once we do that, we also need a variable to determine
> whether we should derive the default table from the unicode
> standard (like we currently do) or just use an empty default with
> the ad-hoc rules slapped on top. 

Users should be able to define their own equivalence classes (groups),
not just one class.  Each class should be the value of a user option.

Here is one simple and flexible way to do this:

1. Define a user option, `char-folding-classes', which is a list of
   any number of (OPTION-NAME DOC-STRING) pairs, where OPTION-NAME
   is a symbol that will name a user option and DOC-STRING is its doc
   string.

   Each symbol would automatically be used to define an option (a
   defcustom) that the user can then use to define a given equivalence
   class.

2. The generated defcustom for each user option specified in option
   `char-folding-classes' would allow for any number of entries, each
   of which could be a `choice' of either of these defcustom types:

   a. An alist, such as used currently in my `char-fold-ad-hoc' option:
      Each entry is a list of a char and the strings that fold into it.

   b. A function that populates such an alist.

The default value of `char-folding-classes' would be something like
this:

((char-fold-diacriticals
  "Classes of chars equivalent because they have the same base char.")
 (char-fold-quotations
  "Classes of equivalent quotation-mark characters."))

Option `char-fold-diacriticals' would have as its default value a
function that returns the alist of diacritical-equivalent classes
that we provide today.  Its code would be derived from what we use
today.

(If needed, a user can replace the function with another that
defines some of the classes differently or that provides only a
subset of the classes we provide today.  But most users would
probably not customize this option.)

Option `char-fold-quotations' would have as its default value what I
use as the default value of my `char-fold-ad-hoc', which is an alist
of the quotation-mark equivalences provided today by character-fold.el:

((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝"
      "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»")
 (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "󠀢" "❮" "❯" "‹" "›")
 (?` "❛" "‘" "‛" "󠀢" "❮" "‹"))

Having an option that lets users define any number of classes, and
having each class be defined by a user option, is flexible.

Having multiple classes, each associated with a variable (option),
lets users and libraries easily enable/disable different equivalence
classes in different contexts.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Thu, 17 Dec 2015 18:48:02 GMT) Full text and rfc822 format available.

Message #49 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: 22147 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Thu, 17 Dec 2015 18:47:37 +0000
2015-12-17 17:21 GMT+00:00 Drew Adams <drew.adams <at> oracle.com>:
>>> char-fold-symmetric could wait for later, but we definitely need
>>> char-fold-ad-hoc now before the release because the users should be
>>> able to customize the default rules.
>>
>> Indeed.  Once we do that, we also need a variable to determine
>> whether we should derive the default table from the unicode
>> standard (like we currently do) or just use an empty default with
>> the ad-hoc rules slapped on top.
>
> Users should be able to define their own equivalence classes (groups),
> not just one class.  Each class should be the value of a user option.
>
> Here is one simple and flexible way to do this:
>
> 1. Define a user option, `char-folding-classes', which is a list of
>    any number of (OPTION-NAME DOC-STRING) pairs, where OPTION-NAME
>    is a symbol that will name a user option and DOC-STRING is its doc
>    string.
>
>    Each symbol would automatically be used to define an option (a
>    defcustom) that the user can then use to define a given equivalence
>    class.
>
> 2. The generated defcustom for each user option specified in option
>    `char-folding-classes' would allow for any number of entries, each
>    of which could be a `choice' of either of these defcustom types:
>
>    a. An alist, such as used currently in my `char-fold-ad-hoc' option:
>       Each entry is a list of a char and the strings that fold into it.
>
>    b. A function that populates such an alist.

I appreciate you probably put quite a bit of thought into this, but
IMO this would be over-engineering.

I think we should define two simpole defcustoms that determine how the
character-fold-table is generated: character-fold-ad-hoc (an alist)
and character-fold-derive-from-unicode-decomposition (a boolean).
This should be immediately configurable by anyone, without requiring a
big initial investment.

Then we also make character-fold-table into a defvar, and document it
as a proper exposed API, so advanced users can change it however they
want with hooks and local vars to however many different
values/equiv-classes they want.

This would offer a dead-simple defcustom that covers most cases, while
still allowing the versatility of having multiple options for those
who need it.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Thu, 17 Dec 2015 22:17:02 GMT) Full text and rfc822 format available.

Message #52 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: bruce.connor.am <at> gmail.com
Cc: 22147 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>
Subject: RE: bug#22147: Obsolete search-forward-lax-whitespace
Date: Thu, 17 Dec 2015 14:16:26 -0800 (PST)
> > Users should be able to define their own equivalence classes (groups),
> > not just one class.  Each class should be the value of a user option.
> >
> > Here is one simple and flexible way to do this:
> >
> > 1. Define a user option, `char-folding-classes', which is a list of
> >    any number of (OPTION-NAME DOC-STRING) pairs, where OPTION-NAME
> >    is a symbol that will name a user option and DOC-STRING is its doc
> >    string.
> >
> >    Each symbol would automatically be used to define an option (a
> >    defcustom) that the user can then use to define a given equivalence
> >    class.
> >
> > 2. The generated defcustom for each user option specified in option
> >    `char-folding-classes' would allow for any number of entries, each
> >    of which could be a `choice' of either of these defcustom types:
> >
> >    a. An alist, such as used currently in my `char-fold-ad-hoc' option:
> >       Each entry is a list of a char and the strings that fold into it.
> >
> >    b. A function that populates such an alist.
> 
> I appreciate you probably put quite a bit of thought into this,

Only a few minutes of thought, as I imagine you can guess.  It just
extends what I already have in character-fold.el.

> but IMO this would be over-engineering.

How so?  I've done zero "engineering" on it.  And I don't really care
how it gets done, as long as it does.

My point, as I said, is only this:

  Users should be able to define their own equivalence classes (groups),
  not just one class.  Each class should be the value of a user option.

Anything less than that is not serving users as well as they deserve, IMO.

As to how that is done, I really don't care.  I offered one simple approach,
but you are welcome to over-, under- or just-right-engineer it your own way.

> I think we should define two simpole defcustoms that determine how the
> character-fold-table is generated: character-fold-ad-hoc (an alist)
> and character-fold-derive-from-unicode-decomposition (a boolean).
> This should be immediately configurable by anyone,

That's far too restrictive, IMO.  It does not let users or libraries
easily apply different equivalence classes for different uses (e.g.
modes).  And there is no reason for such restriction - nothing is
gained by it.

> without requiring a big initial investment.

There is no "big initial investment" to what I described.  I can code
it up quickly, as I'm sure you can too.

And what it provides out of the box is exactly the same.  It is just as
"immediately configurable by anyone" - and immediately configurable in
exactly the same way.  Your Boolean with a default value of t is
equivalent to the default presence of the function that does what your
Boolean t does: "derive-from-unicode-decomposition".

You can do more with what I described, and more easily.  But you can
also do just as little with it.
 
> Then we also make character-fold-table into a defvar, and document it
> as a proper exposed API, so advanced users

Anything that can be a defvar, for "advanced users", can be a defcustom,
for all users.

If you are inviting users to fiddle with a char-fold table, it is far
better to give them the ability to do so in a modular way, and to make
your default derive-from-unicode-decomposition into a default function
instead of just hard-coding the behavior.  Nothing lost, modularity
and flexibility gained.

> can change it however they want with hooks and local vars to however
> many different values/equiv-classes they want.

Ugly, and complicated.  And unnecessary.  No need to be an "advanced
user" and fiddle with such stuff.

> This would offer a dead-simple defcustom that covers most cases, while
> still allowing the versatility of having multiple options for those
> who need it.

What I proposed is just as "dead-simple", but cleaner (IMO) and open
to all users.  Just as importantly, it lets users (easily) define
multiple classes that they can (easily) use in different contexts.

Again, I don't care about the implementation, but I would like users
to be able to define their own equivalence classes (groups), and to
enable/disable them easily au choix.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Fri, 18 Dec 2015 00:56:02 GMT) Full text and rfc822 format available.

Message #55 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: 22147 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Fri, 18 Dec 2015 00:55:06 +0000
[out of order quotes below]

2015-12-17 22:16 GMT+00:00 Drew Adams <drew.adams <at> oracle.com>:
>> This would offer a dead-simple defcustom that covers most cases, while
>> still allowing the versatility of having multiple options for those
>> who need it.
>
> What I proposed is just as "dead-simple", but cleaner (IMO) and open
> to all users.  Just as importantly, it lets users (easily) define
> multiple classes that they can (easily) use in different contexts.

And this is the source of our impasse. IMO (and I say this will all
due respect) your proposal is not as simple as the two defcustoms I
suggested, and it is not cleaner than just using hooks/local-vars to
set the value of character-fold-table to whatever is relevant for the
current situation.
Since we're both just stating opinions, it's unlikely this discussion
will go anywhere.

> My point, as I said, is only this:
>
>   Users should be able to define their own equivalence classes (groups),
>   not just one class.  Each class should be the value of a user option.
>
> Anything less than that is not serving users as well as they deserve, IMO.

And my point is that this is too complex for user options.
Most people won't need this much generality, and the amount of time
these people will waste trying to understand this multi-option
configuration will be significant. The few who want this behavior will
be glad that we offered it, but the time it will save them (compared
to if they wrote something in elisp) will be (IMO) small compared to
the total accumulated wasted time for everyone else.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sat, 14 May 2016 20:53:01 GMT) Full text and rfc822 format available.

Message #58 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: 22147 <at> debbugs.gnu.org, Artur Malabarba <bruce.connor.am <at> gmail.com>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sat, 14 May 2016 23:45:00 +0300
>> I mean a char-folding customization that allows a search
>> for “ä” match “a”.  Is this already possible?
>
> It sounds like you are asking for symmetric char folding: being
> able to use any of the various A's that make up the A-characters
> equivalence class as a search pattern and find any of those
> characters.
>
> If so, I implemented that (one way, at least), and in emacs-devel
> I proposed such behavior as a togglable option.
>
> It is trivial to try it, if you like: character-fold+.el.
> http://www.emacswiki.org/emacs/download/character-fold%2b.el
>
> (A toggle command for it, `isearchp-toggle-symmetric-char-fold',
> is defined in isearch+.el:
> http://www.emacswiki.org/emacs/download/isearch%2b.el.)

I'm starting to recollect all the remaining pieces to finish this
release blocking issue, but I can't download this library,
because the link is broken and it seems the whole site is down.

Drew, could you please send the latest version as an attachment?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sat, 14 May 2016 22:21:02 GMT) Full text and rfc822 format available.

Message #61 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>, Drew Adams <drew.adams <at> oracle.com>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sat, 14 May 2016 22:20:37 +0000
[Message part 1 (text/plain, inline)]
IIUC, Drew was offering an implementation of symmetric char folding,
whereas the release blocking aspect of this bug is to add a
char-folding-ad-hoc variable.

On Sat, 14 May 2016 5:45 pm Juri Linkov, <juri <at> linkov.net> wrote:

> >> I mean a char-folding customization that allows a search
> >> for “ä” match “a”.  Is this already possible?
> >
> > It sounds like you are asking for symmetric char folding: being
> > able to use any of the various A's that make up the A-characters
> > equivalence class as a search pattern and find any of those
> > characters.
> >
> > If so, I implemented that (one way, at least), and in emacs-devel
> > I proposed such behavior as a togglable option.
> >
> > It is trivial to try it, if you like: character-fold+.el.
> > http://www.emacswiki.org/emacs/download/character-fold%2b.el
> >
> > (A toggle command for it, `isearchp-toggle-symmetric-char-fold',
> > is defined in isearch+.el:
> > http://www.emacswiki.org/emacs/download/isearch%2b.el.)
>
> I'm starting to recollect all the remaining pieces to finish this
> release blocking issue, but I can't download this library,
> because the link is broken and it seems the whole site is down.
>
> Drew, could you please send the latest version as an attachment?
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sat, 14 May 2016 22:23:01 GMT) Full text and rfc822 format available.

Message #64 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org, Artur Malabarba <bruce.connor.am <at> gmail.com>
Subject: RE: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sat, 14 May 2016 15:22:27 -0700 (PDT)
[Message part 1 (text/plain, inline)]
> >> I mean a char-folding customization that allows a search
> >> for “ä” match “a”.  Is this already possible?
> >
> > It sounds like you are asking for symmetric char folding: being
> > able to use any of the various A's that make up the A-characters
> > equivalence class as a search pattern and find any of those
> > characters.
> >
> > If so, I implemented that (one way, at least), and in emacs-devel
> > I proposed such behavior as a togglable option.
> >
> > It is trivial to try it, if you like: character-fold+.el.
> > http://www.emacswiki.org/emacs/download/character-fold%2b.el
> >
> > (A toggle command for it, `isearchp-toggle-symmetric-char-fold',
> > is defined in isearch+.el:
> > http://www.emacswiki.org/emacs/download/isearch%2b.el.)
> 
> I'm starting to recollect all the remaining pieces to finish this
> release blocking issue, but I can't download this library,
> because the link is broken and it seems the whole site is down.
> 
> Drew, could you please send the latest version as an attachment?

1. EmacsWiki seems to be up now.  Also, you should be able to get to
what is on EmacsWiki at the EmacsMirror: https://github.com/emacsmirror.
And you should also be able to get my libraries from MELPA.  I've
attached `character-fold+.el' anyway.  Let me know if you also want
to look at `isearch+.el' and you cannot get to it for some reason.

2. More importantly, what I wrote in `character-fold+.el' worked
only at the time I wrote it and for a while thereafter, unfortunately.
Not too long after that, Artur Malabarba rewrote `character-fold.el',
so the code I wrote is no longer appropriate.

I have not had time to look at the (fairly deep) changes he made,
or to imagine what I might do with it to obtain the symmetric
behavior I implemented for the earlier version.

4. Dunno whether what I wrote is needed or helpful for dealing
with this bug.  Perhaps you or Artur can tell.  IIUC, the part
of this bug report that I replied to seemed to be a request for
an extension of what `character-fold.el' does: symmetric folding.
But perhaps I was misunderstanding, because I don't see how that
could be a blocking bug - it was never Artur's intention to
provide symmetric folding, AFAIK.
[character-fold+.el (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sat, 14 May 2016 22:28:01 GMT) Full text and rfc822 format available.

Message #67 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>, Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org
Subject: RE: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sat, 14 May 2016 15:27:05 -0700 (PDT)
> IIUC, Drew was offering an implementation of symmetric char folding,
> whereas the release blocking aspect of this bug is to add a
> char-folding-ad-hoc variable. 

That makes sense.

That too is in `character-fold+.el', which I attached to my previous message.
Dunno whether what I have there is exactly what you want/need.  This is it:

(defcustom char-fold-ad-hoc '((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝"
                               "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»")
                              (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "󠀢" "❮" "❯" "‹" "›")
                              (?` "❛" "‘" "‛" "󠀢" "❮" "‹"))
  "Ad hoc character foldings.
Each entry is a list of a character and the strings that fold into it.

The default value includes those ad hoc foldings provided by vanilla
Emacs."
  :set (lambda (sym defs)
         (custom-set-default sym defs)
         (update-char-fold-table))
  :type '(repeat (cons
                  (character :tag "Fold to character")
                  (repeat (string :tag "Fold from string"))))
  :group 'isearch)

And this is where it is used:

;; Add some manual entries.
(dolist (it  char-fold-ad-hoc)
  (let ((idx        (car it))
        (chr-strgs  (cdr it)))
    (aset equiv idx (append chr-strgs (aref equiv idx)))))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sun, 15 May 2016 20:59:02 GMT) Full text and rfc822 format available.

Message #70 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 22147 <at> debbugs.gnu.org, Drew Adams <drew.adams <at> oracle.com>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sun, 15 May 2016 23:45:48 +0300
> IIUC, Drew was offering an implementation of symmetric char folding,
> whereas the release blocking aspect of this bug is to add a
> char-folding-ad-hoc variable.

My initial request was to restore an ability to fold whitespace.
One way to do this is to implement symmetric char folding.
However, I believe that the same could be achieved with just
char-fold-ad-hoc providing a suitable set of mappings.  I'll confirm
whether this is achievable after adding char-fold-ad-hoc.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sun, 15 May 2016 20:59:02 GMT) Full text and rfc822 format available.

Message #73 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: 22147 <at> debbugs.gnu.org, Artur Malabarba <bruce.connor.am <at> gmail.com>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sun, 15 May 2016 23:56:30 +0300
>> I'm starting to recollect all the remaining pieces to finish this
>> release blocking issue, but I can't download this library,
>> because the link is broken and it seems the whole site is down.
>>
>> Drew, could you please send the latest version as an attachment?
>
> 1. EmacsWiki seems to be up now.  Also, you should be able to get to
> what is on EmacsWiki at the EmacsMirror: https://github.com/emacsmirror.
> And you should also be able to get my libraries from MELPA.  I've
> attached `character-fold+.el' anyway.  Let me know if you also want
> to look at `isearch+.el' and you cannot get to it for some reason.

EmacsWiki is inaccessible to me due to its invalid server certificate.
But thanks for pointing to EmacsMirror - I found your code at
https://github.com/emacsmirror/character-fold-plus
https://github.com/emacsmirror/isearch-plus
which I hope is at the latest version.

> 2. More importantly, what I wrote in `character-fold+.el' worked
> only at the time I wrote it and for a while thereafter, unfortunately.
> Not too long after that, Artur Malabarba rewrote `character-fold.el',
> so the code I wrote is no longer appropriate.

I see that you just moved the hard-coded alist to defcustom
char-fold-ad-hoc.  I think that char-fold-ad-hoc is too ad-hoc naming.
Using more wide-spread naming convention with a data type suffix -alist
(like in display-buffer-alist, etc.) would provide a defcustom name
char-fold-alist.

Another thing we need to do is to allow customization to remove
default mappings.  Maybe this is possible by using the same
defcustom with a rule like: remove default mappings when a char
is mapped to an empty list, e.g.

- adding more mappings for ‘`’:

  (defcustom char-fold-ad-hoc '((?` "❛" "‘" "‛" "󠀢" "❮" "‹"))

- removing default mappings for ‘`’:

  (defcustom char-fold-ad-hoc '((?`))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sun, 15 May 2016 21:52:02 GMT) Full text and rfc822 format available.

Message #76 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org, Artur Malabarba <bruce.connor.am <at> gmail.com>
Subject: RE: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sun, 15 May 2016 14:51:35 -0700 (PDT)
> EmacsWiki is inaccessible to me due to its invalid server certificate.

I see.  I don't know anything about that.

> But thanks for pointing to EmacsMirror - I found your code at
> https://github.com/emacsmirror/character-fold-plus
> https://github.com/emacsmirror/isearch-plus
> which I hope is at the latest version.

Yes, I just checked, and those are the latest versions.

I don't know how often EmacsMirror is updated.  For a while (a year
or two ago, I think) I think it was not mirroring.  You can always
get my code from MELPA, which refreshes from EmacsWiki daily.

> > 2. More importantly, what I wrote in `character-fold+.el' worked
> > only at the time I wrote it and for a while thereafter, unfortunately.
> > Not too long after that, Artur Malabarba rewrote `character-fold.el',
> > so the code I wrote is no longer appropriate.
> 
> I see that you just moved the hard-coded alist to defcustom
> char-fold-ad-hoc.

Correct.  You can see how I use it.  I broke up some of the
character-fold.el code (at the time), in order to use parts of
it (a bit more modular).  Mainly, I broke out `update-char-fold-table'
so that it could be called in the :set functions of the two defcustoms.
So as soon as a user made changes, they were reflected in the behavior.

> I think that char-fold-ad-hoc is too ad-hoc naming.
> Using more wide-spread naming convention with a data type suffix -alist
> (like in display-buffer-alist, etc.) would provide a defcustom name
> char-fold-alist.

OK.  FWIW, I'm not a fan of putting the type ("alist") in the option
name, but I don't speak for what vanilla Emacs does.  If all we can
say about some value is that it takes the _form_ of an alist, that's
too bad.  Normally, we should be able to describe that value (content,
not just form).  It's better, IMO, if the name talks about what the
value is (content, purpose - something specific about it), and not
just say form it takes.

Another consideration (for me, at least): I think (and hope) that
eventually users will be able to have multiple such lists (sets)
of char mappings that they can choose (and mix and match - sets of
such sets, for different purposes/contexts).  IOW, I don't see just
a single set of ad-hoc char mappings.  But this is anyway for the
future.

> Another thing we need to do is to allow customization to remove
> default mappings.  Maybe this is possible by using the same
> defcustom with a rule like: remove default mappings when a char
> is mapped to an empty list, e.g.
> 
> - adding more mappings for ‘`’:
>   (defcustom char-fold-ad-hoc '((?` "❛" "‘" "‛" "󠀢" "❮" "‹"))
> 
> - removing default mappings for ‘`’:
>   (defcustom char-fold-ad-hoc '((?`))

Yes, I would think that would work (already).  But I could be wrong.

Thanks for taking a look at this.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Tue, 17 May 2016 20:57:02 GMT) Full text and rfc822 format available.

Message #79 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: 22147 <at> debbugs.gnu.org, Artur Malabarba <bruce.connor.am <at> gmail.com>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Tue, 17 May 2016 23:55:53 +0300
> Another consideration (for me, at least): I think (and hope) that
> eventually users will be able to have multiple such lists (sets)
> of char mappings that they can choose (and mix and match - sets of
> such sets, for different purposes/contexts).  IOW, I don't see just
> a single set of ad-hoc char mappings.  But this is anyway for the
> future.

Yes, we have to take into consideration that in addition to the
plain customizable list we are adding to the next release,
in later versions we might also add more customizable lists,
e.g. with categories and other character groups.

>> Another thing we need to do is to allow customization to remove
>> default mappings.  Maybe this is possible by using the same
>> defcustom with a rule like: remove default mappings when a char
>> is mapped to an empty list, e.g.
>>
>> - adding more mappings for ‘`’:
>>   (defcustom char-fold-ad-hoc '((?` "❛" "‘" "‛" "󠀢" "❮" "‹"))
>>
>> - removing default mappings for ‘`’:
>>   (defcustom char-fold-ad-hoc '((?`))
>
> Yes, I would think that would work (already).  But I could be wrong.
>
> Thanks for taking a look at this.

After long-planned terminology improvements, I'd wait for sync between
branches to avoid merge conflicts, and then I'll submit a patch taking
into account all opinions about the default value for users who will
enable this feature in the next release.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Tue, 17 May 2016 21:57:02 GMT) Full text and rfc822 format available.

Message #82 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org, Artur Malabarba <bruce.connor.am <at> gmail.com>
Subject: RE: bug#22147: Obsolete search-forward-lax-whitespace
Date: Tue, 17 May 2016 14:55:49 -0700 (PDT)
> > Another consideration (for me, at least): I think (and hope) that
> > eventually users will be able to have multiple such lists (sets)
> > of char mappings that they can choose (and mix and match - sets of
> > such sets, for different purposes/contexts).  IOW, I don't see just
> > a single set of ad-hoc char mappings.  But this is anyway for the
> > future.
> 
> Yes, we have to take into consideration that in addition to the
> plain customizable list we are adding to the next release,
> in later versions we might also add more customizable lists,
> e.g. with categories and other character groups.

One possibility is to (now), instead of having an option with a single list of ad-hoc mappings as value, have an option with an alist of such lists as its value, where the car of an alist entry names the particular ad-hoc mapping.

See my suggestion in an earlier mail in this thread.
 
> >> Another thing we need to do is to allow customization to remove
> >> default mappings.  Maybe this is possible by using the same
> >> defcustom with a rule like: remove default mappings when a char
> >> is mapped to an empty list, e.g.
> >>
> >> - adding more mappings for ‘`’:
> >>   (defcustom char-fold-ad-hoc '((?` "❛" "‘" "‛" "󠀢" "❮" "‹"))
> >>
> >> - removing default mappings for ‘`’:
> >>   (defcustom char-fold-ad-hoc '((?`))
> >
> > Yes, I would think that would work (already).  But I could be wrong.
> >
> > Thanks for taking a look at this.
> 
> After long-planned terminology improvements, I'd wait for sync between
> branches to avoid merge conflicts, and then I'll submit a patch taking
> into account all opinions about the default value for users who will
> enable this feature in the next release.

Sounds good. Thx.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Wed, 18 May 2016 03:01:01 GMT) Full text and rfc822 format available.

Message #85 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Drew Adams <drew.adams <at> oracle.com>, Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Wed, 18 May 2016 03:00:35 +0000
[Message part 1 (text/plain, inline)]
First of all, thanks for taking the time to help with this, Juri.

I'm of the opinion that we should avoid over thinking this feature for the
first release. And I'm also of the opinion that complicated custom-vars
(like alists of alists) are less helpful than simple custom vars. So I'd
strongly prefer if we don't turn this variable into something more
complicated.

Also (although I'm perfectly in favor of a variable for ad-hoc foldings), I
wouldn't mind it if this bug was removed from the release blocking list.
It's a feature request, not a proper bug, so the release wouldn't be flawed
without it.

On Tue, 17 May 2016 6:55 pm Drew Adams, <drew.adams <at> oracle.com> wrote:

> > > Another consideration (for me, at least): I think (and hope) that
> > > eventually users will be able to have multiple such lists (sets)
> > > of char mappings that they can choose (and mix and match - sets of
> > > such sets, for different purposes/contexts).  IOW, I don't see just
> > > a single set of ad-hoc char mappings.  But this is anyway for the
> > > future.
> >
> > Yes, we have to take into consideration that in addition to the
> > plain customizable list we are adding to the next release,
> > in later versions we might also add more customizable lists,
> > e.g. with categories and other character groups.
>
> One possibility is to (now), instead of having an option with a single
> list of ad-hoc mappings as value, have an option with an alist of such
> lists as its value, where the car of an alist entry names the particular
> ad-hoc mapping.
>
> See my suggestion in an earlier mail in this thread.
>
> > >> Another thing we need to do is to allow customization to remove
> > >> default mappings.  Maybe this is possible by using the same
> > >> defcustom with a rule like: remove default mappings when a char
> > >> is mapped to an empty list, e.g.
> > >>
> > >> - adding more mappings for ‘`’:
> > >>   (defcustom char-fold-ad-hoc '((?` "❛" "‘" "‛" "󠀢" "❮" "‹"))
> > >>
> > >> - removing default mappings for ‘`’:
> > >>   (defcustom char-fold-ad-hoc '((?`))
> > >
> > > Yes, I would think that would work (already).  But I could be wrong.
> > >
> > > Thanks for taking a look at this.
> >
> > After long-planned terminology improvements, I'd wait for sync between
> > branches to avoid merge conflicts, and then I'll submit a patch taking
> > into account all opinions about the default value for users who will
> > enable this feature in the next release.
>
> Sounds good. Thx.
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Wed, 18 May 2016 19:40:01 GMT) Full text and rfc822 format available.

Message #88 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 22147 <at> debbugs.gnu.org, Drew Adams <drew.adams <at> oracle.com>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Wed, 18 May 2016 22:34:05 +0300
> I'm of the opinion that we should avoid over thinking this feature for the
> first release. And I'm also of the opinion that complicated custom-vars
> (like alists of alists) are less helpful than simple custom vars. So I'd
> strongly prefer if we don't turn this variable into something more
> complicated.

I agree that we are better off starting with simpler customization,
and then gradually adding more layers later when needed.

I wonder why you removed defvar mappings from your initial patches
with ‘isearch-groups-alist’ and ‘isearch--character-fold-extras’
(a similar variable is also presented in Drew's ‘char-fold-ad-hoc’).

Now I tried to reintroduce these lists with different names:
‘char-fold-include-alist’ with a list to add to default mappings and
‘char-fold-exclude-alist’ with a list to remove from default mappings
taking into account all opinions expressed on emacs-devel for the
default values:

diff --git a/lisp/char-fold.el b/lisp/char-fold.el
index 68bea29..68d1eb0 100644
--- a/lisp/char-fold.el
+++ b/lisp/char-fold.el
@@ -22,10 +22,68 @@
 
 ;;; Code:
 
-(eval-and-compile (put 'char-fold-table 'char-table-extra-slots 1))
+(put 'char-fold-table 'char-table-extra-slots 1)
 
-(defconst char-fold-table
-  (eval-when-compile
+(defcustom char-fold-include-alist
+  (append
+   '((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝" "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»")
+     (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "󠀢" "❮" "❯" "‹" "›")
+     (?` "❛" "‘" "‛" "󠀢" "❮" "‹")
+     (?→ "->") (?⇒ "=>")
+     (?1 "⒈") (?2 "⒉") (?3 "⒊") (?4 "⒋") (?5 "⒌") (?6 "⒍") (?7 "⒎") (?8 "⒏") (?9 "⒐") (?0 "🄀")
+     )
+   (unless (string-match-p "^\\(?:da\\|n[obn]\\)" (getenv "LANG"))
+     '((?o "ø")
+       (?O "Ø")))
+   (unless (string-match-p "^pl" (getenv "LANG"))
+     '((?l "ł")
+       (?L "Ł")))
+   (unless (string-match-p "^de" (getenv "LANG"))
+     '((?ß "ss")))
+   )
+  "Ad hoc character foldings.
+Each entry is a list of a character and the strings that fold into it."
+  :set (lambda (symbol value)
+         (custom-set-default symbol value)
+         (with-no-warnings
+           (setq char-fold-table (make-char-fold-table))))
+  :initialize 'custom-initialize-default
+  :type '(repeat (cons
+                  (character :tag "Fold to character")
+                  (repeat (string :tag "Fold from string"))))
+  :version "25.1"
+  :group 'isearch)
+
+(defcustom char-fold-exclude-alist
+  (append
+   (when (string-match-p "^es" (getenv "LANG"))
+     '((?n "ñ")
+       (?N "Ñ")))
+   (when (string-match-p "^\\(?:sv\\|fi\\|et\\)" (getenv "LANG"))
+     '((?a "ä")
+       (?A "Ä")
+       (?o "ö")
+       (?O "Ö")))
+   (when (string-match-p "^\\(?:sv\\|da\\|n[obn]\\)" (getenv "LANG"))
+     '((?a "å")
+       (?A "Å")))
+   (when (string-match-p "^ru" (getenv "LANG"))
+     '((?и "й")
+       (?И "Й"))))
+  "Character foldings to remove from default mappings.
+Each entry is a list of a character and the strings that unfold from it."
+  :set (lambda (symbol value)
+         (custom-set-default symbol value)
+         (with-no-warnings
+           (setq char-fold-table (make-char-fold-table))))
+  :initialize 'custom-initialize-default
+  :type '(repeat (cons
+                  (character :tag "Unfold to character")
+                  (repeat (string :tag "Unfold from string"))))
+  :version "25.1"
+  :group 'isearch)
+
+(defun make-char-fold-table ()
     (let ((equiv (make-char-table 'char-fold-table))
           (equiv-multi (make-char-table 'char-fold-table))
           (table (unicode-property-table-internal 'decomposition)))
@@ -58,9 +116,11 @@ char-fold-table
                ;; If there's no formatting tag, ensure that char matches
                ;; its decomp exactly.  This is because we want 'ä' to
                ;; match 'ä', but we don't want '¹' to match '1'.
+             (unless (and (assq char char-fold-exclude-alist)
+                          (member (apply #'string decomp) (assq char char-fold-exclude-alist)))
                (aset equiv char
                      (cons (apply #'string decomp)
-                           (aref equiv char))))
+                           (aref equiv char)))))
 
              ;; Allow the entire decomp to match char.  If decomp has
              ;; multiple characters, this is done by adding an entry
@@ -74,9 +134,11 @@ char-fold-table
                                 (cons (cons (apply #'string (cdr decomp))
                                             (regexp-quote (string char)))
                                       (aref equiv-multi (car decomp))))
+                      (unless (and (assq (car decomp) char-fold-exclude-alist)
+                                   (member (char-to-string char) (assq (car decomp) char-fold-exclude-alist)))
                         (aset equiv (car decomp)
                               (cons (char-to-string char)
-                                    (aref equiv (car decomp))))))))
+                                    (aref equiv (car decomp)))))))))
                (funcall make-decomp-match-char decomp char)
                ;; Do it again, without the non-spacing characters.
                ;; This allows 'a' to match 'ä'.
@@ -98,9 +160,7 @@ char-fold-table
        table)
 
       ;; Add some manual entries.
-      (dolist (it '((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝" "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»")
-                    (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "󠀢" "❮" "❯" "‹" "›")
-                    (?` "❛" "‘" "‛" "󠀢" "❮" "‹")))
+    (dolist (it char-fold-include-alist)
         (let ((idx (car it))
               (chars (cdr it)))
           (aset equiv idx (append chars (aref equiv idx)))))
@@ -114,6 +174,9 @@ char-fold-table
              (aset equiv char re))))
        equiv)
       equiv))
+
+(defvar char-fold-table
+  (make-char-fold-table)
   "Used for folding characters of the same group during search.
 This is a char-table with the `char-fold-table' subtype.
 




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Wed, 18 May 2016 20:41:01 GMT) Full text and rfc822 format available.

Message #91 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org, Drew Adams <drew.adams <at> oracle.com>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Wed, 18 May 2016 17:40:22 -0300
Juri Linkov <juri <at> linkov.net> writes:

> Now I tried to reintroduce these lists with different names:
> ‘char-fold-include-alist’ with a list to add to default mappings and
> ‘char-fold-exclude-alist’ with a list to remove from default mappings
> taking into account all opinions expressed on emacs-devel for the
> default values:

Sounds good! Some minor comments:

> +(defun make-char-fold-table ()

Call this `char-fold--make-table'

> +             (unless (and (assq char char-fold-exclude-alist)
> +                          (member (apply #'string decomp) (assq char char-fold-exclude-alist)))

This call to `member' will run dozens of times for each entry in
`char-fold-exclude-alist'. Maybe we should optimize those two repeated
forms: `(apply #'string decomp)' and `(assq char char-fold-exclude-alist)'.

> -      (dolist (it '((?\" """ "“" "”" "”" "„" "⹂" "〞" "‟" "‟" "❞" "❝" "❠" "“" "„" "〝" "〟" "🙷" "🙶" "🙸" "«" "»")
> -                    (?' "❟" "❛" "❜" "‘" "’" "‚" "‛" "‚" "󠀢" "❮" "❯" "‹" "›")
> -                    (?` "❛" "‘" "‛" "󠀢" "❮" "‹")))
> +    (dolist (it char-fold-include-alist)
>          (let ((idx (car it))

The indentation looks wrong here.




Removed indication that bug 22147 blocks Request was from Eli Zaretskii <eliz <at> gnu.org> to control <at> debbugs.gnu.org. (Sun, 22 May 2016 16:36:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Mon, 30 May 2016 21:22:02 GMT) Full text and rfc822 format available.

Message #96 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 22147 <at> debbugs.gnu.org, Drew Adams <drew.adams <at> oracle.com>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Mon, 30 May 2016 23:57:12 +0300
>> Now I tried to reintroduce these lists with different names:
>> ‘char-fold-include-alist’ with a list to add to default mappings and
>> ‘char-fold-exclude-alist’ with a list to remove from default mappings
>> taking into account all opinions expressed on emacs-devel for the
>> default values:
>
> Sounds good! Some minor comments:
>
>> +(defun make-char-fold-table ()
>
> Call this `char-fold--make-table'
>
>> +             (unless (and (assq char char-fold-exclude-alist)
>> +                          (member (apply #'string decomp) (assq char char-fold-exclude-alist)))
>
> This call to `member' will run dozens of times for each entry in
> `char-fold-exclude-alist'. Maybe we should optimize those two repeated
> forms: `(apply #'string decomp)' and `(assq char char-fold-exclude-alist)'.

This definitely needs to be optimized, but now it's clear there is no hurry
since this is not going to be released in 25.  Moreover, I get occasional crashes
in char-tables with the latest patch, so it was a good thing not to push
it to the release branch at the last minute.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Wed, 01 Jun 2016 15:04:02 GMT) Full text and rfc822 format available.

Message #99 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 22147 <at> debbugs.gnu.org, Drew Adams <drew.adams <at> oracle.com>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Wed, 01 Jun 2016 12:03:27 -0300
Juri Linkov <juri <at> linkov.net> writes:

> This definitely needs to be optimized, but now it's clear there is no hurry
> since this is not going to be released in 25.  Moreover, I get occasional crashes
> in char-tables with the latest patch, so it was a good thing not to push
> it to the release branch at the last minute.

Indeed. It wasn't a must-have anyway. At least now we have more time to
play around with this.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Sat, 05 Sep 2020 14:55:02 GMT) Full text and rfc822 format available.

Message #102 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 22147 <at> debbugs.gnu.org, Drew Adams <drew.adams <at> oracle.com>,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Sat, 05 Sep 2020 16:54:48 +0200
Artur Malabarba <bruce.connor.am <at> gmail.com> writes:

> Juri Linkov <juri <at> linkov.net> writes:
>
>> This definitely needs to be optimized, but now it's clear there is no hurry
>> since this is not going to be released in 25.  Moreover, I get
>> occasional crashes
>> in char-tables with the latest patch, so it was a good thing not to push
>> it to the release branch at the last minute.
>
> Indeed. It wasn't a must-have anyway. At least now we have more time to
> play around with this.

I've just skimmed this bug report, but it seems like a different version
of the proposed patch was applied three years later:

commit 376f5df3cca0dbf186823e5b329d76b52019473d
Author:     Juri Linkov <juri <at> linkov.net>
AuthorDate: Tue Jul 23 23:27:28 2019 +0300

    Customizable char-fold with char-fold-symmetric, char-fold-include (bug#35689)
    
search-forward-lax-whitespace still isn't obsolete, but I'm unsure
whether there's anything more to do in this bug report?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22147; Package emacs. (Mon, 07 Sep 2020 18:38:02 GMT) Full text and rfc822 format available.

Message #105 received at 22147 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 22147 <at> debbugs.gnu.org
Subject: Re: bug#22147: Obsolete search-forward-lax-whitespace
Date: Mon, 07 Sep 2020 21:34:56 +0300
tags 22147 fixed
close 22147 28.0.50
quit

> I've just skimmed this bug report, but it seems like a different version
> of the proposed patch was applied three years later:
>
> commit 376f5df3cca0dbf186823e5b329d76b52019473d
> Author:     Juri Linkov <juri <at> linkov.net>
> AuthorDate: Tue Jul 23 23:27:28 2019 +0300
>
>     Customizable char-fold with char-fold-symmetric, char-fold-include (bug#35689)
>
> search-forward-lax-whitespace still isn't obsolete, but I'm unsure
> whether there's anything more to do in this bug report?

Let's see if the requested feature works now:

0. emacs -Q

1. eval

(setq search-whitespace-regexp "\\(?:\\s-\\|\n\\)+")
(require 'char-fold)
(setq-default search-default-mode 'char-fold-to-regexp)
(setq char-fold-symmetric t)

2. then ‘C-s 1 C-q C-j 2 C-s’ finds both occurrences:

1 2
1
2

Oh, wait!  This works because I have an uninstalled patch from bug#38539.
Now pushed to master, and closed both reports.




Added tag(s) fixed. Request was from Juri Linkov <juri <at> linkov.net> to control <at> debbugs.gnu.org. (Mon, 07 Sep 2020 18:38:03 GMT) Full text and rfc822 format available.

bug marked as fixed in version 28.0.50, send any further explanations to 22147 <at> debbugs.gnu.org and Juri Linkov <juri <at> linkov.net> Request was from Juri Linkov <juri <at> linkov.net> to control <at> debbugs.gnu.org. (Mon, 07 Sep 2020 18:38:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 06 Oct 2020 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 307 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.