GNU bug report logs - #35802
Broken data loaded from uni-decomposition

Previous Next

Package: emacs;

Reported by: Juri Linkov <juri <at> linkov.net>

Date: Sun, 19 May 2019 20:21:02 UTC

Severity: normal

Tags: fixed, patch

Fixed in version 27.1

Done: Noam Postavsky <npostavs <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Juri Linkov <juri <at> linkov.net>
To: npostavs <at> gmail.com
Cc: 35802 <at> debbugs.gnu.org
Subject: bug#35802: Broken data loaded from uni-decomposition
Date: Thu, 06 Jun 2019 23:41:35 +0300
>> But should return `t'.  I customized `search-whitespace-regexp'
>> (whose value isearch sets to `search-spaces-regexp') to a legitimate
>> value, but `unicode-property-table-internal' used in char-fold.el fails
>> to correctly load "uni-decomposition.el", thus breaking the char-fold search.
>
> The problem is that this messes up a search in find-auto-coding:

Thanks for finding this.

>       (if (re-search-forward
>            "[\r\n]\\([^\r\n]*\\)[ \t]*Local Variables:[ \t]*\\([^\r\n]*\\)[\r\n]"
>            tail-end t)
>           ...
>           (let* ((prefix (regexp-quote (match-string 1)))
>                  (suffix (regexp-quote (match-string 2)))
>
> The space between "Local Variables" becomes "\\(\\s-\\|\n\\)+" which is
> a problem because it adds a new capturing group, which means suffix gets
> the wrong value.  Then we fail to find the ";; End:" line, and don't
> apply the "coding: utf-8" setting.

When this feature is used in Isearch, the documented way to avoid this problem
is to replace the space with ‘[ ]’, i.e. to use

  "Local[ ]Variables:"

> So the value you chose isn't entirely legitimate, you should use a shy
> group instead:
>
> (equal (progn (load "international/uni-decomposition.el" t t t t)
>               (aref (cdr (assq 'decomposition char-code-property-alist)) 1024))
>        (progn (let ((search-spaces-regexp "\\(?:\\s-\\|\n\\)+"))
>                 (load "international/uni-decomposition.el" t t t t))
>               (aref (cdr (assq 'decomposition char-code-property-alist)) 1024)))
> ;=> t

Maybe this gotcha should be mentioned in the documentation of
search-spaces-regexp and search-whitespace-regexp?

> And possibly let-binding search-spaces-regexp in find-auto-coding would
> make sense (although, there's probably more places like this that might
> break, not sure if we can ever hope to find them all).

This is almost the same class of problems as wrapping re-search-forward
in save-match-data, so finding all places that affect matching elsewhere
will take time.




This bug report was last modified 5 years and 328 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.