GNU bug report logs - #20871
25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren

Previous Next

Package: emacs;

Reported by: Marcin Borkowski <mbork <at> mbork.pl>

Date: Mon, 22 Jun 2015 10:21:02 UTC

Severity: normal

Found in version 25.0.50

Fixed in version 27.1

Done: Glenn Morris <rgm <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Marcin Borkowski <mbork <at> mbork.pl>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20871 <at> debbugs.gnu.org
Subject: bug#20871: 25.0.50; fill-single-char-nobreak-p does not recognize a single-letter word when it is preceded by an open paren
Date: Sat, 30 Apr 2016 14:26:28 +0200
On 2016-04-30, at 13:21, Eli Zaretskii <eliz <at> gnu.org> wrote:

>> From: Marcin Borkowski <mbork <at> mbork.pl>
>> Cc: 20871 <at> debbugs.gnu.org
>> Date: Fri, 29 Apr 2016 14:18:34 +0200
>> 
>> >> +    (looking-at "[^[:alpha:]][[:alpha:]]")))
>> >
>> > You should be aware that starting with Emacs 25.1 [:alpha:] matches a
>> > very large class of characters, some of them having nothing in common
>> > with those used in Polish.  So perhaps it is better to use '\cl'
>> > instead, which will only capture Latin characters?  Just a thought --
>> > your call.
>> 
>> I guess you are right, Eli - in fact, all one-letter words in Polish are
>> matched by [aiouwz].  I decided to go with \cl, as you suggested,
>> though - this way, the function could be (probably) useful also for
>> Slovaks, for instance.  I attach the corrected patch.
>
> LGTM, thanks.

Thanks!

>> Just to be sure: in my Emacs, \cl matches also ą, ę, ż, ź, á, ö etc.  Is
>> it intentional?
>
> Yes.  \cl matches any character that belongs to any of the Latin
> blocks.
>
>> Is it documented somewhere?
>
> Not sure what needs to be documented, please elaborate.

Well, at first I thought that "Latin" means "matching [a-z]".  Finding
out that accented letter qualify, too, was a (pleasant) surprise.
Finding that out using `describe-categories' is a bit tricky, since its
output contains ranges, and I don't know which of them does e.g. "ą"
belong to.  The output of `describe-categories' says "Legend of category
mnemonics (see the tail for the longer description)"; I guess the
"longer" description might say something more.  For instance, this line:

(define-category ?l "Latin")

in characters.el

could be replaced by

(define-category ?l "Latin
Latin letters (including those with diacritics)")

This way, there would be at least a hint at the bottom of the *Help*
buffer displayed by `describe-categories'.

WDYT?  Would you like me to prepare a patch?

Best,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University




This bug report was last modified 5 years and 280 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.