GNU bug report logs - #39483
27.0.60; ispell ignores syntax/category tables word boundaries

Previous Next

Package: emacs;

Reported by: "Paul W. Rankin" <hello <at> paulwrankin.com>

Date: Fri, 7 Feb 2020 15:46:01 UTC

Severity: normal

Found in version 27.0.60

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 39483 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "Paul W. Rankin" <hello <at> paulwrankin.com>
Cc: 39483 <at> debbugs.gnu.org
Subject: Re: bug#39483: 27.0.60;
 ispell ignores syntax/category tables word boundaries
Date: Fri, 07 Feb 2020 20:23:33 +0200
> From: "Paul W. Rankin" <hello <at> paulwrankin.com>
> Date: Sat, 08 Feb 2020 01:44:52 +1000
> 
> It appears that the function `ispell-get-word' makes its own judgements
> on word boundaries, ignoring the buffer's syntax tables and character
> categories.

That is true.  And I don't really see how it can be any different,
since ispell.el must have the same notion of a word as the underlying
dictionary, otherwise you will have false positives and/or false
negatives, right?

ispell.el looks up the word characters and non-word characters in
its database, and the doc string of ispell-dictionary-base-alist
explains how.

> This becomes a problem with using `electric-quote-mode' and
> ispell, because contractions are parsed as separate words. e.g. Calling
> ispell word for "doesn’t" returns:
> 
>     T is correct
> 
> To reproduce:
> 
> 1. emacs -Q
> 2. (in *scratch*) M-x text-mode RET
> 3. enter text "doesn’t" (i.e. "doesn" C-x 8 ] "t")
> 4. M-: (modify-syntax-entry ?’ "w")
> 5. M-: (modify-category-entry ?’ ?^)
> 6. M-$ | ispell-word

The buffer syntax table has no effect on ispell.el, and shouldn't have
any effect on it.

> Attempts at workarounds:
> 
> I've tried altering slot 3 of the corresponding `ispell-dictionary-base-alist'
> entries from "[']" to "['’]" to no avail.

That's the right direction, but you didn't follow it far enough.
First, ispell-dictionary-base-alist is the default value, and is used
to produce ispell-dictionary-alist, which is one you should change
(alternatively, customize ispell-local-dictionary-alist).  More
importantly, the definitions of each dictionary include more than just
one character set: there are 3 character sets there and one parameter
for encoding the string passed to the spell-checker, and you should be
sure to set them all as appropriate for the dictionary you use.

My suggestion is to step with Edebug through ispell-get-word and see
why it doesn't consider "doesn’t" as a single word in your case.

> Setup:
> 
> GNU Emacs 27.0.60 (build 2, x86_64-apple-darwin19.3.0, NS appkit-1894.30
> Version 10.15.3 (Build 19D76)) of 2020-02-05

This omits crucial information, like the dictionary in use and the
locale-dependent settings that affect encoding.  (In any case, I don't
think this list is the right place of discussing this issue.)




This bug report was last modified 5 years and 185 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.