GNU bug report logs - #42602
Wrong (not-)casechars value for "polish" in ispell-dictionary-base-alist

Previous Next

Package: emacs;

Reported by: Sebastian Urban <mrsebastianurban <at> gmail.com>

Date: Wed, 29 Jul 2020 16:13:01 UTC

Severity: normal

Done: Stefan Kangas <stefan <at> marxist.se>

Bug is archived. No further changes may be made.

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Sebastian Urban <mrsebastianurban <at> gmail.com>
To: Bug GNU Emacs <bug-gnu-emacs <at> gnu.org>
Subject: Wrong (not-)casechars value for "polish" in
 ispell-dictionary-base-alist
Date: Wed, 29 Jul 2020 18:12:02 +0200
Hello,

for words like:
   męski
   miód
   klątwa
   ślad
   łuk
   żaba
   źrebak
   grzać
   bańka
ispell.el sends to Aspell only part of the word, e.g. "lad" instead of
"ślad", or "kl"/"twa" (depending on the cursor position) instead of
"klątwa".

I think this is because wrong value of (NOT-)CASECHARS, which is ASCII
A-z letters and a few chars of which only ó/Ó is valid for Polish.

Although, for some reason, it doesn't recognize "ó" in word "miód",
sending "mi" or "d". It is on the list of CASECHARS under \363, so it
should work.  Moreover, if I type in regexp-builder "[\363\323]" it
won't recognize ó/Ó, but it doesn't have a problem with other Polish
chars, like "ł" ("[\502]") or "ż" ("[\574]").

If I put in my init.el:
--8<---------------cut here---------------start------------->8---
(setq ispell-program-name "C:/cygwin64/bin/aspell")
(add-hook 'ispell-initialize-spellchecker-hook
          (lambda ()
          (add-to-list 'ispell-local-dictionary-alist
                       '("pl"
                         ;; "[[:alpha:]]"
                         ;; "[^[:alpha:]]"
                         ;; ęóąśłżźćńĘÓĄŚŁŻŹĆŃ
"[A-Za-z\431\363\405\533\502\574\572\407\504\430\323\404\532\501\573\571\406\503]"
"[^A-Za-z\431\363\405\533\502\574\572\407\504\430\323\404\532\501\573\571\406\503]"
                         "[.]" nil nil nil iso-8859-2))))
(setq ispell-dictionary "pl")
--8<---------------cut here---------------start------------->8---

everything seems to work, even ó/Ó are recognised. "[[:alpha:]]" works
as well, so I leaved it as an alternative. Changing from iso-8859-2 to
utf-8 doesn't break anything.

Tested on:
- GNU Emacs 26.3 (build 1, x86_64-w64-mingw32) of 2019-08-29,
- GNU Emacs 28.0.50 (build 1, x86_64-w64-mingw32) of 2020-07-05,
with Aspell from Cygwin installation.


S. U.




This bug report was last modified 4 years and 282 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.