GNU bug report logs - #13639
ispell.el: hunspell dicts autodetection under Emacs.

Previous Next

Package: emacs;

Reported by: Agustin Martin <agustin.martin <at> hispalinux.es>

Date: Wed, 16 Jan 2013 16:37:02 UTC

Owned by: Agustin Martin <agustin.martin <at> hispalinux.es>

Severity: normal

Done: Agustin Martin <agustin.martin <at> hispalinux.es>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Agustin Martin <agustin.martin <at> hispalinux.es>
Cc: 13639 <at> debbugs.gnu.org
Subject: bug#13639: [emacs] ispell.el: hunspell dicts autodetection under	Emacs.
Date: Wed, 20 Feb 2013 21:00:41 +0200
> Date: Wed, 20 Feb 2013 18:50:45 +0100
> From: Agustin Martin <agustin.martin <at> hispalinux.es>
> 
> > > > > Sorry, I should have written WORDCHARS.
> > > > 
> > > > Why do we need that?
> > > 
> > > This is what ispell.el calls otherchars. Parsing WORDCHARS ensures that
> > > both
> > > hunspell and ispell.el think about the same characters in that category.
> > 
> > I think you are mistaken, that's not my reading of hunspell(4).
> 
> Sorry for the late reply,
> 
> (Opening a new thread specifically about hunspell dicts autodetection and
> using new cloned bugreport #13639 specific about this)
> 
> Although WORDCHARS description in hunspell(4)
> 
> WORDCHARS characters
>    WORDCHARS extends tokenizer of Hunspell command line interface
>    with additional word character. For example, dot, dash, n-dash, numbers,
>    percent sign are word character in Hungarian.
> 
> is too hungarian biassed and does not mention usual apostrophe AFAIK it
> mostly refers to the same as 'otherchars', although hunspell may accept
> that in locations not in the middle of a word.

I didn't just read the man page, I also looked into several *.aff
files that install with Hunspell dictionaries.  It is clear to me that
WORDCHARS is at least unreliable, even if your interpretation is
correct (of which I'm still unconvinced): some *.aff files don't have
that entry at all (e.g., en_GB.aff, whose OTHERCHARS should include
the ' character, and also ru_RU.aff); others, like he_IL.aff, have
that entry mention all the CASECHARS, in addition to OTHERCHARS.  I
wouldn't bet my money on what that entry gives us.

> The good news are that I started working on hunspell dicts autodetection.

Good news, indeed!  Thanks!




This bug report was last modified 12 years and 46 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.