GNU bug report logs -
#20741
24.4; flyspell doesn't work with abbreviations ending in a period
Previous Next
To reply to this bug, email your comments to 20741 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Fri, 05 Jun 2015 14:08:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Reuben Thomas <rrt <at> sc3d.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Fri, 05 Jun 2015 14:08:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
flyspell marks as incorrect “etc.”, “i.e.”, “e.g.” &c.
flyspell is of course behaving as expected: “.” is in OTHERCHARS, and as
it comes after the word, it is not included.
ispell sets my default dictionary to en_GB (from my locale, I presume),
and I’m using hunspell.
If I run ispell-buffer on a buffer containing the above words, they
pass, which is surprising in that it seems that the OTHERCHARS
specification has not been applied in this case. It is not surprising in
the sense that these definitions are in my dictionary.
The somewhat nonsensical result is that if I run ispell-word on such a
word marked incorrect by flyspell, the first correction offered is the
word I already have, plus a period. If I select it, the net effect is
that an extra period is inserted, and flyspell complains again.
I tried to move “.” to CASECHARS and NOT-CASECHARS in a custom
dictionary definition:
("en_GB" "[[:alpha:].]" "[^[:alpha:].]" "['0-9’-]" t
("-d" "en_GB")
nil utf-8)
but this causes flyspell to give an error saying it got nil where it
expected a stringp in its post-command-hook. In any case, I guess this
would not do what I wanted without adding an inflexion rule to the
dictionary that allowed any word to add “.” (except, ideally, a word
that already ends in a period).
In GNU Emacs 24.4.1 (x86_64-pc-linux-gnu, GTK+ Version 3.10.8)
of 2014-11-21 on skwd, modified by Debian
Windowing system distributor `The X.Org Foundation', version 11.0.11501000
System Description: Ubuntu 14.04.2 LTS
Configured using:
`configure --build x86_64-linux-gnu --prefix=/usr
--sharedstatedir=/var/lib --libexecdir=/usr/lib
--localstatedir=/var/lib --infodir=/usr/share/info
--mandir=/usr/share/man --with-pop=yes
--enable-locallisppath=/etc/emacs24:/etc/emacs:/usr/local/share/emacs/24.4/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/24.4/site-lisp:/usr/share/emacs/site-lisp
--build x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib
--libexecdir=/usr/lib --localstatedir=/var/lib
--infodir=/usr/share/info --mandir=/usr/share/man --with-pop=yes
--enable-locallisppath=/etc/emacs24:/etc/emacs:/usr/local/share/emacs/24.4/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/24.4/site-lisp:/usr/share/emacs/site-lisp
--with-x=yes --with-x-toolkit=gtk3 --with-toolkit-scroll-bars
'CFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat
-Werror=format-security -Wall' CPPFLAGS=-D_FORTIFY_SOURCE=2
'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro''
Important settings:
value of $LC_MONETARY: en_GB.UTF-8
value of $LC_NUMERIC: en_GB.UTF-8
value of $LC_TIME: en_GB.UTF-8
value of $LANG: en_GB.UTF-8
value of $XMODIFIERS: @im=local
locale-coding-system: utf-8-unix
Major mode: Emacs-Lisp
Minor modes in effect:
TeX-PDF-mode: t
TeX-source-correlate-mode: t
shell-dirtrack-mode: t
paredit-mode: t
show-paren-mode: t
savehist-mode: t
minibuffer-electric-default-mode: t
icomplete-mode: t
global-auto-revert-mode: t
desktop-save-mode: t
bug-reference-prog-mode: t
global-undo-tree-mode: t
undo-tree-mode: t
global-whitespace-mode: t
ido-everywhere: t
dtrt-indent-mode: t
global-auto-complete-mode: t
auto-complete-mode: t
eldoc-mode: t
tooltip-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
column-number-mode: t
line-number-mode: t
transient-mark-mode: (only . t)
Recent input:
<switch-frame> C-x b A g <tab> <return> C-n C-n C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-f C-f C-f C-f C-f C-f C-f
C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f
C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f
C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f
C-f C-f C-f C-f C-f C-f C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-p C-p C-p C-p C-p C-p C-p
C-p C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-p
C-b C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f
C-x b * N <backspace> M e s s <tab> <return> C-r f
g g l C-a C-x b <return> C-x b c u s t <tab> <return>
C-a <switch-frame> <switch-frame> <switch-frame> <help-echo>
<switch-frame> <down-mouse-1> <mouse-movement> <mouse-movement>
<drag-mouse-1> M-x r e p o r t - e m a c s - b u g
<return>
Recent messages:
Applying style hooks... done
Applying style hooks... done
Applying style hooks... done
Applying style hooks... done
Wrote /home/rrt/.emacs.desktop.lock
Desktop: 4 frames, 16 buffers restored.
For information about GNU Emacs and the GNU system, type C-h C-a.
call-interactively: End of buffer [18 times]
Mark saved where search started
Mark set
Load-path shadows:
/home/rrt/.emacs.d/el-get/xrdb-mode/xrdb-mode hides /usr/share/emacs24/site-lisp/emacs-goodies-el/xrdb-mode
/home/rrt/.emacs.d/el-get/csv-mode/csv-mode hides /usr/share/emacs24/site-lisp/emacs-goodies-el/csv-mode
/home/rrt/.emacs.d/el-get/quack/quack hides /usr/share/emacs24/site-lisp/emacs-goodies-el/quack
/home/rrt/.emacs.d/el-get/markdown-mode/markdown-mode hides /usr/share/emacs24/site-lisp/emacs-goodies-el/markdown-mode
/home/rrt/.emacs.d/el-get/filladapt/filladapt hides /usr/share/emacs24/site-lisp/emacs-goodies-el/filladapt
/home/rrt/.emacs.d/el-get/graphviz-dot-mode/graphviz-dot-mode hides /usr/share/emacs24/site-lisp/emacs-goodies-el/graphviz-dot-mode
/home/rrt/.emacs.d/el-get/browse-kill-ring/browse-kill-ring hides /usr/share/emacs24/site-lisp/emacs-goodies-el/browse-kill-ring
/home/rrt/.emacs.d/el-get/apache-mode/apache-mode hides /usr/share/emacs24/site-lisp/emacs-goodies-el/apache-mode
/usr/share/emacs/24.4/site-lisp/debian-startup hides /usr/share/emacs/site-lisp/debian-startup
/home/rrt/.local/share/emacs/site-lisp/lilypond-mode hides /usr/share/emacs/site-lisp/lilypond-mode
/home/rrt/.local/share/emacs/site-lisp/lilypond-what-beat hides /usr/share/emacs/site-lisp/lilypond-what-beat
/usr/share/emacs/24.4/site-lisp/cdargs hides /usr/share/emacs/site-lisp/cdargs
/home/rrt/.emacs.d/el-get/cmake-mode/cmake-mode hides /usr/share/emacs/site-lisp/cmake-mode
/home/rrt/.local/share/emacs/site-lisp/lilypond-init hides /usr/share/emacs/site-lisp/lilypond-init
/home/rrt/.local/share/emacs/site-lisp/lilypond-song hides /usr/share/emacs/site-lisp/lilypond-song
/home/rrt/.local/share/emacs/site-lisp/lilypond-indent hides /usr/share/emacs/site-lisp/lilypond-indent
/home/rrt/.local/share/emacs/site-lisp/lilypond-font-lock hides /usr/share/emacs/site-lisp/lilypond-font-lock
/home/rrt/.local/share/emacs/site-lisp/whitespace hides /usr/share/emacs/24.4/lisp/whitespace
/usr/share/emacs24/site-lisp/dictionaries-common/ispell hides /usr/share/emacs/24.4/lisp/textmodes/ispell
/usr/share/emacs/site-lisp/rst hides /usr/share/emacs/24.4/lisp/textmodes/rst
/usr/share/emacs24/site-lisp/dictionaries-common/flyspell hides /usr/share/emacs/24.4/lisp/textmodes/flyspell
/home/rrt/.emacs.d/el-get/flymake/flymake hides /usr/share/emacs/24.4/lisp/progmodes/flymake
/home/rrt/.emacs.d/el-get/cperl-mode/cperl-mode hides /usr/share/emacs/24.4/lisp/progmodes/cperl-mode
Features:
(shadow sort mail-extr emacsbug message rfc822 mml mml-sec mm-decode
mm-bodies mm-encode mailabbrev gmm-utils mailheader sendmail mail-utils
misearch multi-isearch mule-util plain-tex gitignore-mode conf-mode
latexenc preview prv-emacs tex-buf font-latex latex tex-style tex dbus
xml crm tex-mode shell yaml-mode tern url-http tls url-auth mail-parse
rfc2231 rfc2047 rfc2045 ietf-drums url-gw json js3-mode imenu js3-parse
js3-browse js3-highlight js3-ast js3-messages js3-scan js3-util js3-vars
cc-langs js3-externs adaptive-wrap window-margin face-remap flyspell
ispell goto-addr smart-quotes org-element org-indent org-rmail org-mhe
org-irc org-info org-gnus org-docview doc-view jka-compr image-mode
org-bibtex bibtex org-bbdb org-w3m flymake compile paredit info tex-site
sws-mode-autoloads server paren savehist minibuf-eldef icomplete
autorevert filenotify desktop frameset cus-start cus-load iimage org
org-macro org-footnote org-pcomplete pcomplete org-list org-faces
org-entities noutline outline org-version ob-emacs-lisp ob ob-tangle
ob-ref ob-lob ob-table ob-exp org-src ob-keys ob-comint comint
ansi-color ob-core ob-eval org-compat org-macs org-loaddefs format-spec
find-func cal-menu calendar cal-loaddefs go-mode url url-proxy
url-privacy url-expand url-methods url-history url-cookie url-domsuf
url-util mailcap ffap thingatpt url-parse auth-source gnus-util mm-util
mail-prsvr password-cache url-vars dired-x bug-reference-github
bug-reference vc-git undo-tree diff whitespace locate yasnippet derived
po-mode php-mode etags ring cc-mode cc-fonts cc-guess cc-menus cc-cmds
cc-styles cc-align cc-engine cc-vars cc-defs speedbar sb-image ezimage
dframe init-paredit ido-hacks ido magit-autoloads geiser-load geiser
flymake-point filladapt dtrt-indent csv auto-complete-config
auto-complete edmacro kmacro popup init-eldoc eldoc-extension cl-macs
advice eldoc .loaddefs eieio byte-opt eieio-core el-get
el-get-autoloading el-get-list-packages el-get-dependencies el-get-build
el-get-status pp el-get-methods el-get-fossil el-get-svn el-get-pacman
el-get-github-zip el-get-github-tar el-get-http-zip el-get-http-tar
el-get-hg el-get-go el-get-git-svn el-get-fink el-get-emacswiki
el-get-http el-get-notify help-mode easymenu el-get-emacsmirror
el-get-github el-get-git el-get-elpa package epg-config el-get-darcs
el-get-cvs el-get-bzr el-get-brew el-get-builtin el-get-apt-get
el-get-recipes el-get-byte-compile el-get-custom el-get-core autoload
help-fns lisp-mnt bytecomp byte-compile cconv cl gv cl-loaddefs cl-lib
dired user-site-loaddefs debian-el debian-el-loaddefs emacs-goodies-el
emacs-goodies-custom emacs-goodies-loaddefs easy-mmode dpkg-dev-el
dpkg-dev-el-loaddefs devhelp time-date tooltip electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel x-win x-dnd tool-bar dnd
fontset image regexp-opt fringe tabulated-list newcomment lisp-mode
prog-mode register page menu-bar rfn-eshadow timer select scroll-bar
mouse jit-lock font-lock syntax facemenu font-core frame cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev
minibuffer nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote make-network-process
dbusbind gfilenotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs)
Memory information:
((conses 16 603598 45229)
(symbols 48 50761 0)
(miscs 40 267 443)
(strings 32 171824 23445)
(string-bytes 1 5482137)
(vectors 16 49781)
(vector-slots 8 1578059 70586)
(floats 8 291 310)
(intervals 56 3314 78)
(buffers 960 28)
(heap 1024 62706 2723))
--
http://rrt.sc3d.org/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Fri, 05 Jun 2015 14:09:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
As a workaround, I've added "i.e", "e.g" and "etc" to my personal word list.
--
http://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Fri, 05 Jun 2015 19:25:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 20741 <at> debbugs.gnu.org (full text, mbox):
> From: Reuben Thomas <rrt <at> sc3d.org>
> Date: Fri, 05 Jun 2015 15:06:40 +0100
>
> flyspell marks as incorrect “etc.”, “i.e.”, “e.g.” &c.
I can reproduce part of this with en_GB, but not with en_US. So I
think it's an issue with the dictionary, not with flyspell or ispell.
> flyspell is of course behaving as expected: “.” is in OTHERCHARS, and as
> it comes after the word, it is not included.
What OTHERCHARS are you looking at? In Emacs 24.4 and later,
ispell.el takes that value from the dictionary's .aff file, not from
the internal database. So if you customized ispell-dictionary-alist,
try without those customizations, you shouldn't need them in v24.4.
> ispell sets my default dictionary to en_GB (from my locale, I presume),
Yes. But you can override that, if you want.
> I tried to move “.” to CASECHARS and NOT-CASECHARS in a custom
> dictionary definition:
>
> ("en_GB" "[[:alpha:].]" "[^[:alpha:].]" "['0-9’-]" t
> ("-d" "en_GB")
> nil utf-8)
You shouldn't need all that in Emacs 24.4. Try not to customize the
dictionary at all.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Fri, 05 Jun 2015 21:43:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 5 June 2015 at 20:23, Eli Zaretskii <eliz <at> gnu.org> wrote:
> > From: Reuben Thomas <rrt <at> sc3d.org>
> > Date: Fri, 05 Jun 2015 15:06:40 +0100
> >
> > flyspell marks as incorrect “etc.”, “i.e.”, “e.g.” &c.
>
> I can reproduce part of this with en_GB, but not with en_US. So I
> think it's an issue with the dictionary, not with flyspell or ispell.
>
The en_US dictionary contains "etc", which is incorrect.
What OTHERCHARS are you looking at? In Emacs 24.4 and later,
> ispell.el takes that value from the dictionary's .aff file, not from
> the internal database. So if you customized ispell-dictionary-alist,
> try without those customizations, you shouldn't need them in v24.4.
>
Oh dear, after further investigation this turns out to be because Debian
overrides ispell.el and flyspell.el with its own patched versions, which
predate Emacs 24.4 (they are from 2013).
In what follows, I have moved these patched files aside, and am definitely
working with just Emacs 24.4's versions!
Now, still using hunspell, and having removed "i.e", "e.g" and "etc" from
my en_GB spelling list, I get exactly the same highlighting.
> > ispell sets my default dictionary to en_GB (from my locale, I presume),
>
> Yes. But you can override that, if you want.
>
I don't want to override it, it's fine.
When I mention OTHERCHARS, I am looking at the documentation for
ispell-dictionary-alist. Indeed, when I change language, and I am using
hunspell, the language definitions seem to be auto-generated. With
hunspell, OTHERCHARS is set to include ".". But indeed, removing it or
moving it into CASECHARS and NOT-CASECHARS still seems not to help, so I'm
back to my original workaround.
But indeed, apart from when I specifically mentioned customising the
dictionary, I am working with Emacs's default values, not customised at all.
Thanks very much for your help with this.
--
http://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sat, 06 Jun 2015 06:50:03 GMT)
Full text and
rfc822 format available.
Message #17 received at 20741 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 5 Jun 2015 22:42:39 +0100
> From: Reuben Thomas <rrt <at> sc3d.org>
> Cc: 20741 <at> debbugs.gnu.org
>
> > flyspell marks as incorrect “etc.”, “i.e.”, “e.g.” &c.
>
> I can reproduce part of this with en_GB, but not with en_US. So I
> think it's an issue with the dictionary, not with flyspell or ispell.
>
> The en_US dictionary contains "etc", which is incorrect.
Not that I'm maintaining those dictionaries, but why is it incorrect,
in your opinion? It clearly produces the desirable effect, doesn't it?
> When I mention OTHERCHARS, I am looking at the documentation for
> ispell-dictionary-alist. Indeed, when I change language, and I am using
> hunspell, the language definitions seem to be auto-generated. With hunspell,
> OTHERCHARS is set to include ".". But indeed, removing it or moving it into
> CASECHARS and NOT-CASECHARS still seems not to help, so I'm back to my original
> workaround.
>
> But indeed, apart from when I specifically mentioned customising the
> dictionary, I am working with Emacs's default values, not customised at all.
>
> Thanks very much for your help with this.
You are welcome.
Does this mean that your problem is solved, and we can close this bug?
Or does something still need to be fixed in Emacs?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sat, 06 Jun 2015 09:36:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 6 June 2015 at 07:49, Eli Zaretskii <eliz <at> gnu.org> wrote:
> > Date: Fri, 5 Jun 2015 22:42:39 +0100
> > From: Reuben Thomas <rrt <at> sc3d.org>
> > Cc: 20741 <at> debbugs.gnu.org
> >
> > > flyspell marks as incorrect “etc.”, “i.e.”, “e.g.” &c.
> >
> > I can reproduce part of this with en_GB, but not with en_US. So I
> > think it's an issue with the dictionary, not with flyspell or ispell.
> >
> > The en_US dictionary contains "etc", which is incorrect.
>
> Not that I'm maintaining those dictionaries, but why is it incorrect,
> in your opinion? It clearly produces the desirable effect, doesn't it?
>
No, it produces the undesirable effect of treating "etc" as a correct
spelling.
Does this mean that your problem is solved, and we can close this bug?
> Or does something still need to be fixed in Emacs?
>
Sorry, I must have been unclear. I still have the original problem:
without the workaround of adding incorrect spellings to my personal
wordlist, "i.e." and "e.g." are marked as wrong in en_GB. I just double
checked this with the following recipe:
1. Rename my ~/.hunspell_en_GB.
2. Start "emacs -Q"
3. M-x flyspell-mode RET
4. M-x customize-variable RET ispell-program-name RET; set to
"/usr/bin/hunspell" (doing this after step 3 because the variable is not
available for customization before loading ispell)
5. Type "etc. i.e. e.g."
All of the above is now red-underwiggled.
--
http://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sat, 06 Jun 2015 09:39:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 6 June 2015 at 10:35, Reuben Thomas <rrt <at> sc3d.org> wrote:
>
>
> All of the above is now red-underwiggled.
>
I should add, if I M-x ispell-change-dictionary
RET american RET, then allow the underlining to refresh, "etc." is no
longer marked as wrong (as we've seen, it's incorrectly in the "american"
word list), but "i.e." and "e.g." are still so marked.
--
http://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sat, 06 Jun 2015 09:59:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 20741 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 6 Jun 2015 10:35:46 +0100
> From: Reuben Thomas <rrt <at> sc3d.org>
> Cc: 20741 <at> debbugs.gnu.org
>
> > The en_US dictionary contains "etc", which is incorrect.
>
> Not that I'm maintaining those dictionaries, but why is it incorrect,
> in your opinion? It clearly produces the desirable effect, doesn't it?
>
>
> No, it produces the undesirable effect of treating "etc" as a correct spelling.
I think it's a correct spelling, but that's me.
> Does this mean that your problem is solved, and we can close this bug?
> Or does something still need to be fixed in Emacs?
>
>
> Sorry, I must have been unclear. I still have the original problem: without the
> workaround of adding incorrect spellings to my personal wordlist, "i.e." and
> "e.g." are marked as wrong in en_GB. I just double checked this with the
> following recipe:
>
> 1. Rename my ~/.hunspell_en_GB.
>
> 2. Start "emacs -Q"
>
> 3. M-x flyspell-mode RET
>
> 4. M-x customize-variable RET ispell-program-name RET; set to
> "/usr/bin/hunspell" (doing this after step 3 because the variable is not
> available for customization before loading ispell)
>
> 5. Type "etc. i.e. e.g."
>
> All of the above is now red-underwiggled.
But do you agree that it's not an Emacs problem?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sat, 06 Jun 2015 10:04:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 20741 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 6 Jun 2015 10:38:17 +0100
> From: Reuben Thomas <rrt <at> sc3d.org>
> Cc: 20741 <at> debbugs.gnu.org
>
> I should add, if I M-x ispell-change-dictionary
> RET american RET, then allow the underlining to refresh, "etc." is no longer
> marked as wrong (as we've seen, it's incorrectly in the "american" word list),
> but "i.e." and "e.g." are still so marked.
That's not what I see here. When en_US is used, neither of these is
flagged as a mis-spelling. When I switch to en_GB, only "etc" and the
"i" in "i.e." are flagged, the rest (including all of "e.g.") are not.
I guess the reason is the different versions of Hunspell dictionaries
we have installed.
Once again, I don't think this is an Emacs problem. Don't you see the
same when you invoke Hunspell as a stand-alone program?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sat, 06 Jun 2015 10:09:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 6 June 2015 at 11:03, Eli Zaretskii <eliz <at> gnu.org> wrote:
> > Date: Sat, 6 Jun 2015 10:38:17 +0100
> > From: Reuben Thomas <rrt <at> sc3d.org>
> > Cc: 20741 <at> debbugs.gnu.org
> >
> > I should add, if I M-x ispell-change-dictionary
> > RET american RET, then allow the underlining to refresh, "etc." is no
> longer
> > marked as wrong (as we've seen, it's incorrectly in the "american" word
> list),
> > but "i.e." and "e.g." are still so marked.
>
> That's not what I see here. When en_US is used, neither of these is
> flagged as a mis-spelling. When I switch to en_GB, only "etc" and the
> "i" in "i.e." are flagged, the rest (including all of "e.g.") are not.
>
> I guess the reason is the different versions of Hunspell dictionaries
> we have installed.
>
> Once again, I don't think this is an Emacs problem. Don't you see the
> same when you invoke Hunspell as a stand-alone program?
>
$ cat Downloads/foo.txt
i.e. this is not e.g. etc. help!
foxb
$ hunspell -a -d en_GB -i UTF-8 ~/Downloads/foo.txt
@(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.3)
*
*
*
*
*
*
*
& foxb 6 33: fox, fob, foxy, fox b, fixable, faux
Here we see that hunspell doesn't like "foxb" (great!) but is otherwise
happy.
So the problem does not appear to be with hunspell.
--
http://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sat, 06 Jun 2015 10:12:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 6 June 2015 at 11:08, Reuben Thomas <rrt <at> sc3d.org> wrote:
>
>
> $ cat Downloads/foo.txt
> i.e. this is not e.g. etc. help!
> foxb
>
> $ hunspell -a -d en_GB -i UTF-8 ~/Downloads/foo.txt
> @(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.3)
> *
> *
> *
> *
> *
> *
> *
> & foxb 6 33: fox, fob, foxy, fox b, fixable, faux
>
> Here we see that hunspell doesn't like "foxb" (great!) but is otherwise
> happy.
>
> So the problem does not appear to be with hunspell
> .
>
I should add, ispell-buffer too only complains about "foxb". It is just
flyspell that complains about the other (correct) spellings.
--
http://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sat, 06 Jun 2015 10:38:03 GMT)
Full text and
rfc822 format available.
Message #38 received at 20741 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 6 Jun 2015 11:11:22 +0100
> From: Reuben Thomas <rrt <at> sc3d.org>
> Cc: 20741 <at> debbugs.gnu.org
>
> I should add, ispell-buffer too only complains about "foxb". It is just
> flyspell that complains about the other (correct) spellings.
Then I guess flyspell-word-search-forward/backward is the culprit.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sun, 13 Feb 2022 09:05:02 GMT)
Full text and
rfc822 format available.
Message #41 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Reuben Thomas <rrt <at> sc3d.org> writes:
> 1. Rename my ~/.hunspell_en_GB.
>
> 2. Start "emacs -Q"
>
> 3. M-x flyspell-mode RET
>
> 4. M-x customize-variable RET ispell-program-name RET; set to "/usr/bin/hunspell"
> (doing this after step 3 because the variable is not available for customization
> before loading ispell)
>
> 5. Type "etc. i.e. e.g."
>
> All of the above is now red-underwiggled.
Easier reproduction case:
----
(setq ispell-program-name "/usr/bin/hunspell")
(ispell-change-dictionary "en_GB")
(flyspell-mode)
etc.
----
[Message part 2 (image/png, inline)]
[Message part 3 (text/plain, inline)]
Note red wiggle under etc. `M-$' on the etc gives me:
[Message part 4 (image/png, inline)]
[Message part 5 (text/plain, inline)]
So flyspell doesn't really understand that a full stop can be part of a
word, apparently? (This is with Emacs 29.)
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Added tag(s) confirmed.
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Sun, 13 Feb 2022 09:05:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sun, 13 Feb 2022 12:38:01 GMT)
Full text and
rfc822 format available.
Message #46 received at 20741 <at> debbugs.gnu.org (full text, mbox):
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 20741 <at> debbugs.gnu.org
> Date: Sun, 13 Feb 2022 10:04:38 +0100
>
> So flyspell doesn't really understand that a full stop can be part of a
> word, apparently?
Yes; and it normally isn't. Maybe we should have a list of
exceptions?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Sun, 13 Feb 2022 21:34:02 GMT)
Full text and
rfc822 format available.
Message #49 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sun, 13 Feb 2022 at 12:37, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> Maybe we should have a list of exceptions?
>
As an upstream spellchecker maintainer, I don't think that's a good idea.
Emacs should just be using the spellchecker. If it's not working, the
problem should be fixed in the spellchecker.
As far as I can see, the problem is not specific to flyspell (mea culpa for
the bug title!).
For now, with current hunspell dictionaries, and using either hunspell, or
enchant with hunspell backend, I have used the workaround of adding a few
words like "etc" to my personal word list.
To be honest, I'm not sure Emacs can do much here. As far as I can tell,
hunspell doesn't cope well with characters like "." that normally are
non-word characters, but *can* occur in a word.
Relatedly, see: https://github.com/hunspell/hunspell/issues/361
--
https://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 10:46:01 GMT)
Full text and
rfc822 format available.
Message #52 received at 20741 <at> debbugs.gnu.org (full text, mbox):
Reuben Thomas <rrt <at> sc3d.org> writes:
> To be honest, I'm not sure Emacs can do much here. As far as I can
> tell, hunspell doesn't cope well with characters like "." that
> normally are non-word characters, but *can* occur in a word.
>
> Relatedly, see: https://github.com/hunspell/hunspell/issues/361
So it's a problem on the hunspell side, and not because Emacs is
considering the "." to be a non-word character? (I haven't tried to
debug what's going on.)
There's also a problem in common abbreviations like "i.e.", which is
considered as the words "i" and "e", apparently...
I was wondering whether Emacs could query the backend speller whether it
had the word "foo." in the dictionary before squiggly-lining "foo", but
I'm very unfamiliar with how these functions work.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 13:10:02 GMT)
Full text and
rfc822 format available.
Message #55 received at 20741 <at> debbugs.gnu.org (full text, mbox):
> I was wondering whether Emacs could query the backend speller whether it
> had the word "foo." in the dictionary before squiggly-lining "foo", but
> I'm very unfamiliar with how these functions work.
Query and/or set WORDCHARS in the respective .aff file.
martin
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 13:36:02 GMT)
Full text and
rfc822 format available.
Message #58 received at 20741 <at> debbugs.gnu.org (full text, mbox):
> From: Reuben Thomas <rrt <at> sc3d.org>
> Date: Sun, 13 Feb 2022 21:33:32 +0000
> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 20741 <at> debbugs.gnu.org
>
> Maybe we should have a list of exceptions?
>
> As an upstream spellchecker maintainer, I don't think that's a good idea. Emacs should just be using the
> spellchecker. If it's not working, the problem should be fixed in the spellchecker.
I don't think I understand what this means in practice. "Use the
spell-checker" how? Do you mean we should not break words on
punctuation characters, or do you mean not to break them only on '.',
or do you mean something else?
Emacs is widely used to edit program sources, where stuff like
"file.attribute" and "list-my-packages" happens quite frequently.
Right now, these are not marked as misspellings, but if we pass them
to the speller with the punctuation, we are likely to get back
indications of misspelled words, which is not what we want. Thus my
questions above: if we want to handle punctuation characters smarter
than just considering them part of the NOT-CASECHARS class, we need to
come up with a specification that will improve the situation, not make
it worse. Can we do that?
> To be honest, I'm not sure Emacs can do much here.
I tend to agree, but maybe we can come up with some minor
improvements, even if they don't solve the problem in its entirety.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 13:44:01 GMT)
Full text and
rfc822 format available.
Message #61 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, 14 Feb 2022 at 13:35, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> I don't think I understand what this means in practice. "Use the
> spell-checker" how? Do you mean we should not break words on
> punctuation characters, or do you mean not to break them only on '.',
> or do you mean something else?
>
> Emacs is widely used to edit program sources, where stuff like
> "file.attribute" and "list-my-packages" happens quite frequently.
>
I wasn't considering this case, and this issue is about checking text (or
comments or strings) where you can just feed the entire thing to the
spellchecker, and not have to isolate words "manually", as in program
source.
In program source (i.e. not strings or comments), the issue currently under
discussion won't arise, as "." cannot be part of an identifier.
--
https://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 15:08:01 GMT)
Full text and
rfc822 format available.
Message #64 received at 20741 <at> debbugs.gnu.org (full text, mbox):
On Mon, 14 Feb 2022 13:43:12 +0000 Reuben Thomas via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> wrote:
> On Mon, 14 Feb 2022 at 13:35, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> I don't think I understand what this means in practice. "Use the
> spell-checker" how? Do you mean we should not break words on
> punctuation characters, or do you mean not to break them only on '.',
> or do you mean something else?
>
> Emacs is widely used to edit program sources, where stuff like
> "file.attribute" and "list-my-packages" happens quite frequently.
>
> I wasn't considering this case, and this issue is about checking text
> (or comments or strings) where you can just feed the entire thing to
> the spellchecker, and not have to isolate words "manually", as in
> program source.
>
> In program source (i.e. not strings or comments), the issue currently
> under discussion won't arise, as "." cannot be part of an identifier.
In some languages it can, e.g. R: "Identifiers consist of a sequence of
letters, digits, the period (‘.’) and the underscore."
(https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Identifiers)
Steve Berman
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 15:09:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 15:22:01 GMT)
Full text and
rfc822 format available.
Message #70 received at 20741 <at> debbugs.gnu.org (full text, mbox):
> From: Reuben Thomas <rrt <at> sc3d.org>
> Date: Mon, 14 Feb 2022 13:43:12 +0000
> Cc: larsi <at> gnus.org, 20741 <at> debbugs.gnu.org
>
> Emacs is widely used to edit program sources, where stuff like
> "file.attribute" and "list-my-packages" happens quite frequently.
>
> I wasn't considering this case, and this issue is about checking text (or comments or strings) where you can
> just feed the entire thing to the spellchecker, and not have to isolate words "manually", as in program
> source.
It's the same case: references to variables and other symbols in
comments and strings of a program are very frequent. They are also
very frequent in email messages which discuss programming, such as
this discussion (I have Flyspell turned on in all my email buffers).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 15:29:02 GMT)
Full text and
rfc822 format available.
Message #73 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, 14 Feb 2022 at 15:21, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> It's the same case: references to variables and other symbols in
> comments and strings of a program are very frequent. They are also
> very frequent in email messages which discuss programming, such as
> this discussion (I have Flyspell turned on in all my email buffers).
>
I think we can distinguish 3 different problems here:
1. Natural language spellchecking. That's what this issue is about.
2. Spell-checking code. (Essentially, identifiers.)
3. Finding code inside natural language, and checking it as if it were
code. (That's what you're talking about here.) This is not a spellchecking
problem, it's a problem of identifying which spell-checking apparatus to
use, rather like font-lock for multi-language buffers. It's hard to see how
to do it without some syntactic clue (e.g. the use of backticks in
markdown), as used in multi-language buffers for font-locking.
--
https://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 15:29:02 GMT)
Full text and
rfc822 format available.
Message #76 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, 14 Feb 2022 at 10:45, Lars Ingebrigtsen <larsi <at> gnus.org> wrote:
> Reuben Thomas <rrt <at> sc3d.org> writes:
>
> > To be honest, I'm not sure Emacs can do much here. As far as I can
> > tell, hunspell doesn't cope well with characters like "." that
> > normally are non-word characters, but *can* occur in a word.
> >
> > Relatedly, see: https://github.com/hunspell/hunspell/issues/361
>
> So it's a problem on the hunspell side, and not because Emacs is
> considering the "." to be a non-word character? (I haven't tried to
> debug what's going on.)
For natural language, yes.
There's also a problem in common abbreviations like "i.e.", which is
> considered as the words "i" and "e", apparently...
>
This is indeed the case, and it's not normally a problem because Emacs does
not spellcheck words so short.
I was wondering whether Emacs could query the backend speller whether it
> had the word "foo." in the dictionary before squiggly-lining "foo", but
> I'm very unfamiliar with how these functions work.
>
It could!
--
https://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 16:43:02 GMT)
Full text and
rfc822 format available.
Message #79 received at 20741 <at> debbugs.gnu.org (full text, mbox):
> From: Reuben Thomas <rrt <at> sc3d.org>
> Date: Mon, 14 Feb 2022 15:27:54 +0000
> Cc: larsi <at> gnus.org, 20741 <at> debbugs.gnu.org
>
> It's the same case: references to variables and other symbols in
> comments and strings of a program are very frequent. They are also
> very frequent in email messages which discuss programming, such as
> this discussion (I have Flyspell turned on in all my email buffers).
>
> I think we can distinguish 3 different problems here:
>
> 1. Natural language spellchecking. That's what this issue is about.
> 2. Spell-checking code. (Essentially, identifiers.)
> 3. Finding code inside natural language, and checking it as if it were code. (That's what you're talking about
> here.) This is not a spellchecking problem, it's a problem of identifying which spell-checking apparatus to
> use, rather like font-lock for multi-language buffers. It's hard to see how to do it without some syntactic clue
> (e.g. the use of backticks in markdown), as used in multi-language buffers for font-locking.
Like I said: when we talk about this stuff in email, it's both case 1
and case 3.
Anyway: what are the practical proposals for improving this? Are we
going to handle only periods, or does anyone have a more general
solution?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 17:02:02 GMT)
Full text and
rfc822 format available.
Message #82 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, 14 Feb 2022 at 16:42, Eli Zaretskii <eliz <at> gnu.org> wrote:
> > From: Reuben Thomas <rrt <at> sc3d.org>
> >
> > I think we can distinguish 3 different problems here:
> >
> > 1. Natural language spellchecking. That's what this issue is about.
> > 2. Spell-checking code. (Essentially, identifiers.)
> > 3. Finding code inside natural language, and checking it as if it were
> code. (That's what you're talking about
> > here.) This is not a spellchecking problem, it's a problem of
> identifying which spell-checking apparatus to
> > use, rather like font-lock for multi-language buffers. It's hard to see
> how to do it without some syntactic clue
> > (e.g. the use of backticks in markdown), as used in multi-language
> buffers for font-locking.
>
> Anyway: what are the practical proposals for improving this? Are we
> going to handle only periods, or does anyone have a more general
> solution?
>
Emacs does not currently try to handle case 3, as far as I know. That would
be a medium-sized project in its own right.
Case 2 depends on per-programming-language syntax tables, not on the
spellchecker. I don't know what the current arrangements for this are.
Case 1, on which everything else depends, is a matter of Emacs sending
stretches of text to the spell-checker, which currently (at least with
Hunspell) does not deal with punctuation very well, though there are plans
to improve Hunspell, according to the issue I linked to earlier.
None of these would benefit from special-casing treatment of the period or
any other character in ispell.el, I think.
--
https://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#20741
; Package
emacs
.
(Mon, 14 Feb 2022 18:10:01 GMT)
Full text and
rfc822 format available.
Message #85 received at 20741 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
> Case 1, on which everything else depends, is a matter of Emacs sending
> stretches of text to the spell-checker, which currently (at least with
> Hunspell) does not deal with punctuation very well, though there are plans
> to improve Hunspell, according to the issue I linked to earlier.
>
> None of these would benefit from special-casing treatment of the period or
> any other character in ispell.el, I think.
Attached find my code which works with Hunspell and utf-8 coded buffers
only. Add to your .emacs something like
(load "~/speck.el" nil t)
(custom-set-variables
'(speck-dictionaries-alist '((0 "en_US" nil nil) (1 "de_AT" nil nil) (2 "en_US" ("de_AT") nil)))
'(speck-wordchars-alist
'(("en_US" "'´`" nil)
("fr_FR" "’'" nil)
("en_US,fr_FR" "'´`’'" nil))))
(global-set-key [(f7)] 'speck-mode)
customize 'speck-wordchars-alist' according to your like (the above are
the values I use - you probably want to add options for en_GB), hit F7
in some window you want to check and tell me if special-casing treatment
of characters with that option works.
martin
[speck.el (text/x-emacs-lisp, attachment)]
This bug report was last modified 3 years and 119 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.