GNU bug report logs - #30815
26.0.91; unicode right single quote mark with syntax entry of w not respected by forward-word

Previous Next

Package: emacs;

Reported by: Aaron Jensen <aaronjensen <at> gmail.com>

Date: Wed, 14 Mar 2018 00:16:01 UTC

Severity: wishlist

Tags: confirmed, notabug

Merged with 10494, 13129

Found in versions 24.0.92, 24.1, 25.1, 26.0.91

Done: Noam Postavsky <npostavs <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #29 received at 30815 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Aaron Jensen <aaronjensen <at> gmail.com>
Cc: 30815 <at> debbugs.gnu.org, npostavs <at> gmail.com
Subject: Re: bug#30815: 26.0.91;
 unicode right single quote mark with syntax entry of w not respected
 by forward-word
Date: Wed, 14 Mar 2018 18:09:11 +0200
> From: Aaron Jensen <aaronjensen <at> gmail.com>
> Date: Tue, 13 Mar 2018 19:43:03 -0700
> Cc: 30815 <at> debbugs.gnu.org
> 
> Is the case for this when two words of different scripts are next to
> each other? I suppose that'd make sense if they also didn't have a
> space between them for some reason (probably possible with some
> languages)
> 
> > Maybe there is some other way to get the wanted behaviour though.  I
> > also found Bug#13129 asking about this.
> 
> Since the right quote is part of the General Punctuation script
> (afaict), what if some scripts, like that one, were treated neutrally?
> That is, they derived their behavior solely from the syntax entry.
> This is probably a bad idea for a number of reasons I don't even know
> I don't know.

We could introduce a new script, say, 'punctuation', and treat that
specially.  Doing that with all the characters of the 'symbol' script
is probably not what users will expect, I invite you to look in
charscript.el where you will see what Unicode blocks belong to
'symbol'.  So we will have to decide which characters to assign to
that new script.

But before we discuss this issue more, I think we need to talk about
the goals.  E.g., is this only for text-derived modes, or also for
programming modes?  More generally, why did you want to change the
syntax entry of ’ ?

The next question is what other characters need this special handling,
and how many of them are there?

Armed with answers to these questions, we could then decide how to
implement the requested feature, if at all.

Btw, you should know that in some quarters using ’ as an apostrophe is
anathema: they maintain one should use u+02B7 MODIFIER LETTER
APOSTROPHE instead, in particular because it doesn't have the script
disparity issue in this context.  See, for example, this URL:

  https://tedclancy.wordpress.com/2015/06/03/which-unicode-character-should-represent-the-english-apostrophe-and-why-the-unicode-committee-is-very-wrong/




This bug report was last modified 7 years and 66 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.