GNU bug report logs -
#62898
29.0.90; X can’t be input by the current input method [chinese-ctlaub]
Previous Next
Reported by: Van Ly <van.ly <at> sdf.org>
Date: Mon, 17 Apr 2023 12:19:02 UTC
Severity: normal
Found in version 29.0.90
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 62898 in the body.
You can then email your comments to 62898 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62898
; Package
emacs
.
(Mon, 17 Apr 2023 12:19:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Van Ly <van.ly <at> sdf.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 17 Apr 2023 12:19:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Where X is character codepoint
#x6a58
#x6a59
with chinese-ctlaub set the current input method, quail-show-key has
#x6a58 => X can’t be input by the current input method
#x6a59 => To input ‘X’, type "chaang"
Steps to reproduce
- emacs -Q
- switch to buffer in plain Fundamental mode, C-x b bbb
- M-x set-input-method RET chinese-ctlaub
- mouse copy, paste the two symbols from chart at row 1, column 2 and 3
- put cursor over symbol and apply M-x quail-show-key
chart from Shuowen's tree section
- https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/shuowenRadical.php?rad=%E6%9C%A8
unexpected result
- the current input method won't learn new input key sequence for symbol
expected result
- the current input method learns new input key sequence for symbol
The current input method being chinese-ctlaub learns a new input key
sequence for symbol flowing the way a word that does not occur in the
wordlist can be added for future personal spell checking.
[bug-gnu-emacs-report.text (application/octet-stream, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62898
; Package
emacs
.
(Thu, 20 Apr 2023 08:12:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 62898 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 17 Apr 2023 12:11:33 GMT
> From: Van Ly <van.ly <at> sdf.org>
>
> Where X is character codepoint
>
> #x6a58
> #x6a59
>
> with chinese-ctlaub set the current input method, quail-show-key has
>
> #x6a58 => X can’t be input by the current input method
> #x6a59 => To input ‘X’, type "chaang"
>
> Steps to reproduce
> - emacs -Q
> - switch to buffer in plain Fundamental mode, C-x b bbb
> - M-x set-input-method RET chinese-ctlaub
> - mouse copy, paste the two symbols from chart at row 1, column 2 and 3
> - put cursor over symbol and apply M-x quail-show-key
The chinese-ctlaub input method is produced from the file
CTLau-b5.html, and that file doesn't include #x6a59.
I cannot find a newer version of CTLau-b5.html on the Internet, if
there is a newer version. I also don't know why #x6a59 is missing
from the file we have: whether it's a mistake, omission, or there's
some real reason for that.
Are there any newer sources for this input method which we could use?
> chart from Shuowen's tree section
> - https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/shuowenRadical.php?rad=%E6%9C%A8
That page is in Chinese, and I cannot read nor understand it. What
does it say that is relevant to this issue?
> unexpected result
> - the current input method won't learn new input key sequence for symbol
>
> expected result
> - the current input method learns new input key sequence for symbol
>
> The current input method being chinese-ctlaub learns a new input key
> sequence for symbol flowing the way a word that does not occur in the
> wordlist can be added for future personal spell checking.
I don't understand what you are saying here, sorry. What do you mean
by "current input method learns new input key sequence"? AFAIK, input
methods don't learn any key sequences, they just support key sequences
that are part of the IM's definition.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62898
; Package
emacs
.
(Thu, 20 Apr 2023 19:40:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 62898 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 20 Apr 2023 11:11:07 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 62898 <at> debbugs.gnu.org
> Content-type: text/plain; charset=utf-8
>
> The chinese-ctlaub input method is produced from the file
> CTLau-b5.html, and that file doesn't include #x6a59.
>
> I cannot find a newer version of CTLau-b5.html on the Internet, if
> there is a newer version. I also don't know why #x6a59 is missing
> from the file we have: whether it's a mistake, omission, or there's
> some real reason for that.
>
Perhaps at the time when this CTLau-b5.html was composed that was the
extent of what was known or the authors hadn't encountered a use for
it to be included. Those two codepoints are documented at page 185 of
U4E00.pdf . They represent two forms of citrus fruit.
- https://www.unicode.org/charts/index.html
- https://www.unicode.org/charts/PDF/U4E00.pdf
Looking at the below for #x6a58
- https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/search.php?word=%E6%A9%98
The left margin section, at the bottom, has a drop down menu and the
CTLau phonology is obtained by selecting the bottom option.
There the phonological reading for #x6a58 is given by gat and gwat.
Looking at the CTLau-b input sequence for \foh
- https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/search.php?word=%E7%85%92
the humanum gives the reading wai, is there any memory for some
CTLau-b input sequences having the backslash prefix? The sounding foh
refers to the fire character indexical is my guess which might have
been a mnemonic for the authors of CTLau-b5.html .
> Are there any newer sources for this input method which we could use?
- http://sdf.org/~van.ly/img/x6a58--gat--gwat--add-to-CTLau-b5.jpg
I would like to use the phonological readings gvien at the humanum website.
I've tried to contact them but got no reply. It would be neat if they
update the CTLau-b5.html file.
>
> > chart from Shuowen's tree section
> > - https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/shuowenRadical.php?rad=%E6%9C%A8
>
> That page is in Chinese, and I cannot read nor understand it. What
> does it say that is relevant to this issue?
>
That page is a random page in the Shuowen dictionary with two
codepoints for orange and tangerine citrus fruit which I thought would
be in common use enough to be covered by the CTLaub input method. One
of them is not available. I am hoping for an accommodation to allow
updating the CTLaub input method. Perhaps the user can compose a
CTLau-b5-extend.html for use.
> > unexpected result
> > - the current input method won't learn new input key sequence for symbol
> >
> > expected result
> > - the current input method learns new input key sequence for symbol
> >
> > The current input method being chinese-ctlaub learns a new input key
> > sequence for symbol flowing the way a word that does not occur in the
> > wordlist can be added for future personal spell checking.
>
> I don't understand what you are saying here, sorry. What do you mean
> by "current input method learns new input key sequence"? AFAIK, input
> methods don't learn any key sequences, they just support key sequences
> that are part of the IM's definition.
>
I was likening the input method function to spellchecking. The input
method function does not accommodate updates but the spellchecker will
let you add new word spellings.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62898
; Package
emacs
.
(Sat, 22 Apr 2023 09:43:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 62898 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 20 Apr 2023 19:39:26 GMT
> From: Van Ly <van.ly <at> sdf.org>
> Cc: 62898 <at> debbugs.gnu.org
>
>
> > Date: Thu, 20 Apr 2023 11:11:07 +0300
> > From: Eli Zaretskii <eliz <at> gnu.org>
> > Cc: 62898 <at> debbugs.gnu.org
> > Content-type: text/plain; charset=utf-8
> >
> > The chinese-ctlaub input method is produced from the file
> > CTLau-b5.html, and that file doesn't include #x6a59.
> >
> > I cannot find a newer version of CTLau-b5.html on the Internet, if
> > there is a newer version. I also don't know why #x6a59 is missing
> > from the file we have: whether it's a mistake, omission, or there's
> > some real reason for that.
> >
>
> Perhaps at the time when this CTLau-b5.html was composed that was the
> extent of what was known or the authors hadn't encountered a use for
> it to be included. Those two codepoints are documented at page 185 of
> U4E00.pdf . They represent two forms of citrus fruit.
>
> - https://www.unicode.org/charts/index.html
> - https://www.unicode.org/charts/PDF/U4E00.pdf
>
> Looking at the below for #x6a58
>
> - https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/search.php?word=%E6%A9%98
>
> The left margin section, at the bottom, has a drop down menu and the
> CTLau phonology is obtained by selecting the bottom option.
>
> There the phonological reading for #x6a58 is given by gat and gwat.
We already have GAT and GWAT in CTLau-b5.html. Are you saying we
should add #x6a58 to the list of characters in those 2 lines?
> Looking at the CTLau-b input sequence for \foh
>
> - https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/search.php?word=%E7%85%92
>
> the humanum gives the reading wai, is there any memory for some
> CTLau-b input sequences having the backslash prefix? The sounding foh
> refers to the fire character indexical is my guess which might have
> been a mnemonic for the authors of CTLau-b5.html .
Sorry, I don't understand: the above is for a different Unicode
codpoint, U+7152. How is that relevant to the issue at hand?
> > Are there any newer sources for this input method which we could use?
>
> - http://sdf.org/~van.ly/img/x6a58--gat--gwat--add-to-CTLau-b5.jpg
I don't understand how to interpret that image, sorry.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62898
; Package
emacs
.
(Sat, 22 Apr 2023 17:11:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 62898 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 22 Apr 2023 12:42:32 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 62898 <at> debbugs.gnu.org
> Content-type: text/plain; charset=utf-8
>
> > Date: Thu, 20 Apr 2023 19:39:26 GMT
> > From: Van Ly <van.ly <at> sdf.org>
> > Cc: 62898 <at> debbugs.gnu.org
> >
> >
> > Looking at the below for #x6a58
> >
> > - https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/search.php?word=%E6%A9%98
> >
> > The left margin section, at the bottom, has a drop down menu and the
> > CTLau phonology is obtained by selecting the bottom option.
> >
> > There the phonological reading for #x6a58 is given by gat and gwat.
>
> We already have GAT and GWAT in CTLau-b5.html. Are you saying we
> should add #x6a58 to the list of characters in those 2 lines?
>
Yes, homonym GAT expands to four graphs. So, add one. Do the same
for GWAT. It would be like adding a new word or variant spelling to
the spellchecker interactive flow.
You will see the homonym CHO expands to 17 graphs. Maybe there are
more to add there.
> > Looking at the CTLau-b input sequence for \foh
> >
> > - https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/search.php?word=%E7%85%92
> >
> > the humanum gives the reading wai, is there any memory for some
> > CTLau-b input sequences having the backslash prefix? The sounding foh
> > refers to the fire character indexical is my guess which might have
> > been a mnemonic for the authors of CTLau-b5.html .
>
> Sorry, I don't understand: the above is for a different Unicode
> codpoint, U+7152. How is that relevant to the issue at hand?
The homomym \FOH has a backslash prefix, the other input sequences
don't have a prefix symbol. I don't know why that convention is. At
a guess, the prefix adds a different style of input sequence. Maybe
that is explained in the documentation I haven't reached.
>
> > > Are there any newer sources for this input method which we could use?
> >
> > - http://sdf.org/~van.ly/img/x6a58--gat--gwat--add-to-CTLau-b5.jpg
>
> I don't understand how to interpret that image, sorry.
>
I included the picture to accompany the above wording on navigating to
the dropdown menu and to show the GAT and GWAT homonym as displayed,
as a help, and to suggest how the phonoloreading once looked up can be
used to augment entries no found in the received CTLau-b5.html . You
will also see there are TOFU graphs, so the situation is an improving
incomplete work in progress.
Reply sent
to
Eli Zaretskii <eliz <at> gnu.org>
:
You have taken responsibility.
(Tue, 25 Apr 2023 14:42:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Van Ly <van.ly <at> sdf.org>
:
bug acknowledged by developer.
(Tue, 25 Apr 2023 14:42:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 62898-done <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 22 Apr 2023 17:10:24 GMT
> From: Van Ly <van.ly <at> sdf.org>
> Cc: 62898 <at> debbugs.gnu.org
>
> > We already have GAT and GWAT in CTLau-b5.html. Are you saying we
> > should add #x6a58 to the list of characters in those 2 lines?
> >
>
> Yes, homonym GAT expands to four graphs. So, add one. Do the same
> for GWAT. It would be like adding a new word or variant spelling to
> the spellchecker interactive flow.
OK, I've now done that on the master branch, and I'm closing this bug.
Thanks.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Wed, 24 May 2023 11:24:10 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 28 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.