GNU bug report logs -
#52918
29.0.50; to make use of ucd/Unihan_Readings.txt for kDefinition entry
Previous Next
To reply to this bug, email your comments to 52918 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Fri, 31 Dec 2021 17:56:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Van Ly <van.ly <at> sdf.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Fri, 31 Dec 2021 17:56:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hello,
I was looking in the master's emacs/admin/notes subdirectory and
found the unicode file. It has a list of files from the ucd and has
left out:
. Unihan_Readings.txt
Like how quail-show-key helps by showing in the minibuffer the input
sequence needed to type a character for a specific input method, can
there be a function called quail-show-unihan that exposes in the
minibuffer the kDefinition entry associated with the East Asian
character from ucd/Unihan_Readings.txt?
--
vl
[bug-gnu-emacs-29.text (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Mon, 03 Jan 2022 13:55:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 52918 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 31 Dec 2021 17:55:01 +0000 (UTC)
> From: Van Ly <van.ly <at> sdf.org>
>
> I was looking in the master's emacs/admin/notes subdirectory and
> found the unicode file. It has a list of files from the ucd and has
> left out:
>
> . Unihan_Readings.txt
>
> Like how quail-show-key helps by showing in the minibuffer the input
> sequence needed to type a character for a specific input method, can
> there be a function called quail-show-unihan that exposes in the
> minibuffer the kDefinition entry associated with the East Asian
> character from ucd/Unihan_Readings.txt?
Yes, this could be added to Emacs, and IMO would be a useful feature.
Suggested implementation:
. import the Unihan_Readings.txt file into Emacs
. add Makefile rules to produce a uni-unihan-readings.el file from
Unihan_Readings.txt, which defines a char-table where each
character has its kDefinition property value
. code a minor mode which will show in the echo area the value of
the kDefinition property, if any, of the character at point
Patches welcome.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Tue, 04 Jan 2022 15:14:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 52918 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, 3 Jan 2022, Eli Zaretskii wrote:
>
> Suggested implementation:
>
> . import the Unihan_Readings.txt file into Emacs
>
> Patches welcome.
>
Attached is the diff listing for admin/unidata/README to source
Unihan_Readings.txt from
=> https://www.unicode.org/Public/UCD/latest/ucd/Unihan.zip
The version specific path alternatives are
=> https://www.unicode.org/Public/14.0.0/ucd/Unihan.zip
=> https://www.unicode.org/Public/15.0.0/ucd/Unihan-15.0.0d1.zip
--
vl
[admin-unidata-README-diff.text (text/plain, attachment)]
Severity set to 'wishlist' from 'normal'
Request was from
Stefan Kangas <stefan <at> marxist.se>
to
control <at> debbugs.gnu.org
.
(Sun, 09 Jan 2022 15:46:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Mon, 17 Jan 2022 18:26:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 52918 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, 3 Jan 2022, Eli Zaretskii wrote:
>
> Suggested implementation:
>
> . add Makefile rules to produce a uni-unihan-readings.el file from
> Unihan_Readings.txt, which defines a char-table where each
> character has its kDefinition property value
>
A candidate for the Makefile rule to produce uni-unihan-readings.el
is
'''
#!/bin/sh
X='/usr/X/Projects/emacs-28.0.91/admin/unidata/Unihan_Readings.txt'
fgrep 'kDefinition' "$X" | sed -e '/^#/d' -e 's/^../#x/' | head -n 3
| awk '-F ' 'BEGIN {printf("(defvar
readings-table\n\t(make-char-table '\'readings-table' nil)\n\t\"Char
table of definitions for East Asian characters.\")\n")}
{printf("(aset readings-table %s \"%s\")\n", $1, $3)}'
'''
The result is
'''
(defvar readings-table
(make-char-table 'readings-table nil)
"Char table of definitions for East Asian characters.")
(aset readings-table #x3400 "(same as U+4E18 丘) hillock or mound")
(aset readings-table #x3401 "to lick; to taste, a mat, bamboo bark")
(aset readings-table #x3402 "(J) non-standard form of U+559C 喜, to
like, love, enjoy; a joyful thing")
'''
--
vl
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Tue, 18 Jan 2022 11:31:01 GMT)
Full text and
rfc822 format available.
Message #19 received at 52918 <at> debbugs.gnu.org (full text, mbox):
Place node in etc/TODO file for the suggested implementation here to
be done.
'''
diff -u --label /usr/X/Projects/emacs-28.0.91/etc/TODO --label
\#\<buffer\ TODO\> /usr/X/Projects/emacs-28.0.91/etc/TODO
/dev/shm/buffer-content-Q1ArDD
--- /usr/X/Projects/emacs-28.0.91/etc/TODO
+++ #<buffer TODO>
@@ -747,6 +747,9 @@
** Add definitions for symbol properties, for documentation purposes
+** Make use of char-table for reading definitions from
ucd/Unihan_Readings.txt
+bug#52918 see.
+
** Temporarily remove scroll bars when they are not needed
Typically when a buffer can be fully displayed in its window.
Diff finished. Tue Jan 18 22:22:52 2022
'''
--
vl
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Sun, 23 Jan 2022 03:51:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 52918 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, 3 Jan 2022, Eli Zaretskii wrote:
>> . Unihan_Readings.txt
>>
>> Like how quail-show-key helps by showing in the minibuffer the input
>> sequence needed to type a character for a specific input method, can
>> there be a function called quail-show-unihan that exposes in the
>> minibuffer the kDefinition entry associated with the East Asian
>> character from ucd/Unihan_Readings.txt?
>
> Yes, this could be added to Emacs, and IMO would be a useful feature.
>
> Suggested implementation:
>
> . import the Unihan_Readings.txt file into Emacs
> . add Makefile rules to produce a uni-unihan-readings.el file from
> Unihan_Readings.txt, which defines a char-table where each
> character has its kDefinition property value
> . code a minor mode which will show in the echo area the value of
> the kDefinition property, if any, of the character at point
>
> Patches welcome.
>
See patch attached.
Two of the three implementation steps suggested are done.
--
vl
[0029-bug-52918-generate-East-Asian-readings-char-table.patch (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Sun, 23 Jan 2022 06:01:01 GMT)
Full text and
rfc822 format available.
Message #25 received at 52918 <at> debbugs.gnu.org (full text, mbox):
> Date: Sun, 23 Jan 2022 02:15:06 +0000 (UTC)
> From: Van Ly <van.ly <at> sdf.org>
> cc: 52918 <at> debbugs.gnu.org
>
> On Mon, 3 Jan 2022, Eli Zaretskii wrote:
>
> >> . Unihan_Readings.txt
> >>
> >> Like how quail-show-key helps by showing in the minibuffer the input
> >> sequence needed to type a character for a specific input method, can
> >> there be a function called quail-show-unihan that exposes in the
> >> minibuffer the kDefinition entry associated with the East Asian
> >> character from ucd/Unihan_Readings.txt?
> >
> > Yes, this could be added to Emacs, and IMO would be a useful feature.
> >
> > Suggested implementation:
> >
> > . import the Unihan_Readings.txt file into Emacs
> > . add Makefile rules to produce a uni-unihan-readings.el file from
> > Unihan_Readings.txt, which defines a char-table where each
> > character has its kDefinition property value
> > . code a minor mode which will show in the echo area the value of
> > the kDefinition property, if any, of the character at point
> >
> > Patches welcome.
> >
>
> See patch attached.
>
> Two of the three implementation steps suggested are done.
Thanks.
You don't seem to have copyright assignment on file, and without that
we cannot accept such large contributions. Would you like to start
your legal paperwork now? If so, I will send you the form and the
instructions.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Sun, 23 Jan 2022 11:23:01 GMT)
Full text and
rfc822 format available.
Message #28 received at 52918 <at> debbugs.gnu.org (full text, mbox):
On Sun, 23 Jan 2022, Eli Zaretskii wrote:
>>>
>>> Patches welcome.
>>>
>>
>> See patch attached.
>>
>> Two of the three implementation steps suggested are done.
>
> Thanks.
>
> You don't seem to have copyright assignment on file, and without that
> we cannot accept such large contributions. Would you like to start
> your legal paperwork now? If so, I will send you the form and the
> instructions.
>
I sent an email to assign <at> gnu.org in the 24hr before this patch was
submitted. I was hoping this patch would fall below the 15 line
limit and not need the formality of the legal paperwork. The minor
mode contribution would climb above the limit, which was why I sent
the request to assign copyright. Best case is a 2 week wait.
That generated uni-unihan-readings.el will need a line as follows:
diff --git a/admin/unidata/Unihan_Readings.awk
b/admin/unidata/Unihan_Readings.awk
index cf319449e59..f01c75b88f9 100644
--- a/admin/unidata/Unihan_Readings.awk
+++ b/admin/unidata/Unihan_Readings.awk
@@ -1,5 +1,6 @@
BEGIN {
FS=" "
+ printf(";; -*-no-byte-compile: t; -*-\n")
printf("(defvar readings-table\n\
(make-char-table 'readings-table nil)\n\
\"Char table of definitions for East Asian characters.\")\n")
--
vl
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Sun, 23 Jan 2022 11:41:02 GMT)
Full text and
rfc822 format available.
Message #31 received at 52918 <at> debbugs.gnu.org (full text, mbox):
> Date: Sun, 23 Jan 2022 11:22:03 +0000 (UTC)
> From: Van Ly <van.ly <at> sdf.org>
> cc: 52918 <at> debbugs.gnu.org
>
> > You don't seem to have copyright assignment on file, and without that
> > we cannot accept such large contributions. Would you like to start
> > your legal paperwork now? If so, I will send you the form and the
> > instructions.
> >
>
> I sent an email to assign <at> gnu.org in the 24hr before this patch was
> submitted. I was hoping this patch would fall below the 15 line
> limit and not need the formality of the legal paperwork. The minor
> mode contribution would climb above the limit, which was why I sent
> the request to assign copyright. Best case is a 2 week wait.
>
> That generated uni-unihan-readings.el will need a line as follows:
Thanks, I prefer to wait until your assignment is in place, and you
can then submit the final pieces to make this feature complete.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Tue, 25 Jul 2023 15:44:01 GMT)
Full text and
rfc822 format available.
Message #34 received at 52918 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
> Date: Sun, 23 Jan 2022 13:40:23 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 52918 <at> debbugs.gnu.org
>
> > Date: Sun, 23 Jan 2022 11:22:03 +0000 (UTC)
> > From: Van Ly <van.ly <at> sdf.org>
> > cc: 52918 <at> debbugs.gnu.org
> >
> > > You don't seem to have copyright assignment on file, and without that
> > > we cannot accept such large contributions. Would you like to start
> > > your legal paperwork now? If so, I will send you the form and the
> > > instructions.
> > >
> >
> > I sent an email to assign <at> gnu.org in the 24hr before this patch was
> > submitted. I was hoping this patch would fall below the 15 line
> > limit and not need the formality of the legal paperwork. The minor
> > mode contribution would climb above the limit, which was why I sent
> > the request to assign copyright. Best case is a 2 week wait.
> >
> > That generated uni-unihan-readings.el will need a line as follows:
>
> Thanks, I prefer to wait until your assignment is in place, and you
> can then submit the final pieces to make this feature complete.
<Time passes...>
> Date: Tue, 25 Jul 2023 14:47:52 GMT
> From: Van Ly <van.ly <at> sdf.org>
>
> More than 18-months ago I left hanging in one of the bug report
> threads the suggestion to include a readings table for CJKV characters
> from Unicode.
>
> At the time I hadn't done the paperwork and posted the awk transformer
> script which was about fewer than 16 lines that generated the 21346
> lines reading table. See attached.
>
> I have since done the paperwork and was prompted to get this done or
> close the bug report seeing the configure script for 29.1 on line
> 2761 has the option to generate a smaller sized Japanese dictionary.
>
> The awk script I have since misplaced but it should be somewhere in
> the bug report if details have not been purged beyond 12 months.
Details were not purged, but please look at the past discussions of
this bug and tell where in it should we look for the Awk script.
I forward below the attachments you sent to me in private email;
please continue discussing this issue in this thread, not separately
and not in private email to me.
Thanks.
[Unihan_Readings.el (application/emacs-lisp, attachment)]
[create-readings-table.sh (application/x-sh, attachment)]
[example-configuration.el (application/emacs-lisp, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#52918
; Package
emacs
.
(Tue, 25 Jul 2023 18:19:01 GMT)
Full text and
rfc822 format available.
Message #37 received at 52918 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 25 Jul 2023 18:44:01 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
>
> > Date: Sun, 23 Jan 2022 13:40:23 +0200
> > From: Eli Zaretskii <eliz <at> gnu.org>
>
<Time passes...>
>
> > Date: Tue, 25 Jul 2023 14:47:52 GMT
> > From: Van Ly <van.ly <at> sdf.org>
> >
> > I have since done the paperwork and was prompted to get this done or
> > close the bug report seeing the configure script for 29.1 on line
> > 2761 has the option to generate a smaller sized Japanese dictionary.
> >
> > The awk script I have since misplaced but it should be somewhere in
> > the bug report if details have not been purged beyond 12 months.
>
> Details were not purged, but please look at the past discussions of
> this bug and tell where in it should we look for the Awk script.
>
The patch is located at X and the Awk script in there looks as follows
1 BEGIN {
2 FS=" "
3 printf("(defvar readings-table\n\
4 (make-char-table 'readings-table nil)\n\
5 \"Char table of definitions for East Asian characters.\")\n")
6 }
7 /^#/ { next }
8 /kDefinition/ {
9 sub(/^../, "#x", $1)
10 printf("(aset readings-table %s \"%s\")\n", $1, $3)
11 }
12
13 # Local Variables:
14 # indent-tabs-mode: t
15 # tab-width: 8
16 # End:
X
https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-01/msg01393.html
This bug report was last modified 2 years and 17 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.