GNU bug report logs - #58159
[PATCH] Add support for the Wancho script

Previous Next

Package: emacs;

Reported by: समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com>

Date: Thu, 29 Sep 2022 11:08:01 UTC

Severity: normal

Tags: patch

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 58159 in the body.
You can then email your comments to 58159 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Thu, 29 Sep 2022 11:08:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 29 Sep 2022 11:08:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: समीर सिंह Sameer Singh
 <lumarzeli30 <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] Add support for the Wancho script
Date: Thu, 29 Sep 2022 16:37:13 +0530
[Message part 1 (text/plain, inline)]
The Wancho script is added to Emacs this time.

Also can we add something like this to etc/HELLO:
"Some of these greetings or the script name may be wrong or misspelled so
if you know the script, please help by correcting them."?

For many of these languages/scripts it is difficult to find their greetings
and most of the time if their greetings are available online they are in
the roman script so often I have to convert them into their native script
therefore these greetings may have a high chance of misspelling or they may
be wrong altogether so adding something like the above mentioned line may
help.

Thanks
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Thu, 29 Sep 2022 11:10:01 GMT) Full text and rfc822 format available.

Message #8 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: समीर सिंह Sameer Singh
 <lumarzeli30 <at> gmail.com>
To: 58159 <at> debbugs.gnu.org
Subject: Re: [PATCH] Add support for the Wancho script
Date: Thu, 29 Sep 2022 16:39:27 +0530
[Message part 1 (text/plain, inline)]
On Thu, Sep 29, 2022 at 4:37 PM समीर सिंह Sameer Singh <
lumarzeli30 <at> gmail.com> wrote:

> The Wancho script is added to Emacs this time.
>
> Also can we add something like this to etc/HELLO:
> "Some of these greetings or the script name may be wrong or misspelled so
> if you know the script, please help by correcting them."?
>
> For many of these languages/scripts it is difficult to find their
> greetings and most of the time if their greetings are available online they
> are in the roman script so often I have to convert them into their native
> script therefore these greetings may have a high chance of misspelling or
> they may be wrong altogether so adding something like the above mentioned
> line may help.
>
> Thanks
>
[Message part 2 (text/html, inline)]
[0001-Add-support-for-the-Wancho-script-bug-58159.patch (text/x-patch, attachment)]

Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Thu, 29 Sep 2022 13:17:01 GMT) Full text and rfc822 format available.

Notification sent to समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com>:
bug acknowledged by developer. (Thu, 29 Sep 2022 13:17:02 GMT) Full text and rfc822 format available.

Message #13 received at 58159-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: समीर सिंह Sameer Singh
 <lumarzeli30 <at> gmail.com>
Cc: 58159-done <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Thu, 29 Sep 2022 16:15:58 +0300
> From: समीर सिंह Sameer Singh
>  <lumarzeli30 <at> gmail.com>
> Date: Thu, 29 Sep 2022 16:39:27 +0530
> 
>  Also can we add something like this to etc/HELLO:
>  "Some of these greetings or the script name may be wrong or misspelled so if you know the script,
>  please help by correcting them."?

We don't need to have a greeting for every language environment we
support.  So if we aren't sure how are greetings written, we could
just omit it.

>  For many of these languages/scripts it is difficult to find their greetings and most of the time if their
>  greetings are available online they are in the roman script so often I have to convert them into their
>  native script therefore these greetings may have a high chance of misspelling or they may be wrong
>  altogether so adding something like the above mentioned line may help.

I installed the patch, thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Thu, 29 Sep 2022 13:22:02 GMT) Full text and rfc822 format available.

Message #16 received at 58159-done <at> debbugs.gnu.org (full text, mbox):

From: समीर सिंह Sameer Singh
 <lumarzeli30 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 58159-done <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Thu, 29 Sep 2022 18:51:01 +0530
[Message part 1 (text/plain, inline)]
>
> We don't need to have a greeting for every language environment we
> support.  So if we aren't sure how are greetings written, we could
> just omit it


Oh, Ok

> I installed the patch, thanks.

Great!


On Thu, Sep 29, 2022 at 6:46 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: समीर सिंह Sameer Singh
> >  <lumarzeli30 <at> gmail.com>
> > Date: Thu, 29 Sep 2022 16:39:27 +0530
> >
> >  Also can we add something like this to etc/HELLO:
> >  "Some of these greetings or the script name may be wrong or misspelled
> so if you know the script,
> >  please help by correcting them."?
>
> We don't need to have a greeting for every language environment we
> support.  So if we aren't sure how are greetings written, we could
> just omit it.
>
> >  For many of these languages/scripts it is difficult to find their
> greetings and most of the time if their
> >  greetings are available online they are in the roman script so often I
> have to convert them into their
> >  native script therefore these greetings may have a high chance of
> misspelling or they may be wrong
> >  altogether so adding something like the above mentioned line may help.
>
> I installed the patch, thanks.
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Thu, 29 Sep 2022 14:28:02 GMT) Full text and rfc822 format available.

Message #19 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: समीर सिंह Sameer Singh
 <lumarzeli30 <at> gmail.com>
Cc: 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Thu, 29 Sep 2022 16:27:02 +0200
>>>>> On Thu, 29 Sep 2022 16:39:27 +0530, समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com> said:
    समीर> @@ -116,6 +116,7 @@ Turkish (Türkçe)	Merhaba
    समीर>  Ukrainian (українська)	Вітаю
    समीर>  Vietnamese (tiếng Việt)	Chào bạn
 
    समीर> +Wancho (𞋒𞋀𞋉𞋃𞋕)    	𞋂𞋈𞋛

Any reason for the newline between Vietnamese and Wancho?

    समीर>  	(toto #x1E290)

TIL thereʼs a script called 'toto', which is the French equivalent of
'foo' :-)

Thanks

Robert
-- 

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Thu, 29 Sep 2022 15:20:02 GMT) Full text and rfc822 format available.

Message #22 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: समीर सिंह Sameer Singh
 <lumarzeli30 <at> gmail.com>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Thu, 29 Sep 2022 20:49:10 +0530
[Message part 1 (text/plain, inline)]
>
> Any reason for the newline between Vietnamese and Wancho?


This was not intentional, enriched-mode automatically adds a newline, most
of the time I remove it, this time
it may have skipped past my eyes.
see: https://mail.gnu.org/archive/html/bug-gnu-emacs/2022-05/msg00581.html

On Thu, Sep 29, 2022 at 7:57 PM Robert Pluim <rpluim <at> gmail.com> wrote:

> >>>>> On Thu, 29 Sep 2022 16:39:27 +0530, समीर सिंह Sameer Singh <
> lumarzeli30 <at> gmail.com> said:
>     समीर> @@ -116,6 +116,7 @@ Turkish (Türkçe)  Merhaba
>     समीर>  Ukrainian (українська)       Вітаю
>     समीर>  Vietnamese (tiếng Việt)      Chào bạn
>
>     समीर> +Wancho (𞋒𞋀𞋉𞋃𞋕)          𞋂𞋈𞋛
>
> Any reason for the newline between Vietnamese and Wancho?
>
>     समीर>       (toto #x1E290)
>
> TIL thereʼs a script called 'toto', which is the French equivalent of
> 'foo' :-)
>
> Thanks
>
> Robert
> --
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Thu, 29 Sep 2022 15:43:01 GMT) Full text and rfc822 format available.

Message #25 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: समीर सिंह Sameer Singh
 <lumarzeli30 <at> gmail.com>
Cc: 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Thu, 29 Sep 2022 17:41:49 +0200
>>>>> On Thu, 29 Sep 2022 20:49:10 +0530, समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com> said:

    >> 
    >> Any reason for the newline between Vietnamese and Wancho?


    समीर> This was not intentional, enriched-mode automatically adds a newline, most
    समीर> of the time I remove it, this time
    समीर> it may have skipped past my eyes.
    समीर> see: https://mail.gnu.org/archive/html/bug-gnu-emacs/2022-05/msg00581.html

OK. I avoid enriched-mode for HELLO :-)

Robert
-- 




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Sat, 01 Oct 2022 01:59:01 GMT) Full text and rfc822 format available.

Message #28 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com>
Cc: 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Fri, 30 Sep 2022 21:58:01 -0400
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

Do we really want to complicate Emacs to support the Wancho script?
According to Wikipedia, the Wancho script was invented 10 years ago;
Wancho is normally written using the Latin alphabet or Devanagari.
Some schools are starting to teach writing Wancho using that alphabet
instead of the well-known alphabet.  I suppose there is a campaign
for Wancho speakers to switch to it.

Is that really a good idea?  I suspect it comes from a sort of
boosterism/ethnic nationalism, as if having your own script were a
mark of importance.  But I think it is counterproductive to introduce
more incompatibility of scripts.

Do we really want to spend time on Emacs supporting scripts
which were created recently and have little user base?

English does not have an alphabet of its own; it uses an alphabet
borrowed from Latin.  Maybe English needs more prestige to compete
with Chinese and Hindi.  Should we invent a new English alphabet?

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Sat, 01 Oct 2022 04:55:02 GMT) Full text and rfc822 format available.

Message #31 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: समीर सिंह Sameer Singh
 <lumarzeli30 <at> gmail.com>
To: rms <at> gnu.org
Cc: 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Sat, 1 Oct 2022 10:23:43 +0530
[Message part 1 (text/plain, inline)]
>
> Do we really want to complicate Emacs to support the Wancho script?


I don't get how adding support for the Wancho script is complicating Emacs,
this
was a relatively straightforward simple patch, even composition rules were
not needed
here. Wancho is included in Unicode therefore Emacs support is added.

Wancho is normally written using the Latin alphabet or Devanagari.
> Some schools are starting to teach writing Wancho using that alphabet
> instead of the well-known alphabet.  I suppose there is a campaign
> for Wancho speakers to switch to it.
>

Have you considered that Wancho being a Sino-Tibetan language, Devanagari
and Latin script
may be inadequate to serve it?

Is that really a good idea?  I suspect it comes from a sort of
> boosterism/ethnic nationalism, as if having your own script were a
> mark of importance.
>

It is though, having a separate script also provides a unique identity to
the language.
For example take the Bhojpuri language it used to have its own script:
Kaithi, but later switched to
Devanagari, this I feel is one of the major reasons it is still not
recognised as a language by the government
but is instead treated as a dialect of Hindi. Many people regard it as a
"less polished" version of Hindi.
Urdu despite being virtually same with Hindi enjoys the status of a
separate language.
(Of course this also has many different reasons, but a having a different
script is also one of them)

Having a different script has aesthetic reasons as well for example how
could latin replicate the beauty
of devanagari conjuncts!
Also look at the abomination that is the Vietnamese script.

But I think it is counterproductive to introduce
> more incompatibility of scripts.
>

Emacs should atleast support all of the unicode scripts, I don't know how
moving towards that goal is
"increasing incompatibility of scripts"

Do we really want to spend time on Emacs supporting scripts
> which were created recently and have little user base?
>

I do not ask anyone else to spend their time adding scripts to Emacs, since
this is my wish I do it myself,
and the Emacs maintainers graciously accept it  and include it into Emacs
providing corrections and guidance along the way.

English does not have an alphabet of its own; it uses an alphabet
> borrowed from Latin.  Maybe English needs more prestige to compete
> with Chinese and Hindi.  Should we invent a new English alphabet?
>

I propse an Abugida 😉
Maybe this time they could work on the orthography 🤞

On Sat, Oct 1, 2022 at 7:28 AM Richard Stallman <rms <at> gnu.org> wrote:

> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
> Do we really want to complicate Emacs to support the Wancho script?
> According to Wikipedia, the Wancho script was invented 10 years ago;
> Wancho is normally written using the Latin alphabet or Devanagari.
> Some schools are starting to teach writing Wancho using that alphabet
> instead of the well-known alphabet.  I suppose there is a campaign
> for Wancho speakers to switch to it.
>
> Is that really a good idea?  I suspect it comes from a sort of
> boosterism/ethnic nationalism, as if having your own script were a
> mark of importance.  But I think it is counterproductive to introduce
> more incompatibility of scripts.
>
> Do we really want to spend time on Emacs supporting scripts
> which were created recently and have little user base?
>
> English does not have an alphabet of its own; it uses an alphabet
> borrowed from Latin.  Maybe English needs more prestige to compete
> with Chinese and Hindi.  Should we invent a new English alphabet?
>
> --
> Dr Richard Stallman (https://stallman.org)
> Chief GNUisance of the GNU Project (https://gnu.org)
> Founder, Free Software Foundation (https://fsf.org)
> Internet Hall-of-Famer (https://internethalloffame.org)
>
>
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Sat, 01 Oct 2022 06:05:01 GMT) Full text and rfc822 format available.

Message #34 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: rms <at> gnu.org
Cc: lumarzeli30 <at> gmail.com, 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Sat, 01 Oct 2022 09:03:37 +0300
> Cc: 58159 <at> debbugs.gnu.org
> From: Richard Stallman <rms <at> gnu.org>
> Date: Fri, 30 Sep 2022 21:58:01 -0400
> 
> Do we really want to complicate Emacs to support the Wancho script?

The script was already supported: we automatically add support for all
the scripts defined by the Unicode Standard when we import each new
version of Unicode, simply by virtue of supporting the character
codepoints of that script.

The change in question just defined a new language-environment (a
small addition to an existing data structure) and a new (and very
simple) input method.  So my conclusion (and I asked myself the same
questions when reviewing the patch) was that adding this doesn't
complicate Emacs in any significant way.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Mon, 03 Oct 2022 01:07:01 GMT) Full text and rfc822 format available.

Message #37 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com>
Cc: 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Sun, 02 Oct 2022 21:06:10 -0400
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I don't get how adding support for the Wancho script is complicating Emacs,
  > this

Normally a feature like this requires documentation in a manual as
well as code to implement it.

  > Have you considered that Wancho being a Sino-Tibetan language, Devanagari
  > and Latin script
  > may be inadequate to serve it?

It could be so, but there's no point in our speculating about it.  The
Wancho speakers can judge this.  If some decades from now they mostly
use the new alphabet, that will give it a real case for support.

  > It is though, having a separate script also provides a unique identity to
  > the language.

This tends to support my speculation, that the development of this
alphabet was part of a political influence campaign.

  > Urdu despite being virtually same with Hindi enjoys the status of a
  > separate language.

I don't speak either Urdu or Hindi, but I've read that Urdu has a lot
of vocabulary derived from Persian or Arabic.  With such a difference,
they are not "virtually the same."

But that is a tangent.  Each of those scripts is used by millions and
has been used for centuries.  It is clear that Emacs should support
them both.

  > Having a different script has aesthetic reasons as well for example how
  > could latin replicate the beauty
  > of devanagari conjuncts!

I found that a difficult complexity, for this human, and for software
too I expect.  But that too is a tangent.

My point is that when Unicode incorporates scripts that aren't and
never were used very much, and were developed for PR motives,
incorporation into Unicode is not by itself a reason to add support
into Emacs.

You're right that supporting _one_ barely-used script is not a
significant complexity.  If this is the only barely-used script that
Unicode incorporates, I won't keep arguing against it.

But if Unicode is inclined to do things like this, how many more
barely-used scripts will it adopt?  How many more has it already
adopted?

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Mon, 03 Oct 2022 02:39:02 GMT) Full text and rfc822 format available.

Message #40 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: rms <at> gnu.org
Cc: lumarzeli30 <at> gmail.com, 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Mon, 03 Oct 2022 05:38:02 +0300
> Cc: 58159 <at> debbugs.gnu.org
> From: Richard Stallman <rms <at> gnu.org>
> Date: Sun, 02 Oct 2022 21:06:10 -0400
> 
> But if Unicode is inclined to do things like this, how many more
> barely-used scripts will it adopt?  How many more has it already
> adopted?

That is not our question to answer.  The Unicode Consortium makes
these decisions based on their criteria.  We just support the
characters they add.

These additions are usually so minor that I believe they don't warrant
any discussion.  E.g., part of the support for new characters is the
ability to up-case and down-case them; we lift the data from the
Unicode Character Database, which we import.  It would be unthinkable
for Emacs not to be able to do these simple text operations on any
character.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Sat, 08 Oct 2022 22:36:02 GMT) Full text and rfc822 format available.

Message #43 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: lumarzeli30 <at> gmail.com, 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Sat, 08 Oct 2022 18:35:36 -0400
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > But if Unicode is inclined to do things like this, how many more
  > > barely-used scripts will it adopt?  How many more has it already
  > > adopted?

  > That is not our question to answer.

They are questions about the future, so we cannot look for answers
today.  But they do affect what our attitude towards Unicode should
be.


-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Sun, 09 Oct 2022 01:10:01 GMT) Full text and rfc822 format available.

Message #46 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: समीर सिंह Sameer Singh
 <lumarzeli30 <at> gmail.com>
To: rms <at> gnu.org
Cc: Eli Zaretskii <eliz <at> gnu.org>, 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Sun, 9 Oct 2022 06:38:53 +0530
[Message part 1 (text/plain, inline)]
>
> Normally a feature like this requires documentation in a manual as
> well as code to implement it.
>

Can you elaborate on what changes are needed in which manual?

The code is already implemented i.e. the foundations to support these
scripts are already there,
someone just needs to take their time and extend this support to a specific
script, and I am doing
exactly that. This is nothing more than some grunt work.

This is what a typical patch for adding a script in Emacs looks like:
1. A one line entry in etc/NEWS announcing the support of the script and
its language environment.
2. A one line greeting in the language/script which is added in etc/HELLO
(optional)
3. A one line entry in script-representative-chars in
lisp/international/fontset.el so that Emacs can select an appropriate font
for it.
4. Adding the script name in setup-default-fontset in
lisp/international/fontset.el
5. Defining a language environment for the script in the lisp/language/*.el
files which includes the following entries:
its charset (usually unicode), its coding-system (usually utf-8), its
coding-priority (usually utf-8), its input-method, its sample text (the
same text which is added in etc/HELLO),
a one line documentation usually in the following template: "foo language
and its script bar are supported in this language environment."
6. Adding composition rules for the script (optional, only needed for
complex scripts)
7. Adding an input-method for the script in lisp/leim/quail/*.el files

Adding one of these patches does not mean introducing any significant or
breaking changes.
All the heavy lifting functions or programs were implemented earlier.
We already parse all of the information from unicode so Emacs knows about
these characters,
composite.el and harfbuzz take care of composition and quail takes care of
input-methods.

The average size of my patches appears to be around 126 lines with the
input method and 36 lines without the input-method,
which is a given since input method is needed to be defined for nearly
every key on the keyboard.
I have added around 27 scripts since May of this year.

My point is that when Unicode incorporates scripts that aren't and
> never were used very much, and were developed for PR motives,
> incorporation into Unicode is not by itself a reason to add support
> into Emacs
>

These scripts were not developed for "PR motives", they were developed to
serve the needs of the community.
For example this what was said by the inventor of the Wancho script[1]

> "I found out that it was not possible to translate the language as it did
> not capture all of its sounds. So I started researching on phonetics of the
> language," Losu said.
>

It is necessary for Unicode to support them because this is not the age of
pen and paper where the only thing limiting you to write any script for
communication is... you.
For computers this is not possible therefore efforts should be made to
rectify this both at the Unicode level and the application level.

I don't speak either Urdu or Hindi, but I've read that Urdu has a lot
> of vocabulary derived from Persian or Arabic.  With such a difference,
> they are not "virtually the same."
>

Urdu and Hindi have virtually the same grammar, having some different
vocabulary does not make it
a different language. Hindi and Urdu are regarded as two different
registers of the same language.
see: https://en.wikipedia.org/wiki/Hindustani_language

[1]
https://www.indiatoday.in/education-today/news/story/this-arunachal-student-worked-for-over-12-years-to-create-a-new-alphabet-for-a-dying-ancient-tribal-language-1597122-2019-09-09

On Sun, Oct 9, 2022 at 4:05 AM Richard Stallman <rms <at> gnu.org> wrote:

> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
>   > > But if Unicode is inclined to do things like this, how many more
>   > > barely-used scripts will it adopt?  How many more has it already
>   > > adopted?
>
>   > That is not our question to answer.
>
> They are questions about the future, so we cannot look for answers
> today.  But they do affect what our attitude towards Unicode should
> be.
>
>
> --
> Dr Richard Stallman (https://stallman.org)
> Chief GNUisance of the GNU Project (https://gnu.org)
> Founder, Free Software Foundation (https://fsf.org)
> Internet Hall-of-Famer (https://internethalloffame.org)
>
>
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Sun, 09 Oct 2022 04:23:02 GMT) Full text and rfc822 format available.

Message #49 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: rms <at> gnu.org
Cc: lumarzeli30 <at> gmail.com, 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Sun, 09 Oct 2022 07:22:22 +0300
> From: Richard Stallman <rms <at> gnu.org>
> Cc: lumarzeli30 <at> gmail.com, 58159 <at> debbugs.gnu.org
> Date: Sat, 08 Oct 2022 18:35:36 -0400
> 
>   > > But if Unicode is inclined to do things like this, how many more
>   > > barely-used scripts will it adopt?  How many more has it already
>   > > adopted?
> 
>   > That is not our question to answer.
> 
> They are questions about the future, so we cannot look for answers
> today.  But they do affect what our attitude towards Unicode should
> be.

Emacs supports all the characters defined by Unicode.  This design is
from Emacs 23 onwards.  So any characters Unicode adds will be
supported by Emacs as soon as we import the latest character database.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Fri, 14 Oct 2022 21:25:02 GMT) Full text and rfc822 format available.

Message #52 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: समीर सिंह Sameer Singh <lumarzeli30 <at> gmail.com>
Cc: eliz <at> gnu.org, 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Fri, 14 Oct 2022 17:24:48 -0400
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Can you elaborate on what changes are needed in which manual?

I don't know, but normally every new addition calls for documentation
somewhere.

  > This is what a typical patch for adding a script in Emacs looks like:
  > 1. A one line entry in etc/NEWS announcing the support of the script and
  > its language environment.
  > 2. A one line greeting in the language/script which is added in etc/HELLO
  > (optional)
  > 3. A one line entry in script-representative-chars in
  > lisp/international/fontset.el so that Emacs can select an appropriate font
  > for it.
  > 4. Adding the script name in setup-default-fontset in
  > lisp/international/fontset.el
  > 5. Defining a language environment for the script in the lisp/language/*.el
  > files which includes the following entries:
  > its charset (usually unicode), its coding-system (usually utf-8), its
  > coding-priority (usually utf-8), its input-method, its sample text (the
  > same text which is added in etc/HELLO),
  > a one line documentation usually in the following template: "foo language
  > and its script bar are supported in this language environment."
  > 6. Adding composition rules for the script (optional, only needed for
  > complex scripts)
  > 7. Adding an input-method for the script in lisp/leim/quail/*.el files

That looks like nontrivial work to add each script.
Not a big job, but not minimal either.

For a script that users actually want, it is work worth doing.
For a script that we support only because some bureaucrats
decided to include it in Unicode, is it worth that much?

  > These scripts were not developed for "PR motives", they were developed to
  > serve the needs of the community.

What I've read suggests the opposite.  I am not convinced that the
community experienced or experiences such linguistic "needs".  It
looks like some activists in that community decided that using their
own script would help them get political benefits, so they push for
its adoption.

What we know about this is sketchy.  (I could see only fragments of
the article you pointed at -- I suspect nonfree JS blocks the rest.)

If the speakers of a language are really using a script, I am in favor
of supporting it.

  > It is necessary for Unicode to support them because this is not the age of
  > pen and paper where the only thing limiting you to write any script for
  > communication is... you.

I don't subscribe to the idea that we Emacs developers _must_ support
every script that a minority of some speecdh community campaigns to
switch to.  That is dogmatic, and it could impose an unlimited burden
on us.  If every endangered language gets its own script, that could
be almost 200 more scripts coming from India alone.

I am in favor of preserving endangered languages, but that doesn't
usually require inventing a new script for each one.  For instance,
speakers of 22 Maya languages got together and established a rather
natural convention for writing them in the Latin alphabet.  The
convention states how to express each sound used in any of those
languages.  You can find it in Maya Languages in Wikipedia.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58159; Package emacs. (Sat, 15 Oct 2022 06:36:01 GMT) Full text and rfc822 format available.

Message #55 received at 58159 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: rms <at> gnu.org
Cc: lumarzeli30 <at> gmail.com, 58159 <at> debbugs.gnu.org
Subject: Re: bug#58159: [PATCH] Add support for the Wancho script
Date: Sat, 15 Oct 2022 09:35:30 +0300
> From: Richard Stallman <rms <at> gnu.org>
> Cc: eliz <at> gnu.org, 58159 <at> debbugs.gnu.org
> Date: Fri, 14 Oct 2022 17:24:48 -0400
> 
>   > This is what a typical patch for adding a script in Emacs looks like:
>   > 1. A one line entry in etc/NEWS announcing the support of the script and
>   > its language environment.
>   > 2. A one line greeting in the language/script which is added in etc/HELLO
>   > (optional)
>   > 3. A one line entry in script-representative-chars in
>   > lisp/international/fontset.el so that Emacs can select an appropriate font
>   > for it.
>   > 4. Adding the script name in setup-default-fontset in
>   > lisp/international/fontset.el
>   > 5. Defining a language environment for the script in the lisp/language/*.el
>   > files which includes the following entries:
>   > its charset (usually unicode), its coding-system (usually utf-8), its
>   > coding-priority (usually utf-8), its input-method, its sample text (the
>   > same text which is added in etc/HELLO),
>   > a one line documentation usually in the following template: "foo language
>   > and its script bar are supported in this language environment."
>   > 6. Adding composition rules for the script (optional, only needed for
>   > complex scripts)
>   > 7. Adding an input-method for the script in lisp/leim/quail/*.el files
> 
> That looks like nontrivial work to add each script.
> Not a big job, but not minimal either.

Only the two last items are nontrivial.  And item 6 is only necessary
for some scripts.  All the rest is basically trivial boilerplate.

> For a script that users actually want, it is work worth doing.
> For a script that we support only because some bureaucrats
> decided to include it in Unicode, is it worth that much?

We cannot control which itches our contributors want to scratch.
Letting them scratch their itches is an important aspect of being able
to keep them contributing to Emacs in all other areas.  This
particular itch is useful to Emacs, so I see no reason to object their
scratching it.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 12 Nov 2022 12:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 302 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.