GNU bug report logs - #19503
most translations of proper names aren't being used

Previous Next

Package: coreutils;

Reported by: Benno Schulenberg <bensberg <at> justemail.net>

Date: Sun, 4 Jan 2015 11:15:03 UTC

Severity: wishlist

Tags: wontfix

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 19503 in the body.
You can then email your comments to 19503 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#19503; Package coreutils. (Sun, 04 Jan 2015 11:15:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Benno Schulenberg <bensberg <at> justemail.net>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sun, 04 Jan 2015 11:15:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Benno Schulenberg <bensberg <at> justemail.net>
To: Coreutils <bug-coreutils <at> gnu.org>
Subject: most translations of proper names aren't being used
Date: Sun, 04 Jan 2015 12:14:26 +0100
Hi,

Just now I noticed that 'du --version | tail -2' produces here this:

Verkita de Torbjern GRANLUND (Torbjörn Granlund), David MacKenzie, Paul Eggert
kaj Jim Meyering.

whereas I expected it to produce this:

Verkita de Torbjern GRANLUND (Torbjörn Granlund), David MEKENZI (David MacKenzie), Paŭl EGERT (Paul Eggert)
kaj Ĝim MEJERING (Jim Meyering).

Quite a few languages transcribe proper names to their own
writing system and phonology.  But coreutils, when a proper
name does not contain any UTF-8 characters, defines away the
propername() function, thus ignoring any transcription that
translators made for this name.  I think this is wrong and
that propername() should always be used, or at least that
disabling it should be optional.

If however propername() is never going to be used, then please
don't waste translator time by marking proper names as
translatable/transcribable.

The propername() function is defined to a noop in the system.h
file with the argument that it saves 17K on each executable.
But... shouldn't propername() be a library function that isn't
actually included into each and every util?

Benno

-- 
http://www.fastmail.com - Same, same, but different...





Information forwarded to bug-coreutils <at> gnu.org:
bug#19503; Package coreutils. (Sun, 04 Jan 2015 15:54:02 GMT) Full text and rfc822 format available.

Message #8 received at 19503 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Benno Schulenberg <bensberg <at> justemail.net>, 19503 <at> debbugs.gnu.org
Cc: Daiki Ueno <ueno <at> gnu.org>
Subject: Re: bug#19503: most translations of proper names aren't being used
Date: Sun, 04 Jan 2015 15:53:38 +0000
On 04/01/15 11:14, Benno Schulenberg wrote:
> 
> Hi,
> 
> Just now I noticed that 'du --version | tail -2' produces here this:
> 
> Verkita de Torbjern GRANLUND (Torbjörn Granlund), David MacKenzie, Paul Eggert
> kaj Jim Meyering.
> 
> whereas I expected it to produce this:
> 
> Verkita de Torbjern GRANLUND (Torbjörn Granlund), David MEKENZI (David MacKenzie), Paŭl EGERT (Paul Eggert)
> kaj Ĝim MEJERING (Jim Meyering).
> 
> Quite a few languages transcribe proper names to their own
> writing system and phonology.  But coreutils, when a proper
> name does not contain any UTF-8 characters, defines away the
> propername() function, thus ignoring any transcription that
> translators made for this name.  I think this is wrong and
> that propername() should always be used, or at least that
> disabling it should be optional.
> 
> If however propername() is never going to be used, then please
> don't waste translator time by marking proper names as
> translatable/transcribable.
> 
> The propername() function is defined to a noop in the system.h
> file with the argument that it saves 17K on each executable.
> But... shouldn't propername() be a library function that isn't
> actually included into each and every util?

Yes this isn't ideal.

tl;dr I'm thinking of just adding ASCII author names,
and removing using of proper_name.

For reference, details on proper_name() are at:
http://www.gnu.org/software/gettext/manual/html_node/Names.html

I'm not sure why proper_name() hasn't been made available in
the gettext shared libs. If that was the case then we would
probably enable unconditionally.

However the 17K extra per util is too much for the feature IMHO.
As a side note, that wouldn't be an issue when building as a multi-call binary
which is an optional feature new in v8.23.

proper_name_utf8() is primarily used so that we're not
outputting non ASCII chars in the C "locale" for example.

Getting into the feature itself, it's handy to have the
transliteration done inline, though it can be done independently.
Given the transliteration can be done independently, and since
only a few locales are providing transliterations,
I'm inclined to not use proper_name() at all and remove
that task from translators.

Now we could add some dependence and be more correct, by
adding pronunciation hints to the comments like:
  Pronunciation is like "Pawrig Brady" (Pɒrɪg Bredi)
However personally I don't care how my name is pronounced,
and think people tend to pattern match here anyway, rather
than care about pronunciation.

Also there is the more general point about how correct
it is to attribute a program to author(s) in any case,
as that tracked to a much more accurate level of detail
by git blame etc.  Should we be removing output of
author names at runtime completely?

thanks,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#19503; Package coreutils. (Sun, 04 Jan 2015 16:52:02 GMT) Full text and rfc822 format available.

Message #11 received at 19503 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigbrady.com>
Cc: Daiki Ueno <ueno <at> gnu.org>, Benno Schulenberg <bensberg <at> justemail.net>,
 19503 <at> debbugs.gnu.org
Subject: Re: bug#19503: most translations of proper names aren't being used
Date: Sun, 4 Jan 2015 08:50:43 -0800
On Sun, Jan 4, 2015 at 7:53 AM, Pádraig Brady <P <at> draigbrady.com> wrote:
...
> Also there is the more general point about how correct
> it is to attribute a program to author(s) in any case,
> as that tracked to a much more accurate level of detail
> by git blame etc.  Should we be removing output of
> author names at runtime completely?

We cannot do that blindly, since we lack version control history
from before 1992, which would make it appear that David J. MacKenzie
(who wrote many of these tools from scratch) contributed nothing.




Information forwarded to bug-coreutils <at> gnu.org:
bug#19503; Package coreutils. (Sun, 04 Jan 2015 17:44:02 GMT) Full text and rfc822 format available.

Message #14 received at 19503 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: Benno Schulenberg <bensberg <at> justemail.net>, 19503 <at> debbugs.gnu.org,
 Daiki Ueno <ueno <at> gnu.org>
Subject: Re: bug#19503: most translations of proper names aren't being used
Date: Sun, 04 Jan 2015 17:43:20 +0000
On 04/01/15 16:50, Jim Meyering wrote:
> On Sun, Jan 4, 2015 at 7:53 AM, Pádraig Brady <P <at> draigbrady.com> wrote:
> ...
>> Also there is the more general point about how correct
>> it is to attribute a program to author(s) in any case,
>> as that tracked to a much more accurate level of detail
>> by git blame etc.  Should we be removing output of
>> author names at runtime completely?
> 
> We cannot do that blindly, since we lack version control history
> from before 1992, which would make it appear that David J. MacKenzie
> (who wrote many of these tools from scratch) contributed nothing.

Well we'd still leave the  /* Written by ... */ comments at the start.
I'm just not convinced of the need for attribution at runtime,
given that it's inaccurate and awkward to represent.

BTW, it might have been nice to have the initial git commits
for these tools attributed to the original author. Hindsight and all that :)
Also I was wondering recently about the origins of some of this code,
and thought it might be useful to have a repo with commits per
release, which could be obtained from various old tar balls.
I did notice a few pre 1992 tarballs. I wonder what the best
source of these would be.

cheers,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#19503; Package coreutils. (Sun, 04 Jan 2015 21:19:02 GMT) Full text and rfc822 format available.

Message #17 received at 19503 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigbrady.com>
Cc: Benno Schulenberg <bensberg <at> justemail.net>, 19503 <at> debbugs.gnu.org,
 Daiki Ueno <ueno <at> gnu.org>
Subject: Re: bug#19503: most translations of proper names aren't being used
Date: Sun, 4 Jan 2015 13:18:11 -0800
On Sun, Jan 4, 2015 at 9:43 AM, Pádraig Brady <P <at> draigbrady.com> wrote:
...
> BTW, it might have been nice to have the initial git commits
> for these tools attributed to the original author. Hindsight and all that :)

It would have been nice, indeed.

When I agreed to do the job of maintaining the fileutils, textutils and
shellutils packages, I tried hard to find a CVS repository (no
dVCS existed back then), but as far as I could tell, if there had been
one, it was gone. It was only reluctantly that I resorted to putting
old tarballs on versioned release branches.  While djm was the primary
author for many tools, there were invariably commits by others, too, as
seen in old/*/ChangeLog*.  However, without some VCS files, it was not
feasible to attribute.  Even with ChangeLog+CVS, automated attribution
as I did for the glibc cvs-to-git conversion was nontrivial: most commits
were done by Ulrich, but ChangeLog usually gave the name of the "Author",
and reliably mapping the cvs user-name or ChangeLog-attributed name to
a git Real Name/email pair took some work.




Information forwarded to bug-coreutils <at> gnu.org:
bug#19503; Package coreutils. (Sun, 21 Oct 2018 22:00:02 GMT) Full text and rfc822 format available.

Message #20 received at 19503 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: 19503 <at> debbugs.gnu.org
Subject: Re: bug#19503: most translations of proper names aren't being used
Date: Sun, 21 Oct 2018 15:59:20 -0600
severity 19503 wishlist
tags 19503 wontfix
close 19503
stop

(triaging old bugs)

Hello,

On 04/01/15 08:53 AM, Pádraig Brady wrote:
> On 04/01/15 11:14, Benno Schulenberg wrote:
>>
>> Just now I noticed that 'du --version | tail -2' produces here this:
>>
>> Verkita de Torbjern GRANLUND (Torbjörn Granlund), David MacKenzie, Paul Eggert
>> kaj Jim Meyering.
>>
>> whereas I expected it to produce this:
>>
>> Verkita de Torbjern GRANLUND (Torbjörn Granlund), David MEKENZI (David MacKenzie), Paŭl EGERT (Paul Eggert)
>> kaj Ĝim MEJERING (Jim Meyering).
>>
> 
> Yes this isn't ideal.
> 
> tl;dr I'm thinking of just adding ASCII author names,
> and removing using of proper_name.

In 2016 commit eba871cd [1] converted all authors' names
to ASCII characters (following agreement from authors in [2]).

[1] 
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=eba871cd3237e8b7dcd9552f544b365934767849
[2] https://lists.gnu.org/archive/html/coreutils/2016-11/msg00039.html

As such, closing this request, but discussion can continue by replying
to this thread.

-assaf





Severity set to 'wishlist' from 'normal' Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 21 Oct 2018 22:00:03 GMT) Full text and rfc822 format available.

Added tag(s) wontfix. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 21 Oct 2018 22:00:03 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 19503 <at> debbugs.gnu.org and Benno Schulenberg <bensberg <at> justemail.net> Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 21 Oct 2018 22:00:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 19 Nov 2018 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 215 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.