GNU bug report logs - #21574
Emacs mishandles ASCII .po files

Previous Next

Package: emacs;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Sun, 27 Sep 2015 20:02:02 UTC

Severity: normal

To reply to this bug, email your comments to 21574 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#21574; Package emacs. (Sun, 27 Sep 2015 20:02:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Paul Eggert <eggert <at> cs.ucla.edu>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 27 Sep 2015 20:02:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Emacs bug reports and feature requests <bug-gnu-emacs <at> gnu.org>
Subject: po-mode mishandles ASCII files
Date: Sun, 27 Sep 2015 13:01:22 -0700
[Message part 1 (text/plain, inline)]
Emacs's po-mode mishandles .po files that specify charset=us-ascii.  To 
reproduce the problem on Fedora, run 'LC_ALL=cs_CZ.iso88592 emacs -Q fr.po' with 
fr.po being the attached file (taken from Texinfo 6.0), and type '# C-x 8 RET 
161 RET RET C-x C-s'.  The file will be saved with the line '#š' prepended, in 
Latin-2 encoding, even though the file declares its encoding to be 
charset=us-ascii.  If I visit the modified file again with 
'LC_ALL=fr_FR.iso88591 emacs -Q fr.po' I will see a first line of '#¹'.  This 
example is merely of a bad comment, but I suppose this could lead to a bad 
translation.

It's only a minor problem, as .po files should be using UTF-8 nowadays instead 
of US-ASCII.  I'm reporting it only because I suppose the bug could be more 
general than just po-mode.
[fr.po (text/x-gettext-translation, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21574; Package emacs. (Sun, 27 Sep 2015 23:56:02 GMT) Full text and rfc822 format available.

Message #8 received at 21574 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Morris <rgm <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 21574 <at> debbugs.gnu.org
Subject: Re: bug#21574: po-mode mishandles ASCII files
Date: Sun, 27 Sep 2015 19:55:17 -0400
Paul Eggert wrote:

> Emacs's po-mode 

Not part of Emacs?





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21574; Package emacs. (Sun, 27 Sep 2015 23:57:02 GMT) Full text and rfc822 format available.

Message #11 received at 21574 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Morris <rgm <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 21574 <at> debbugs.gnu.org
Subject: Re: bug#21574: po-mode mishandles ASCII files
Date: Sun, 27 Sep 2015 19:56:21 -0400
Glenn Morris wrote:

>> Emacs's po-mode 
>
> Not part of Emacs?

Blah, ignore me.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21574; Package emacs. (Sun, 27 Sep 2015 23:59:02 GMT) Full text and rfc822 format available.

Message #14 received at 21574 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Morris <rgm <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 21574 <at> debbugs.gnu.org
Subject: Re: bug#21574: po-mode mishandles ASCII files
Date: Sun, 27 Sep 2015 19:58:40 -0400
Glenn Morris wrote:

>>> Emacs's po-mode 
>>
>> Not part of Emacs?
>
> Blah, ignore me.

Or not.
Are you talking about po.el or po-mode.el?
.po files open in fundamental-mode for me.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21574; Package emacs. (Mon, 28 Sep 2015 02:21:01 GMT) Full text and rfc822 format available.

Message #17 received at 21574 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Glenn Morris <rgm <at> gnu.org>
Cc: 21574 <at> debbugs.gnu.org
Subject: Re: bug#21574: po-mode mishandles ASCII files
Date: Sun, 27 Sep 2015 19:20:35 -0700
Glenn Morris wrote:
> Or not.
> Are you talking about po.el or po-mode.el?
> .po files open in fundamental-mode for me.

Sorry, I meant whatever happens when you visit a .po file without doing anything 
special.  I thought Emacs dumped you into po-mode, but now I see that it's 
fundamental mode.  I'll retitle the bug report accordingly.




Changed bug title to 'Emacs mishandles ASCII .po files' from 'po-mode mishandles ASCII files' Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Mon, 28 Sep 2015 02:23:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21574; Package emacs. (Mon, 28 Sep 2015 06:44:03 GMT) Full text and rfc822 format available.

Message #22 received at 21574 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: rgm <at> gnu.org, 21574 <at> debbugs.gnu.org
Subject: Re: bug#21574: po-mode mishandles ASCII files
Date: Mon, 28 Sep 2015 09:43:55 +0300
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Sun, 27 Sep 2015 19:20:35 -0700
> Cc: 21574 <at> debbugs.gnu.org
> 
> Glenn Morris wrote:
> > Or not.
> > Are you talking about po.el or po-mode.el?
> > .po files open in fundamental-mode for me.
> 
> Sorry, I meant whatever happens when you visit a .po file without doing anything 
> special.  I thought Emacs dumped you into po-mode

In the old discussion I mentioned yesterday, we decided that the full
po-mode was too large and not really necessary to include in core, so
we only included the part that helps decoding the file's contents
correctly.

Perhaps we should revisit that old decision and include the full
po-mode now.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21574; Package emacs. (Mon, 28 Sep 2015 10:19:01 GMT) Full text and rfc822 format available.

Message #25 received at 21574 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 21574 <at> debbugs.gnu.org
Subject: Re: bug#21574: po-mode mishandles ASCII files
Date: Mon, 28 Sep 2015 13:18:49 +0300
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Sun, 27 Sep 2015 13:01:22 -0700
> 
> Emacs's po-mode mishandles .po files that specify charset=us-ascii.  To 
> reproduce the problem on Fedora, run 'LC_ALL=cs_CZ.iso88592 emacs -Q fr.po' with 
> fr.po being the attached file (taken from Texinfo 6.0), and type '# C-x 8 RET 
> 161 RET RET C-x C-s'.  The file will be saved with the line '#š' prepended, in 
> Latin-2 encoding, even though the file declares its encoding to be 
> charset=us-ascii.

I think what you see is a side effect of a feature: when a character
is added that can be safely encoded by the default value of
buffer-file-coding-system, Emacs silently saves the file in that
encoding.  This feature was added in response to user requests not to
bother them with annoying requests to select an encoding when all they
did was add some non-ASCII text native to their locale to a file that
was previously ASCII-only.

What we need in this particular case, I think, is some code in po.el
that would function similarly to what we already do when the file has
a coding cookie, and the user adds characters that cannot be saved
with the encoding stated by the cookie.

Btw, do we have a similar problem in other files that have entries in
file-coding-system-alist, like XML files or LaTeX files?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21574; Package emacs. (Mon, 28 Sep 2015 14:56:02 GMT) Full text and rfc822 format available.

Message #28 received at 21574 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21574 <at> debbugs.gnu.org
Subject: Re: bug#21574: po-mode mishandles ASCII files
Date: Mon, 28 Sep 2015 07:55:12 -0700
On 09/28/2015 03:18 AM, Eli Zaretskii wrote:
> do we have a similar problem in other files that have entries in
> file-coding-system-alist, like XML files or LaTeX files?

Not with XML, because xml-find-file-coding-system only tries 
detect-coding-region.

But there is a problem with LaTeX.  For example, if I visit a file like 
this:

  \usepackage[ascii]{inputenc}
  hello

The default coding system for saving the buffer is us-ascii-unix, which 
is correct (and is better than what we get with .po files). But if I 
then insert a non-ASCII character, say á, and type C-x C-s, Emacs 
changes the coding system to utf-8-unix without asking and saves the 
file.  If I then revisit the file I get garbage: the á is decoded as two 
bytes and I see \303\241 on the screen.




This bug report was last modified 9 years and 318 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.