GNU bug report logs -
#21574
Emacs mishandles ASCII .po files
Previous Next
To reply to this bug, email your comments to 21574 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#21574
; Package
emacs
.
(Sun, 27 Sep 2015 20:02:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sun, 27 Sep 2015 20:02:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Emacs's po-mode mishandles .po files that specify charset=us-ascii. To
reproduce the problem on Fedora, run 'LC_ALL=cs_CZ.iso88592 emacs -Q fr.po' with
fr.po being the attached file (taken from Texinfo 6.0), and type '# C-x 8 RET
161 RET RET C-x C-s'. The file will be saved with the line '#š' prepended, in
Latin-2 encoding, even though the file declares its encoding to be
charset=us-ascii. If I visit the modified file again with
'LC_ALL=fr_FR.iso88591 emacs -Q fr.po' I will see a first line of '#¹'. This
example is merely of a bad comment, but I suppose this could lead to a bad
translation.
It's only a minor problem, as .po files should be using UTF-8 nowadays instead
of US-ASCII. I'm reporting it only because I suppose the bug could be more
general than just po-mode.
[fr.po (text/x-gettext-translation, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#21574
; Package
emacs
.
(Sun, 27 Sep 2015 23:56:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 21574 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert wrote:
> Emacs's po-mode
Not part of Emacs?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#21574
; Package
emacs
.
(Sun, 27 Sep 2015 23:57:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 21574 <at> debbugs.gnu.org (full text, mbox):
Glenn Morris wrote:
>> Emacs's po-mode
>
> Not part of Emacs?
Blah, ignore me.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#21574
; Package
emacs
.
(Sun, 27 Sep 2015 23:59:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 21574 <at> debbugs.gnu.org (full text, mbox):
Glenn Morris wrote:
>>> Emacs's po-mode
>>
>> Not part of Emacs?
>
> Blah, ignore me.
Or not.
Are you talking about po.el or po-mode.el?
.po files open in fundamental-mode for me.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#21574
; Package
emacs
.
(Mon, 28 Sep 2015 02:21:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 21574 <at> debbugs.gnu.org (full text, mbox):
Glenn Morris wrote:
> Or not.
> Are you talking about po.el or po-mode.el?
> .po files open in fundamental-mode for me.
Sorry, I meant whatever happens when you visit a .po file without doing anything
special. I thought Emacs dumped you into po-mode, but now I see that it's
fundamental mode. I'll retitle the bug report accordingly.
Changed bug title to 'Emacs mishandles ASCII .po files' from 'po-mode mishandles ASCII files'
Request was from
Paul Eggert <eggert <at> cs.ucla.edu>
to
control <at> debbugs.gnu.org
.
(Mon, 28 Sep 2015 02:23:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#21574
; Package
emacs
.
(Mon, 28 Sep 2015 06:44:03 GMT)
Full text and
rfc822 format available.
Message #22 received at 21574 <at> debbugs.gnu.org (full text, mbox):
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Sun, 27 Sep 2015 19:20:35 -0700
> Cc: 21574 <at> debbugs.gnu.org
>
> Glenn Morris wrote:
> > Or not.
> > Are you talking about po.el or po-mode.el?
> > .po files open in fundamental-mode for me.
>
> Sorry, I meant whatever happens when you visit a .po file without doing anything
> special. I thought Emacs dumped you into po-mode
In the old discussion I mentioned yesterday, we decided that the full
po-mode was too large and not really necessary to include in core, so
we only included the part that helps decoding the file's contents
correctly.
Perhaps we should revisit that old decision and include the full
po-mode now.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#21574
; Package
emacs
.
(Mon, 28 Sep 2015 10:19:01 GMT)
Full text and
rfc822 format available.
Message #25 received at 21574 <at> debbugs.gnu.org (full text, mbox):
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Sun, 27 Sep 2015 13:01:22 -0700
>
> Emacs's po-mode mishandles .po files that specify charset=us-ascii. To
> reproduce the problem on Fedora, run 'LC_ALL=cs_CZ.iso88592 emacs -Q fr.po' with
> fr.po being the attached file (taken from Texinfo 6.0), and type '# C-x 8 RET
> 161 RET RET C-x C-s'. The file will be saved with the line '#š' prepended, in
> Latin-2 encoding, even though the file declares its encoding to be
> charset=us-ascii.
I think what you see is a side effect of a feature: when a character
is added that can be safely encoded by the default value of
buffer-file-coding-system, Emacs silently saves the file in that
encoding. This feature was added in response to user requests not to
bother them with annoying requests to select an encoding when all they
did was add some non-ASCII text native to their locale to a file that
was previously ASCII-only.
What we need in this particular case, I think, is some code in po.el
that would function similarly to what we already do when the file has
a coding cookie, and the user adds characters that cannot be saved
with the encoding stated by the cookie.
Btw, do we have a similar problem in other files that have entries in
file-coding-system-alist, like XML files or LaTeX files?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#21574
; Package
emacs
.
(Mon, 28 Sep 2015 14:56:02 GMT)
Full text and
rfc822 format available.
Message #28 received at 21574 <at> debbugs.gnu.org (full text, mbox):
On 09/28/2015 03:18 AM, Eli Zaretskii wrote:
> do we have a similar problem in other files that have entries in
> file-coding-system-alist, like XML files or LaTeX files?
Not with XML, because xml-find-file-coding-system only tries
detect-coding-region.
But there is a problem with LaTeX. For example, if I visit a file like
this:
\usepackage[ascii]{inputenc}
hello
The default coding system for saving the buffer is us-ascii-unix, which
is correct (and is better than what we get with .po files). But if I
then insert a non-ASCII character, say á, and type C-x C-s, Emacs
changes the coding system to utf-8-unix without asking and saves the
file. If I then revisit the file I get garbage: the á is decoded as two
bytes and I see \303\241 on the screen.
This bug report was last modified 9 years and 318 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.