GNU bug report logs - #71080
30.0.50; UTF-8 used unconditionally when saving GPG file

Previous Next

Package: emacs;

Reported by: Stefan Monnier <monnier <at> iro.umontreal.ca>

Date: Mon, 20 May 2024 15:44:02 UTC

Severity: normal

Found in version 30.0.50

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 71080 <at> debbugs.gnu.org
Subject: bug#71080: 30.0.50; UTF-8 used unconditionally when saving GPG file
Date: Mon, 20 May 2024 19:20:55 +0300
> Cc: monnier <at> iro.umontreal.ca
> Date: Mon, 20 May 2024 11:43:31 -0400
> From:  Stefan Monnier via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> Then
> 
>     emacs -Q ~/tmp/foo.txt
>     C-x C-w foo.gpg RET        # To save the file into an encrypted `foo.gpg`.
>     TAB TAB RET                # To select symmetric encryption.
>     .. type the password you'd like to use ...
>     M-x revert-buffer RET
> 
> and then you should see that he `λ` turned into its UTF-8 sequence `\316\273`.
> The same happens if you encrypt with public keys and if you use any
> other encoding that's different from UTF-8.
> 
> AFAICT, the problem is partly due to
> 
>     (find-coding-systems-region (point-min) (point-max))
> 
> returning a list which includes `no-conversion` because in the end the
> buffer is saved with "no conversion" (i.e. it uses Emacs's internal
> encoding).  Another part of the problem is that `find-auto-coding`
> returns `no-conversion` for `.gpg` files because those files are binary.
> 
> IOW it can be considered as the result of "no conversion" being
> ambiguous, meaning either "binary" or "Emacs's internal encoding"
> depending on the circumstances.  But it's also due to the confusion
> between the encoding to use before encryption (resp. after decryption)
> and the encoding to use after encryption (resp. before decryption).
> 
> I don't understand enough of how the "no conversion" ambiguity is
> expected to be resolved, nor how the different layers of encoding
> are supposed to be handled in file-name-handlers to dig much deeper.

How can this work reliably, unless the *.gpg files can have some
meta-data that tells Emacs how to decode them?  When encoding, we
could perhaps use buffer-file-coding-system (AFAICT, we do that
indirectly now, via select-safe-coding-system), but what to do when
decoding?

If _you_ know the correct encoding, you could use "C-x RET c" before
the commands (as in "C-x RET c iso-2022-7bit RET C-x C-w").  Did you
try that?

IOW, I don't think the problem is with 'no-conversion', the problem is
that when decoding, we don't really have any useful info for how to
decode, and the locale-dependent ad-hoc'ery doesn't help because
GPG-encrypted stuff is likely to come from a different locale.  You
deliberately used iso-2022-7bit, which simulates such an "alien" file.

Am I missing something (I know very little about epa, so apologies if
what I say makes no sense)?




This bug report was last modified 1 year and 50 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.