GNU bug report logs -
#15535
24.3.50; basic-save-buffer should update buffer-file-coding-system value if the contents were written using different coding system
Previous Next
Reported by: Dmitry Gutov <dgutov <at> yandex.ru>
Date: Sat, 5 Oct 2013 22:45:02 UTC
Severity: normal
Found in version 24.3.50
Fixed in version 24.4
Done: Dmitry Gutov <dgutov <at> yandex.ru>
Bug is archived. No further changes may be made.
Full log
Message #11 received at 15535 <at> debbugs.gnu.org (full text, mbox):
(I've added Handa-san to this discussion, as I'm not sure I didn't miss
anything in looking into this.)
> Date: Sun, 06 Oct 2013 02:09:01 +0300
> From: Dmitry Gutov <dgutov <at> yandex.ru>
>
> Sorry, here's a better test:
>
> (ert-deftest save-buffer-updates-buffer-file-coding-system ()
> (let ((file (expand-file-name "foo" temporary-file-directory))
> (default-buffer-file-coding-system 'utf-8-unix))
> (find-file file)
> (insert "abcdef\n")
> (save-buffer)
> (kill-buffer)
> (unwind-protect
> (with-current-buffer (find-file-noselect file)
> (should (eq 'undecided (coding-system-change-eol-conversion
> buffer-file-coding-system nil)))
> (insert "водка матрёшка селёдка")
> (save-buffer)
> (let ((coding-system buffer-file-coding-system))
> (kill-buffer)
> (should (eq 'utf-8-unix coding-system))))
> (delete-file file))))
>
> Likewise, succeeds on 24.3, fails on trunk.
Thanks. For the record, a simpler test case is this:
emacs -Q
C-x C-f foo RET
Insert some ASCII text, then save the buffer, kill it, and visit the
file again:
C-x C-s
C-x k RET
C-x C-f foo RET
You now have foo with `undecided' as its buffer-file-coding-system.
Then:
C-u C-\ cyrillic-translit RET
abvgde
C-\
C-x C-s
The file is saved (as UTF-8, as can be seen by examining it on disk),
but without asking for encoding, and without changing
buffer-file-coding-system to reflect the actual encoding.
What happens is that `undecided' silently encodes the buffer in UTF-8,
but never communicates that fact back to its callers. So write-region
thinks it used `undecided', as does select-safe-coding-system. The
latter is actually equipped to DTRT when the `prefer-utf-8' variant of
`undecided' is used, but that is not the case here.
Is this what was supposed to happen, or is something misbehaving here?
If the former, we could perhaps add some flag to struct undecided_spec
and set it whenever the encoder used by `undecided' sees a non-ASCII
character, and then use that flag to set last-coding-system-used to
UTF-8. Does this make sense?
This bug report was last modified 11 years and 227 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.