GNU bug report logs - #15535
24.3.50; basic-save-buffer should update buffer-file-coding-system value if the contents were written using different coding system

Previous Next

Package: emacs;

Reported by: Dmitry Gutov <dgutov <at> yandex.ru>

Date: Sat, 5 Oct 2013 22:45:02 UTC

Severity: normal

Found in version 24.3.50

Fixed in version 24.4

Done: Dmitry Gutov <dgutov <at> yandex.ru>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 15535 in the body.
You can then email your comments to 15535 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#15535; Package emacs. (Sat, 05 Oct 2013 22:45:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Dmitry Gutov <dgutov <at> yandex.ru>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 05 Oct 2013 22:45:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.3.50;
 basic-save-buffer should update buffer-file-coding-system value if
 the contents were written using different coding system
Date: Sun, 06 Oct 2013 01:44:09 +0300
Otherwise it's hard to find out which coding system was used, after all.

See here why it's useful:
http://lists.gnu.org/archive/html/emacs-devel/2013-10/msg00129.html

The following test passes in Emacs 24.3 but fails on trunk:

(ert-deftest save-buffer-updates-buffer-file-coding-system ()
  (let ((file (expand-file-name "foo" temporary-file-directory))
        (default-buffer-file-coding-system 'utf-8-unix))
    (unwind-protect
        (with-temp-buffer
          (insert "abcdef\n")
          (write-file file))
      (with-current-buffer (find-file-noselect file)
        (should (eq 'undecided (coding-system-change-eol-conversion
                                buffer-file-coding-system nil)))
        (insert "водка матрёшка селёдка")
        (save-buffer)
        ;; Fails here:
        (should (eq 'utf-8-unix buffer-file-coding-system)))
      (delete-file file))))

In GNU Emacs 24.3.50.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.6.4)
 of 2013-10-04 on axl
Bzr revision: 114513 eggert <at> cs.ucla.edu-20131003161631-vox3mdtalfjg13ed
Windowing system distributor `The X.Org Foundation', version 11.0.11303000
System Description:	Ubuntu 13.04




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#15535; Package emacs. (Sat, 05 Oct 2013 23:10:02 GMT) Full text and rfc822 format available.

Message #8 received at 15535 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: 15535 <at> debbugs.gnu.org
Subject: Re: bug#15535: Acknowledgement (24.3.50; basic-save-buffer should
 update buffer-file-coding-system value if the contents were written using
 different coding system)
Date: Sun, 06 Oct 2013 02:09:01 +0300
Sorry, here's a better test:

(ert-deftest save-buffer-updates-buffer-file-coding-system ()
  (let ((file (expand-file-name "foo" temporary-file-directory))
        (default-buffer-file-coding-system 'utf-8-unix))
    (find-file file)
    (insert "abcdef\n")
    (save-buffer)
    (kill-buffer)
    (unwind-protect
        (with-current-buffer (find-file-noselect file)
          (should (eq 'undecided (coding-system-change-eol-conversion
                                  buffer-file-coding-system nil)))
          (insert "водка матрёшка селёдка")
          (save-buffer)
          (let ((coding-system buffer-file-coding-system))
            (kill-buffer)
            (should (eq 'utf-8-unix coding-system))))
      (delete-file file))))

Likewise, succeeds on 24.3, fails on trunk.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#15535; Package emacs. (Sun, 06 Oct 2013 16:52:02 GMT) Full text and rfc822 format available.

Message #11 received at 15535 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>, Kenichi Handa <handa <at> gnu.org>
Cc: 15535 <at> debbugs.gnu.org
Subject: Re: bug#15535: Acknowledgement (24.3.50;
 basic-save-buffer should update buffer-file-coding-system value
 if	the contents were written using different coding system)
Date: Sun, 06 Oct 2013 19:51:34 +0300
(I've added Handa-san to this discussion, as I'm not sure I didn't miss
anything in looking into this.)

> Date: Sun, 06 Oct 2013 02:09:01 +0300
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> 
> Sorry, here's a better test:
> 
> (ert-deftest save-buffer-updates-buffer-file-coding-system ()
>    (let ((file (expand-file-name "foo" temporary-file-directory))
>          (default-buffer-file-coding-system 'utf-8-unix))
>      (find-file file)
>      (insert "abcdef\n")
>      (save-buffer)
>      (kill-buffer)
>      (unwind-protect
>          (with-current-buffer (find-file-noselect file)
>            (should (eq 'undecided (coding-system-change-eol-conversion
>                                    buffer-file-coding-system nil)))
>            (insert "водка матрёшка селёдка")
>            (save-buffer)
>            (let ((coding-system buffer-file-coding-system))
>              (kill-buffer)
>              (should (eq 'utf-8-unix coding-system))))
>        (delete-file file))))
> 
> Likewise, succeeds on 24.3, fails on trunk.

Thanks.  For the record, a simpler test case is this:

 emacs -Q
 C-x C-f foo RET
 
Insert some ASCII text, then save the buffer, kill it, and visit the
file again:

 C-x C-s
 C-x k RET
 C-x C-f foo RET

You now have foo with `undecided' as its buffer-file-coding-system.
Then:

 C-u C-\ cyrillic-translit RET
 abvgde
 C-\
 C-x C-s

The file is saved (as UTF-8, as can be seen by examining it on disk),
but without asking for encoding, and without changing
buffer-file-coding-system to reflect the actual encoding.

What happens is that `undecided' silently encodes the buffer in UTF-8,
but never communicates that fact back to its callers.  So write-region
thinks it used `undecided', as does select-safe-coding-system.  The
latter is actually equipped to DTRT when the `prefer-utf-8' variant of
`undecided' is used, but that is not the case here.

Is this what was supposed to happen, or is something misbehaving here?

If the former, we could perhaps add some flag to struct undecided_spec
and set it whenever the encoder used by `undecided' sees a non-ASCII
character, and then use that flag to set last-coding-system-used to
UTF-8.  Does this make sense?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#15535; Package emacs. (Sun, 06 Oct 2013 21:00:04 GMT) Full text and rfc822 format available.

Message #14 received at 15535 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>, Kenichi Handa <handa <at> gnu.org>
Cc: 15535 <at> debbugs.gnu.org
Subject: Re: bug#15535: Acknowledgement (24.3.50;	basic-save-buffer should
 update buffer-file-coding-system value if	the contents were written using
 different coding system)
Date: Sun, 06 Oct 2013 23:58:57 +0300
On 06.10.2013 19:51, Eli Zaretskii wrote:
> If the former, we could perhaps add some flag to struct undecided_spec
> and set it whenever the encoder used by `undecided' sees a non-ASCII
> character, and then use that flag to set last-coding-system-used to
> UTF-8.

That already happens (last-coding-system-used has the right value right 
after the file is written), but I don't think I can use it: even if 
`ruby-mode-set-encoding' is moved to after-save-hook, as long as it's 
not the first function in this hook (and I can't ensure that it is), the 
previous functions can also do some I/O and thus change 
last-coding-system-used's value.

And that the reason I reverted 114527 in 114533, which in turn sparked 
the discussion in emacs-devel.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#15535; Package emacs. (Mon, 07 Oct 2013 02:53:02 GMT) Full text and rfc822 format available.

Message #17 received at 15535 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 15535 <at> debbugs.gnu.org, handa <at> gnu.org
Subject: Re: bug#15535: Acknowledgement (24.3.50;
 basic-save-buffer should update buffer-file-coding-system value
 if	the contents were written using different coding system)
Date: Mon, 07 Oct 2013 05:52:38 +0300
> Date: Sun, 06 Oct 2013 23:58:57 +0300
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> CC: 15535 <at> debbugs.gnu.org
> 
> On 06.10.2013 19:51, Eli Zaretskii wrote:
> > If the former, we could perhaps add some flag to struct undecided_spec
> > and set it whenever the encoder used by `undecided' sees a non-ASCII
> > character, and then use that flag to set last-coding-system-used to
> > UTF-8.
> 
> That already happens (last-coding-system-used has the right value right 
> after the file is written)

Not here, it doesn't.  I see 'undecided'.  And that is part of the
problem.

> but I don't think I can use it: even if 
> `ruby-mode-set-encoding' is moved to after-save-hook, as long as it's 
> not the first function in this hook (and I can't ensure that it is), the 
> previous functions can also do some I/O and thus change 
> last-coding-system-used's value.

You can always take the value of last-coding-system-used as the first
thing you do.  The problem is that the value is wrong, at least in the
scenario I used to reproduce the problem.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#15535; Package emacs. (Mon, 07 Oct 2013 04:01:01 GMT) Full text and rfc822 format available.

Message #20 received at 15535 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 15535 <at> debbugs.gnu.org, handa <at> gnu.org
Subject: Re: bug#15535: Acknowledgement (24.3.50;	basic-save-buffer should
 update buffer-file-coding-system value if	the contents were written using
 different coding system)
Date: Mon, 07 Oct 2013 07:00:42 +0300
On 07.10.2013 05:52, Eli Zaretskii wrote:
> Not here, it doesn't.  I see 'undecided'.  And that is part of the
> problem.

True, sorry. It worked for me previously, but I guess the value was 
similarly spoiled by some other function in after-save-hook.

>> `ruby-mode-set-encoding' is moved to after-save-hook, as long as it's
>> not the first function in this hook (and I can't ensure that it is), the
>> previous functions can also do some I/O and thus change
>> last-coding-system-used's value.
>
> You can always take the value of last-coding-system-used as the first
> thing you do.

If "I" am a function inside after-save-hook, I don't control the "first 
thing".

But now I see that `basic-save-buffer' does save the value of 
`last-coding-system-used' to either `save-buffer-coding-system' or 
`buffer-file-coding-system', depending on whether the former is non-nil.

So I can use those, and the problem is reduced to having the right 
`last-coding-system-used' value set.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#15535; Package emacs. (Mon, 07 Oct 2013 15:00:04 GMT) Full text and rfc822 format available.

Message #23 received at 15535 <at> debbugs.gnu.org (full text, mbox):

From: Kenichi Handa <handa <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 15535 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#15535: Acknowledgement (24.3.50;
 basic-save-buffer should update buffer-file-coding-system value
 if	the contents were written using different coding system)
Date: Mon, 07 Oct 2013 23:59:25 +0900
In article <83vc1a5omh.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
[...]
> You now have foo with `undecided' as its buffer-file-coding-system.
> Then:

>  C-u C-\ cyrillic-translit RET
>  abvgde
>  C-\
>  C-x C-s

> The file is saved (as UTF-8, as can be seen by examining it on disk),
> but without asking for encoding, and without changing
> buffer-file-coding-system to reflect the actual encoding.

> What happens is that `undecided' silently encodes the buffer in UTF-8,
> but never communicates that fact back to its callers.  So write-region
> thinks it used `undecided', as does select-safe-coding-system.  The
> latter is actually equipped to DTRT when the `prefer-utf-8' variant of
> `undecided' is used, but that is not the case here.

> Is this what was supposed to happen

No.  I think the behavior of 24.3 is correct.  So, some
change in trunk has a problem.  But, as far as I remember I
have not touched any codes that relate to this misbehavior.
I'm now investigating what has been changed from 24.3.

---
Kenichi Handa
handa <at> gnu.org




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#15535; Package emacs. (Sun, 13 Oct 2013 12:08:01 GMT) Full text and rfc822 format available.

Message #26 received at 15535 <at> debbugs.gnu.org (full text, mbox):

From: Kenichi Handa <handa <at> gnu.org>
To: Kenichi Handa <handa <at> gnu.org>
Cc: 15535 <at> debbugs.gnu.org, eliz <at> gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#15535: Acknowledgement (24.3.50;
 basic-save-buffer should update buffer-file-coding-system value
 if	the contents were written using different coding system)
Date: Sun, 13 Oct 2013 21:06:59 +0900
In article <87siwd9lf6.fsf <at> gnu.org>, Kenichi Handa <handa <at> gnu.org> writes:

> No.  I think the behavior of 24.3 is correct.  So, some
> change in trunk has a problem.  But, as far as I remember I
> have not touched any codes that relate to this misbehavior.
> I'm now investigating what has been changed from 24.3.

Oops, it was me who enbugged...I had fixed the bug of 24.3
in a wrong way.  I've just committed the fix to trunk.

---
Kenichi Handa
handa <at> gnu.org




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#15535; Package emacs. (Sun, 13 Oct 2013 16:49:02 GMT) Full text and rfc822 format available.

Message #29 received at 15535 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Kenichi Handa <handa <at> gnu.org>
Cc: 15535 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#15535: Acknowledgement (24.3.50;
 basic-save-buffer should update buffer-file-coding-system
 value	if	the contents were written using different coding system)
Date: Sun, 13 Oct 2013 19:48:05 +0300
> From: Kenichi Handa <handa <at> gnu.org>
> Cc: eliz <at> gnu.org, 15535 <at> debbugs.gnu.org, dgutov <at> yandex.ru
> Date: Sun, 13 Oct 2013 21:06:59 +0900
> 
> In article <87siwd9lf6.fsf <at> gnu.org>, Kenichi Handa <handa <at> gnu.org> writes:
> 
> > No.  I think the behavior of 24.3 is correct.  So, some
> > change in trunk has a problem.  But, as far as I remember I
> > have not touched any codes that relate to this misbehavior.
> > I'm now investigating what has been changed from 24.3.
> 
> Oops, it was me who enbugged...I had fixed the bug of 24.3
> in a wrong way.  I've just committed the fix to trunk.

Thanks, it works.




Reply sent to Dmitry Gutov <dgutov <at> yandex.ru>:
You have taken responsibility. (Sun, 13 Oct 2013 20:44:02 GMT) Full text and rfc822 format available.

Notification sent to Dmitry Gutov <dgutov <at> yandex.ru>:
bug acknowledged by developer. (Sun, 13 Oct 2013 20:44:02 GMT) Full text and rfc822 format available.

Message #34 received at 15535-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>, Kenichi Handa <handa <at> gnu.org>
Cc: 15535-done <at> debbugs.gnu.org
Subject: Re: bug#15535: Acknowledgement (24.3.50;	basic-save-buffer should
 update buffer-file-coding-system value	if	the contents were written using
 different coding system)
Date: Sun, 13 Oct 2013 23:43:29 +0300
Version: 24.4

Fixed for me too, thanks.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 11 Nov 2013 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 11 years and 227 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.