GNU bug report logs - #23595
25.1.50; file with chinese/japanse chars, vc-diff fails (HG, Git, RCS)

Previous Next

Package: emacs;

Reported by: Uwe Brauer <oub <at> mat.ucm.es>

Date: Sat, 21 May 2016 13:03:01 UTC

Severity: normal

Found in version 25.1.50

Done: Dmitry Gutov <dgutov <at> yandex.ru>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: oub <at> mat.ucm.es, Paul Eggert <eggert <at> cs.ucla.edu>, 23595 <at> debbugs.gnu.org
Subject: bug#23595: 25.1.50; file with chinese/japanse chars, vc-diff fails (HG, Git, RCS)
Date: Tue, 24 May 2016 00:02:36 +0300
On 05/23/2016 07:48 PM, Eli Zaretskii wrote:

>>> The resulting diff contains either rubbish or fails to run.
>>> Files attached.
>
> I don't see any rubbish in the Git output.

Might that have to do something with your OS? I see the mojibake like 
others.

> Setting coding-system-for-read is correct, because the important use
> case is when the diffs are actually output.  The problem is that
> UTF-16 is not ASCII-compatible, and so text output by Git itself will
> be mishandled.  Another problem is that Git doesn't show the diffs at
> all.

Apparently so.

>> Which is weird, considering both vc-diff-internal and vc-coding-system-for-diff have both been virtually untouched for the last couple of years.
>
> Not sure what do you see as weird.

That we have a regression while the relevant functions didn't change. 
Something probably changed on the lower level, and we might be wise to 
figure out what (unless somebody already knows, and just didn't point 
that out because it's not a bug).

>> But even if we figure out why happens, you (Uwe) probably want Git, Hg, etc, to treat this file as text, and not binary. Only then you'll be able to get meaningful diffs. I don't have a specific advice on that.
>
> Why can't we invoke "git diff --text"?  That should fix the second
> problem, I think.

It does not. It forces Git to diff the file as text, but neither the 
current code, nor the patch at the end make the displayed file contents 
to be correctly decoded.

I haven't tried Paul's solution for this myself, but it seems to be the 
way to go.

> As for the first problem, we should probably refrain from binding
> coding-system-for-read to a CODING-SYSTEM for which
>
>    (coding-system-get CODING-SYSTEM :ascii-compatible-p)
>
> returns nil.  We should instead bind it to no-conversion and decode
> the file data parts by hand, skipping the parts that Git itself
> outputs (yes, this is messy).  Patches to that effect are welcome.

Not sure what's the best place to do it, but the patch below gives me 
24.5's behavior (correctly decoding the short "Binary files ... differ" 
output). Could someone try it together with Paul's solution?

diff --git a/lisp/vc/vc.el b/lisp/vc/vc.el
index 25b41e3..b62b68d 100644
--- a/lisp/vc/vc.el
+++ b/lisp/vc/vc.el
@@ -1696,6 +1696,8 @@ vc-diff-internal
 	(setq coding-system-for-read
 	      (coding-system-change-eol-conversion coding-system-for-read
 						   'dos)))
+    (unless (coding-system-get coding-system-for-read :ascii-compatible-p)
+      (setq coding-system-for-read nil))
     (vc-setup-buffer buffer)
     (message "%s" (car messages))
     ;; Many backends don't handle well the case of a file that has been





This bug report was last modified 9 years and 24 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.