GNU bug report logs -
#23595
25.1.50; file with chinese/japanse chars, vc-diff fails (HG, Git, RCS)
Previous Next
Reported by: Uwe Brauer <oub <at> mat.ucm.es>
Date: Sat, 21 May 2016 13:03:01 UTC
Severity: normal
Found in version 25.1.50
Done: Dmitry Gutov <dgutov <at> yandex.ru>
Bug is archived. No further changes may be made.
Full log
Message #44 received at 23595 <at> debbugs.gnu.org (full text, mbox):
On 05/23/2016 07:48 PM, Eli Zaretskii wrote:
>>> The resulting diff contains either rubbish or fails to run.
>>> Files attached.
>
> I don't see any rubbish in the Git output.
Might that have to do something with your OS? I see the mojibake like
others.
> Setting coding-system-for-read is correct, because the important use
> case is when the diffs are actually output. The problem is that
> UTF-16 is not ASCII-compatible, and so text output by Git itself will
> be mishandled. Another problem is that Git doesn't show the diffs at
> all.
Apparently so.
>> Which is weird, considering both vc-diff-internal and vc-coding-system-for-diff have both been virtually untouched for the last couple of years.
>
> Not sure what do you see as weird.
That we have a regression while the relevant functions didn't change.
Something probably changed on the lower level, and we might be wise to
figure out what (unless somebody already knows, and just didn't point
that out because it's not a bug).
>> But even if we figure out why happens, you (Uwe) probably want Git, Hg, etc, to treat this file as text, and not binary. Only then you'll be able to get meaningful diffs. I don't have a specific advice on that.
>
> Why can't we invoke "git diff --text"? That should fix the second
> problem, I think.
It does not. It forces Git to diff the file as text, but neither the
current code, nor the patch at the end make the displayed file contents
to be correctly decoded.
I haven't tried Paul's solution for this myself, but it seems to be the
way to go.
> As for the first problem, we should probably refrain from binding
> coding-system-for-read to a CODING-SYSTEM for which
>
> (coding-system-get CODING-SYSTEM :ascii-compatible-p)
>
> returns nil. We should instead bind it to no-conversion and decode
> the file data parts by hand, skipping the parts that Git itself
> outputs (yes, this is messy). Patches to that effect are welcome.
Not sure what's the best place to do it, but the patch below gives me
24.5's behavior (correctly decoding the short "Binary files ... differ"
output). Could someone try it together with Paul's solution?
diff --git a/lisp/vc/vc.el b/lisp/vc/vc.el
index 25b41e3..b62b68d 100644
--- a/lisp/vc/vc.el
+++ b/lisp/vc/vc.el
@@ -1696,6 +1696,8 @@ vc-diff-internal
(setq coding-system-for-read
(coding-system-change-eol-conversion coding-system-for-read
'dos)))
+ (unless (coding-system-get coding-system-for-read :ascii-compatible-p)
+ (setq coding-system-for-read nil))
(vc-setup-buffer buffer)
(message "%s" (car messages))
;; Many backends don't handle well the case of a file that has been
This bug report was last modified 9 years and 24 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.