From unknown Mon Jun 23 07:50:19 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Reply-To: Juanma Barranquero , 2741@debbugs.gnu.org Resent-From: Juanma Barranquero Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Sat, 21 Mar 2009 23:30:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: report 2741 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by submit@emacsbugs.donarmstrong.com id=B.123767783729574 (code B ref -1); Sat, 21 Mar 2009 23:30:03 +0000 Received: (at submit) by emacsbugs.donarmstrong.com; 21 Mar 2009 23:23:57 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: * X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=1.1 required=4.0 tests=FOURLA,IMPRONONCABLE_2 autolearn=no version=3.2.5-bugs.debian.org_2005_01_02 Received: from mail-ew0-f178.google.com (mail-ew0-f178.google.com [209.85.219.178]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n2LNNr3F029568 for ; Sat, 21 Mar 2009 16:23:55 -0700 Received: by ewy26 with SMTP id 26so1525623ewy.1 for ; Sat, 21 Mar 2009 16:23:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:received:message-id:subject :from:to:content-type:content-transfer-encoding; bh=UhrTQXE13rcFhkFIncPIpW1kSu8lGAE69D6pwUCmH9c=; b=fmLUIez8A3+jcpFfM18oBV3jsI+wL3ZFemFGEZPaQuB8zAdbOuqaRbm9IuAaqihGbO 6+isU9p+bI/b8Is/PRe9txekr8L499wRAFAcizHZdfsbOw5MmD2dEatJhHNcx3JJ+LZo CWF3fIIVZCwvTJtGqonZR5Zbjk/HYLcfXoS4g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=mG3B3ikjUMxbsEsFb5EyxB/9bgzEslq8HacyFdznZEwjCzw8gV1CVi2Jgkp312BceV 2hjluJBCNaVFN4wBkocNg+uWN7uu5OO4Ml2SzoGYSdCM54QP9F2YnDf90gzk1cln/l6Y 64Ld9Z9W0yNjFSc8pAY7EdtMXStW1NFenZIak= MIME-Version: 1.0 Date: Sun, 22 Mar 2009 00:23:32 +0100 Received: by 10.210.129.19 with SMTP id b19mr979969ebd.34.1237677827917; Sat, 21 Mar 2009 16:23:47 -0700 (PDT) Message-ID: From: Juanma Barranquero To: Emacs Bug Tracker Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable 1) Create a Git repository and add a Latin-1 file with some non-ASCII characters. In my example, the archive test.txt contains the following text: A few Spanish characters: =C3=A1=C3=A9=C3=AD=C3=B3=C3=BA=C3=BC=C3=B1 2) Execute "emacs -Q test.txt -f vc-annotate". The resulting *Annotate test.txt* buffer has buffer-file-coding-system `iso-latin-1-dos' and shows: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: =C3=A1=C3=A9=C3=AD=C3=B3=C3=BA=C3=BC=C3=B1 3) Set LANG to UTF-8 (for example, "set LANG=3Den_US.UTF-8"), and repeat "emacs -Q test.txt -f vc-annotate". Now the *Annotate* buffer is in `utf-8-dos', and shows: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: =C3=A1=C3=A9=C3=AD=C3=B3=C3=BA=C3=BC=C3=B1 4) Finally, after unsetting LANG or not (it is irrelevant) do emacs -Q --eval "(set-language-environment \"UTF-8\")" test.txt -f vc-annotate Now the *Annotate* buffer is in `utf-8-dos', but contains a mixture of utf-8 and raw bytes: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: \341\351\355\363\372\374\361 Juanma From unknown Mon Jun 23 07:50:19 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Reply-To: Stefan Monnier , 2741@debbugs.gnu.org Resent-From: Stefan Monnier Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Sun, 22 Mar 2009 01:30:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 2741 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by submit@emacsbugs.donarmstrong.com id=B.123768504227000 (code B ref -1); Sun, 22 Mar 2009 01:30:03 +0000 Received: (at submit) by emacsbugs.donarmstrong.com; 22 Mar 2009 01:24:02 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=0.6 required=4.0 tests=FOURLA,HAS_BUG_NUMBER, IMPRONONCABLE_2,XIRONPORT autolearn=no version=3.2.5-bugs.debian.org_2005_01_02 Received: from ironport2-out.teksavvy.com (ironport2-out.teksavvy.com [206.248.154.182]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n2M1N0Vq026971; Sat, 21 Mar 2009 18:23:02 -0700 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjoFAA8wxUlFxIZP/2dsb2JhbACBTNIgg34GYoQF X-IronPort-AV: E=Sophos;i="4.38,401,1233550800"; d="scan'208";a="35490648" Received: from 69-196-134-79.dsl.teksavvy.com (HELO pastel.home) ([69.196.134.79]) by ironport2-out.teksavvy.com with ESMTP; 21 Mar 2009 21:22:55 -0400 Received: by pastel.home (Postfix, from userid 20848) id C80D67F74; Sat, 21 Mar 2009 21:23:16 -0400 (EDT) From: Stefan Monnier To: Juanma Barranquero Cc: 2741@debbugs.gnu.org, Emacs Bug Tracker Message-ID: References: Date: Sat, 21 Mar 2009 21:23:16 -0400 In-Reply-To: (Juanma Barranquero's message of "Sun, 22 Mar 2009 00:23:32 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.91 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii > 4) Finally, after unsetting LANG or not (it is irrelevant) do > emacs -Q --eval "(set-language-environment \"UTF-8\")" test.txt -f > vc-annotate > Now the *Annotate* buffer is in `utf-8-dos', but contains a mixture > of utf-8 and raw bytes: > ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few > Spanish characters: \341\351\355\363\372\374\361 I don't see a mixture of anything, I just see latin-1 encoded chars decoded incorrectly because Emacs somehow decided to try and decode the stream using the utf-8 coding-system. But yes that's a bug. `vc-annotate' should use the main file's coding-system to decode the annotated text, regardless of language environment. Stefan From unknown Mon Jun 23 07:50:19 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Reply-To: Juanma Barranquero , 2741@debbugs.gnu.org Resent-From: Juanma Barranquero Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Sun, 22 Mar 2009 01:40:04 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 2741 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by 2741-submit@emacsbugs.donarmstrong.com id=B2741.123768551029366 (code B ref 2741); Sun, 22 Mar 2009 01:40:04 +0000 Received: (at 2741) by emacsbugs.donarmstrong.com; 22 Mar 2009 01:31:50 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-2.9 required=4.0 tests=FOURLA,HAS_BUG_NUMBER autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from mail-ew0-f178.google.com (mail-ew0-f178.google.com [209.85.219.178]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n2M1VkUI029360 for <2741@emacsbugs.donarmstrong.com>; Sat, 21 Mar 2009 18:31:48 -0700 Received: by ewy26 with SMTP id 26so1544870ewy.1 for <2741@emacsbugs.donarmstrong.com>; Sat, 21 Mar 2009 18:31:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :received:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=yUrfeDRETCDcYX70sa41A4JgZS0euxVuX4eyLf57lRQ=; b=dDgCfTAw+mjMGE6HYAAtOpda/vUgrx94eF0ofjyIRfsGB1DbWa+9urftcRz5i6pupF 7VzXigkNwclQHjSIH1AqcyJmb7Tc0CAdTJydvYheWspAaz4J0bAG9+DIVpfMMflr0clw Jup/kmqRTGBIX1siPhjq4W2px3sjS7ssVcI1c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=x5YEBzi6duZq62Z8KNDtG+a0AfLjYruPwPXlM2jz0cE+6wHLoQFQt3TPJ/rvTy5Ny/ 2boEesCPL4oRI6ke7eZVFcmllp4erHcNojy/kjlNZgUJ4+/VBhUYj4msQZlf01JkplBP EP8rd8YPHrHJyToe5AxwcWiE8yv9gxgvDuPEI= MIME-Version: 1.0 In-Reply-To: References: Date: Sun, 22 Mar 2009 02:31:26 +0100 Received: by 10.210.21.6 with SMTP id 6mr1045845ebu.63.1237685501147; Sat, 21 Mar 2009 18:31:41 -0700 (PDT) Message-ID: From: Juanma Barranquero To: Stefan Monnier Cc: 2741@debbugs.gnu.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Sun, Mar 22, 2009 at 02:23, Stefan Monnier wr= ote: > I don't see a mixture of anything, I just see latin-1 encoded chars > decoded incorrectly because Emacs somehow decided to try and decode the > stream using the utf-8 coding-system. Whatever. What I meant is that the buffer is nominally utf-8, but contains raw bytes. > But yes that's a bug. =C2=A0`vc-annotate' should use the main file's > coding-system to decode the annotated text, regardless of > language environment. It seems also a bug that the behavior is different between emacs -Q --eval "(set-language-environment \"UTF-8\")" and set LANG=3Dutf8.UTF-8 emacs -Q when, in both cases, `current-language-environment' is "UTF-8". Juanma From lekktu@gmail.com Mon Mar 23 03:06:30 2009 Received: (at control) by emacsbugs.donarmstrong.com; 23 Mar 2009 10:06:30 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=0.0 required=4.0 tests=none autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from mail-ew0-f178.google.com (mail-ew0-f178.google.com [209.85.219.178]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n2NA6QVg030413 for ; Mon, 23 Mar 2009 03:06:28 -0700 Received: by ewy26 with SMTP id 26so1926725ewy.1 for ; Mon, 23 Mar 2009 03:06:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :received:message-id:subject:from:to:content-type :content-transfer-encoding; bh=mzlXztSVnwkj2NBrW5768ICI+gRV7rfsAjan67iCCrM=; b=FxQ1YRBJ+Rm2dcypEkOcfzVvbcv7dGHQL4I6YDIKmipn03p3LO32zZHKgPo6sIm7JK ppH8ybMQt67ov4elCoEUAJRhnsgFjdsgb5u4JkptJyg4O0lSEVazsazoWWs3AENd1jab ZL0MjybOqTqMePz4GrtZWabtabAJzCD1xtvA4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=IVTAC3TkmYnY/4T8atB5W+gMOnh0ghDQTgt/KUR1s/9W0g5g77NWUmIa02rZmHlh+L cpXiSAwGe0EAm1aZ2g+UFF9OT5ERl68hFlVsOvYWzSYX2UzFsr/n6IovnlKM4ShmDSzt j4Xa8qjF2JsmeRGXiYj0kWy221aH5PnxU/XPM= MIME-Version: 1.0 In-Reply-To: References: Date: Mon, 23 Mar 2009 11:06:06 +0100 Received: by 10.210.115.15 with SMTP id n15mr2169826ebc.20.1237802781172; Mon, 23 Mar 2009 03:06:21 -0700 (PDT) Message-ID: Subject: Re: Processed (with 1 errors): your mail From: Juanma Barranquero To: control@debbugs.gnu.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit retitle 2741 Decoding of vc-annotate output affected by language environment quit From unknown Mon Jun 23 07:50:19 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") Reply-To: Juanma Barranquero , 2741@debbugs.gnu.org Resent-From: Juanma Barranquero Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Wed, 09 Sep 2009 23:25:07 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 2741 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by 2741-submit@emacsbugs.donarmstrong.com id=B2741.12525383283205 (code B ref 2741); Wed, 09 Sep 2009 23:25:07 +0000 Received: (at 2741) by emacsbugs.donarmstrong.com; 9 Sep 2009 23:18:48 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-3.5 required=4.0 tests=AWL,GMAIL,HAS_BUG_NUMBER, MURPHY_DRUGS_REL8,SARE_SUB_ENC_UTF8 autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from mail-fx0-f213.google.com (mail-fx0-f213.google.com [209.85.220.213]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id n89NIjZM003194 for <2741@emacsbugs.donarmstrong.com>; Wed, 9 Sep 2009 16:18:47 -0700 Received: by fxm9 with SMTP id 9so4199462fxm.1 for <2741@emacsbugs.donarmstrong.com>; Wed, 09 Sep 2009 16:18:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=Q/2MQBT9NPAuhEWB+h1BxZ0G7d6rJvswBP9O65gwIIk=; b=KDJMU+u67g+VwMyOS2YqoFqeT0xvujK3gebn1C9HWXFSSzIwR+q1HKS5D/K8jQ/RXA fKFh6YisT25TJ/W1pY0pUEcXrr0OZ4VNl4OHN0rq2QBXIUbS0oWS96y+7tDkUQaYyixd G1bAsTkSWGLHhTCgsckqFe2evEmSKFZRVZwDk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=dHc7mrDaoX9cDHUu7NbBa/LiuO25ZkYRLA7vNo/UYbrVIyLrypQluwMqFd1SlQ2hat XfAOsy8ZuMoMNtcVCjNlmww4bEPYlgP2BIIQnBObaEikR+4P/poH9h9yrSOjAIc+r7/t Y31bsfcz00aJJvIi4Ob7/dJSrA2ufRdb6qXQs= MIME-Version: 1.0 Received: by 10.239.139.158 with SMTP id t30mr86544hbt.94.1252538320107; Wed, 09 Sep 2009 16:18:40 -0700 (PDT) In-Reply-To: References: From: Juanma Barranquero Date: Thu, 10 Sep 2009 01:18:20 +0200 Message-ID: To: Stefan Monnier Cc: 2741@debbugs.gnu.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Sun, Mar 22, 2009 at 03:23, Stefan Monnier wro= te: > I don't see a mixture of anything, I just see latin-1 encoded chars > decoded incorrectly because Emacs somehow decided to try and decode the > stream using the utf-8 coding-system. > But yes that's a bug. =C2=A0`vc-annotate' should use the main file's > coding-system to decode the annotated text, regardless of > language environment. The following patch fixes it. The change is in `vc-annotate' and not `vc-git-annotate-command' because the bug is not git-specific. I can easily reproduce it with bzr, for example. Juanma 2009-09-09 Juanma Barranquero * vc-annotate.el (vc-annotate): Use the main file's coding-system to decode annotated text, regardless of language environment. (Bug#2741) Index: vc-annotate.el =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /cvsroot/emacs/emacs/lisp/vc-annotate.el,v retrieving revision 1.8 diff -u -2 -r1.8 vc-annotate.el --- vc-annotate.el 10 Mar 2009 00:59:09 -0000 1.8 +++ vc-annotate.el 9 Sep 2009 23:11:24 -0000 @@ -376,5 +376,6 @@ (setq temp-buffer-name (buffer-name)))) (with-output-to-temp-buffer temp-buffer-name - (let ((backend (vc-backend file))) + (let ((backend (vc-backend file)) + (coding-system-for-read buffer-file-coding-system)) (vc-call-backend backend 'annotate-command file (get-buffer temp-buffer-name) rev) From unknown Mon Jun 23 07:50:19 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.427 (Entity 5.427) X-Loop: owner@emacsbugs.donarmstrong.com From: help-debbugs@gnu.org (Emacs bug Tracking System) To: Juanma Barranquero Subject: bug#2741 closed by Juanma Barranquero (Re: bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8")) Message-ID: References: X-Emacs-PR-Message: they-closed 2741 X-Emacs-PR-Package: emacs Reply-To: 2741@debbugs.gnu.org Date: Fri, 11 Sep 2009 11:10:09 +0000 Content-Type: multipart/mixed; boundary="----------=_1252667409-26938-1" This is a multi-part message in MIME format... ------------=_1252667409-26938-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This is an automatic notification regarding your bug report which was filed against the emacs package: #2741: Decoding of vc-annotate output affected by language environment It has been closed by Juanma Barranquero . Their explanation is attached below along with your original report. If this explanation is unsatisfactory and you have not received a better one in a separate message then please contact Juanma Barranquero by replying to this email. --=20 2741: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D2741 Emacs Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1252667409-26938-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 2741-done) by emacsbugs.donarmstrong.com; 11 Sep 2009 11:03:14 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-2.4 required=4.0 tests=AWL,GMAIL,HAS_BUG_NUMBER, SARE_SUB_ENC_UTF8 autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.153]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id n8BB3BgI024997 for <2741-done@emacsbugs.donarmstrong.com>; Fri, 11 Sep 2009 04:03:13 -0700 Received: by fg-out-1718.google.com with SMTP id e21so1690166fga.13 for <2741-done@emacsbugs.donarmstrong.com>; Fri, 11 Sep 2009 04:03:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=vgvli+WRAiW991BZw8BKuEUqTVR0AycR/6ywQ7cM7kk=; b=GyifdC7RbVyd5Fa38UYbuD/60G87TQ51wEa0Jop3ChXu1sEwwgqqnK5bG3Y+8wODLf VABkkxzZpkQCWClSyC4Hgm0RtzkQUtseFKUykdSb2gHHXdVZmEN3c6t7Api70MtBwt4L bdWgbPTDp9s5g5XHxKS2U7HgMnggejoETaeXw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=qp5PfoqL/yO4hI47W5/Z+2uRXejLsaCy8GanqCZz/G18w3Ot+67C3uMb04dp/5Mf09 4U4KLYldCMVpLgxBNzQfLW/Xk6AAh3am4MM1w61VOFNyNbqKh8tVuYFaaj+JKzUVzDrJ +o885+BtfcutwkVz6LhhIrEhKf3GWoYmw6NcI= MIME-Version: 1.0 Received: by 10.239.145.142 with SMTP id s14mr263532hba.144.1252666991140; Fri, 11 Sep 2009 04:03:11 -0700 (PDT) In-Reply-To: References: From: Juanma Barranquero Date: Fri, 11 Sep 2009 13:02:51 +0200 Message-ID: Subject: Re: bug#2741: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") To: Stefan Monnier Cc: 2741-done@debbugs.gnu.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Thu, Sep 10, 2009 at 01:18, Juanma Barranquero wrote: > =C2=A0 =C2=A0 =C2=A0 =C2=A0* vc-annotate.el (vc-annotate): Use the main f= ile's coding-system to > =C2=A0 =C2=A0 =C2=A0 =C2=A0decode annotated text, regardless of language = environment. =C2=A0(Bug#2741) I've installed this change. Juanma ------------=_1252667409-26938-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by emacsbugs.donarmstrong.com; 21 Mar 2009 23:23:57 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: * X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=1.1 required=4.0 tests=FOURLA,IMPRONONCABLE_2 autolearn=no version=3.2.5-bugs.debian.org_2005_01_02 Received: from mail-ew0-f178.google.com (mail-ew0-f178.google.com [209.85.219.178]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n2LNNr3F029568 for ; Sat, 21 Mar 2009 16:23:55 -0700 Received: by ewy26 with SMTP id 26so1525623ewy.1 for ; Sat, 21 Mar 2009 16:23:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:received:message-id:subject :from:to:content-type:content-transfer-encoding; bh=UhrTQXE13rcFhkFIncPIpW1kSu8lGAE69D6pwUCmH9c=; b=fmLUIez8A3+jcpFfM18oBV3jsI+wL3ZFemFGEZPaQuB8zAdbOuqaRbm9IuAaqihGbO 6+isU9p+bI/b8Is/PRe9txekr8L499wRAFAcizHZdfsbOw5MmD2dEatJhHNcx3JJ+LZo CWF3fIIVZCwvTJtGqonZR5Zbjk/HYLcfXoS4g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=mG3B3ikjUMxbsEsFb5EyxB/9bgzEslq8HacyFdznZEwjCzw8gV1CVi2Jgkp312BceV 2hjluJBCNaVFN4wBkocNg+uWN7uu5OO4Ml2SzoGYSdCM54QP9F2YnDf90gzk1cln/l6Y 64Ld9Z9W0yNjFSc8pAY7EdtMXStW1NFenZIak= MIME-Version: 1.0 Date: Sun, 22 Mar 2009 00:23:32 +0100 Received: by 10.210.129.19 with SMTP id b19mr979969ebd.34.1237677827917; Sat, 21 Mar 2009 16:23:47 -0700 (PDT) Message-ID: Subject: Mixed UTF-8 and raw bytes in output of vc-annotate after (set-language-environment "UTF-8") From: Juanma Barranquero To: Emacs Bug Tracker Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable 1) Create a Git repository and add a Latin-1 file with some non-ASCII characters. In my example, the archive test.txt contains the following text: A few Spanish characters: =C3=A1=C3=A9=C3=AD=C3=B3=C3=BA=C3=BC=C3=B1 2) Execute "emacs -Q test.txt -f vc-annotate". The resulting *Annotate test.txt* buffer has buffer-file-coding-system `iso-latin-1-dos' and shows: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: =C3=A1=C3=A9=C3=AD=C3=B3=C3=BA=C3=BC=C3=B1 3) Set LANG to UTF-8 (for example, "set LANG=3Den_US.UTF-8"), and repeat "emacs -Q test.txt -f vc-annotate". Now the *Annotate* buffer is in `utf-8-dos', and shows: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: =C3=A1=C3=A9=C3=AD=C3=B3=C3=BA=C3=BC=C3=B1 4) Finally, after unsetting LANG or not (it is irrelevant) do emacs -Q --eval "(set-language-environment \"UTF-8\")" test.txt -f vc-annotate Now the *Annotate* buffer is in `utf-8-dos', but contains a mixture of utf-8 and raw bytes: ^7fb00c1 (Juanma Barranquero 2009-03-22 00:01:39 +0100 1) A few Spanish characters: \341\351\355\363\372\374\361 Juanma ------------=_1252667409-26938-1--