From unknown Fri Sep 05 20:55:04 2025 X-Loop: help-debbugs@gnu.org Subject: bug#11694: 24.1; in-is13194-devanagari with \200 Resent-From: starback@stp.lingfil.uu.se (Per =?UTF-8?Q?Starb=C3=A4ck?=) Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 13 Jun 2012 14:32:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 11694 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 11694@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.13395979052796 (code B ref -1); Wed, 13 Jun 2012 14:32:01 +0000 Received: (at submit) by debbugs.gnu.org; 13 Jun 2012 14:31:45 +0000 Received: from localhost ([127.0.0.1]:40972 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Seobo-0000j2-IO for submit@debbugs.gnu.org; Wed, 13 Jun 2012 10:31:45 -0400 Received: from eggs.gnu.org ([208.118.235.92]:54626) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Seobm-0000iu-8u for submit@debbugs.gnu.org; Wed, 13 Jun 2012 10:31:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SeoYy-000889-2R for submit@debbugs.gnu.org; Wed, 13 Jun 2012 10:28:53 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.2 Received: from lists.gnu.org ([208.118.235.17]:42677) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SeoYx-000882-VJ for submit@debbugs.gnu.org; Wed, 13 Jun 2012 10:28:47 -0400 Received: from eggs.gnu.org ([208.118.235.92]:32781) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SeoYt-0001rJ-40 for bug-gnu-emacs@gnu.org; Wed, 13 Jun 2012 10:28:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SeoYm-000879-3f for bug-gnu-emacs@gnu.org; Wed, 13 Jun 2012 10:28:42 -0400 Received: from numerus.lingfil.uu.se ([130.238.78.148]:41709) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SeoYl-00086m-PQ for bug-gnu-emacs@gnu.org; Wed, 13 Jun 2012 10:28:36 -0400 Received: from numerus.lingfil.uu.se (numerus.lingfil.uu.se [130.238.78.148]) by numerus.lingfil.uu.se (8.14.4/8.14.4) with ESMTP id q5DESK2a002662; Wed, 13 Jun 2012 16:28:23 +0200 From: starback@stp.lingfil.uu.se (Per =?UTF-8?Q?Starb=C3=A4ck?=) Date: Wed, 13 Jun 2012 16:28:20 +0200 Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 208.118.235.17 X-Spam-Score: -6.9 (------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.9 (------) In GNU Emacs 24.1.1 (i686-pc-linux-gnu, GTK+ Version 2.18.9) value of $LANG: en_US.utf8 locale-coding-system: utf-8-unix $ echo -e 'Sm\xf6rg\xe5sbord' >test1.txt When opened by Emacs that file is detected correctly as being Latin-1. $ echo -e 'Sm\xf6rg\xe5sbord \x805' >test2.txt Here I add a price of 5 Euro for the Smorgasbord, written with Microsoft's windows-1252 (which is very similar to latin-1). With Emacs 23 this test2.txt is opened as raw-text-unix which is OK I guess. (Detecting windows-1252 correctly isn't that easy.) But now, with Emacs 24, it is opened as in-is13194-devanagari-unix which surprises me. I think this is a bug. The problem is that \xe5 is shown as "\200" then anyway. Why guess a coding which doesn't make sense of all the characters? And if we really want to guess something even if it doesn't make sense of that character, why prefer this to Latin-1 which makes sense of the same part of the input and which was preferrable if the "strange" part didn't exist? I don't know where this change comes from. "C-h C" shows the same "Priority order for recognizing coding systems when reading files" as I had in 23.4, starting Priority order for recognizing coding systems when reading files: 1. utf-8 (alias: mule-utf-8) 2. iso-2022-7bit 3. iso-latin-1 (alias: iso-8859-1 latin-1) 4. iso-2022-7bit-lock (alias: iso-2022-int-1) 5. iso-2022-8bit-ss2 6. emacs-mule 7. raw-text 8. iso-2022-jp (alias: junet) 9. in-is13194-devanagari (alias: devanagari) windows-1252 isn't there at all, so of course I can't detect that. (Maybe it ought to be there somewhere, but that's another issue in that case.) After (set-coding-system-priority 'windows-1252) (set-coding-system-priority 'in-is13194-devanagari) so that windows-1252 is recognized, but in-is13194-devanagari is preferred, test2.txt still detects as in-is13194-devanagari, even though there now is is a possiblity that makes sense of all characters. From unknown Fri Sep 05 20:55:04 2025 X-Loop: help-debbugs@gnu.org Subject: bug#11694: 24.1; in-is13194-devanagari with \200 In-Reply-To: Resent-From: npostavs@users.sourceforge.net Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 08 Jul 2016 23:12:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 11694 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: starback@stp.lingfil.uu.se (Per =?UTF-8?Q?Starb=C3=A4ck?=) Cc: 11694@debbugs.gnu.org Received: via spool by 11694-submit@debbugs.gnu.org id=B11694.14680195157557 (code B ref 11694); Fri, 08 Jul 2016 23:12:01 +0000 Received: (at 11694) by debbugs.gnu.org; 8 Jul 2016 23:11:55 +0000 Received: from localhost ([127.0.0.1]:42967 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bLew7-0001xk-A4 for submit@debbugs.gnu.org; Fri, 08 Jul 2016 19:11:55 -0400 Received: from mail-io0-f180.google.com ([209.85.223.180]:36217) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bLew4-0001xR-MC; Fri, 08 Jul 2016 19:11:53 -0400 Received: by mail-io0-f180.google.com with SMTP id s93so13920542ioi.3; Fri, 08 Jul 2016 16:11:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:references:date:message-id:mime-version :content-transfer-encoding; bh=j0YsFDi5/uR+HARuSqOl3yOBVdYZjYgaAwer2aPvGko=; b=QnAaAcD3sHB/Ud2mjqn5Ap/3zO0FTr9pRPGKucEo8FGAql3hxS5iRMtun9j5Uk3fci E0n+W2tzeAaQFkGzsbx6CPQ3BV2dF9+Wx2XhcaiIadZ6ow133WnZF1GWtYesif3m94Wc Od0M5YEBTWCqaCPU360HgyZwtogy32hn1nz1sVaUDmEVvEZXAe9CWCU5jepw3J85iUEb TqOr42/tsNizCeXEB8NfwghaU8mWEaNmRK7SVGvQxsHuSw2IeWTCUTHKyDhHhLX/ZYI8 SSgbB4aBSBq0XrRAlBV8Ib1GNA8UyeX2fFdpryo7xtGlcVxgJIJrKnt57F8d3vUq3+qC QuXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:references:date :message-id:mime-version:content-transfer-encoding; bh=j0YsFDi5/uR+HARuSqOl3yOBVdYZjYgaAwer2aPvGko=; b=kAkhEo9sQP7TxTJhTm/bL1csFVRuReIGWFGQQU76xcm+6Kq6agdfP4DNPVjwMIoNRn CoTSeau9iqyU1iHha3HFhuBcItVe42EIvXdYRL6lcNPU2HGRP+bmqw64OJCg4SrR+oJr weAk+7qvdiVoIe3TVc+X++BlJLta59A4SM/TeAOjq29VGL+JFNUQ9Dc29T95x0wYmFNh feXWvaYCdDO+NAaiejNCHXeNWEJltVQCBs0a53979JgoOZho0SyR0Ywba2egBMQiJ6en 4r4agF9SR7zFSBP11274CftDa2iB0ORQFA//g8QM7RDLf9KTreXGVjf31fxpuX1baQZU OVoA== X-Gm-Message-State: ALyK8tIZJBjp6RXq3Mt+BeqAMl9sNi2mgVBWqcEdArex2RaaLwmNMpHBNgwsvG56s6sDyQ== X-Received: by 10.107.201.193 with SMTP id z184mr10574863iof.120.1468019506902; Fri, 08 Jul 2016 16:11:46 -0700 (PDT) Received: from zony (206-188-64-44.cpe.distributel.net. [206.188.64.44]) by smtp.googlemail.com with ESMTPSA id i13sm5121635iod.33.2016.07.08.16.11.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 08 Jul 2016 16:11:46 -0700 (PDT) From: npostavs@users.sourceforge.net References: Date: Fri, 08 Jul 2016 19:11:16 -0400 Message-ID: <87wpkviv3f.fsf@users.sourceforge.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) found 11694 24.2 tags 11694 fixed close 11694 24.3 quit starback@stp.lingfil.uu.se (Per Starb=C3=A4ck) writes: > In GNU Emacs 24.1.1 (i686-pc-linux-gnu, GTK+ Version 2.18.9) > $ echo -e 'Sm\xf6rg\xe5sbord \x805' >test2.txt > > Here I add a price of 5 Euro for the Smorgasbord, written with > Microsoft's windows-1252 (which is very similar to latin-1). > > With Emacs 23 this test2.txt is opened as raw-text-unix which is OK > I guess. (Detecting windows-1252 correctly isn't that easy.) With Emacs 24.3 it's detected as raw-text-unix again. > > After > (set-coding-system-priority 'windows-1252) > (set-coding-system-priority 'in-is13194-devanagari) And it's correctly recognized as windows-1252 after doing this.