From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: J S Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 16 May 2019 17:58:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 35766@debbugs.gnu.org X-Debbugs-Original-To: "bug-gnu-emacs@gnu.org" Received: via spool by submit@debbugs.gnu.org id=B.155802946228717 (code B ref -1); Thu, 16 May 2019 17:58:01 +0000 Received: (at submit) by debbugs.gnu.org; 16 May 2019 17:57:42 +0000 Received: from localhost ([127.0.0.1]:56386 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRKdS-0007T7-8U for submit@debbugs.gnu.org; Thu, 16 May 2019 13:57:42 -0400 Received: from eggs.gnu.org ([209.51.188.92]:42999) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRJum-0006Gc-He for submit@debbugs.gnu.org; Thu, 16 May 2019 13:11:33 -0400 Received: from lists.gnu.org ([209.51.188.17]:50957) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hRJuh-0003SL-FZ for submit@debbugs.gnu.org; Thu, 16 May 2019 13:11:27 -0400 Received: from eggs.gnu.org ([209.51.188.92]:57376) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hRJug-0004GH-Il for bug-gnu-emacs@gnu.org; Thu, 16 May 2019 13:11:27 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: *** X-Spam-Status: No, score=3.6 required=5.0 tests=BAYES_50, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, RECEIVED_FROM_WINDOWS_HOST autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hRJuf-0003Ph-NL for bug-gnu-emacs@gnu.org; Thu, 16 May 2019 13:11:26 -0400 Received: from mail-oln040092011071.outbound.protection.outlook.com ([40.92.11.71]:65121 helo=NAM04-SN1-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hRJue-0003Lz-Vx for bug-gnu-emacs@gnu.org; Thu, 16 May 2019 13:11:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FPvWpkUH3GdHCWbEy6WSxstd0rekrKu/nwPV1HWgXLI=; b=LthImiCPuXAJQh+qU1jQL7cGWjvkAG9sMFqDKEPqTk70NzFi3/GBw6z4MUqkKzsyo+frXi+/FHFAJqn+ErrAAkuKnDlwra5/0/BRKQrGkWJvh33tNT3zZBjzoGCH/ssdjRWxIqBiwzUJacSRe5vxkE/HnzyRJGDX3lD8Vx5xYcIGBxfkxUpE+8DglaIdxaX3s/MyuRYwM40igYWK/H/TH3gWiff7R+c13sKUUMTDwyzqwy3wHrJFVk8OcmtREqFYBX2Qgo8knETXIpygcjk5ZA6JRzfpcRMs8CDtMv9gwfYxU/cYBvZos0nZH4WaveK8TpOvjXUpUauFDS/9+t9n4w== Received: from CO1NAM04FT058.eop-NAM04.prod.protection.outlook.com (10.152.90.58) by CO1NAM04HT036.eop-NAM04.prod.protection.outlook.com (10.152.91.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1856.15; Thu, 16 May 2019 17:11:21 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com (10.152.90.59) by CO1NAM04FT058.mail.protection.outlook.com (10.152.91.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1856.15 via Frontend Transport; Thu, 16 May 2019 17:11:21 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec]) by BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec%7]) with mapi id 15.20.1878.028; Thu, 16 May 2019 17:11:21 +0000 From: J S Thread-Topic: emacs saves utf-16 le xml files as utf-16 be Thread-Index: AQHVDAo+Qjldgmx4b0y+0fxmDRw/MA== Date: Thu, 16 May 2019 17:11:21 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-incomingtopheadermarker: OriginalChecksum:4045900D75277826565E439E2B53E685FAA48A508FB40938E2DDC8A09381B699; UpperCasedChecksum:0724840654B902D8D830CB79ED6C9425C557B89F897487E2990369C9F4D2C194; SizeAsReceived:6517; Count:40 x-tmn: [FfeErehBerL+J7TWc4SQW8az6ndiXHd8] x-ms-publictraffictype: Email x-incomingheadercount: 40 x-eopattributedmessage: 0 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(5050001)(7020095)(20181119110)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031323274)(2017031324274)(2017031322404)(1601125500)(1603101475)(1701031045); SRVR:CO1NAM04HT036; x-ms-traffictypediagnostic: CO1NAM04HT036: x-microsoft-antispam-message-info: 5Ol+BoEJuTHOdkG0G99C6xcVE+phFmG8lakxXRZX6cXK5CsDF8pni8RjL04lQx0isr2+0vxwCumr5Pk/wjFzZU0mcskv/ukjrsacqq2FKslq3uQkDq5jnc0ZZLxVCucZytFmZnn8bS6Ji4hFJNdgsy0XdEQCSsrZYYWdSwFiQjXR5yOw8IUWXNNLop1hTYBE Content-Type: multipart/alternative; boundary="_000_BL0PR11MB34754FD55C8A9DF928B582A09E0A0BL0PR11MB3475namp_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: 3b9f9a5a-d0f3-4a29-0768-08d6da2184b5 X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-originalarrivaltime: 16 May 2019 17:11:21.3766 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1NAM04HT036 X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 [fuzzy] X-Received-From: 40.92.11.71 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: -1.1 (-) X-Mailman-Approved-At: Thu, 16 May 2019 13:57:40 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.1 (--) --_000_BL0PR11MB34754FD55C8A9DF928B582A09E0A0BL0PR11MB3475namp_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Xml files with this tag are saved as utf-16 be by emacs, even if the file w= as originally utf-16 le. Using "UTF-16LE" instead will break the encoding = and remove the BOM. --_000_BL0PR11MB34754FD55C8A9DF928B582A09E0A0BL0PR11MB3475namp_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Xml files with this tag are saved as utf-16 be by emacs, even if the file w= as originally utf-16 le.  Using "UTF-16LE" instead will brea= k the encoding and remove the BOM.

<?xml version=3D"1.0" encoding=3D"UTF-16"?>
--_000_BL0PR11MB34754FD55C8A9DF928B582A09E0A0BL0PR11MB3475namp_-- From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 16 May 2019 18:23:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: J S Cc: 35766@debbugs.gnu.org Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155803095631204 (code B ref 35766); Thu, 16 May 2019 18:23:02 +0000 Received: (at 35766) by debbugs.gnu.org; 16 May 2019 18:22:36 +0000 Received: from localhost ([127.0.0.1]:56444 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRL1X-00087D-L6 for submit@debbugs.gnu.org; Thu, 16 May 2019 14:22:35 -0400 Received: from eggs.gnu.org ([209.51.188.92]:55472) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRL1W-000871-C2 for 35766@debbugs.gnu.org; Thu, 16 May 2019 14:22:34 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:56568) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hRL1Q-0004mr-Uq; Thu, 16 May 2019 14:22:29 -0400 Received: from [176.228.60.248] (port=3868 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hRL1Q-00048S-AF; Thu, 16 May 2019 14:22:28 -0400 Date: Thu, 16 May 2019 21:22:19 +0300 Message-Id: <837eaqcl9g.fsf@gnu.org> From: Eli Zaretskii In-reply-to: (message from J S on Thu, 16 May 2019 17:11:21 +0000) References: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: J S > Date: Thu, 16 May 2019 17:11:21 +0000 > > Xml files with this tag are saved as utf-16 be by emacs, even if the file was originally utf-16 le. Using > "UTF-16LE" instead will break the encoding and remove the BOM. > > Did you try using utf-16le-with-signature? Or maybe I don't understand the scenario: would you please describe a full reproduction recipe, starting from "emacs -Q"? Thanks. From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: J S Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 16 May 2019 19:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.15580345905065 (code B ref 35766); Thu, 16 May 2019 19:24:01 +0000 Received: (at 35766) by debbugs.gnu.org; 16 May 2019 19:23:10 +0000 Received: from localhost ([127.0.0.1]:56507 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRLy9-0001Jd-LB for submit@debbugs.gnu.org; Thu, 16 May 2019 15:23:10 -0400 Received: from mail-oln040092011100.outbound.protection.outlook.com ([40.92.11.100]:20544 helo=NAM04-SN1-obe.outbound.protection.outlook.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRLwl-0001G5-9A for 35766@debbugs.gnu.org; Thu, 16 May 2019 15:21:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=h0ydxBe5p+DJbILtXNZa1ZA/cl7chA0k+dWtzgGNtoI=; b=JSNemg8BspB1OFn5SyOI2E6wkDIGoWNkZVX+jbCztSH7wT0RsgigHfCMCzZAIelIXNojjlHQAt1b+xBjTfovZdJM6eVi1wos+kssG6unTSQcH+LITWwFLGSS95GSB1JR8mhDof1OZmHvudRzWf7zaq2Z8Jj5CJCx7IP2TnqbUBYkoV1kuW02QdjrY2lsNC5d6rrJ1kWFbaSikTVZmJ+v5yR2brcNEg2O2BDKM3/y4pDFhaIeO8jwwrl1ah/ciNKxof9y/WMvU6+/sBd9mScYnbfcMjrrvy6oOSgIMOrthQquTJUh+9n1CFY9aOi/KUtVy4dL5CEgV5qotEzv+Msx8g== Received: from CO1NAM04FT015.eop-NAM04.prod.protection.outlook.com (10.152.90.57) by CO1NAM04HT148.eop-NAM04.prod.protection.outlook.com (10.152.90.232) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1856.15; Thu, 16 May 2019 19:21:35 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com (10.152.90.59) by CO1NAM04FT015.mail.protection.outlook.com (10.152.90.169) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1856.15 via Frontend Transport; Thu, 16 May 2019 19:21:35 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec]) by BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec%7]) with mapi id 15.20.1878.028; Thu, 16 May 2019 19:21:34 +0000 From: J S Thread-Topic: bug#35766: emacs saves utf-16 le xml files as utf-16 be Thread-Index: AQHVDAo+Qjldgmx4b0y+0fxmDRw/MKZuEJPQgAAOtn2AAAGb0Q== Date: Thu, 16 May 2019 19:21:34 +0000 Message-ID: References: , <837eaqcl9g.fsf@gnu.org>, In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-incomingtopheadermarker: OriginalChecksum:C250D626F09CD5FB5A063F44116AAD11AD2C109FC617948DD8EF6DF06E046D45; UpperCasedChecksum:A257FEAB49B71EA9316CE567CDE199084D7CF2A207BBA3412F5D4DEBF7CE178F; SizeAsReceived:6941; Count:44 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [WYgNQgbN/2tY1pNNXZlIqvkFMf8g5slR] x-ms-publictraffictype: Email x-incomingheadercount: 44 x-eopattributedmessage: 0 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(5050001)(7020095)(20181119110)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031323274)(2017031324274)(2017031322404)(1601125500)(1603101475)(1701031045); SRVR:CO1NAM04HT148; x-ms-traffictypediagnostic: CO1NAM04HT148: x-microsoft-antispam-message-info: YsWKsILPXwTURrKxZb2CTqKDl44aZ0S5hk872cyELD6zwMQuKnv9NiHLurAMzpnWmdmKNzE/SwpnjdAQp5GxyHQNwD3ZAywfWE0ZQxQzEeQD9CHl5cKxIDRanMQ6UIk+McJwhPUHK/WSqTn/Q3M91+XqqhqD5cUkmPXwOQenyYadUKKEG4mVLEZZMi1HdB8i Content-Type: multipart/alternative; boundary="_000_BL0PR11MB3475BA798A629AA800D2D63C9E0A0BL0PR11MB3475namp_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: db16d569-8b45-4166-f52a-08d6da33b5e5 X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-originalarrivaltime: 16 May 2019 19:21:34.8658 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1NAM04HT148 X-Spam-Score: 0.3 (/) X-Mailman-Approved-At: Thu, 16 May 2019 15:23:08 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --_000_BL0PR11MB3475BA798A629AA800D2D63C9E0A0BL0PR11MB3475namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Try saving this xml file and opening it again: ________________________________ From: J S Sent: Thursday, May 16, 2019 7:15 PM To: Eli Zaretskii Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be Try saving this xml file and opening it again: ________________________________ From: Eli Zaretskii Sent: Thursday, May 16, 2019 6:22 PM To: J S Cc: 35766@debbugs.gnu.org Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be > From: J S > Date: Thu, 16 May 2019 17:11:21 +0000 > > Xml files with this tag are saved as utf-16 be by emacs, even if the file= was originally utf-16 le. Using > "UTF-16LE" instead will break the encoding and remove the BOM. > > Did you try using utf-16le-with-signature? Or maybe I don't understand the scenario: would you please describe a full reproduction recipe, starting from "emacs -Q"? Thanks. --_000_BL0PR11MB3475BA798A629AA800D2D63C9E0A0BL0PR11MB3475namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
Try saving this xml file and opening it again:

<?xml version=3D"1.0" encoding=3D"UTF-16LE"?>

From: J S <jszabo_98@hot= mail.com>
Sent: Thursday, May 16, 2019 7:15 PM
To: Eli Zaretskii
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
Try saving this xml file and opening it again:

<?xml version=3D"1.0" encoding=3D"UTF-16LE"?>



From: Eli Zaretskii <e= liz@gnu.org>
Sent: Thursday, May 16, 2019 6:22 PM
To: J S
Cc: 35766@debbugs.gnu.org
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
> From: J S <jszabo_98@hotmail.com>
> Date: Thu, 16 May 2019 17:11:21 +0000
>
> Xml files with this tag are saved as utf-16 be by emacs, even if the f= ile was originally utf-16 le.  Using
> "UTF-16LE" instead will break the encoding and remove the BO= M.
>
> <?xml version=3D"1.0" encoding=3D"UTF-16"?><= br>
Did you try using utf-16le-with-signature?

Or maybe I don't understand the scenario: would you please describe a
full reproduction recipe, starting from "emacs -Q"?

Thanks.
--_000_BL0PR11MB3475BA798A629AA800D2D63C9E0A0BL0PR11MB3475namp_-- From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: J S Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 16 May 2019 20:58:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155804026615229 (code B ref 35766); Thu, 16 May 2019 20:58:01 +0000 Received: (at 35766) by debbugs.gnu.org; 16 May 2019 20:57:46 +0000 Received: from localhost ([127.0.0.1]:56604 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRNRf-0003xX-Vx for submit@debbugs.gnu.org; Thu, 16 May 2019 16:57:45 -0400 Received: from mail-oln040092001012.outbound.protection.outlook.com ([40.92.1.12]:6252 helo=NAM01-BY2-obe.outbound.protection.outlook.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRNRd-0003x9-Em for 35766@debbugs.gnu.org; Thu, 16 May 2019 16:57:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wmr1Vy498S0BuAty+vnoWPasNY376MSjKbSAydnzmm8=; b=Ud6hTXlOop6yAnnAFcNOzYjg2V/pWeyAAcsT7me6JvF5OKWHakloKcS3hlHMP7U4nLgPNR4gcSBY4m/OySNSgYcvc2Q9PtPYZurzgD70GZkPNBqf++t/NFO7U3GevebRJx3LQye/TQhzG6a7HqpLQ4MQnCcdEUENgxMw5NZAb9SPBotTBgofa8M5JS3g8RR748XHHz7vULrNnUlkZ/cHZfqWcToJptvhenKea2lweAjX1G0yMRyLld6QzC2ySKBbHZ+TmIkCUgCL/zC81lq1KJSJ4K2Ebgk0WZcK2ctBO67hCwqHUDRUHHfUvDfUGWZ/2xGQHH6gPEvkXE8KuuRoUg== Received: from BY2NAM01FT045.eop-nam01.prod.protection.outlook.com (10.152.68.56) by BY2NAM01HT089.eop-nam01.prod.protection.outlook.com (10.152.68.197) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1856.11; Thu, 16 May 2019 20:57:34 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com (10.152.68.52) by BY2NAM01FT045.mail.protection.outlook.com (10.152.68.206) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1900.16 via Frontend Transport; Thu, 16 May 2019 20:57:34 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec]) by BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec%7]) with mapi id 15.20.1878.028; Thu, 16 May 2019 20:57:34 +0000 From: J S Thread-Topic: bug#35766: emacs saves utf-16 le xml files as utf-16 be Thread-Index: AQHVDAo+Qjldgmx4b0y+0fxmDRw/MKZuEJPQgAAOtn2AAAGb0YAAGqeF Date: Thu, 16 May 2019 20:57:34 +0000 Message-ID: References: , <837eaqcl9g.fsf@gnu.org>, , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-incomingtopheadermarker: OriginalChecksum:00F23017A7606FA35FB78276E125A5F1718FFABC4DEEEB7DE09C29F4893C9B11; UpperCasedChecksum:3B429D11154167DABE5AE55441423C31871832FC95C46E5D360593A73448CA02; SizeAsReceived:7023; Count:44 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [MV77oaqUQ7TjKrHdDtkiZs6aa8BFHHPa] x-ms-publictraffictype: Email x-incomingheadercount: 44 x-eopattributedmessage: 0 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(5050001)(7020095)(20181119110)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031323274)(2017031324274)(2017031322404)(1601125500)(1603101475)(1701031045); SRVR:BY2NAM01HT089; x-ms-traffictypediagnostic: BY2NAM01HT089: x-microsoft-antispam-message-info: bZbz7EqZRzIbeCDSZx+TgetEIGDj7iSprdV+/vGmFiRfrU5mpQrOl4MBDN7JWc+fqF1BGjl1ilEROkWiQrUsSOPWQvcDyyyJV+UG5ehITxYS80jSNWr4BkvPKuV+L7jqYswkGW325+DOeQjsYTlFsen2dsgFmsQr3OSO8K6Hj6XBuV9XkYELRqXBS7eWGHrd Content-Type: multipart/alternative; boundary="_000_BL0PR11MB347526C5880C5A4B2A9EB97D9E0A0BL0PR11MB3475namp_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: bb6a36d3-3761-48fd-e594-08d6da411ec7 X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-originalarrivaltime: 16 May 2019 20:57:34.3100 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2NAM01HT089 X-Spam-Score: 0.3 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --_000_BL0PR11MB347526C5880C5A4B2A9EB97D9E0A0BL0PR11MB3475namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I should say that I'm using emacs for windows. And it's preferring saving = in big endian to little endian when this is the tag: ________________________________ From: J S Sent: Thursday, May 16, 2019 7:21 PM To: Eli Zaretskii Cc: 35766@debbugs.gnu.org Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be Try saving this xml file and opening it again: ________________________________ From: J S Sent: Thursday, May 16, 2019 7:15 PM To: Eli Zaretskii Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be Try saving this xml file and opening it again: ________________________________ From: Eli Zaretskii Sent: Thursday, May 16, 2019 6:22 PM To: J S Cc: 35766@debbugs.gnu.org Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be > From: J S > Date: Thu, 16 May 2019 17:11:21 +0000 > > Xml files with this tag are saved as utf-16 be by emacs, even if the file= was originally utf-16 le. Using > "UTF-16LE" instead will break the encoding and remove the BOM. > > Did you try using utf-16le-with-signature? Or maybe I don't understand the scenario: would you please describe a full reproduction recipe, starting from "emacs -Q"? Thanks. --_000_BL0PR11MB347526C5880C5A4B2A9EB97D9E0A0BL0PR11MB3475namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
I should say that I'm using emacs for windows.  And it's preferring sa= ving in big endian to little endian when this is the tag:

<?xml version=3D"1.0" encoding=3D"UTF-16"?>


From: J S <jszabo_98@hot= mail.com>
Sent: Thursday, May 16, 2019 7:21 PM
To: Eli Zaretskii
Cc: 35766@debbugs.gnu.org
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
Try saving this xml file and opening it again:

<?xml version=3D"1.0" encoding=3D"UTF-16LE"?>

From: J S <jszabo_98@h= otmail.com>
Sent: Thursday, May 16, 2019 7:15 PM
To: Eli Zaretskii
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
Try saving this xml file and opening it again:

<?xml version=3D"1.0" encoding=3D"UTF-16LE"?>



From: Eli Zaretskii <= ;eliz@gnu.org>
Sent: Thursday, May 16, 2019 6:22 PM
To: J S
Cc: 35766@debbugs.gnu.org
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
> From: J S <jszabo_98@hotmail.com> > Date: Thu, 16 May 2019 17:11:21 +0000
>
> Xml files with this tag are saved as utf-16 be by emacs, even if the f= ile was originally utf-16 le.  Using
> "UTF-16LE" instead will break the encoding and remove the BO= M.
>
> <?xml version=3D"1.0" encoding=3D"UTF-16"?><= br>
Did you try using utf-16le-with-signature?

Or maybe I don't understand the scenario: would you please describe a
full reproduction recipe, starting from "emacs -Q"?

Thanks.
--_000_BL0PR11MB347526C5880C5A4B2A9EB97D9E0A0BL0PR11MB3475namp_-- From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 17 May 2019 09:27:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: J S Cc: 35766@debbugs.gnu.org Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.15580852139853 (code B ref 35766); Fri, 17 May 2019 09:27:01 +0000 Received: (at 35766) by debbugs.gnu.org; 17 May 2019 09:26:53 +0000 Received: from localhost ([127.0.0.1]:57258 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRZ8f-0002Yr-7b for submit@debbugs.gnu.org; Fri, 17 May 2019 05:26:53 -0400 Received: from eggs.gnu.org ([209.51.188.92]:51629) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRZ8b-0002Yb-8z for 35766@debbugs.gnu.org; Fri, 17 May 2019 05:26:51 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:40651) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hRZ8W-0005K6-2i; Fri, 17 May 2019 05:26:44 -0400 Received: from [176.228.60.248] (port=4170 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hRZ8V-0003d4-Cj; Fri, 17 May 2019 05:26:43 -0400 Date: Fri, 17 May 2019 12:26:34 +0300 Message-Id: <83lfz5bfed.fsf@gnu.org> From: Eli Zaretskii In-reply-to: (message from J S on Thu, 16 May 2019 20:57:34 +0000) References: , <837eaqcl9g.fsf@gnu.org>, , X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: J S > CC: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> > Date: Thu, 16 May 2019 20:57:34 +0000 > > I should say that I'm using emacs for windows. And it's preferring saving in big endian to little endian when > this is the tag: > > This is the default, yes. "C-h C utf-16 RET" says: UTF-16 (detect endian on decoding, use big endian on encoding with BOM). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If you want to encode in UTF-16LE, you need to tell Emacs to do this explicitly: C-x RET c utf-16le-with-signature RET C-x C-s > Try saving this xml file and opening it again: > > AFAIU, encoding="UTF-16LE" is invalid in XML. If you see this documented somewhere in XML docs, please tell me where it is described. From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: J S Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 17 May 2019 11:27:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155809238421051 (code B ref 35766); Fri, 17 May 2019 11:27:01 +0000 Received: (at 35766) by debbugs.gnu.org; 17 May 2019 11:26:24 +0000 Received: from localhost ([127.0.0.1]:57537 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRb0J-0005TT-Ir for submit@debbugs.gnu.org; Fri, 17 May 2019 07:26:23 -0400 Received: from mail-oln040092010041.outbound.protection.outlook.com ([40.92.10.41]:28742 helo=NAM04-CO1-obe.outbound.protection.outlook.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRb0H-0005T7-Nz for 35766@debbugs.gnu.org; Fri, 17 May 2019 07:26:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lU7yi6z0Yso1KFBbXePSgFg8tqiI9OsZjAUTuZVWUXo=; b=a0bEyUFfw73GrVeW46KxkfAjX2QZdUocpywQthUcj+D5C8Hnwu4wcyK7xTbIU5REdiye+LMEjHEVwuYjO/zwIlH/ofMm2xMfDPSSl/uuddEHqLS5U0AAUYgdNqMXJJ8F5V49szJEQcelXbrYt57jR1uFMgRFhcI/Iv6iUbPFqSjmN9JX3EH8dmApaTlEPS/QkXY8lV9GAeypsLuFcHeVVy5DyLgyr3Wpd0nrWWu+7jHKSj7IakCJyTe1vDhlsSrw+qKBZ9OfUmd5WzmNKvZQUGDWIqYtfIClEvwKM2QfBAEn7JFrZ9EsS/VfEr/72hhhNOqkvCnVP/SCZGtHreT07w== Received: from SN1NAM04FT017.eop-NAM04.prod.protection.outlook.com (10.152.88.57) by SN1NAM04HT216.eop-NAM04.prod.protection.outlook.com (10.152.89.33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1856.15; Fri, 17 May 2019 11:26:14 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com (10.152.88.60) by SN1NAM04FT017.mail.protection.outlook.com (10.152.88.154) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1900.16 via Frontend Transport; Fri, 17 May 2019 11:26:14 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec]) by BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec%7]) with mapi id 15.20.1900.010; Fri, 17 May 2019 11:26:14 +0000 From: J S Thread-Topic: bug#35766: emacs saves utf-16 le xml files as utf-16 be Thread-Index: AQHVDAo+Qjldgmx4b0y+0fxmDRw/MKZuEJPQgAAOtn2AAAGb0YAAGqeFgADRqwCAACD5gA== Date: Fri, 17 May 2019 11:26:14 +0000 Message-ID: References: , <837eaqcl9g.fsf@gnu.org>, , , <83lfz5bfed.fsf@gnu.org> In-Reply-To: <83lfz5bfed.fsf@gnu.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-incomingtopheadermarker: OriginalChecksum:ACAE41679960FCB458DA8E333C6E3CFE8592442D94AC6944ED313B22160274FA; UpperCasedChecksum:60BAED002C98FCBDFCAE2933C1D0FD252C2F8774AA3D95E09D4E2383FC29B84F; SizeAsReceived:7100; Count:44 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [JchM+hrzLNOcI4u15illTvrpRt+FHrFI] x-ms-publictraffictype: Email x-incomingheadercount: 44 x-eopattributedmessage: 0 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(5050001)(7020095)(20181119110)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031323274)(2017031324274)(2017031322404)(1601125500)(1603101475)(1701031045); SRVR:SN1NAM04HT216; x-ms-traffictypediagnostic: SN1NAM04HT216: x-ms-exchange-purlcount: 1 x-microsoft-antispam-message-info: MO6aacZ0DYmp+7gYvHVASdqHdmTchoqEB0PtX8z5YcZuoCy5qZtMWjnebCVGm1Jx+mvrlOV/BhRha8jBgXAqSkBmUxpc8HzHqdyIIwE5HTbh3dMkZmu2KS8qYXxtDvh+Eo87uN8SbudCR78v/jxHcwLVUWwk4RhjANcUv+D9FFQm6S4YIXIKT8P+89bjU+zo Content-Type: multipart/alternative; boundary="_000_BL0PR11MB3475F70B777717241FB7AB449E0B0BL0PR11MB3475namp_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: 52430f19-c27c-47e8-0cee-08d6daba78bc X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-originalarrivaltime: 17 May 2019 11:26:14.3216 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1NAM04HT216 X-Spam-Score: 0.3 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --_000_BL0PR11MB3475F70B777717241FB7AB449E0B0BL0PR11MB3475namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable It would change color in emacs if encoding=3D"UTF16-LE" were invalid. It's= hard to find the docs for it. UTF-16LE is listed here: http://help.eclip= se.org/kepler/index.jsp?topic=3D%2Forg.eclipse.wst.xmleditor.doc.user%2Ftop= ics%2Fcxmlenc.html ________________________________ From: Eli Zaretskii Sent: Friday, May 17, 2019 9:26 AM To: J S Cc: 35766@debbugs.gnu.org Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be > From: J S > CC: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> > Date: Thu, 16 May 2019 20:57:34 +0000 > > I should say that I'm using emacs for windows. And it's preferring savin= g in big endian to little endian when > this is the tag: > > This is the default, yes. "C-h C utf-16 RET" says: UTF-16 (detect endian on decoding, use big endian on encoding with BOM). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If you want to encode in UTF-16LE, you need to tell Emacs to do this explicitly: C-x RET c utf-16le-with-signature RET C-x C-s > Try saving this xml file and opening it again: > > AFAIU, encoding=3D"UTF-16LE" is invalid in XML. If you see this documented somewhere in XML docs, please tell me where it is described. --_000_BL0PR11MB3475F70B777717241FB7AB449E0B0BL0PR11MB3475namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
It would change color in emacs if encoding=3D"UTF16-LE" were inva= lid.  It's hard to find the docs for it.  UTF-16LE is listed here= :  http://help.ec= lipse.org/kepler/index.jsp?topic=3D%2Forg.eclipse.wst.xmleditor.doc.user%2F= topics%2Fcxmlenc.html



From: Eli Zaretskii <eli= z@gnu.org>
Sent: Friday, May 17, 2019 9:26 AM
To: J S
Cc: 35766@debbugs.gnu.org
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
> From: J S <jszabo_98@hotmail.com>
> CC: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> > Date: Thu, 16 May 2019 20:57:34 +0000
>
> I should say that I'm using emacs for windows.  And it's preferri= ng saving in big endian to little endian when
> this is the tag:
>
> <?xml version=3D"1.0" encoding=3D"UTF-16"?><= br>
This is the default, yes.  "C-h C utf-16 RET" says:

  UTF-16 (detect endian on decoding, use big endian on encoding with B= OM).
            &nb= sp;            =             ^^^^^^^^= ^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you want to encode in UTF-16LE, you need to tell Emacs to do this
explicitly:

  C-x RET c utf-16le-with-signature RET C-x C-s

> Try saving this xml file and opening it again:
>
> <?xml version=3D"1.0" encoding=3D"UTF-16LE"?>= ;

AFAIU, encoding=3D"UTF-16LE" is invalid in XML.  If you see = this
documented somewhere in XML docs, please tell me where it is
described.
--_000_BL0PR11MB3475F70B777717241FB7AB449E0B0BL0PR11MB3475namp_-- From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: Noam Postavsky Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 17 May 2019 11:49:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: J S Cc: Eli Zaretskii , "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155809372131357 (code B ref 35766); Fri, 17 May 2019 11:49:01 +0000 Received: (at 35766) by debbugs.gnu.org; 17 May 2019 11:48:41 +0000 Received: from localhost ([127.0.0.1]:57598 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRbLs-00089h-LB for submit@debbugs.gnu.org; Fri, 17 May 2019 07:48:40 -0400 Received: from mail-io1-f66.google.com ([209.85.166.66]:38640) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRbLr-00089U-2P for 35766@debbugs.gnu.org; Fri, 17 May 2019 07:48:39 -0400 Received: by mail-io1-f66.google.com with SMTP id x24so5230230ion.5 for <35766@debbugs.gnu.org>; Fri, 17 May 2019 04:48:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=k6trEDFkJOU+PlaIO1FD4ZwPKKyfqfxNcbBIqfRSzMc=; b=hsU25TpFqMxIBbL3a8Jgou3fHUHrG7DdzTs8vHcgTqzXtT6gxYKQTaho9w88uieSaB 5NczfGrrbebqlD0NRdD6KG+vNvLv3pUif12s0IEQoQt1Z2bg3CbUKl1E/vtft8d2RjKi PGNIaZMf8+nLEDZgmtlAQ5QtoW1jY5HksSaJu8HjFcgT5RTf/Ypl8b1Iznc1obob+USg NNjLfX2sLtZd8e2qqfLtYJo7wjdRgkyWmHHkRlYZaRbC6EOwd4j92Ds1m+86N+D2Z6JT da2a014Yk7C0dS7WE3FCceEJckiKXxhJBJzdIR2SYyo9gXg9OAdjM4dhTQ7uPLxd+izD mtSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=k6trEDFkJOU+PlaIO1FD4ZwPKKyfqfxNcbBIqfRSzMc=; b=evrXzFuOHjsc015QNSZHKf6OyOQykwcOKXfkX3MJDlJwQ5hGpk4hZlvdF7VkHBxHRT K7ysrcoIpfJai6rlTHBgAVdyP1bgxAgcwbkKd7C5yRQJG+p46eMwijfD+MPKAQYEFZ+1 fjQLnVlfM6JrJayIFUcIpl7fVvku3VPVNYAyThUA8ZN9y/mZNTATflD8Tipgn/2SW8LS LvKlE9IffuxHiYsjeBAuHkvbJq6ECw37AKCUQqyOxJBMOaDBN7C8wvDx5pbWOZV2qlz5 7NA+T+XVBRrWpjXPGPoxtbuv8vGmF+Tvi4adR4w32YI82Hdn6xekAV9442B9ZefCAHDu g+wA== X-Gm-Message-State: APjAAAUC+/3xaS/MFNaarw39+TtLNeZTWjges3Xmvb2iTioQ58DyaNP5 9GnvyzsIRapAIk+9loIo2P6LhywZ X-Google-Smtp-Source: APXvYqwWeBqo6YGIu5daO49maaprgcnOHQXb31VPMKkLg4tmQoDbdBqH9EDWJE0SjP5ohADWN55/nQ== X-Received: by 2002:a05:6602:4f:: with SMTP id z15mr6780379ioz.108.1558093712097; Fri, 17 May 2019 04:48:32 -0700 (PDT) Received: from minid (cbl-45-2-119-34.yyz.frontiernetworks.ca. [45.2.119.34]) by smtp.gmail.com with ESMTPSA id q72sm93044ita.26.2019.05.17.04.48.31 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 17 May 2019 04:48:31 -0700 (PDT) From: Noam Postavsky References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> Date: Fri, 17 May 2019 07:48:30 -0400 In-Reply-To: (J. S.'s message of "Fri, 17 May 2019 11:26:14 +0000") Message-ID: <87a7fle1yp.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) J S writes: > It would change color in emacs if encoding="UTF16-LE" were invalid. > It's hard to find the docs for it. UTF-16LE is listed here: > http://help.eclipse.org/kepler/index.jsp?topic=%2Forg.eclipse.wst.xmleditor.doc.user%2Ftopics%2Fcxmlenc.html A more official reference: https://www.w3.org/TR/xml/#NT-EncName It is RECOMMENDED that character encodings registered (as charsets) with the Internet Assigned Numbers Authority [IANA-CHARSETS], other than those just listed, be referred to using their registered names; [IANA-CHARSETS]: http://www.iana.org/assignments/character-sets/character-sets.xhtml UTF-16LE 1014 [RFC2781] [RFC2781] csUTF16LE From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 17 May 2019 15:36:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Noam Postavsky Cc: jszabo_98@hotmail.com, 35766@debbugs.gnu.org Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.15581073244879 (code B ref 35766); Fri, 17 May 2019 15:36:02 +0000 Received: (at 35766) by debbugs.gnu.org; 17 May 2019 15:35:24 +0000 Received: from localhost ([127.0.0.1]:58806 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRetI-0001Gc-Hd for submit@debbugs.gnu.org; Fri, 17 May 2019 11:35:24 -0400 Received: from eggs.gnu.org ([209.51.188.92]:39805) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRetG-0001GL-7l for 35766@debbugs.gnu.org; Fri, 17 May 2019 11:35:22 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:56274) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hRet9-00032k-9O; Fri, 17 May 2019 11:35:15 -0400 Received: from [176.228.60.248] (port=3266 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hRet4-0005ar-0X; Fri, 17 May 2019 11:35:13 -0400 Date: Fri, 17 May 2019 18:34:48 +0300 Message-Id: <83tvdt9js7.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87a7fle1yp.fsf@gmail.com> (message from Noam Postavsky on Fri, 17 May 2019 07:48:30 -0400) References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Noam Postavsky > Cc: Eli Zaretskii , "35766\@debbugs.gnu.org" <35766@debbugs.gnu.org> > Date: Fri, 17 May 2019 07:48:30 -0400 > > UTF-16LE 1014 [RFC2781] [RFC2781] csUTF16LE Ouch, I was looking at the wrong column in that document. The problem is that our detection of encoding of XML files is based on the assumption that the header is in ASCII-compatible encoding, which UTF-16 isn't. So regexp search for the XML header fails, and the detection fails with it. The patch below make us at least recognize UTF-16 with BOM, and also stop the encoding from frightening the user when she specifies UTF-16 with BOM at buffer-save time. But by default, saving a buffer with UTF-16BE or UTF-16LE still produces a file without BOM, and that cannot be detected by our encoding-detection machinery, leaving it to the user to use "C-x RET c" or "C-x RET r". Perhaps we should by default produce encoding with BOM when XML header specifies UTF-16? diff --git a/lisp/international/mule-cmds.el b/lisp/international/mule-cmds.el index dfa9e4e..a248ef8 100644 --- a/lisp/international/mule-cmds.el +++ b/lisp/international/mule-cmds.el @@ -1029,7 +1029,11 @@ select-safe-coding-system ;; This check perhaps isn't ideal, but is probably ;; the best thing to do. (not (auto-coding-alist-lookup (or file buffer-file-name ""))) - (not (coding-system-equal coding-system auto-cs))) + (not (coding-system-equal coding-system auto-cs)) + (or (equal (coding-system-type auto-cs) 'charset) + (not (coding-system-equal (coding-system-type auto-cs) + (coding-system-type + coding-system))))) (unless (yes-or-no-p (format "Selected encoding %s disagrees with \ %s specified by file contents. Really save (else edit coding cookies \ diff --git a/lisp/international/mule.el b/lisp/international/mule.el index b5414de..fcdcd3c 100644 --- a/lisp/international/mule.el +++ b/lisp/international/mule.el @@ -2587,9 +2587,14 @@ xml-find-file-coding-system (let ((detected (with-coding-priority '(utf-8) (coding-system-base - (detect-coding-region (point-min) (point-max) t))))) - ;; Pure ASCII always comes back as undecided. + (detect-coding-region (point-min) (point-max) t)))) + (bom (list (char-after 1) (char-after 2)))) (cond + ((equal bom '(#xFE #xFF)) + 'utf-16be-with-signature) + ((equal bom '(#xFF #xFE)) + 'utf-16le-with-signature) + ;; Pure ASCII always comes back as undecided. ((memq detected '(utf-8 undecided)) 'utf-8) ((eq detected 'utf-16le-with-signature) 'utf-16le-with-signature) From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: npostavs@gmail.com Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 17 May 2019 16:29:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: jszabo_98@hotmail.com, 35766@debbugs.gnu.org, Noam Postavsky Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.15581104829798 (code B ref 35766); Fri, 17 May 2019 16:29:02 +0000 Received: (at 35766) by debbugs.gnu.org; 17 May 2019 16:28:02 +0000 Received: from localhost ([127.0.0.1]:58837 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRfiE-0002Xp-04 for submit@debbugs.gnu.org; Fri, 17 May 2019 12:28:02 -0400 Received: from mail-it1-f181.google.com ([209.85.166.181]:56087) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRfiB-0002XU-Tj for 35766@debbugs.gnu.org; Fri, 17 May 2019 12:28:00 -0400 Received: by mail-it1-f181.google.com with SMTP id q132so12879301itc.5 for <35766@debbugs.gnu.org>; Fri, 17 May 2019 09:27:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=hZBy7HtkZjKGBSXXtQbFODVQKwGNQl1+gbroIqNyqjE=; b=IzaP7b8j2Fgn8kQ+R+Ld0BNmMOSCs8JRC7wC05kgUKVYqMdCPkXmlDJudQvz37iWGS 0kM9gWM5Uj0MEfvRAv6lBkTF5d+o/2ZNUWiDIxrqjeuhaMmtM0eJEzmUer0CSH3C7YhA w2knwN7xlLhXADw+CRBg0MwlRxO2NN0Pcz6PvJ9waaaoGpDXn4NnECGvHq+FHKy5BEpB 36xThMmOG9RsAjCohBK3ERhc3WH8Ob0/nuVubnHzXv3mL+Mr+35H/TmlHyTXrzGlpOdX SJ45J+Gi8mYOlcZVUzCaj5mj+pAwpIn76yjt67CeysikQjmElgTBeksNkjtgKJymoycW HZbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=hZBy7HtkZjKGBSXXtQbFODVQKwGNQl1+gbroIqNyqjE=; b=diGjoFZV0G+T0qh7y0TjU6Iib4JmvwpZTuIFG8h+ZnxfffRMaZ9B5LbwpPBBwpxm65 DGCri4KpWXySrIHTRatUZz1UYG16eD1DoXk3PE8nE9fDQRQDxcmD2LszEhQCkjDFTArP 4XrNe/KtRwIysrUTmFHk3Fm52oMJJppadD6x7EoCBdrXpLLGM9HVagbpIyN7dPRosL1f 4CoBLgaDmEATQnuZ+InUVOpOWXtRMy71fbHY9K7skRX3eXQO1Gu0V2BIJAultZXp8mpe zLsPRfDiydBHUks9KTFofQty2dZOHhN5yGorqQMl7uvPKWAh/xn+Vu0yNbODG9oZYVnO VJDQ== X-Gm-Message-State: APjAAAXZFpFhUhsqGYq1rOXTeTl2o8fM8ZxqdKB8FhIh5zSMsKJ6wPjd CVYZhOn9tCF9I5GVPxLhd+ysu86X X-Google-Smtp-Source: APXvYqxQ+4a+Cx5yZ4/pdoYx0DPtuJV5d2ZkSGMpZM7rKC4Z3/JVm3B/kZIeQkqC/CUkxAE6Y1r9LA== X-Received: by 2002:a02:224b:: with SMTP id o72mr7175913jao.16.1558110472620; Fri, 17 May 2019 09:27:52 -0700 (PDT) Received: from vhost2 (CPE001143542e1f-CMf81d0f809fa0.cpe.net.cable.rogers.com. [99.230.51.196]) by smtp.gmail.com with ESMTPSA id s10sm2859586iob.29.2019.05.17.09.27.51 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 17 May 2019 09:27:51 -0700 (PDT) From: npostavs@gmail.com References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> <83tvdt9js7.fsf@gnu.org> Date: Fri, 17 May 2019 12:27:50 -0400 In-Reply-To: <83tvdt9js7.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 17 May 2019 18:34:48 +0300") Message-ID: <85a7fldp15.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1.92 (windows-nt) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii writes: > Perhaps we should by default produce encoding with BOM when XML header > specifies UTF-16? I think yes, https://www.w3.org/TR/xml/#charencoding says Entities encoded in UTF-16 MUST [...] begin with the Byte Order Mark By the way, is Bug#8282 the same as this one, or just closely related? It's talking about sgml-html-meta-auto-coding-function (though maybe sgml-xml-auto-coding-function is more relevant). I'm getting a little confused between all the different *-find/auto-coding-* functions. There is also nxml-set-auto-coding which seems to be mostly unused. From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: J S Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 17 May 2019 16:58:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: "npostavs@gmail.com" , Eli Zaretskii Cc: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155811225212599 (code B ref 35766); Fri, 17 May 2019 16:58:02 +0000 Received: (at 35766) by debbugs.gnu.org; 17 May 2019 16:57:32 +0000 Received: from localhost ([127.0.0.1]:58867 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRgAl-0003H9-MV for submit@debbugs.gnu.org; Fri, 17 May 2019 12:57:32 -0400 Received: from mail-oln040092009103.outbound.protection.outlook.com ([40.92.9.103]:38466 helo=NAM04-BN3-obe.outbound.protection.outlook.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRgAj-0003Gu-Gd for 35766@debbugs.gnu.org; Fri, 17 May 2019 12:57:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=D2k9BOMbAbJtKfElN1KreAYhE1oyMCSlcvcCLyQoAKU=; b=dgbL8CbldwPA3/QX9p+dwLbLQGIocLoOrZbeSjNWxkO1hOmMXTmHmnH5+TS9Ngst+1whs7QIUaQREHkAEF024fnJ5cMEi3hdAjLnAWqKO0A2o27FNWp/B9CFlJ5uT5bkndolabLxKhhnaA3Fu2qWaFR0qyv7vhcNNsoMBkM8NX+zqdflelpjA6sdezQhTnMXkxeoAWjC/MiEH//eSeZbJfm9LvKFc1N2f0X9LTXdeohOJpBAozpDuHYYnMOQUMRWTcC+g3F+ZVP/9LM7AUQBl84RyXGUAfRzy0Xcoj3D6AgNgRfDQrLxfxnWKCGSVPrOErAwXFOgh8q0iQ/6vdbSMw== Received: from SN1NAM04FT034.eop-NAM04.prod.protection.outlook.com (10.152.88.59) by SN1NAM04HT011.eop-NAM04.prod.protection.outlook.com (10.152.88.128) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1856.15; Fri, 17 May 2019 16:57:23 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com (10.152.88.54) by SN1NAM04FT034.mail.protection.outlook.com (10.152.88.156) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1900.16 via Frontend Transport; Fri, 17 May 2019 16:57:23 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec]) by BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec%7]) with mapi id 15.20.1900.010; Fri, 17 May 2019 16:57:23 +0000 From: J S Thread-Topic: bug#35766: emacs saves utf-16 le xml files as utf-16 be Thread-Index: AQHVDAo+Qjldgmx4b0y+0fxmDRw/MKZuEJPQgAAOtn2AAAGb0YAAGqeFgADRqwCAACD5gIAABqSpgAA/Xq2AAA6vR4AACAlM Date: Fri, 17 May 2019 16:57:23 +0000 Message-ID: References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> <83tvdt9js7.fsf@gnu.org>,<85a7fldp15.fsf@gmail.com> In-Reply-To: <85a7fldp15.fsf@gmail.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-incomingtopheadermarker: OriginalChecksum:D2CD7369E676FCAACF9597C0AD099DE9BE5F6A941D33E8D6BF12392D6D268792; UpperCasedChecksum:EE4A1B4788893C960C0A92920E822BD502520D1108FC2E7BE54C62FD9001CECD; SizeAsReceived:7350; Count:44 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [HjUUyh+zmmUzJqUv3TyhN0sdyYVsDMTV] x-ms-publictraffictype: Email x-incomingheadercount: 44 x-eopattributedmessage: 0 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(5050001)(7020095)(20181119110)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031323274)(2017031324274)(2017031322404)(1601125500)(1603101475)(1701031045); SRVR:SN1NAM04HT011; x-ms-traffictypediagnostic: SN1NAM04HT011: x-ms-exchange-purlcount: 1 x-microsoft-antispam-message-info: w8sESUNJ7ZVxnJEHurS4KNB345urv6lVqaLwre5UUq0gpWl0lSulRL+E2utXK19rRz4pSKY/+DkXaXhVJ5j4bnA/1CGurhGRnaoEDbCY3FndGVaQGafey4S1ZCCOVChSPy4sqBunt92cy2wYOp9DUCy9jnMnEjTHiSpvFZL8wdmb/7PsT2TWkFrY0GhKkV+y Content-Type: multipart/alternative; boundary="_000_BL0PR11MB3475CE9945A64367B885EA829E0B0BL0PR11MB3475namp_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: 1a262f40-5369-41ca-bc62-08d6dae8bb9b X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-originalarrivaltime: 17 May 2019 16:57:23.3307 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1NAM04HT011 X-Spam-Score: 1.3 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: When an xml file just says encoding="UTF-16", how does an application pick big endian vs little endian? From: npostavs@gmail.com Sent: Friday, May 17, 2019 4:27 PM To: Eli Zaretskii Cc: Noam Postavsky; jszabo_98@hotmail.com; Re: bug#35766: emacs saves utf-16 le xml f [...] Content analysis details: (1.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (jszabo_98[at]hotmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends in digit (jszabo_98[at]hotmail.com) -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [40.92.9.103 listed in list.dnswl.org] 0.0 HTML_MESSAGE BODY: HTML included in message 1.0 FREEMAIL_REPLY From and body contain different freemails X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --_000_BL0PR11MB3475CE9945A64367B885EA829E0B0BL0PR11MB3475namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable When an xml file just says encoding=3D"UTF-16", how does an application pic= k big endian vs little endian? ________________________________ From: npostavs@gmail.com Sent: Friday, May 17, 2019 4:27 PM To: Eli Zaretskii Cc: Noam Postavsky; jszabo_98@hotmail.com; 35766@debbugs.gnu.org Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be Eli Zaretskii writes: > Perhaps we should by default produce encoding with BOM when XML header > specifies UTF-16? I think yes, https://www.w3.org/TR/xml/#charencoding says Entities encoded in UTF-16 MUST [...] begin with the Byte Order Mark By the way, is Bug#8282 the same as this one, or just closely related? It's talking about sgml-html-meta-auto-coding-function (though maybe sgml-xml-auto-coding-function is more relevant). I'm getting a little confused between all the different *-find/auto-coding-* functions. There is also nxml-set-auto-coding which seems to be mostly unused. --_000_BL0PR11MB3475CE9945A64367B885EA829E0B0BL0PR11MB3475namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
When an xml file just says encoding=3D"UTF-16", how does an appli= cation pick big endian vs little endian?


From: npostavs@gmail.com &l= t;npostavs@gmail.com>
Sent: Friday, May 17, 2019 4:27 PM
To: Eli Zaretskii
Cc: Noam Postavsky; jszabo_98@hotmail.com; 35766@debbugs.gnu.org
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
Eli Zaretskii <eliz@gnu.org> writes:

> Perhaps we should by default produce encoding with BOM when XML header=
> specifies UTF-16?

I think yes, https://ww= w.w3.org/TR/xml/#charencoding says

    Entities encoded in UTF-16 MUST [...] begin with the Byt= e Order Mark

By the way, is Bug#8282 the same as this one, or just closely related?
It's talking about sgml-html-meta-auto-coding-function (though maybe
sgml-xml-auto-coding-function is more relevant).  I'm getting a little=
confused between all the different *-find/auto-coding-* functions.
There is also nxml-set-auto-coding which seems to be mostly unused.
--_000_BL0PR11MB3475CE9945A64367B885EA829E0B0BL0PR11MB3475namp_-- From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 17 May 2019 19:48:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: J S Cc: 35766@debbugs.gnu.org, npostavs@gmail.com Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155812243311829 (code B ref 35766); Fri, 17 May 2019 19:48:02 +0000 Received: (at 35766) by debbugs.gnu.org; 17 May 2019 19:47:13 +0000 Received: from localhost ([127.0.0.1]:58992 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRioz-00034j-Ho for submit@debbugs.gnu.org; Fri, 17 May 2019 15:47:13 -0400 Received: from eggs.gnu.org ([209.51.188.92]:57651) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRiox-00034X-JO for 35766@debbugs.gnu.org; Fri, 17 May 2019 15:47:11 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:32905) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hRior-000679-Sv; Fri, 17 May 2019 15:47:05 -0400 Received: from [176.228.60.248] (port=2733 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hRior-000087-AR; Fri, 17 May 2019 15:47:05 -0400 Date: Fri, 17 May 2019 22:46:59 +0300 Message-Id: <83r28wamoc.fsf@gnu.org> From: Eli Zaretskii In-reply-to: (message from J S on Fri, 17 May 2019 16:57:23 +0000) References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> <83tvdt9js7.fsf@gnu.org>, <85a7fldp15.fsf@gmail.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: J S > CC: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> > Date: Fri, 17 May 2019 16:57:23 +0000 > > When an xml file just says encoding="UTF-16", how does an application pick big endian vs little endian? What is "an application" in this context? From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: J S Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 17 May 2019 20:17:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org>, "npostavs@gmail.com" Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155812420914702 (code B ref 35766); Fri, 17 May 2019 20:17:02 +0000 Received: (at 35766) by debbugs.gnu.org; 17 May 2019 20:16:49 +0000 Received: from localhost ([127.0.0.1]:59036 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRjHd-0003p4-2x for submit@debbugs.gnu.org; Fri, 17 May 2019 16:16:49 -0400 Received: from mail-oln040092009096.outbound.protection.outlook.com ([40.92.9.96]:6308 helo=NAM04-BN3-obe.outbound.protection.outlook.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRjHb-0003oo-9I for 35766@debbugs.gnu.org; Fri, 17 May 2019 16:16:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FnwQccDqoIb4NCQDM7gG6JMuufDOZCX+DeFffh5ftG0=; b=V6Fj1X04y40zCBoYLNm7OWhAOw1CmLYEirwEetKGRejDxY7KqVJ3CSVHWfVkg69OcC8EBZiUk/3OzQa005i2tTziR7VhO75ZxZ9sViH8NQY5Izi94PeIROfhskZZY9nA7uOdgzaTtFArIYlmV51WLJl5/0ZIU02LrcxgiIgujLOERttK8z7ovBnpd8x87PgPCAAedLTAz9a+dq5NlfnrkhAXpSwQdbZExhRF+axXjnL5OfRpPkuc7tVWw6q7KWMwqGnjV6F4+/TQpcCOUYg/quCdNR1P0BQw2cJF5sGDiQAibcyF5ObwkpyIoc0UetXgC/QSmVx1VLiqSmzWUVaGYw== Received: from BN3NAM04FT029.eop-NAM04.prod.protection.outlook.com (10.152.92.52) by BN3NAM04HT006.eop-NAM04.prod.protection.outlook.com (10.152.92.92) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1856.15; Fri, 17 May 2019 20:16:41 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com (10.152.92.53) by BN3NAM04FT029.mail.protection.outlook.com (10.152.92.148) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1900.16 via Frontend Transport; Fri, 17 May 2019 20:16:41 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec]) by BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec%7]) with mapi id 15.20.1900.010; Fri, 17 May 2019 20:16:41 +0000 From: J S Thread-Topic: bug#35766: emacs saves utf-16 le xml files as utf-16 be Thread-Index: AQHVDAo+Qjldgmx4b0y+0fxmDRw/MKZuEJPQgAAOtn2AAAGb0YAAGqeFgADRqwCAACD5gIAABqSpgAA/Xq2AAA6vR4AACAlMgAAvoiuAAAex1Q== Date: Fri, 17 May 2019 20:16:41 +0000 Message-ID: References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> <83tvdt9js7.fsf@gnu.org>,<85a7fldp15.fsf@gmail.com> , <83r28wamoc.fsf@gnu.org> In-Reply-To: <83r28wamoc.fsf@gnu.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-incomingtopheadermarker: OriginalChecksum:BD794E6B90068C843E466CAD8B87A52EF4B5DCF0B3116F2344F941D127958A4E; UpperCasedChecksum:D424F18261300167425ED43976431F29529DDCE87C2B760CCC575176ECF168BA; SizeAsReceived:7473; Count:44 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [s7lQdY1osMT/v5Va6eKHOwDApLMnId5G] x-ms-publictraffictype: Email x-incomingheadercount: 44 x-eopattributedmessage: 0 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(5050001)(7020095)(20181119110)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031323274)(2017031324274)(2017031322404)(1601125500)(1603101475)(1701031045); SRVR:BN3NAM04HT006; x-ms-traffictypediagnostic: BN3NAM04HT006: x-microsoft-antispam-message-info: UYkyKparilgc8OzoKdU7bol0y/DnEaSHq+a1/VSpShrxVPTmY51iUUEgESYo7zQ23GPpEcgwQJszhGDo1KLIUVsp2gAiJLyk3MgRCUGACqfx9+d/pnYjHPkAICwZCAmEgxP9SB4owtWDIUdwWsxuJnuWoOhTmIcnTErpGGIMzQ1rlcAgUfZOhgPDxSPhEHSi Content-Type: multipart/alternative; boundary="_000_BL0PR11MB3475DC9D8B47FD88CE5E30E09E0B0BL0PR11MB3475namp_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: 2d510d30-0dcb-4736-12b8-08d6db04930c X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-originalarrivaltime: 17 May 2019 20:16:41.1958 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN3NAM04HT006 X-Spam-Score: 1.3 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: For example, if I save this xml file in emacs, it saves it as utf-16 big endian: If I do this in powershell (really a .net method), it saves it as utf-16 little endian (osx or windows): Content analysis details: (1.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (jszabo_98[at]hotmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends in digit (jszabo_98[at]hotmail.com) -0.0 SPF_HELO_PASS SPF: HELO matches SPF record 0.0 HTML_MESSAGE BODY: HTML included in message -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [40.92.9.96 listed in list.dnswl.org] 1.0 FREEMAIL_REPLY From and body contain different freemails X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --_000_BL0PR11MB3475DC9D8B47FD88CE5E30E09E0B0BL0PR11MB3475namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable For example, if I save this xml file in emacs, it saves it as utf-16 big en= dian: If I do this in powershell (really a .net method), it saves it as utf-16 li= ttle endian (osx or windows): [xml]$xml =3D get-content file.xml $xml.save('file.xml') ________________________________ From: Eli Zaretskii Sent: Friday, May 17, 2019 7:46 PM To: J S Cc: npostavs@gmail.com; 35766@debbugs.gnu.org Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be > From: J S > CC: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> > Date: Fri, 17 May 2019 16:57:23 +0000 > > When an xml file just says encoding=3D"UTF-16", how does an application p= ick big endian vs little endian? What is "an application" in this context? --_000_BL0PR11MB3475DC9D8B47FD88CE5E30E09E0B0BL0PR11MB3475namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
For example, if I save this xml file in emacs, it saves it as utf-16 big en= dian:

<?xml version=3D"1.0" encoding= =3D"UTF-16"?>


If I do this in powershell (really a .net method), it saves it as utf-16 li= ttle endian (osx or windows):

[xml]$xml =3D get-content file.xml
$xml.save('file.xml')



From: Eli Zaretskii <eli= z@gnu.org>
Sent: Friday, May 17, 2019 7:46 PM
To: J S
Cc: npostavs@gmail.com; 35766@debbugs.gnu.org
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
> From: J S <jszabo_98@hotmail.com>
> CC: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org> > Date: Fri, 17 May 2019 16:57:23 +0000
>
> When an xml file just says encoding=3D"UTF-16", how does an = application pick big endian vs little endian?

What is "an application" in this context?
--_000_BL0PR11MB3475DC9D8B47FD88CE5E30E09E0B0BL0PR11MB3475namp_-- From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 18 May 2019 05:34:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: J S Cc: 35766@debbugs.gnu.org, npostavs@gmail.com Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155815761210585 (code B ref 35766); Sat, 18 May 2019 05:34:02 +0000 Received: (at 35766) by debbugs.gnu.org; 18 May 2019 05:33:32 +0000 Received: from localhost ([127.0.0.1]:59478 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRryN-0002kf-QR for submit@debbugs.gnu.org; Sat, 18 May 2019 01:33:32 -0400 Received: from eggs.gnu.org ([209.51.188.92]:51624) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRryM-0002kS-1S for 35766@debbugs.gnu.org; Sat, 18 May 2019 01:33:30 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:41626) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hRryF-0002SV-5i; Sat, 18 May 2019 01:33:23 -0400 Received: from [176.228.60.248] (port=2719 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hRryE-0001nX-2E; Sat, 18 May 2019 01:33:22 -0400 Date: Sat, 18 May 2019 08:33:17 +0300 Message-Id: <83o9409vj6.fsf@gnu.org> From: Eli Zaretskii In-reply-to: (message from J S on Fri, 17 May 2019 20:16:41 +0000) References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> <83tvdt9js7.fsf@gnu.org>,<85a7fldp15.fsf@gmail.com> , <83r28wamoc.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: J S > CC: "npostavs@gmail.com" , "35766@debbugs.gnu.org" > <35766@debbugs.gnu.org> > Date: Fri, 17 May 2019 20:16:41 +0000 > > For example, if I save this xml file in emacs, it saves it as utf-16 big endian: > > This is the Emacs default, which is well documented, and is also according to what the UTF-16 spec (RFC 2781) says. > If I do this in powershell (really a .net method), it saves it as utf-16 little endian (osx or windows): Then PowerShell behaves in violation of RFC 2781. From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 18 May 2019 07:27:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: npostavs@gmail.com Cc: jszabo_98@hotmail.com, 35766@debbugs.gnu.org Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155816438521392 (code B ref 35766); Sat, 18 May 2019 07:27:01 +0000 Received: (at 35766) by debbugs.gnu.org; 18 May 2019 07:26:25 +0000 Received: from localhost ([127.0.0.1]:59508 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRtjc-0005Yt-Tw for submit@debbugs.gnu.org; Sat, 18 May 2019 03:26:25 -0400 Received: from eggs.gnu.org ([209.51.188.92]:42231) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRtjb-0005Yb-3C; Sat, 18 May 2019 03:26:23 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:42915) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hRtjT-0001Vs-Qv; Sat, 18 May 2019 03:26:17 -0400 Received: from [176.228.60.248] (port=1700 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hRtjS-0007fS-R2; Sat, 18 May 2019 03:26:15 -0400 Date: Sat, 18 May 2019 10:26:09 +0300 Message-Id: <83ef4w9qb2.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <85a7fldp15.fsf@gmail.com> (npostavs@gmail.com) References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> <83tvdt9js7.fsf@gnu.org> <85a7fldp15.fsf@gmail.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) merge 8282 35766 close 36766 thanks > From: npostavs@gmail.com > Cc: Noam Postavsky , jszabo_98@hotmail.com, 35766@debbugs.gnu.org > Date: Fri, 17 May 2019 12:27:50 -0400 > > Eli Zaretskii writes: > > > Perhaps we should by default produce encoding with BOM when XML header > > specifies UTF-16? > > I think yes, https://www.w3.org/TR/xml/#charencoding says > > Entities encoded in UTF-16 MUST [...] begin with the Byte Order Mark OK, I did that as well, and pushed the changes to master. > By the way, is Bug#8282 the same as this one, or just closely related? It's the same problem; merged the bugs. > It's talking about sgml-html-meta-auto-coding-function (though maybe > sgml-xml-auto-coding-function is more relevant). I'm getting a little > confused between all the different *-find/auto-coding-* functions. The function relevant for the recipe in bug#8282 is sgml-xml-auto-coding-function, which is where I made the changes. If the HTML and/or SGML specs also mandate that we use BOM, then maybe we need the same changes in sgml-html-meta-auto-coding-function as well. Note that there's no equivalent for xml-find-file-coding-system for non-XML files, so recognition of visited UTF-16 HTML files will not work even if they do have a BOM. > There is also nxml-set-auto-coding which seems to be mostly unused. It is supposed to be used by packages that build on top of nXml, AFAIU. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Sat May 18 03:44:25 2019 Received: (at control) by debbugs.gnu.org; 18 May 2019 07:44:25 +0000 Received: from localhost ([127.0.0.1]:59517 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRu13-00085c-IX for submit@debbugs.gnu.org; Sat, 18 May 2019 03:44:25 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44432) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRu10-00085O-K7 for control@debbugs.gnu.org; Sat, 18 May 2019 03:44:22 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:43049) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hRu0v-00077e-FD for control@debbugs.gnu.org; Sat, 18 May 2019 03:44:17 -0400 Received: from [176.228.60.248] (port=2805 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hRu0u-00084m-3Q for control@debbugs.gnu.org; Sat, 18 May 2019 03:44:16 -0400 Date: Sat, 18 May 2019 10:44:10 +0300 Message-Id: <83d0kg9ph1.fsf@gnu.org> From: Eli Zaretskii To: control@debbugs.gnu.org Subject: Close 35766 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) close 35766 thanks From debbugs-submit-bounces@debbugs.gnu.org Sat May 18 07:29:41 2019 Received: (at control) by debbugs.gnu.org; 18 May 2019 11:29:42 +0000 Received: from localhost ([127.0.0.1]:60028 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRxX3-0007bX-N2 for submit@debbugs.gnu.org; Sat, 18 May 2019 07:29:41 -0400 Received: from mail-io1-f50.google.com ([209.85.166.50]:44917) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hRxX1-0007bK-FN for control@debbugs.gnu.org; Sat, 18 May 2019 07:29:39 -0400 Received: by mail-io1-f50.google.com with SMTP id f22so7483743iol.11 for ; Sat, 18 May 2019 04:29:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:mime-version; bh=kuE16Gblw+x5ogjh+rDlj3sS44QwGRcJC498d0FMm4Y=; b=e9/H1mgTBkb4NbK1fm2aydFImAzZS1rtlS9TvnnV1kQYaVCUMgPFWKWi0ut0U67ipK ag2Nc0+YyqA1aB0LUNk1iBvmHdyQ76CT8zsj1h27hKw6Ax55Dn9573s7oYzhsoPMNFKD HMNKwiVTfzpfKPF3SK0L//ioXw6UAcZDH2TMOIom+IVfqrn/cHqPddfcpNpazXpjvpuj vrh+NV0neRwcaC5O7qhcWvs8RU0Zk0+HD4kByq+Tb/CLC93zF48r2VECiAk1Hi/bIhun qcsKK1zvGzM0RBR9kwBjWHncOSBRSLYNSkhWCegBRClLBItclvQNFk+N96HlDQ/xJaoQ uWKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version; bh=kuE16Gblw+x5ogjh+rDlj3sS44QwGRcJC498d0FMm4Y=; b=dVSIDk3/RfTI+pAqVytuIq+Ztf65ofLkaMU00b/J3bSXQPsc//u7I/lGOkNUlbxXoe a4XVvL8jq+YDhj88TFzpECnLtfIxG5HMe9fQhAXH1Zon/RUxfw1AFmdlNMf40PSZc4nP FRs5IlSK6qR7xhjemqBdpaNq1OaDZVeTPl+neVB7GKWvvLxjo6673/tsA5IK9xTY7A68 y4nPgatNG/nk/cdbYBFn8ryZ/j2utpl4T733QAg4tevpGsZIZlXyOeoUorNWjxNpc/ZT I6I11xh0NMYUe1NYfPA3L7a6DwM29l+njhtPeprMtn0wCV5nidA8IYM2sdp9HdDxyMGV SCdA== X-Gm-Message-State: APjAAAXtqz4yxByt6st8h5hW+ZhxuDVUc3GBTjXIS2lr0VrD9o7BoXDe VKOGwAeDNA3l/ksEgZpMLG8/HGl7 X-Google-Smtp-Source: APXvYqxhdhSjQVf+H+s8rMiJmPNBCVyNAOrZ5gyeeNpMhyDaBDie9cJj+UbkoV1ymH9q6fViKeftQQ== X-Received: by 2002:a6b:dd07:: with SMTP id f7mr35463201ioc.244.1558178973745; Sat, 18 May 2019 04:29:33 -0700 (PDT) Received: from minid (cbl-45-2-119-34.yyz.frontiernetworks.ca. [45.2.119.34]) by smtp.gmail.com with ESMTPSA id x11sm3712463ion.10.2019.05.18.04.29.32 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 18 May 2019 04:29:32 -0700 (PDT) From: Noam Postavsky To: control@debbugs.gnu.org Subject: control message for bug #35766 Date: Sat, 18 May 2019 07:29:31 -0400 Message-ID: <874l5sdmqs.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) fixed 35766 27.1 quit From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: J S Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 18 May 2019 20:58:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org>, "npostavs@gmail.com" Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.15582130811920 (code B ref 35766); Sat, 18 May 2019 20:58:03 +0000 Received: (at 35766) by debbugs.gnu.org; 18 May 2019 20:58:01 +0000 Received: from localhost ([127.0.0.1]:33063 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hS6P2-0000Ut-Ko for submit@debbugs.gnu.org; Sat, 18 May 2019 16:58:00 -0400 Received: from mail-oln040092005045.outbound.protection.outlook.com ([40.92.5.45]:36226 helo=NAM02-SN1-obe.outbound.protection.outlook.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hS6P0-0000UW-Ef for 35766@debbugs.gnu.org; Sat, 18 May 2019 16:57:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=g1b6Df9GU1Ab9Kq6exN7nxQWr44H4a04iOkkgwTIfp0=; b=tZr/rnEBSZoiv2U+wWMTXeApz3nGe9/6alzbd8mg6cIaHIPybVkP1FOflaR17JZiBwrgGjIzQZ9m2EhQ4cxAf5IJlW/AjzVz3ytH0/A0rlg8Dkj+4AwoeJamk/rTjqG2Lq+fVu2FJnR2EQtuJGb2Mq/3dDgytvnEnVFnP0JjWTHYr6aejWX3+6BBwdgysqLQBPbRRpO2G8BYdRjgEm6DrcAdWjCupNNKCGEDmoIWIrPHSgtVIEK2tj9ghFb9+7qh1WYXKjIWvBXw248zXvQ4eUf34ZDJyQrLKttp25W5rvAEBSINSQNhJEXfWPf6btV6oIWnT8LpiQ3dfy3rJE0zbg== Received: from BL2NAM02FT019.eop-nam02.prod.protection.outlook.com (10.152.76.55) by BL2NAM02HT054.eop-nam02.prod.protection.outlook.com (10.152.77.1) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1900.16; Sat, 18 May 2019 20:57:51 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com (10.152.76.58) by BL2NAM02FT019.mail.protection.outlook.com (10.152.77.166) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1900.16 via Frontend Transport; Sat, 18 May 2019 20:57:51 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec]) by BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec%7]) with mapi id 15.20.1900.010; Sat, 18 May 2019 20:57:51 +0000 From: J S Thread-Topic: bug#35766: emacs saves utf-16 le xml files as utf-16 be Thread-Index: AQHVDAo+Qjldgmx4b0y+0fxmDRw/MKZuEJPQgAAOtn2AAAGb0YAAGqeFgADRqwCAACD5gIAABqSpgAA/Xq2AAA6vR4AACAlMgAAvoiuAAAex1YAAnB/ogAEAfdY= Date: Sat, 18 May 2019 20:57:51 +0000 Message-ID: References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> <83tvdt9js7.fsf@gnu.org>,<85a7fldp15.fsf@gmail.com> , <83r28wamoc.fsf@gnu.org> , <83o9409vj6.fsf@gnu.org> In-Reply-To: <83o9409vj6.fsf@gnu.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-incomingtopheadermarker: OriginalChecksum:F51A0B925050C6F9FA4ABA41B8661FB8F164D6E92EDE4D4DC1F0580B12106D5E; UpperCasedChecksum:08908E137B6B01A57112A0263DDBE48933404A43E7EEADCE186B50B7D3EDAF16; SizeAsReceived:7585; Count:44 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [RZZhLndqI+obpTM6CZjpZ8VFvILXrj6m] x-ms-publictraffictype: Email x-incomingheadercount: 44 x-eopattributedmessage: 0 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(5050001)(7020095)(20181119110)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031323274)(2017031324274)(2017031322404)(1601125500)(1603101475)(1701031045); SRVR:BL2NAM02HT054; x-ms-traffictypediagnostic: BL2NAM02HT054: x-microsoft-antispam-message-info: vIx9bkbbwP5C3SBi7wj17HhSUL4lJKOoFGp52RTo1Dq6kf6okrtxeEII6Ge43xhsPRQ83hqiD5B/h8niXb/oOlrLFxFCGmDfwXpZfda3QseYisL9a5iaX8fRIYGA3h1gf+Zcyth/qR9YIpiooBvjwQ4u+m4i+nhlnrt1ROQzjNX1XdoHERlA11yGHWCqNlRN Content-Type: multipart/alternative; boundary="_000_BL0PR11MB3475AB95C0692C46A6D26F5C9E040BL0PR11MB3475namp_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: cadd3165-43a9-4619-e063-08d6dbd37e02 X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-originalarrivaltime: 18 May 2019 20:57:51.7834 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2NAM02HT054 X-Spam-Score: 1.3 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: RFC 2781 under "4.3 Interpreting text labelled as UTF-16" says is that if a document is labelled "UTF-16", the application should check the byte order mark to see if it is little endian or big endian [...] Content analysis details: (1.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (jszabo_98[at]hotmail.com) -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [40.92.5.45 listed in list.dnswl.org] 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends in digit (jszabo_98[at]hotmail.com) -0.0 SPF_PASS SPF: sender matches SPF record -0.0 SPF_HELO_PASS SPF: HELO matches SPF record 0.0 HTML_MESSAGE BODY: HTML included in message 1.0 FREEMAIL_REPLY From and body contain different freemails X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --_000_BL0PR11MB3475AB95C0692C46A6D26F5C9E040BL0PR11MB3475namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable RFC 2781 under "4.3 Interpreting text labelled as UTF-16" says is that if a= document is labelled "UTF-16", the application should check the byte order= mark to see if it is little endian or big endian Only if there's no byte= order mark, should the document be interpreted as big endian. ________________________________ From: Eli Zaretskii Sent: Saturday, May 18, 2019 5:33 AM To: J S Cc: npostavs@gmail.com; 35766@debbugs.gnu.org Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be > From: J S > CC: "npostavs@gmail.com" , "35766@debbugs.gnu.org" > <35766@debbugs.gnu.org> > Date: Fri, 17 May 2019 20:16:41 +0000 > > For example, if I save this xml file in emacs, it saves it as utf-16 big = endian: > > This is the Emacs default, which is well documented, and is also according to what the UTF-16 spec (RFC 2781) says. > If I do this in powershell (really a .net method), it saves it as utf-16 = little endian (osx or windows): Then PowerShell behaves in violation of RFC 2781. --_000_BL0PR11MB3475AB95C0692C46A6D26F5C9E040BL0PR11MB3475namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
RFC 2781 under "4.3 Interpreting text labelled as UTF-16" sa= ys is that if a document is labelled "UTF-16", the application sh= ould check the byte order mark to see if it is little endian or big endian&= nbsp;  Only if there's no byte order mark, should the document be interpreted as big endian.


From: Eli Zaretskii <eli= z@gnu.org>
Sent: Saturday, May 18, 2019 5:33 AM
To: J S
Cc: npostavs@gmail.com; 35766@debbugs.gnu.org
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
> From: J S <jszabo_98@hotmail.com>
> CC: "npostavs@gmail.com" <npostavs@gmail.com>, "3= 5766@debbugs.gnu.org"
>        <35766@debbugs.gnu.org>= ;
> Date: Fri, 17 May 2019 20:16:41 +0000
>
> For example, if I save this xml file in emacs, it saves it as utf-16 b= ig endian:
>
> <?xml version=3D"1.0" encoding=3D"UTF-16"?><= br>
This is the Emacs default, which is well documented, and is also
according to what the UTF-16 spec (RFC 2781) says.

> If I do this in powershell (really a .net method), it saves it as utf-= 16 little endian (osx or windows):

Then PowerShell behaves in violation of RFC 2781.
--_000_BL0PR11MB3475AB95C0692C46A6D26F5C9E040BL0PR11MB3475namp_-- From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 19 May 2019 04:59:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: J S Cc: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org>, "npostavs@gmail.com" Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155824189414701 (code B ref 35766); Sun, 19 May 2019 04:59:01 +0000 Received: (at 35766) by debbugs.gnu.org; 19 May 2019 04:58:14 +0000 Received: from localhost ([127.0.0.1]:33605 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hSDtm-0003p3-3A for submit@debbugs.gnu.org; Sun, 19 May 2019 00:58:14 -0400 Received: from eggs.gnu.org ([209.51.188.92]:33307) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hSDtk-0003oq-J1 for 35766@debbugs.gnu.org; Sun, 19 May 2019 00:58:12 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:58480) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hSDte-0005Ym-OV; Sun, 19 May 2019 00:58:06 -0400 Received: from [176.12.219.164] (port=36765 helo=[10.209.164.91]) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1hSDtd-0006Vs-Be; Sun, 19 May 2019 00:58:06 -0400 Date: Sun, 19 May 2019 07:58:01 +0300 User-Agent: K-9 Mail for Android In-Reply-To: References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> <83tvdt9js7.fsf@gnu.org>, <85a7fldp15.fsf@gmail.com> , <83r28wamoc.fsf@gnu.org> , <83o9409vj6.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Eli Zaretskii Message-ID: <062CAA71-E754-4977-90FC-9F94487415A7@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) On May 18, 2019 11:57:51 PM GMT+03:00, J S wrote: > RFC 2781 under "4=2E3 Interpreting text labelled as UTF-16" says is that > if a document is labelled "UTF-16", the application should check the > byte order mark to see if it is little endian or big endian Only if > there's no byte order mark, should the document be interpreted as big > endian=2E >=20 If you are talking about visiting an existing file, then the change I inst= alled does just that=2E I was talking about saving a file, in which case t= here's no BOM, since it isn't present in the buffer=20 From unknown Sat Aug 16 21:23:56 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35766: emacs saves utf-16 le xml files as utf-16 be Resent-From: J S Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 19 May 2019 14:13:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35766 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: "35766@debbugs.gnu.org" <35766@debbugs.gnu.org>, "npostavs@gmail.com" Received: via spool by 35766-submit@debbugs.gnu.org id=B35766.155827514421648 (code B ref 35766); Sun, 19 May 2019 14:13:02 +0000 Received: (at 35766) by debbugs.gnu.org; 19 May 2019 14:12:24 +0000 Received: from localhost ([127.0.0.1]:35237 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hSMY4-0005d5-8f for submit@debbugs.gnu.org; Sun, 19 May 2019 10:12:24 -0400 Received: from mail-oln040092005077.outbound.protection.outlook.com ([40.92.5.77]:16848 helo=NAM02-SN1-obe.outbound.protection.outlook.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hSMY1-0005cr-5B for 35766@debbugs.gnu.org; Sun, 19 May 2019 10:12:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XKTFyOlD2/BFFpccu27+9un6SprXfT62owNR7vsN8r8=; b=e+AUPJE4ezM/nlOf0HBFJISr3Ys29dtKLPScbASqgNZizeJGAdAUzAoFeKc4tQ8Y12N2L5ouGv0e0e3I7uoBhC3jjg/YZOGboDqbrQU/IuxPjcUtZEQbRgr3vRncrQ+/W9plWyQWhD9DDGCxvSWQ2VBtWLSeADAYvbr/K3vgwGS9Ner9Kis6ZOvJ1T1Rfs9AeXEOG/TM6h0RhhTRhTutx5qAv00eRTbyo9uTz0z2DsYO2x5L/40joshfnlig5EBktIKQWSzSy0s3mYHr8GHjNmGgzUfWuWm1an8yXEv6/sEjCQ9FyxxANfzraLFZiJ7gK7sSzyCGl+LaLZzDzt6gcQ== Received: from CY1NAM02FT027.eop-nam02.prod.protection.outlook.com (10.152.74.55) by CY1NAM02HT185.eop-nam02.prod.protection.outlook.com (10.152.74.77) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1900.16; Sun, 19 May 2019 14:12:14 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com (10.152.74.56) by CY1NAM02FT027.mail.protection.outlook.com (10.152.75.159) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1900.16 via Frontend Transport; Sun, 19 May 2019 14:12:14 +0000 Received: from BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec]) by BL0PR11MB3475.namprd11.prod.outlook.com ([fe80::111f:6124:13a4:baec%7]) with mapi id 15.20.1900.020; Sun, 19 May 2019 14:12:14 +0000 From: J S Thread-Topic: bug#35766: emacs saves utf-16 le xml files as utf-16 be Thread-Index: AQHVDAo+Qjldgmx4b0y+0fxmDRw/MKZuEJPQgAAOtn2AAAGb0YAAGqeFgADRqwCAACD5gIAABqSpgAA/Xq2AAA6vR4AACAlMgAAvoiuAAAex1YAAnB/ogAEAfdaAAIfxgIAAms2g Date: Sun, 19 May 2019 14:12:14 +0000 Message-ID: References: <837eaqcl9g.fsf@gnu.org> <83lfz5bfed.fsf@gnu.org> <87a7fle1yp.fsf@gmail.com> <83tvdt9js7.fsf@gnu.org>,<85a7fldp15.fsf@gmail.com> , <83r28wamoc.fsf@gnu.org> , <83o9409vj6.fsf@gnu.org> , <062CAA71-E754-4977-90FC-9F94487415A7@gnu.org> In-Reply-To: <062CAA71-E754-4977-90FC-9F94487415A7@gnu.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-incomingtopheadermarker: OriginalChecksum:7D6D6A72EC6583E11B1E08FBED355D4887F6526293BD2536AEEBDDAEC896B78B; UpperCasedChecksum:447DBD24F5512074C3FB3FF2549DF2C47A11D70208296DB287A8C85896ABE4F7; SizeAsReceived:7757; Count:44 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [nGvSZqjAVPFQxrPEBQTIUZtZrvzqKJC7] x-ms-publictraffictype: Email x-incomingheadercount: 44 x-eopattributedmessage: 0 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(5050001)(7020095)(20181119110)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031323274)(2017031324274)(2017031322404)(1601125500)(1603101475)(1701031045); SRVR:CY1NAM02HT185; x-ms-traffictypediagnostic: CY1NAM02HT185: x-microsoft-antispam-message-info: dL4Hl37d/1ypJu3R7wAokBIENYh0Riry8pfA8T3xlcVKDjXCv6LTakBugiD6MLPEVOBLaYTboGe70CHS6j8k3W0Hob8GhVfbaXaZ/P9N7wwoAaJZ3vZPKgUDmi+tWs5M3Y1eIxWlwJrmDJneNxKwafRUMB1I/m1OSXRY7PH9j3jaRGaGLZxig2NYyE7rDE+h Content-Type: multipart/alternative; boundary="_000_BL0PR11MB3475CD395124344ECF9E6C8A9E050BL0PR11MB3475namp_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: eed0fd19-bc1a-448d-56e4-08d6dc63fe34 X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-originalarrivaltime: 19 May 2019 14:12:14.3570 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1NAM02HT185 X-Spam-Score: 1.3 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Sounds good. From: Eli Zaretskii Sent: Sunday, May 19, 2019 4:58 AM To: J S Cc: npostavs@gmail.com; Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be On May 18, 2019 11:57:51 PM GMT+03:00, J S wrote: > RFC 2781 under "4.3 Interpreting text labelled as UTF-16" says is that > if a document is labelled "UTF-16", the application should check the > by [...] Content analysis details: (1.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [40.92.5.77 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -0.0 SPF_HELO_PASS SPF: HELO matches SPF record 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (jszabo_98[at]hotmail.com) 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends in digit (jszabo_98[at]hotmail.com) 0.0 HTML_MESSAGE BODY: HTML included in message 1.0 FREEMAIL_REPLY From and body contain different freemails X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --_000_BL0PR11MB3475CD395124344ECF9E6C8A9E050BL0PR11MB3475namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Sounds good. ________________________________ From: Eli Zaretskii Sent: Sunday, May 19, 2019 4:58 AM To: J S Cc: npostavs@gmail.com; 35766@debbugs.gnu.org Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be On May 18, 2019 11:57:51 PM GMT+03:00, J S wrote: > RFC 2781 under "4.3 Interpreting text labelled as UTF-16" says is that > if a document is labelled "UTF-16", the application should check the > byte order mark to see if it is little endian or big endian Only if > there's no byte order mark, should the document be interpreted as big > endian. > If you are talking about visiting an existing file, then the change I insta= lled does just that. I was talking about saving a file, in which case ther= e's no BOM, since it isn't present in the buffer --_000_BL0PR11MB3475CD395124344ECF9E6C8A9E050BL0PR11MB3475namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
Sounds good.

From: Eli Zaretskii <eli= z@gnu.org>
Sent: Sunday, May 19, 2019 4:58 AM
To: J S
Cc: npostavs@gmail.com; 35766@debbugs.gnu.org
Subject: Re: bug#35766: emacs saves utf-16 le xml files as utf-16 be=
 
On May 18, 2019 11:57:51 PM GMT+03:00, J S <= ;jszabo_98@hotmail.com> wrote:
> RFC 2781 under "4.3 Interpreting text labelled as UTF-16" sa= ys is that
> if a document is labelled "UTF-16", the application should c= heck the
> byte order mark to see if it is little endian or big endian  = ; Only if
> there's no byte order mark, should the document be interpreted as big<= br> > endian.
>


If you are talking about visiting an existing file, then the change I insta= lled does just that.  I was talking about saving a file, in which case= there's no BOM, since it isn't present in the buffer
--_000_BL0PR11MB3475CD395124344ECF9E6C8A9E050BL0PR11MB3475namp_--