From unknown Fri Aug 15 16:23:12 2025 X-Loop: help-debbugs@gnu.org Subject: bug#33053: scm_i_mirror_backslashes assumes ASCII-compatible locale encoding Resent-From: Mark H Weaver Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Mon, 15 Oct 2018 20:47:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 33053 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 33053@debbugs.gnu.org X-Debbugs-Original-To: bug-guile@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.153963638015929 (code B ref -1); Mon, 15 Oct 2018 20:47:01 +0000 Received: (at submit) by debbugs.gnu.org; 15 Oct 2018 20:46:20 +0000 Received: from localhost ([127.0.0.1]:51528 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gC9kq-00048r-Ep for submit@debbugs.gnu.org; Mon, 15 Oct 2018 16:46:20 -0400 Received: from eggs.gnu.org ([208.118.235.92]:49244) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gC9ko-00048d-40 for submit@debbugs.gnu.org; Mon, 15 Oct 2018 16:46:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gC9ki-0008EO-4T for submit@debbugs.gnu.org; Mon, 15 Oct 2018 16:46:12 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:46069) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gC9kh-0008EH-LW for submit@debbugs.gnu.org; Mon, 15 Oct 2018 16:46:11 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58840) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gC9kf-0000GB-M1 for bug-guile@gnu.org; Mon, 15 Oct 2018 16:46:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gC9kc-00088e-29 for bug-guile@gnu.org; Mon, 15 Oct 2018 16:46:09 -0400 Received: from world.peace.net ([64.112.178.59]:39856) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gC9kb-0007xp-P5 for bug-guile@gnu.org; Mon, 15 Oct 2018 16:46:05 -0400 Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1gC9kQ-0005pT-Hp; Mon, 15 Oct 2018 16:45:54 -0400 From: Mark H Weaver Date: Mon, 15 Oct 2018 16:45:40 -0400 Message-ID: <87lg6zos5n.fsf@netris.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) The 'scm_i_mirror_backslashes' in load.c operates on C strings in the locale encoding, and assumes that the locale encoding is ASCII compatible. In the Shift_JIS encoding, used in the "JP_jp.sjis" locale, backslash '\' is mapped to a multibyte character, and the Yen sign '=C2=A5' is represented using code 0x5C, the same code as backslash '\' in ASCII. As a result, users of the "JP_jp.sjis" locale will have Yen signs '=C2=A5' = in their file names converted into slashes by this function. Mark From unknown Fri Aug 15 16:23:12 2025 X-Loop: help-debbugs@gnu.org Subject: bug#33053: scm_i_mirror_backslashes assumes ASCII-compatible locale encoding Resent-From: Mark H Weaver Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Mon, 15 Oct 2018 23:08:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33053 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 33053@debbugs.gnu.org Received: via spool by 33053-submit@debbugs.gnu.org id=B33053.153964483328580 (code B ref 33053); Mon, 15 Oct 2018 23:08:01 +0000 Received: (at 33053) by debbugs.gnu.org; 15 Oct 2018 23:07:13 +0000 Received: from localhost ([127.0.0.1]:51598 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gCBxB-0007Qu-Iy for submit@debbugs.gnu.org; Mon, 15 Oct 2018 19:07:13 -0400 Received: from world.peace.net ([64.112.178.59]:55252) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gCBxA-0007Qi-8r for 33053@debbugs.gnu.org; Mon, 15 Oct 2018 19:07:12 -0400 Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1gCBx4-0008Br-9k; Mon, 15 Oct 2018 19:07:06 -0400 From: Mark H Weaver References: <87lg6zos5n.fsf@netris.org> Date: Mon, 15 Oct 2018 19:06:53 -0400 In-Reply-To: <87lg6zos5n.fsf@netris.org> (Mark H. Weaver's message of "Mon, 15 Oct 2018 16:45:40 -0400") Message-ID: <87d0saq06q.fsf@netris.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Mark H Weaver writes: > The 'scm_i_mirror_backslashes' in load.c operates on C strings in the > locale encoding, and assumes that the locale encoding is ASCII > compatible. In the Shift_JIS encoding, used in the "JP_jp.sjis" locale, > backslash '\' is mapped to a multibyte character, and the Yen sign '=C2= =A5' > is represented using code 0x5C, the same code as backslash '\' in ASCII. > > As a result, users of the "JP_jp.sjis" locale will have Yen signs '=C2=A5= ' in > their file names converted into slashes by this function. I miswrote the locale name above. The locale name is "ja_JP.sjis". Mark From unknown Fri Aug 15 16:23:12 2025 X-Loop: help-debbugs@gnu.org Subject: bug#33053: scm_i_mirror_backslashes assumes ASCII-compatible locale encoding Resent-From: Mark H Weaver Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Sat, 20 Oct 2018 01:23:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33053 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 33053@debbugs.gnu.org Received: via spool by 33053-submit@debbugs.gnu.org id=B33053.153999852821079 (code B ref 33053); Sat, 20 Oct 2018 01:23:01 +0000 Received: (at 33053) by debbugs.gnu.org; 20 Oct 2018 01:22:08 +0000 Received: from localhost ([127.0.0.1]:60053 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gDfxv-0005Tp-OY for submit@debbugs.gnu.org; Fri, 19 Oct 2018 21:22:07 -0400 Received: from world.peace.net ([64.112.178.59]:46342) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gDfxu-0005TO-Ja; Fri, 19 Oct 2018 21:22:06 -0400 Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1gDfxo-0001GQ-O3; Fri, 19 Oct 2018 21:22:00 -0400 From: Mark H Weaver References: <87lg6zos5n.fsf@netris.org> <87d0saq06q.fsf@netris.org> Date: Fri, 19 Oct 2018 21:21:49 -0400 In-Reply-To: <87d0saq06q.fsf@netris.org> (Mark H. Weaver's message of "Mon, 15 Oct 2018 19:06:53 -0400") Message-ID: <87va5x4dle.fsf@netris.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) tags 33053 + notabug close 33053 thanks Mark H Weaver writes: > Mark H Weaver writes: > >> The 'scm_i_mirror_backslashes' in load.c operates on C strings in the >> locale encoding, and assumes that the locale encoding is ASCII >> compatible. In the Shift_JIS encoding, used in the "JP_jp.sjis" locale, >> backslash '\' is mapped to a multibyte character, and the Yen sign '=C2= =A5' >> is represented using code 0x5C, the same code as backslash '\' in ASCII. >> >> As a result, users of the "JP_jp.sjis" locale will have Yen signs '=C2= =A5' in >> their file names converted into slashes by this function. > > I miswrote the locale name above. The locale name is "ja_JP.sjis". It seems that I was mistaken in my assumption that '\' is mapped to a multibyte character in Shift_JIS. According to John Cowan, "the character at #\x5C is *functionally* a backslash that is *displayed* as a yen sign". It seems that this is not actually a bug in 'scm_i_mirror_backslashes', so I'm closing this bug. Mark