From debbugs-submit-bounces@debbugs.gnu.org Tue Feb 27 15:26:12 2024 Received: (at submit) by debbugs.gnu.org; 27 Feb 2024 20:26:12 +0000 Received: from localhost ([127.0.0.1]:51692 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rf41g-0001Y3-CD for submit@debbugs.gnu.org; Tue, 27 Feb 2024 15:26:11 -0500 Received: from lists.gnu.org ([209.51.188.17]:48132) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rf41Z-0001VJ-Lk for submit@debbugs.gnu.org; Tue, 27 Feb 2024 15:26:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rf3rS-0002ui-L3 for bug-guile@gnu.org; Tue, 27 Feb 2024 15:15:34 -0500 Received: from river.fysh.org ([2001:41d0:d:20da::2]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rf3r5-0008S5-OI for bug-guile@gnu.org; Tue, 27 Feb 2024 15:15:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=fysh.org; s=20170316; h=Content-Type:MIME-Version:Message-ID:Subject:To:From:Date: Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=+Y6NTz4F3J2yjX0ajYLTE2hjLxNqmOJB1VPsC83FjXo=; b=rLHkdhuS44kBgpg/j9LUqf3hNd Acwoy2dINBSdQNa/qBblTHA5bB6cdcf91hnz7wyzFhaEHgpBRQmgvK166/BM9q2zNmOJRQBm4udZv /8AYRoToRg/IUkjpxtKiX5ay/tQ3RrzwO94Nxe0eqPFk/moym6m+bmiBt3NqP5D0yJPo=; Received: from zefram by river.fysh.org with local (Exim 4.96 #2 (Debian)) id 1rf3qs-00GHOS-12; Tue, 27 Feb 2024 20:14:58 +0000 Date: Tue, 27 Feb 2024 20:14:58 +0000 From: Zefram To: bug-guile@gnu.org Subject: basename faulty with nul and suffix Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Received-SPF: pass client-ip=2001:41d0:d:20da::2; envelope-from=zefram@fysh.org; helo=river.fysh.org X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, T_SPF_HELO_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) Trying out the basename function in Guile 3.0.9: scheme@(guile-user)> (basename "foo/bar") $1 = "bar" scheme@(guile-user)> (basename "foo/bar" "r") $2 = "ba" scheme@(guile-user)> (basename "foo/bar" "x") $3 = "bar" scheme@(guile-user)> (basename "foo/bar\0baz/quux") $4 = "bar" scheme@(guile-user)> (basename "foo/bar\0baz/quux" "r") $5 = "bar" scheme@(guile-user)> (basename "foo/bar\0baz/quux" "x") $6 = "ba" The first three cases here show the function operating correctly on a mundane pathname string. The fourth case shows it operating in a debatable manner on a pathname string that has an embedded nul. This treatment of this case is based on the idea that the string is acceptable as a pathname for file I/O functions, and that the nul will serve to terminate the pathname (which is what happens naturally in a naive treatment of passing the string to a system call). I note that the open-file function had that treatment of embedded nuls in Guile 1.6, but that since Guile 1.8 it has instead signalled an error on such a pathname string. Are there any remaining functions that accept embedded nuls in pathname strings? If not, then the basename and dirname functions probably ought to correspondingly signal an error for an embedded nul. (Incidentally, basename and dirname had unambiguously incorrect treatment of embedded nuls prior to Guile 2.0, so there's never yet been a version in which open-file and basename had matching treatment of embedded nuls.) But what I'm really interested in here is the fifth and sixth cases, where there's an embedded nul in the pathname string and also a suffix argument. Accepting the interpretation of the embedded nul as correct, the treatment here of the suffix is clearly faulty. Whether the suffix matches is being incorrectly checked against the original string, before nul truncation, but the suffix removal is being applied to the correct basename. It's possible to give a suffix that matches the untruncated string but is longer than the true basename, causing the suffix removal to error due to indexing outside the string. Whether the suffix matches should instead be checked against the nul-truncated string. -zefram From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 28 00:40:33 2024 Received: (at submit) by debbugs.gnu.org; 28 Feb 2024 05:40:33 +0000 Received: from localhost ([127.0.0.1]:37725 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rfCg5-0004Jl-9R for submit@debbugs.gnu.org; Wed, 28 Feb 2024 00:40:33 -0500 Received: from lists.gnu.org ([209.51.188.17]:60560) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rfCYR-0003yL-Q5 for submit@debbugs.gnu.org; Wed, 28 Feb 2024 00:32:35 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfCY0-0000xz-QU for bug-guile@gnu.org; Wed, 28 Feb 2024 00:32:04 -0500 Received: from mail.tuxteam.de ([5.199.139.25]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rfCXv-00077w-5w for bug-guile@gnu.org; Wed, 28 Feb 2024 00:32:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tuxteam.de; s=mail; h=From:In-Reply-To:Content-Type:MIME-Version:References:Message-ID: Subject:To:Date:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=DtJgXc4zQOriNtyN35Xz/KN8xUaNfJ498FCJbLDnxiE=; b=CIcxDyrnnSpDDIkpdEsj0UcAks vtfKUkdQaCjutBsJ09HN3iBSlI/IpNCUqkZaqf2nqupp60NR7oRz0KO1AoYQ0aIdFDEds9j2AKVnC goD2Hiak1XfD3WbazWsW74LJISIGtlhj92HkNcNd0jlEOKdzuTV8eX1zEVPpq7xLeRoznvd0pXz5r tZy/S2hTjXCHTGZog/PbC9iqz+uGO/Tyeoomk+TpTREomg39eo4nYbJ5U/fyCqfEsr1mdNYymGWAO B8R94CpuTRom1hKtYIBx75FsNqm9TnEesKZsKKZQi+AKpSHCWyjNNAW4hcRVvHWi3a09BUjpS0+go JYs9vRcQ==; Received: from tomas by mail.tuxteam.de with local (Exim 4.94.2) (envelope-from ) id 1rfCXo-0005eq-Kf for bug-guile@gnu.org; Wed, 28 Feb 2024 06:31:52 +0100 Date: Wed, 28 Feb 2024 06:31:52 +0100 To: bug-guile@gnu.org Subject: Re: bug#69438: basename faulty with nul and suffix Message-ID: References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Jtq31h2LIET9jtbQ" Content-Disposition: inline In-Reply-To: From: Received-SPF: pass client-ip=5.199.139.25; envelope-from=tomas@tuxteam.de; helo=mail.tuxteam.de X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --Jtq31h2LIET9jtbQ Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 27, 2024 at 08:14:58PM +0000, Zefram via Bug reports for GUILE,= GNU's Ubiquitous Extension Language wrote: > Trying out the basename function in Guile 3.0.9: >=20 > scheme@(guile-user)> (basename "foo/bar") > $1 =3D "bar" > scheme@(guile-user)> (basename "foo/bar" "r") > $2 =3D "ba" > scheme@(guile-user)> (basename "foo/bar" "x") > $3 =3D "bar" > scheme@(guile-user)> (basename "foo/bar\0baz/quux") > $4 =3D "bar" > scheme@(guile-user)> (basename "foo/bar\0baz/quux" "r") > $5 =3D "bar" > scheme@(guile-user)> (basename "foo/bar\0baz/quux" "x") > $6 =3D "ba" >=20 > The first three cases here show the function operating correctly on a > mundane pathname string. Hm. The functions are clearly designated as POSIX. In POSIX, \0 is explicitly excluded as a legal character in paths [1] [2]. I guess the implementation is just passing the buck to the OS's basename(3). Arguably, it'd be friendlier for the function to throw an error, but then, it'd be inconsistent with the OS. You can't win :-) Perhaps, a note in the docs wouldn't harm, though. Cheers [1] https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.htm= l#tag_03_266 [2] https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.htm= l#tag_03_169 --=20 t --Jtq31h2LIET9jtbQ Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQRp53liolZD6iXhAoIFyCz1etHaRgUCZd7FQQAKCRAFyCz1etHa RkQnAJwK9w8wmhn2FEzekexGD7VzPLyjtgCaA0SoL1u7IYxazqKkn6y7ddMJflc= =VE/F -----END PGP SIGNATURE----- --Jtq31h2LIET9jtbQ--