From unknown Sun Jun 22 11:39:45 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' Resent-From: "Drew Adams" Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 04 Oct 2010 17:59:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 7159 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 7159@debbugs.gnu.org X-Debbugs-Original-To: Received: via spool by submit@debbugs.gnu.org id=B.128621513232252 (code B ref -1); Mon, 04 Oct 2010 17:59:02 +0000 Received: (at submit) by debbugs.gnu.org; 4 Oct 2010 17:58:52 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2pJM-0008O9-Ht for submit@debbugs.gnu.org; Mon, 04 Oct 2010 13:58:52 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2pJK-0008O4-BP for submit@debbugs.gnu.org; Mon, 04 Oct 2010 13:58:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2pMJ-0000yR-V9 for submit@debbugs.gnu.org; Mon, 04 Oct 2010 14:01:57 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([199.232.76.165]:43437) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2pMI-0000yE-CW for submit@debbugs.gnu.org; Mon, 04 Oct 2010 14:01:55 -0400 Received: from [140.186.70.92] (port=52235 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P2pMF-0000Ux-7L for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2pMC-0000xU-1t for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:49 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]:38562) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2pMB-0000x6-T8 for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:48 -0400 Received: from acsinet15.oracle.com (acsinet15.oracle.com [141.146.126.227]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id o94I1fgR007872 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 4 Oct 2010 18:01:42 GMT Received: from acsmt353.oracle.com (acsmt353.oracle.com [141.146.40.153]) by acsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o94HLnrL013075 for ; Mon, 4 Oct 2010 18:01:40 GMT Received: from abhmt010.oracle.com by acsmt354.oracle.com with ESMTP id 660871711286215176; Mon, 04 Oct 2010 10:59:36 -0700 Received: from dradamslap1 (/130.35.179.10) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 04 Oct 2010 10:59:12 -0700 From: "Drew Adams" Date: Mon, 4 Oct 2010 10:59:12 -0700 Message-ID: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: Actj7dmmJdhqupFWQvGaQY16WkT4Ug== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5994 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -6.3 (------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.3 (------) emacs -Q 1. (setq toto (wildcard-to-regexp "c:/foo/bar/b*.el")) gives "\\`c:/foo/bar/b[^^@]*\\.el\\'" (where ^@ is a control char) (setq titi (substring 2 toto)) gives "c:/foo/bar/b[^^@]*\\.el\\'" (^@ is a control char) (file-name-absolute-p toto) ; -> t (file-name-absolute-p titi) ; -> t That is all as one would expect. `file-name-absolute-p' has no problem with either file-name string, even though neither string is a legitimate file name and both contain a control char. This is normal (IMO). BUT: (file-name-directory titi) ; gives "c:/foo/bar/b[^^@]*\\.el\\" (file-name-nondirectory titi) ; gives "'" These functions should know how to parse titi to produce "c:/foo/bar/" and "b[^^@]*\\.el\\'", respectively (where ^@ is the control char). It is not expected that these functions return names that necessarily map to actual directories or files. What is expected is that they remove the non-directory and directory components of the strings they are passed. That is not happening here. Also: (setq baz "c:/foo/bar/*\\.el\\'") (file-name-nondirectory baz) ; gives "'" (setq baz "c:/foo/bar/*\\.el\\ABC") (file-name-nondirectory baz) ; gives "ABC" So I suspect that the `file-name-nondirectory' part of this bug is at least in part a Windows problem. The code seems to be interpreting the backslash (?\) near the end as a directory separator. If so, that is definitely wrong. Even on Windows, the code should use the value of `directory-sep-char', which is ?/, not ?\. 2. However, I see from the doc string that `directory-sep-char' has been made obsolete: directory-sep-char is a variable defined in `subr.el'. Its value is 47 This variable is obsolete since 21.1; do not use it, just use `/'. This variable is potentially risky when used as a file local variable. Documentation: Directory separator character for built-in functions that return file names. The value is always ?/. That seems misguided, and the buggy behavior noted above is a good example of why. The correct way to handle this would be to make `directory-sep-char' a defconst with value ?/. And code should always use this named constant, NOT a literal ?/. The bugged behavior here shows why: someone coding `file-name-nondirectory' seems to have treated (hard-coded) ?\ as the directory separator on Windows (just a guess). Note too that the code has another minor bug: The call to `make-obsolete-variable' (which should anyway be removed, and the defvar simply replaced by a defconst) incorrectly uses "`/'" instead of "?/". The doc string itself is correct in referring to "?/". (defconst directory-sep-char ?/ "Directory separator character for built-in functions that return file names. The value is always ?/.") (make-obsolete-variable 'directory-sep-char "do not use it, just use `/'." "21.1") ^^^ In GNU Emacs 24.0.50.1 (i386-mingw-nt5.1.2600) of 2010-09-20 on 3249CTO Windowing system distributor `Microsoft Corp.', version 5.1.2600 configured using `configure --with-gcc (4.4) --no-opt --cflags -Ic:/imagesupport/include' From unknown Sun Jun 22 11:39:45 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.427 (Entity 5.427) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: "Drew Adams" Subject: bug#7159: closed (Re: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char') Message-ID: References: <837hhxpxi3.fsf@gnu.org> <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> X-Gnu-PR-Message: they-closed 7159 X-Gnu-PR-Package: emacs Reply-To: 7159@debbugs.gnu.org Date: Mon, 04 Oct 2010 19:27:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1286220422-6256-1" This is a multi-part message in MIME format... ------------=_1286220422-6256-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `dir= ectory-sep-char' which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 7159@debbugs.gnu.org. --=20 7159: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D7159 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1286220422-6256-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 7159-done) by debbugs.gnu.org; 4 Oct 2010 19:26:29 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2qg9-0001ch-Mh for submit@debbugs.gnu.org; Mon, 04 Oct 2010 15:26:29 -0400 Received: from mtaout21.012.net.il ([80.179.55.169]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2qg7-0001ca-7y for 7159-done@debbugs.gnu.org; Mon, 04 Oct 2010 15:26:28 -0400 Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0L9S00B0056JMS00@a-mtaout21.012.net.il> for 7159-done@debbugs.gnu.org; Mon, 04 Oct 2010 21:29:32 +0200 (IST) Received: from HOME-C4E4A596F7 ([77.127.80.126]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0L9S00B8V657G880@a-mtaout21.012.net.il>; Mon, 04 Oct 2010 21:29:32 +0200 (IST) Date: Mon, 04 Oct 2010 21:29:40 +0200 From: Eli Zaretskii Subject: Re: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' In-reply-to: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> X-012-Sender: halo1@inter.net.il To: Drew Adams Message-id: <837hhxpxi3.fsf@gnu.org> References: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: 7159-done Cc: 7159-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.0 (--) > From: "Drew Adams" > Date: Mon, 4 Oct 2010 10:59:12 -0700 > Cc: > > BUT: > > (file-name-directory titi) ; gives "c:/foo/bar/b[^^@]*\\.el\\" > (file-name-nondirectory titi) ; gives "'" > > These functions should know how to parse titi to produce "c:/foo/bar/" > and "b[^^@]*\\.el\\'", respectively (where ^@ is the control char). You are forgetting the backslashes that wildcard-to-regexp inserted. On DOS and Windows, Emacs treats backslashes as directory separators, as you'd expect. So "c:/foo/bar/b[^^@]*\\.el\\" looks like a leading directory of a file whose basename is "'". In other words, don't pass a regexp with backslashes to these functions, because you won't get what you think you will. > It is not expected that these functions return names that necessarily > map to actual directories or files. And indeed, they don't. > So I suspect that the `file-name-nondirectory' part of this bug > is at least in part a Windows problem. The code seems to be > interpreting the backslash (?\) near the end as a directory > separator. It does, by design. > If so, that is definitely wrong. Even on Windows, the > code should use the value of `directory-sep-char', which is ?/, > not ?\. On Windows, we support both, and we always will. Anything else means a terrible breakage, believe me. For example, it would be very hard to parse output of programs that emit file name with backslashes. With the current setup, this is seamless, even if the file names use mixed forward- and back-slashes (yes, it happens with GCC and GDB, for example, or even with Make sometimes). > However, I see from the doc string that `directory-sep-char' has > been made obsolete: In fact, just yesterday it was removed altogether, because it has not effect on what Emacs does. That's been like that for years, and we saw no complains. I'm closing this bug. ------------=_1286220422-6256-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 4 Oct 2010 17:58:52 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2pJM-0008O9-Ht for submit@debbugs.gnu.org; Mon, 04 Oct 2010 13:58:52 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2pJK-0008O4-BP for submit@debbugs.gnu.org; Mon, 04 Oct 2010 13:58:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2pMJ-0000yR-V9 for submit@debbugs.gnu.org; Mon, 04 Oct 2010 14:01:57 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([199.232.76.165]:43437) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2pMI-0000yE-CW for submit@debbugs.gnu.org; Mon, 04 Oct 2010 14:01:55 -0400 Received: from [140.186.70.92] (port=52235 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P2pMF-0000Ux-7L for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2pMC-0000xU-1t for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:49 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]:38562) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2pMB-0000x6-T8 for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:48 -0400 Received: from acsinet15.oracle.com (acsinet15.oracle.com [141.146.126.227]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id o94I1fgR007872 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 4 Oct 2010 18:01:42 GMT Received: from acsmt353.oracle.com (acsmt353.oracle.com [141.146.40.153]) by acsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o94HLnrL013075 for ; Mon, 4 Oct 2010 18:01:40 GMT Received: from abhmt010.oracle.com by acsmt354.oracle.com with ESMTP id 660871711286215176; Mon, 04 Oct 2010 10:59:36 -0700 Received: from dradamslap1 (/130.35.179.10) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 04 Oct 2010 10:59:12 -0700 From: "Drew Adams" To: Subject: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' Date: Mon, 4 Oct 2010 10:59:12 -0700 Message-ID: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: Actj7dmmJdhqupFWQvGaQY16WkT4Ug== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5994 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -6.3 (------) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.3 (------) emacs -Q 1. (setq toto (wildcard-to-regexp "c:/foo/bar/b*.el")) gives "\\`c:/foo/bar/b[^^@]*\\.el\\'" (where ^@ is a control char) (setq titi (substring 2 toto)) gives "c:/foo/bar/b[^^@]*\\.el\\'" (^@ is a control char) (file-name-absolute-p toto) ; -> t (file-name-absolute-p titi) ; -> t That is all as one would expect. `file-name-absolute-p' has no problem with either file-name string, even though neither string is a legitimate file name and both contain a control char. This is normal (IMO). BUT: (file-name-directory titi) ; gives "c:/foo/bar/b[^^@]*\\.el\\" (file-name-nondirectory titi) ; gives "'" These functions should know how to parse titi to produce "c:/foo/bar/" and "b[^^@]*\\.el\\'", respectively (where ^@ is the control char). It is not expected that these functions return names that necessarily map to actual directories or files. What is expected is that they remove the non-directory and directory components of the strings they are passed. That is not happening here. Also: (setq baz "c:/foo/bar/*\\.el\\'") (file-name-nondirectory baz) ; gives "'" (setq baz "c:/foo/bar/*\\.el\\ABC") (file-name-nondirectory baz) ; gives "ABC" So I suspect that the `file-name-nondirectory' part of this bug is at least in part a Windows problem. The code seems to be interpreting the backslash (?\) near the end as a directory separator. If so, that is definitely wrong. Even on Windows, the code should use the value of `directory-sep-char', which is ?/, not ?\. 2. However, I see from the doc string that `directory-sep-char' has been made obsolete: directory-sep-char is a variable defined in `subr.el'. Its value is 47 This variable is obsolete since 21.1; do not use it, just use `/'. This variable is potentially risky when used as a file local variable. Documentation: Directory separator character for built-in functions that return file names. The value is always ?/. That seems misguided, and the buggy behavior noted above is a good example of why. The correct way to handle this would be to make `directory-sep-char' a defconst with value ?/. And code should always use this named constant, NOT a literal ?/. The bugged behavior here shows why: someone coding `file-name-nondirectory' seems to have treated (hard-coded) ?\ as the directory separator on Windows (just a guess). Note too that the code has another minor bug: The call to `make-obsolete-variable' (which should anyway be removed, and the defvar simply replaced by a defconst) incorrectly uses "`/'" instead of "?/". The doc string itself is correct in referring to "?/". (defconst directory-sep-char ?/ "Directory separator character for built-in functions that return file names. The value is always ?/.") (make-obsolete-variable 'directory-sep-char "do not use it, just use `/'." "21.1") ^^^ In GNU Emacs 24.0.50.1 (i386-mingw-nt5.1.2600) of 2010-09-20 on 3249CTO Windowing system distributor `Microsoft Corp.', version 5.1.2600 configured using `configure --with-gcc (4.4) --no-opt --cflags -Ic:/imagesupport/include' ------------=_1286220422-6256-1-- From unknown Sun Jun 22 11:39:45 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.427 (Entity 5.427) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: "Drew Adams" Subject: bug#7159: closed (RE: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char') Message-ID: References: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> X-Gnu-PR-Message: they-closed 7159 X-Gnu-PR-Package: emacs Reply-To: 7159@debbugs.gnu.org Date: Mon, 04 Oct 2010 22:22:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1286230922-10754-1" This is a multi-part message in MIME format... ------------=_1286230922-10754-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `dir= ectory-sep-char' which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 7159@debbugs.gnu.org. --=20 7159: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D7159 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1286230922-10754-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 7159-done) by debbugs.gnu.org; 4 Oct 2010 22:21:08 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2tPA-0002n1-1G for submit@debbugs.gnu.org; Mon, 04 Oct 2010 18:21:08 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2tP8-0002mR-9e for 7159-done@debbugs.gnu.org; Mon, 04 Oct 2010 18:21:07 -0400 Received: from rcsinet13.oracle.com (rcsinet13.oracle.com [148.87.113.125]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id o94MOBWl010205 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 4 Oct 2010 22:24:12 GMT Received: from acsmt355.oracle.com (acsmt355.oracle.com [141.146.40.155]) by rcsinet13.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o94MOASk014498; Mon, 4 Oct 2010 22:24:10 GMT Received: from abhmt001.oracle.com by acsmt353.oracle.com with ESMTP id 653420691286230946; Mon, 04 Oct 2010 15:22:26 -0700 Received: from dradamslap1 (/130.35.179.10) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 04 Oct 2010 15:22:25 -0700 From: "Drew Adams" To: "'Eli Zaretskii'" References: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> <837hhxpxi3.fsf@gnu.org> Subject: RE: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' Date: Mon, 4 Oct 2010 15:22:26 -0700 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <837hhxpxi3.fsf@gnu.org> Thread-Index: Actj+nqYMlWNlqnBQZeT+/+RH1cK+QAA9xTw X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5994 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 7159-done Cc: 7159-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.3 (------) > > (file-name-directory titi) ; gives "c:/foo/bar/b[^^@]*\\.el\\" > > (file-name-nondirectory titi) ; gives "'" > > > > These functions should know how to parse titi to produce > > "c:/foo/bar/" and "b[^^@]*\\.el\\'", respectively (where ^@ > > is the control char). > > You are forgetting the backslashes that wildcard-to-regexp inserted. It should be obvious that I am NOT forgetting such backslashes. > On DOS and Windows, Emacs treats backslashes as directory separators, > as you'd expect. So "c:/foo/bar/b[^^@]*\\.el\\" looks like a leading > directory of a file whose basename is "'". No. Well, let me put it another way: That is just what this bug report is about: Backslashes are NOT directory separators for Emacs - or at least they should not be. Even on Windows. This bug report says we should get rid of any such vestigial treatment. As the doc string for `directory-sep-char' indicates, ?/ is the only directory separator for Emacs - or at least it should be. "Directory separator character for built-in functions that return file names. The value is always ?/." It says "return" rather than "accept or return" because such functions don't yet DTRT wrt input. (And it says "built-in" rather than "standard", which would be better.) > In other words, don't pass a regexp with backslashes to these > functions, because you won't get what you think you will. Correction: You won't get what you should get, which is just the directory or non-directory portion of the name, respecting ?/ as the only separator. And it's not just about regexps - I used that as an example of a name that included a backslash. The point was that file-name decomposition functions should pay no attention to backslashes. There is no reason they should consider ?\ to be a directory separator. After fixing this we will also be able to remove this parenthetical phrase in the Elisp manual: "(backslash is also allowed in input on MS-DOS or MS-Windows)". This is the _only_ (whispered, parenthetical) mention of such a vestigial crutch. > > So I suspect that the `file-name-nondirectory' part of this bug > > is at least in part a Windows problem. The code seems to be > > interpreting the backslash (?\) near the end as a directory > > separator. > > It does, by design. Bad design, if so. More likely it is a vestige. Perhaps it seemed like the best or the only possible thing to do at the time, but it is not TRT. > > If so, that is definitely wrong. Even on Windows, the > > code should use the value of `directory-sep-char', which is ?/, > > not ?\. > > On Windows, we support both, and we always will. Anything else means > a terrible breakage, believe me. For example, it would be very hard > to parse output of programs that emit file name with backslashes. Parsing output of programs is something altogether different. You should not throw that in here. Emacs standard functions for decomposing file names should not be tainted with a eye to parsing arbitrary Windows program output. That is a completely different requirement and should be handled, naturally, by special-purpose code (i.e. at a different level) - code that knows just what to expect from those particular programs. We can have code in Emacs that parses many different kinds of output, including Windows file names. But the need for such special-purpose parsing code is unrelated to general, standard functions that expect a file name. In Emacs, such functions should not treat backslashes as directory separators. There is no need for that. Why? Because ?/ as dir separator works fine for Emacs code even in Windows. And because ?/ works always, we should use ONLY ?/. What is the real requirement to support also ?\? Please don't say that it is handling the output from some Windows programs - that is a red herring. Note that this is very different from the path-separator (":" for Windows, ";" for UNIX). In that case, ";" does NOT work for Emacs on Windows - there is no canonical separator. But for directories, ?/ _always works_, and it should therefore be the only char recognized as a dir separator. For general file-name functions, that is. Nothing prevents some specialized Windows parsing code from processing Windows file names that use ?\ (e.g. creating a file name that uses the standard separator, which can then be handled in the standard way). > With the current setup, this is seamless, Well, it's apparently been hard-coded here and there to such an extent that you are screaming that there would be a lot to change to clean it up. That in itself is a hefty price for such "seamlessness". But the real price is the loss of simple standard functions for manipulating file names correctly. By pushing special-purpose parsing into the code everywhere you might think things have been made "seamless", but in fact a muddy mess has been created. Emacs's handling of \? in a file name output by an external program should proceed in two stages: (1) translation to an Emacs file name (if needed), which means using ?/ as separator, then (2) handling of the Emacs file name using the standard file-name functions (e.g. `file-name-directory'). That's the clean way to handle such special-casing. (And any such use of special-case parsing should be the exception, not the rule.) > even if the file names use mixed forward- and back-slashes (yes, it > happens with GCC and GDB, for example, or even with Make sometimes). Again, there is nothing wrong with having specialized code that handles such cases on an individual basis, if they require it. But the general file-name handling code of Emacs should handle _Emacs_ file names, which use only ?/ as the dir separator. You are muddying the waters by throwing in lots of other stuff here. Of _course_ it can happen that some program might need to parse special syntax - any special syntax. But this is about the normal Emacs syntax for file names. And for that syntax the Emacs directory separator is ?/. If some particular Emacs code is forced by some other code (e.g. GDB) to digest a name that uses both ?/ and ?\ as directory separators (quelle horreur), then appropriate Emacs code can be used to fix such names before Emacs tries to deal with them using the standard file-name functions (e.g. `file-name-directory'). IOW, tack a translation mapping onto the output of GDB or Make or whatever to standardize such bastard file names (w/ mixed separators). That can be done by Emacs, but we should not foul the standard Emacs file-name handling with such considerations. "Seamless", indeed. Putting special-case handling throughout the code doesn't make things seamless; it makes them quite seamy. > > However, I see from the doc string that `directory-sep-char' has > > been made obsolete: > > In fact, just yesterday it was removed altogether, because it has not > effect on what Emacs does. That's been like that for years, and we > saw no complains. The complaint/suggestion wrt `directory-sep-char' is only that it should be a constant. We should not be advising people to hard-code ?/, but rather to use a constant with a name that proclaims what it is and with a value of ?/. But this is only a minor, stylistic concern. It is not directly related to this bug. > I'm closing this bug. I'm reopening it. To me, this is broken, and this dysfunction is not an inevitable price to be paid because GDB or whatever outputs Windows file names using backslashes. That argument is a copout. The simple functions `file-name-directory' and `file-name-nondirectory' should be robust enough to just remove the non-directory and directory portion - always. That should be so irregardless of the presence of backslashes. Those functions are broken on Windows when backslashes are present. If you don't want to fix this bug, fine; maybe someone else will someday. Maybe you don't want to make the effort required to remove such ad-hoc backslash handling here and there from the Windows Emacs code, but maybe someone else will someday. I believe you that the effort might be great, and I accept that therefore this cannot be a high priority now (there are _many_ outstanding bugs). But that does not mean that we currently handle Windows file names correctly. That we choose not to fix something now does not imply that it doesn't need fixing. Your reason, "Anything else means a terrible breakage, believe me" suggests that the fix is non-trivial because there is (apparently) lots of code here and there that still special-cases backslashes, on Windows. Your example of such breakage, "to parse output of programs that emit file name with backslashes", suggests that you do not distinguish parsing Windows program output from Emacs's general-purpose file-name handling functions. It is not right to mess up general-purpose file-name-handling functions just for the benefit of some special-purpose Windows-output parsing here and there. Write Windows-output specific code to do that according to the particular case (need), and make the general-purpose file-name handling functions do as they logically should: recognize ?/ as the only directory separator. ------------=_1286230922-10754-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 4 Oct 2010 17:58:52 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2pJM-0008O9-Ht for submit@debbugs.gnu.org; Mon, 04 Oct 2010 13:58:52 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2pJK-0008O4-BP for submit@debbugs.gnu.org; Mon, 04 Oct 2010 13:58:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2pMJ-0000yR-V9 for submit@debbugs.gnu.org; Mon, 04 Oct 2010 14:01:57 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([199.232.76.165]:43437) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2pMI-0000yE-CW for submit@debbugs.gnu.org; Mon, 04 Oct 2010 14:01:55 -0400 Received: from [140.186.70.92] (port=52235 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P2pMF-0000Ux-7L for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2pMC-0000xU-1t for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:49 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]:38562) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2pMB-0000x6-T8 for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:48 -0400 Received: from acsinet15.oracle.com (acsinet15.oracle.com [141.146.126.227]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id o94I1fgR007872 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 4 Oct 2010 18:01:42 GMT Received: from acsmt353.oracle.com (acsmt353.oracle.com [141.146.40.153]) by acsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o94HLnrL013075 for ; Mon, 4 Oct 2010 18:01:40 GMT Received: from abhmt010.oracle.com by acsmt354.oracle.com with ESMTP id 660871711286215176; Mon, 04 Oct 2010 10:59:36 -0700 Received: from dradamslap1 (/130.35.179.10) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 04 Oct 2010 10:59:12 -0700 From: "Drew Adams" To: Subject: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' Date: Mon, 4 Oct 2010 10:59:12 -0700 Message-ID: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: Actj7dmmJdhqupFWQvGaQY16WkT4Ug== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5994 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -6.3 (------) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.3 (------) emacs -Q 1. (setq toto (wildcard-to-regexp "c:/foo/bar/b*.el")) gives "\\`c:/foo/bar/b[^^@]*\\.el\\'" (where ^@ is a control char) (setq titi (substring 2 toto)) gives "c:/foo/bar/b[^^@]*\\.el\\'" (^@ is a control char) (file-name-absolute-p toto) ; -> t (file-name-absolute-p titi) ; -> t That is all as one would expect. `file-name-absolute-p' has no problem with either file-name string, even though neither string is a legitimate file name and both contain a control char. This is normal (IMO). BUT: (file-name-directory titi) ; gives "c:/foo/bar/b[^^@]*\\.el\\" (file-name-nondirectory titi) ; gives "'" These functions should know how to parse titi to produce "c:/foo/bar/" and "b[^^@]*\\.el\\'", respectively (where ^@ is the control char). It is not expected that these functions return names that necessarily map to actual directories or files. What is expected is that they remove the non-directory and directory components of the strings they are passed. That is not happening here. Also: (setq baz "c:/foo/bar/*\\.el\\'") (file-name-nondirectory baz) ; gives "'" (setq baz "c:/foo/bar/*\\.el\\ABC") (file-name-nondirectory baz) ; gives "ABC" So I suspect that the `file-name-nondirectory' part of this bug is at least in part a Windows problem. The code seems to be interpreting the backslash (?\) near the end as a directory separator. If so, that is definitely wrong. Even on Windows, the code should use the value of `directory-sep-char', which is ?/, not ?\. 2. However, I see from the doc string that `directory-sep-char' has been made obsolete: directory-sep-char is a variable defined in `subr.el'. Its value is 47 This variable is obsolete since 21.1; do not use it, just use `/'. This variable is potentially risky when used as a file local variable. Documentation: Directory separator character for built-in functions that return file names. The value is always ?/. That seems misguided, and the buggy behavior noted above is a good example of why. The correct way to handle this would be to make `directory-sep-char' a defconst with value ?/. And code should always use this named constant, NOT a literal ?/. The bugged behavior here shows why: someone coding `file-name-nondirectory' seems to have treated (hard-coded) ?\ as the directory separator on Windows (just a guess). Note too that the code has another minor bug: The call to `make-obsolete-variable' (which should anyway be removed, and the defvar simply replaced by a defconst) incorrectly uses "`/'" instead of "?/". The doc string itself is correct in referring to "?/". (defconst directory-sep-char ?/ "Directory separator character for built-in functions that return file names. The value is always ?/.") (make-obsolete-variable 'directory-sep-char "do not use it, just use `/'." "21.1") ^^^ In GNU Emacs 24.0.50.1 (i386-mingw-nt5.1.2600) of 2010-09-20 on 3249CTO Windowing system distributor `Microsoft Corp.', version 5.1.2600 configured using `configure --with-gcc (4.4) --no-opt --cflags -Ic:/imagesupport/include' ------------=_1286230922-10754-1--