From unknown Sun Jun 22 11:31:06 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#7159 <7159@debbugs.gnu.org> To: bug#7159 <7159@debbugs.gnu.org> Subject: Status: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' Reply-To: bug#7159 <7159@debbugs.gnu.org> Date: Sun, 22 Jun 2025 18:31:06 +0000 retitle 7159 24.0.50; (1) `file-name-(non)directory': bad return values, (2= ) `directory-sep-char' reassign 7159 emacs submitter 7159 "Drew Adams" severity 7159 minor tag 7159 wontfix thanks From debbugs-submit-bounces@debbugs.gnu.org Mon Oct 04 13:58:52 2010 Received: (at submit) by debbugs.gnu.org; 4 Oct 2010 17:58:52 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2pJM-0008O9-Ht for submit@debbugs.gnu.org; Mon, 04 Oct 2010 13:58:52 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2pJK-0008O4-BP for submit@debbugs.gnu.org; Mon, 04 Oct 2010 13:58:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2pMJ-0000yR-V9 for submit@debbugs.gnu.org; Mon, 04 Oct 2010 14:01:57 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([199.232.76.165]:43437) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2pMI-0000yE-CW for submit@debbugs.gnu.org; Mon, 04 Oct 2010 14:01:55 -0400 Received: from [140.186.70.92] (port=52235 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P2pMF-0000Ux-7L for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P2pMC-0000xU-1t for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:49 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]:38562) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P2pMB-0000x6-T8 for bug-gnu-emacs@gnu.org; Mon, 04 Oct 2010 14:01:48 -0400 Received: from acsinet15.oracle.com (acsinet15.oracle.com [141.146.126.227]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id o94I1fgR007872 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 4 Oct 2010 18:01:42 GMT Received: from acsmt353.oracle.com (acsmt353.oracle.com [141.146.40.153]) by acsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o94HLnrL013075 for ; Mon, 4 Oct 2010 18:01:40 GMT Received: from abhmt010.oracle.com by acsmt354.oracle.com with ESMTP id 660871711286215176; Mon, 04 Oct 2010 10:59:36 -0700 Received: from dradamslap1 (/130.35.179.10) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 04 Oct 2010 10:59:12 -0700 From: "Drew Adams" To: Subject: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' Date: Mon, 4 Oct 2010 10:59:12 -0700 Message-ID: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: Actj7dmmJdhqupFWQvGaQY16WkT4Ug== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5994 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -6.3 (------) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.3 (------) emacs -Q 1. (setq toto (wildcard-to-regexp "c:/foo/bar/b*.el")) gives "\\`c:/foo/bar/b[^^@]*\\.el\\'" (where ^@ is a control char) (setq titi (substring 2 toto)) gives "c:/foo/bar/b[^^@]*\\.el\\'" (^@ is a control char) (file-name-absolute-p toto) ; -> t (file-name-absolute-p titi) ; -> t That is all as one would expect. `file-name-absolute-p' has no problem with either file-name string, even though neither string is a legitimate file name and both contain a control char. This is normal (IMO). BUT: (file-name-directory titi) ; gives "c:/foo/bar/b[^^@]*\\.el\\" (file-name-nondirectory titi) ; gives "'" These functions should know how to parse titi to produce "c:/foo/bar/" and "b[^^@]*\\.el\\'", respectively (where ^@ is the control char). It is not expected that these functions return names that necessarily map to actual directories or files. What is expected is that they remove the non-directory and directory components of the strings they are passed. That is not happening here. Also: (setq baz "c:/foo/bar/*\\.el\\'") (file-name-nondirectory baz) ; gives "'" (setq baz "c:/foo/bar/*\\.el\\ABC") (file-name-nondirectory baz) ; gives "ABC" So I suspect that the `file-name-nondirectory' part of this bug is at least in part a Windows problem. The code seems to be interpreting the backslash (?\) near the end as a directory separator. If so, that is definitely wrong. Even on Windows, the code should use the value of `directory-sep-char', which is ?/, not ?\. 2. However, I see from the doc string that `directory-sep-char' has been made obsolete: directory-sep-char is a variable defined in `subr.el'. Its value is 47 This variable is obsolete since 21.1; do not use it, just use `/'. This variable is potentially risky when used as a file local variable. Documentation: Directory separator character for built-in functions that return file names. The value is always ?/. That seems misguided, and the buggy behavior noted above is a good example of why. The correct way to handle this would be to make `directory-sep-char' a defconst with value ?/. And code should always use this named constant, NOT a literal ?/. The bugged behavior here shows why: someone coding `file-name-nondirectory' seems to have treated (hard-coded) ?\ as the directory separator on Windows (just a guess). Note too that the code has another minor bug: The call to `make-obsolete-variable' (which should anyway be removed, and the defvar simply replaced by a defconst) incorrectly uses "`/'" instead of "?/". The doc string itself is correct in referring to "?/". (defconst directory-sep-char ?/ "Directory separator character for built-in functions that return file names. The value is always ?/.") (make-obsolete-variable 'directory-sep-char "do not use it, just use `/'." "21.1") ^^^ In GNU Emacs 24.0.50.1 (i386-mingw-nt5.1.2600) of 2010-09-20 on 3249CTO Windowing system distributor `Microsoft Corp.', version 5.1.2600 configured using `configure --with-gcc (4.4) --no-opt --cflags -Ic:/imagesupport/include' From debbugs-submit-bounces@debbugs.gnu.org Mon Oct 04 15:26:29 2010 Received: (at 7159-done) by debbugs.gnu.org; 4 Oct 2010 19:26:29 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2qg9-0001ch-Mh for submit@debbugs.gnu.org; Mon, 04 Oct 2010 15:26:29 -0400 Received: from mtaout21.012.net.il ([80.179.55.169]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2qg7-0001ca-7y for 7159-done@debbugs.gnu.org; Mon, 04 Oct 2010 15:26:28 -0400 Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0L9S00B0056JMS00@a-mtaout21.012.net.il> for 7159-done@debbugs.gnu.org; Mon, 04 Oct 2010 21:29:32 +0200 (IST) Received: from HOME-C4E4A596F7 ([77.127.80.126]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0L9S00B8V657G880@a-mtaout21.012.net.il>; Mon, 04 Oct 2010 21:29:32 +0200 (IST) Date: Mon, 04 Oct 2010 21:29:40 +0200 From: Eli Zaretskii Subject: Re: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' In-reply-to: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> X-012-Sender: halo1@inter.net.il To: Drew Adams Message-id: <837hhxpxi3.fsf@gnu.org> References: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: 7159-done Cc: 7159-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.0 (--) > From: "Drew Adams" > Date: Mon, 4 Oct 2010 10:59:12 -0700 > Cc: > > BUT: > > (file-name-directory titi) ; gives "c:/foo/bar/b[^^@]*\\.el\\" > (file-name-nondirectory titi) ; gives "'" > > These functions should know how to parse titi to produce "c:/foo/bar/" > and "b[^^@]*\\.el\\'", respectively (where ^@ is the control char). You are forgetting the backslashes that wildcard-to-regexp inserted. On DOS and Windows, Emacs treats backslashes as directory separators, as you'd expect. So "c:/foo/bar/b[^^@]*\\.el\\" looks like a leading directory of a file whose basename is "'". In other words, don't pass a regexp with backslashes to these functions, because you won't get what you think you will. > It is not expected that these functions return names that necessarily > map to actual directories or files. And indeed, they don't. > So I suspect that the `file-name-nondirectory' part of this bug > is at least in part a Windows problem. The code seems to be > interpreting the backslash (?\) near the end as a directory > separator. It does, by design. > If so, that is definitely wrong. Even on Windows, the > code should use the value of `directory-sep-char', which is ?/, > not ?\. On Windows, we support both, and we always will. Anything else means a terrible breakage, believe me. For example, it would be very hard to parse output of programs that emit file name with backslashes. With the current setup, this is seamless, even if the file names use mixed forward- and back-slashes (yes, it happens with GCC and GDB, for example, or even with Make sometimes). > However, I see from the doc string that `directory-sep-char' has > been made obsolete: In fact, just yesterday it was removed altogether, because it has not effect on what Emacs does. That's been like that for years, and we saw no complains. I'm closing this bug. From unknown Sun Jun 22 11:31:06 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: Did not alter fixed versions and reopened. Date: Mon, 04 Oct 2010 22:20:03 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # Did not alter fixed versions and reopened. thanks # This fakemail brought to you by your local debbugs # administrator From debbugs-submit-bounces@debbugs.gnu.org Mon Oct 04 18:21:08 2010 Received: (at 7159-done) by debbugs.gnu.org; 4 Oct 2010 22:21:08 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2tPA-0002n1-1G for submit@debbugs.gnu.org; Mon, 04 Oct 2010 18:21:08 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2tP8-0002mR-9e for 7159-done@debbugs.gnu.org; Mon, 04 Oct 2010 18:21:07 -0400 Received: from rcsinet13.oracle.com (rcsinet13.oracle.com [148.87.113.125]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id o94MOBWl010205 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 4 Oct 2010 22:24:12 GMT Received: from acsmt355.oracle.com (acsmt355.oracle.com [141.146.40.155]) by rcsinet13.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o94MOASk014498; Mon, 4 Oct 2010 22:24:10 GMT Received: from abhmt001.oracle.com by acsmt353.oracle.com with ESMTP id 653420691286230946; Mon, 04 Oct 2010 15:22:26 -0700 Received: from dradamslap1 (/130.35.179.10) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 04 Oct 2010 15:22:25 -0700 From: "Drew Adams" To: "'Eli Zaretskii'" References: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> <837hhxpxi3.fsf@gnu.org> Subject: RE: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' Date: Mon, 4 Oct 2010 15:22:26 -0700 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <837hhxpxi3.fsf@gnu.org> Thread-Index: Actj+nqYMlWNlqnBQZeT+/+RH1cK+QAA9xTw X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5994 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 7159-done Cc: 7159-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.3 (------) > > (file-name-directory titi) ; gives "c:/foo/bar/b[^^@]*\\.el\\" > > (file-name-nondirectory titi) ; gives "'" > > > > These functions should know how to parse titi to produce > > "c:/foo/bar/" and "b[^^@]*\\.el\\'", respectively (where ^@ > > is the control char). > > You are forgetting the backslashes that wildcard-to-regexp inserted. It should be obvious that I am NOT forgetting such backslashes. > On DOS and Windows, Emacs treats backslashes as directory separators, > as you'd expect. So "c:/foo/bar/b[^^@]*\\.el\\" looks like a leading > directory of a file whose basename is "'". No. Well, let me put it another way: That is just what this bug report is about: Backslashes are NOT directory separators for Emacs - or at least they should not be. Even on Windows. This bug report says we should get rid of any such vestigial treatment. As the doc string for `directory-sep-char' indicates, ?/ is the only directory separator for Emacs - or at least it should be. "Directory separator character for built-in functions that return file names. The value is always ?/." It says "return" rather than "accept or return" because such functions don't yet DTRT wrt input. (And it says "built-in" rather than "standard", which would be better.) > In other words, don't pass a regexp with backslashes to these > functions, because you won't get what you think you will. Correction: You won't get what you should get, which is just the directory or non-directory portion of the name, respecting ?/ as the only separator. And it's not just about regexps - I used that as an example of a name that included a backslash. The point was that file-name decomposition functions should pay no attention to backslashes. There is no reason they should consider ?\ to be a directory separator. After fixing this we will also be able to remove this parenthetical phrase in the Elisp manual: "(backslash is also allowed in input on MS-DOS or MS-Windows)". This is the _only_ (whispered, parenthetical) mention of such a vestigial crutch. > > So I suspect that the `file-name-nondirectory' part of this bug > > is at least in part a Windows problem. The code seems to be > > interpreting the backslash (?\) near the end as a directory > > separator. > > It does, by design. Bad design, if so. More likely it is a vestige. Perhaps it seemed like the best or the only possible thing to do at the time, but it is not TRT. > > If so, that is definitely wrong. Even on Windows, the > > code should use the value of `directory-sep-char', which is ?/, > > not ?\. > > On Windows, we support both, and we always will. Anything else means > a terrible breakage, believe me. For example, it would be very hard > to parse output of programs that emit file name with backslashes. Parsing output of programs is something altogether different. You should not throw that in here. Emacs standard functions for decomposing file names should not be tainted with a eye to parsing arbitrary Windows program output. That is a completely different requirement and should be handled, naturally, by special-purpose code (i.e. at a different level) - code that knows just what to expect from those particular programs. We can have code in Emacs that parses many different kinds of output, including Windows file names. But the need for such special-purpose parsing code is unrelated to general, standard functions that expect a file name. In Emacs, such functions should not treat backslashes as directory separators. There is no need for that. Why? Because ?/ as dir separator works fine for Emacs code even in Windows. And because ?/ works always, we should use ONLY ?/. What is the real requirement to support also ?\? Please don't say that it is handling the output from some Windows programs - that is a red herring. Note that this is very different from the path-separator (":" for Windows, ";" for UNIX). In that case, ";" does NOT work for Emacs on Windows - there is no canonical separator. But for directories, ?/ _always works_, and it should therefore be the only char recognized as a dir separator. For general file-name functions, that is. Nothing prevents some specialized Windows parsing code from processing Windows file names that use ?\ (e.g. creating a file name that uses the standard separator, which can then be handled in the standard way). > With the current setup, this is seamless, Well, it's apparently been hard-coded here and there to such an extent that you are screaming that there would be a lot to change to clean it up. That in itself is a hefty price for such "seamlessness". But the real price is the loss of simple standard functions for manipulating file names correctly. By pushing special-purpose parsing into the code everywhere you might think things have been made "seamless", but in fact a muddy mess has been created. Emacs's handling of \? in a file name output by an external program should proceed in two stages: (1) translation to an Emacs file name (if needed), which means using ?/ as separator, then (2) handling of the Emacs file name using the standard file-name functions (e.g. `file-name-directory'). That's the clean way to handle such special-casing. (And any such use of special-case parsing should be the exception, not the rule.) > even if the file names use mixed forward- and back-slashes (yes, it > happens with GCC and GDB, for example, or even with Make sometimes). Again, there is nothing wrong with having specialized code that handles such cases on an individual basis, if they require it. But the general file-name handling code of Emacs should handle _Emacs_ file names, which use only ?/ as the dir separator. You are muddying the waters by throwing in lots of other stuff here. Of _course_ it can happen that some program might need to parse special syntax - any special syntax. But this is about the normal Emacs syntax for file names. And for that syntax the Emacs directory separator is ?/. If some particular Emacs code is forced by some other code (e.g. GDB) to digest a name that uses both ?/ and ?\ as directory separators (quelle horreur), then appropriate Emacs code can be used to fix such names before Emacs tries to deal with them using the standard file-name functions (e.g. `file-name-directory'). IOW, tack a translation mapping onto the output of GDB or Make or whatever to standardize such bastard file names (w/ mixed separators). That can be done by Emacs, but we should not foul the standard Emacs file-name handling with such considerations. "Seamless", indeed. Putting special-case handling throughout the code doesn't make things seamless; it makes them quite seamy. > > However, I see from the doc string that `directory-sep-char' has > > been made obsolete: > > In fact, just yesterday it was removed altogether, because it has not > effect on what Emacs does. That's been like that for years, and we > saw no complains. The complaint/suggestion wrt `directory-sep-char' is only that it should be a constant. We should not be advising people to hard-code ?/, but rather to use a constant with a name that proclaims what it is and with a value of ?/. But this is only a minor, stylistic concern. It is not directly related to this bug. > I'm closing this bug. I'm reopening it. To me, this is broken, and this dysfunction is not an inevitable price to be paid because GDB or whatever outputs Windows file names using backslashes. That argument is a copout. The simple functions `file-name-directory' and `file-name-nondirectory' should be robust enough to just remove the non-directory and directory portion - always. That should be so irregardless of the presence of backslashes. Those functions are broken on Windows when backslashes are present. If you don't want to fix this bug, fine; maybe someone else will someday. Maybe you don't want to make the effort required to remove such ad-hoc backslash handling here and there from the Windows Emacs code, but maybe someone else will someday. I believe you that the effort might be great, and I accept that therefore this cannot be a high priority now (there are _many_ outstanding bugs). But that does not mean that we currently handle Windows file names correctly. That we choose not to fix something now does not imply that it doesn't need fixing. Your reason, "Anything else means a terrible breakage, believe me" suggests that the fix is non-trivial because there is (apparently) lots of code here and there that still special-cases backslashes, on Windows. Your example of such breakage, "to parse output of programs that emit file name with backslashes", suggests that you do not distinguish parsing Windows program output from Emacs's general-purpose file-name handling functions. It is not right to mess up general-purpose file-name-handling functions just for the benefit of some special-purpose Windows-output parsing here and there. Write Windows-output specific code to do that according to the particular case (need), and make the general-purpose file-name handling functions do as they logically should: recognize ?/ as the only directory separator. From debbugs-submit-bounces@debbugs.gnu.org Tue Oct 05 00:07:41 2010 Received: (at 7159-done) by debbugs.gnu.org; 5 Oct 2010 04:07:42 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2yoX-0006PP-Il for submit@debbugs.gnu.org; Tue, 05 Oct 2010 00:07:41 -0400 Received: from mtaout20.012.net.il ([80.179.55.166]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2yoW-0006PK-1W for 7159-done@debbugs.gnu.org; Tue, 05 Oct 2010 00:07:40 -0400 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0L9S00500TWVS200@a-mtaout20.012.net.il> for 7159-done@debbugs.gnu.org; Tue, 05 Oct 2010 06:10:42 +0200 (IST) Received: from HOME-C4E4A596F7 ([77.127.80.126]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0L9S005DKU9RM670@a-mtaout20.012.net.il>; Tue, 05 Oct 2010 06:10:41 +0200 (IST) Date: Tue, 05 Oct 2010 06:10:48 +0200 From: Eli Zaretskii Subject: Re: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' In-reply-to: X-012-Sender: halo1@inter.net.il To: Drew Adams Message-id: <834od1p9dj.fsf@gnu.org> References: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> <837hhxpxi3.fsf@gnu.org> X-Spam-Score: -1.0 (-) X-Debbugs-Envelope-To: 7159-done Cc: 7159-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.0 (-) > From: "Drew Adams" > Cc: <7159-done@debbugs.gnu.org> > Date: Mon, 4 Oct 2010 15:22:26 -0700 > > Well, let me put it another way: That is just what this bug report is about: > Backslashes are NOT directory separators for Emacs - or at least they should not > be. Even on Windows. We will have to disagree on that. I've been porting Unix programs to DOS and Windows for the last 20 years, and in my experience, any program that does not treat these two flavors equivalently is fundamentally broken on these systems. > > In other words, don't pass a regexp with backslashes to these > > functions, because you won't get what you think you will. > > Correction: You won't get what you should get, which is just the > directory or non-directory portion of the name, respecting ?/ as the > only separator. These two functions are not supposed to be handed regexps anyway, even on Unix. For example, what's the filename part of "/foo\\(/a?\\)bar"? (file-name-nondirectory "/foo\\(/a?\\)bar") => "a?\\)bar" Or how about this: (file-name-nondirectory "/foo[^/]*") => "]*" This simply doesn't work, on any system. Even in your example, you cheated: you passed a substring of a regexp, to avoid the problems with leading drive spec, which must be at the beginning of the file name to work with these functions. > > On Windows, we support both, and we always will. Anything else means > > a terrible breakage, believe me. For example, it would be very hard > > to parse output of programs that emit file name with backslashes. > > Parsing output of programs is something altogether different. It's not different. These functions are used all the time for parsing file names, including those in output of other programs. There are also users who use backslashes in their ~/.emacs files, when they specify file names and programs. > That is a completely different requirement and should be handled, naturally, by > special-purpose code (i.e. at a different level) - code that knows just what to > expect from those particular programs. If you do this, you will flood the application sources with ugly system-dependent conditions. Hardly a good idea. > What is the real requirement to support also ?\? That it is used in DOS/Windows file names in many situations. > "Seamless", indeed. Putting special-case handling throughout the code doesn't > make things seamless; it makes them quite seamy. The special-case code has to be somewhere. Having it at the current low level in C, hidden from the Lisp programs, is the best we can do. > The simple functions `file-name-directory' and `file-name-nondirectory' should > be robust enough to just remove the non-directory and directory portion - > always. It does that today. From debbugs-submit-bounces@debbugs.gnu.org Tue Oct 05 01:05:06 2010 Received: (at 7159-done) by debbugs.gnu.org; 5 Oct 2010 05:05:06 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2zi6-0006nZ-9e for submit@debbugs.gnu.org; Tue, 05 Oct 2010 01:05:06 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P2zi3-0006nC-FT for 7159-done@debbugs.gnu.org; Tue, 05 Oct 2010 01:05:04 -0400 Received: from rcsinet15.oracle.com (rcsinet15.oracle.com [148.87.113.117]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id o95588px017587 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 5 Oct 2010 05:08:09 GMT Received: from acsmt353.oracle.com (acsmt353.oracle.com [141.146.40.153]) by rcsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o95587ON028523; Tue, 5 Oct 2010 05:08:08 GMT Received: from abhmt001.oracle.com by acsmt354.oracle.com with ESMTP id 662396651286255188; Mon, 04 Oct 2010 22:06:28 -0700 Received: from dradamslap1 (/10.159.219.146) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 04 Oct 2010 22:06:27 -0700 From: "Drew Adams" To: "'Eli Zaretskii'" References: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> <837hhxpxi3.fsf@gnu.org> <834od1p9dj.fsf@gnu.org> Subject: RE: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' Date: Mon, 4 Oct 2010 22:06:26 -0700 Message-ID: <94BA2845D4EF4A7CBE409B90D6A377AB@us.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <834od1p9dj.fsf@gnu.org> Thread-Index: ActkQ0taTO+qdfXKQ2WkqugVfZNVgQAAEdxg X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5994 X-Spam-Score: -6.3 (------) X-Debbugs-Envelope-To: 7159-done Cc: 7159-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.3 (------) > > > In other words, don't pass a regexp with backslashes to these > > > functions, because you won't get what you think you will. > > > > Correction: You won't get what you should get, which is just the > > directory or non-directory portion of the name, respecting ?/ as the > > only separator. > > These two functions are not supposed to be handed regexps anyway, even > on Unix. For example, ... No one said that anyone is likely to pass a regexp in place of a file name. Sorry if my example misled you. The point is that these _general_, standard functions for simply removing a file name's (non)directory portion should be able to handle _backslash_ characters without interpreting them as directory separators - on Windows as on Unix or any other platform. I said that the use of a regexp in the example I gave was just that: an example of a name that contains backslashes. Nothing more. I probably should have just used a literal string, to avoid confusion. These functions should DTRT with such a name, whether or not it corresponds to a real file (you agreed with that). Our difference of opinion is wrt whether a backslash should ever be considered as a (second kind of) directory separator in Emacs: you say yes (for Windows); I say no. I say that even on Windows ?/ is enough; there is no need for two dir separators in Emacs file names. You say Emacs must recognize ?\ today at least, because mumblemumble things are complicated. I say that even if that is so (and I believe you that it is), that's not the same as claiming that that _should_ be so. This is a bug, a poor design/implementation decision, that we can hope to fix at some point. > what's the filename part of "/foo\\(/a?\\)bar"? > (file-name-nondirectory "/foo\\(/a?\\)bar") => "a?\\)bar" > Or how about this: > (file-name-nondirectory "/foo[^/]*") => "]*" The answer, for Emacs, should simply be to interpret the chars other than ?/ as file-name chars, not as directory separators. It has nothing to do with interpreting regexp syntax. It has only to do with interpreting a directory separator. The only question/disagreement is wrt ?\ as a directory separator. IMO, it should not be treated as such. > > Parsing output of programs is something altogether different. > > It's not different. These functions are used all the time for parsing > file names, including those in output of other programs. But they should be used to parse Emacs file names (i.e., names that use only ?/ as dir separator), nothing more. If the output of some program is &(*^*&#HI&*U@);';.1?>>!, and that program considers that to be a file name for some platform, that is (or should be) irrelevant to standard Emacs file-name decomposition. After your specialized code translates that name to its Emacs file name of, say, /foo/bar/toto.c, _then_ that is something that the standard file-name functions can decompose. That's all I'm suggesting: keep `file-name-(non)directory' for Emacs file names, where that notion is platform-independent wrt dir separator. Use other code as needed to translate to names that use only ?/ as dir separator. > There are also users who use backslashes in their ~/.emacs files, when > they specify file names and programs. So what? Again, however & whenever Emacs receives such names, it can use code that translates them to Emacs file names (names with ?/ as dir separator). > > That is a completely different requirement and should be > > handled, naturally, by special-purpose code (i.e. at a > > different level) - code that knows just what to > > expect from those particular programs. > > If you do this, you will flood the application sources with ugly > system-dependent conditions. Hardly a good idea. I didn't say that the "application sources" should do that (though I'm not sure what you mean by that term). I said that if some Emacs code expects a "file name" in some format different from the standard Emacs syntax - i.e., with some other directory separator, then specialized _Emacs_ code that recognizes such a format can translate it to the standard format: replace ?\ or another directory separator by ?/, the directory separator used by Emacs. The Emacs code that receives such a non-standard (for Emacs) format is the code that should deal with this. Not the application code (if by that you mean the code that produces such output). > > What is the real requirement to support also ?\? > > That it is used in DOS/Windows file names in many situations. Not inside Emacs. It's not needed. For Emacs, ?/ is sufficient even for Windows. So it should suffice - there is no (logical) need for Emacs to have two standard directory separators (on Windows). There might be a historical (legacy) reason why we have two today, but only one is needed: ?/ _always_ works within Emacs. > > "Seamless", indeed. Putting special-case handling > > throughout the code doesn't make things seamless; it makes > > them quite seamy. > > The special-case code has to be somewhere. Having it at the current > low level in C, hidden from the Lisp programs, is the best we can do. I agree that it has to be somewhere. And I recognize that you are far more familiar with the Emacs implementation (and with Windows) than I. My point is that there is no logical reason why the _standard_, _general_ Emacs functions for decomposing file names (within Emacs) should have to recognize two different chars as directory separators. That is just not necessary, since ?/ is all that is needed, even for Emacs on Windows. Emacs already DTRT for ?/ on Windows. You say that Emacs can sometimes receive file names output by some external programs in a syntax that uses ?\ as dir separators. I say fine, then the Emacs code that receives such names can translate ?\ to ?/, so that when it comes to decomposing an Emacs file name we can use simple, standard functions that expect only ?/ as dir separator. That's the difference in our points of view, as I see it. And again, you have a better idea of where those places are that Emacs can expect to receive such file-name syntax (you mentioned GDB, Make, .emacs...). Those are the places where I'd suggest we make the transition to the standard syntax (with only ?/ as separator). It is the code that accepts/receives such names that should digest them to produce standard Emacs file names (i.e., with only ?/ separators). IOW, if the problem is _external_ programs and an _external_ syntax, then do the translation at the external/internal boundary: at the point where Emacs gets the file name from outside. Don't do it (in effect) at each call to a standard name-decomposition function. The translation code could be in C or Lisp for all I care. All I would like to see is simple file-name decomposition functions - no special-casing of ?\ on Windows. > > The simple functions `file-name-directory' and > > `file-name-nondirectory' should be robust enough to just > > remove the non-directory and directory portion - always. > > It does that today. No, they do not, because they falsely interpret ?\ as a directory separator (and only on Windows). From debbugs-submit-bounces@debbugs.gnu.org Tue Oct 05 03:06:26 2010 Received: (at 7159-done) by debbugs.gnu.org; 5 Oct 2010 07:06:26 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P31bV-0007bw-Vp for submit@debbugs.gnu.org; Tue, 05 Oct 2010 03:06:26 -0400 Received: from fencepost.gnu.org ([140.186.70.10]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P31bT-0007bp-2y for 7159-done@debbugs.gnu.org; Tue, 05 Oct 2010 03:06:23 -0400 Received: from eliz by fencepost.gnu.org with local (Exim 4.69) (envelope-from ) id 1P31eU-0006EA-Mi; Tue, 05 Oct 2010 03:09:30 -0400 From: Eli Zaretskii To: "Drew Adams" In-reply-to: <94BA2845D4EF4A7CBE409B90D6A377AB@us.oracle.com> (drew.adams@oracle.com) Subject: Re: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' References: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> <837hhxpxi3.fsf@gnu.org> <834od1p9dj.fsf@gnu.org> <94BA2845D4EF4A7CBE409B90D6A377AB@us.oracle.com> Message-Id: Date: Tue, 05 Oct 2010 03:09:30 -0400 X-Spam-Score: -6.5 (------) X-Debbugs-Envelope-To: 7159-done Cc: 7159-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.5 (------) > From: "Drew Adams" > Cc: <7159-done@debbugs.gnu.org> > Date: Mon, 4 Oct 2010 22:06:26 -0700 > > No one said that anyone is likely to pass a regexp in place of a file name. > Sorry if my example misled you. What other real-life use-cases exist that require such a functionality? > You say Emacs must recognize ?\ today at least, because mumblemumble things are > complicated. I say that even if that is so (and I believe you that it is), > that's not the same as claiming that that _should_ be so. This is a bug, a poor > design/implementation decision, that we can hope to fix at some point. It isn't a bug, it's a feature that is necessary on DOS and Windows. "Fixing" that would introduce bugs, some subtle, others glaring. So these primitives, which are widely used in Emacs's own Lisp sources, must retain their equal handling of both flavors of slashes in file names. If you still disagree, let's leave it at that, because we will never agree. > > > what's the filename part of "/foo\\(/a?\\)bar"? > > (file-name-nondirectory "/foo\\(/a?\\)bar") => "a?\\)bar" > > Or how about this: > > (file-name-nondirectory "/foo[^/]*") => "]*" > > The answer, for Emacs, should simply be to interpret the chars other than ?/ as > file-name chars, not as directory separators. It has nothing to do with > interpreting regexp syntax. It has only to do with interpreting a directory > separator. But the above output doesn't make sense. The result is by no means what file-name-directory and file-name-nondirectory are documented to produce. And the reason is that the argument is not a file name. So why what you are asking makes sense, and when will it be useful in Emacs in practice? > > It's not different. These functions are used all the time for parsing > > file names, including those in output of other programs. > > But they should be used to parse Emacs file names (i.e., names that use only ?/ > as dir separator), nothing more. No. They are designed to parse file names that are valid on the underlying OS. > That's all I'm suggesting: keep `file-name-(non)directory' for Emacs file names, > where that notion is platform-independent wrt dir separator. Use other code as > needed to translate to names that use only ?/ as dir separator. > > > There are also users who use backslashes in their ~/.emacs files, when > > they specify file names and programs. > > So what? Again, however & whenever Emacs receives such names, it can use code > that translates them to Emacs file names (names with ?/ as dir separator). But we already use that "other code": these two primitives (and others) which DTRT with any file name that is valid on Windows. There's no need to change anything. It appears that you are asking for an additional set of functions, which ignore backslashes on Windows. If such functions are to become part of Emacs, we need to hear the practical use-cases where they would be useful. You presented a single example, which you now say was not relevant. Please present relevant examples that would justify yet another set of file-name APIs. Otherwise, you can always write such functions yourself, it's hardly a big job. Btw, I suggest to move the rest of the discussion to emacs-devel, as it's no longer relevant to the original bug report. That mailing list has more subscribers than the bug-reporting list, who may contribute to the discussion. > > If you do this, you will flood the application sources with ugly > > system-dependent conditions. Hardly a good idea. > > I didn't say that the "application sources" should do that (though I'm not sure > what you mean by that term). > > I said that if some Emacs code expects a "file name" in some format different > from the standard Emacs syntax - i.e., with some other directory separator, then > specialized _Emacs_ code that recognizes such a format can translate it to the > standard format: replace ?\ or another directory separator by ?/, the directory > separator used by Emacs. This code will be specific to Windows, and will clutter Lisp application-level sources, such as gud.el, grep.el, compile.el, etc. with the kinds of `(if (eq system-type 'windows-nt) fix-file-names-for-drew)'. That's ugly and unergonomic. The current situation, though not ideal, is much better. > > The special-case code has to be somewhere. Having it at the current > > low level in C, hidden from the Lisp programs, is the best we can do. > > I agree that it has to be somewhere. And I recognize that you are far more > familiar with the Emacs implementation (and with Windows) than I. My point is > that there is no logical reason why the _standard_, _general_ Emacs functions > for decomposing file names (within Emacs) should have to recognize two different > chars as directory separators. Yes, there's a perfectly valid reason: because these primitives are used everywhere in Emacs packages, and those packages don't want to know about differences in file format between Posix and Windows platforms. Again, we do that in a lot of places, most of which I don't even remember. The reason I can safely forget about them is _precisely_ that Lisp code doesn't have to worry about these issues, because the primitives DTRT. But here's one more example I just recalled: type "M-x getenv RET PATH RET" and look at the value. I'm sure if I think more, I will recall more examples. But why waste energy on a problem that doesn't exist? From debbugs-submit-bounces@debbugs.gnu.org Tue Oct 05 10:56:32 2010 Received: (at 7159-done) by debbugs.gnu.org; 5 Oct 2010 14:56:32 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P38wR-0003lc-Ug for submit@debbugs.gnu.org; Tue, 05 Oct 2010 10:56:32 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P38wP-0003lV-3J for 7159-done@debbugs.gnu.org; Tue, 05 Oct 2010 10:56:30 -0400 Received: from rcsinet13.oracle.com (rcsinet13.oracle.com [148.87.113.125]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id o95ExZUT031384 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 5 Oct 2010 14:59:36 GMT Received: from acsmt354.oracle.com (acsmt354.oracle.com [141.146.40.154]) by rcsinet13.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o94IUmjc003389; Tue, 5 Oct 2010 14:59:33 GMT Received: from abhmt012.oracle.com by acsmt355.oracle.com with ESMTP id 663955581286290713; Tue, 05 Oct 2010 07:58:33 -0700 Received: from dradamslap1 (/10.159.219.146) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 05 Oct 2010 07:58:28 -0700 From: "Drew Adams" To: "'Eli Zaretskii'" References: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> <837hhxpxi3.fsf@gnu.org> <834od1p9dj.fsf@gnu.org> <94BA2845D4EF4A7CBE409B90D6A377AB@us.oracle.com> Subject: RE: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' Date: Tue, 5 Oct 2010 07:58:23 -0700 Message-ID: <30449C251E4C44978BA588627B14AAA1@us.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: Thread-Index: ActkXENbvwnJAVMYRqKX4nWtTsy+mAAOItLA X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5994 X-Spam-Score: -6.3 (------) X-Debbugs-Envelope-To: 7159-done Cc: 7159-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.3 (------) > What other real-life use-cases exist that require such a > functionality? Ah, the "real world" argument. The old no-one-would-ever-do-that refrain. Why would a user or a program ever try to decompose a file name that contains backslashes? Answers: (1) Users and programs will always do what you don't expect - and sometimes there is no reason they shouldn't. (2) Occam's razor: There is no reason to special-case backslash here: slash alone is sufficient. If not true, then treat ?\ as a dir separator in Emacs on Unix also. > > You say Emacs must recognize ?\ today at least, because > > mumblemumble things are complicated. I say that even if > > that is so (and I believe you that it is), > > that's not the same as claiming that that _should_ be so. > > This is a bug, a poor design/implementation decision, that > > we can hope to fix at some point. > > It isn't a bug, it's a feature that is necessary on DOS and Windows. What's not necessary is treating ?\ as a dir separator in Emacs, even on Windows. That's clear. > "Fixing" that would introduce bugs, some subtle, others glaring. So > these primitives, which are widely used in Emacs's own Lisp sources, > must retain their equal handling of both flavors of slashes in file > names. If you still disagree, let's leave it at that, because we will > never agree. Yes, I still disagree - or rather you do. That is why I asked to leave the bug open in hopes that someone else will eventually fix it. It's clear that you disagree that there is a bug. Those places you refer to can be fixed. Not fixing the standard decomposer functions makes the bug a self-fulfilling prophecy: Of course these functions will be widely used with no intermediary in places where backslashes are present. Today, they interpret backslashes, so naturally the mess is widespread. This can be cleaned up progressively: 1. The first step is to remove mention of ?\ from the doc for these functions, and thus not encourage people to depend on this behavior. 2. The second step is to start fixing code where they these functions are used directly in contexts where \? might enter Emacs within a file name, translating such an outside format by replacing \? with ?/. 3. The final step is to stop those functions from recognizing \? as a dir separator. Only in this final step will any remaining bugs surface: places where we neglected to clean up the file name before passing it to these functions. If the second step is done right then the transition will be "seamless". Any bugs uncovered after fixing the function definitions (code) will be the exception. > > The answer, for Emacs, should simply be to interpret the > > chars other than ?/ as file-name chars, not as directory > > separators. It has nothing to do with interpreting regexp syntax. > > It has only to do with interpreting a directory separator. > > But the above output doesn't make sense. It does if that is the argument passed. We have agreed that the argument need not name an existing file. These functions should be robust and simple enough that they do nothing other than split the string at the last directory separator. > The result is by no means what file-name-directory and > file-name-nondirectory are documented to produce. > And the reason is that the argument is not a file name. You already agreed that the argument need not be a file name. It might not name an existing file. And the functions should be robust and simple enough to DTRT with any string, even a string that could not possibly be a file name. But that latter part is a bit beside the point here. Nothing in the bug report _requires_ this to be about names that could never name a file. This is only about names that contain backslashes. It is only about not having these functions treat \? as a dir separator. > So why what you are asking makes sense, and when will it be > useful in Emacs in practice? > > > > It's not different. These functions are used all the > > > time for parsing file names, including those in output > > > of other programs. > > > > But they should be used to parse Emacs file names (i.e., > > names that use only ?/ as dir separator), nothing more. > > No. They are designed to parse file names that are valid on the > underlying OS. That's where we disagree. I can't speak to their original intent, but what they should do is retrieve the directory and non-directory portions of an _Emacs_ file name, where the latter means a name that uses ?/ as dir separator. These functions should know nothing about the underlying OS. They should be handed _Emacs_ file names, that is, names with ?/ as the dir separator. > > That's all I'm suggesting: keep `file-name-(non)directory' > > for Emacs file names, where that notion is platform-independent > > wrt dir separator. Use other code as needed to translate to > > names that use only ?/ as dir separator. > > > > > There are also users who use backslashes in their > > > ~/.emacs files, when they specify file names and programs. > > > > So what? Again, however & whenever Emacs receives such > > names, it can use code that translates them to Emacs file > > names (names with ?/ as dir separator). > > But we already use that "other code": these two primitives (and > others) which DTRT with any file name that is valid on Windows. > There's no need to change anything. It should be clear from my use above that by "other" I meant "other than these functions". These standard functions should be only for operating on Emacs file names. "Other" means platform-specific translation to platform-independent Emacs file names (with ?/ separators). > It appears that you are asking for an additional set of functions, > which ignore backslashes on Windows. No, I am asking for _these_ functions to stop being special-cased according to the platform, to stop treating ?\ as dir separator on Windows (since ?/ works on Windows too). This is about separating out the platform-specific treatment to only the places where it is needed, and having the standard functions that access parts of a file name use and expect the standard Emacs file-name syntax: ?/ as dir separator. > If such functions are to become > part of Emacs, we need to hear the practical use-cases where they > would be useful. You presented a single example, which you now say > was not relevant. Please present relevant examples that would justify > yet another set of file-name APIs. Otherwise, you can always write > such functions yourself, it's hardly a big job. > > Btw, I suggest to move the rest of the discussion to emacs-devel, as > it's no longer relevant to the original bug report. That mailing list > has more subscribers than the bug-reporting list, who may contribute > to the discussion. Just leave the bug open please. Mark it as "wishlist" if such is your wont. I don't have time to argue anymore about this. I've made my argument here clear. > > > If you do this, you will flood the application sources with ugly > > > system-dependent conditions. Hardly a good idea. > > > > I didn't say that the "application sources" should do that > > (though I'm not sure what you mean by that term). > > > > I said that if some Emacs code expects a "file name" in > > some format different from the standard Emacs syntax - i.e., > > with some other directory separator, then specialized _Emacs_ > > code that recognizes such a format can translate it to the > > standard format: replace ?\ or another directory separator > > by ?/, the directory separator used by Emacs. > > This code will be specific to Windows, and will clutter Lisp > application-level sources, such as gud.el, grep.el, compile.el, > etc. with the kinds of `(if (eq system-type 'windows-nt) > fix-file-names-for-drew)'. That's ugly and unergonomic. > The current situation, though not ideal, is much better. (standard-file-name the-input) is exactly what we should have. It shows clearly what is involved. It is such a function that would do the (eq system-type 'windows-nt) replace-\-by-/). The point is that at places in the code where you see (file-name-directory (standard-file-name the-input)) it will be clear that `the-input' might not be in the standard Emacs format (with only ?/ as dir separator). More typically, such places will call `standard-file-name' only once to convert the external input once and for all. From then on, Emacs will be dealing only with a standard file name. Everywhere you do not see `standard-file-name' you will be sure that the file name is an Emacs name (?/ as separator). > > > The special-case code has to be somewhere. Having it at > > > the current low level in C, hidden from the Lisp programs, > > > is the best we can do. > > > > I agree that it has to be somewhere. And I recognize that > > you are far more familiar with the Emacs implementation > > (and with Windows) than I. My point is that there is no > > logical reason why the _standard_, _general_ Emacs functions > > for decomposing file names (within Emacs) should have to > > recognize two different chars as directory separators. > > Yes, there's a perfectly valid reason: because these primitives are > used everywhere in Emacs packages, and those packages don't want to > know about differences in file format between Posix and Windows > platforms. They need not know. Only places where an external-format name is introduced into Emacs need call `standard-file-name' (or whatever name is used). And even if those places are also numerous, they need call it only once. Once the file name has been converted to the standard Emacs form (only ?/ as separator), it can travel on its merry way throughout Emacs, with no code needing to worry about anything platform-dependent in the name. > Again, we do that in a lot of places, most of which I don't even > remember. The reason I can safely forget about them is _precisely_ > that Lisp code doesn't have to worry about these issues, because the > primitives DTRT. You don't know where they are, and cannot tell, precisely because there is no explicit call to a function that translates to the standard form. If every such location where an external format might enter Emacs had a call to `standard-file-name' then (a) we would easily recognize those places and (b) all other code would be sure to be dealing with simple, standard, Emacs file names. It's not about you remembering all such locations. It's about identifying them clearly, making _them_ alone do the translation, explicitly. > But here's one more example I just recalled: type > "M-x getenv RET PATH RET" and look at the value. I'm sure if I think > more, I will recall more examples. But why waste energy on a problem > that doesn't exist? To convert a PATH you need only iterate wrt `path-separator', calling `standard-file-name' on each path component. Again, doing that makes it explicit at that point in the code that what is being handled is a list of file names that are not necessarily in standard form. We can agree to disagree - I don't think we're going to convince each other. Please leave the bug report open, for possible consideration in the future or by others. From debbugs-submit-bounces@debbugs.gnu.org Tue Oct 05 15:14:05 2010 Received: (at 7159-done) by debbugs.gnu.org; 5 Oct 2010 19:14:05 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P3Cxh-0006KK-0y for submit@debbugs.gnu.org; Tue, 05 Oct 2010 15:14:05 -0400 Received: from mtaout20.012.net.il ([80.179.55.166]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1P3Cxd-0006Jw-QU for 7159-done@debbugs.gnu.org; Tue, 05 Oct 2010 15:14:03 -0400 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0L9U00I0008J0H00@a-mtaout20.012.net.il> for 7159-done@debbugs.gnu.org; Tue, 05 Oct 2010 21:17:08 +0200 (IST) Received: from HOME-C4E4A596F7 ([77.124.135.76]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0L9U00GT608J7VK0@a-mtaout20.012.net.il>; Tue, 05 Oct 2010 21:17:08 +0200 (IST) Date: Tue, 05 Oct 2010 21:17:17 +0200 From: Eli Zaretskii Subject: Re: bug#7159: 24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char' In-reply-to: <30449C251E4C44978BA588627B14AAA1@us.oracle.com> X-012-Sender: halo1@inter.net.il To: Drew Adams Message-id: <83zkuso3eq.fsf@gnu.org> References: <929AEFC3EF4C43C18E17D1F36A0CB6FE@us.oracle.com> <837hhxpxi3.fsf@gnu.org> <834od1p9dj.fsf@gnu.org> <94BA2845D4EF4A7CBE409B90D6A377AB@us.oracle.com> <30449C251E4C44978BA588627B14AAA1@us.oracle.com> X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: 7159-done Cc: 7159-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.9 (-) > From: "Drew Adams" > Cc: <7159-done@debbugs.gnu.org> > Date: Tue, 5 Oct 2010 07:58:23 -0700 > > > What other real-life use-cases exist that require such a > > functionality? > > Ah, the "real world" argument. What argument? I asked a reasonable question, and as someone who requests to change existing APIs, it is not unreasonable to expect a decent answer. At the very least, you could tell about your own use-case, which prompted you to ask for this. If you aren't prepared to present your case, then I don't see any reason to continue discussing this, let alone consider any changes in veteran and well-established APIs. > Answers: (1) Users and programs will always do what you don't expect - and > sometimes there is no reason they shouldn't. (2) Occam's razor: There is no > reason to special-case backslash here: slash alone is sufficient. If not true, > then treat ?\ as a dir separator in Emacs on Unix also. That's not an answer, sorry. I asked for specific, practical use-cases, not about theoretical "why not?" philosophical arguments. We don't generally implement in Emacs APIs for situations that don't happen in practice. Life's too short. > You already agreed that the argument need not be a file name. No, I didn't. I agreed that the argument need not name an existing file. But it still must be a valid file name, by the rules of the underlying filesystem. We sometimes extend these rules (cf. Tramp and the "remote" file names in general) to allow Emacs-specific features, but we never restrict them. > > No. They are designed to parse file names that are valid on the > > underlying OS. > > That's where we disagree. Yes, we do. > These functions should know nothing about the underlying OS. They should be > handed _Emacs_ file names, that is, names with ?/ as the dir separator. Not going to happen. At most, you may have (or write) additional functions. An attempt at imposing on the Emacs developers to go through all the sources and change code that works perfectly well is not going to fly. > > But we already use that "other code": these two primitives (and > > others) which DTRT with any file name that is valid on Windows. > > There's no need to change anything. > > It should be clear from my use above that by "other" I meant "other than these > functions". These standard functions should be only for operating on Emacs file > names. "Other" means platform-specific translation to platform-independent > Emacs file names (with ?/ separators). What is so special about "these" functions that you insist that they and only they do what you want? What's in a name? > This is about separating out the platform-specific treatment to only the places > where it is needed My point, which I obviously fail to drive home, is that they are needed everywhere where we use file names in Emacs. > (standard-file-name the-input) is exactly what we should have. Please take a second look at that function (or, rather, at w32-convert-standard-filename, its w32 implementation). You will see that it does things that are far from the "simple" slash parsing you asked for. For example, if we use w32-convert-standard-filename indiscriminantly, as you suggest, we will be unable to support remote file names on Windows. > The point is that at places in the code where you see (file-name-directory > (standard-file-name the-input)) it will be clear that `the-input' might not be > in the standard Emacs format (with only ?/ as dir separator). You are asking Emacs developers to: . find all the places in Lisp code that may potentially get file names with backslashes . add a call to some function to all those places . somehow remember to insert a call to that function to any place in future Lisp code that is added to Emacs . remember to teach all Emacs contributors who work on platforms other than Windows to use that paradigm in their contributions All that when we already have APIs that work correctly everywhere, and don't suffer from the above maintenance burden. Now, please try to guess what are the chances of this to happen, even if I did agree with your arguments. > More typically, such places will call `standard-file-name' only once to convert > the external input once and for all. There _is_ no "only once". Emacs does not get _filenames_ from all those sources, it gets _text_. To understand which parts of that text are file names, it needs to analyze the text. That analysis is not always in a single place and on a single level. All those places need to be changed, and for no good reason, because they already work fine. > > But here's one more example I just recalled: type > > "M-x getenv RET PATH RET" and look at the value. I'm sure if I think > > more, I will recall more examples. But why waste energy on a problem > > that doesn't exist? > > To convert a PATH you need only iterate wrt `path-separator', calling > `standard-file-name' on each path component. That would break some programs when Emacs invokes them on Windows and DOS, because they are not prepared to handle directories with forward slashes in PATH. That's the reason we don't translate PATH to use forward slashes to begin with. You see, life isn't as simple as you think it is. Please try to give a bit more respect to what we have now in Emacs, and please don't assume that its current shape is due to some omission or lack of foresight, but rather that it's based on a lot of experience and grey hair. > Please leave the bug report open, for possible consideration in the future or by > others. No. The bug is there for everyone to see and "consider", but there's no reason to leave it open, because I will do my best to block any "fixes" like that to those APIs. From unknown Sun Jun 22 11:31:06 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Wed, 03 Nov 2010 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator