GNU bug report logs - #7159
24.0.50; (1) `file-name-(non)directory': bad return values, (2) `directory-sep-char'

Previous Next

Package: emacs;

Reported by: "Drew Adams" <drew.adams <at> oracle.com>

Date: Mon, 4 Oct 2010 17:59:02 UTC

Severity: minor

Tags: wontfix

Found in version 24.0.50

Done: "Drew Adams" <drew.adams <at> oracle.com>

Bug is archived. No further changes may be made.

Full log


Message #19 received at 7159-done <at> debbugs.gnu.org (full text, mbox):

From: "Drew Adams" <drew.adams <at> oracle.com>
To: "'Eli Zaretskii'" <eliz <at> gnu.org>
Cc: 7159-done <at> debbugs.gnu.org
Subject: RE: bug#7159: 24.0.50;
	(1) `file-name-(non)directory': bad return values,
	(2)	`directory-sep-char'
Date: Mon, 4 Oct 2010 22:06:26 -0700
> > > In other words, don't pass a regexp with backslashes to these
> > > functions, because you won't get what you think you will.
> > 
> > Correction: You won't get what you should get, which is just the
> > directory or non-directory portion of the name, respecting ?/ as the
> > only separator.
> 
> These two functions are not supposed to be handed regexps anyway, even
> on Unix.  For example, ...

No one said that anyone is likely to pass a regexp in place of a file name.
Sorry if my example misled you.

The point is that these _general_, standard functions for simply removing a file
name's (non)directory portion should be able to handle _backslash_ characters
without interpreting them as directory separators - on Windows as on Unix or any
other platform.

I said that the use of a regexp in the example I gave was just that: an example
of a name that contains backslashes.  Nothing more.  I probably should have just
used a literal string, to avoid confusion.

These functions should DTRT with such a name, whether or not it corresponds to a
real file (you agreed with that).  Our difference of opinion is wrt whether a
backslash should ever be considered as a (second kind of) directory separator in
Emacs: you say yes (for Windows); I say no.  I say that even on Windows ?/ is
enough; there is no need for two dir separators in Emacs file names.

You say Emacs must recognize ?\ today at least, because mumblemumble things are
complicated.  I say that even if that is so (and I believe you that it is),
that's not the same as claiming that that _should_ be so.  This is a bug, a poor
design/implementation decision, that we can hope to fix at some point.

> what's the filename part of "/foo\\(/a?\\)bar"?
>   (file-name-nondirectory "/foo\\(/a?\\)bar") => "a?\\)bar"
> Or how about this:
>   (file-name-nondirectory "/foo[^/]*") => "]*"

The answer, for Emacs, should simply be to interpret the chars other than ?/ as
file-name chars, not as directory separators.  It has nothing to do with
interpreting regexp syntax.  It has only to do with interpreting a directory
separator.  The only question/disagreement is wrt ?\ as a directory separator.
IMO, it should not be treated as such.

> > Parsing output of programs is something altogether different.
> 
> It's not different.  These functions are used all the time for parsing
> file names, including those in output of other programs.

But they should be used to parse Emacs file names (i.e., names that use only ?/
as dir separator), nothing more.

If the output of some program is &(*^*&#HI&*U@);';.1?>>!, and that program
considers that to be a file name for some platform, that is (or should be)
irrelevant to standard Emacs file-name decomposition.  After your specialized
code translates that name to its Emacs file name of, say, /foo/bar/toto.c,
_then_ that is something that the standard file-name functions can decompose.

That's all I'm suggesting: keep `file-name-(non)directory' for Emacs file names,
where that notion is platform-independent wrt dir separator.  Use other code as
needed to translate to names that use only ?/ as dir separator.

> There are also users who use backslashes in their ~/.emacs files, when
> they specify file names and programs.

So what?  Again, however & whenever Emacs receives such names, it can use code
that translates them to Emacs file names (names with ?/ as dir separator).

> > That is a completely different requirement and should be 
> > handled, naturally, by special-purpose code (i.e. at a
> > different level) - code that knows just what to
> > expect from those particular programs.
> 
> If you do this, you will flood the application sources with ugly
> system-dependent conditions.  Hardly a good idea.

I didn't say that the "application sources" should do that (though I'm not sure
what you mean by that term).

I said that if some Emacs code expects a "file name" in some format different
from the standard Emacs syntax - i.e., with some other directory separator, then
specialized _Emacs_ code that recognizes such a format can translate it to the
standard format: replace ?\ or another directory separator by ?/, the directory
separator used by Emacs.

The Emacs code that receives such a non-standard (for Emacs) format is the code
that should deal with this.  Not the application code (if by that you mean the
code that produces such output).

> > What is the real requirement to support also ?\?
> 
> That it is used in DOS/Windows file names in many situations.

Not inside Emacs.  It's not needed.  For Emacs, ?/ is sufficient even for
Windows.  So it should suffice - there is no (logical) need for Emacs to have
two standard directory separators (on Windows).

There might be a historical (legacy) reason why we have two today, but only one
is needed: ?/ _always_ works within Emacs.

> > "Seamless", indeed.  Putting special-case handling 
> > throughout the code doesn't make things seamless; it makes
> > them quite seamy.
> 
> The special-case code has to be somewhere.  Having it at the current
> low level in C, hidden from the Lisp programs, is the best we can do.

I agree that it has to be somewhere.  And I recognize that you are far more
familiar with the Emacs implementation (and with Windows) than I.  My point is
that there is no logical reason why the _standard_, _general_ Emacs functions
for decomposing file names (within Emacs) should have to recognize two different
chars as directory separators.  That is just not necessary, since ?/ is all that
is needed, even for Emacs on Windows.  Emacs already DTRT for ?/ on Windows.

You say that Emacs can sometimes receive file names output by some external
programs in a syntax that uses ?\ as dir separators.  I say fine, then the Emacs
code that receives such names can translate ?\ to ?/, so that when it comes to
decomposing an Emacs file name we can use simple, standard functions that expect
only ?/ as dir separator. 

That's the difference in our points of view, as I see it.  And again, you have a
better idea of where those places are that Emacs can expect to receive such
file-name syntax (you mentioned GDB, Make, .emacs...).  Those are the places
where I'd suggest we make the transition to the standard syntax (with only ?/ as
separator).  It is the code that accepts/receives such names that should digest
them to produce standard Emacs file names (i.e., with only ?/ separators).

IOW, if the problem is _external_ programs and an _external_ syntax, then do the
translation at the external/internal boundary: at the point where Emacs gets the
file name from outside.  Don't do it (in effect) at each call to a standard
name-decomposition function.  The translation code could be in C or Lisp for all
I care.  All I would like to see is simple file-name decomposition functions -
no special-casing of ?\ on Windows.

> > The simple functions `file-name-directory' and 
> > `file-name-nondirectory' should be robust enough to just
> > remove the non-directory and directory portion - always.
> 
> It does that today.

No, they do not, because they falsely interpret ?\ as a directory separator (and
only on Windows).






This bug report was last modified 14 years and 235 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.