GNU bug report logs - #47799
28.0.50; Default `project-files' implementation doesn't work with quoted filenames

Previous Next

Package: emacs;

Reported by: Philipp Stephani <p.stephani2 <at> gmail.com>

Date: Thu, 15 Apr 2021 13:45:02 UTC

Severity: normal

Found in version 28.0.50

Full log


Message #44 received at 47799 <at> debbugs.gnu.org (full text, mbox):

From: Philipp <p.stephani2 <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 47799 <at> debbugs.gnu.org
Subject: Re: bug#47799: 28.0.50; Default `project-files' implementation
 doesn't work with quoted filenames
Date: Mon, 5 Jul 2021 21:05:01 +0200

> Am 17.05.2021 um 01:22 schrieb Dmitry Gutov <dgutov <at> yandex.ru>:
> 
> On 16.05.2021 16:37, Philipp wrote:
> 
>> One thing that came to my mind is: in general, in Elisp (not just XRef), we spend lots of time parsing filenames to support remote and quoted filenames.  Other languages probably solve this by introducing proper types for filenames (e.g. the Java Path class), which can then hold preprocessed information about the underlying filesystem (or special file name handler, in the case of Elisp).  How about doing similar for Elisp?  For example, introduce a `parsed-file-name' class or structure holding the remote/quoting state, or attach it to string properties?  I haven't tried out that idea, but I think it could significantly speed up the parsing (since we'd only have to do it once and don't have to search for filename handlers all the time), as well as remain backward-compatible to "plain" unparsed filenames by allowing both strings and this new object type.  WDYT?
> 
> That sounds like an interesting idea to explore.
> 
> We create/concatenate those file names inside project-files, and then "parse" them again to convert to local names inside xref-matches-in-files. Creating such structures might indeed save us on some parsing and garbage generation.
> 
> Experiments and patches welcome.
> 
> What I was also thinking of previously, is some "fileset" data structure which could contain a list of local file names and their connection in a separate slot. Maybe even separating the parent/root directory into a separate slot when feasible, to minimize GC further, though that might complicate applications.
> 
> A more structured "file" value format might make this stuff easier to use indeed, and perhaps the performance difference will be negligible.

I think those are very good ideas.  The "fileset" structure sounds like a pretty good abstraction.

> 
> The difficulty is having a method like project-files return one format for some users, and another for users who want to take advantage of this performance improvement. Or we break the compatibility and/or introduce a new method with this new behavior.

A general design approach in OOP is to not treat abstract virtual functions (generic functions in ELisp terminology) as part of the public interface of a type; i.e., abstract functions can be implemented, but shouldn't be called outside of the module that defines them (project.el in this case).  That allows for changes like this: implementers could freely return the new fileset structure because only project.el would call project-files.  Not sure how much ELisp code adheres to this principle, though.  If there's too much code (outside of project.el) that relies on project-files returning a list, we need to indeed fall back to some of the other options.





This bug report was last modified 3 years and 268 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.