GNU bug report logs -
#50733
28.0.1; project-find-regexp can block Emacs for a long time
Previous Next
Full log
Message #122 received at 50733 <at> debbugs.gnu.org (full text, mbox):
>
> To get back to the issue at hand: we are talking (or at least I was
> talking) about scalability of an algorithm, not about some particular
> implementation of the algorithm.
>
Are you now again shifting the discussion to something else, a theoretical
comparison between various algorithms?
>
> Ripgrep is a multithreaded program, whereas idutils is single-threaded.
> So for a fair comparison of scalability of these two main ideas:
> file-based search vs DB search, you need at the very least to limit
> ripgrep to a single thread. And then you need to run each program on
> code bases of various sizes, preferably those which differ by orders of
> magnitude or close to that, and see their O(n) behavior. And exclude
> from your comparison command-line options that require IDUtils to access
> the files in addition to the DB. That would be at least an
> approximation to comparing apples to apples.
>
You're asking me to disable everything that makes ripgrep a modern tool,
and to disable everything that makes idutils an outdated tool, to make the
outdated tool shine in comparison? Interesting viewpoint.
>
> But frankly, I don't understand why this all would be needed at all,
> because it should be absolutely clear that searching the files in the
> filesystem will always scale worse than reading a well-indexed DB.
>
Which is precisely what I don't believe. It is, at least to me, not at
all "absolutely clear" when you look at the whole picture, IOW, when you
include the necessity to create and keep a database up to date in your
comparison, the added complexity of that solution, and the purpose of the
tool.
>
> IDUtils is an example of the latter, and it beats many utilities that
> search the files, including ripgrep, as long as it doesn't need to
> access the files themselves. But even if it doesn't always beat them
> (which you didn't yet demonstrate), it just means the ideas of its
> design should be taken further and/or implemented better, that's all.
>
I provided you with many numbers and comparisons, which IMO demonstrate
what they were meant to demonstrate. A tool which finds matches for a
regexp in a O(100 MB) code base in O(10 ms), and in a O(1 GB) code base in
O(100 ms), is clearly good enough in practice. (Note that I made these
comparisons on a six or seven years old laptop, these numbers would be
even lower on a more recent machine.)
I'm still waiting for some numbers from you to demonstrate *your*
viewpoint.
>
> I said that such tools are the future, not that IDUtils itself is
> necessarily the future (though it could be, if someone picks up its
> development).
>
Is it not simply because it's not useful/better in practice that nobody is
picking its development (and pretty much nobody is using it)?
>
> Again, this is about looking for the best tools for this job, and I
> still stand by my opinion: focusing only on general-purpose search tools
> is sub-optimal.
>
The message to which you replied and which started this subtread did not
suggest to "focus only on general-purpose search tools", it suggested to
focus only on *one* particular general-purpose search tool, ripgrep, which
is currently the best tool for the job, and to bundle it with Emacs. It
has a public domain license, so I guess it should be possible.
This bug report was last modified 3 years and 261 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.