GNU bug report logs -
#73484
31.0.50; Abolishing etags-regen-file-extensions
Previous Next
Full log
View this message in rfc822 format
On 06/10/2024 09:22, Eli Zaretskii wrote:
>> Then, the total time increased a lot: from 30 s to 30-40 min.
>
> I don't understand why. How many files with no extensions are in that
> tree, and what was the etags command line in both cases?
Sorry, I have to add a correction: it's about 15 min either way. Seems
like the first time I either messed up the start time, or the directory
was in "cold" cache, or the used etags some much older version.
So to reiterate: the current etags-regen scans in around 30s, and the
simple switch scans the directory in 15 minutes. Retesting the change
from previous email, it doesn't really help.
And the 'find-tag' scan did become slower - i.e. from 400 ms to 1200 ms.
Not clear about the mechanics (the size of TAGS only went up from 65 to
88 MB).
>> But parsing HTML files seems to remain the slowest part. There are a lot
>> of them in that project (many test cases), but maybe 3x the number of
>> code files, not 60x their number. And they're pretty small, on average.
>> If somebody wants to test that locally, here's the repository:
>> https://github.com/mozilla/gecko-dev
>
> If HTML files is what explains the slowdown, then why this change
> triggered it? HTML files are supposed to have extensions that tell
> etags they are HTML.
Okay, I've commented out the most obvious suspects (html, asm, makefile)
- all their entries in 'lang_names' - but the scan still takes too long.
Maybe it's some other file type, which I haven't found yet.
But what is see when monitoring the running scan with 'tail -f TAGS', is
the output stops sometimes for like 20 seconds, in the middle of
outputting tags of some common code file (like .cpp or .py, a common
type), and then resumes, with files of the same type around this one.
> And if they don't have extensions, the code you
> removed would have caused etags to scan these files anyway, looking
> for Fortran or C tags. So how come the change slowed down etags so
> much? What am I missing?
I think it would also concern "unknown" extensions, right? Like .txt,
.png and so on.
Anyway, the difference is either due to the different set of files (all
project files, rather than files in the specified list of extensions),
or due to all file names being printed. Not sure how to verify, yet.
This bug report was last modified 225 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.