GNU bug report logs - #24126
vc-hg-state can be extremely slow

Previous Next

Package: emacs;

Reported by: Jonathan Kotta <jpkotta <at> gmail.com>

Date: Mon, 1 Aug 2016 18:42:01 UTC

Severity: normal

To reply to this bug, email your comments to 24126 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#24126; Package emacs. (Mon, 01 Aug 2016 18:42:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonathan Kotta <jpkotta <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 01 Aug 2016 18:42:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jonathan Kotta <jpkotta <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: vc-hg-state can be extremely slow
Date: Mon, 1 Aug 2016 13:40:36 -0500
[Message part 1 (text/plain, inline)]
Emacs uses `hg status -A` in vc-hg-state, which in turn is used in many vc
commands (e.g. vc-root-diff).  The "-A" option makes mercurial look at all
files under the directory, even the ignored ones.  If there are a lot of
ignored files, this will be very slow.

As an example, I have a repo that's 38MB / 300 files when freshly checked
out, and 34GB / 1.2M files when the build finishes (if you're curious, it's
a yocto project).  Without clearing the disk cache, `hg stat -A >
/dev/null` takes 28s; it's far longer if the disk cache isn't warmed up or
the output is actually used.  `hg status` takes about 90ms.

vg-git-state does not have this problem; currently it behaves like `hg
status`, i.e. honoring the ignore rules.   There is actually a FIXME
comment regarding this functionality, noting that `git ls-files -i -o
--exclude-standard` is the equivalent to `hg status -A`; this takes over
400s (I got sick of waiting).

I'm guessing VC has some sort of assumption that vc-x-state will return all
files.  Maybe the command could bailout after taking too long and use `hg
status`.  Maybe the command options could be configurable.  Personally, I'd
prefer just dropping the "-A", because I've never used it and I don't
really see why you'd want to get ignored files by default; this is my
current solution.

I'm using Emacs 24.5.1.  I've tested Emacs 25.1.1 and it still has the
issue; though vc-hg-state has changed, it still uses "-A" and is still very
slow on my repo.

-- 
Thanks,

Jonathan Kotta

Hofstadter's Law:
    It always takes longer than you expect, even
    when you take into account Hofstadter's Law.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24126; Package emacs. (Tue, 02 Aug 2016 13:33:02 GMT) Full text and rfc822 format available.

Message #8 received at 24126 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Jonathan Kotta <jpkotta <at> gmail.com>, 24126 <at> debbugs.gnu.org
Subject: Re: bug#24126: vc-hg-state can be extremely slow
Date: Tue, 2 Aug 2016 16:32:37 +0300
On 08/01/2016 09:40 PM, Jonathan Kotta wrote:
> Emacs uses `hg status -A` in vc-hg-state, which in turn is used in many
> vc commands (e.g. vc-root-diff).  The "-A" option makes mercurial look
> at all files under the directory, even the ignored ones.  If there are a
> lot of ignored files, this will be very slow.

Why does it do that? We're passing a specific file name to it.

> As an example, I have a repo that's 38MB / 300 files when freshly
> checked out, and 34GB / 1.2M files when the build finishes (if you're
> curious, it's a yocto project).  Without clearing the disk cache, `hg
> stat -A > /dev/null` takes 28s;

What about 'hg status -A file/name > /dev/null'?

> vg-git-state does not have this problem; currently it behaves like `hg
> status`, i.e. honoring the ignore rules.   There is actually a FIXME
> comment regarding this functionality, noting that `git ls-files -i -o
> --exclude-standard` is the equivalent to `hg status -A`; this takes over
> 400s (I got sick of waiting).

The FIXME is outdated, we'll do it by parsing 'git status --porcelain'.

Does 'git status --ignored --porcelain -- file/name' take a lot of time 
for you as well?

> I'm guessing VC has some sort of assumption that vc-x-state will return
> all files.

vc-x-state returns the state of a single file.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24126; Package emacs. (Tue, 02 Aug 2016 15:59:02 GMT) Full text and rfc822 format available.

Message #11 received at 24126 <at> debbugs.gnu.org (full text, mbox):

From: Jonathan Kotta <jpkotta <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 24126 <at> debbugs.gnu.org
Subject: Re: bug#24126: vc-hg-state can be extremely slow
Date: Tue, 2 Aug 2016 10:57:33 -0500
[Message part 1 (text/plain, inline)]
On Tue, Aug 2, 2016 at 8:32 AM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:

> On 08/01/2016 09:40 PM, Jonathan Kotta wrote:
>
>> Emacs uses `hg status -A` in vc-hg-state, which in turn is used in many
>> vc commands (e.g. vc-root-diff).  The "-A" option makes mercurial look
>> at all files under the directory, even the ignored ones.  If there are a
>> lot of ignored files, this will be very slow.
>>
>
> Why does it do that? We're passing a specific file name to it.
>

In the case of `vc-root-diff` at least, it's passing in a directory; the
command is essentially `hg status -A ./`, and the CWD is indeed the repo
root.  I did this by running `vc-root-diff` from a dired buffer visiting
the repo root.

-- 
Thanks,

Jonathan Kotta

Hofstadter's Law:
    It always takes longer than you expect, even
    when you take into account Hofstadter's Law.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24126; Package emacs. (Tue, 02 Aug 2016 16:20:01 GMT) Full text and rfc822 format available.

Message #14 received at 24126 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Jonathan Kotta <jpkotta <at> gmail.com>
Cc: 24126 <at> debbugs.gnu.org
Subject: Re: bug#24126: vc-hg-state can be extremely slow
Date: Tue, 2 Aug 2016 19:19:30 +0300
On 08/02/2016 06:57 PM, Jonathan Kotta wrote:

> In the case of `vc-root-diff` at least, it's passing in a directory; the
> command is essentially `hg status -A ./`, and the CWD is indeed the repo
> root.  I did this by running `vc-root-diff` from a dired buffer visiting
> the repo root.

OK, thanks. The chain of calls looks like this:

  vc-hg-state("~/vc/mozilla-central/")
  vc-hg-registered("~/vc/mozilla-central/")
  apply(vc-hg-registered "~/vc/mozilla-central/")
  vc-call-backend(Hg registered "~/vc/mozilla-central/")
  vc-registered("~/vc/mozilla-central/")
  vc-backend("~/vc/mozilla-central/")
  vc-working-revision("~/vc/mozilla-central/")
  vc-root-diff(nil t)

Maybe vc-hg-registered shouldn't delegate to vc-hg-state, and call 'hg 
status' on its own without '-A'.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24126; Package emacs. (Sun, 15 Aug 2021 12:31:02 GMT) Full text and rfc822 format available.

Message #17 received at 24126 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 24126 <at> debbugs.gnu.org, Jonathan Kotta <jpkotta <at> gmail.com>
Subject: Re: bug#24126: vc-hg-state can be extremely slow
Date: Sun, 15 Aug 2021 14:30:22 +0200
Dmitry Gutov <dgutov <at> yandex.ru> writes:

>> In the case of `vc-root-diff` at least, it's passing in a directory; the
>> command is essentially `hg status -A ./`, and the CWD is indeed the repo
>> root.  I did this by running `vc-root-diff` from a dired buffer visiting
>> the repo root.
>
> OK, thanks. The chain of calls looks like this:
>
>   vc-hg-state("~/vc/mozilla-central/")
>   vc-hg-registered("~/vc/mozilla-central/")
>   apply(vc-hg-registered "~/vc/mozilla-central/")
>   vc-call-backend(Hg registered "~/vc/mozilla-central/")
>   vc-registered("~/vc/mozilla-central/")
>   vc-backend("~/vc/mozilla-central/")
>   vc-working-revision("~/vc/mozilla-central/")
>   vc-root-diff(nil t)

This was five years ago -- I tried reproducing this by instrumenting
`vg-hg-state' in a hg-covered directory, and that function was not
called when doing vc-root-diff there.

So has this been fixed in the intervening years?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24126; Package emacs. (Tue, 17 Aug 2021 00:15:01 GMT) Full text and rfc822 format available.

Message #20 received at 24126 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 24126 <at> debbugs.gnu.org, Jonathan Kotta <jpkotta <at> gmail.com>
Subject: Re: bug#24126: vc-hg-state can be extremely slow
Date: Tue, 17 Aug 2021 03:14:13 +0300
On 15.08.2021 15:30, Lars Ingebrigtsen wrote:
> This was five years ago -- I tried reproducing this by instrumenting
> `vg-hg-state' in a hg-covered directory, and that function was not
> called when doing vc-root-diff there.

I think that only happens once per root per Emacs session (after that 
the directory's backend is cached).

> So has this been fixed in the intervening years?

Probably not.

It's not too slow here in the project I've tried it with, but it doesn't 
have many ignored files (which was the situation in the report).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24126; Package emacs. (Wed, 18 Aug 2021 14:41:01 GMT) Full text and rfc822 format available.

Message #23 received at 24126 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 24126 <at> debbugs.gnu.org, Jonathan Kotta <jpkotta <at> gmail.com>
Subject: Re: bug#24126: vc-hg-state can be extremely slow
Date: Wed, 18 Aug 2021 16:39:57 +0200
Dmitry Gutov <dgutov <at> yandex.ru> writes:

> I think that only happens once per root per Emacs session (after that
> the directory's backend is cached).

Yup.  If I instrument the function and then do a vc-root-diff in a fresh
Emacs session, I get the expected backtrace:

Debugger entered--entering a function:
* vc-hg-state("/tmp/hg/")
  apply(vc-hg-state "/tmp/hg/")
  vc-call-backend(Hg state "/tmp/hg/")
  vc-state-refresh("/tmp/hg/" Hg)
  vc-state("/tmp/hg/" Hg)
  vc-hg-registered("/tmp/hg/")
  apply(vc-hg-registered "/tmp/hg/")
  vc-call-backend(Hg registered "/tmp/hg/")
  #f(compiled-function (b) #<bytecode -0x1056748a7cc3b17d>)(Hg)
  mapc(#f(compiled-function (b) #<bytecode -0x1056748a7cc3b17d>) (RCS CVS SVN SCCS SRC Bzr Git Hg))
  vc-registered("/tmp/hg/")
  vc-backend("/tmp/hg/")
  vc-working-revision("/tmp/hg/")
  vc-root-diff(nil)
  eval((vc-root-diff nil) t)
  eval-expression((vc-root-diff nil) nil nil 127)
  funcall-interactively(eval-expression (vc-root-diff nil) nil nil 127)
  call-interactively(eval-expression nil nil)
  command-execute(eval-expression)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24126; Package emacs. (Sat, 08 Mar 2025 02:58:01 GMT) Full text and rfc822 format available.

Message #26 received at 24126 <at> debbugs.gnu.org (full text, mbox):

From: Sean Whitton <spwhitton <at> spwhitton.name>
To: jpkotta <at> gmail.com
Cc: control <at> debbugs.gnu.org, 24126 <at> debbugs.gnu.org
Subject: Bug#24126: vc-hg-state can be extremely slow
Date: Sat, 08 Mar 2025 10:57:16 +0800
tag 24126 + moreinfo
thanks

Hello,

It sounds like this bug may well still exist but there isn't enough
information in this particular report for anyone to do any work on it.

Therefore I would propose we close it, unless, Jonathan, you would be
able to provide a tarball of a Mercurial repository that shows the
problem, perhaps.

-- 
Sean Whitton




Added tag(s) moreinfo. Request was from Sean Whitton <spwhitton <at> spwhitton.name> to control <at> debbugs.gnu.org. (Sat, 08 Mar 2025 02:58:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24126; Package emacs. (Mon, 10 Mar 2025 21:25:01 GMT) Full text and rfc822 format available.

Message #31 received at 24126 <at> debbugs.gnu.org (full text, mbox):

From: Jonathan Kotta <jpkotta <at> gmail.com>
To: Sean Whitton <spwhitton <at> spwhitton.name>
Cc: control <at> debbugs.gnu.org, 24126 <at> debbugs.gnu.org
Subject: Re: Bug#24126: vc-hg-state can be extremely slow
Date: Mon, 10 Mar 2025 16:24:02 -0500
[Message part 1 (text/plain, inline)]
Unfortunately I don't even have any hg repos I work with any more.  I use
magit for almost everything these days.  So I guess from my perspective it
doesn't matter any more.

I tested `vc-root-diff` on Emacs 29.4 with the repo described below, and
though it's slow the first time it seems to be much faster on subsequent
calls, probably due to caching in Emacs.  I'd argue it's still unacceptably
slow, because there can be an arbitrary number of ignored files.  The
fundamental bug is assuming `hg status -A some_directory` is a fast
operation, when it can easily take many seconds or even minutes because
it's proportional to the number of files under some_directory.

mkdir /tmp/foo
hg init
echo -e "syntax: glob\n*" > .hgignore
for i in $(seq 1000) ; do mkdir $i ; (cd $i ; touch $(seq -s ' ' 1000));
done

$ time hg status -A ./ | wc
1000001 2000002 9786012

real 0m4.748s
user 0m3.703s
sys 0m1.063s

$ time hg status ./ | wc
      0       0       0

real 0m0.124s
user 0m0.101s
sys 0m0.024s

Here's some timing on one of my Yocto repos:

$ time hg status -A | wc
5074646 10180682 727258295

real 1m6.047s
user 0m23.774s
sys 0m26.154s

$ time hg status | wc
      0       0       0

real 0m0.204s
user 0m0.090s
sys 0m0.054s


On Fri, Mar 7, 2025 at 8:57 PM Sean Whitton <spwhitton <at> spwhitton.name>
wrote:

> tag 24126 + moreinfo
> thanks
>
> Hello,
>
> It sounds like this bug may well still exist but there isn't enough
> information in this particular report for anyone to do any work on it.
>
> Therefore I would propose we close it, unless, Jonathan, you would be
> able to provide a tarball of a Mercurial repository that shows the
> problem, perhaps.
>
> --
> Sean Whitton
>


-- 
Thanks,

Jonathan Kotta

Hofstadter's Law:
    It always takes longer than you expect, even
    when you take into account Hofstadter's Law.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24126; Package emacs. (Fri, 14 Mar 2025 05:38:02 GMT) Full text and rfc822 format available.

Message #34 received at 24126 <at> debbugs.gnu.org (full text, mbox):

From: Sean Whitton <spwhitton <at> spwhitton.name>
To: Jonathan Kotta <jpkotta <at> gmail.com>
Cc: control <at> debbugs.gnu.org, 24126 <at> debbugs.gnu.org
Subject: Re: bug#24126: vc-hg-state can be extremely slow
Date: Fri, 14 Mar 2025 13:37:18 +0800
tag 24126 - moreinfo
thanks

Hello,

On Mon 10 Mar 2025 at 04:24pm -05, Jonathan Kotta wrote:

> I tested `vc-root-diff` on Emacs 29.4 with the repo described below,
> and though it's slow the first time it seems to be much faster on
> subsequent calls, probably due to caching in Emacs.  I'd argue it's
> still unacceptably slow, because there can be an arbitrary number of
> ignored files.  The fundamental bug is assuming `hg status -A
> some_directory` is a fast operation, when it can easily take many
> seconds or even minutes because it's proportional to the number of
> files under some_directory.

Thanks, Jonathan, for the info.

The next step here is for someone to find out if there is any equivalent
hg command that will get us the information we need without slowing down
based on the number of ignored files.  Otherwise this is just unfixable.

-- 
Sean Whitton




Removed tag(s) moreinfo. Request was from Sean Whitton <spwhitton <at> spwhitton.name> to control <at> debbugs.gnu.org. (Fri, 14 Mar 2025 05:38:03 GMT) Full text and rfc822 format available.

This bug report was last modified 99 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.