GNU bug report logs - #46881
28.0.50; pdumper dumping causes way too many syscalls

Previous Next

Package: emacs;

Reported by: Pip Cet <pipcet <at> gmail.com>

Date: Tue, 2 Mar 2021 20:35:01 UTC

Severity: normal

Found in version 28.0.50

Done: Mattias EngdegÄrd <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

Full log


Message #68 received at 46881 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 46881 <at> debbugs.gnu.org, Daniel Colascione <dancol <at> dancol.org>,
 eggert <at> cs.ucla.edu
Subject: Re: bug#46881: 28.0.50; pdumper dumping causes way too many syscalls
Date: Fri, 5 Mar 2021 13:16:14 +0000
On Fri, Mar 5, 2021 at 12:07 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> > From: Pip Cet <pipcet <at> gmail.com>
> > Date: Fri, 5 Mar 2021 09:54:32 +0000
> > Cc: Daniel Colascione <dancol <at> dancol.org>, eggert <at> cs.ucla.edu, 46881 <at> debbugs.gnu.org
> >
> > My patch:
> >
> > real    0m1.988s
> > user    0m1.916s
> > sys    0m0.073s
> >
> > fwrite-based patch:
> >
> > real    0m3.576s
> > user    0m2.571s
> > sys    0m1.006s
>
> 30% slowdown and 1.5 sec absolute time difference doesn't sound bad
> enough to me to

It's a 30% slowdown of the entire dump process, including the
CPU-intensive part which loads Emacs. I think you get a better idea of
the performance difference from the "sys" numbers above.

And the absolute time difference is more than that, because Emacs is
dumped twice during each build; the first dump file is about 2.5 times
the size of the ultimate dump file, so my guess (as I said before,
unfortunately Intel decided to make this system not have a predictable
CPU clock, so I can't really run good benchmarks) is we're talking
about 4.5 seconds here.

> justify a homemade solution.

"Create a buffer in memory and do all the IO at once" is such an old
solution that even the GNU Coding Standards explicitly recommend it
(albeit for input files):

You could keep the entire input file in memory and scan it there
instead of using stdio

>I say let's go with stdio.

Maybe setbuffer(3) could help us here? I could run some benchmarks for
that if the idea isn't out of the question.

> > > > Also, we're not currently using fseek-and-write anywhere in Emacs.
> > >
> > > I don't see why this would be important.
> >
> > Because the stream returned by emacs_fopen might not be generally seekable?
>
> I don't see how that could happen.

It has, to me, but I'm willing to accept I did some inadvisable things first.

> > By preparing the data in memory and writing it in one go, which
> > doesn't require any of the major complications of implementing
> > buffered streams.
>
> There are no complications I can see, not in our sources.  (And you
> don't actually write it in one go anyway, see emacs_full_write.)

Er, precisely. I was the one saying there are no complications, so we
shouldn't let the idea of "implementing our own buffered streams"
scare us, because that is a complicated project but it's also not what
we are doing.

> So let's go with the stdio solution, please.

Should I add a sync after every seek to make absolutely certain,
rather than merely likely, this will destroy someone's flash chip one
day?

Pip




This bug report was last modified 4 years and 34 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.