GNU bug report logs - #71355
30.0.50; [PATCH] Improve performance of buffered output in Eshell

Previous Next

Package: emacs;

Reported by: Jim Porter <jporterbugs <at> gmail.com>

Date: Tue, 4 Jun 2024 05:37:02 UTC

Severity: normal

Tags: patch

Found in version 30.0.50

Done: Jim Porter <jporterbugs <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 71355 in the body.
You can then email your comments to 71355 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Tue, 04 Jun 2024 05:37:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jim Porter <jporterbugs <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 04 Jun 2024 05:37:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 30.0.50; [PATCH] Improve performance of buffered output in Eshell
Date: Mon, 3 Jun 2024 22:36:29 -0700
[Message part 1 (text/plain, inline)]
In Eshell, if I run "time cat config.log" from my Emacs build directory, 
it reports that it takes about 7.5s. It also doesn't show *any* output 
until it's completely finished. With my attached patches, it now takes 
about 0.6s and also shows the output iteratively, redisplaying 
periodically so users can see that something is happening.

The other command most likely to be impacted by this is the built-in 
version of "ls". When I run "ls -Al /usr/bin" on my system, I go from 
2.1s before my patch to 1.2s after. Not as big an improvement, but still 
noticeable, and it *feels* a lot faster too with the iterative redisplay.

I don't usually add a NEWS entry for perf improvements, but this one 
seemed notable enough that I figured it was worth tooting my own horn. :)
[0001-Be-more-efficient-when-buffering-output-in-Eshell.patch (text/plain, attachment)]
[0002-Improve-implementations-of-some-Eshell-output-filter.patch (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Tue, 04 Jun 2024 22:02:01 GMT) Full text and rfc822 format available.

Message #8 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Jim Porter <jporterbugs <at> gmail.com>, 71355 <at> debbugs.gnu.org
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Tue, 4 Jun 2024 17:52:03 -0400
Jim Porter <jporterbugs <at> gmail.com> writes:

> In Eshell, if I run "time cat config.log" from my Emacs build directory,
> it reports that it takes about 7.5s. It also doesn't show *any* output
> until it's completely finished. With my attached patches, it now takes
> about 0.6s and also shows the output iteratively, redisplaying
> periodically so users can see that something is happening.
>
> The other command most likely to be impacted by this is the built-in
> version of "ls". When I run "ls -Al /usr/bin" on my system, I go from
> 2.1s before my patch to 1.2s after. Not as big an improvement, but still
> noticeable, and it *feels* a lot faster too with the iterative redisplay.
>
> I don't usually add a NEWS entry for perf improvements, but this one
> seemed notable enough that I figured it was worth tooting my own horn. :)

Nice, thanks for working on this.  Your patch makes sense to me at first
glance, but I didn't test it.  The performance improvement definitely
seems highly worthwhile based on your measurements.

Bonus points for adding tests, as always.

> +(defcustom eshell-buffered-print-size 2048
> +  "The size of the print queue in characters, for doing buffered printing.
>  This is basically a speed enhancement, to avoid blocking the Lisp code
>  from executing while Emacs is redisplaying."

How did you decide on this value?

Could the docstring be expanded to explain what a user can expect to
happen if they increase or decrease this value?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Wed, 05 Jun 2024 01:57:02 GMT) Full text and rfc822 format available.

Message #11 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Stefan Kangas <stefankangas <at> gmail.com>, 71355 <at> debbugs.gnu.org
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Tue, 4 Jun 2024 18:55:00 -0700
On 6/4/2024 2:52 PM, Stefan Kangas wrote:
> Nice, thanks for working on this.  Your patch makes sense to me at first
> glance, but I didn't test it.  The performance improvement definitely
> seems highly worthwhile based on your measurements.

If anyone has time to test, I'd be interested to see if the results are 
similarly good on other platforms (I'm testing on x86_64 GNU/Linux), but 
I imagine they're comparable. The main areas of improvement are:

1. Batching output into larger chunks reduces the constant-time 
overheads associated with each write to the screen.

2. Scanning for control characters in output is much faster thanks to 
're-search-forward'.

3. The above improvements made it possible to add periodic redisplays, 
which eat into the perf improvements. I mitigated this by throttling the 
redisplay to 40Hz (the slowest value I found where the throttling isn't 
obvious).

> Bonus points for adding tests, as always.

I wrote some tests for the control character fix, since my changes to 
that function were nontrivial. I'll see if I can think of a few others, 
though I'm partly relying on the now-fairly-extensive Eshell test suite 
to catch any mistakes.

>> +(defcustom eshell-buffered-print-size 2048
>> +  "The size of the print queue in characters, for doing buffered printing.
>>   This is basically a speed enhancement, to avoid blocking the Lisp code
>>   from executing while Emacs is redisplaying."
> 
> How did you decide on this value?

Basically, I tried a few different command invocations that normally 
take a while (the main ones being "cat config.log" and "ls -Al 
/usr/bin") and scaled up the value by factors of two until I stopped 
seeing any major perf improvements. (Larger values are faster still, but 
not by a whole lot.)

> Could the docstring be expanded to explain what a user can expect to
> happen if they increase or decrease this value?

Sure, that makes sense. Essentially, smaller values will be slower, but 
may update faster (subject to the redisplay throttling), whereas larger 
values are the opposite.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Wed, 05 Jun 2024 03:53:01 GMT) Full text and rfc822 format available.

Message #14 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Stefan Kangas <stefankangas <at> gmail.com>, 71355 <at> debbugs.gnu.org
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Tue, 4 Jun 2024 20:50:52 -0700
[Message part 1 (text/plain, inline)]
On 6/4/2024 6:55 PM, Jim Porter wrote:
> On 6/4/2024 2:52 PM, Stefan Kangas wrote:
>> Could the docstring be expanded to explain what a user can expect to
>> happen if they increase or decrease this value?
> 
> Sure, that makes sense. Essentially, smaller values will be slower, but 
> may update faster (subject to the redisplay throttling), whereas larger 
> values are the opposite.

I've expanded this docstring and a few others, plus added a simple test 
to make sure Eshell's built-in "cat" still works.
[0001-Be-more-efficient-when-buffering-output-in-Eshell.patch (text/plain, attachment)]
[0002-Improve-implementations-of-some-Eshell-output-filter.patch (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Wed, 05 Jun 2024 12:25:02 GMT) Full text and rfc822 format available.

Message #17 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50;
 [PATCH] Improve performance of buffered output in Eshell
Date: Wed, 05 Jun 2024 15:06:21 +0300
> Date: Tue, 4 Jun 2024 20:50:52 -0700
> From: Jim Porter <jporterbugs <at> gmail.com>
> 
> On 6/4/2024 6:55 PM, Jim Porter wrote:
> > On 6/4/2024 2:52 PM, Stefan Kangas wrote:
> >> Could the docstring be expanded to explain what a user can expect to
> >> happen if they increase or decrease this value?
> > 
> > Sure, that makes sense. Essentially, smaller values will be slower, but 
> > may update faster (subject to the redisplay throttling), whereas larger 
> > values are the opposite.
> 
> I've expanded this docstring and a few others, plus added a simple test 
> to make sure Eshell's built-in "cat" still works.

Is 2K indeed the optimal size?  It is about 25 80-column lines, which
is quite a lot.  "Normal" shells send output to the screen in smaller
chunks.  How about 128 instead? or maybe some small multiple of the
line length, like 1 or 2?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Wed, 05 Jun 2024 16:53:01 GMT) Full text and rfc822 format available.

Message #20 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Wed, 5 Jun 2024 09:42:43 -0700
On 6/5/2024 5:06 AM, Eli Zaretskii wrote:
> Is 2K indeed the optimal size?  It is about 25 80-column lines, which
> is quite a lot.  "Normal" shells send output to the screen in smaller
> chunks.  How about 128 instead? or maybe some small multiple of the
> line length, like 1 or 2?

Yes, I believe 2k is the optimal size, or close to it. Trying a value of 
128 results in basically no change in performance from baseline. That 
makes sense to me, since 128 is actually fairly close to the old value 
for this buffering (which was five *lines*[1]; the old code measured 
this differently).

In starting on this work, I compared to the amount of data that I get 
each time a child process filter is called; that's normally 4095 bytes 
in my tests. Then I started a bit lower (512) and gradually doubled the 
buffer size until I was able to get most of the performance benefit.

A lot of this improvement does come from reducing the number of times 
that we call 'eshell-output-filter-functions', so if we made those 
functions much faster, or throttled them somewhere else, then we could 
get a large perf improvement while using a small buffer size. However, 
even if I remove everything from 'eshell-output-filter-functions', a 
larger buffer is *still* faster (0.2s for 2048 vs 0.5s for 128). For 
comparison, the same command in an ordinary shell takes about 0.1s, so 
that's my ultimate target.

Note that this buffered output code is really only used when Eshell will 
output a (relatively) large block of text all at once from a Lisp 
command. External processes and most ordinary Lisp code won't use this, 
since you have to use the special functions for buffering your writes.

[1] Well, 5 calls to the print function, but most callers printed a line 
at a time.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Wed, 05 Jun 2024 16:53:01 GMT) Full text and rfc822 format available.

Message #23 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Wed, 05 Jun 2024 19:51:39 +0300
> Date: Wed, 5 Jun 2024 09:42:43 -0700
> Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
> From: Jim Porter <jporterbugs <at> gmail.com>
> 
> On 6/5/2024 5:06 AM, Eli Zaretskii wrote:
> > Is 2K indeed the optimal size?  It is about 25 80-column lines, which
> > is quite a lot.  "Normal" shells send output to the screen in smaller
> > chunks.  How about 128 instead? or maybe some small multiple of the
> > line length, like 1 or 2?
> 
> Yes, I believe 2k is the optimal size, or close to it. Trying a value of 
> 128 results in basically no change in performance from baseline. That 
> makes sense to me, since 128 is actually fairly close to the old value 
> for this buffering (which was five *lines*[1]; the old code measured 
> this differently).

That's strange, because I see no output at all until all of it is
available and showsn, and I thought you said the same in your OP?

> In starting on this work, I compared to the amount of data that I get 
> each time a child process filter is called; that's normally 4095 bytes 
> in my tests. Then I started a bit lower (512) and gradually doubled the 
> buffer size until I was able to get most of the performance benefit.

So why do I see only all of it, in one go?  I see this both on Windows
and on GNU/Linux, btw.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Wed, 05 Jun 2024 17:37:02 GMT) Full text and rfc822 format available.

Message #26 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Wed, 5 Jun 2024 10:35:08 -0700
On 6/5/2024 9:51 AM, Eli Zaretskii wrote:
>> Date: Wed, 5 Jun 2024 09:42:43 -0700
>> Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
>> From: Jim Porter <jporterbugs <at> gmail.com>
>>
>> On 6/5/2024 5:06 AM, Eli Zaretskii wrote:
>>> Is 2K indeed the optimal size?  It is about 25 80-column lines, which
>>> is quite a lot.  "Normal" shells send output to the screen in smaller
>>> chunks.  How about 128 instead? or maybe some small multiple of the
>>> line length, like 1 or 2?
>>
>> Yes, I believe 2k is the optimal size, or close to it. Trying a value of
>> 128 results in basically no change in performance from baseline. That
>> makes sense to me, since 128 is actually fairly close to the old value
>> for this buffering (which was five *lines*[1]; the old code measured
>> this differently).
> 
> That's strange, because I see no output at all until all of it is
> available and showsn, and I thought you said the same in your OP?

Yes, without my patch that's expected. When I talk about changes in 
performance, I mean the total time to complete the command, as measured 
by, e.g. "time cat config.log".

Here's what's happening: all of the output in 'eshell/cat' occurs in a 
loop, periodically calling 'eshell-interactive-print' (how often it 
calls this depends on the buffering settings). That runs the functions 
in 'eshell-output-filter-functions', which can be fairly expensive. So 
one way to make output faster would be to optimize those functions, 
which I did in my second patch. However, a larger buffer size is still 
faster even when there are no output filter functions, due to other 
overheads in the code. So I think even if we could make 
'eshell-output-filter-functions' all very cheap, it's worth increasing 
the buffer size.

In addition to this, the performance improvements I made allowed me to 
add in the extra work of redisplaying periodically when using this 
buffered output scheme. That's all new in my patch, and previously you'd 
have to wait until the command was finished to see any output. From 
Emacs's perspective, everything in 'eshell/cat' is synchronous, so I 
needed to manually trigger the redisplay (or do some other sorcery to 
hand control back to the command loop).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Wed, 05 Jun 2024 17:59:01 GMT) Full text and rfc822 format available.

Message #29 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Wed, 05 Jun 2024 20:57:59 +0300
> Date: Wed, 5 Jun 2024 10:35:08 -0700
> Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
> From: Jim Porter <jporterbugs <at> gmail.com>
> 
> On 6/5/2024 9:51 AM, Eli Zaretskii wrote:
> >> Date: Wed, 5 Jun 2024 09:42:43 -0700
> >> Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
> >> From: Jim Porter <jporterbugs <at> gmail.com>
> >>
> >> On 6/5/2024 5:06 AM, Eli Zaretskii wrote:
> >>> Is 2K indeed the optimal size?  It is about 25 80-column lines, which
> >>> is quite a lot.  "Normal" shells send output to the screen in smaller
> >>> chunks.  How about 128 instead? or maybe some small multiple of the
> >>> line length, like 1 or 2?
> >>
> >> Yes, I believe 2k is the optimal size, or close to it. Trying a value of
> >> 128 results in basically no change in performance from baseline. That
> >> makes sense to me, since 128 is actually fairly close to the old value
> >> for this buffering (which was five *lines*[1]; the old code measured
> >> this differently).
> > 
> > That's strange, because I see no output at all until all of it is
> > available and showsn, and I thought you said the same in your OP?
> 
> Yes, without my patch that's expected. When I talk about changes in 
> performance, I mean the total time to complete the command, as measured 
> by, e.g. "time cat config.log".
> 
> Here's what's happening: all of the output in 'eshell/cat' occurs in a 
> loop, periodically calling 'eshell-interactive-print' (how often it 
> calls this depends on the buffering settings). That runs the functions 
> in 'eshell-output-filter-functions', which can be fairly expensive. So 
> one way to make output faster would be to optimize those functions, 
> which I did in my second patch. However, a larger buffer size is still 
> faster even when there are no output filter functions, due to other 
> overheads in the code. So I think even if we could make 
> 'eshell-output-filter-functions' all very cheap, it's worth increasing 
> the buffer size.
> 
> In addition to this, the performance improvements I made allowed me to 
> add in the extra work of redisplaying periodically when using this 
> buffered output scheme. That's all new in my patch, and previously you'd 
> have to wait until the command was finished to see any output. From 
> Emacs's perspective, everything in 'eshell/cat' is synchronous, so I 
> needed to manually trigger the redisplay (or do some other sorcery to 
> hand control back to the command loop).

I think we are miscommunicating.  I wasn't talking about performance,
I was talking about the fact that I don't see text delivered to the
screen in chunks.  You said that the current code sends text to the
screen in chunks of 5 lines, and that therefore using the value 128 is
almost the same.  But at least part of your patch calls redisplay
after each chunk (AFAIU), something that is not done with the current
code.  So I expect the effect to be a difference in behavior, whereby
test appears on the screen in chunks, and the user does not need to
wait till all of it is sent before he/she sees anything at all
displayed.

I hope I've succeeded to explain myself now.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Wed, 05 Jun 2024 18:58:01 GMT) Full text and rfc822 format available.

Message #32 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Wed, 5 Jun 2024 11:47:40 -0700
On 6/5/2024 10:57 AM, Eli Zaretskii wrote:
> I think we are miscommunicating.  I wasn't talking about performance,
> I was talking about the fact that I don't see text delivered to the
> screen in chunks.  You said that the current code sends text to the
> screen in chunks of 5 lines, and that therefore using the value 128 is
> almost the same.

You had asked about changing the buffer size; in my initial explanation, 
I was just trying to explain my reasoning for why a larger buffer size 
improves the overall execution time. So I was only talking about the 
performance (the redisplay changes are mostly independent from the 
buffer size setting, since I throttle the redisplays by time).

To be more precise though: the current code writes text into the 
*buffer* in chunks of 5 lines. (That text will end up on the screen at 
the next redisplay, whenever that is.) With my patch, if I set the 
buffer size to 128, it will write text into the buffer every 128 chars, 
which results in a similar number of write operations as before the 
patch. Since those write operations are what makes the current code 
slow, a 128-char buffer with my patch is similarly slow.

> But at least part of your patch calls redisplay
> after each chunk (AFAIU), something that is not done with the current
> code.  So I expect the effect to be a difference in behavior, whereby
> test appears on the screen in chunks, and the user does not need to
> wait till all of it is sent before he/she sees anything at all
> displayed.

Correct. The redisplay logic is a new behavior, and not *directly* a 
part of the performance improvements I made by increasing the buffer 
size. On the contrary, I added the redisplay logic because the buffer 
size improvements made the total execution time so much better that I 
felt Eshell could now afford the extra cost of redisplaying periodically 
for these commands.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Wed, 05 Jun 2024 18:59:02 GMT) Full text and rfc822 format available.

Message #35 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Wed, 05 Jun 2024 21:58:22 +0300
> Date: Wed, 5 Jun 2024 11:47:40 -0700
> Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
> From: Jim Porter <jporterbugs <at> gmail.com>
> 
> > But at least part of your patch calls redisplay
> > after each chunk (AFAIU), something that is not done with the current
> > code.  So I expect the effect to be a difference in behavior, whereby
> > test appears on the screen in chunks, and the user does not need to
> > wait till all of it is sent before he/she sees anything at all
> > displayed.
> 
> Correct. The redisplay logic is a new behavior, and not *directly* a 
> part of the performance improvements I made by increasing the buffer 
> size. On the contrary, I added the redisplay logic because the buffer 
> size improvements made the total execution time so much better that I 
> felt Eshell could now afford the extra cost of redisplaying periodically 
> for these commands.

So now that we agree about this aspect, I ask again: wouldn't it make
sense to show the text to the user in smaller chunks?  2K characters
is 2 dozen lines, and I expect users to be somewhat unhappy about
being presented the text in such large chunks.  That's now what they
see when invoking 'cat' from the shell prompt outside Emacs.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Wed, 05 Jun 2024 20:11:02 GMT) Full text and rfc822 format available.

Message #38 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Wed, 5 Jun 2024 13:07:48 -0700
On 6/5/2024 11:58 AM, Eli Zaretskii wrote:
> So now that we agree about this aspect, I ask again: wouldn't it make
> sense to show the text to the user in smaller chunks?  2K characters
> is 2 dozen lines, and I expect users to be somewhat unhappy about
> being presented the text in such large chunks.  That's now what they
> see when invoking 'cat' from the shell prompt outside Emacs.

Ok, I see what you mean. I think the thing users would be unhappy about 
is "long" periods of time between display updates. (If we flush and/or 
redisplay faster than the user's monitor refreshes, those updates are 
wasted.)

For the kinds of output that Eshell's buffered-print is designed for, we 
can get the text we want to print very quickly, so even with a buffer 
size of 2048, we flush more than 60 times a second (testing with "cat" 
and "ls" on a spinny disk). In situations where the buffering caused 
unacceptable delays, a command would either a) not use buffered output 
or b) manually flush at opportune times. I'm not sure those would come 
up in practice though.

Ultimately, in the cases where Eshell does buffered-printing now, the 
thing that limits the user seeing updates is the redisplay throttle, not 
the buffer size.

A smarter version of 'eshell-buffered-print' could flush before the 
buffer was full if enough time has passed, but that would add complexity 
without a lot of immediate benefit. (For example, would we set up a 
timer to flush? I'm not sure how that would interact with the rest of 
this code, which is all synchronous.)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Thu, 06 Jun 2024 05:23:02 GMT) Full text and rfc822 format available.

Message #41 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Thu, 06 Jun 2024 07:43:58 +0300
> Date: Wed, 5 Jun 2024 13:07:48 -0700
> Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
> From: Jim Porter <jporterbugs <at> gmail.com>
> 
> On 6/5/2024 11:58 AM, Eli Zaretskii wrote:
> > So now that we agree about this aspect, I ask again: wouldn't it make
> > sense to show the text to the user in smaller chunks?  2K characters
> > is 2 dozen lines, and I expect users to be somewhat unhappy about
> > being presented the text in such large chunks.  That's now what they
> > see when invoking 'cat' from the shell prompt outside Emacs.
> 
> Ok, I see what you mean. I think the thing users would be unhappy about 
> is "long" periods of time between display updates. (If we flush and/or 
> redisplay faster than the user's monitor refreshes, those updates are 
> wasted.)
> 
> For the kinds of output that Eshell's buffered-print is designed for, we 
> can get the text we want to print very quickly, so even with a buffer 
> size of 2048, we flush more than 60 times a second (testing with "cat" 
> and "ls" on a spinny disk). In situations where the buffering caused 
> unacceptable delays, a command would either a) not use buffered output 
> or b) manually flush at opportune times. I'm not sure those would come 
> up in practice though.
> 
> Ultimately, in the cases where Eshell does buffered-printing now, the 
> thing that limits the user seeing updates is the redisplay throttle, not 
> the buffer size.
> 
> A smarter version of 'eshell-buffered-print' could flush before the 
> buffer was full if enough time has passed, but that would add complexity 
> without a lot of immediate benefit. (For example, would we set up a 
> timer to flush? I'm not sure how that would interact with the rest of 
> this code, which is all synchronous.)

My main point is that 'cat' is used to show the contents of a file to
the user, so the assumption is that the user _wants_ to see that
stuff.  So having the stuff flash before user's eyes in an instant is
not necessarily the best idea, even though it is faster, and thus
performance-wise we win.

If the above is agreed, and you still think 2K characters is the best
default value, I'm fine with that.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Thu, 06 Jun 2024 09:22:01 GMT) Full text and rfc822 format available.

Message #44 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Jim Porter <jporterbugs <at> gmail.com>, 71355 <at> debbugs.gnu.org
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Thu, 6 Jun 2024 05:20:14 -0400
Jim Porter <jporterbugs <at> gmail.com> writes:

> If anyone has time to test, I'd be interested to see if the results are
> similarly good on other platforms (I'm testing on x86_64 GNU/Linux), but
> I imagine they're comparable.

0. emacs -Q
1. M-x eshell RET
2. time cat config.log RET

I did three attempts:

Before your patch:  4.520  4.687  4.609 secs
With your patch:    0.958  0.968  1.048 secs

In GNU Emacs 30.0.50 (build 3, aarch64-apple-darwin23.5.0, NS
 appkit-2487.60 Version 14.5 (Build 23F79)) of 2024-06-06 built on
 Stefans-MacBook-Pro.local
Repository revision: a525cfb3af0c49c5c64e8af548ab23d086348fed
Repository branch: master
Windowing system distributor 'Apple', version 10.3.2487
System Description:  macOS 14.5




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Thu, 06 Jun 2024 18:05:02 GMT) Full text and rfc822 format available.

Message #47 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Thu, 6 Jun 2024 11:02:59 -0700
On 6/5/2024 9:43 PM, Eli Zaretskii wrote:
> My main point is that 'cat' is used to show the contents of a file to
> the user, so the assumption is that the user _wants_ to see that
> stuff.  So having the stuff flash before user's eyes in an instant is
> not necessarily the best idea, even though it is faster, and thus
> performance-wise we win.

For "cat" specifically, I think what we want is just to finish as fast 
as we can so that Eshell hands control back to the Emacs command loop, 
and then the user can start examining the output. (For example, by 
pressing C-c C-r to go to the beginning of the output, or using the 
"Smart" module[1].)

> If the above is agreed, and you still think 2K characters is the best
> default value, I'm fine with that.

Agreed. I'd definitely like to improve the usability for dealing with 
long-running commands or ones that output a lot of text in Eshell. I 
think that's a separate task though. (For example, the text flashing 
before the user's eyes also happens when running the external 
/usr/bin/cat, and I haven't touched that code path here. A fix for that 
behavior would go elsewhere so that both the internal and external cats 
could benefit.)

[1] I don't use the Smart module so I don't know a ton about it, but it 
avoids scrolling the output as it comes in and stays at the beginning.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Thu, 06 Jun 2024 18:07:02 GMT) Full text and rfc822 format available.

Message #50 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Stefan Kangas <stefankangas <at> gmail.com>, 71355 <at> debbugs.gnu.org
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Thu, 6 Jun 2024 11:04:57 -0700
On 6/6/2024 2:20 AM, Stefan Kangas wrote:
> Jim Porter <jporterbugs <at> gmail.com> writes:
> 
>> If anyone has time to test, I'd be interested to see if the results are
>> similarly good on other platforms (I'm testing on x86_64 GNU/Linux), but
>> I imagine they're comparable.
> 
> 0. emacs -Q
> 1. M-x eshell RET
> 2. time cat config.log RET
> 
> I did three attempts:
> 
> Before your patch:  4.520  4.687  4.609 secs
> With your patch:    0.958  0.968  1.048 secs

Thanks. Those differences aren't quite as stark as in my tests, but 4.5x 
faster is still pretty substantial, so it's good to see this improves 
things on other systems too.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Thu, 06 Jun 2024 23:24:02 GMT) Full text and rfc822 format available.

Message #53 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Jim Porter <jporterbugs <at> gmail.com>, 71355 <at> debbugs.gnu.org
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Thu, 6 Jun 2024 19:14:44 -0400
Jim Porter <jporterbugs <at> gmail.com> writes:

>>> +(defcustom eshell-buffered-print-size 2048
>>> +  "The size of the print queue in characters, for doing buffered printing.
>>>   This is basically a speed enhancement, to avoid blocking the Lisp code
>>>   from executing while Emacs is redisplaying."
>>
>> How did you decide on this value?
>
> Basically, I tried a few different command invocations that normally
> take a while (the main ones being "cat config.log" and "ls -Al
> /usr/bin") and scaled up the value by factors of two until I stopped
> seeing any major perf improvements. (Larger values are faster still, but
> not by a whole lot.)

FWIW, I've experimented a bit on my machine, and I see the following
using the command "time cat config.log" from an initially empty eshell
buffer:

| eshell-buffered-print-size | secs       |       |
|----------------------------+------------+-------|
|                        256 | 1.922 secs |     1 |
|                        512 | 1.413 secs |  0.74 |
|                       1024 | 1.065 secs |  0.55 |
|                       2048 | 0.996 secs |  0.52 |
|                       4096 | 0.860 secs |  0.45 |
|                       8192 | 0.835 secs |  0.43 |
|                      16384 | 0.829 secs |  0.43 |

To me, these numbers seem to suggest that, at least on this system,
there is a sweet spot around 4096, but 2048 admittedly does already get
us most of the way there.  However, going above 8192 doesn't lead to any
appreciable speedup.  This is on a fast M2 machine; it would be
interesting to see some experiments on slower machines as well.

I'm assuming that we don't want to set it to some arbitrarily large
number, but do we expect any adverse affects from choosing a slightly
higher value?  If not, is there a case to be made for choosing 4096 as
the default?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Fri, 07 Jun 2024 00:11:02 GMT) Full text and rfc822 format available.

Message #56 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Stefan Kangas <stefankangas <at> gmail.com>, 71355 <at> debbugs.gnu.org
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Thu, 6 Jun 2024 17:09:18 -0700
On 6/6/2024 4:14 PM, Stefan Kangas wrote:
> FWIW, I've experimented a bit on my machine, and I see the following
> using the command "time cat config.log" from an initially empty eshell
> buffer:
> 
> | eshell-buffered-print-size | secs       |       |
> |----------------------------+------------+-------|
> |                        256 | 1.922 secs |     1 |
> |                        512 | 1.413 secs |  0.74 |
> |                       1024 | 1.065 secs |  0.55 |
> |                       2048 | 0.996 secs |  0.52 |
> |                       4096 | 0.860 secs |  0.45 |
> |                       8192 | 0.835 secs |  0.43 |
> |                      16384 | 0.829 secs |  0.43 |
> 
> To me, these numbers seem to suggest that, at least on this system,
> there is a sweet spot around 4096, but 2048 admittedly does already get
> us most of the way there.  However, going above 8192 doesn't lead to any
> appreciable speedup.  This is on a fast M2 machine; it would be
> interesting to see some experiments on slower machines as well.
> 
> I'm assuming that we don't want to set it to some arbitrarily large
> number, but do we expect any adverse affects from choosing a slightly
> higher value?  If not, is there a case to be made for choosing 4096 as
> the default?

This is consistent with what I saw, but I thought it best to err on the 
side of smallness for the buffer size, just to ensure that we flush the 
buffer faster than the redisplay rate (defaulting to 40Hz), even on 
slower machines. Since the difference between 2048 and 4096 is small, I 
figured it wasn't too high a price to pay.

In your tests, just like mine, the difference between 1024 and 2048 is 
also small, but I opted for the larger 2048 because it produced a 
more-noticeable improvement when running `time ls -Al /usr/bin`.

That said, if we wanted to use a buffer size of 4096, I don't think that 
would cause any major issues. (At least on my system, when running an 
external command in Eshell, it writes 4095 bytes at a time, since that's 
how much data we get at once from the process filter. As far as I'm 
aware, it's been that way forever.)

On some level, I think any value between about 1024 and 4096 is probably 
somewhat arbitrary. They'd likely all be fine in most cases, and have 
much better performance than what we have today.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Fri, 07 Jun 2024 08:53:01 GMT) Full text and rfc822 format available.

Message #59 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Jim Porter <jporterbugs <at> gmail.com>, 71355 <at> debbugs.gnu.org
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Fri, 7 Jun 2024 10:51:00 +0200
Jim Porter <jporterbugs <at> gmail.com> writes:

> This is consistent with what I saw, but I thought it best to err on the
> side of smallness for the buffer size, just to ensure that we flush the
> buffer faster than the redisplay rate (defaulting to 40Hz), even on
> slower machines. Since the difference between 2048 and 4096 is small, I
> figured it wasn't too high a price to pay.

Makes sense to me, thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Sat, 08 Jun 2024 04:27:02 GMT) Full text and rfc822 format available.

Message #62 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71355 <at> debbugs.gnu.org, stefankangas <at> gmail.com
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Fri, 7 Jun 2024 21:25:16 -0700
[Message part 1 (text/plain, inline)]
Whoops, one of the intermediate changes I'd made while working on these 
patches broke a bunch of the regression tests, so here's the fixed 
version. I also renamed a few of the new variables to be clearer that 
they're for internal use only.

Since it sounds like all the other concerns have been addressed, I'll 
probably merge this in the next day or so. (This is the last non-bugfix 
change I'd like to land for Eshell in Emacs 30.)
[0001-Be-more-efficient-when-buffering-output-in-Eshell.patch (text/plain, attachment)]
[0002-Improve-implementations-of-some-Eshell-output-filter.patch (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71355; Package emacs. (Sat, 08 Jun 2024 07:35:01 GMT) Full text and rfc822 format available.

Message #65 received at 71355 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Jim Porter <jporterbugs <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 71355 <at> debbugs.gnu.org
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Sat, 8 Jun 2024 03:33:20 -0400
Jim Porter <jporterbugs <at> gmail.com> writes:

> Since it sounds like all the other concerns have been addressed, I'll
> probably merge this in the next day or so. (This is the last non-bugfix
> change I'd like to land for Eshell in Emacs 30.)

Sounds good to me.




Reply sent to Jim Porter <jporterbugs <at> gmail.com>:
You have taken responsibility. (Sat, 08 Jun 2024 19:45:01 GMT) Full text and rfc822 format available.

Notification sent to Jim Porter <jporterbugs <at> gmail.com>:
bug acknowledged by developer. (Sat, 08 Jun 2024 19:45:02 GMT) Full text and rfc822 format available.

Message #70 received at 71355-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Stefan Kangas <stefankangas <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 71355-done <at> debbugs.gnu.org
Subject: Re: bug#71355: 30.0.50; [PATCH] Improve performance of buffered
 output in Eshell
Date: Sat, 8 Jun 2024 12:43:18 -0700
On 6/8/2024 12:33 AM, Stefan Kangas wrote:
> Jim Porter <jporterbugs <at> gmail.com> writes:
> 
>> Since it sounds like all the other concerns have been addressed, I'll
>> probably merge this in the next day or so. (This is the last non-bugfix
>> change I'd like to land for Eshell in Emacs 30.)
> 
> Sounds good to me.

I took one last look after sleeping on it, and everything seems correct 
to me, so I've now merged this to the master branch as 15f515c7a37.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 07 Jul 2024 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 42 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.