Package: coreutils;
Reported by: Ole Tange <tange <at> gnu.org>
Date: Sat, 7 Mar 2015 11:50:02 UTC
Severity: normal
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
Message #26 received at 20029 <at> debbugs.gnu.org (full text, mbox):
From: Giuseppe Scrivano <gscrivano <at> gnu.org> To: Pádraig Brady <P <at> draigBrady.com> Cc: 20029 <at> debbugs.gnu.org, Ole Tange <tange <at> gnu.org> Subject: Re: bug#20029: 'yes' surprisingly slow Date: Tue, 10 Mar 2015 01:31:36 +0100
Pádraig Brady <P <at> draigBrady.com> writes: > The attached should make things more efficient here. > > thanks, > Pádraig. > > > From 7959bbf19307705e98f08cfa32a9dcf67672590c Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?P=C3=A1draig=20Brady?= <P <at> draigBrady.com> > Date: Mon, 9 Mar 2015 19:27:32 +0000 > Subject: [PATCH] yes: output data more efficiently > > yes(1) may be used to generate repeating patterns of text > for test inputs etc., so adjust to be more efficient. > > Profiling the case where yes(1) is outputting small items > through stdio (which was the default case), shows the overhead > continuously processing small items in main() and in stdio: > > $ yes >/dev/null & perf top -p $! > 31.02% yes [.] main > 27.36% libc-2.20.so [.] _IO_file_xsputn@@GLIBC_2.2.5 > 14.51% libc-2.20.so [.] fputs_unlocked > 13.50% libc-2.20.so [.] strlen > 10.66% libc-2.20.so [.] __GI___mempcpy > 1.98% yes [.] fputs_unlocked <at> plta > > Sending more data per stdio call improves the situation, > but still, there is significant stdio overhead due to memory copies, > and the repeated string length checking: > > $ yes "`echo {1..1000}`" >/dev/null & perf top -p $! > 42.26% libc-2.20.so [.] __GI___mempcpy > 17.38% libc-2.20.so [.] strlen > 5.21% [kernel] [k] __srcu_read_lock > 4.58% [kernel] [k] __srcu_read_unlock > 4.27% libc-2.20.so [.] _IO_file_xsputn@@GLIBC_2.2.5 > 2.50% libc-2.20.so [.] __GI___libc_write > 2.45% [kernel] [k] system_call > 2.40% [kernel] [k] system_call_after_swapgs > 2.27% [kernel] [k] vfs_write > 2.09% libc-2.20.so [.] _IO_do_write@@GLIBC_2.2.5 > 2.01% [kernel] [k] fsnotify > 1.95% libc-2.20.so [.] _IO_file_write@@GLIBC_2.2.5 > 1.44% yes [.] main > > We can avoid all stdio overhead by building up the buffer > _once_ and outputting that, and the profile below shows > the bottleneck moved to the kernel: > > $ src/yes >/dev/null & perf top -p $! > 15.42% [kernel] [k] __srcu_read_lock > 12.98% [kernel] [k] __srcu_read_unlock > 9.41% libc-2.20.so [.] __GI___libc_write > 9.11% [kernel] [k] vfs_write > 8.35% [kernel] [k] fsnotify > 8.02% [kernel] [k] system_call > 5.84% [kernel] [k] system_call_after_swapgs > 4.54% [kernel] [k] __fget_light > 3.98% [kernel] [k] sys_write > 3.65% [kernel] [k] selinux_file_permission > 3.44% [kernel] [k] rw_verify_area > 2.94% [kernel] [k] __fsnotify_parent > 2.76% [kernel] [k] security_file_permission > 2.39% yes [.] main > 2.17% [kernel] [k] __fdget_pos > 2.13% [kernel] [k] sysret_check > 0.81% [kernel] [k] write_null > 0.36% yes [.] write <at> plt > > Note this change also ensures that yes will only write complete lines > for lines softer than BUFSIZ. > > * src/yes.c (main): Build up a BUFSIZ buffer of lines, > and output that, rather than having stdio process each item. > * tests/misc/yes.sh: Add a new test for various buffer sizes. > * tests/local.mk: Reference the new test. > Fixes http://bugs.gnu.org/20029 > --- > src/yes.c | 43 +++++++++++++++++++++++++++++++++++++++++-- > tests/local.mk | 1 + > tests/misc/yes.sh | 28 ++++++++++++++++++++++++++++ > 3 files changed, 70 insertions(+), 2 deletions(-) > create mode 100755 tests/misc/yes.sh > > diff --git a/src/yes.c b/src/yes.c > index b35b13f..91dea11 100644 > --- a/src/yes.c > +++ b/src/yes.c > @@ -58,6 +58,10 @@ Repeatedly output a line with all specified STRING(s), or 'y'.\n\ > int > main (int argc, char **argv) > { > + char buf[BUFSIZ]; > + char *pbuf = buf; > + int i; > + > initialize_main (&argc, &argv); > set_program_name (argv[0]); > setlocale (LC_ALL, ""); > @@ -77,9 +81,44 @@ main (int argc, char **argv) > argv[argc++] = bad_cast ("y"); > } > > - while (true) > + /* Buffer data locally once, rather than having the > + large overhead of stdio buffering each item. */ > + for (i = optind; i < argc; i++) > + { > + size_t len = strlen (argv[i]); > + if (BUFSIZ < len || BUFSIZ - len <= pbuf - buf) > + break; > + memcpy (pbuf, argv[i], len); > + pbuf += len; > + *pbuf++ = i == argc - 1 ? '\n' : ' '; > + } > + if (i < argc) > + pbuf = NULL; since the buffer is partly filled, wouldn't be better to reuse it? Something like this (barely tested): diff --git a/src/yes.c b/src/yes.c index 91dea11..ac690ce 100644 --- a/src/yes.c +++ b/src/yes.c @@ -92,9 +92,7 @@ main (int argc, char **argv) pbuf += len; *pbuf++ = i == argc - 1 ? '\n' : ' '; } - if (i < argc) - pbuf = NULL; - else + if (i == argc) { size_t line_len = pbuf - buf; size_t lines = BUFSIZ / line_len; @@ -106,7 +104,7 @@ main (int argc, char **argv) } /* The normal case is to continuously output the local buffer. */ - while (pbuf) + while (i == argc) { if (write (STDOUT_FILENO, buf, pbuf - buf) == -1) { @@ -117,11 +115,17 @@ main (int argc, char **argv) /* If the data doesn't fit in BUFSIZ then it's large and not too onerous to output using stdio in any case. */ - while (! pbuf) + while (i != argc) { - for (i = optind; i < argc; i++) - if (fputs (argv[i], stdout) == EOF - || putchar (i == argc - 1 ? '\n' : ' ') == EOF) + int j; + if ((pbuf - buf) && fwrite (buf, pbuf - buf, 1, stdout) == 0) + { + error (0, errno, _("standard output")); + return EXIT_FAILURE; + } + for (j = i; j < argc; j++) + if (fputs (argv[j], stdout) == EOF + || putchar (j == argc - 1 ? '\n' : ' ') == EOF) { error (0, errno, _("standard output")); return EXIT_FAILURE; The difference seems to be (without the patch): $ time src/yes `echo {1..2000}` | head -c 2000M | md5sum 55c293324aa6ecce14f0bf30da5a4686 - real 0m7.994s user 0m11.093s sys 0m2.953s versus (with the patch): $ time src/yes `echo {1..2000}` | head -c 2000M | md5sum 55c293324aa6ecce14f0bf30da5a4686 - real 0m3.534s user 0m4.164s sys 0m1.803s Regards, Giuseppe
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.