#20006 - Get rid of excessive sed forks

GNU bug report logs - #20006
Get rid of excessive sed forks

Reported by: Harald Hoyer <harald <at> redhat.com>

Date: Thu, 5 Mar 2015 08:59:03 UTC

Severity: normal

Merged with 20005

Fixed in versions 2.4.6.19-f187, 2.4.6.19-aabc

Done: Pavel Raiskup <praiskup <at> redhat.com>

Bug is archived. No further changes may be made.

Message #8 received at 20006 <at> debbugs.gnu.org (full text, mbox):

From: Pavel Raiskup <praiskup <at> redhat.com> To: libtool <at> gnu.org Cc: Eric Blake <eblake <at> redhat.com>, 20006 <at> debbugs.gnu.org, Richard Purdie <richard.purdie <at> linuxfoundation.org>, Mike Frysinger <vapier <at> gentoo.org> Subject: Re: Bash-specific performance by avoiding sed Date: Mon, 05 Oct 2015 00:45:50 +0200

[Message part 1 (text/plain, inline)]

forcemerge 20006 20005 thanks On Monday 09 of March 2015 18:04:34 Mike Frysinger wrote: > On 09 Mar 2015 14:48, Eric Blake wrote: > > On 03/09/2015 01:50 PM, Bob Friesenhahn wrote: > > > On Mon, 9 Mar 2015, Mike Gran wrote: > > >> I don't know if y'all saw this blogpost where a guy pushed > > >> the sed regular expression handling into bash-specific > > >> regular expressions when bash was available. He claims > > >> there's a significant performance improvement because of > > >> reduced forking. > > >> > > >> http://harald.hoyer.xyz/2015/03/05/libtool-getting-rid-of-180000-sed-forks/ > > > > > > There is an issue in the libtool bug tracker regarding this. > > > > > > This solution only works with GNU bash. It would be good if volunteers > > > could research to see if there are similar solutions which can work with > > > other common shells (e.g. dash, ksh, zsh). > > > > For context, we're trying to speed up: > > > > sed_quote_subst='s|$[`"$\\]$|\\\1|g' > > _G_unquoted_arg=`printf '%s\n' "$1" |$SED "$sed_quote_subst"` > > > > How about this, which should be completely portable to XSI shells (alas, > > it still uses ${a#b} and ${a%b} at the end, so it is not portable to > > ancient Solaris /bin/sh): > > > > # func_quote STRING > > # Escapes all \`"$ in STRING with another \, and stores that in $quoted > > func_quote () { > > case $1 in > > *[\\\`\"\$]*) > > save_IFS=$IFS pre=.$1. > > for char in '\' '`' '"' '$'; do > > post= IFS=$char > > for part in $pre; do > > post=${post:+$post\\$char}$part > > done > > pre=$post > > done > > should we test the size of the string first ? i've written such raw shell > string parsing functions before, and once you hit a certain size (like 1k+ > iirc), forking out to sed is way faster, especially when running in multibyte > locales (like UTF8) which most people are doing nowadays. > -mike Well, that optimization would require (fast) strlen()-like construct. Anyway, the vast majority of calls to func_quote () function will have short ARG, and its complexity is still "just" linear. We could optimize later if that was a real issue. I would like to propose solution based on Eric's one, without using of '${VAR%.}' and '${VAR#.}' constructs -- sounds like this could be even more portable while it keeps almost the same speed (if we can use += its even faster). I have yet a another patch trying to minimize option-parser overhead (that is focused on the POV of Richard, but that needs to be cleaned up a bit, I'll post hopefully tomorrow). Any comment is welcome! Pave

This bug report was last modified 9 years and 258 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #20006 Get rid of excessive sed forks

GNU bug report logs - #20006
Get rid of excessive sed forks