GNU bug report logs -
#20768
RFC: Multithreaded grep
Previous Next
Full log
View this message in rfc822 format
On Tue, Jun 09, 2015 at 12:04:11PM +0100, Aaron Crane wrote:
>Zev Weiss <zev <at> bewilderbeest.net> wrote:
>> Hmm -- I picked --parallel largely for consistency with the corresponding
>> flag for coreutils' sort, which strikes me as a closer relative to grep than
>> either make or parallel.
>
>That's a good point; I wasn't aware of sort's --parallel option.
>Though I also note that "sort --parallel=4" limits the number of
>threads to 4, rather than increasing the number of threads from 1 to
>4, so the comparison isn't exact.
>
>> sort doesn't
>> have a matching short option though, so I went with -M to suggest
>> "mulithreaded" (since, as you point out, -P is already in use). Though I
>> notice now that lower-case -p is still available; perhaps that might be
>> better than -M.
>
>I'm a little unhappy about the idea of proliferating the world's set
>of short options in this space, to be honest. If grep didn't already
>have -P, I'd be happy enough with -P and either --parallel or
>--max-procs, but I'm not terribly fond of the idea of introducing
>either -M or -p.
>
>--
>Aaron Crane ** http://aaroncrane.co.uk/
True, I suppose that's a reasonable concern (especially given how many
there are now). My thought was that at least for me (and it sounds like
perhaps Paul as well) this would be fairly likely to be a commonly used
option, so I'd like a nice concise way of enabling it. With sort
there's no real downside to just enabling multithreading by default, so
a longopt-only flag is fine. With grep however (at least with my
current implementation) there are tradeoffs with output ordering that
may be undesirable (and which I don't see a good way around without
introducing a bunch of potentially-complicated and performance-reducing
per-file output buffering), so I kept it off by default.
There's also the question of the argument parsing mentioned in my
original email -- as it stands now, '-M' would be the only short option
with an optional argument, which has potential to be confusing.
Thinking about it a bit more, I realize that what I really want out of
the short flag is just a shorter way to say --parallel=NUMCPUS (and not
have to remember how many CPUs the machine I'm on has), so perhaps
another possibility on that front would be to leave the long option
as-is but have the short flag (assuming there is one) not take an
argument (though I suppose that could perhaps be seen as confusing in
its own way too).
Zev
This bug report was last modified 341 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.