GNU bug report logs -
#32073
Improvements in Grep
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
On Fri, Jul 6, 2018 at 9:26 AM, Sergiu Hlihor <sh <at> discovergy.com> wrote:
> Hello,
> I'm using grep over Ubuntu Server 14.04 (Grep version 2.16). While
> grepping over large files I've noticed Grep is painfully slow. The
> bottleneck seems to be the read block which is extremely low (looks like
> 64KB). For large files residing over big HDD RAID arrays, this request
> barely reaches one drive and based on CPU usage, grep is idling more or
> less. Given my tests for such scenarios, a read block size of at least
> 512KB would be way more efficient. It's very likely that optimum would be
> 1MB+. Also, such increase in buffer size would also benefit slightly SSDs
> where maximum sequential throughput is usually achieved when reading at
> 256KB+ block size.
> If this is already possible in newer versions or configurable, I'd
> appreciate some hints about the new version which contains or about the way
> I can configure it to increase the read block size.
Thanks for raising the issue.
This makes me think we should follow Coreutils' lead[0] and increase
grep's initial buffer size from 32KiB, probably to 128KiB. I will time
with the attached diff on a few systems.
[0] https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v8.22-103-g74ca6e84c
[grep-bufsize-increase.diff (application/octet-stream, attachment)]
This bug report was last modified 5 years and 228 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.