GNU bug report logs - #32073
Improvements in Grep

Previous Next

Package: grep;

Reported by: Sergiu Hlihor <sh <at> discovergy.com>

Date: Fri, 6 Jul 2018 21:32:02 UTC

Severity: wishlist

Full log


View this message in rfc822 format

From: Jim Meyering <jim <at> meyering.net>
To: Sergiu Hlihor <sh <at> discovergy.com>
Cc: 32073 <at> debbugs.gnu.org
Subject: bug#32073: Improvements in Grep
Date: Fri, 6 Jul 2018 17:33:08 -0700
[Message part 1 (text/plain, inline)]
On Fri, Jul 6, 2018 at 9:26 AM, Sergiu Hlihor <sh <at> discovergy.com> wrote:
> Hello,
>      I'm using grep over Ubuntu Server 14.04 (Grep version 2.16). While
> grepping over large files I've noticed Grep is painfully slow. The
> bottleneck seems to be the read block which is extremely low (looks like
> 64KB). For large files residing over big HDD RAID arrays, this request
> barely reaches one drive and based on CPU usage, grep is idling more or
> less. Given my tests for such scenarios, a read block size of at least
> 512KB would be way more efficient. It's very likely that optimum would be
> 1MB+. Also, such increase in buffer size would also benefit slightly SSDs
> where maximum sequential throughput is usually achieved when reading at
> 256KB+ block size.
>      If this is already possible in newer versions or configurable, I'd
> appreciate some hints about the new version which contains or about the way
> I can configure it to increase the read block size.

Thanks for raising the issue.
This makes me think we should follow Coreutils' lead[0] and increase
grep's initial buffer size from 32KiB, probably to 128KiB. I will time
with the attached diff on a few systems.

[0] https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v8.22-103-g74ca6e84c
[grep-bufsize-increase.diff (application/octet-stream, attachment)]

This bug report was last modified 5 years and 228 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.