#65416 - Feature request: include first line of file in output

GNU bug report logs - #65416
Feature request: include first line of file in output

Package: grep;

Reported by: Daniel Green <ddgreen <at> gmail.com>

Date: Mon, 21 Aug 2023 07:16:02 UTC

Severity: wishlist

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Message #26 received at 65416 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Green <ddgreen <at> gmail.com> To: arnold <at> skeeve.com Cc: 65416 <at> debbugs.gnu.org Subject: Re: bug#65416: Feature request: include first line of file in output Date: Tue, 22 Aug 2023 22:12:25 -0400

[Message part 1 (text/plain, inline)]

I don't have access to a newer gawk where I did the initial timings, but I ran an almost identical test on my home machine. grep (v3.11): ~0.60s perl (v5.38.0): ~3.21s gawk (v4.0.2 built from source with `-O3 -march=native`): ~10.22s gawk (v5.2.2 built from source with `-O3 -march=native`): ~4.95s If grep will never add this functionality I'll survive, it just seemed like it might not be too much work to implement, and would probably still be much faster than using awk/perl. I've never looked at the grep source code before, but could be tempted to try implementing it myself if there was any chance of the path being accepted. Dan On Mon, Aug 21, 2023 at 2:37 PM <arnold <at> skeeve.com> wrote: > Gawk 4.0.2 is 11 years old. Try timing the current version, > I'll bet it's faster. And it solves your problem NOW, > instead of waiting for a feature that the grep developers > aren't likely to add. > > My two cents of course. > > Arnold > > Daniel Green <ddgreen <at> gmail.com> wrote: > > > That works, as well as the Perl version I've been using: > > > > perl -ne 'print if ($. == 1 || /pattern/)' > > > > But timings for a real-life example (3GB file with ~16m lines, CentOS 7) > > show the problem: > > > > grep (v2.20): ~1.15s > > perl (v5.36.1): ~4.48s > > awk (v4.0.2): ~10.81s > > > > Admittedly grep is just searching in those timings, but I suspect it > could > > accomplish the full task with a minimal decrease in speed. > > > > Dan > > > > On Mon, Aug 21, 2023 at 12:57 PM <arnold <at> skeeve.com> wrote: > > > > > Daniel Green <ddgreen <at> gmail.com> wrote: > > > > > > > I'm frequently searching CSV files with 20-30 columns, and when > there's a > > > > hit it can be hard to know what the columns are. An option to also > print > > > > the first line of a file (either always, or only if that file had a > match > > > > to the pattern) in addition to any hits would be nice. > > > > > > > > Thanks, > > > > Dan > > > > > > It sounds like awk would be a better tool: > > > > > > awk 'FNR == 1 || /pattern/' files ... > > > > > > should do the trick. > > > > > > HTH, > > > > > > Arnold > > > >

[Message part 2 (text/html, inline)]

This bug report was last modified 1 year and 321 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #65416 Feature request: include first line of file in output

GNU bug report logs - #65416
Feature request: include first line of file in output