GNU bug report logs - #20638
BUG: standard & extended RE's don't find NUL's :-(

Previous Next

Package: grep;

Reported by: "L. A. Walsh" <gnu <at> tlinx.org>

Date: Sun, 24 May 2015 00:06:02 UTC

Severity: normal

Tags: notabug

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Linda Walsh <gnu <at> tlinx.org>, Eric Blake <eblake <at> redhat.com>
Cc: 20638 <at> debbugs.gnu.org
Subject: bug#20638: BUG: standard & extended RE's don't find NUL's :-(
Date: Mon, 25 May 2015 08:18:56 -0700
Linda Walsh wrote:

> I had one file that it bailed on
> saying it has an invalid UTF-8 encoding -- but the line was
> recursive starting from '.' -- and it didn't name the file

That's pretty vague.  Can you reproduce that problem?  I don't observe it:

$ mkdir d
$ printf 'a\200\n' >d/f
$ printf 'b\200\n' >d/g
$ grep -r a d
Binary file d/f matches

> "-a" doesn't work, BTW:
>
> Ishtar:/tmp> grep -a '\000\000' zeros
> Ishtar:/tmp> echo $?
> 1

That's the way 'grep' has always behaved.  The regular expression '\0' matches 
the string "0", not the NUL byte.

> Ishtar:/tmp> grep -P '\000\000' zeros Binary file zeros matches

I don't follow this example; perhaps some text was omitted?  Anyway, -P has 
always treated files containing zeros as binary files too, ever since -P has 
been introduced.  It's the same as without -P.

> But there it is -- if grep wasn't meant to handle binary files,
> it wouldn't know to call 'zeroes' a binary file.

Obviously, grep *is* meant to handle binary files; it's documented to handle 
them in a particular way.

> how can 'shuf' claim to work on input lines yet have this allowed:
>
>    -z, --zero-terminated
> line delimiter is NUL, not newline.

I don't follow this point.  -z is a nice feature; we don't want to get rid of it.

> People argue to dumb down POSIX
> utils, because some corp wants to get a posix label but
> has a few shortcomings -- so they donate enough money and
> posix changes it's rules.

I'm afraid you've gone off the deep end here.




This bug report was last modified 9 years and 363 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.