GNU bug report logs -
#20638
BUG: standard & extended RE's don't find NUL's :-(
Previous Next
Reported by: "L. A. Walsh" <gnu <at> tlinx.org>
Date: Sun, 24 May 2015 00:06:02 UTC
Severity: normal
Tags: notabug
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
Message #14 received at 20638 <at> debbugs.gnu.org (full text, mbox):
Linda Walsh wrote:
> I had one file that it bailed on
> saying it has an invalid UTF-8 encoding -- but the line was
> recursive starting from '.' -- and it didn't name the file
That's pretty vague. Can you reproduce that problem? I don't observe it:
$ mkdir d
$ printf 'a\200\n' >d/f
$ printf 'b\200\n' >d/g
$ grep -r a d
Binary file d/f matches
> "-a" doesn't work, BTW:
>
> Ishtar:/tmp> grep -a '\000\000' zeros
> Ishtar:/tmp> echo $?
> 1
That's the way 'grep' has always behaved. The regular expression '\0' matches
the string "0", not the NUL byte.
> Ishtar:/tmp> grep -P '\000\000' zeros Binary file zeros matches
I don't follow this example; perhaps some text was omitted? Anyway, -P has
always treated files containing zeros as binary files too, ever since -P has
been introduced. It's the same as without -P.
> But there it is -- if grep wasn't meant to handle binary files,
> it wouldn't know to call 'zeroes' a binary file.
Obviously, grep *is* meant to handle binary files; it's documented to handle
them in a particular way.
> how can 'shuf' claim to work on input lines yet have this allowed:
>
> -z, --zero-terminated
> line delimiter is NUL, not newline.
I don't follow this point. -z is a nice feature; we don't want to get rid of it.
> People argue to dumb down POSIX
> utils, because some corp wants to get a posix label but
> has a few shortcomings -- so they donate enough money and
> posix changes it's rules.
I'm afraid you've gone off the deep end here.
This bug report was last modified 9 years and 363 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.