GNU bug report logs - #32704
Can grep search for a line feed and a null character at the same time?

Previous Next

Package: grep;

Reported by: 21naown <at> gmail.com

Date: Tue, 11 Sep 2018 16:27:01 UTC

Severity: wishlist

Full log


View this message in rfc822 format

From: Eric Blake <eblake <at> redhat.com>
To: 21naown <at> gmail.com, 32704 <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>
Subject: bug#32704: Can grep search for a line feed and a null character at the same time?
Date: Sat, 15 Sep 2018 15:27:08 -0500
On 9/15/18 12:57 PM, 21naown <at> gmail.com wrote:

>>> But is it at least possible to find “\x0A\x00” with grep?
>>
>> If you bend the rules by throwing -P into the mix, yes :)
>>
> So it is possible to find “\x0A\x00” alone, but for example 
> “\x74\x00\x0D\x00\x0A\x00\x74\x00\x65\00” is impossible to find with the 
> “-P” option?

Correct. It is impossible to find the record terminator in the middle of 
a pattern, whether that terminator is \n (default) or NUL (-z).  It is 
therefore impossible to find a multi-record match using grep.  The 
string you listed contains both \x00 and \x0a, so regardless of which of 
those two bytes you pick as the record terminator, it is impossible to 
use grep to find that substring in your file.  You'll have to resort to 
a tool that supports multiline matching, since grep is not such a tool.

It IS possible, of course, to change your data, for example:

tr '\0' '\xff' < file | grep $modified_pattern | tr '\xff' '\0'

assuming that \xff didn't appear anywhere else in the file; although it 
may make matching harder if you don't have the right record terminators 
any longer.  Or, if your input data is encoded in UTF-16, it's easiest 
to convert it into UTF-8 for the grep:

iconv -f UTF-16 -t UTF-8 < file | grep $modified_pattern \
  | iconv -f UTF-8 -t UTF-16

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




This bug report was last modified 4 years and 328 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.