GNU bug report logs - #18777
[PATCH] dfa: improvement for checking of multibyte character boundary

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Mon, 20 Oct 2014 15:05:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #20 received at 18777 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: arnold <at> skeeve.com
Cc: eblake <at> redhat.com, 18777 <at> debbugs.gnu.org
Subject: Re: bug#18777: [PATCH] dfa: improvement for checking of multibyte
 character boundary
Date: Tue, 21 Oct 2014 22:25:21 +0900
arnold <at> skeeve.com wrote:
> I would think adding a check for '\r' would be safe and would help
> too; given that on Windows systems '\r' generally occurs just as
> frequently as '\n', it should give a nice speedup for gawk on those
> systems.

As I recognize that DFA and regex aren't support multiple eolbytes as
CR-LF, I can't understand where we can use the change.  Grep converts
Windows text to Unix text by removal of CR in advance.

BTW, although I say `newline', correctly notice that it's `eolbyte'
which mayn't be either LF or NUL.





This bug report was last modified 9 years and 74 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.