GNU bug report logs - #17027
[PATCH] grep: prefer regex to DFA for ANYCHAR in non-UTF8 locales

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Mon, 17 Mar 2014 15:02:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#17027: closed ([PATCH] grep: prefer regex to DFA for ANYCHAR
 in non-UTF8 locales)
Date: Tue, 08 Apr 2014 04:09:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Mon, 07 Apr 2014 21:08:44 -0700
with message-id <5343764C.9020208 <at> cs.ucla.edu>
and subject line Re: bug#17027: [PATCH] grep: prefer regex to DFA for ANYCHAR in non-UTF8 locales
has caused the debbugs.gnu.org bug report #17027,
regarding [PATCH] grep: prefer regex to DFA for ANYCHAR in non-UTF8 locales
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
17027: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=17027
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: submit <at> debbugs.gnu.org
Subject: [PATCH] grep: prefer regex to DFA for ANYCHAR in non-UTF8 locales
Date: Tue, 18 Mar 2014 00:01:05 +0900
[Message part 3 (text/plain, inline)]
Package: grep
Tags: patch

When ANYCHAR is included in a pattern in non-UTF8 locales, grep prefer
to DFA engine to regex's.  However, as long as I tested, even after have
applied Patch#17025, regex engine is slower than DFA's for ANYCHAR in
non-UTF8 locales.

This patch prefers regex to DFA for ANYCHAR in non-UTF8 locales.

Create the text.

$ yes abcd.abc | head -1000000 > m

I tested below before applying it.

$ time -p env LC_ALL=ja_JP.eucJP src/grep abcd.abd m
real 1.99
user 1.75
sys 0.28

I re-tested after applying it.

$ time -p env LC_ALL=ja_JP.eucJP src/grep abcd.abd m
real 1.21
user 0.71
sys 0.46

Norihiro
[patch2.txt (text/plain, attachment)]
[Message part 5 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 17027-done <at> debbugs.gnu.org
Subject: Re: bug#17027: [PATCH] grep: prefer regex to DFA for ANYCHAR in
 non-UTF8 locales
Date: Mon, 07 Apr 2014 21:08:44 -0700
Thanks for this patch too.  I pushed it into the savannah git master, 
with a slightly different commit message.


This bug report was last modified 11 years and 125 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.