GNU bug report logs - #17027
[PATCH] grep: prefer regex to DFA for ANYCHAR in non-UTF8 locales

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Mon, 17 Mar 2014 15:02:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: 17027 <at> debbugs.gnu.org
Subject: bug#17027: [PATCH] grep: prefer regex to DFA for ANYCHAR in non-UTF8 locales
Date: Tue, 18 Mar 2014 00:01:05 +0900
[Message part 1 (text/plain, inline)]
Package: grep
Tags: patch

When ANYCHAR is included in a pattern in non-UTF8 locales, grep prefer
to DFA engine to regex's.  However, as long as I tested, even after have
applied Patch#17025, regex engine is slower than DFA's for ANYCHAR in
non-UTF8 locales.

This patch prefers regex to DFA for ANYCHAR in non-UTF8 locales.

Create the text.

$ yes abcd.abc | head -1000000 > m

I tested below before applying it.

$ time -p env LC_ALL=ja_JP.eucJP src/grep abcd.abd m
real 1.99
user 1.75
sys 0.28

I re-tested after applying it.

$ time -p env LC_ALL=ja_JP.eucJP src/grep abcd.abd m
real 1.21
user 0.71
sys 0.46

Norihiro
[patch2.txt (text/plain, attachment)]

This bug report was last modified 11 years and 125 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.