GNU bug report logs - #16842
[PATCH] Use mbrtowc_cache in DFA engine

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Sat, 22 Feb 2014 15:47:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#16842: closed ([PATCH] Use mbrtowc_cache in DFA engine)
Date: Fri, 28 Mar 2014 16:37:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Fri, 28 Mar 2014 09:36:11 -0700
with message-id <5335A4FB.2000302 <at> cs.ucla.edu>
and subject line Re: bug#16842: [PATCH] Use mbrtowc_cache in DFA engine
has caused the debbugs.gnu.org bug report #16842,
regarding [PATCH] Use mbrtowc_cache in DFA engine
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
16842: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=16842
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: submit <at> debbugs.gnu.org
Subject: [PATCH] Use mbrtowc_cache in DFA engine
Date: Sun, 23 Feb 2014 00:46:27 +0900
[Message part 3 (text/plain, inline)]
Package: grep
Tags: patch

The patch is DFA version of patch#16544 "Optimazation for is_mb_middle".
It will improve performance for non-UTF8 locales in DFA engine.

I tested below.  In both case, Speed-up 3-3.5x.

$ yes $(printf '%078dm' 0)|head -1000000 > in
$ for i in `seq 5`; do env LC_ALL=ja_JP.eucJP time src/grep n in; done

$ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -1000000 > k
$ for i in `seq 5`; do env LC_ALL=ja_JP.eucJP time src/grep -i foobar k; done

Norihiro
[use_mb_cache_in_dfa.txt (application/octet-stream, attachment)]
[tests.txt (application/octet-stream, attachment)]
[Message part 6 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 16842-done <at> debbugs.gnu.org
Subject: Re: bug#16842: [PATCH] Use mbrtowc_cache in DFA engine
Date: Fri, 28 Mar 2014 09:36:11 -0700
[Message part 7 (text/plain, inline)]
Thanks for the review and the fixes.  I found a couple more things. 
First, it's not portable to cast wint_t * to wchar_t *, since the 
pointed-to types might be different sizes or representations. Second, we 
can put the cache directly in the struct dfa, saving the overhead of 
doing a separate malloc.

The attached further patch should address these problems.  I pushed 
this, along with the earlier two patches in this sequence, and am 
marking this as done.


[0003-dfa-avoid-an-indirection-and-port-wint_t-usage.patch (text/x-patch, attachment)]

This bug report was last modified 11 years and 54 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.