GNU bug report logs - #16842
[PATCH] Use mbrtowc_cache in DFA engine

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Sat, 22 Feb 2014 15:47:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Subject: bug#16842: closed (Re: bug#16842: [PATCH] Use mbrtowc_cache in
 DFA engine)
Date: Fri, 28 Mar 2014 16:37:03 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#16842: [PATCH] Use mbrtowc_cache in DFA engine

which was filed against the grep package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 16842 <at> debbugs.gnu.org.

-- 
16842: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=16842
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 16842-done <at> debbugs.gnu.org
Subject: Re: bug#16842: [PATCH] Use mbrtowc_cache in DFA engine
Date: Fri, 28 Mar 2014 09:36:11 -0700
[Message part 3 (text/plain, inline)]
Thanks for the review and the fixes.  I found a couple more things. 
First, it's not portable to cast wint_t * to wchar_t *, since the 
pointed-to types might be different sizes or representations. Second, we 
can put the cache directly in the struct dfa, saving the overhead of 
doing a separate malloc.

The attached further patch should address these problems.  I pushed 
this, along with the earlier two patches in this sequence, and am 
marking this as done.


[0003-dfa-avoid-an-indirection-and-port-wint_t-usage.patch (text/x-patch, attachment)]
[Message part 5 (message/rfc822, inline)]
From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: submit <at> debbugs.gnu.org
Subject: [PATCH] Use mbrtowc_cache in DFA engine
Date: Sun, 23 Feb 2014 00:46:27 +0900
[Message part 6 (text/plain, inline)]
Package: grep
Tags: patch

The patch is DFA version of patch#16544 "Optimazation for is_mb_middle".
It will improve performance for non-UTF8 locales in DFA engine.

I tested below.  In both case, Speed-up 3-3.5x.

$ yes $(printf '%078dm' 0)|head -1000000 > in
$ for i in `seq 5`; do env LC_ALL=ja_JP.eucJP time src/grep n in; done

$ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -1000000 > k
$ for i in `seq 5`; do env LC_ALL=ja_JP.eucJP time src/grep -i foobar k; done

Norihiro
[use_mb_cache_in_dfa.txt (application/octet-stream, attachment)]
[tests.txt (application/octet-stream, attachment)]

This bug report was last modified 11 years and 53 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.