GNU bug report logs - #18777
[PATCH] dfa: improvement for checking of multibyte character boundary

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Mon, 20 Oct 2014 15:05:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #38 received at 18777 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: Eric Blake <eblake <at> redhat.com>, 18777 <at> debbugs.gnu.org
Subject: Re: bug#18777: [PATCH] dfa: improvement for checking of multibyte
 character boundary
Date: Tue, 16 Dec 2014 09:12:21 -0800
On 12/16/2014 04:42 AM, Norihiro Tanaka wrote:
> Thanks for the review and suggestion.  If using_utf8 () is true, we can
> set always_character_boundary to true except 0x80-0xbf.

Even better, thanks.


>> >This won't assign anything to *WCP, contrary to the documented API for
>> >for skip_remains_mb.  This is OK (as callers don't care) but the API
>> >documentation should be changed to reflect the actual behavior.
> Oh!  if WCP is needed, we must be go through step by step, as a wide
> character before P is set to *WCP.  I fixed it and updated the API
> documentation.

This part of the patch does too much work, as the caller inspects *WCP 
only when skip_remains_mb returns a value not equal to p.  So there's no 
need for the "wcp == NULL &&" test in the patch. Instead, the documented 
API can change, saying that *WCP is assigned to only if WCP is non-NULL 
and the result is greater than p.




This bug report was last modified 9 years and 75 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.