GNU bug report logs - #24071
[PATCH] Refactor regex character class parsing in [:name:]

Previous Next

Package: emacs;

Reported by: Michal Nazarewicz <mina86 <at> mina86.com>

Date: Mon, 25 Jul 2016 22:55:02 UTC

Severity: wishlist

Tags: patch

Done: Michal Nazarewicz <mina86 <at> mina86.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Michal Nazarewicz <mina86 <at> mina86.com>
To: Eli Zaretskii <eliz <at> gnu.org>, Dima Kogan <lists <at> dima.secretsauce.net>
Cc: 24071 <at> debbugs.gnu.org
Subject: bug#24071: [PATCH] Refactor regex character class parsing in [:name:]
Date: Wed, 27 Jul 2016 17:29:04 +0200
On Tue, Jul 26 2016, Eli Zaretskii wrote:
>> From: Michal Nazarewicz <mina86 <at> mina86.com>
>> Date: Tue, 26 Jul 2016 00:54:05 +0200
>> 
>> re_wctype function is used in three separate places and in all of
>> those places almost exact code extracting the name from [:name:]
>> surrounds it.  Furthermore, re_wctype requires a NUL-terminated
>> string, so the name of the character class is copied to a temporary
>> buffer.
>> 
>> The code duplication and unnecessary memory copying can be avoided by
>> pushing the responsibility of parsing the whole [:name:] sequence to
>> the function.
>> 
>> Furthermore, since now the function has access to the length of the
>> character class name (since it’s doing the parsing), it can take
>> advantage of that information in skipping some string comparisons and
>> using a constant-length memcmp instead of strcmp which needs to take
>> care of NUL bytes.
>
> Thanks.
>
> If we are going to make some serious refactoring in regex.c, I think
> we should start with having a test suite for it.

I agree.  Which is why I started test/src/regex-tests.el¹.  Since this
patch touches only character classes I limited the tests to character
classes.

¹ If fact, the bug I’ve fixed with the previous patch was discovered
precisely because I’ve written tests for this patch.

> The dima_regex_embedded_modifiers branch, created by Dima Kogan
> (CC'ed) in the Emacs repository includes a suite taken from glibc.
> Dima, could you perhaps merge the parts of the test suite that can
> already be used to the master branch, so that we could use them to
> verify changes in regex.c?

This looks relatively straightforward;  I can take care of it.  I’ll
send a link to the result soon.

-- 
Best regards
ミハウ “𝓶𝓲𝓷𝓪86” ナザレヴイツ
«If at first you don’t succeed, give up skydiving»




This bug report was last modified 8 years and 351 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.