GNU bug report logs - #24914
24.5; isearch-regexp: wrong error message

Previous Next

Package: emacs;

Reported by: Drew Adams <drew.adams <at> oracle.com>

Date: Wed, 9 Nov 2016 22:31:01 UTC

Severity: minor

Tags: confirmed, fixed, patch

Found in versions 24.5, 25.2

Fixed in version 27.1

Done: Noam Postavsky <npostavs <at> users.sourceforge.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Noam Postavsky <npostavs <at> users.sourceforge.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: drew.adams <at> oracle.com, 24914 <at> debbugs.gnu.org
Subject: bug#24914: 24.5; isearch-regexp: wrong error message
Date: Sat, 09 Dec 2017 21:18:05 -0500
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

>> I thought it would be easier to document the limit if it's fixed across
>> all machines.  Otherwise we would have to say something like "For both
>> forms, m and n, if specified, may be no larger than INT_MAX, which is
>> usually 2**31 - 1, but could be 2**63 - 1 depending on the compiler used
>> for building Emacs".
>
> Isn't int 32 bit wide everywhere?

I might have been mixing up int with long when I was thinking about
this; it seems only a few very obscure platforms have 64 bit ints.
According to [1], everywhere but "HAL Computer Systems port of Solaris
to the SPARC64" and "Classic UNICOS" has 32 bit ints.

[1]: https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models

> And anyway, since the bitmap is stored in an int, isn't INT_MAX TRT?

Unfortunately, all this discussion of int size seems to be academic.  I
took another look at the code, there is another limit due to regexp
opcode format.  We can raise the limit to 2^16-1 though.

Here is the use of RE_DUP_MAX, which makes it seem like int-size is the
main limit:

    /* Get the next unsigned number in the uncompiled pattern.  */
    #define GET_INTERVAL_COUNT(num)					\
      ...
                if (RE_DUP_MAX / 10 - (RE_DUP_MAX % 10 < c - '0') < num)	\
                  FREE_STACK_RETURN (REG_ESIZEBR);				\


    static reg_errcode_t
    regex_compile (const_re_char *pattern, size_t size,
    {
      ...
		int lower_bound = 0, upper_bound = -1;
                [...]
		GET_INTERVAL_COUNT (lower_bound);

But then

			INSERT_JUMP2 (succeed_n, laststart,
				      b + 5 + nbytes,
				      lower_bound);

    /* Like `STORE_JUMP2', but for inserting.  Assume `b' is the buffer end.  */
    #define INSERT_JUMP2(op, loc, to, arg) \
      insert_op2 (op, loc, (to) - (loc) - 3, arg, b)

    /* Like `insert_op1', but for two two-byte parameters ARG1 and ARG2.  */
                                      ^^^^^^^^
    static void
    insert_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2, unsigned char *end)
    {
      ...
      store_op2 (op, loc, arg1, arg2);
    }

    /* Like `store_op1', but for two two-byte parameters ARG1 and ARG2.  */
                                     ^^^^^^^^
    static void
    store_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2)
    {
      *loc = (unsigned char) op;
      STORE_NUMBER (loc + 1, arg1);
      STORE_NUMBER (loc + 3, arg2);
    }

    /* Store NUMBER in two contiguous bytes starting at DESTINATION.  */
                       ^^^^^^^^^^^^^^^^^^^^

    #define STORE_NUMBER(destination, number)				\
      do {									\
        (destination)[0] = (number) & 0377;					\
        (destination)[1] = (number) >> 8;					\
      } while (0)


Here is the updated patch:

[0001-Raise-limit-of-regexp-repetition-Bug-24914.patch (text/plain, attachment)]

This bug report was last modified 7 years and 119 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.