GNU bug report logs - #43598
replace-in-string: finishing touches

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattiase <at> acm.org>

Date: Thu, 24 Sep 2020 20:53:02 UTC

Severity: normal

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

Full log


Message #41 received at 43598 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 43598 <at> debbugs.gnu.org
Subject: Re: bug#43598: replace-in-string: finishing touches
Date: Sun, 27 Sep 2020 00:44:38 +0200
Mattias Engdegård <mattiase <at> acm.org> writes:

> We should also consider the optimisations:
> - If SCHARS(needle)>SCHARS(haystack) then no match is possible.

I've now done this.

> - If either needle or haystack is all-ASCII (all bytes in 0..127),
> then we can use memmem without conversion.

I thought that surely there's be a function like that in Emacs, but I
can't find it?

Instead there's code like

          && (STRING_MULTIBYTE (string)
              ? (chars == bytes) : string_ascii_p (string))
[...]
/* Whether STRING only contains chars in the 0..127 range.  */
static bool
string_ascii_p (Lisp_Object string)
{
  ptrdiff_t nbytes = SBYTES (string);
  for (ptrdiff_t i = 0; i < nbytes; i++)
    if (SREF (string, i) > 127)
      return false;
  return true;
}

and

	  unsigned char *p = SDATA (name);
	  while (*p && ASCII_CHAR_P (*p))
	    p++;

sprinkled around the code base.

Would it make sense to add a new utility function that does the right
thing for both multibyte and unibyte strings?  (The multibyte case is
just chars == bytes, but the unibyte case would be a loop.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




This bug report was last modified 4 years and 295 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.