GNU bug report logs - #56347
Optimize/simplify STRING_SET_MULTIBYTE

Previous Next

Package: emacs;

Reported by: Stefan Monnier <monnier <at> iro.umontreal.ca>

Date: Fri, 1 Jul 2022 23:33:01 UTC

Severity: wishlist

Tags: patch

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

Full log


Message #16 received at 56347 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 56347 <at> debbugs.gnu.org
Subject: Re: bug#56347: Optimize/simplify STRING_SET_MULTIBYTE
Date: Sat, 02 Jul 2022 19:24:02 +0300
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: 56347 <at> debbugs.gnu.org
> Date: Sat, 02 Jul 2022 12:12:06 -0400
> 
> STRING_SET_MULTIBYTE is fundamentally evil because it changes the nature
> of an object.  Its current definition (like that of STRING_SET_UNIBYTE)
> is rather scary (it sometimes changes the nature of the arg passed to
> it, and sometimes replaces the arg with something else).

But do we have any alternatives?

> >> -	  /* STRING is a pure-ASCII string, so we can convert it (or,
> >> -	     rather, its copy) to multibyte and use that thereafter.  */
> >> -	  Lisp_Object string_copy = Fconcat (1, &string);
> >> -	  STRING_SET_MULTIBYTE (string_copy);
> >> -	  string = string_copy;
> >> +	  /* STRING is a pure-ASCII string, so we can treat it as multibyte.  */
> >
> > Did you actually try your change in the situations where this problem
> > pops up?
> 
> I don't even know how to go about doing that, no.

Make a character-composition rule that composes, say, two '-'
characters, and then display a buffer where you have adjacent dashes.

> > AFAIR, the code makes a copy of the string for good reasons:
> > the rest of handling of the string down the line barfs if we keep a
> > multibyte string here.
> 
> [ I assume you meant "barfs if we keep a *uni*byte string here".  ]

Yes.

> Where?

I don't remember, sorry.

> >> -#define STRING_SET_MULTIBYTE(STR)			\
> >> -  do {							\
> >> -    if (XSTRING (STR)->u.s.size == 0)			\
> >> -      (STR) = empty_multibyte_string;			\
> >> -    else						\
> >> -      XSTRING (STR)->u.s.size_byte = XSTRING (STR)->u.s.size; \
> >> +#define STRING_SET_MULTIBYTE(STR)			    \
> >> +  do {							    \
> >> +    eassert (XSTRING (STR)->u.s.size > 0);		    \
> >> +    XSTRING (STR)->u.s.size_byte = XSTRING (STR)->u.s.size; \
> >>    } while (false)
> >>  
> >>  /* Convenience functions for dealing with Lisp strings.  */
> >
> > You want to disallow uses of empty_multibyte_string? why?
> 
> No, I want to reduce the scope of semantics of the macro, e.g. so it can
> be implemented as a function rather than a macro and so it doesn't
> magically substitute empty_multibyte_string into a variable that held
> something else.

But the effect is that you disallow calling STRING_SET_MULTIBYTE on an
empty string, isn't it?




This bug report was last modified 3 years and 9 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.