GNU bug report logs - #58168
string-lessp glitches and inconsistencies

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattias.engdegard <at> gmail.com>

Date: Thu, 29 Sep 2022 16:25:01 UTC

Severity: normal

Full log


Message #107 received at 58168 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Cc: 58168 <at> debbugs.gnu.org, larsi <at> gnus.org
Subject: Re: bug#58168: string-lessp glitches and inconsistencies
Date: Fri, 07 Oct 2022 18:33:57 +0300
> From: Mattias Engdegård <mattias.engdegard <at> gmail.com>
> Date: Fri, 7 Oct 2022 16:45:57 +0200
> Cc: larsi <at> gnus.org,
>  58168 <at> debbugs.gnu.org
> 
> > I don't think it matters much, because whatever we produce we cannot
> > be sure it will look identical to the original format string.
> 
> I agree. What about this patch then?
> 
> diff --git a/src/editfns.c b/src/editfns.c
> index c1414071c7..5a99a40656 100644
> --- a/src/editfns.c
> +++ b/src/editfns.c
> @@ -3551,10 +3551,15 @@ styled_format (ptrdiff_t nargs, Lisp_Object *args, bool message)
>  		      || float_conversion || conversion == 'i'
>  		      || conversion == 'o' || conversion == 'x'
>  		      || conversion == 'X'))
> -	    error ("Invalid format operation %%%c",
> -		   multibyte_format
> -		   ? STRING_CHAR ((unsigned char *) format - 1)
> -		   : *((unsigned char *) format - 1));
> +	    {
> +	      unsigned char *p = (unsigned char *) format - 1;
> +	      if (multibyte_format)
> +		error ("Invalid format operation %%%c", STRING_CHAR (p));
> +	      else
> +		error ((*p <= 127 ? "Invalid format operation %%%c"
> +			: "Invalid format operation char #x%02x"),
> +		       *p);
> +	    }
>  	  else if (! (FIXNUMP (arg) || ((BIGNUMP (arg) || FLOATP (arg))
>  					&& conversion != 'c')))
>  	    error ("Format specifier doesn't match argument type");

I'd prefer to show it in octal, not in hex, since that's the default
in Emacs.

> > A test suite doesn't have to
> > assume that the internals work as they should, it should just test
> > that.  So testing both sounds to me better than testing just one
> > assuming that this one covers both.
> 
> I agree, we should definitely test raw bytes inserted from both unibyte and multibyte strings. The current code doesn't do that, hence my patch.
> 
> With respect to other values there should be a reasonable qualitative difference between test cases: testing both #x80 and #xfc isn't more useful than testing the display of both the letters 'A' and 'B'. But if you think that the current code is deficient in coverage, please do propose extensions and I'll adapt the patch accordingly.

I think the more values we test the better.  The codepoints between
#x80 and #xA0 in particular are treated differently than those above
#xA0, so at least one from each range should be beneficial.




This bug report was last modified 2 years and 276 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.