From unknown Tue Jun 17 20:14:48 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#43439 <43439@debbugs.gnu.org> To: bug#43439 <43439@debbugs.gnu.org> Subject: Status: [PATCH] doprnt improvements Reply-To: bug#43439 <43439@debbugs.gnu.org> Date: Wed, 18 Jun 2025 03:14:48 +0000 retitle 43439 [PATCH] doprnt improvements reassign 43439 emacs submitter 43439 Paul Eggert severity 43439 normal tag 43439 patch thanks From debbugs-submit-bounces@debbugs.gnu.org Tue Sep 15 21:51:10 2020 Received: (at submit) by debbugs.gnu.org; 16 Sep 2020 01:51:10 +0000 Received: from localhost ([127.0.0.1]:32788 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIMbF-0003vU-IE for submit@debbugs.gnu.org; Tue, 15 Sep 2020 21:51:10 -0400 Received: from lists.gnu.org ([209.51.188.17]:55378) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIMb9-0003vF-MF for submit@debbugs.gnu.org; Tue, 15 Sep 2020 21:51:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41940) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kIMb9-0007vf-Dd for bug-gnu-emacs@gnu.org; Tue, 15 Sep 2020 21:51:03 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:41204) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kIMb6-0001NC-2X for bug-gnu-emacs@gnu.org; Tue, 15 Sep 2020 21:51:02 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id DE6EB1600F3 for ; Tue, 15 Sep 2020 18:50:56 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id KRDrXpZrnvBq; Tue, 15 Sep 2020 18:50:54 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id E5B4C1600F1; Tue, 15 Sep 2020 18:50:54 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id sUtAa8JBDtXA; Tue, 15 Sep 2020 18:50:54 -0700 (PDT) Received: from day.example.com (cpe-75-82-69-226.socal.res.rr.com [75.82.69.226]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id B6DD11600DE; Tue, 15 Sep 2020 18:50:54 -0700 (PDT) From: Paul Eggert To: bug-gnu-emacs@gnu.org Subject: [PATCH] doprnt improvements Date: Tue, 15 Sep 2020 18:50:51 -0700 Message-Id: <20200916015051.20517-1-eggert@cs.ucla.edu> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=131.179.128.68; envelope-from=eggert@cs.ucla.edu; helo=zimbra.cs.ucla.edu X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/15 21:50:57 X-ACL-Warn: Detected OS = Linux 3.1-3.10 [fuzzy] X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit Cc: Paul Eggert X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Improve doprnt performance, internal checking, and internal documentation. On my platform (Ubuntu 18.04.5 x86-64), this improved CPU speed of =E2=80=98make -C lisp compile-always=E2=80=99 by 6%= . This patch implements some of my suggestions in Bug#8545, with further changes suggested by Eli Zaretskii. * src/doprnt.c: Improve comments. (SIZE_BOUND_EXTRA): Now at top level, for parse_format_integer. (parse_format_integer): New static function, containing some of the old doprint. Fix a bug that caused doprnt to infloop on formats like "%10s" that Emacs does not use. We could simplify doprnt further if we dropped support for these never-used formats. (doprnt): Omit FORMAT_END argument, since it=E2=80=99s always NULL, which means doprnt must call strlen on FORMAT; doing this means doprnt needs just one pass over FORMAT, not two. All callers changed. Assume C99 to make code clearer. Do not use malloc or alloca to allocate a copy of the format FMTCPY; instead, use a small fixed-size array FMTSTAR, and use '*' in that array to represent width and precision, passing them as separate int arguments. Use eassume to pacify GCC in switch statements. Drop support for "%S" which is never used and which would cause GCC to warn anyway. * src/lisp.h (doprnt): Give it ATTRIBUTE_FORMAT_PRINTF (3, 0), since GCC can grok doprnt's new API. --- src/doprnt.c | 223 ++++++++++++++++++++++++++------------------------- src/lisp.h | 4 +- src/sysdep.c | 2 +- src/xdisp.c | 5 +- 4 files changed, 117 insertions(+), 117 deletions(-) diff --git a/src/doprnt.c b/src/doprnt.c index b0ba12552b..f154578c0d 100644 --- a/src/doprnt.c +++ b/src/doprnt.c @@ -28,6 +28,7 @@ . For %s and %c, when field width is specified (e.g., %25s), it accou= nts for the display width of each character, according to char-width-table.= That is, it does not assume that each character takes one column on disp= lay. + Nor does it assume that each character is a single byte. =20 . If the size of the buffer is not enough to produce the formatted st= ring in its entirety, it makes sure that truncation does not chop the last @@ -42,38 +43,41 @@ Emacs can handle. =20 OTOH, this function supports only a small subset of the standard C fo= rmatted - output facilities. E.g., %u and %ll are not supported, and precision= is - ignored %s and %c conversions. (See below for the detailed documenta= tion of - what is supported.) However, this is okay, as this function is suppo= sed to - be called from `error' and similar functions, and thus does not need = to - support features beyond those in `Fformat_message', which is used - by `error' on the Lisp level. */ + output facilities. E.g., %u is not supported, precision is ignored + in %s and %c conversions, and although %lld often works it is not + supported and code should use something like %"pM"d with intmax_t ins= tead. + (See below for the detailed documentation of what is supported.) + However, this is okay, as this function is supposed to be called + from 'error' and similar C functions, and thus does not need to + support all the features of 'Fformat_message', which is used by the + Lisp 'error' function. */ =20 /* In the FORMAT argument this function supports ` and ' as directives that output left and right quotes as per =E2=80=98text-quoting style=E2= =80=99. It also supports the following %-sequences: =20 %s means print a string argument. - %S is treated as %s, for loose compatibility with `Fformat_message'. %d means print a `signed int' argument in decimal. %o means print an `unsigned int' argument in octal. %x means print an `unsigned int' argument in hex. %e means print a `double' argument in exponential notation. %f means print a `double' argument in decimal-point notation. %g means print a `double' argument in exponential notation - or in decimal-point notation, whichever uses fewer characters. + or in decimal-point notation, depending on the value; + this is often (though not always) the shorter of the two notations %c means print a `signed int' argument as a single character. %% means produce a literal % character. =20 - A %-sequence may contain optional flag, width, and precision specifie= rs, and - a length modifier, as follows: + A %-sequence other than %% may contain optional flags, width, precisi= on, + and length, as follows: =20 %character =20 where flags is [+ -0], width is [0-9]+, precision is .[0-9]+, and len= gth is empty or l or the value of the pD or pI or PRIdMAX (sans "d") macr= os. - Also, %% in a format stands for a single % in the output. A % that - does not introduce a valid %-sequence causes undefined behavior. + A % that does not introduce a valid %-sequence causes undefined behav= ior. + ASCII bytes in FORMAT other than % are copied through as-is; + non-ASCII bytes should not appear in FORMAT. =20 The + flag character inserts a + before any positive number, while a = space inserts a space before any positive number; these flags only affect %= d, %o, @@ -99,7 +103,9 @@ =20 For %e, %f, and %g sequences, the number after the "." in the precisi= on specifier says how many decimal places to show; if zero, the decimal = point - itself is omitted. For %s and %S, the precision specifier is ignored= . */ + itself is omitted. For %d, %o, and %x sequences, the precision speci= fies + the minimum number of digits to appear. Precision specifiers are + not supported for other %-sequences. */ =20 #include #include @@ -115,9 +121,29 @@ another macro. */ #include "character.h" =20 -/* Generate output from a format-spec FORMAT, - terminated at position FORMAT_END. - (*FORMAT_END is not part of the format, but must exist and be readabl= e.) +/* Enough to handle floating point formats with large numbers. */ +enum { SIZE_BOUND_EXTRA =3D DBL_MAX_10_EXP + 50 }; + +/* Parse FMT as an unsigned decimal integer, putting its value into *VAL= UE. + Return the address of the first byte after the integer. + If FMT is not an integer, return FMT and store zero into *VALUE. */ +static char const * +parse_format_integer (char const *fmt, int *value) +{ + int n =3D 0; + bool overflow =3D false; + for (; '0' <=3D *fmt && *fmt <=3D '9'; fmt++) + { + overflow |=3D INT_MULTIPLY_WRAPV (n, 10, &n); + overflow |=3D INT_ADD_WRAPV (n, *fmt - '0', &n); + } + if (overflow || min (PTRDIFF_MAX, SIZE_MAX) - SIZE_BOUND_EXTRA < n) + error ("Format width or precision too large"); + *value =3D n; + return fmt; +} + +/* Generate output from a format-spec FORMAT. Output goes in BUFFER, which has room for BUFSIZE chars. BUFSIZE must be positive. If the output does not fit, truncate it to fit and return BUFSIZE - 1; if this truncates a multibyte @@ -128,15 +154,11 @@ Integers are passed as C integers. */ =20 ptrdiff_t -doprnt (char *buffer, ptrdiff_t bufsize, const char *format, - const char *format_end, va_list ap) +doprnt (char *buffer, ptrdiff_t bufsize, const char *format, va_list ap) { const char *fmt =3D format; /* Pointer into format string. */ char *bufptr =3D buffer; /* Pointer into output buffer. */ =20 - /* Enough to handle floating point formats with large numbers. */ - enum { SIZE_BOUND_EXTRA =3D DBL_MAX_10_EXP + 50 }; - /* Use this for sprintf unless we need something really big. */ char tembuf[SIZE_BOUND_EXTRA + 50]; =20 @@ -150,103 +172,91 @@ doprnt (char *buffer, ptrdiff_t bufsize, const cha= r *format, char *big_buffer =3D NULL; =20 enum text_quoting_style quoting_style =3D text_quoting_style (); - ptrdiff_t tem =3D -1; - char *string; - char fixed_buffer[20]; /* Default buffer for small formatting. */ - char *fmtcpy; - int minlen; - char charbuf[MAX_MULTIBYTE_LENGTH + 1]; /* Used for %c. */ - USE_SAFE_ALLOCA; - - if (format_end =3D=3D 0) - format_end =3D format + strlen (format); - - fmtcpy =3D (format_end - format < sizeof (fixed_buffer) - 1 - ? fixed_buffer - : SAFE_ALLOCA (format_end - format + 1)); =20 bufsize--; =20 /* Loop until end of format string or buffer full. */ - while (fmt < format_end && bufsize > 0) + while (*fmt && bufsize > 0) { char const *fmt0 =3D fmt; char fmtchar =3D *fmt++; if (fmtchar =3D=3D '%') { - ptrdiff_t size_bound =3D 0; ptrdiff_t width; /* Columns occupied by STRING on display. */ enum { pDlen =3D sizeof pD - 1, pIlen =3D sizeof pI - 1, - pMlen =3D sizeof PRIdMAX - 2 + pMlen =3D sizeof PRIdMAX - 2, + maxmlen =3D max (max (1, pDlen), max (pIlen, pMlen)) }; enum { no_modifier, long_modifier, pD_modifier, pI_modifier, pM_modifier } length_modifier =3D no_modifier; static char const modifier_len[] =3D { 0, 1, pDlen, pIlen, pMlen }; - int maxmlen =3D max (max (1, pDlen), max (pIlen, pMlen)); int mlen; + char charbuf[MAX_MULTIBYTE_LENGTH + 1]; /* Used for %c. */ =20 - /* Copy this one %-spec into fmtcpy. */ - string =3D fmtcpy; + /* Width and precision specified by this %-sequence. */ + int wid =3D 0, prec =3D -1; + + /* FMTSTAR will be a "%*.*X"-like version of this %-sequence. + Start by putting '%' into FMTSTAR. */ + char fmtstar[sizeof "%-+ 0*.*d" + maxmlen]; + char *string =3D fmtstar; *string++ =3D '%'; - while (fmt < format_end) + + /* Copy at most one instance of each flag into FMTSTAR. */ + bool minusflag =3D false, plusflag =3D false, zeroflag =3D false, + spaceflag =3D false; + for (;; fmt++) { - *string++ =3D *fmt; - if ('0' <=3D *fmt && *fmt <=3D '9') + *string =3D *fmt; + switch (*fmt) { - /* Get an idea of how much space we might need. - This might be a field width or a precision; e.g. - %1.1000f and %1000.1f both might need 1000+ bytes. - Parse the width or precision, checking for overflow. */ - int n =3D *fmt - '0'; - bool overflow =3D false; - while (fmt + 1 < format_end - && '0' <=3D fmt[1] && fmt[1] <=3D '9') - { - overflow |=3D INT_MULTIPLY_WRAPV (n, 10, &n); - overflow |=3D INT_ADD_WRAPV (n, fmt[1] - '0', &n); - *string++ =3D *++fmt; - } - - if (overflow - || min (PTRDIFF_MAX, SIZE_MAX) - SIZE_BOUND_EXTRA < n) - error ("Format width or precision too large"); - if (size_bound < n) - size_bound =3D n; + case '-': string +=3D !minusflag; minusflag =3D true; continue; + case '+': string +=3D !plusflag; plusflag =3D true; continue; + case ' ': string +=3D !spaceflag; spaceflag =3D true; continue; + case '0': string +=3D !zeroflag; zeroflag =3D true; continue; } - else if (! (*fmt =3D=3D '-' || *fmt =3D=3D ' ' || *fmt =3D=3D '.' - || *fmt =3D=3D '+')) - break; - fmt++; + break; } =20 + /* Parse width and precision, putting "*.*" into FMTSTAR. */ + if ('1' <=3D *fmt && *fmt <=3D '9') + fmt =3D parse_format_integer (fmt, &wid); + if (*fmt =3D=3D '.') + fmt =3D parse_format_integer (fmt + 1, &prec); + *string++ =3D '*'; + *string++ =3D '.'; + *string++ =3D '*'; + /* Check for the length modifiers in textual length order, so that longer modifiers override shorter ones. */ for (mlen =3D 1; mlen <=3D maxmlen; mlen++) { - if (format_end - fmt < mlen) - break; if (mlen =3D=3D 1 && *fmt =3D=3D 'l') length_modifier =3D long_modifier; - if (mlen =3D=3D pDlen && memcmp (fmt, pD, pDlen) =3D=3D 0) + if (mlen =3D=3D pDlen && strncmp (fmt, pD, pDlen) =3D=3D 0) length_modifier =3D pD_modifier; - if (mlen =3D=3D pIlen && memcmp (fmt, pI, pIlen) =3D=3D 0) + if (mlen =3D=3D pIlen && strncmp (fmt, pI, pIlen) =3D=3D 0) length_modifier =3D pI_modifier; - if (mlen =3D=3D pMlen && memcmp (fmt, PRIdMAX, pMlen) =3D=3D 0) + if (mlen =3D=3D pMlen && strncmp (fmt, PRIdMAX, pMlen) =3D=3D 0) length_modifier =3D pM_modifier; } =20 + /* Copy optional length modifier and conversion specifier + character into FMTSTAR, and append a NUL. */ mlen =3D modifier_len[length_modifier]; - memcpy (string, fmt + 1, mlen); - string +=3D mlen; + string =3D mempcpy (string, fmt, mlen + 1); fmt +=3D mlen; *string =3D 0; =20 - /* Make the size bound large enough to handle floating point formats + /* An idea of how much space we might need. + This might be a field width or a precision; e.g. + %1.1000f and %1000.1f both might need 1000+ bytes. + Make it large enough to handle floating point formats with large numbers. */ - size_bound +=3D SIZE_BOUND_EXTRA; + ptrdiff_t size_bound =3D max (wid, prec) + SIZE_BOUND_EXTRA; =20 /* Make sure we have that much. */ if (size_bound > size_allocated) @@ -257,48 +267,49 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, sprintf_buffer =3D big_buffer; size_allocated =3D size_bound; } - minlen =3D 0; + int minlen =3D 0; + ptrdiff_t tem; switch (*fmt++) { default: - error ("Invalid format operation %s", fmtcpy); + error ("Invalid format operation %s", fmt0); =20 -/* case 'b': */ - case 'l': case 'd': switch (length_modifier) { case no_modifier: { int v =3D va_arg (ap, int); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case long_modifier: { long v =3D va_arg (ap, long); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pD_modifier: signed_pD_modifier: { ptrdiff_t v =3D va_arg (ap, ptrdiff_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pI_modifier: { EMACS_INT v =3D va_arg (ap, EMACS_INT); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pM_modifier: { intmax_t v =3D va_arg (ap, intmax_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; + default: + eassume (false); } /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; @@ -311,13 +322,13 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case no_modifier: { unsigned v =3D va_arg (ap, unsigned); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case long_modifier: { unsigned long v =3D va_arg (ap, unsigned long); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pD_modifier: @@ -325,15 +336,17 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case pI_modifier: { EMACS_UINT v =3D va_arg (ap, EMACS_UINT); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pM_modifier: { uintmax_t v =3D va_arg (ap, uintmax_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; + default: + eassume (false); } /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; @@ -344,22 +357,18 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case 'g': { double d =3D va_arg (ap, double); - tem =3D sprintf (sprintf_buffer, fmtcpy, d); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, d); /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; goto doit; } =20 - case 'S': - string[-1] =3D 's'; - FALLTHROUGH; case 's': - if (fmtcpy[1] !=3D 's') - minlen =3D atoi (&fmtcpy[1]); + minlen =3D minusflag ? -wid : wid; string =3D va_arg (ap, char *); tem =3D strnlen (string, STRING_BYTES_BOUND + 1); if (tem =3D=3D STRING_BYTES_BOUND + 1) - error ("String for %%s or %%S format is too long"); + error ("String for %%s format is too long"); width =3D strwidth (string, tem); goto doit1; =20 @@ -432,14 +441,12 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, string =3D charbuf; string[tem] =3D 0; width =3D strwidth (string, tem); - if (fmtcpy[1] !=3D 'c') - minlen =3D atoi (&fmtcpy[1]); + minlen =3D minusflag ? -wid : wid; goto doit1; } =20 case '%': /* Treat this '%' as normal. */ - fmt0 =3D fmt - 1; break; } } @@ -450,13 +457,13 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, src =3D uLSQM, srclen =3D sizeof uLSQM - 1; else if (quoting_style =3D=3D CURVE_QUOTING_STYLE && fmtchar =3D=3D= '\'') src =3D uRSQM, srclen =3D sizeof uRSQM - 1; - else if (quoting_style =3D=3D STRAIGHT_QUOTING_STYLE && fmtchar =3D= =3D '`') - src =3D "'", srclen =3D 1; else { - while (fmt < format_end && !CHAR_HEAD_P (*fmt)) - fmt++; - src =3D fmt0, srclen =3D fmt - fmt0; + if (quoting_style =3D=3D STRAIGHT_QUOTING_STYLE && fmtchar =3D=3D '`'= ) + fmtchar =3D '\''; + eassert (ASCII_CHAR_P (fmtchar)); + *bufptr++ =3D fmtchar; + continue; } =20 if (bufsize < srclen) @@ -479,8 +486,6 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char *= format, xfree (big_buffer); =20 *bufptr =3D 0; /* Make sure our string ends with a '\0' */ - - SAFE_FREE (); return bufptr - buffer; } =20 @@ -495,10 +500,9 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char = *format, ptrdiff_t esprintf (char *buf, char const *format, ...) { - ptrdiff_t nbytes; va_list ap; va_start (ap, format); - nbytes =3D doprnt (buf, TYPE_MAXIMUM (ptrdiff_t), format, 0, ap); + ptrdiff_t nbytes =3D doprnt (buf, TYPE_MAXIMUM (ptrdiff_t), format, ap= ); va_end (ap); return nbytes; } @@ -534,10 +538,9 @@ evxprintf (char **buf, ptrdiff_t *bufsize, { for (;;) { - ptrdiff_t nbytes; va_list ap_copy; va_copy (ap_copy, ap); - nbytes =3D doprnt (*buf, *bufsize, format, 0, ap_copy); + ptrdiff_t nbytes =3D doprnt (*buf, *bufsize, format, ap_copy); va_end (ap_copy); if (nbytes < *bufsize - 1) return nbytes; diff --git a/src/lisp.h b/src/lisp.h index a24898004d..957ca41702 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -4034,8 +4034,8 @@ #define FLOAT_TO_STRING_BUFSIZE 350 extern void syms_of_print (void); =20 /* Defined in doprnt.c. */ -extern ptrdiff_t doprnt (char *, ptrdiff_t, const char *, const char *, - va_list); +extern ptrdiff_t doprnt (char *, ptrdiff_t, const char *, va_list) + ATTRIBUTE_FORMAT_PRINTF (3, 0); extern ptrdiff_t esprintf (char *, char const *, ...) ATTRIBUTE_FORMAT_PRINTF (2, 3); extern ptrdiff_t exprintf (char **, ptrdiff_t *, char const *, ptrdiff_t= , diff --git a/src/sysdep.c b/src/sysdep.c index e161172a79..790ae084d3 100644 --- a/src/sysdep.c +++ b/src/sysdep.c @@ -2192,7 +2192,7 @@ snprintf (char *buf, size_t bufsize, char const *fo= rmat, ...) if (size) { va_start (ap, format); - nbytes =3D doprnt (buf, size, format, 0, ap); + nbytes =3D doprnt (buf, size, format, ap); va_end (ap); } =20 diff --git a/src/xdisp.c b/src/xdisp.c index 615f0ca7cf..213c2a464a 100644 --- a/src/xdisp.c +++ b/src/xdisp.c @@ -11269,13 +11269,10 @@ vmessage (const char *m, va_list ap) { if (m) { - ptrdiff_t len; ptrdiff_t maxsize =3D FRAME_MESSAGE_BUF_SIZE (f); USE_SAFE_ALLOCA; char *message_buf =3D SAFE_ALLOCA (maxsize + 1); - - len =3D doprnt (message_buf, maxsize, m, 0, ap); - + ptrdiff_t len =3D doprnt (message_buf, maxsize, m, ap); message3 (make_string (message_buf, len)); SAFE_FREE (); } --=20 2.17.1 From debbugs-submit-bounces@debbugs.gnu.org Wed Sep 16 10:58:24 2020 Received: (at 43439) by debbugs.gnu.org; 16 Sep 2020 14:58:24 +0000 Received: from localhost ([127.0.0.1]:35683 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIYt5-00018F-T3 for submit@debbugs.gnu.org; Wed, 16 Sep 2020 10:58:24 -0400 Received: from eggs.gnu.org ([209.51.188.92]:46756) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIYt3-000182-HA for 43439@debbugs.gnu.org; Wed, 16 Sep 2020 10:58:22 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:42128) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kIYsx-0001My-Ao; Wed, 16 Sep 2020 10:58:15 -0400 Received: from [176.228.60.248] (port=3459 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kIYsw-00007u-EC; Wed, 16 Sep 2020 10:58:14 -0400 Date: Wed, 16 Sep 2020 17:58:24 +0300 Message-Id: <83o8m57oq7.fsf@gnu.org> From: Eli Zaretskii To: Paul Eggert In-Reply-To: <20200916015051.20517-1-eggert@cs.ucla.edu> (message from Paul Eggert on Tue, 15 Sep 2020 18:50:51 -0700) Subject: Re: bug#43439: [PATCH] doprnt improvements References: <20200916015051.20517-1-eggert@cs.ucla.edu> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Paul Eggert > Date: Tue, 15 Sep 2020 18:50:51 -0700 > Cc: Paul Eggert > > (doprnt): Omit FORMAT_END argument, since it’s always NULL, > which means doprnt must call strlen on FORMAT; doing this means > doprnt needs just one pass over FORMAT, not two. All callers changed. This loses a feature. Emacs traditionally supports strings with embedded null characters, and this feature is in line with that. It is true that it is currently unused, but why is it a good idea to remove it? If the problem is the slight inefficiency caused by the call to strlen, we could instead solve it in the callers: all the formats I've seen are const strings, so the value of FORMAT_END can be computed at compile time, and used instead of passing NULL. > Drop support for > "%S" which is never used and which would cause GCC to warn anyway. This is an old compatibility feature, I'd rather not drop it. Who knows what code relies on the fact that 'message' and 'format-message' support it? Thanks. From debbugs-submit-bounces@debbugs.gnu.org Wed Sep 16 18:09:59 2020 Received: (at 43439) by debbugs.gnu.org; 16 Sep 2020 22:09:59 +0000 Received: from localhost ([127.0.0.1]:36722 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIfcl-0005Kk-48 for submit@debbugs.gnu.org; Wed, 16 Sep 2020 18:09:59 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:53562) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kIfcj-0005KX-Jl for 43439@debbugs.gnu.org; Wed, 16 Sep 2020 18:09:58 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 2F46A16013B; Wed, 16 Sep 2020 15:09:52 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id kBG_5QO4ZFsU; Wed, 16 Sep 2020 15:09:51 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 344F316013C; Wed, 16 Sep 2020 15:09:51 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id C5FQTgDC4CYh; Wed, 16 Sep 2020 15:09:51 -0700 (PDT) Received: from [192.168.1.9] (cpe-75-82-69-226.socal.res.rr.com [75.82.69.226]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 0C5FB16013B; Wed, 16 Sep 2020 15:09:51 -0700 (PDT) Subject: Re: bug#43439: [PATCH] doprnt improvements To: Eli Zaretskii References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> From: Paul Eggert Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= LS0tLS1CRUdJTiBQR1AgUFVCTElDIEtFWSBCTE9DSy0tLS0tCgptUUlOQkV5QWNtUUJFQURB QXlIMnhvVHU3cHBHNUQzYThGTVpFb243NGRDdmM0K3ExWEEySjJ0QnkycHdhVHFmCmhweHhk R0E5Smo1MFVKM1BENGJTVUVnTjh0TFowc2FuNDdsNVhUQUZMaTI0NTZjaVNsNW04c0thSGxH ZHQ5WG0KQUF0bVhxZVpWSVlYL1VGUzk2ZkR6ZjR4aEVtbS95N0xiWUVQUWRVZHh1NDd4QTVL aFRZcDVibHRGM1dZRHoxWQpnZDdneDA3QXV3cDdpdzdlTnZub0RUQWxLQWw4S1lEWnpiRE5D UUdFYnBZM2VmWkl2UGRlSStGV1FONFcra2doCnkrUDZhdTZQcklJaFlyYWV1YTdYRGRiMkxT MWVuM1NzbUUzUWpxZlJxSS9BMnVlOEpNd3N2WGUvV0szOEV6czYKeDc0aVRhcUkzQUZINmls QWhEcXBNbmQvbXNTRVNORnQ3NkRpTzFaS1FNcjlhbVZQa25qZlBtSklTcWRoZ0IxRApsRWR3 MzRzUk9mNlY4bVp3MHhmcVQ2UEtFNDZMY0ZlZnpzMGtiZzRHT1JmOHZqRzJTZjF0azVlVThN Qml5Ti9iClowM2JLTmpOWU1wT0REUVF3dVA4NGtZTGtYMndCeHhNQWhCeHdiRFZadWR6eERa SjFDMlZYdWpDT0pWeHEya2wKakJNOUVUWXVVR3FkNzVBVzJMWHJMdzYrTXVJc0hGQVlBZ1Jy NytLY3dEZ0JBZndoUEJZWDM0blNTaUhsbUxDKwpLYUhMZUNMRjVaSTJ2S20zSEVlQ1R0bE9n N3haRU9OZ3d6TCtmZEtvK0Q2U29DOFJSeEpLczhhM3NWZkk0dDZDCm5yUXp2SmJCbjZneGRn Q3U1aTI5SjFRQ1lyQ1l2cWwyVXlGUEFLK2RvOTkvMWpPWFQ0bTI4MzZqMXdBUkFRQUIKdENC UVlYVnNJRVZuWjJWeWRDQThaV2RuWlhKMFFHTnpMblZqYkdFdVpXUjFQb2tDVlFRVEFRZ0FQ d0liQXdZTApDUWdIQXdJR0ZRZ0NDUW9MQkJZQ0F3RUNIZ0VDRjRBV0lRUitONUtwMkt6MzFq TzhGWWp0bCtrT1lxcCtOQVVDClh5Vzlsd1VKRks0THN3QUtDUkR0bCtrT1lxcCtOS05WRC85 SE1zSTE2MDZuMFV1VFhId0lUc3lPakFJOVNET1QKK0MzRFV2NnFsTTVCSDJuV0FNVGlJaXlB NXVnbHNKdjkzb2kydk50RmYvUS9tLzFjblpXZ25WbkV4a3lMSTRFTgpTZDF1QnZyMC9sQ1Nk UGxQME1nNkdXU3BYTXUreDB2ZFQwQWFaTk9URTBGblB1b2xkYzNYRDc2QzJxZzhzWC9pCmF4 WFRLSHk5UCtCbEFxL0NzNy9weERRMEV6U24wVVNaMkMwbDV2djRQTXBBL3BpY25TNks2MDlK dkRHYU9SbXcKWmVYSVpxUU5aVitaUXMrVVl0Vm9ndURUcWJ5M0lVWTFJOEJsWEhScHRhajlB TW40VW9oL0NxcFFsVm9qb3lXbApIcWFGbm5KQktlRjBodko5U0F5YWx3dXpBakc3dlFXMDdN WW5jYU9GbTB3b2lLYmc1SkxPOEY0U0JUSWt1TzBECkNmOW5MQWF5NlZzQjRyendkRWZSd2pQ TFlBbjdNUjNmdkhDRXpmcmtsZFRyYWlCTzFUMGllREs4MEk3c0xmNnAKTWVDWUkxOXBVbHgw L05STUdDZGRpRklRZGZ0aEtXWEdSUzVMQXM4andCZjhINkc1UFdpblByRUlhb21JUDIxaQp2 dWhRRDA3YllxOUlpSWRlbGpqVWRIY0dJMGkvQjRNNTZaYWE4RmYzOGluaU9sckRZQ21ZV1I0 ZENXWml1UWVaCjNPZ3FlUXM5YTZqVHZnZERHVm1SVnFZK2p6azhQbGFIZmNvazhST2hGY0hL a2NmaHVCaEwyNWhsUklzaFJET0UKc2tYcUt3bnpyYnFnYTNHWFpYZnNYQW9GYnpOaExkTHY5 QStMSkFZU2tYUDYvNXFkVHBFTFZHb3N5SDg4NFZkYgpCcGtHSTA0b1lWcXVsYmtDRFFSTWdI SmtBUkFBcG9YcnZ4UDNESWZqQ05PdFhVL1Bkd01TaEtkWC9SbFNzNVBmCnVuVjF3YktQOGhl clhIcnZRZEZWcUVDYVRTeG1saHpiazhYMFBrWTlnY1ZhVTJPNDlUM3FzT2QxY0hlRjUyWUYK R0V0MExoc0JlTWpnTlg1dVoxVjc2cjhneWVWbEZwV1diMFNJd0pVQkhyRFhleEY2N3VwZVJi MnZkSEJqWUROZQp5U24rMEI3Z0ZFcXZWbVp1K0xhZHVkRHA2a1FMamF0RnZIUUhVU0dOc2hC bmtrY2FUYmlJOVBzdDBHQ2MyYWl6Cm5CaVBQQTJXUXhBUGxQUmgzT0dUc241VEhBRG1ianFZ NkZFTUxhc1ZYOERTQ2JsTXZMd05lTy84U3h6aUJpZGgKcUxwSkNxZFFSV0hrdTVYeGdJa0dl S096NU9MRHZYSFdKeWFmckVZamprUzZBazZCNXo2c3ZLbGlDbFduakhRYwpqbFB6eW9GRmdL VEVmY3FEeENqNFJZMEQwRGd0RkQwTmZ5ZU9pZHJTQi9TelRlMmh3cnlRRTNycFNpcW8rMGNH CmR6aDR5QUhLWUorVXJYWjRwOTNaaGpHZktEMXhsck5ZRGxXeVc5UEdtYnZxRnVEbWlJQVFm OVdEL3d6RWZJQ2MKK0YrdURESSt1WWtSeFVGcDkyeWttZGhERUZnMXlqWXNVOGlHVTY5YUh5 dmhxMzZ6NHpjdHZicWhSTnpPV0IxYgpWSi9kSU1EdnNFeEdjWFFWRElUN3NETlh2MHdFM2pL U0twcDdOREcxb1hVWEwrMitTRjk5S2p5NzUzQWJRU0FtCkg2MTdmeUJOd2hKV3ZRWWcrbVV2 UHBpR090c2VzOUVYVUkzbFM0djBNRWFQRzQzZmxFczFVUisxcnBGUVdWSG8KMXkxT08rc0FF UUVBQVlrQ1BBUVlBUWdBSmdJYkRCWWhCSDQza3FuWXJQZldNN3dWaU8yWDZRNWlxbjQwQlFK ZgpKYjJ6QlFrVXJndlBBQW9KRU8yWDZRNWlxbjQwY25NUC8xN0NnVWtYVDlhSUpyaVBNOHdi Y2VZcmNsNytiZFlFCmY3OVNsd1NiYkhON1I0Q29JSkZPbE45Uy8zNHR5cEdWWXZwZ21DSkRZ RlRCeHlQTzkyaU1YRGdBNCtjV0h6dDUKVDFhWU85aHNLaGg3dkR0Sys2UHJvWkdjKzA4Z1VU WEhoYjk3aE1NUWhrbkpsbmZqcFNFQzllbTkwNkZVK0k5MwpUMWZUR3VwbkJhM2FXY0s4ak0w SmFCR2J5MmhHMVMzb2xhRExTVHRCSU5OQlltdnVXUjlNS09oaHFEcmxrNWN3CkZESkxoNU5y WHRlRVkwOFdBemNMekczcGtyWFBIa0ZlTVF0ZnFrMGpMZEdHdkdDM05DSWtxWXJkTGhpUnZH cHIKdTM4QzI2UkVuNWY0STB2R0UzVmZJWEhlOFRNQ05tUXV0MU50TXVVbXBESXkxYUx4R3p1 cHRVaG5PSk4vL3IrVgpqRFBvaTNMT3lTTllwaHFlL2RNdWJzZlVyNm9oUDQxbUtGODFGdXdJ NGFtcUp0cnFJTDJ5cWF4M2EwcWxmd0N4ClhmdGllcUpjdWVrWCtlQ1BEQ0tyWU1YUjBGWWd3 cEcySVRaVUd0ckVqRVNsRTZEc2N4NzM0SEtkcjVPUklvY0wKVVVLRU9HZWlVNkRHaEdGZGI1 VHd1MFNuK3UxbVVQRE4wTSsrQ2RNdkNsSUU4a2xvNEc5MUVPSW11MVVwYjh4YwpPUFF3eGgx andxU3JVNVF3b05tU1llZ1FTSExwSVV1ckZ6MWlRVWgxdnBQWHpLaW5rV0VxdjRJcUExY2lM K0x5CnlTdUxrcDdNc0pwVlJNYldKQ05XT09TYmFING9EQko1ZEhNR2MzNXg1bW9zQ2s5MFBY a251RkREc1lIZkRvNXMKbWY5bG82WVh4N045Cj0zTGFJCi0tLS0tRU5EIFBHUCBQVUJMSUMg S0VZIEJMT0NLLS0tLS0K Organization: UCLA Computer Science Department Message-ID: <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> Date: Wed, 16 Sep 2020 15:09:50 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <83o8m57oq7.fsf@gnu.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) On 9/16/20 7:58 AM, Eli Zaretskii wrote: > Emacs traditionally supports strings with > embedded null characters, and this feature is in line with that. It > is true that it is currently unused, but why is it a good idea to > remove it? It's a good idea not only because the feature is currently unused and its support complicates and adds bugs to the code, but also because it would be a bad idea to ever use the feature. The Emacs feature is for Lisp strings. Emacs does not (and for API reasons, it cannot practically) rely on embedded NULs in C strings. Among other things, if we tried to use C-style printf formats with embedded NULs, GCC's warnings about formats not matching their arguments would stop working. These GCC warnings are quite useful for preventing bugs in Emacs's C code and have helped to catch many such bugs, and we should not give them up. More generally, the vestigial support for NULs and %S in doprnt's C formats dates back to long ago, before GCC warned about these features. It may have made sense back then but it does not make sense now. Any C-level formatting facility that supports NULs and %S should not attempt to use the longstanding printf API that is incompatible with such support - it should be a separate facility. And Emacs C code already has a non-printf facility for formatting with NULs and %S - e.g., Fformat and Fformat_message - so it doesn't need yet another one. > If the problem is the slight inefficiency caused by the call to > strlen, we could instead solve it in the callers: all the formats I've > seen are const strings, so the value of FORMAT_END can be computed at > compile time, and used instead of passing NULL. This would require unnecessary complication of the code and runtime overhead. doprnt is called directly only in few places: esprintf, evxprintf, vmessage, and the (rarely-if-ever used) snprintf replacement. If we modified callers to do what you're suggesting, we'd need to modify the callers of these functions, and their callers too, all the way back to the original ancestor call that specifies the format. This would clutter and bloat the code and would add runtime cost to all the callers. For example, we'd have to change all calls to the 'error' function from something like this: if (ret < GNUTLS_E_SUCCESS) error ("GnuTLS AEAD cipher %s/%s initialization failed: %s", gnutls_cipher_get_name (gca), desc, emacs_gnutls_strerror (ret)); to something like this: if (ret < GNUTLS_E_SUCCESS) { char const msg[] = "GnuTLS AEAD cipher %s/%s initialization failed: %s"; error (msg, sizeof msg - 1, gnutls_cipher_get_name (gca), desc, emacs_gnutls_strerror (ret)); } Of course we could invent a new macro ERROR to package this up, but such a macro would still be less efficient than what we have, and worse it would not always work, for cases like this: if (NILP (tem) || (XBUFFER (tem) != current_buffer)) error (for_region ? "The mark is not set now, so there is no region" : "The mark is not set now"); or like this: error (format, string); // in x_check_errors And of course this could be gotten around as well, but we're now talking about a reasonably large amount of code surgery that will hurt code readability and runtime performance, all to support a low-level C feature that Emacs does not use and won't ever reasonably use. Instead, we should leave most of the C code alone and just adjust 'doprnt' and its very few callers. >> Drop support for >> "%S" which is never used and which would cause GCC to warn anyway. > > This is an old compatibility feature, I'd rather not drop it. Who > knows what code relies on the fact that 'message' and 'format-message' > support it? I know because I checked all the code. No Emacs C code uses %S. And none is likely to use it in the future because GCC warns about it if you try. (To be specific: GCC warns unless you use %S compatibly with its standard C meaning, which differs from that of Emacs doprnt - which is yet another compatibility minefield if we insist on keeping doprnt's unused %S feature.) From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 18 03:30:51 2020 Received: (at 43439) by debbugs.gnu.org; 18 Sep 2020 07:30:51 +0000 Received: from localhost ([127.0.0.1]:40539 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kJAr4-0005rE-TW for submit@debbugs.gnu.org; Fri, 18 Sep 2020 03:30:51 -0400 Received: from eggs.gnu.org ([209.51.188.92]:57756) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kJAr3-0005r0-9N for 43439@debbugs.gnu.org; Fri, 18 Sep 2020 03:30:50 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:57485) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJAqw-0004Di-Pa; Fri, 18 Sep 2020 03:30:43 -0400 Received: from [176.228.60.248] (port=1552 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kJAql-00023z-Dz; Fri, 18 Sep 2020 03:30:37 -0400 Date: Fri, 18 Sep 2020 10:30:44 +0300 Message-Id: <838sd75yor.fsf@gnu.org> From: Eli Zaretskii To: Paul Eggert In-Reply-To: <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> (message from Paul Eggert on Wed, 16 Sep 2020 15:09:50 -0700) Subject: Re: bug#43439: [PATCH] doprnt improvements References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Cc: 43439@debbugs.gnu.org > From: Paul Eggert > Date: Wed, 16 Sep 2020 15:09:50 -0700 > > On 9/16/20 7:58 AM, Eli Zaretskii wrote: > > > Emacs traditionally supports strings with > > embedded null characters, and this feature is in line with that. It > > is true that it is currently unused, but why is it a good idea to > > remove it? > > It's a good idea not only because the feature is currently unused and its > support complicates and adds bugs to the code, but also because it would be a > bad idea to ever use the feature. > > The Emacs feature is for Lisp strings. Emacs does not (and for API reasons, it > cannot practically) rely on embedded NULs in C strings. Among other things, if > we tried to use C-style printf formats with embedded NULs, GCC's warnings about > formats not matching their arguments would stop working. These GCC warnings are > quite useful for preventing bugs in Emacs's C code and have helped to catch many > such bugs, and we should not give them up. How about a compromise: we modify doprint to exit when either it finds NUL or reaches the character specified by FORMAT_END? This will allow us to keep some of the feature, and I think the amount of changes will be smaller. It should also not be much slower than what you propose. > More generally, the vestigial support for NULs and %S in doprnt's C formats > dates back to long ago, before GCC warned about these features. I understand your desire to have GCC warnings about this, but GCC is a tool, it shouldn't dictate what features we keep and which ones we drop. We should do it the other way around. doprnt callers don't change much, so the GCC diagnostic features are not very important in this case. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 15 13:59:02 2020 Received: (at 43439) by debbugs.gnu.org; 15 Oct 2020 17:59:02 +0000 Received: from localhost ([127.0.0.1]:56503 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kT7Wn-0004dh-Ac for submit@debbugs.gnu.org; Thu, 15 Oct 2020 13:59:02 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:60084) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kT7Wl-0004dU-5w for 43439@debbugs.gnu.org; Thu, 15 Oct 2020 13:59:00 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id CB2761600ED; Thu, 15 Oct 2020 10:58:52 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id wNCf-fXbiplT; Thu, 15 Oct 2020 10:58:50 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 73B541600F9; Thu, 15 Oct 2020 10:58:50 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id NyX__Dq8Ioli; Thu, 15 Oct 2020 10:58:50 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-243-218-95.socal.res.rr.com [23.243.218.95]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3E7271600ED; Thu, 15 Oct 2020 10:58:50 -0700 (PDT) Subject: Re: bug#43439: [PATCH] doprnt improvements To: Eli Zaretskii References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> <838sd75yor.fsf@gnu.org> From: Paul Eggert Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= mQINBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABtCBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PokCVQQTAQgAPwIbAwYLCQgHAwIGFQgCCQoLBBYCAwEC HgECF4AWIQR+N5Kp2Kz31jO8FYjtl+kOYqp+NAUCXyW9lwUJFK4LswAKCRDtl+kOYqp+NKNV D/9HMsI1606n0UuTXHwITsyOjAI9SDOT+C3DUv6qlM5BH2nWAMTiIiyA5uglsJv93oi2vNtF f/Q/m/1cnZWgnVnExkyLI4ENSd1uBvr0/lCSdPlP0Mg6GWSpXMu+x0vdT0AaZNOTE0FnPuol dc3XD76C2qg8sX/iaxXTKHy9P+BlAq/Cs7/pxDQ0EzSn0USZ2C0l5vv4PMpA/picnS6K609J vDGaORmwZeXIZqQNZV+ZQs+UYtVoguDTqby3IUY1I8BlXHRptaj9AMn4Uoh/CqpQlVojoyWl HqaFnnJBKeF0hvJ9SAyalwuzAjG7vQW07MYncaOFm0woiKbg5JLO8F4SBTIkuO0DCf9nLAay 6VsB4rzwdEfRwjPLYAn7MR3fvHCEzfrkldTraiBO1T0ieDK80I7sLf6pMeCYI19pUlx0/NRM GCddiFIQdfthKWXGRS5LAs8jwBf8H6G5PWinPrEIaomIP21ivuhQD07bYq9IiIdeljjUdHcG I0i/B4M56Zaa8Ff38iniOlrDYCmYWR4dCWZiuQeZ3OgqeQs9a6jTvgdDGVmRVqY+jzk8PlaH fcok8ROhFcHKkcfhuBhL25hlRIshRDOEskXqKwnzrbqga3GXZXfsXAoFbzNhLdLv9A+LJAYS kXP6/5qdTpELVGosyH884VdbBpkGI04oYVqulbkCDQRMgHJkARAApoXrvxP3DIfjCNOtXU/P dwMShKdX/RlSs5PfunV1wbKP8herXHrvQdFVqECaTSxmlhzbk8X0PkY9gcVaU2O49T3qsOd1 cHeF52YFGEt0LhsBeMjgNX5uZ1V76r8gyeVlFpWWb0SIwJUBHrDXexF67upeRb2vdHBjYDNe ySn+0B7gFEqvVmZu+LadudDp6kQLjatFvHQHUSGNshBnkkcaTbiI9Pst0GCc2aiznBiPPA2W QxAPlPRh3OGTsn5THADmbjqY6FEMLasVX8DSCblMvLwNeO/8SxziBidhqLpJCqdQRWHku5Xx gIkGeKOz5OLDvXHWJyafrEYjjkS6Ak6B5z6svKliClWnjHQcjlPzyoFFgKTEfcqDxCj4RY0D 0DgtFD0NfyeOidrSB/SzTe2hwryQE3rpSiqo+0cGdzh4yAHKYJ+UrXZ4p93ZhjGfKD1xlrNY DlWyW9PGmbvqFuDmiIAQf9WD/wzEfICc+F+uDDI+uYkRxUFp92ykmdhDEFg1yjYsU8iGU69a Hyvhq36z4zctvbqhRNzOWB1bVJ/dIMDvsExGcXQVDIT7sDNXv0wE3jKSKpp7NDG1oXUXL+2+ SF99Kjy753AbQSAmH617fyBNwhJWvQYg+mUvPpiGOtses9EXUI3lS4v0MEaPG43flEs1UR+1 rpFQWVHo1y1OO+sAEQEAAYkCPAQYAQgAJgIbDBYhBH43kqnYrPfWM7wViO2X6Q5iqn40BQJf Jb2zBQkUrgvPAAoJEO2X6Q5iqn40cnMP/17CgUkXT9aIJriPM8wbceYrcl7+bdYEf79SlwSb bHN7R4CoIJFOlN9S/34typGVYvpgmCJDYFTBxyPO92iMXDgA4+cWHzt5T1aYO9hsKhh7vDtK +6ProZGc+08gUTXHhb97hMMQhknJlnfjpSEC9em906FU+I93T1fTGupnBa3aWcK8jM0JaBGb y2hG1S3olaDLSTtBINNBYmvuWR9MKOhhqDrlk5cwFDJLh5NrXteEY08WAzcLzG3pkrXPHkFe MQtfqk0jLdGGvGC3NCIkqYrdLhiRvGpru38C26REn5f4I0vGE3VfIXHe8TMCNmQut1NtMuUm pDIy1aLxGzuptUhnOJN//r+VjDPoi3LOySNYphqe/dMubsfUr6ohP41mKF81FuwI4amqJtrq IL2yqax3a0qlfwCxXftieqJcuekX+eCPDCKrYMXR0FYgwpG2ITZUGtrEjESlE6Dscx734HKd r5ORIocLUUKEOGeiU6DGhGFdb5Twu0Sn+u1mUPDN0M++CdMvClIE8klo4G91EOImu1Upb8xc OPQwxh1jwqSrU5QwoNmSYegQSHLpIUurFz1iQUh1vpPXzKinkWEqv4IqA1ciL+LyySuLkp7M sJpVRMbWJCNWOOSbaH4oDBJ5dHMGc35x5mosCk90PXknuFDDsYHfDo5smf9lo6YXx7N9 Organization: UCLA Computer Science Department Message-ID: Date: Thu, 15 Oct 2020 10:58:49 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <838sd75yor.fsf@gnu.org> Content-Type: multipart/mixed; boundary="------------07474DA9A9758F7BDC2793DE" Content-Language: en-US X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.3 (----) This is a multi-part message in MIME format. --------------07474DA9A9758F7BDC2793DE Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit On 9/18/20 12:30 AM, Eli Zaretskii wrote: > How about ... we modify doprint to exit when either it finds > NUL or reaches the character specified by FORMAT_END? This will allow > us to keep some of the feature, and I think the amount of changes will > be smaller. It should also not be much slower than what you propose. Better yet, let's leave doprnt's API unchanged, and add a function evsnprintf (named by analogy from esprintf) whose API is like C vsnprintf but which does formatting the Emacs way. We can avoid duplication of code by implementing doprnt in terms of evsnprintf. This fixes the performance issue with current Emacs, and avoids the need for evsnprintf having to check for both NULs and FORMAT_END etc. Updated patch attached. --------------07474DA9A9758F7BDC2793DE Content-Type: text/x-patch; charset=UTF-8; name="0001-New-function-evsnprintf-to-speed-clean-up-doprnt.patch" Content-Disposition: attachment; filename*0="0001-New-function-evsnprintf-to-speed-clean-up-doprnt.patch" Content-Transfer-Encoding: quoted-printable >From c072575e7936c6e9f30864733677f19b1dcdc31a Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Thu, 15 Oct 2020 10:47:10 -0700 Subject: [PATCH] New function evsnprintf to speed/clean up doprnt MIME-Version: 1.0 Content-Type: text/plain; charset=3DUTF-8 Content-Transfer-Encoding: 8bit The new low-level C function evsnprintf improves on doprnt=E2=80=99s performance and internal checking. On my platform, this improved CPU speed of =E2=80=98make -C lisp compile-always=E2=80=99 by 6%= . This patch implements some of my suggestions in Bug#8545, with further changes suggested by Eli Zaretskii (Bug#43439). * src/doprnt.c: Improve comments. (SIZE_BOUND_EXTRA): Now at top level, for parse_format_integer. (parse_format_integer): New static function, containing some of the old doprnt. Fix a bug that caused doprnt to infloop on formats like "%10s" that Emacs does not use. We could simplify doprnt further if we dropped support for these never-used formats. (doprnt): Implement in terms of the new evsnprintf function. Put inside #if 0 since it=E2=80=99s not currently used. (evsnprintf): New function, that takes just one pass over FORMAT, not two. All doprnt callers changed to use evsnprintf. This new function has the contents of the old doprnt implementation, except its format argument argument is a null-terminated string. Assume C99 to make code clearer. Do not use malloc or alloca to allocate a copy of the format FMTCPY; instead, use a small fixed-size array FMTSTAR, and use '*' in that array to represent width and precision, passing them as separate int arguments. Use eassume to pacify GCC in switch statements. Drop support for "%S" which is not needed, is never used and which would cause GCC to warn anyway. --- src/doprnt.c | 260 ++++++++++++++++++++++++++++++--------------------- src/image.c | 2 +- src/lisp.h | 4 +- src/sysdep.c | 2 +- src/xdisp.c | 5 +- 5 files changed, 159 insertions(+), 114 deletions(-) diff --git a/src/doprnt.c b/src/doprnt.c index ceadf3bdfa..022b9ef16b 100644 --- a/src/doprnt.c +++ b/src/doprnt.c @@ -28,6 +28,7 @@ . For %s and %c, when field width is specified (e.g., %25s), it accou= nts for the display width of each character, according to char-width-table.= That is, it does not assume that each character takes one column on disp= lay. + Nor does it assume that each character is a single byte. =20 . If the size of the buffer is not enough to produce the formatted st= ring in its entirety, it makes sure that truncation does not chop the last @@ -42,38 +43,41 @@ Emacs can handle. =20 OTOH, this function supports only a small subset of the standard C fo= rmatted - output facilities. E.g., %u and %ll are not supported, and precision= is - ignored %s and %c conversions. (See below for the detailed documenta= tion of - what is supported.) However, this is okay, as this function is suppo= sed to - be called from `error' and similar functions, and thus does not need = to - support features beyond those in `Fformat_message', which is used - by `error' on the Lisp level. */ + output facilities. E.g., %u is not supported, precision is ignored + in %s and %c conversions, and %lld does not necessarily work and + code should use something like %"pM"d with intmax_t instead. + (See below for the detailed documentation of what is supported.) + However, this is okay, as this function is supposed to be called + from 'error' and similar C functions, and thus does not need to + support all the features of 'Fformat_message', which is used by the + Lisp 'error' function. */ =20 /* In the FORMAT argument this function supports ` and ' as directives that output left and right quotes as per =E2=80=98text-quoting style=E2= =80=99. It also supports the following %-sequences: =20 %s means print a string argument. - %S is treated as %s, for loose compatibility with `Fformat_message'. %d means print a `signed int' argument in decimal. %o means print an `unsigned int' argument in octal. %x means print an `unsigned int' argument in hex. %e means print a `double' argument in exponential notation. %f means print a `double' argument in decimal-point notation. %g means print a `double' argument in exponential notation - or in decimal-point notation, whichever uses fewer characters. + or in decimal-point notation, depending on the value; + this is often (though not always) the shorter of the two notations= . %c means print a `signed int' argument as a single character. %% means produce a literal % character. =20 - A %-sequence may contain optional flag, width, and precision specifie= rs, and - a length modifier, as follows: + A %-sequence other than %% may contain optional flags, width, precisi= on, + and length, as follows: =20 %character =20 where flags is [+ -0], width is [0-9]+, precision is .[0-9]+, and len= gth is empty or l or the value of the pD or pI or PRIdMAX (sans "d") macr= os. - Also, %% in a format stands for a single % in the output. A % that - does not introduce a valid %-sequence causes undefined behavior. + A % that does not introduce a valid %-sequence causes undefined behav= ior. + ASCII bytes in FORMAT other than % are copied through as-is; + non-ASCII bytes should not appear in FORMAT. =20 The + flag character inserts a + before any positive number, while a = space inserts a space before any positive number; these flags only affect %= d, %o, @@ -99,7 +103,9 @@ =20 For %e, %f, and %g sequences, the number after the "." in the precisi= on specifier says how many decimal places to show; if zero, the decimal = point - itself is omitted. For %s and %S, the precision specifier is ignored= . */ + itself is omitted. For %d, %o, and %x sequences, the precision speci= fies + the minimum number of digits to appear. Precision specifiers are + not supported for other %-sequences. */ =20 #include #include @@ -115,6 +121,36 @@ another macro. */ #include "character.h" =20 +/* Enough to handle floating point formats with large numbers. */ +enum { SIZE_BOUND_EXTRA =3D DBL_MAX_10_EXP + 50 }; + +/* Parse FMT as an unsigned decimal integer, putting its value into *VAL= UE. + Return the address of the first byte after the integer. + If FMT is not an integer, return FMT and store zero into *VALUE. */ +static char const * +parse_format_integer (char const *fmt, int *value) +{ + int n =3D 0; + bool overflow =3D false; + for (; '0' <=3D *fmt && *fmt <=3D '9'; fmt++) + { + overflow |=3D INT_MULTIPLY_WRAPV (n, 10, &n); + overflow |=3D INT_ADD_WRAPV (n, *fmt - '0', &n); + } + if (overflow || min (PTRDIFF_MAX, SIZE_MAX) - SIZE_BOUND_EXTRA < n) + error ("Format width or precision too large"); + *value =3D n; + return fmt; +} + +#if 0 +/* The doprnt function is not currently used in Emacs. + It's here in case it's needed later. + It acts like evsnprintf, except it also supports NULs in formats. */ + +ptrdiff_t doprnt (char *, ptrdiff_t, char const *, char const *, va_list= ) + ATTRIBUTE_FORMAT_PRINTF (3, 0); + /* Generate output from a format-spec FORMAT, terminated at position FORMAT_END. (*FORMAT_END is not part of the format, but must exist and be readabl= e.) @@ -130,13 +166,44 @@ ptrdiff_t doprnt (char *buffer, ptrdiff_t bufsize, const char *format, const char *format_end, va_list ap) +{ + ptrdiff_t nbytes =3D 0; + char const *f =3D format; + for (char const *fend; (fend =3D memchr (f, 0, format_end - f)); f =3D= fend + 1) + { + nbytes +=3D evsnprintf (buffer + nbytes, bufsize - nbytes, f, ap); + if (nbytes =3D=3D bufsize - 1) + return nbytes; + nbytes++; + } + + USE_SAFE_ALLOCA; + ptrdiff_t ftaillen =3D format_end - f; + char *ftail =3D SAFE_ALLOCA (ftaillen + 1); + memcpy (ftail, f, ftaillen); + ftail[ftaillen] =3D 0; + nbytes +=3D evsnprintf (buffer + nbytes, bufsize - nbytes, ftail, ap); + SAFE_FREE (); + return nbytes; +} +#endif + +/* Generate output from a format-spec FORMAT. + Output goes in BUFFER, which has room for BUFSIZE chars. + BUFSIZE must be positive. If the output does not fit, truncate it + to fit and return BUFSIZE - 1; if this truncates a multibyte + sequence, store '\0' into the sequence's first byte. + Return the number of bytes stored into BUFFER, excluding + the terminating null byte. Output is always null-terminated. + String arguments are passed as C strings. + Integers are passed as C integers. */ + +ptrdiff_t +evsnprintf (char *buffer, ptrdiff_t bufsize, char const *format, va_list= ap) { const char *fmt =3D format; /* Pointer into format string. */ char *bufptr =3D buffer; /* Pointer into output buffer. */ =20 - /* Enough to handle floating point formats with large numbers. */ - enum { SIZE_BOUND_EXTRA =3D DBL_MAX_10_EXP + 50 }; - /* Use this for sprintf unless we need something really big. */ char tembuf[SIZE_BOUND_EXTRA + 50]; =20 @@ -150,103 +217,91 @@ doprnt (char *buffer, ptrdiff_t bufsize, const cha= r *format, char *big_buffer =3D NULL; =20 enum text_quoting_style quoting_style =3D text_quoting_style (); - ptrdiff_t tem =3D -1; - char *string; - char fixed_buffer[20]; /* Default buffer for small formatting. */ - char *fmtcpy; - int minlen; - char charbuf[MAX_MULTIBYTE_LENGTH + 1]; /* Used for %c. */ - USE_SAFE_ALLOCA; - - if (format_end =3D=3D 0) - format_end =3D format + strlen (format); - - fmtcpy =3D (format_end - format < sizeof (fixed_buffer) - 1 - ? fixed_buffer - : SAFE_ALLOCA (format_end - format + 1)); =20 bufsize--; =20 /* Loop until end of format string or buffer full. */ - while (fmt < format_end && bufsize > 0) + while (*fmt && bufsize > 0) { char const *fmt0 =3D fmt; char fmtchar =3D *fmt++; if (fmtchar =3D=3D '%') { - ptrdiff_t size_bound =3D 0; ptrdiff_t width; /* Columns occupied by STRING on display. */ enum { pDlen =3D sizeof pD - 1, pIlen =3D sizeof pI - 1, - pMlen =3D sizeof PRIdMAX - 2 + pMlen =3D sizeof PRIdMAX - 2, + maxmlen =3D max (max (1, pDlen), max (pIlen, pMlen)) }; enum { no_modifier, long_modifier, pD_modifier, pI_modifier, pM_modifier } length_modifier =3D no_modifier; static char const modifier_len[] =3D { 0, 1, pDlen, pIlen, pMlen }; - int maxmlen =3D max (max (1, pDlen), max (pIlen, pMlen)); int mlen; + char charbuf[MAX_MULTIBYTE_LENGTH + 1]; /* Used for %c. */ =20 - /* Copy this one %-spec into fmtcpy. */ - string =3D fmtcpy; + /* Width and precision specified by this %-sequence. */ + int wid =3D 0, prec =3D -1; + + /* FMTSTAR will be a "%*.*X"-like version of this %-sequence. + Start by putting '%' into FMTSTAR. */ + char fmtstar[sizeof "%-+ 0*.*d" + maxmlen]; + char *string =3D fmtstar; *string++ =3D '%'; - while (fmt < format_end) + + /* Copy at most one instance of each flag into FMTSTAR. */ + bool minusflag =3D false, plusflag =3D false, zeroflag =3D false, + spaceflag =3D false; + for (;; fmt++) { - *string++ =3D *fmt; - if ('0' <=3D *fmt && *fmt <=3D '9') + *string =3D *fmt; + switch (*fmt) { - /* Get an idea of how much space we might need. - This might be a field width or a precision; e.g. - %1.1000f and %1000.1f both might need 1000+ bytes. - Parse the width or precision, checking for overflow. */ - int n =3D *fmt - '0'; - bool overflow =3D false; - while (fmt + 1 < format_end - && '0' <=3D fmt[1] && fmt[1] <=3D '9') - { - overflow |=3D INT_MULTIPLY_WRAPV (n, 10, &n); - overflow |=3D INT_ADD_WRAPV (n, fmt[1] - '0', &n); - *string++ =3D *++fmt; - } - - if (overflow - || min (PTRDIFF_MAX, SIZE_MAX) - SIZE_BOUND_EXTRA < n) - error ("Format width or precision too large"); - if (size_bound < n) - size_bound =3D n; + case '-': string +=3D !minusflag; minusflag =3D true; continue; + case '+': string +=3D !plusflag; plusflag =3D true; continue; + case ' ': string +=3D !spaceflag; spaceflag =3D true; continue; + case '0': string +=3D !zeroflag; zeroflag =3D true; continue; } - else if (! (*fmt =3D=3D '-' || *fmt =3D=3D ' ' || *fmt =3D=3D '.' - || *fmt =3D=3D '+')) - break; - fmt++; + break; } =20 + /* Parse width and precision, putting "*.*" into FMTSTAR. */ + if ('1' <=3D *fmt && *fmt <=3D '9') + fmt =3D parse_format_integer (fmt, &wid); + if (*fmt =3D=3D '.') + fmt =3D parse_format_integer (fmt + 1, &prec); + *string++ =3D '*'; + *string++ =3D '.'; + *string++ =3D '*'; + /* Check for the length modifiers in textual length order, so that longer modifiers override shorter ones. */ for (mlen =3D 1; mlen <=3D maxmlen; mlen++) { - if (format_end - fmt < mlen) - break; if (mlen =3D=3D 1 && *fmt =3D=3D 'l') length_modifier =3D long_modifier; - if (mlen =3D=3D pDlen && memcmp (fmt, pD, pDlen) =3D=3D 0) + if (mlen =3D=3D pDlen && strncmp (fmt, pD, pDlen) =3D=3D 0) length_modifier =3D pD_modifier; - if (mlen =3D=3D pIlen && memcmp (fmt, pI, pIlen) =3D=3D 0) + if (mlen =3D=3D pIlen && strncmp (fmt, pI, pIlen) =3D=3D 0) length_modifier =3D pI_modifier; - if (mlen =3D=3D pMlen && memcmp (fmt, PRIdMAX, pMlen) =3D=3D 0) + if (mlen =3D=3D pMlen && strncmp (fmt, PRIdMAX, pMlen) =3D=3D 0) length_modifier =3D pM_modifier; } =20 + /* Copy optional length modifier and conversion specifier + character into FMTSTAR, and append a NUL. */ mlen =3D modifier_len[length_modifier]; - memcpy (string, fmt + 1, mlen); - string +=3D mlen; + string =3D mempcpy (string, fmt, mlen + 1); fmt +=3D mlen; *string =3D 0; =20 - /* Make the size bound large enough to handle floating point formats + /* An idea of how much space we might need. + This might be a field width or a precision; e.g. + %1.1000f and %1000.1f both might need 1000+ bytes. + Make it large enough to handle floating point formats with large numbers. */ - size_bound +=3D SIZE_BOUND_EXTRA; + ptrdiff_t size_bound =3D max (wid, prec) + SIZE_BOUND_EXTRA; =20 /* Make sure we have that much. */ if (size_bound > size_allocated) @@ -257,48 +312,49 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, sprintf_buffer =3D big_buffer; size_allocated =3D size_bound; } - minlen =3D 0; + int minlen =3D 0; + ptrdiff_t tem; switch (*fmt++) { default: - error ("Invalid format operation %s", fmtcpy); + error ("Invalid format operation %s", fmt0); =20 -/* case 'b': */ - case 'l': case 'd': switch (length_modifier) { case no_modifier: { int v =3D va_arg (ap, int); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case long_modifier: { long v =3D va_arg (ap, long); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pD_modifier: signed_pD_modifier: { ptrdiff_t v =3D va_arg (ap, ptrdiff_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pI_modifier: { EMACS_INT v =3D va_arg (ap, EMACS_INT); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pM_modifier: { intmax_t v =3D va_arg (ap, intmax_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; + default: + eassume (false); } /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; @@ -311,13 +367,13 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case no_modifier: { unsigned v =3D va_arg (ap, unsigned); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case long_modifier: { unsigned long v =3D va_arg (ap, unsigned long); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pD_modifier: @@ -325,15 +381,17 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case pI_modifier: { EMACS_UINT v =3D va_arg (ap, EMACS_UINT); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pM_modifier: { uintmax_t v =3D va_arg (ap, uintmax_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; + default: + eassume (false); } /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; @@ -344,22 +402,18 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case 'g': { double d =3D va_arg (ap, double); - tem =3D sprintf (sprintf_buffer, fmtcpy, d); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, d); /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; goto doit; } =20 - case 'S': - string[-1] =3D 's'; - FALLTHROUGH; case 's': - if (fmtcpy[1] !=3D 's') - minlen =3D atoi (&fmtcpy[1]); + minlen =3D minusflag ? -wid : wid; string =3D va_arg (ap, char *); tem =3D strnlen (string, STRING_BYTES_BOUND + 1); if (tem =3D=3D STRING_BYTES_BOUND + 1) - error ("String for %%s or %%S format is too long"); + error ("String for %%s format is too long"); width =3D strwidth (string, tem); goto doit1; =20 @@ -432,14 +486,12 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, string =3D charbuf; string[tem] =3D 0; width =3D strwidth (string, tem); - if (fmtcpy[1] !=3D 'c') - minlen =3D atoi (&fmtcpy[1]); + minlen =3D minusflag ? -wid : wid; goto doit1; } =20 case '%': /* Treat this '%' as normal. */ - fmt0 =3D fmt - 1; break; } } @@ -450,13 +502,13 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, src =3D uLSQM, srclen =3D sizeof uLSQM - 1; else if (quoting_style =3D=3D CURVE_QUOTING_STYLE && fmtchar =3D=3D= '\'') src =3D uRSQM, srclen =3D sizeof uRSQM - 1; - else if (quoting_style =3D=3D STRAIGHT_QUOTING_STYLE && fmtchar =3D= =3D '`') - src =3D "'", srclen =3D 1; else { - while (fmt < format_end && !CHAR_HEAD_P (*fmt)) - fmt++; - src =3D fmt0, srclen =3D fmt - fmt0; + if (quoting_style =3D=3D STRAIGHT_QUOTING_STYLE && fmtchar =3D=3D '`'= ) + fmtchar =3D '\''; + eassert (ASCII_CHAR_P (fmtchar)); + *bufptr++ =3D fmtchar; + continue; } =20 if (bufsize < srclen) @@ -479,15 +531,13 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, xfree (big_buffer); =20 *bufptr =3D 0; /* Make sure our string ends with a '\0' */ - - SAFE_FREE (); return bufptr - buffer; } =20 /* Format to an unbounded buffer BUF. This is like sprintf, except it is not limited to returning an 'int' so it doesn't have a silly 2 GiB limit on typical 64-bit hosts. However, it is limited to the - Emacs-style formats that doprnt supports, and it requotes ` and ' + Emacs-style formats that evsnprintf supports, and it requotes ` and ' as per =E2=80=98text-quoting-style=E2=80=99. =20 Return the number of bytes put into BUF, excluding the terminating @@ -495,10 +545,9 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char = *format, ptrdiff_t esprintf (char *buf, char const *format, ...) { - ptrdiff_t nbytes; va_list ap; va_start (ap, format); - nbytes =3D doprnt (buf, TYPE_MAXIMUM (ptrdiff_t), format, 0, ap); + ptrdiff_t nbytes =3D evsnprintf (buf, TYPE_MAXIMUM (ptrdiff_t), format= , ap); va_end (ap); return nbytes; } @@ -534,10 +583,9 @@ evxprintf (char **buf, ptrdiff_t *bufsize, { for (;;) { - ptrdiff_t nbytes; va_list ap_copy; va_copy (ap_copy, ap); - nbytes =3D doprnt (*buf, *bufsize, format, 0, ap_copy); + ptrdiff_t nbytes =3D evsnprintf (*buf, *bufsize, format, ap_copy); va_end (ap_copy); if (nbytes < *bufsize - 1) return nbytes; diff --git a/src/image.c b/src/image.c index 25d5af8a8d..8a030c154a 100644 --- a/src/image.c +++ b/src/image.c @@ -7826,7 +7826,7 @@ tiff_handler (const char *, const char *, const cha= r *, va_list) tiff_handler (const char *log_format, const char *title, const char *format, va_list ap) { - /* doprnt is not suitable here, as TIFF handlers are called from + /* evsnprintf is not suitable here, as TIFF handlers are called from libtiff and are passed arbitrary printf directives. Instead, use vsnprintf, taking care to be portable to nonstandard environments where vsnprintf returns -1 on buffer overflow. Since it's just a diff --git a/src/lisp.h b/src/lisp.h index 45353fbef3..e4de9c8e9c 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -4034,8 +4034,8 @@ #define FLOAT_TO_STRING_BUFSIZE 350 extern void syms_of_print (void); =20 /* Defined in doprnt.c. */ -extern ptrdiff_t doprnt (char *, ptrdiff_t, const char *, const char *, - va_list); +extern ptrdiff_t evsnprintf (char *, ptrdiff_t, const char *, va_list) + ATTRIBUTE_FORMAT_PRINTF (3, 0); extern ptrdiff_t esprintf (char *, char const *, ...) ATTRIBUTE_FORMAT_PRINTF (2, 3); extern ptrdiff_t exprintf (char **, ptrdiff_t *, char const *, ptrdiff_t= , diff --git a/src/sysdep.c b/src/sysdep.c index f6c0ddee01..da176634a5 100644 --- a/src/sysdep.c +++ b/src/sysdep.c @@ -2192,7 +2192,7 @@ snprintf (char *buf, size_t bufsize, char const *fo= rmat, ...) if (size) { va_start (ap, format); - nbytes =3D doprnt (buf, size, format, 0, ap); + nbytes =3D evsnprintf (buf, size, format, ap); va_end (ap); } =20 diff --git a/src/xdisp.c b/src/xdisp.c index 5a62cd6eb5..2f1fe74e1d 100644 --- a/src/xdisp.c +++ b/src/xdisp.c @@ -11269,13 +11269,10 @@ vmessage (const char *m, va_list ap) { if (m) { - ptrdiff_t len; ptrdiff_t maxsize =3D FRAME_MESSAGE_BUF_SIZE (f); USE_SAFE_ALLOCA; char *message_buf =3D SAFE_ALLOCA (maxsize + 1); - - len =3D doprnt (message_buf, maxsize, m, 0, ap); - + ptrdiff_t len =3D evsnprintf (message_buf, maxsize, m, ap); message3 (make_string (message_buf, len)); SAFE_FREE (); } --=20 2.25.1 --------------07474DA9A9758F7BDC2793DE-- From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 15 14:12:21 2020 Received: (at 43439) by debbugs.gnu.org; 15 Oct 2020 18:12:21 +0000 Received: from localhost ([127.0.0.1]:56519 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kT7jg-00053a-TO for submit@debbugs.gnu.org; Thu, 15 Oct 2020 14:12:21 -0400 Received: from eggs.gnu.org ([209.51.188.92]:56342) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kT7je-00053M-SG for 43439@debbugs.gnu.org; Thu, 15 Oct 2020 14:12:19 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:55677) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kT7jY-00039m-Dl; Thu, 15 Oct 2020 14:12:12 -0400 Received: from [176.228.60.248] (port=2094 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kT7jV-0001Zg-CF; Thu, 15 Oct 2020 14:12:11 -0400 Date: Thu, 15 Oct 2020 21:12:05 +0300 Message-Id: <83y2k7we8a.fsf@gnu.org> From: Eli Zaretskii To: Paul Eggert In-Reply-To: (message from Paul Eggert on Thu, 15 Oct 2020 10:58:49 -0700) Subject: Re: bug#43439: [PATCH] doprnt improvements References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> <838sd75yor.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Cc: 43439@debbugs.gnu.org > From: Paul Eggert > Date: Thu, 15 Oct 2020 10:58:49 -0700 > > On 9/18/20 12:30 AM, Eli Zaretskii wrote: > > > How about ... we modify doprint to exit when either it finds > > NUL or reaches the character specified by FORMAT_END? This will allow > > us to keep some of the feature, and I think the amount of changes will > > be smaller. It should also not be much slower than what you propose. > > Better yet, let's leave doprnt's API unchanged, and add a function evsnprintf > (named by analogy from esprintf) whose API is like C vsnprintf but which does > formatting the Emacs way. No, let's not, please. I didn't agree to modifying doprnt in significant ways, so you are now suggesting to do an even more radical modification, just under another name? This is moving away of a potential compromise point, not towards it. From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 15 14:50:57 2020 Received: (at 43439) by debbugs.gnu.org; 15 Oct 2020 18:50:57 +0000 Received: from localhost ([127.0.0.1]:56545 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kT8L3-0005yp-MC for submit@debbugs.gnu.org; Thu, 15 Oct 2020 14:50:57 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:44012) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kT8L1-0005yc-Pn for 43439@debbugs.gnu.org; Thu, 15 Oct 2020 14:50:56 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 3D5A81600E9; Thu, 15 Oct 2020 11:50:50 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 6mdDVRJ0F0Ta; Thu, 15 Oct 2020 11:50:49 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 6BB4F1600F9; Thu, 15 Oct 2020 11:50:49 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id zaT963zOjjSE; Thu, 15 Oct 2020 11:50:49 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-243-218-95.socal.res.rr.com [23.243.218.95]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 13E901600E9; Thu, 15 Oct 2020 11:50:49 -0700 (PDT) Subject: Re: bug#43439: [PATCH] doprnt improvements To: Eli Zaretskii References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> <838sd75yor.fsf@gnu.org> <83y2k7we8a.fsf@gnu.org> From: Paul Eggert Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= mQINBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABtCBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PokCVQQTAQgAPwIbAwYLCQgHAwIGFQgCCQoLBBYCAwEC HgECF4AWIQR+N5Kp2Kz31jO8FYjtl+kOYqp+NAUCXyW9lwUJFK4LswAKCRDtl+kOYqp+NKNV D/9HMsI1606n0UuTXHwITsyOjAI9SDOT+C3DUv6qlM5BH2nWAMTiIiyA5uglsJv93oi2vNtF f/Q/m/1cnZWgnVnExkyLI4ENSd1uBvr0/lCSdPlP0Mg6GWSpXMu+x0vdT0AaZNOTE0FnPuol dc3XD76C2qg8sX/iaxXTKHy9P+BlAq/Cs7/pxDQ0EzSn0USZ2C0l5vv4PMpA/picnS6K609J vDGaORmwZeXIZqQNZV+ZQs+UYtVoguDTqby3IUY1I8BlXHRptaj9AMn4Uoh/CqpQlVojoyWl HqaFnnJBKeF0hvJ9SAyalwuzAjG7vQW07MYncaOFm0woiKbg5JLO8F4SBTIkuO0DCf9nLAay 6VsB4rzwdEfRwjPLYAn7MR3fvHCEzfrkldTraiBO1T0ieDK80I7sLf6pMeCYI19pUlx0/NRM GCddiFIQdfthKWXGRS5LAs8jwBf8H6G5PWinPrEIaomIP21ivuhQD07bYq9IiIdeljjUdHcG I0i/B4M56Zaa8Ff38iniOlrDYCmYWR4dCWZiuQeZ3OgqeQs9a6jTvgdDGVmRVqY+jzk8PlaH fcok8ROhFcHKkcfhuBhL25hlRIshRDOEskXqKwnzrbqga3GXZXfsXAoFbzNhLdLv9A+LJAYS kXP6/5qdTpELVGosyH884VdbBpkGI04oYVqulbkCDQRMgHJkARAApoXrvxP3DIfjCNOtXU/P dwMShKdX/RlSs5PfunV1wbKP8herXHrvQdFVqECaTSxmlhzbk8X0PkY9gcVaU2O49T3qsOd1 cHeF52YFGEt0LhsBeMjgNX5uZ1V76r8gyeVlFpWWb0SIwJUBHrDXexF67upeRb2vdHBjYDNe ySn+0B7gFEqvVmZu+LadudDp6kQLjatFvHQHUSGNshBnkkcaTbiI9Pst0GCc2aiznBiPPA2W QxAPlPRh3OGTsn5THADmbjqY6FEMLasVX8DSCblMvLwNeO/8SxziBidhqLpJCqdQRWHku5Xx gIkGeKOz5OLDvXHWJyafrEYjjkS6Ak6B5z6svKliClWnjHQcjlPzyoFFgKTEfcqDxCj4RY0D 0DgtFD0NfyeOidrSB/SzTe2hwryQE3rpSiqo+0cGdzh4yAHKYJ+UrXZ4p93ZhjGfKD1xlrNY DlWyW9PGmbvqFuDmiIAQf9WD/wzEfICc+F+uDDI+uYkRxUFp92ykmdhDEFg1yjYsU8iGU69a Hyvhq36z4zctvbqhRNzOWB1bVJ/dIMDvsExGcXQVDIT7sDNXv0wE3jKSKpp7NDG1oXUXL+2+ SF99Kjy753AbQSAmH617fyBNwhJWvQYg+mUvPpiGOtses9EXUI3lS4v0MEaPG43flEs1UR+1 rpFQWVHo1y1OO+sAEQEAAYkCPAQYAQgAJgIbDBYhBH43kqnYrPfWM7wViO2X6Q5iqn40BQJf Jb2zBQkUrgvPAAoJEO2X6Q5iqn40cnMP/17CgUkXT9aIJriPM8wbceYrcl7+bdYEf79SlwSb bHN7R4CoIJFOlN9S/34typGVYvpgmCJDYFTBxyPO92iMXDgA4+cWHzt5T1aYO9hsKhh7vDtK +6ProZGc+08gUTXHhb97hMMQhknJlnfjpSEC9em906FU+I93T1fTGupnBa3aWcK8jM0JaBGb y2hG1S3olaDLSTtBINNBYmvuWR9MKOhhqDrlk5cwFDJLh5NrXteEY08WAzcLzG3pkrXPHkFe MQtfqk0jLdGGvGC3NCIkqYrdLhiRvGpru38C26REn5f4I0vGE3VfIXHe8TMCNmQut1NtMuUm pDIy1aLxGzuptUhnOJN//r+VjDPoi3LOySNYphqe/dMubsfUr6ohP41mKF81FuwI4amqJtrq IL2yqax3a0qlfwCxXftieqJcuekX+eCPDCKrYMXR0FYgwpG2ITZUGtrEjESlE6Dscx734HKd r5ORIocLUUKEOGeiU6DGhGFdb5Twu0Sn+u1mUPDN0M++CdMvClIE8klo4G91EOImu1Upb8xc OPQwxh1jwqSrU5QwoNmSYegQSHLpIUurFz1iQUh1vpPXzKinkWEqv4IqA1ciL+LyySuLkp7M sJpVRMbWJCNWOOSbaH4oDBJ5dHMGc35x5mosCk90PXknuFDDsYHfDo5smf9lo6YXx7N9 Organization: UCLA Computer Science Department Message-ID: Date: Thu, 15 Oct 2020 11:50:48 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <83y2k7we8a.fsf@gnu.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.3 (----) On 10/15/20 11:12 AM, Eli Zaretskii wrote: > I didn't agree to modifying doprnt in > significant ways, so you are now suggesting to do an even more radical > modification, just under another name? If you'd rather have the patch keep doprnt entirely as-is (i.e., not change doprnt's implementation at all), I can easily modify the patch to do that. All current Emacs code that calls doprnt would benefit from switching to the proposed evsnprintf function, an API that is simpler and faster and that has better static checking with GCC. From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 15 15:05:29 2020 Received: (at 43439) by debbugs.gnu.org; 15 Oct 2020 19:05:29 +0000 Received: from localhost ([127.0.0.1]:56597 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kT8Z6-0006OD-NW for submit@debbugs.gnu.org; Thu, 15 Oct 2020 15:05:28 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40788) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kT8Z5-0006Nx-4P for 43439@debbugs.gnu.org; Thu, 15 Oct 2020 15:05:27 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:56703) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kT8Yy-0001eL-Nk; Thu, 15 Oct 2020 15:05:20 -0400 Received: from [176.228.60.248] (port=1387 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kT8Yx-0003Ic-S2; Thu, 15 Oct 2020 15:05:20 -0400 Date: Thu, 15 Oct 2020 22:05:16 +0300 Message-Id: <83sgafwbrn.fsf@gnu.org> From: Eli Zaretskii To: Paul Eggert In-Reply-To: (message from Paul Eggert on Thu, 15 Oct 2020 11:50:48 -0700) Subject: Re: bug#43439: [PATCH] doprnt improvements References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> <838sd75yor.fsf@gnu.org> <83y2k7we8a.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Cc: 43439@debbugs.gnu.org > From: Paul Eggert > Date: Thu, 15 Oct 2020 11:50:48 -0700 > > On 10/15/20 11:12 AM, Eli Zaretskii wrote: > > I didn't agree to modifying doprnt in > > significant ways, so you are now suggesting to do an even more radical > > modification, just under another name? > > If you'd rather have the patch keep doprnt entirely as-is (i.e., not change > doprnt's implementation at all), I can easily modify the patch to do that. No, I want to continue calling doprnt directly, not replace all its calls with a call to another function. doprnt by itself is useless unless it is used by the relevant primitives. I see no need to replace it with another function, because doprnt works and works well. > All current Emacs code that calls doprnt would benefit from switching to the > proposed evsnprintf function, an API that is simpler and faster and that has > better static checking with GCC. Yes, that's exactly where we disagree. I made my proposal to find some kind of middle ground, and was disappointed to see you suggesting to move even farther from a potential agreement. In general, I'm against messing with code that has been stable for ages, for ephemeral benefits or minor stylistic reasons. If nothing else, it gets in the way of maintaining Emacs because code I've known for years and could find with my eyes closed constantly shifts and changes under my feet. Another example of this is that src/lisp.h macros look nowadays completely different from what they were several years ago. This need to constantly unlearn that which was burned into my muscle memory is not pleasant at all. Changes that take us forward because they are needed for new and improved features are welcome and justified, but there are no new features in all those changes, including in the doprnt patch. I wish this fever would stop. From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 15 16:06:37 2020 Received: (at 43439) by debbugs.gnu.org; 15 Oct 2020 20:06:37 +0000 Received: from localhost ([127.0.0.1]:56675 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kT9WG-0003id-Lc for submit@debbugs.gnu.org; Thu, 15 Oct 2020 16:06:37 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:60692) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kT9WE-0003iQ-So for 43439@debbugs.gnu.org; Thu, 15 Oct 2020 16:06:36 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 824621600E9; Thu, 15 Oct 2020 13:06:28 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id oYlhTGKfJ5A9; Thu, 15 Oct 2020 13:06:26 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 8745D1600F9; Thu, 15 Oct 2020 13:06:26 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id lrmAR1-YzpnR; Thu, 15 Oct 2020 13:06:26 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-243-218-95.socal.res.rr.com [23.243.218.95]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 542211600E9; Thu, 15 Oct 2020 13:06:26 -0700 (PDT) Subject: Re: bug#43439: [PATCH] doprnt improvements To: Eli Zaretskii References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> <838sd75yor.fsf@gnu.org> <83y2k7we8a.fsf@gnu.org> <83sgafwbrn.fsf@gnu.org> From: Paul Eggert Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= mQINBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABtCBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PokCVQQTAQgAPwIbAwYLCQgHAwIGFQgCCQoLBBYCAwEC HgECF4AWIQR+N5Kp2Kz31jO8FYjtl+kOYqp+NAUCXyW9lwUJFK4LswAKCRDtl+kOYqp+NKNV D/9HMsI1606n0UuTXHwITsyOjAI9SDOT+C3DUv6qlM5BH2nWAMTiIiyA5uglsJv93oi2vNtF f/Q/m/1cnZWgnVnExkyLI4ENSd1uBvr0/lCSdPlP0Mg6GWSpXMu+x0vdT0AaZNOTE0FnPuol dc3XD76C2qg8sX/iaxXTKHy9P+BlAq/Cs7/pxDQ0EzSn0USZ2C0l5vv4PMpA/picnS6K609J vDGaORmwZeXIZqQNZV+ZQs+UYtVoguDTqby3IUY1I8BlXHRptaj9AMn4Uoh/CqpQlVojoyWl HqaFnnJBKeF0hvJ9SAyalwuzAjG7vQW07MYncaOFm0woiKbg5JLO8F4SBTIkuO0DCf9nLAay 6VsB4rzwdEfRwjPLYAn7MR3fvHCEzfrkldTraiBO1T0ieDK80I7sLf6pMeCYI19pUlx0/NRM GCddiFIQdfthKWXGRS5LAs8jwBf8H6G5PWinPrEIaomIP21ivuhQD07bYq9IiIdeljjUdHcG I0i/B4M56Zaa8Ff38iniOlrDYCmYWR4dCWZiuQeZ3OgqeQs9a6jTvgdDGVmRVqY+jzk8PlaH fcok8ROhFcHKkcfhuBhL25hlRIshRDOEskXqKwnzrbqga3GXZXfsXAoFbzNhLdLv9A+LJAYS kXP6/5qdTpELVGosyH884VdbBpkGI04oYVqulbkCDQRMgHJkARAApoXrvxP3DIfjCNOtXU/P dwMShKdX/RlSs5PfunV1wbKP8herXHrvQdFVqECaTSxmlhzbk8X0PkY9gcVaU2O49T3qsOd1 cHeF52YFGEt0LhsBeMjgNX5uZ1V76r8gyeVlFpWWb0SIwJUBHrDXexF67upeRb2vdHBjYDNe ySn+0B7gFEqvVmZu+LadudDp6kQLjatFvHQHUSGNshBnkkcaTbiI9Pst0GCc2aiznBiPPA2W QxAPlPRh3OGTsn5THADmbjqY6FEMLasVX8DSCblMvLwNeO/8SxziBidhqLpJCqdQRWHku5Xx gIkGeKOz5OLDvXHWJyafrEYjjkS6Ak6B5z6svKliClWnjHQcjlPzyoFFgKTEfcqDxCj4RY0D 0DgtFD0NfyeOidrSB/SzTe2hwryQE3rpSiqo+0cGdzh4yAHKYJ+UrXZ4p93ZhjGfKD1xlrNY DlWyW9PGmbvqFuDmiIAQf9WD/wzEfICc+F+uDDI+uYkRxUFp92ykmdhDEFg1yjYsU8iGU69a Hyvhq36z4zctvbqhRNzOWB1bVJ/dIMDvsExGcXQVDIT7sDNXv0wE3jKSKpp7NDG1oXUXL+2+ SF99Kjy753AbQSAmH617fyBNwhJWvQYg+mUvPpiGOtses9EXUI3lS4v0MEaPG43flEs1UR+1 rpFQWVHo1y1OO+sAEQEAAYkCPAQYAQgAJgIbDBYhBH43kqnYrPfWM7wViO2X6Q5iqn40BQJf Jb2zBQkUrgvPAAoJEO2X6Q5iqn40cnMP/17CgUkXT9aIJriPM8wbceYrcl7+bdYEf79SlwSb bHN7R4CoIJFOlN9S/34typGVYvpgmCJDYFTBxyPO92iMXDgA4+cWHzt5T1aYO9hsKhh7vDtK +6ProZGc+08gUTXHhb97hMMQhknJlnfjpSEC9em906FU+I93T1fTGupnBa3aWcK8jM0JaBGb y2hG1S3olaDLSTtBINNBYmvuWR9MKOhhqDrlk5cwFDJLh5NrXteEY08WAzcLzG3pkrXPHkFe MQtfqk0jLdGGvGC3NCIkqYrdLhiRvGpru38C26REn5f4I0vGE3VfIXHe8TMCNmQut1NtMuUm pDIy1aLxGzuptUhnOJN//r+VjDPoi3LOySNYphqe/dMubsfUr6ohP41mKF81FuwI4amqJtrq IL2yqax3a0qlfwCxXftieqJcuekX+eCPDCKrYMXR0FYgwpG2ITZUGtrEjESlE6Dscx734HKd r5ORIocLUUKEOGeiU6DGhGFdb5Twu0Sn+u1mUPDN0M++CdMvClIE8klo4G91EOImu1Upb8xc OPQwxh1jwqSrU5QwoNmSYegQSHLpIUurFz1iQUh1vpPXzKinkWEqv4IqA1ciL+LyySuLkp7M sJpVRMbWJCNWOOSbaH4oDBJ5dHMGc35x5mosCk90PXknuFDDsYHfDo5smf9lo6YXx7N9 Organization: UCLA Computer Science Department Message-ID: <74baf31e-43ec-677a-e976-889a101c1e9f@cs.ucla.edu> Date: Thu, 15 Oct 2020 13:06:26 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <83sgafwbrn.fsf@gnu.org> Content-Type: multipart/mixed; boundary="------------D006C7B2D38DCAF19C17397E" Content-Language: en-US X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.3 (----) This is a multi-part message in MIME format. --------------D006C7B2D38DCAF19C17397E Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit On 10/15/20 12:05 PM, Eli Zaretskii wrote: > I want to continue calling doprnt directly, not replace all its > calls with a call to another function. OK, then attached is a patch that does things that way. This patch affects only the implementation of doprnt; no doprnt callers are affected. The code should have most of the proposed performance benefits of the earlier patches I proposed. --------------D006C7B2D38DCAF19C17397E Content-Type: text/x-patch; charset=UTF-8; name="0001-Improve-doprnt-performance.patch" Content-Disposition: attachment; filename="0001-Improve-doprnt-performance.patch" Content-Transfer-Encoding: quoted-printable >From 9c33be96d1a74b2910753117542358c3d7cc4cf0 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Thu, 15 Oct 2020 12:09:55 -0700 Subject: [PATCH] Improve doprnt performance This patch implements some of my suggestions in Bug#8545, with further changes suggested by Eli Zaretskii (Bug#43439). * src/doprnt.c: Improve comments. (SIZE_BOUND_EXTRA): Now at top level, for parse_format_integer. (parse_format_integer): New static function, containing some of the old doprnt. Fix a bug that caused doprnt to infloop on formats like "%10s" that Emacs does not use. We could simplify doprnt further if we dropped support for these never-used formats. (doprnt_nul): New function. (doprnt): Use it. In the typical case where FORMAT_END is null, take just one pass over FORMAT, not two. Assume C99 to make code clearer. Do not use malloc or alloca to allocate a copy of the format FMTCPY; instead, use a small fixed-size array FMTSTAR, and use '*' in that array to represent width and precision, passing them as separate int arguments. Use eassume to pacify GCC in switch statements. Drop support for "%S" which is not needed, is never used and which would cause GCC to warn anyway. --- src/doprnt.c | 241 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 140 insertions(+), 101 deletions(-) diff --git a/src/doprnt.c b/src/doprnt.c index ceadf3bdfa..f1dbf0cb2c 100644 --- a/src/doprnt.c +++ b/src/doprnt.c @@ -28,6 +28,7 @@ . For %s and %c, when field width is specified (e.g., %25s), it accou= nts for the display width of each character, according to char-width-table.= That is, it does not assume that each character takes one column on disp= lay. + Nor does it assume that each character is a single byte. =20 . If the size of the buffer is not enough to produce the formatted st= ring in its entirety, it makes sure that truncation does not chop the last @@ -42,38 +43,41 @@ Emacs can handle. =20 OTOH, this function supports only a small subset of the standard C fo= rmatted - output facilities. E.g., %u and %ll are not supported, and precision= is - ignored %s and %c conversions. (See below for the detailed documenta= tion of - what is supported.) However, this is okay, as this function is suppo= sed to - be called from `error' and similar functions, and thus does not need = to - support features beyond those in `Fformat_message', which is used - by `error' on the Lisp level. */ + output facilities. E.g., %u is not supported, precision is ignored + in %s and %c conversions, and %lld does not necessarily work and + code should use something like %"pM"d with intmax_t instead. + (See below for the detailed documentation of what is supported.) + However, this is okay, as this function is supposed to be called + from 'error' and similar C functions, and thus does not need to + support all the features of 'Fformat_message', which is used by the + Lisp 'error' function. */ =20 /* In the FORMAT argument this function supports ` and ' as directives that output left and right quotes as per =E2=80=98text-quoting style=E2= =80=99. It also supports the following %-sequences: =20 %s means print a string argument. - %S is treated as %s, for loose compatibility with `Fformat_message'. %d means print a `signed int' argument in decimal. %o means print an `unsigned int' argument in octal. %x means print an `unsigned int' argument in hex. %e means print a `double' argument in exponential notation. %f means print a `double' argument in decimal-point notation. %g means print a `double' argument in exponential notation - or in decimal-point notation, whichever uses fewer characters. + or in decimal-point notation, depending on the value; + this is often (though not always) the shorter of the two notations= . %c means print a `signed int' argument as a single character. %% means produce a literal % character. =20 - A %-sequence may contain optional flag, width, and precision specifie= rs, and - a length modifier, as follows: + A %-sequence other than %% may contain optional flags, width, precisi= on, + and length, as follows: =20 %character =20 where flags is [+ -0], width is [0-9]+, precision is .[0-9]+, and len= gth is empty or l or the value of the pD or pI or PRIdMAX (sans "d") macr= os. - Also, %% in a format stands for a single % in the output. A % that - does not introduce a valid %-sequence causes undefined behavior. + A % that does not introduce a valid %-sequence causes undefined behav= ior. + ASCII bytes in FORMAT other than % are copied through as-is; + non-ASCII bytes should not appear in FORMAT. =20 The + flag character inserts a + before any positive number, while a = space inserts a space before any positive number; these flags only affect %= d, %o, @@ -99,7 +103,9 @@ =20 For %e, %f, and %g sequences, the number after the "." in the precisi= on specifier says how many decimal places to show; if zero, the decimal = point - itself is omitted. For %s and %S, the precision specifier is ignored= . */ + itself is omitted. For %d, %o, and %x sequences, the precision speci= fies + the minimum number of digits to appear. Precision specifiers are + not supported for other %-sequences. */ =20 #include #include @@ -115,6 +121,56 @@ another macro. */ #include "character.h" =20 +/* Enough to handle floating point formats with large numbers. */ +enum { SIZE_BOUND_EXTRA =3D DBL_MAX_10_EXP + 50 }; + +/* Parse FMT as an unsigned decimal integer, putting its value into *VAL= UE. + Return the address of the first byte after the integer. + If FMT is not an integer, return FMT and store zero into *VALUE. */ +static char const * +parse_format_integer (char const *fmt, int *value) +{ + int n =3D 0; + bool overflow =3D false; + for (; '0' <=3D *fmt && *fmt <=3D '9'; fmt++) + { + overflow |=3D INT_MULTIPLY_WRAPV (n, 10, &n); + overflow |=3D INT_ADD_WRAPV (n, *fmt - '0', &n); + } + if (overflow || min (PTRDIFF_MAX, SIZE_MAX) - SIZE_BOUND_EXTRA < n) + error ("Format width or precision too large"); + *value =3D n; + return fmt; +} + +/* Like doprnt, except FORMAT_END must be non-null. + Although this function is never exercised in current Emacs, + it is retained in case some future Emacs version contains doprnt + callers with a nonnull FORMAT_END. */ +static ptrdiff_t +doprnt_nul (char *buffer, ptrdiff_t bufsize, char const *format, + char const *format_end, va_list ap) +{ + ptrdiff_t nbytes =3D 0; + char const *f =3D format; + for (char const *fend; (fend =3D memchr (f, 0, format_end - f)); f =3D= fend + 1) + { + nbytes +=3D doprnt (buffer + nbytes, bufsize - nbytes, f, NULL, ap= ); + if (nbytes =3D=3D bufsize - 1) + return nbytes; + nbytes++; + } + + USE_SAFE_ALLOCA; + ptrdiff_t ftaillen =3D format_end - f; + char *ftail =3D SAFE_ALLOCA (ftaillen + 1); + memcpy (ftail, f, ftaillen); + ftail[ftaillen] =3D 0; + nbytes +=3D doprnt (buffer + nbytes, bufsize - nbytes, ftail, NULL, ap= ); + SAFE_FREE (); + return nbytes; +} + /* Generate output from a format-spec FORMAT, terminated at position FORMAT_END. (*FORMAT_END is not part of the format, but must exist and be readabl= e.) @@ -131,12 +187,12 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char *format, const char *format_end, va_list ap) { + if (format_end) + return doprnt_nul (buffer, bufsize, format, format_end, ap); + const char *fmt =3D format; /* Pointer into format string. */ char *bufptr =3D buffer; /* Pointer into output buffer. */ =20 - /* Enough to handle floating point formats with large numbers. */ - enum { SIZE_BOUND_EXTRA =3D DBL_MAX_10_EXP + 50 }; - /* Use this for sprintf unless we need something really big. */ char tembuf[SIZE_BOUND_EXTRA + 50]; =20 @@ -150,103 +206,91 @@ doprnt (char *buffer, ptrdiff_t bufsize, const cha= r *format, char *big_buffer =3D NULL; =20 enum text_quoting_style quoting_style =3D text_quoting_style (); - ptrdiff_t tem =3D -1; - char *string; - char fixed_buffer[20]; /* Default buffer for small formatting. */ - char *fmtcpy; - int minlen; - char charbuf[MAX_MULTIBYTE_LENGTH + 1]; /* Used for %c. */ - USE_SAFE_ALLOCA; - - if (format_end =3D=3D 0) - format_end =3D format + strlen (format); - - fmtcpy =3D (format_end - format < sizeof (fixed_buffer) - 1 - ? fixed_buffer - : SAFE_ALLOCA (format_end - format + 1)); =20 bufsize--; =20 /* Loop until end of format string or buffer full. */ - while (fmt < format_end && bufsize > 0) + while (*fmt && bufsize > 0) { char const *fmt0 =3D fmt; char fmtchar =3D *fmt++; if (fmtchar =3D=3D '%') { - ptrdiff_t size_bound =3D 0; ptrdiff_t width; /* Columns occupied by STRING on display. */ enum { pDlen =3D sizeof pD - 1, pIlen =3D sizeof pI - 1, - pMlen =3D sizeof PRIdMAX - 2 + pMlen =3D sizeof PRIdMAX - 2, + maxmlen =3D max (max (1, pDlen), max (pIlen, pMlen)) }; enum { no_modifier, long_modifier, pD_modifier, pI_modifier, pM_modifier } length_modifier =3D no_modifier; static char const modifier_len[] =3D { 0, 1, pDlen, pIlen, pMlen }; - int maxmlen =3D max (max (1, pDlen), max (pIlen, pMlen)); int mlen; + char charbuf[MAX_MULTIBYTE_LENGTH + 1]; /* Used for %c. */ =20 - /* Copy this one %-spec into fmtcpy. */ - string =3D fmtcpy; + /* Width and precision specified by this %-sequence. */ + int wid =3D 0, prec =3D -1; + + /* FMTSTAR will be a "%*.*X"-like version of this %-sequence. + Start by putting '%' into FMTSTAR. */ + char fmtstar[sizeof "%-+ 0*.*d" + maxmlen]; + char *string =3D fmtstar; *string++ =3D '%'; - while (fmt < format_end) + + /* Copy at most one instance of each flag into FMTSTAR. */ + bool minusflag =3D false, plusflag =3D false, zeroflag =3D false, + spaceflag =3D false; + for (;; fmt++) { - *string++ =3D *fmt; - if ('0' <=3D *fmt && *fmt <=3D '9') + *string =3D *fmt; + switch (*fmt) { - /* Get an idea of how much space we might need. - This might be a field width or a precision; e.g. - %1.1000f and %1000.1f both might need 1000+ bytes. - Parse the width or precision, checking for overflow. */ - int n =3D *fmt - '0'; - bool overflow =3D false; - while (fmt + 1 < format_end - && '0' <=3D fmt[1] && fmt[1] <=3D '9') - { - overflow |=3D INT_MULTIPLY_WRAPV (n, 10, &n); - overflow |=3D INT_ADD_WRAPV (n, fmt[1] - '0', &n); - *string++ =3D *++fmt; - } - - if (overflow - || min (PTRDIFF_MAX, SIZE_MAX) - SIZE_BOUND_EXTRA < n) - error ("Format width or precision too large"); - if (size_bound < n) - size_bound =3D n; + case '-': string +=3D !minusflag; minusflag =3D true; continue; + case '+': string +=3D !plusflag; plusflag =3D true; continue; + case ' ': string +=3D !spaceflag; spaceflag =3D true; continue; + case '0': string +=3D !zeroflag; zeroflag =3D true; continue; } - else if (! (*fmt =3D=3D '-' || *fmt =3D=3D ' ' || *fmt =3D=3D '.' - || *fmt =3D=3D '+')) - break; - fmt++; + break; } =20 + /* Parse width and precision, putting "*.*" into FMTSTAR. */ + if ('1' <=3D *fmt && *fmt <=3D '9') + fmt =3D parse_format_integer (fmt, &wid); + if (*fmt =3D=3D '.') + fmt =3D parse_format_integer (fmt + 1, &prec); + *string++ =3D '*'; + *string++ =3D '.'; + *string++ =3D '*'; + /* Check for the length modifiers in textual length order, so that longer modifiers override shorter ones. */ for (mlen =3D 1; mlen <=3D maxmlen; mlen++) { - if (format_end - fmt < mlen) - break; if (mlen =3D=3D 1 && *fmt =3D=3D 'l') length_modifier =3D long_modifier; - if (mlen =3D=3D pDlen && memcmp (fmt, pD, pDlen) =3D=3D 0) + if (mlen =3D=3D pDlen && strncmp (fmt, pD, pDlen) =3D=3D 0) length_modifier =3D pD_modifier; - if (mlen =3D=3D pIlen && memcmp (fmt, pI, pIlen) =3D=3D 0) + if (mlen =3D=3D pIlen && strncmp (fmt, pI, pIlen) =3D=3D 0) length_modifier =3D pI_modifier; - if (mlen =3D=3D pMlen && memcmp (fmt, PRIdMAX, pMlen) =3D=3D 0) + if (mlen =3D=3D pMlen && strncmp (fmt, PRIdMAX, pMlen) =3D=3D 0) length_modifier =3D pM_modifier; } =20 + /* Copy optional length modifier and conversion specifier + character into FMTSTAR, and append a NUL. */ mlen =3D modifier_len[length_modifier]; - memcpy (string, fmt + 1, mlen); - string +=3D mlen; + string =3D mempcpy (string, fmt, mlen + 1); fmt +=3D mlen; *string =3D 0; =20 - /* Make the size bound large enough to handle floating point formats + /* An idea of how much space we might need. + This might be a field width or a precision; e.g. + %1.1000f and %1000.1f both might need 1000+ bytes. + Make it large enough to handle floating point formats with large numbers. */ - size_bound +=3D SIZE_BOUND_EXTRA; + ptrdiff_t size_bound =3D max (wid, prec) + SIZE_BOUND_EXTRA; =20 /* Make sure we have that much. */ if (size_bound > size_allocated) @@ -257,48 +301,49 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, sprintf_buffer =3D big_buffer; size_allocated =3D size_bound; } - minlen =3D 0; + int minlen =3D 0; + ptrdiff_t tem; switch (*fmt++) { default: - error ("Invalid format operation %s", fmtcpy); + error ("Invalid format operation %s", fmt0); =20 -/* case 'b': */ - case 'l': case 'd': switch (length_modifier) { case no_modifier: { int v =3D va_arg (ap, int); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case long_modifier: { long v =3D va_arg (ap, long); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pD_modifier: signed_pD_modifier: { ptrdiff_t v =3D va_arg (ap, ptrdiff_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pI_modifier: { EMACS_INT v =3D va_arg (ap, EMACS_INT); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pM_modifier: { intmax_t v =3D va_arg (ap, intmax_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; + default: + eassume (false); } /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; @@ -311,13 +356,13 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case no_modifier: { unsigned v =3D va_arg (ap, unsigned); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case long_modifier: { unsigned long v =3D va_arg (ap, unsigned long); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pD_modifier: @@ -325,15 +370,17 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case pI_modifier: { EMACS_UINT v =3D va_arg (ap, EMACS_UINT); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pM_modifier: { uintmax_t v =3D va_arg (ap, uintmax_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; + default: + eassume (false); } /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; @@ -344,22 +391,18 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case 'g': { double d =3D va_arg (ap, double); - tem =3D sprintf (sprintf_buffer, fmtcpy, d); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, d); /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; goto doit; } =20 - case 'S': - string[-1] =3D 's'; - FALLTHROUGH; case 's': - if (fmtcpy[1] !=3D 's') - minlen =3D atoi (&fmtcpy[1]); + minlen =3D minusflag ? -wid : wid; string =3D va_arg (ap, char *); tem =3D strnlen (string, STRING_BYTES_BOUND + 1); if (tem =3D=3D STRING_BYTES_BOUND + 1) - error ("String for %%s or %%S format is too long"); + error ("String for %%s format is too long"); width =3D strwidth (string, tem); goto doit1; =20 @@ -432,14 +475,12 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, string =3D charbuf; string[tem] =3D 0; width =3D strwidth (string, tem); - if (fmtcpy[1] !=3D 'c') - minlen =3D atoi (&fmtcpy[1]); + minlen =3D minusflag ? -wid : wid; goto doit1; } =20 case '%': /* Treat this '%' as normal. */ - fmt0 =3D fmt - 1; break; } } @@ -450,13 +491,13 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, src =3D uLSQM, srclen =3D sizeof uLSQM - 1; else if (quoting_style =3D=3D CURVE_QUOTING_STYLE && fmtchar =3D=3D= '\'') src =3D uRSQM, srclen =3D sizeof uRSQM - 1; - else if (quoting_style =3D=3D STRAIGHT_QUOTING_STYLE && fmtchar =3D= =3D '`') - src =3D "'", srclen =3D 1; else { - while (fmt < format_end && !CHAR_HEAD_P (*fmt)) - fmt++; - src =3D fmt0, srclen =3D fmt - fmt0; + if (quoting_style =3D=3D STRAIGHT_QUOTING_STYLE && fmtchar =3D=3D '`'= ) + fmtchar =3D '\''; + eassert (ASCII_CHAR_P (fmtchar)); + *bufptr++ =3D fmtchar; + continue; } =20 if (bufsize < srclen) @@ -479,8 +520,6 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char *= format, xfree (big_buffer); =20 *bufptr =3D 0; /* Make sure our string ends with a '\0' */ - - SAFE_FREE (); return bufptr - buffer; } =20 --=20 2.25.1 --------------D006C7B2D38DCAF19C17397E-- From debbugs-submit-bounces@debbugs.gnu.org Sat Oct 17 14:32:54 2020 Received: (at 43439) by debbugs.gnu.org; 17 Oct 2020 18:32:54 +0000 Received: from localhost ([127.0.0.1]:35530 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kTr0g-00039h-4c for submit@debbugs.gnu.org; Sat, 17 Oct 2020 14:32:54 -0400 Received: from eggs.gnu.org ([209.51.188.92]:39182) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kTr0e-00039T-1d for 43439@debbugs.gnu.org; Sat, 17 Oct 2020 14:32:52 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:59763) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kTr0W-0007sJ-Pq; Sat, 17 Oct 2020 14:32:45 -0400 Received: from [176.228.60.248] (port=2267 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kTr0W-0008JY-4l; Sat, 17 Oct 2020 14:32:44 -0400 Date: Sat, 17 Oct 2020 21:32:44 +0300 Message-Id: <83pn5gsnxv.fsf@gnu.org> From: Eli Zaretskii To: Paul Eggert In-Reply-To: <74baf31e-43ec-677a-e976-889a101c1e9f@cs.ucla.edu> (message from Paul Eggert on Thu, 15 Oct 2020 13:06:26 -0700) Subject: Re: bug#43439: [PATCH] doprnt improvements References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> <838sd75yor.fsf@gnu.org> <83y2k7we8a.fsf@gnu.org> <83sgafwbrn.fsf@gnu.org> <74baf31e-43ec-677a-e976-889a101c1e9f@cs.ucla.edu> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Cc: 43439@debbugs.gnu.org > From: Paul Eggert > Date: Thu, 15 Oct 2020 13:06:26 -0700 > > On 10/15/20 12:05 PM, Eli Zaretskii wrote: > > I want to continue calling doprnt directly, not replace all its > > calls with a call to another function. > > OK, then attached is a patch that does things that way. This patch affects only > the implementation of doprnt; no doprnt callers are affected. The code should > have most of the proposed performance benefits of the earlier patches I proposed. I'm sorry, but this is still nowhere near the compromise I suggested. And it loses %S, something I didn't agree to. From debbugs-submit-bounces@debbugs.gnu.org Sat Oct 17 22:24:16 2020 Received: (at 43439) by debbugs.gnu.org; 18 Oct 2020 02:24:16 +0000 Received: from localhost ([127.0.0.1]:35944 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kTyMp-0003wM-LU for submit@debbugs.gnu.org; Sat, 17 Oct 2020 22:24:16 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:41848) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kTyMn-0003w5-D0 for 43439@debbugs.gnu.org; Sat, 17 Oct 2020 22:24:14 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 6BD73160079; Sat, 17 Oct 2020 19:24:07 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id LbHCZFDjBvwv; Sat, 17 Oct 2020 19:24:05 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 0D561160112; Sat, 17 Oct 2020 19:24:05 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id lxUVlxUJzevP; Sat, 17 Oct 2020 19:24:04 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-243-218-95.socal.res.rr.com [23.243.218.95]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 99A2F160079; Sat, 17 Oct 2020 19:24:04 -0700 (PDT) Subject: Re: bug#43439: [PATCH] doprnt improvements To: Eli Zaretskii References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> <838sd75yor.fsf@gnu.org> <83y2k7we8a.fsf@gnu.org> <83sgafwbrn.fsf@gnu.org> <74baf31e-43ec-677a-e976-889a101c1e9f@cs.ucla.edu> <83pn5gsnxv.fsf@gnu.org> From: Paul Eggert Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= mQINBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABtCBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PokCVQQTAQgAPwIbAwYLCQgHAwIGFQgCCQoLBBYCAwEC HgECF4AWIQR+N5Kp2Kz31jO8FYjtl+kOYqp+NAUCXyW9lwUJFK4LswAKCRDtl+kOYqp+NKNV D/9HMsI1606n0UuTXHwITsyOjAI9SDOT+C3DUv6qlM5BH2nWAMTiIiyA5uglsJv93oi2vNtF f/Q/m/1cnZWgnVnExkyLI4ENSd1uBvr0/lCSdPlP0Mg6GWSpXMu+x0vdT0AaZNOTE0FnPuol dc3XD76C2qg8sX/iaxXTKHy9P+BlAq/Cs7/pxDQ0EzSn0USZ2C0l5vv4PMpA/picnS6K609J vDGaORmwZeXIZqQNZV+ZQs+UYtVoguDTqby3IUY1I8BlXHRptaj9AMn4Uoh/CqpQlVojoyWl HqaFnnJBKeF0hvJ9SAyalwuzAjG7vQW07MYncaOFm0woiKbg5JLO8F4SBTIkuO0DCf9nLAay 6VsB4rzwdEfRwjPLYAn7MR3fvHCEzfrkldTraiBO1T0ieDK80I7sLf6pMeCYI19pUlx0/NRM GCddiFIQdfthKWXGRS5LAs8jwBf8H6G5PWinPrEIaomIP21ivuhQD07bYq9IiIdeljjUdHcG I0i/B4M56Zaa8Ff38iniOlrDYCmYWR4dCWZiuQeZ3OgqeQs9a6jTvgdDGVmRVqY+jzk8PlaH fcok8ROhFcHKkcfhuBhL25hlRIshRDOEskXqKwnzrbqga3GXZXfsXAoFbzNhLdLv9A+LJAYS kXP6/5qdTpELVGosyH884VdbBpkGI04oYVqulbkCDQRMgHJkARAApoXrvxP3DIfjCNOtXU/P dwMShKdX/RlSs5PfunV1wbKP8herXHrvQdFVqECaTSxmlhzbk8X0PkY9gcVaU2O49T3qsOd1 cHeF52YFGEt0LhsBeMjgNX5uZ1V76r8gyeVlFpWWb0SIwJUBHrDXexF67upeRb2vdHBjYDNe ySn+0B7gFEqvVmZu+LadudDp6kQLjatFvHQHUSGNshBnkkcaTbiI9Pst0GCc2aiznBiPPA2W QxAPlPRh3OGTsn5THADmbjqY6FEMLasVX8DSCblMvLwNeO/8SxziBidhqLpJCqdQRWHku5Xx gIkGeKOz5OLDvXHWJyafrEYjjkS6Ak6B5z6svKliClWnjHQcjlPzyoFFgKTEfcqDxCj4RY0D 0DgtFD0NfyeOidrSB/SzTe2hwryQE3rpSiqo+0cGdzh4yAHKYJ+UrXZ4p93ZhjGfKD1xlrNY DlWyW9PGmbvqFuDmiIAQf9WD/wzEfICc+F+uDDI+uYkRxUFp92ykmdhDEFg1yjYsU8iGU69a Hyvhq36z4zctvbqhRNzOWB1bVJ/dIMDvsExGcXQVDIT7sDNXv0wE3jKSKpp7NDG1oXUXL+2+ SF99Kjy753AbQSAmH617fyBNwhJWvQYg+mUvPpiGOtses9EXUI3lS4v0MEaPG43flEs1UR+1 rpFQWVHo1y1OO+sAEQEAAYkCPAQYAQgAJgIbDBYhBH43kqnYrPfWM7wViO2X6Q5iqn40BQJf Jb2zBQkUrgvPAAoJEO2X6Q5iqn40cnMP/17CgUkXT9aIJriPM8wbceYrcl7+bdYEf79SlwSb bHN7R4CoIJFOlN9S/34typGVYvpgmCJDYFTBxyPO92iMXDgA4+cWHzt5T1aYO9hsKhh7vDtK +6ProZGc+08gUTXHhb97hMMQhknJlnfjpSEC9em906FU+I93T1fTGupnBa3aWcK8jM0JaBGb y2hG1S3olaDLSTtBINNBYmvuWR9MKOhhqDrlk5cwFDJLh5NrXteEY08WAzcLzG3pkrXPHkFe MQtfqk0jLdGGvGC3NCIkqYrdLhiRvGpru38C26REn5f4I0vGE3VfIXHe8TMCNmQut1NtMuUm pDIy1aLxGzuptUhnOJN//r+VjDPoi3LOySNYphqe/dMubsfUr6ohP41mKF81FuwI4amqJtrq IL2yqax3a0qlfwCxXftieqJcuekX+eCPDCKrYMXR0FYgwpG2ITZUGtrEjESlE6Dscx734HKd r5ORIocLUUKEOGeiU6DGhGFdb5Twu0Sn+u1mUPDN0M++CdMvClIE8klo4G91EOImu1Upb8xc OPQwxh1jwqSrU5QwoNmSYegQSHLpIUurFz1iQUh1vpPXzKinkWEqv4IqA1ciL+LyySuLkp7M sJpVRMbWJCNWOOSbaH4oDBJ5dHMGc35x5mosCk90PXknuFDDsYHfDo5smf9lo6YXx7N9 Organization: UCLA Computer Science Department Message-ID: <26cb04dd-5811-bf1e-4368-5b9f255c43d1@cs.ucla.edu> Date: Sat, 17 Oct 2020 19:24:04 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <83pn5gsnxv.fsf@gnu.org> Content-Type: multipart/mixed; boundary="------------FB1E8099BA72135EF6BCC0DF" Content-Language: en-US X-Spam-Score: -2.5 (--) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.5 (---) This is a multi-part message in MIME format. --------------FB1E8099BA72135EF6BCC0DF Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit On 10/17/20 11:32 AM, Eli Zaretskii wrote: > this is still nowhere near the compromise I suggested. > And it loses %S, something I didn't agree to. Attached is a revised patch that implements what you suggested, and it does not lose %S. Here is what you suggested in : > How about a compromise: we modify doprint to exit when either it finds > NUL or reaches the character specified by FORMAT_END? and the attached patch implements this proposed API change. --------------FB1E8099BA72135EF6BCC0DF Content-Type: text/x-patch; charset=UTF-8; name="0001-Improve-doprnt-performance.patch" Content-Disposition: attachment; filename="0001-Improve-doprnt-performance.patch" Content-Transfer-Encoding: quoted-printable >From 43ecff27e57eb4ca6e340827315272c3159fcee7 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sat, 17 Oct 2020 18:35:28 -0700 Subject: [PATCH] Improve doprnt performance This patch implements some of my suggestions in Bug#8545, with further changes suggested by Eli Zaretskii (Bug#43439). * src/doprnt.c: Improve comments. (SIZE_BOUND_EXTRA): Now at top level, for parse_format_integer. (parse_format_integer): New static function, containing some of the old doprnt. Fix a bug that caused doprnt to infloop on formats like "%10s" that Emacs does not use. We could simplify doprnt further if we dropped support for these never-used formats. (doprnt_nul): New function. (doprnt): Use it. Change doprnt API to exit when either it finds NUL or reaches the character specified by FORMAT_END. In the typical case where FORMAT_END is null, take just one pass over FORMAT, not two. Assume C99 to make code clearer. Do not use malloc or alloca to allocate a copy of the format FMTCPY; instead, use a small fixed-size array FMTSTAR, and use '*' in that array to represent width and precision, passing them as separate int arguments. Use eassume to pacify GCC in switch statements. --- src/doprnt.c | 230 +++++++++++++++++++++++++++++---------------------- 1 file changed, 132 insertions(+), 98 deletions(-) diff --git a/src/doprnt.c b/src/doprnt.c index ceadf3bdfa..07c4d8d797 100644 --- a/src/doprnt.c +++ b/src/doprnt.c @@ -28,6 +28,7 @@ . For %s and %c, when field width is specified (e.g., %25s), it accou= nts for the display width of each character, according to char-width-table.= That is, it does not assume that each character takes one column on disp= lay. + Nor does it assume that each character is a single byte. =20 . If the size of the buffer is not enough to produce the formatted st= ring in its entirety, it makes sure that truncation does not chop the last @@ -42,12 +43,14 @@ Emacs can handle. =20 OTOH, this function supports only a small subset of the standard C fo= rmatted - output facilities. E.g., %u and %ll are not supported, and precision= is - ignored %s and %c conversions. (See below for the detailed documenta= tion of - what is supported.) However, this is okay, as this function is suppo= sed to - be called from `error' and similar functions, and thus does not need = to - support features beyond those in `Fformat_message', which is used - by `error' on the Lisp level. */ + output facilities. E.g., %u is not supported, precision is ignored + in %s and %c conversions, and %lld does not necessarily work and + code should use something like %"pM"d with intmax_t instead. + (See below for the detailed documentation of what is supported.) + However, this is okay, as this function is supposed to be called + from 'error' and similar C functions, and thus does not need to + support all the features of 'Fformat_message', which is used by the + Lisp 'error' function. */ =20 /* In the FORMAT argument this function supports ` and ' as directives that output left and right quotes as per =E2=80=98text-quoting style=E2= =80=99. It @@ -61,19 +64,21 @@ %e means print a `double' argument in exponential notation. %f means print a `double' argument in decimal-point notation. %g means print a `double' argument in exponential notation - or in decimal-point notation, whichever uses fewer characters. + or in decimal-point notation, depending on the value; + this is often (though not always) the shorter of the two notations= . %c means print a `signed int' argument as a single character. %% means produce a literal % character. =20 - A %-sequence may contain optional flag, width, and precision specifie= rs, and - a length modifier, as follows: + A %-sequence other than %% may contain optional flags, width, precisi= on, + and length, as follows: =20 %character =20 where flags is [+ -0], width is [0-9]+, precision is .[0-9]+, and len= gth is empty or l or the value of the pD or pI or PRIdMAX (sans "d") macr= os. - Also, %% in a format stands for a single % in the output. A % that - does not introduce a valid %-sequence causes undefined behavior. + A % that does not introduce a valid %-sequence causes undefined behav= ior. + ASCII bytes in FORMAT other than % are copied through as-is; + non-ASCII bytes should not appear in FORMAT. =20 The + flag character inserts a + before any positive number, while a = space inserts a space before any positive number; these flags only affect %= d, %o, @@ -99,7 +104,9 @@ =20 For %e, %f, and %g sequences, the number after the "." in the precisi= on specifier says how many decimal places to show; if zero, the decimal = point - itself is omitted. For %s and %S, the precision specifier is ignored= . */ + itself is omitted. For %d, %o, and %x sequences, the precision speci= fies + the minimum number of digits to appear. Precision specifiers are + not supported for other %-sequences. */ =20 #include #include @@ -115,7 +122,50 @@ another macro. */ #include "character.h" =20 +/* Enough to handle floating point formats with large numbers. */ +enum { SIZE_BOUND_EXTRA =3D DBL_MAX_10_EXP + 50 }; + +/* Parse FMT as an unsigned decimal integer, putting its value into *VAL= UE. + Return the address of the first byte after the integer. + If FMT is not an integer, return FMT and store zero into *VALUE. */ +static char const * +parse_format_integer (char const *fmt, int *value) +{ + int n =3D 0; + bool overflow =3D false; + for (; '0' <=3D *fmt && *fmt <=3D '9'; fmt++) + { + overflow |=3D INT_MULTIPLY_WRAPV (n, 10, &n); + overflow |=3D INT_ADD_WRAPV (n, *fmt - '0', &n); + } + if (overflow || min (PTRDIFF_MAX, SIZE_MAX) - SIZE_BOUND_EXTRA < n) + error ("Format width or precision too large"); + *value =3D n; + return fmt; +} + +/* Like doprnt, except FORMAT must not contain NUL bytes and + FORMAT_END must be non-null. Although this function is never + exercised in current Emacs, it is retained in case some future + Emacs version contains doprnt callers that need such formats. + Having a separate function helps GCC optimize doprnt better. */ +static ptrdiff_t +doprnt_nul (char *buffer, ptrdiff_t bufsize, char const *format, + char const *format_end, va_list ap) +{ + USE_SAFE_ALLOCA; + ptrdiff_t fmtlen =3D format_end - format; + char *fmt =3D SAFE_ALLOCA (fmtlen + 1); + memcpy (fmt, format, fmtlen); + fmt[fmtlen] =3D 0; + ptrdiff_t nbytes =3D doprnt (buffer, bufsize, fmt, NULL, ap); + SAFE_FREE (); + return nbytes; +} + /* Generate output from a format-spec FORMAT, + terminated at either the first NUL or (if FORMAT_END is non-null + and there are no NUL bytes between FORMAT and FORMAT_END) terminated at position FORMAT_END. (*FORMAT_END is not part of the format, but must exist and be readabl= e.) Output goes in BUFFER, which has room for BUFSIZE chars. @@ -131,12 +181,12 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char *format, const char *format_end, va_list ap) { + if (format_end && !memchr (format, 0, format_end - format)) + return doprnt_nul (buffer, bufsize, format, format_end, ap); + const char *fmt =3D format; /* Pointer into format string. */ char *bufptr =3D buffer; /* Pointer into output buffer. */ =20 - /* Enough to handle floating point formats with large numbers. */ - enum { SIZE_BOUND_EXTRA =3D DBL_MAX_10_EXP + 50 }; - /* Use this for sprintf unless we need something really big. */ char tembuf[SIZE_BOUND_EXTRA + 50]; =20 @@ -150,103 +200,91 @@ doprnt (char *buffer, ptrdiff_t bufsize, const cha= r *format, char *big_buffer =3D NULL; =20 enum text_quoting_style quoting_style =3D text_quoting_style (); - ptrdiff_t tem =3D -1; - char *string; - char fixed_buffer[20]; /* Default buffer for small formatting. */ - char *fmtcpy; - int minlen; - char charbuf[MAX_MULTIBYTE_LENGTH + 1]; /* Used for %c. */ - USE_SAFE_ALLOCA; - - if (format_end =3D=3D 0) - format_end =3D format + strlen (format); - - fmtcpy =3D (format_end - format < sizeof (fixed_buffer) - 1 - ? fixed_buffer - : SAFE_ALLOCA (format_end - format + 1)); =20 bufsize--; =20 /* Loop until end of format string or buffer full. */ - while (fmt < format_end && bufsize > 0) + while (*fmt && bufsize > 0) { char const *fmt0 =3D fmt; char fmtchar =3D *fmt++; if (fmtchar =3D=3D '%') { - ptrdiff_t size_bound =3D 0; ptrdiff_t width; /* Columns occupied by STRING on display. */ enum { pDlen =3D sizeof pD - 1, pIlen =3D sizeof pI - 1, - pMlen =3D sizeof PRIdMAX - 2 + pMlen =3D sizeof PRIdMAX - 2, + maxmlen =3D max (max (1, pDlen), max (pIlen, pMlen)) }; enum { no_modifier, long_modifier, pD_modifier, pI_modifier, pM_modifier } length_modifier =3D no_modifier; static char const modifier_len[] =3D { 0, 1, pDlen, pIlen, pMlen }; - int maxmlen =3D max (max (1, pDlen), max (pIlen, pMlen)); int mlen; + char charbuf[MAX_MULTIBYTE_LENGTH + 1]; /* Used for %c. */ =20 - /* Copy this one %-spec into fmtcpy. */ - string =3D fmtcpy; + /* Width and precision specified by this %-sequence. */ + int wid =3D 0, prec =3D -1; + + /* FMTSTAR will be a "%*.*X"-like version of this %-sequence. + Start by putting '%' into FMTSTAR. */ + char fmtstar[sizeof "%-+ 0*.*d" + maxmlen]; + char *string =3D fmtstar; *string++ =3D '%'; - while (fmt < format_end) + + /* Copy at most one instance of each flag into FMTSTAR. */ + bool minusflag =3D false, plusflag =3D false, zeroflag =3D false, + spaceflag =3D false; + for (;; fmt++) { - *string++ =3D *fmt; - if ('0' <=3D *fmt && *fmt <=3D '9') + *string =3D *fmt; + switch (*fmt) { - /* Get an idea of how much space we might need. - This might be a field width or a precision; e.g. - %1.1000f and %1000.1f both might need 1000+ bytes. - Parse the width or precision, checking for overflow. */ - int n =3D *fmt - '0'; - bool overflow =3D false; - while (fmt + 1 < format_end - && '0' <=3D fmt[1] && fmt[1] <=3D '9') - { - overflow |=3D INT_MULTIPLY_WRAPV (n, 10, &n); - overflow |=3D INT_ADD_WRAPV (n, fmt[1] - '0', &n); - *string++ =3D *++fmt; - } - - if (overflow - || min (PTRDIFF_MAX, SIZE_MAX) - SIZE_BOUND_EXTRA < n) - error ("Format width or precision too large"); - if (size_bound < n) - size_bound =3D n; + case '-': string +=3D !minusflag; minusflag =3D true; continue; + case '+': string +=3D !plusflag; plusflag =3D true; continue; + case ' ': string +=3D !spaceflag; spaceflag =3D true; continue; + case '0': string +=3D !zeroflag; zeroflag =3D true; continue; } - else if (! (*fmt =3D=3D '-' || *fmt =3D=3D ' ' || *fmt =3D=3D '.' - || *fmt =3D=3D '+')) - break; - fmt++; + break; } =20 + /* Parse width and precision, putting "*.*" into FMTSTAR. */ + if ('1' <=3D *fmt && *fmt <=3D '9') + fmt =3D parse_format_integer (fmt, &wid); + if (*fmt =3D=3D '.') + fmt =3D parse_format_integer (fmt + 1, &prec); + *string++ =3D '*'; + *string++ =3D '.'; + *string++ =3D '*'; + /* Check for the length modifiers in textual length order, so that longer modifiers override shorter ones. */ for (mlen =3D 1; mlen <=3D maxmlen; mlen++) { - if (format_end - fmt < mlen) - break; if (mlen =3D=3D 1 && *fmt =3D=3D 'l') length_modifier =3D long_modifier; - if (mlen =3D=3D pDlen && memcmp (fmt, pD, pDlen) =3D=3D 0) + if (mlen =3D=3D pDlen && strncmp (fmt, pD, pDlen) =3D=3D 0) length_modifier =3D pD_modifier; - if (mlen =3D=3D pIlen && memcmp (fmt, pI, pIlen) =3D=3D 0) + if (mlen =3D=3D pIlen && strncmp (fmt, pI, pIlen) =3D=3D 0) length_modifier =3D pI_modifier; - if (mlen =3D=3D pMlen && memcmp (fmt, PRIdMAX, pMlen) =3D=3D 0) + if (mlen =3D=3D pMlen && strncmp (fmt, PRIdMAX, pMlen) =3D=3D 0) length_modifier =3D pM_modifier; } =20 + /* Copy optional length modifier and conversion specifier + character into FMTSTAR, and append a NUL. */ mlen =3D modifier_len[length_modifier]; - memcpy (string, fmt + 1, mlen); - string +=3D mlen; + string =3D mempcpy (string, fmt, mlen + 1); fmt +=3D mlen; *string =3D 0; =20 - /* Make the size bound large enough to handle floating point formats + /* An idea of how much space we might need. + This might be a field width or a precision; e.g. + %1.1000f and %1000.1f both might need 1000+ bytes. + Make it large enough to handle floating point formats with large numbers. */ - size_bound +=3D SIZE_BOUND_EXTRA; + ptrdiff_t size_bound =3D max (wid, prec) + SIZE_BOUND_EXTRA; =20 /* Make sure we have that much. */ if (size_bound > size_allocated) @@ -257,48 +295,49 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, sprintf_buffer =3D big_buffer; size_allocated =3D size_bound; } - minlen =3D 0; + int minlen =3D 0; + ptrdiff_t tem; switch (*fmt++) { default: - error ("Invalid format operation %s", fmtcpy); + error ("Invalid format operation %s", fmt0); =20 -/* case 'b': */ - case 'l': case 'd': switch (length_modifier) { case no_modifier: { int v =3D va_arg (ap, int); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case long_modifier: { long v =3D va_arg (ap, long); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pD_modifier: signed_pD_modifier: { ptrdiff_t v =3D va_arg (ap, ptrdiff_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pI_modifier: { EMACS_INT v =3D va_arg (ap, EMACS_INT); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pM_modifier: { intmax_t v =3D va_arg (ap, intmax_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; + default: + eassume (false); } /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; @@ -311,13 +350,13 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case no_modifier: { unsigned v =3D va_arg (ap, unsigned); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case long_modifier: { unsigned long v =3D va_arg (ap, unsigned long); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pD_modifier: @@ -325,15 +364,17 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case pI_modifier: { EMACS_UINT v =3D va_arg (ap, EMACS_UINT); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; case pM_modifier: { uintmax_t v =3D va_arg (ap, uintmax_t); - tem =3D sprintf (sprintf_buffer, fmtcpy, v); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, v); } break; + default: + eassume (false); } /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; @@ -344,18 +385,15 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, case 'g': { double d =3D va_arg (ap, double); - tem =3D sprintf (sprintf_buffer, fmtcpy, d); + tem =3D sprintf (sprintf_buffer, fmtstar, wid, prec, d); /* Now copy into final output, truncating as necessary. */ string =3D sprintf_buffer; goto doit; } =20 case 'S': - string[-1] =3D 's'; - FALLTHROUGH; case 's': - if (fmtcpy[1] !=3D 's') - minlen =3D atoi (&fmtcpy[1]); + minlen =3D minusflag ? -wid : wid; string =3D va_arg (ap, char *); tem =3D strnlen (string, STRING_BYTES_BOUND + 1); if (tem =3D=3D STRING_BYTES_BOUND + 1) @@ -432,14 +470,12 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, string =3D charbuf; string[tem] =3D 0; width =3D strwidth (string, tem); - if (fmtcpy[1] !=3D 'c') - minlen =3D atoi (&fmtcpy[1]); + minlen =3D minusflag ? -wid : wid; goto doit1; } =20 case '%': /* Treat this '%' as normal. */ - fmt0 =3D fmt - 1; break; } } @@ -450,13 +486,13 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char= *format, src =3D uLSQM, srclen =3D sizeof uLSQM - 1; else if (quoting_style =3D=3D CURVE_QUOTING_STYLE && fmtchar =3D=3D= '\'') src =3D uRSQM, srclen =3D sizeof uRSQM - 1; - else if (quoting_style =3D=3D STRAIGHT_QUOTING_STYLE && fmtchar =3D= =3D '`') - src =3D "'", srclen =3D 1; else { - while (fmt < format_end && !CHAR_HEAD_P (*fmt)) - fmt++; - src =3D fmt0, srclen =3D fmt - fmt0; + if (quoting_style =3D=3D STRAIGHT_QUOTING_STYLE && fmtchar =3D=3D '`'= ) + fmtchar =3D '\''; + eassert (ASCII_CHAR_P (fmtchar)); + *bufptr++ =3D fmtchar; + continue; } =20 if (bufsize < srclen) @@ -479,8 +515,6 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char *= format, xfree (big_buffer); =20 *bufptr =3D 0; /* Make sure our string ends with a '\0' */ - - SAFE_FREE (); return bufptr - buffer; } =20 --=20 2.25.1 --------------FB1E8099BA72135EF6BCC0DF-- From debbugs-submit-bounces@debbugs.gnu.org Sat Oct 24 06:39:59 2020 Received: (at 43439) by debbugs.gnu.org; 24 Oct 2020 10:39:59 +0000 Received: from localhost ([127.0.0.1]:32819 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kWGxq-0000sj-PR for submit@debbugs.gnu.org; Sat, 24 Oct 2020 06:39:59 -0400 Received: from eggs.gnu.org ([209.51.188.92]:32840) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kWGxp-0000sX-OZ for 43439@debbugs.gnu.org; Sat, 24 Oct 2020 06:39:58 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:45288) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kWGxi-0007F1-Vp; Sat, 24 Oct 2020 06:39:50 -0400 Received: from [176.228.60.248] (port=2845 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kWGxi-0007l3-EL; Sat, 24 Oct 2020 06:39:50 -0400 Date: Sat, 24 Oct 2020 13:39:37 +0300 Message-Id: <83mu0bhpqu.fsf@gnu.org> From: Eli Zaretskii To: Paul Eggert In-Reply-To: <26cb04dd-5811-bf1e-4368-5b9f255c43d1@cs.ucla.edu> (message from Paul Eggert on Sat, 17 Oct 2020 19:24:04 -0700) Subject: Re: bug#43439: [PATCH] doprnt improvements References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> <838sd75yor.fsf@gnu.org> <83y2k7we8a.fsf@gnu.org> <83sgafwbrn.fsf@gnu.org> <74baf31e-43ec-677a-e976-889a101c1e9f@cs.ucla.edu> <83pn5gsnxv.fsf@gnu.org> <26cb04dd-5811-bf1e-4368-5b9f255c43d1@cs.ucla.edu> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 43439 Cc: 43439@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Cc: 43439@debbugs.gnu.org > From: Paul Eggert > Date: Sat, 17 Oct 2020 19:24:04 -0700 > > On 10/17/20 11:32 AM, Eli Zaretskii wrote: > > this is still nowhere near the compromise I suggested. > > And it loses %S, something I didn't agree to. > > Attached is a revised patch that implements what you suggested, and it does not > lose %S. > > Here is what you suggested in > : > > > How about a compromise: we modify doprint to exit when either it finds > > NUL or reaches the character specified by FORMAT_END? > > and the attached patch implements this proposed API change. Thanks, I'm okay with installing this on master. From debbugs-submit-bounces@debbugs.gnu.org Sat Oct 24 17:02:35 2020 Received: (at 43439-done) by debbugs.gnu.org; 24 Oct 2020 21:02:35 +0000 Received: from localhost ([127.0.0.1]:35287 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kWQgN-0006W3-9c for submit@debbugs.gnu.org; Sat, 24 Oct 2020 17:02:35 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:38126) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kWQgK-0006Vp-TA for 43439-done@debbugs.gnu.org; Sat, 24 Oct 2020 17:02:34 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1A8321600C4; Sat, 24 Oct 2020 14:02:27 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id r8sA40H92bnY; Sat, 24 Oct 2020 14:02:26 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1B639160126; Sat, 24 Oct 2020 14:02:26 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id EF6z-7zG09Vs; Sat, 24 Oct 2020 14:02:25 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-243-218-95.socal.res.rr.com [23.243.218.95]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id D2E711600C4; Sat, 24 Oct 2020 14:02:25 -0700 (PDT) Subject: Re: bug#43439: [PATCH] doprnt improvements To: Eli Zaretskii References: <20200916015051.20517-1-eggert@cs.ucla.edu> <83o8m57oq7.fsf@gnu.org> <31fb34a7-26c3-552d-e8d4-74624455ffcb@cs.ucla.edu> <838sd75yor.fsf@gnu.org> <83y2k7we8a.fsf@gnu.org> <83sgafwbrn.fsf@gnu.org> <74baf31e-43ec-677a-e976-889a101c1e9f@cs.ucla.edu> <83pn5gsnxv.fsf@gnu.org> <26cb04dd-5811-bf1e-4368-5b9f255c43d1@cs.ucla.edu> <83mu0bhpqu.fsf@gnu.org> From: Paul Eggert Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= mQINBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABtCBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PokCVQQTAQgAPwIbAwYLCQgHAwIGFQgCCQoLBBYCAwEC HgECF4AWIQR+N5Kp2Kz31jO8FYjtl+kOYqp+NAUCXyW9lwUJFK4LswAKCRDtl+kOYqp+NKNV D/9HMsI1606n0UuTXHwITsyOjAI9SDOT+C3DUv6qlM5BH2nWAMTiIiyA5uglsJv93oi2vNtF f/Q/m/1cnZWgnVnExkyLI4ENSd1uBvr0/lCSdPlP0Mg6GWSpXMu+x0vdT0AaZNOTE0FnPuol dc3XD76C2qg8sX/iaxXTKHy9P+BlAq/Cs7/pxDQ0EzSn0USZ2C0l5vv4PMpA/picnS6K609J vDGaORmwZeXIZqQNZV+ZQs+UYtVoguDTqby3IUY1I8BlXHRptaj9AMn4Uoh/CqpQlVojoyWl HqaFnnJBKeF0hvJ9SAyalwuzAjG7vQW07MYncaOFm0woiKbg5JLO8F4SBTIkuO0DCf9nLAay 6VsB4rzwdEfRwjPLYAn7MR3fvHCEzfrkldTraiBO1T0ieDK80I7sLf6pMeCYI19pUlx0/NRM GCddiFIQdfthKWXGRS5LAs8jwBf8H6G5PWinPrEIaomIP21ivuhQD07bYq9IiIdeljjUdHcG I0i/B4M56Zaa8Ff38iniOlrDYCmYWR4dCWZiuQeZ3OgqeQs9a6jTvgdDGVmRVqY+jzk8PlaH fcok8ROhFcHKkcfhuBhL25hlRIshRDOEskXqKwnzrbqga3GXZXfsXAoFbzNhLdLv9A+LJAYS kXP6/5qdTpELVGosyH884VdbBpkGI04oYVqulbkCDQRMgHJkARAApoXrvxP3DIfjCNOtXU/P dwMShKdX/RlSs5PfunV1wbKP8herXHrvQdFVqECaTSxmlhzbk8X0PkY9gcVaU2O49T3qsOd1 cHeF52YFGEt0LhsBeMjgNX5uZ1V76r8gyeVlFpWWb0SIwJUBHrDXexF67upeRb2vdHBjYDNe ySn+0B7gFEqvVmZu+LadudDp6kQLjatFvHQHUSGNshBnkkcaTbiI9Pst0GCc2aiznBiPPA2W QxAPlPRh3OGTsn5THADmbjqY6FEMLasVX8DSCblMvLwNeO/8SxziBidhqLpJCqdQRWHku5Xx gIkGeKOz5OLDvXHWJyafrEYjjkS6Ak6B5z6svKliClWnjHQcjlPzyoFFgKTEfcqDxCj4RY0D 0DgtFD0NfyeOidrSB/SzTe2hwryQE3rpSiqo+0cGdzh4yAHKYJ+UrXZ4p93ZhjGfKD1xlrNY DlWyW9PGmbvqFuDmiIAQf9WD/wzEfICc+F+uDDI+uYkRxUFp92ykmdhDEFg1yjYsU8iGU69a Hyvhq36z4zctvbqhRNzOWB1bVJ/dIMDvsExGcXQVDIT7sDNXv0wE3jKSKpp7NDG1oXUXL+2+ SF99Kjy753AbQSAmH617fyBNwhJWvQYg+mUvPpiGOtses9EXUI3lS4v0MEaPG43flEs1UR+1 rpFQWVHo1y1OO+sAEQEAAYkCPAQYAQgAJgIbDBYhBH43kqnYrPfWM7wViO2X6Q5iqn40BQJf Jb2zBQkUrgvPAAoJEO2X6Q5iqn40cnMP/17CgUkXT9aIJriPM8wbceYrcl7+bdYEf79SlwSb bHN7R4CoIJFOlN9S/34typGVYvpgmCJDYFTBxyPO92iMXDgA4+cWHzt5T1aYO9hsKhh7vDtK +6ProZGc+08gUTXHhb97hMMQhknJlnfjpSEC9em906FU+I93T1fTGupnBa3aWcK8jM0JaBGb y2hG1S3olaDLSTtBINNBYmvuWR9MKOhhqDrlk5cwFDJLh5NrXteEY08WAzcLzG3pkrXPHkFe MQtfqk0jLdGGvGC3NCIkqYrdLhiRvGpru38C26REn5f4I0vGE3VfIXHe8TMCNmQut1NtMuUm pDIy1aLxGzuptUhnOJN//r+VjDPoi3LOySNYphqe/dMubsfUr6ohP41mKF81FuwI4amqJtrq IL2yqax3a0qlfwCxXftieqJcuekX+eCPDCKrYMXR0FYgwpG2ITZUGtrEjESlE6Dscx734HKd r5ORIocLUUKEOGeiU6DGhGFdb5Twu0Sn+u1mUPDN0M++CdMvClIE8klo4G91EOImu1Upb8xc OPQwxh1jwqSrU5QwoNmSYegQSHLpIUurFz1iQUh1vpPXzKinkWEqv4IqA1ciL+LyySuLkp7M sJpVRMbWJCNWOOSbaH4oDBJ5dHMGc35x5mosCk90PXknuFDDsYHfDo5smf9lo6YXx7N9 Organization: UCLA Computer Science Department Message-ID: <3ac480a0-bd03-5068-ae9f-ccbe338245d6@cs.ucla.edu> Date: Sat, 24 Oct 2020 14:02:25 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <83mu0bhpqu.fsf@gnu.org> Content-Type: multipart/mixed; boundary="------------0534F73B87272FE806262959" Content-Language: en-US X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 43439-done Cc: 43439-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) This is a multi-part message in MIME format. --------------0534F73B87272FE806262959 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit OK, I installed the patch, followed by the attached minor cleanups. Closing the bug report. --------------0534F73B87272FE806262959 Content-Type: text/x-patch; charset=UTF-8; name="0001-Rename-doprnt_nul-to-doprnt_non_null_end.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-Rename-doprnt_nul-to-doprnt_non_null_end.patch" >From 28d2931b4bc934d06f449c01e067258d76a16738 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sat, 24 Oct 2020 13:46:46 -0700 Subject: [PATCH 1/2] Rename doprnt_nul to doprnt_non_null_end * src/doprnt.c (doprnt_non_null_end): Rename from doprnt_nul, as the old name was misleading (left over from a previous proposal). Caller changed. --- src/doprnt.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/doprnt.c b/src/doprnt.c index 07c4d8d797..be256f4497 100644 --- a/src/doprnt.c +++ b/src/doprnt.c @@ -150,8 +150,8 @@ parse_format_integer (char const *fmt, int *value) Emacs version contains doprnt callers that need such formats. Having a separate function helps GCC optimize doprnt better. */ static ptrdiff_t -doprnt_nul (char *buffer, ptrdiff_t bufsize, char const *format, - char const *format_end, va_list ap) +doprnt_non_null_end (char *buffer, ptrdiff_t bufsize, char const *format, + char const *format_end, va_list ap) { USE_SAFE_ALLOCA; ptrdiff_t fmtlen = format_end - format; @@ -182,7 +182,7 @@ doprnt (char *buffer, ptrdiff_t bufsize, const char *format, const char *format_end, va_list ap) { if (format_end && !memchr (format, 0, format_end - format)) - return doprnt_nul (buffer, bufsize, format, format_end, ap); + return doprnt_non_null_end (buffer, bufsize, format, format_end, ap); const char *fmt = format; /* Pointer into format string. */ char *bufptr = buffer; /* Pointer into output buffer. */ -- 2.25.1 --------------0534F73B87272FE806262959 Content-Type: text/x-patch; charset=UTF-8; name="0002-Minor-doprnt-cleanup-remove-memchr-call.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0002-Minor-doprnt-cleanup-remove-memchr-call.patch" >From 32e427cca112f5471356c1fa95ba1ed256d200b6 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sat, 24 Oct 2020 13:50:29 -0700 Subject: [PATCH 2/2] Minor doprnt cleanup: remove memchr call * src/doprnt.c (doprnt): Remove unnecessary call to memchr. --- src/doprnt.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/doprnt.c b/src/doprnt.c index be256f4497..ce259d07cf 100644 --- a/src/doprnt.c +++ b/src/doprnt.c @@ -144,10 +144,10 @@ parse_format_integer (char const *fmt, int *value) return fmt; } -/* Like doprnt, except FORMAT must not contain NUL bytes and - FORMAT_END must be non-null. Although this function is never - exercised in current Emacs, it is retained in case some future - Emacs version contains doprnt callers that need such formats. +/* Like doprnt, except FORMAT_END must be non-null. + Although this function is never exercised in current Emacs, + it is retained in case some future Emacs version + contains doprnt callers that need such formats. Having a separate function helps GCC optimize doprnt better. */ static ptrdiff_t doprnt_non_null_end (char *buffer, ptrdiff_t bufsize, char const *format, @@ -181,7 +181,7 @@ doprnt_non_null_end (char *buffer, ptrdiff_t bufsize, char const *format, doprnt (char *buffer, ptrdiff_t bufsize, const char *format, const char *format_end, va_list ap) { - if (format_end && !memchr (format, 0, format_end - format)) + if (format_end) return doprnt_non_null_end (buffer, bufsize, format, format_end, ap); const char *fmt = format; /* Pointer into format string. */ -- 2.25.1 --------------0534F73B87272FE806262959-- From unknown Tue Jun 17 20:14:48 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sun, 22 Nov 2020 12:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator