GNU bug report logs - #73307
Fix ctype(3) usage

Previous Next

Package: emacs;

Reported by: Thomas Klausner <wiz <at> gatalith.at>

Date: Mon, 16 Sep 2024 21:26:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #22 received at 73307 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: luangruo <at> yahoo.com, 73307 <at> debbugs.gnu.org, wiz <at> gatalith.at
Subject: Re: bug#73307: Fix ctype(3) usage
Date: Wed, 18 Sep 2024 14:45:25 +0300
> Date: Tue, 17 Sep 2024 17:05:29 -0700
> Cc: 73307-done <at> debbugs.gnu.org, Thomas Klausner <wiz <at> gatalith.at>,
>  Po Lu <luangruo <at> yahoo.com>
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> On 2024-09-17 04:52, Eli Zaretskii wrote:
> > However, I'm not sure this is the right fix, the function is
> > defined with argument type of 'int'.  Paul, any comments?
> 
> Although that patch was an improvement it still had problems, as it 
> incorrectly assumed the string does not end in a multibyte space, and 
> that Emacs's locale matches the system's.
> 
> Emacs itself should not use <ctype.h> unless it knows the string is 
> unibyte and the system locale matches Emacs's. I scanned through its 
> source code looking for all problematic instances of <ctype.h> that have 
> crept in (except I didn't scan the MS-Windows code, where you're the 
> expert), and found five other places where ctype.h was obviously 
> misused. I installed the attached to fix these glitches and am boldly 
> closing this the report.
> 
> I can't easily test patch 0003, which fixes Android-specific code. 
> Although I think it's an improvement, in unlikely cases I suspect it 
> still doesn't exactly match what the Android kernel does with #! lines. 
> I don't know whether that matters. I'll CC this to Po Lu (my goto person 
> for Android) as a heads-up.

Thanks.

This part doesn't look right to me:

> -  {
> -    char *s = SSDATA (prompt);
> -    ptrdiff_t len = strlen (s);
> -    if ((len > 0) && !isspace (s[len - 1]))
> -      prompt = CALLN (Fconcat, prompt, build_string (" "));
> -  }
> -  prompt = CALLN (Fconcat, prompt, Vyes_or_no_prompt);
> +  ptrdiff_t promptlen = SCHARS (prompt);
> +  bool prompt_ends_in_nonspace
> +    = (0 < promptlen
> +       && (SYNTAX (XFIXNAT (Faref (prompt, make_fixnum (promptlen - 1))))
> +	   != Swhitespace));
> +  AUTO_STRING (space_string, " ");
> +  prompt = CALLN (Fconcat, prompt,
> +		  prompt_ends_in_nonspace ? space_string : empty_unibyte_string,
> +		  Vyes_or_no_prompt);

You are testing characters for whitespace syntax, which is AFAIU
subject to buffer-local syntax tables.  Thus, a strange enough setting
of buffer syntax could make this code return unexpected results.

Stefan, am I right?

Why can't we use something closer to the original code, which doesn't
depend on buffer-local customizations?  In particular, what's the
problem with using c_isspace (and what do you mean by "multibyte
space"?)





This bug report was last modified 296 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.