GNU bug report logs - #56469
29.0.50; Unibyte dir in directory_files_internal

Previous Next

Package: emacs;

Reported by: Stefan Monnier <monnier <at> iro.umontreal.ca>

Date: Sat, 9 Jul 2022 17:46:01 UTC

Severity: normal

Found in version 29.0.50

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #25 received at 56469 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 56469 <at> debbugs.gnu.org
Subject: Re: bug#56469: 29.0.50; Unibyte dir in directory_files_internal
Date: Sun, 10 Jul 2022 10:58:30 -0400
Eli Zaretskii [2022-07-10 17:32:17] wrote:

>> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
>> Cc: 56469 <at> debbugs.gnu.org
>> Date: Sun, 10 Jul 2022 10:23:28 -0400
>> 
>> W.r.t to the comment, it's indeed unrelated to the patch (other than
>> the fact that it touches the same code).  The question is when we do:
>> 
>> 	  finalname = (nchars == nbytes)
>> 	              ? make_uninit_string (nbytes)
>> 	              : make_uninit_multibyte_string (nchars, nbytes);
>> 
>> the actual bytes are "decoded" (i.e. in our internal UTF-8 encoding), so
>> (nchars == nbytes) checks whether its "pure ASCII" or not and if it's
>> pure ASCII we return a unibyte string.
>
> I don't think this is true, because early during startup we don't yet
> have the coding-systems set up, and so the file names are unibyte and
> undecoded.  So that place in dired.c doesn't only handle ASCII when it
> sees that ncahrs == nbytes.

Hmm... the early startup is actually not a worry here (according to my
tests `directory_files_internal` is first called when we get to
native-compile the macroexp/bytecomp, at which point all our coding
systems have been setup).

But indeed, if the file name coding system is something like `binary`,
DECODE_FILE will always return a unibyte string, so we may have non-ASCII
bytes when (nchars == nbytes).
Thanks, I'll update the comment accordingly.


        Stefan





This bug report was last modified 2 years and 258 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.