GNU bug report logs -
#56469
29.0.50; Unibyte dir in directory_files_internal
Previous Next
Full log
Message #25 received at 56469 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii [2022-07-10 17:32:17] wrote:
>> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
>> Cc: 56469 <at> debbugs.gnu.org
>> Date: Sun, 10 Jul 2022 10:23:28 -0400
>>
>> W.r.t to the comment, it's indeed unrelated to the patch (other than
>> the fact that it touches the same code). The question is when we do:
>>
>> finalname = (nchars == nbytes)
>> ? make_uninit_string (nbytes)
>> : make_uninit_multibyte_string (nchars, nbytes);
>>
>> the actual bytes are "decoded" (i.e. in our internal UTF-8 encoding), so
>> (nchars == nbytes) checks whether its "pure ASCII" or not and if it's
>> pure ASCII we return a unibyte string.
>
> I don't think this is true, because early during startup we don't yet
> have the coding-systems set up, and so the file names are unibyte and
> undecoded. So that place in dired.c doesn't only handle ASCII when it
> sees that ncahrs == nbytes.
Hmm... the early startup is actually not a worry here (according to my
tests `directory_files_internal` is first called when we get to
native-compile the macroexp/bytecomp, at which point all our coding
systems have been setup).
But indeed, if the file name coding system is something like `binary`,
DECODE_FILE will always return a unibyte string, so we may have non-ASCII
bytes when (nchars == nbytes).
Thanks, I'll update the comment accordingly.
Stefan
This bug report was last modified 2 years and 258 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.