GNU bug report logs -
#40407
[PATCH] slow ENCODE_FILE and DECODE_FILE
Previous Next
Reported by: Mattias Engdegård <mattiase <at> acm.org>
Date: Fri, 3 Apr 2020 16:11:01 UTC
Severity: normal
Tags: patch
Done: Mattias Engdegård <mattiase <at> acm.org>
Bug is archived. No further changes may be made.
Full log
Message #59 received at 40407 <at> debbugs.gnu.org (full text, mbox):
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Sun, 5 Apr 2020 12:14:59 +0200
> Cc: 40407 <at> debbugs.gnu.org
>
> > I think in the use case where we return a copy, we should make sure
> > the return value is unibyte when encoding and multibyte when decoding.
>
> I'm not necessarily opposed to the suggestion, but why not return a unibyte string in both cases, simplifying the code?
For compatibility with what happens now:
(multibyte-string-p (decode-coding-string "abc" 'utf-8)) => t
> In addition, some operations (aref) are faster on unibyte. Either way, it's nothing that a caller could rely on, is there? (In particular when taking NOCOPY into account.)
That is true, of course, but many/most of our strings are multibyte
nowadays, even if they are ASCII. Suddenly getting a unibyte string
instead would be surprising, I think, even if no one should depend on
it not happening. (NOCOPY case is different: then it's the caller's
responsibility to deal with the issue.) So I'd rather we produced a
multibyte string when "decoding" by copying.
> +/* Whether a (unibyte) string only contains chars in the 0..127 range. */
One subtle point regarding this comment: I'd remove the "unibyte"
part, because (1) you apply this test to multibyte strings as well,
and (2) strings encoded in iso-2022 will look "pure-ASCII", but they
aren't. The latter subtlety doesn't interfere with the caller,
because iso-2022 is not ASCII-compatible, but it's something I'd
mention in the comment, lest someone uses this function for some
other use case.
The patch is OK otherwise. Thanks.
This bug report was last modified 5 years and 91 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.