GNU bug report logs - #40407
[PATCH] slow ENCODE_FILE and DECODE_FILE

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattiase <at> acm.org>

Date: Fri, 3 Apr 2020 16:11:01 UTC

Severity: normal

Tags: patch

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

Full log


Message #95 received at 40407 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 40407 <at> debbugs.gnu.org, handa <at> gnu.org, hirofumi <at> mail.parknet.co.jp
Subject: Re: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE
Date: Mon, 06 Apr 2020 20:18:05 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Mon, 6 Apr 2020 18:55:30 +0200
> Cc: hirofumi <at> mail.parknet.co.jp, handa <at> gnu.org, 40407 <at> debbugs.gnu.org
> 
> 6 apr. 2020 kl. 18.33 skrev Eli Zaretskii <eliz <at> gnu.org>:
> 
> > I think it might be just some convenience thing: utf-7 and utf-8 have
> > something in common that made it convenient to treat them the same in
> > the internal routines.  Or maybe it's just an accident.
> 
> There is nothing common between utf-7 and utf-8 at all (apart from a subset of ASCII being encoded in the same way, and the fact that both encode the Unicode repertoire).

By "in common" in this context I meant from the POV of internal
treating of the two encodings.

> > I don't think 'charset' is the right type for this encoding (any
> > reason why you've chosen it?), but I will let Handa-san comment.
> 
> We could use 'raw-text' as well but that implies that any byte value could be part of an utf-7[-imap] text, which is incorrect.
> In fact, utf-7-imap only uses codes 0x20-0x7e (utf-7 is allowed to use a few C0 controls too, as mentioned).
> 
> Arguably the heuristics of define-coding-system-internal are somewhat inscrutable. There seems to be leaks between layers -- ascii-compatible-p is an end-to-end property and cannot really be set the way it is by that function. But since it is, fixing it afterwards should be the correct way.

I prefer to wait for Handa-san's response, and meanwhile install the
least disruptive change, which just fixes the one aspect that got
broken.  Call me a coward, if you wish.




This bug report was last modified 5 years and 91 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.