GNU bug report logs - #40407
[PATCH] slow ENCODE_FILE and DECODE_FILE

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattiase <at> acm.org>

Date: Fri, 3 Apr 2020 16:11:01 UTC

Severity: normal

Tags: patch

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 40407 <at> debbugs.gnu.org
Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE
Date: Sun, 05 Apr 2020 16:39:25 +0300
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Sun, 5 Apr 2020 12:48:51 +0200
> Cc: 40407 <at> debbugs.gnu.org
> 
> 4 apr. 2020 kl. 20.25 skrev Eli Zaretskii <eliz <at> gnu.org>:
> 
> > That's true.  But as a matter of fact, I don't see any calls to
> > code_convert_string with NOCOPY non-zero, they all pass zero or false
> > to it.  So none of the existing direct calls from C wants or expects
> > to get the same string.
> 
> Right. However, I did some reading and believe that nocopy=true is actually correct for all uses of {EN,DE}CODE_FILE, and in fact all calls to code_convert_string_norecord.

I don't think I follow.  We call code_convert_string_norecord, which
invokes code_convert_string with NOCOPY set to 'false'.  So all those
users should NOT receive the same string as the argument, and I don't
believe they expect that and can cope with it.

Perhaps what you meant was that NOCOPY = false was actually acting as
if the value were 'true', due to the bug.  But if so, that didn't
affect ENCODE_FILE and DECODE_FILE, because the bug is only visible
when the CODING_SYSTEM argument is nil, and both ENCODE_FILE and
DECODE_FILE never let that happen:

  if (! NILP (Vfile_name_coding_system))
    return code_convert_string_norecord (fname, Vfile_name_coding_system, 1);
  else if (! NILP (Vdefault_file_name_coding_system))
    return code_convert_string_norecord (fname,
					 Vdefault_file_name_coding_system, 1);
  else
    return fname;

So in practice this bug was probably never seen until now.

> One of the reasons is that the callers need to be careful with mutation wrt GC anyway; any post-recoding mutation is done on copies. (Not being able to change the length of strings also helps.)

I don't think I understand your line of reasoning here.  I don't think
GC is relevant, and as long as we are talking about file names, the
first null byte terminates it even though the Lisp string's length
could be larger.

> Given the limited scope of the change, would you agree to a backport of that to emacs-27?

That'd be a mistake, I think.  My reasoning goes like this: If I'm
right that this bug was never seen, fixing it on emacs-27 will have no
visible effect; and if I'm wrong, then we will break the release
branch.  The danger of breakage in the latter case is much more severe
than the gain from the fix in the former case.

> For the reasons above, I think it's correct and proper to do (on master)
> 
> --- a/src/coding.c
> +++ b/src/coding.c
> @@ -9554,7 +9554,7 @@ code_convert_string (Lisp_Object string, Lisp_Object coding_system,
>  code_convert_string_norecord (Lisp_Object string, Lisp_Object coding_system,
>                               bool encodep)
>  {
> -  return code_convert_string (string, coding_system, Qt, encodep, 0, 1);
> +  return code_convert_string (string, coding_system, Qt, encodep, 1, 1);
>  }

I hope you now agree with me that we should not do this.  The default
should stay NOCOPY = false, and any caller that wants otherwise must
explicitly request that by calling code_convert_string.

Thanks.




This bug report was last modified 5 years and 91 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.