GNU bug report logs - #3603
23.0.94; takes much time to save large non-ASCII buffers

Previous Next

Package: emacs;

Reported by: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>

Date: Thu, 18 Jun 2009 09:40:05 UTC

Severity: normal

Done: Chong Yidong <cyd <at> stupidchicken.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 3603 in the body.
You can then email your comments to 3603 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#3603; Package emacs. (Thu, 18 Jun 2009 09:40:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>:
New bug report received and forwarded. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Thu, 18 Jun 2009 09:40:05 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):

From: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
To: emacs-pretest-bug <at> gnu.org
Subject: 23.0.94; takes much time to save large non-ASCII buffers
Date: Thu, 18 Jun 2009 18:32:37 +0900
Steps to reproduce:

  1. emacs -Q
  2. C-x ( C-x i .../etc/tutorials/TUTORIAL.ja RET C-x )
  3. C-u 20 C-x e
  4. C-x C-s SOME-NEW-FILE-NAME RET

Result:

  It takes much time (~10 sec.) to save this ~1MB buffer.
  Emacs 22 can save it instantly.

The slowness comes from that of select-safe-coding-system, in
particular, find-coding-systems-region(-internal) in it.  The
following patch makes it much faster (a few sec.) than the current
version.

Index: src/coding.c
===================================================================
RCS file: /sources/emacs/emacs/src/coding.c,v
retrieving revision 1.434
diff -c -p -r1.434 coding.c
*** src/coding.c	17 Jun 2009 00:42:07 -0000	1.434
--- src/coding.c	18 Jun 2009 06:05:04 -0000
*************** DEFUN ("find-coding-systems-region-inter
*** 8638,8644 ****
    EMACS_INT start_byte, end_byte;
    const unsigned char *p, *pbeg, *pend;
    int c;
!   Lisp_Object tail, elt;
  
    if (STRINGP (start))
      {
--- 8638,8644 ----
    EMACS_INT start_byte, end_byte;
    const unsigned char *p, *pbeg, *pend;
    int c;
!   Lisp_Object tail, elt, chars_checked;
  
    if (STRINGP (start))
      {
*************** DEFUN ("find-coding-systems-region-inter
*** 8696,8701 ****
--- 8696,8702 ----
    while (p < pend && ASCII_BYTE_P (*p)) p++;
    while (p < pend && ASCII_BYTE_P (*(pend - 1))) pend--;
  
+   chars_checked = Fmake_char_table (Qnil, Qnil);
    while (p < pend)
      {
        if (ASCII_BYTE_P (*p))
*************** DEFUN ("find-coding-systems-region-inter
*** 8703,8708 ****
--- 8704,8711 ----
        else
  	{
  	  c = STRING_CHAR_ADVANCE (p);
+ 	  if (!NILP (char_table_ref (chars_checked, c)))
+ 	    continue;
  
  	  charset_map_loaded = 0;
  	  for (tail = coding_attrs_list; CONSP (tail);)
*************** DEFUN ("find-coding-systems-region-inter
*** 8734,8739 ****
--- 8737,8743 ----
  	      p = pbeg + p_offset;
  	      pend = pbeg + pend_offset;
  	    }
+ 	  char_table_set (chars_checked, c, Qt);
  	}
      }
  

Some notes:

  1. It's still much slower than Emacs 22.  I guess we need to rewrite
     select-safe-coding-system if we try to make its performance
     comparable with Emacs 22.  But perhaps we should avoid such
     changes at this moment.
  2. If the "if (charset_map_loaded) ..." clause in
     Ffind_coding_systems_region_internal is intended for the
     relocation caused by GC, then maybe `chars_checked' above (and
     also `coding_attrs_list') should be GCPROed.

				     YAMAMOTO Mitsuharu
				mituharu <at> math.s.chiba-u.ac.jp

If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
    `bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/usr/local/share/emacs/23.0.94/etc/DEBUG for instructions.


In GNU Emacs 23.0.94.1 (powerpc-apple-darwin9.7.0, X toolkit)
 of 2009-06-18 on yamamoto-mitsuharu-no-power-mac-g5.local
Windowing system distributor `The X.Org Foundation', version 11.0.10402000
configured using `configure  '--without-gif' '--without-jpeg' '--without-tiff''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: ja_JP.UTF-8
  value of $XMODIFIERS: nil
  locale-coding-system: utf-8-unix
  default-enable-multibyte-characters: t

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  tool-bar-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  global-auto-composition-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#3603; Package emacs. (Thu, 18 Jun 2009 11:50:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kenichi Handa <handa <at> m17n.org>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Thu, 18 Jun 2009 11:50:03 GMT) Full text and rfc822 format available.

Message #10 received at 3603 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Kenichi Handa <handa <at> m17n.org>
To: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>,
        3603 <at> debbugs.gnu.org
Subject: Re: bug#3603: 23.0.94; takes much time to save large non-ASCII
 buffers
Date: Thu, 18 Jun 2009 20:43:28 +0900
In article <wleithrioa.wl%mituharu <at> math.s.chiba-u.ac.jp>, YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp> writes:

> Steps to reproduce:
>   1. emacs -Q
>   2. C-x ( C-x i .../etc/tutorials/TUTORIAL.ja RET C-x )
>   3. C-u 20 C-x e
>   4. C-x C-s SOME-NEW-FILE-NAME RET

> Result:

>   It takes much time (~10 sec.) to save this ~1MB buffer.
>   Emacs 22 can save it instantly.

I observed it too.

> The slowness comes from that of select-safe-coding-system, in
> particular, find-coding-systems-region(-internal) in it.  The
> following patch makes it much faster (a few sec.) than the current
> version.

It seems that your patch is correct.  Actually, Emacs 22
used the similar method, but I forgot to implement that part
when I re-wrote find-coding-systems-region-internal.  :-(

[...]
>   1. It's still much slower than Emacs 22.  I guess we need to rewrite
>      select-safe-coding-system if we try to make its performance
>      comparable with Emacs 22.  But perhaps we should avoid such
>      changes at this moment.

One possible strategy is to check, at first, whether or not
the default coding system(s) used for encoding (usually
buffer-file-coding-system) can encode the text.

>   2. If the "if (charset_map_loaded) ..." clause in
>      Ffind_coding_systems_region_internal is intended for the
>      relocation caused by GC, then maybe `chars_checked' above (and
>      also `coding_attrs_list') should be GCPROed.

It was.  But, as we modified load_charset_map_from_file to
disable file-name-handlers a while ago, we don't need that
check anymore.  I just forgot to delete all those checks.

---
Kenichi Handa
handa <at> m17n.org



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#3603; Package emacs. (Fri, 19 Jun 2009 08:50:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Fri, 19 Jun 2009 08:50:04 GMT) Full text and rfc822 format available.

Message #15 received at 3603 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
To: Kenichi Handa <handa <at> m17n.org>
Cc: 3603 <at> debbugs.gnu.org
Subject: Re: bug#3603: 23.0.94; takes much time to save large non-ASCII buffers
Date: Fri, 19 Jun 2009 17:46:54 +0900
>>>>> On Thu, 18 Jun 2009 20:43:28 +0900, Kenichi Handa <handa <at> m17n.org> said:

>> The slowness comes from that of select-safe-coding-system, in
>> particular, find-coding-systems-region(-internal) in it.  The
>> following patch makes it much faster (a few sec.) than the current
>> version.

> It seems that your patch is correct.  Actually, Emacs 22 used the
> similar method, but I forgot to implement that part when I re-wrote
> find-coding-systems-region-internal.  :-(

I've installed the patch (with changing the variable name to the one
that is consistent with Emacs 22).

>> 2. If the "if (charset_map_loaded) ..." clause in
>> Ffind_coding_systems_region_internal is intended for the relocation
>> caused by GC, then maybe `chars_checked' above (and also
>> `coding_attrs_list') should be GCPROed.

> It was.  But, as we modified load_charset_map_from_file to disable
> file-name-handlers a while ago, we don't need that check anymore.  I
> just forgot to delete all those checks.

Thanks for the explanation.  Actually, I couldn't find the part that
may cause GC, and I wondered why there's an adjustment for relocation.

				     YAMAMOTO Mitsuharu
				mituharu <at> math.s.chiba-u.ac.jp



bug closed, send any further explanations to YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp> Request was from Chong Yidong <cyd <at> stupidchicken.com> to control <at> debbugs.gnu.org. (Sat, 16 Jan 2010 22:14:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <bug-gnu-emacs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 14 Feb 2010 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 15 years and 133 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.