GNU bug report logs - #20154
25.0.50; json-encode-string is too slow for large strings

Previous Next

Package: emacs;

Reported by: Dmitry Gutov <dgutov <at> yandex.ru>

Date: Fri, 20 Mar 2015 14:27:01 UTC

Severity: normal

Found in version 25.0.50

Done: Dmitry Gutov <dgutov <at> yandex.ru>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20154 <at> debbugs.gnu.org
Subject: bug#20154: 25.0.50; json-encode-string is too slow for large strings
Date: Sat, 21 Mar 2015 22:25:48 +0200
> Date: Sat, 21 Mar 2015 22:00:46 +0200
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> CC: 20154 <at> debbugs.gnu.org
> 
> On 03/21/2015 09:58 AM, Eli Zaretskii wrote:
> 
> > It depends on your requirements.  How fast would it need to run to
> > satisfy your needs?
> 
> In this case, the buffer contents are encoded to JSON at most once per 
> keypress. So 50ms or below should be fast enough, especially since most 
> files are smaller than that.

So each keypress you need to encode the whole buffer, including the
last keypress and all those before it?

I guess I don't really understand why each keypress should trigger
encoding of the whole buffer.

> > You don't really need regexp replacement functions with all its
> > features here, do you?  What you need is a way to skip characters that
> > are "okay", then replace the character that is "not okay" with its
> > encoded form, then repeat.
> 
> It doesn't seem like regexp searching is the slow part: save for the GC 
> pauses, looking for the non-matching regexp in the same string -
> 
> (replace-regexp-in-string "x" "z" s1 t t)
> 
> - only takes ~3ms.

Then a series of calls to replace-regexp-in-string, one each for every
one of the "special" characters, should get you close to your goal,
right?

> And likewise, after changing them to use `concat' instead of `format', 
> both alternative json-encode-string implementations that I have "encode" 
> a numbers-only (without newlines) string of the same length in a few 
> milliseconds. Again, save for the GC pauses, which can add 30-40ms.

So does this mean you have your solution?




This bug report was last modified 10 years and 38 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.