GNU bug report logs - #17814
24.3.91; better string manipulation in subr-x

Previous Next

Package: emacs;

Reported by: Shigeru Fukaya <shigeru.fukaya <at> gmail.com>

Date: Thu, 19 Jun 2014 19:12:01 UTC

Severity: wishlist

Found in version 24.3.91

Done: Noam Postavsky <npostavs <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17814 in the body.
You can then email your comments to 17814 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#17814; Package emacs. (Thu, 19 Jun 2014 19:12:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Shigeru Fukaya <shigeru.fukaya <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 19 Jun 2014 19:12:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Shigeru Fukaya <shigeru.fukaya <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.3.91; better string manipulation in subr-x
Date: Fri, 20 Jun 2014 04:10:30 +0900
[Message part 1 (text/plain, inline)]
Some string manipulation functions in subr-x have room to optimize.


string-trim-left, string-trim-right -- use `substring' and
`match-beginning/end' instead of `replace-match'.  The formers have
bytecodes and the latter not.

string-trim -- call string-trim-left first would be cost effective.
But, to change the code to trim both sides of the string at once might
be better.

string-remove-suffix -- change the last argument of substring will
shorten the code.


I change the string-trim defined using defun from defsubst, as its
string literal is somewhat big (Actually I suspect most of other
functions would also be better if defined by defun).


Regards,
Shigeru
[subr-x.diff (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#17814; Package emacs. (Thu, 19 Jun 2014 20:45:03 GMT) Full text and rfc822 format available.

Message #8 received at 17814 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Shigeru Fukaya <shigeru.fukaya <at> gmail.com>
Cc: 17814 <at> debbugs.gnu.org
Subject: Re: bug#17814: 24.3.91; better string manipulation in subr-x
Date: Thu, 19 Jun 2014 16:44:12 -0400
> I change the string-trim defined using defun from defsubst, as its
> string literal is somewhat big (Actually I suspect most of other
> functions would also be better if defined by defun).

The use of `defsubst' is so that subr-x.el is not needed at run-time.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#17814; Package emacs. (Fri, 20 Jun 2014 17:15:02 GMT) Full text and rfc822 format available.

Message #11 received at 17814 <at> debbugs.gnu.org (full text, mbox):

From: Shigeru Fukaya <shigeru.fukaya <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>,
 17814 <at> debbugs.gnu.org
Subject: Re: bug#17814: 24.3.91; better string manipulation in subr-x
Date: Sat, 21 Jun 2014 02:14:19 +0900
>> I change the string-trim defined using defun from defsubst, as its
>> string literal is somewhat big (Actually I suspect most of other
>> functions would also be better if defined by defun).
>
>The use of `defsubst' is so that subr-x.el is not needed at run-time.

I see.  I didn't know that, and it's very good.
Then, the code is

(defsubst string-trim (string)
  "Remove leading and trailing whitespace from STRING."
  (string-trim-right (string-trim-left string)))

or

(defsubst string-trim (string)
  "Remove leading and trailing whitespace from STRING."
  (string-match "\\`[\s\t\n\r]*\\(.*?\\)[\s\t\n\r]*\\'" string)
  (if (or (< 0 (match-beginning 1)) (< (match-end 1) (match-end 0)))
      (match-string 1 string)
    string))

The latter is shorter in byte-compiled code, and call string-match
only once.  Literal string is seemingly larger, but the overhead of
a string object will cover it, I think.


Regards,
Shigeru




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#17814; Package emacs. (Fri, 20 Jun 2014 19:15:02 GMT) Full text and rfc822 format available.

Message #14 received at 17814 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Shigeru Fukaya <shigeru.fukaya <at> gmail.com>
Cc: 17814 <at> debbugs.gnu.org
Subject: Re: bug#17814: 24.3.91; better string manipulation in subr-x
Date: Fri, 20 Jun 2014 15:14:56 -0400
>   (string-match "\\`[\s\t\n\r]*\\(.*?\\)[\s\t\n\r]*\\'" string)
>   (if (or (< 0 (match-beginning 1)) (< (match-end 1) (match-end 0)))
>       (match-string 1 string)
>     string))

The above string-match will fail on a string that has a newline, and the
subsequent code will use whatever was the old match-data, resulting in
broken behavior.

Other than that, I don't have any opinion on such changes (I've never
heard anyone complain about code size or cpu-time of any of those
functions, so I think it largely doesn't matter either way).

        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#17814; Package emacs. (Sat, 21 Jun 2014 04:08:01 GMT) Full text and rfc822 format available.

Message #17 received at 17814 <at> debbugs.gnu.org (full text, mbox):

From: Shigeru Fukaya <shigeru.fukaya <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 17814 <at> debbugs.gnu.org
Subject: Re: bug#17814: 24.3.91; better string manipulation in subr-x
Date: Sat, 21 Jun 2014 13:07:27 +0900
>The above string-match will fail on a string that has a newline, and the
>subsequent code will use whatever was the old match-data, resulting in
>broken behavior.

"." in "\\`[\s\t\n\r]*\\(.*?\\)[\s\t\n\r]*\\'" must be "\\(.\\|\n\\)", sorry.

>Other than that, I don't have any opinion on such changes (I've never
>heard anyone complain about code size or cpu-time of any of those
>functions, so I think it largely doesn't matter either way).

Using string-trim-to-right and string-trim-to-left creates unnecessary
temporary string if both sides need triming may matter, then?

Anyway, I think I'm just sending a small proposal.  I don't care much
if you throw it away.  Thank you for spending your time.


Regards,
Shigeru
t




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#17814; Package emacs. (Wed, 19 Sep 2018 01:38:01 GMT) Full text and rfc822 format available.

Message #20 received at 17814 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: Shigeru Fukaya <shigeru.fukaya <at> gmail.com>
Cc: 17814 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#17814: 24.3.91; better string manipulation in subr-x
Date: Tue, 18 Sep 2018 21:37:20 -0400
close 17814
quit

Shigeru Fukaya <shigeru.fukaya <at> gmail.com> writes:

>>The above string-match will fail on a string that has a newline, and the
>>subsequent code will use whatever was the old match-data, resulting in
>>broken behavior.
>
> "." in "\\`[\s\t\n\r]*\\(.*?\\)[\s\t\n\r]*\\'" must be "\\(.\\|\n\\)", sorry.
>
>>Other than that, I don't have any opinion on such changes (I've never
>>heard anyone complain about code size or cpu-time of any of those
>>functions, so I think it largely doesn't matter either way).
>
> Using string-trim-to-right and string-trim-to-left creates unnecessary
> temporary string if both sides need triming may matter, then?
>
> Anyway, I think I'm just sending a small proposal.  I don't care much
> if you throw it away.  Thank you for spending your time.

An optimization similar to the one proposed here was done for
string-trim-to-{left,right} in [1: 1013e0392b].  I think the strim-trim
change isn't worth the extra complexity (especially since it's not even
entirely clear whether it would be faster/smaller), so I'm closing the
bug.

[1: 1013e0392b]: 2018-07-13 11:28:16 -0400
  Tweak subr-x.el substring functions
  https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=1013e0392b78ee0e2199fb51859dc9e939315f9b




bug closed, send any further explanations to 17814 <at> debbugs.gnu.org and Shigeru Fukaya <shigeru.fukaya <at> gmail.com> Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 19 Sep 2018 01:38:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 17 Oct 2018 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 252 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.