GNU bug report logs - #55838
29.0.50; Eshell string-split subscript indexing splits too much

Previous Next

Package: emacs;

Reported by: Jim Porter <jporterbugs <at> gmail.com>

Date: Wed, 8 Jun 2022 01:37:01 UTC

Severity: normal

Found in version 29.0.50

Fixed in version 29.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 55838 in the body.
You can then email your comments to 55838 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#55838; Package emacs. (Wed, 08 Jun 2022 01:37:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jim Porter <jporterbugs <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 08 Jun 2022 01:37:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 29.0.50; Eshell string-split subscript indexing splits too much
Date: Tue, 7 Jun 2022 18:36:09 -0700
From "emacs -Q -f eshell":

  M-: (setq foo "a\nb:c")

  ~ $ echo $foo
  a
  b:c
  ~ $ echo $foo[: 0]
  ("a" "b")

The first command is normal, and just shows that Eshell outputs the 
string with no manipulation. In the second command, we split the string 
on ":" and get the 0th element. However, that gets split *again* (on 
newlines) and returns a list.

I think this is overly aggressive. It's due to `eshell-apply-indices' 
calling `eshell-convert' on the split element(s) of the string. However, 
`eshell-convert' is primarily designed to turn output from external 
command line programs into a Lispy form (so it splits by line to make a 
list, among other things). This would normally happen when doing 
something like this:

  ~ $ echo ${cat some-file.txt}
  ("line 1" "line 2" ...)

In the original case above, I think the split-subscript operator [: 0] 
should only be doing the one thing the user requested: split on ":" and 
get the 0th element.

Patch forthcoming momentarily. Just getting a bug number.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#55838; Package emacs. (Wed, 08 Jun 2022 01:42:02 GMT) Full text and rfc822 format available.

Message #8 received at 55838 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: 55838 <at> debbugs.gnu.org
Subject: Re: bug#55838: 29.0.50; [PATCH] Eshell string-split subscript
 indexing splits too much
Date: Tue, 7 Jun 2022 18:41:30 -0700
[Message part 1 (text/plain, inline)]
On 6/7/2022 6:36 PM, Jim Porter wrote:
>  From "emacs -Q -f eshell":
> 
>    M-: (setq foo "a\nb:c")
> 
>    ~ $ echo $foo
>    a
>    b:c
>    ~ $ echo $foo[: 0]
>    ("a" "b")
> 
> The first command is normal, and just shows that Eshell outputs the 
> string with no manipulation. In the second command, we split the string 
> on ":" and get the 0th element. However, that gets split *again* (on 
> newlines) and returns a list.

Here's a patch for this. It changes the behavior of 
`eshell-apply-indices' to use `eshell-convert-to-number' (when the 
expansion isn't wrapped in double-quotes) instead of the more-aggressive 
`eshell-convert'. I think `eshell-convert-to-number' is the right thing 
here, since Eshell already converts number-like strings to actual 
numbers in most cases.

As a note, if you wanted the old behavior, you could do something like this:

  ~ $ echo $foo[: 0][0 1]
  ("a" "b")

There's also a suggestion in the "Bugs and ideas" section of the Eshell 
manual to add "*" as a subscript to mean "all indices", so you could do 
the above in a more generic fashion like:

  ~ $ echo $foo[: 0][*]
  ;; Doesn't currently work, but it could.
[0001-Don-t-split-Eshell-expansions-by-line-when-using-spl.patch (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#55838; Package emacs. (Wed, 08 Jun 2022 12:12:02 GMT) Full text and rfc822 format available.

Message #11 received at 55838 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: 55838 <at> debbugs.gnu.org
Subject: Re: bug#55838: 29.0.50; Eshell string-split subscript indexing
 splits too much
Date: Wed, 08 Jun 2022 14:11:33 +0200
Jim Porter <jporterbugs <at> gmail.com> writes:

> Here's a patch for this. It changes the behavior of
> `eshell-apply-indices' to use `eshell-convert-to-number' (when the
> expansion isn't wrapped in double-quotes) instead of the
> more-aggressive `eshell-convert'. I think `eshell-convert-to-number'
> is the right thing here, since Eshell already converts number-like
> strings to actual numbers in most cases.

Sounds good to me; pushed to Emacs 29.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




bug marked as fixed in version 29.1, send any further explanations to 55838 <at> debbugs.gnu.org and Jim Porter <jporterbugs <at> gmail.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Wed, 08 Jun 2022 12:12:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#55838; Package emacs. (Wed, 08 Jun 2022 13:40:02 GMT) Full text and rfc822 format available.

Message #16 received at 55838 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: 55838 <at> debbugs.gnu.org
Subject: Re: bug#55838: 29.0.50;
 [PATCH] Eshell string-split subscript indexing splits too much
Date: Wed, 08 Jun 2022 16:38:59 +0300
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Tue, 7 Jun 2022 18:41:30 -0700
> 
> > The first command is normal, and just shows that Eshell outputs the 
> > string with no manipulation. In the second command, we split the string 
> > on ":" and get the 0th element. However, that gets split *again* (on 
> > newlines) and returns a list.
> 
> Here's a patch for this. It changes the behavior of 
> `eshell-apply-indices' to use `eshell-convert-to-number' (when the 
> expansion isn't wrapped in double-quotes) instead of the more-aggressive 
> `eshell-convert'. I think `eshell-convert-to-number' is the right thing 
> here, since Eshell already converts number-like strings to actual 
> numbers in most cases.
> 
> As a note, if you wanted the old behavior, you could do something like this:
> 
>    ~ $ echo $foo[: 0][0 1]
>    ("a" "b")
> 
> There's also a suggestion in the "Bugs and ideas" section of the Eshell 
> manual to add "*" as a subscript to mean "all indices", so you could do 
> the above in a more generic fashion like:
> 
>    ~ $ echo $foo[: 0][*]
>    ;; Doesn't currently work, but it could.

I don't have any objections based on actual experience, and I don't
know what was the original design goals of this feature in Eshell.
However, please note that you are changing the behavior significantly,
and the only reason is that it doesn't make much sense to you.  I
wonder whether this is a strong enough motivation to make such
incompatible behavior changes.  Eshell is not a "normal" shell, in
that it attempts to make sense even if Lisp expressions are mixed with
Posix-ish shell features, so what may not make sense in Bash, Zsh, and
their ilk is not necessarily nonsensical in Eshell.

So maybe we should raise the bar for considering reasons for behavior
changes as valid?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#55838; Package emacs. (Wed, 08 Jun 2022 23:07:02 GMT) Full text and rfc822 format available.

Message #19 received at 55838 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 55838 <at> debbugs.gnu.org
Subject: Re: bug#55838: 29.0.50; [PATCH] Eshell string-split subscript
 indexing splits too much
Date: Wed, 8 Jun 2022 16:06:45 -0700
On 6/8/2022 6:38 AM, Eli Zaretskii wrote:
> I don't have any objections based on actual experience, and I don't
> know what was the original design goals of this feature in Eshell.
> However, please note that you are changing the behavior significantly,
> and the only reason is that it doesn't make much sense to you.

I probably should have elaborated a bit more on my reasoning in the 
original report. My goal with this (and other Eshell patches in this 
area) is mainly to add tests for some of the more-advanced Eshell syntax 
and also to ensure that it works as documented. There are a few cases 
where it's tricky to decide whether the code is right and the 
documentation is wrong, or vice-versa. This is one of those cases.

Here's what the Emacs 27/28 manuals have to say about this syntax (I've 
already changed/expanded this section in 29, so I'm going back to 28 to 
show what the docs said before I changed them):

  $var[i]

      Expands to the ith element of the value bound to var. If the value
      is a string, it will be split at whitespace to make it a list.
      Again, raises an error if the value is not a sequence.

  $var[: i]

      As above, but now splitting occurs at the colon character.

  $var[: i j]

      As above, but instead of returning just a string, it now returns a
      list of two strings. If the result is being interpolated into a
      larger string, this list will be flattened into one big string,
      with each element separated by a space.

I would interpret the above to mean that the only splitting that should 
happen for `$var[: i]' is with the ":". The last section says that 
`$var[: i]' returns "just a string", and `$var[: i j]' returns a list of 
two strings. However, in my example in the original message, `$foo[: 0 
1]' would return a list containing a list and a string. That's 
inconsistent with what the manual says, and in this case I think it's 
the manual that was right, and the code that wasn't.

Note: the last sentence in the manual excerpt above is also incorrect. 
When the list is "flattened into one big string", it will look like 
'("first" "second")', not 'first second'. Unlike the original bug here, 
which people probably don't encounter very often in practice, changing 
how the list is flattened would probably cause problems for users. It's 
a really common occurrence. Something as simple as `echo a b' will 
return '("a" "b")'. This problem is also discussed in bug#12689.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 07 Jul 2022 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 347 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.