GNU bug report logs -
#6149
24.0.50; shell buffer overflow when input longer than 4096 bytes
Previous Next
To reply to this bug, email your comments to 6149 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Mon, 10 May 2010 04:17:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
jidanni <at> jidanni.org
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 10 May 2010 04:17:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
This is a serious bug in M-x shell. It is not a bash or dash bug. It is
not a readline bug. It does not happen in xterm. It does not happen when
using pipes or backticks to get the input. It only happens in M-x
shell... when one gives lines longer than ~4096 characters.
Actually it is not buffer overflow, but buffer truncation, with NO
WARNING to the user. One day the wrong file will get removed via this
mess.
In GNU Emacs 24.0.50.1 (i486-pc-linux-gnu, GTK+ Version 2.20.0)
of 2010-05-01 on elegiac, modified by Debian
(emacs-snapshot package, version 1:20100501-1)
[input_truncation.txt.gz (application/octet-stream, attachment)]
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Tue, 01 Jun 2010 01:51:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 6149 <at> debbugs.gnu.org (full text, mbox):
>>>>> "jidanni" == jidanni <jidanni <at> jidanni.org> writes:
> This is a serious bug in M-x shell. It is not a bash or dash bug. It is
> not a readline bug. It does not happen in xterm. It does not happen when
> using pipes or backticks to get the input. It only happens in M-x
> shell... when one gives lines longer than ~4096 characters.
> Actually it is not buffer overflow, but buffer truncation, with NO
> WARNING to the user. One day the wrong file will get removed via this
> mess.
> In GNU Emacs 24.0.50.1 (i486-pc-linux-gnu, GTK+ Version 2.20.0)
> of 2010-05-01 on elegiac, modified by Debian
> (emacs-snapshot package, version 1:20100501-1)
Thanks for this nice test case.
It appears it was a silly mistake (code placed in the wrong side of
a #if). I've installed the patch below which should fix it,
Stefan
=== modified file 'src/sysdep.c'
--- src/sysdep.c 2010-05-04 07:40:53 +0000
+++ src/sysdep.c 2010-06-01 01:40:00 +0000
@@ -537,15 +537,6 @@
s.main.c_cflag = (s.main.c_cflag & ~CBAUD) | B9600; /* baud rate sanity */
#endif /* AIX */
-#else /* not HAVE_TERMIO */
-
- s.main.sg_flags &= ~(ECHO | CRMOD | ANYP | ALLDELAY | RAW | LCASE
- | CBREAK | TANDEM);
- s.main.sg_flags |= LPASS8;
- s.main.sg_erase = 0377;
- s.main.sg_kill = 0377;
- s.lmode = LLITOUT | s.lmode; /* Don't strip 8th bit */
-
/* We used to enable ICANON (and set VEOF to 04), but this leads to
problems where process.c wants to send EOFs every once in a while
to force the output, which leads to weird effects when the
@@ -558,6 +549,15 @@
s.main.c_cc[VMIN] = 1;
s.main.c_cc[VTIME] = 0;
+#else /* not HAVE_TERMIO */
+
+ s.main.sg_flags &= ~(ECHO | CRMOD | ANYP | ALLDELAY | RAW | LCASE
+ | CBREAK | TANDEM);
+ s.main.sg_flags |= LPASS8;
+ s.main.sg_erase = 0377;
+ s.main.sg_kill = 0377;
+ s.lmode = LLITOUT | s.lmode; /* Don't strip 8th bit */
+
#endif /* not HAVE_TERMIO */
EMACS_SET_TTY (out, &s, 0);
bug closed, send any further explanations to jidanni <at> jidanni.org
Request was from
Glenn Morris <rgm <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Tue, 22 Jun 2010 06:31:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 20 Jul 2010 11:24:03 GMT)
Full text and
rfc822 format available.
bug unarchived.
Request was from
charles <at> aurox.ch (Charles A. Roelli)
to
control <at> debbugs.gnu.org
.
(Fri, 28 Sep 2018 19:48:00 GMT)
Full text and
rfc822 format available.
Did not alter fixed versions and reopened.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 28 Sep 2018 19:50:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Fri, 28 Sep 2018 20:11:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 6149 <at> debbugs.gnu.org (full text, mbox):
> From: jidanni <at> jidanni.org
> Date: Mon, 10 May 2010 12:14:54 +0800
>
> This is a serious bug in M-x shell. It is not a bash or dash bug. It is
> not a readline bug. It does not happen in xterm. It does not happen when
> using pipes or backticks to get the input. It only happens in M-x
> shell... when one gives lines longer than ~4096 characters.
>
> Actually it is not buffer overflow, but buffer truncation, with NO
> WARNING to the user. One day the wrong file will get removed via this
> mess.
>
> In GNU Emacs 24.0.50.1 (i486-pc-linux-gnu, GTK+ Version 2.20.0)
> of 2010-05-01 on elegiac, modified by Debian
> (emacs-snapshot package, version 1:20100501-1)
>
>
> [application/octet-stream input_truncation.txt.gz (2kB)]
I can still reproduce this bug in 26.1 with the following recipe:
M-x shell RET
echo SPC C-SPC
C-u 5000 a RET
C-p C-e
M-=
On GNU/Linux: Region has 2 lines, 2 words, and 9096 characters.
If echo had received all of the input, you would expect around 10000
characters in the region. Instead, there are 5000 + 4096 characters.
Back when EOF chars were used to flush output, we had an "fpathconf"
check as in:
https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=3d082a269ece18058ed82957f8a056822b39789e
It might be possible to reinstate this "fpathconf" check to warn the
user that he has gone over the PTY limit, or maybe to prevent overlong
lines from being sent at all.
There is further discussion at:
http://lists.gnu.org/archive/html/emacs-devel/2010-08/msg00209.html
(Also, repeating this recipe on macOS with Emacs 26.1 results in the
behavior pointed out in Bug#32438.)
Forcibly Merged 6149 12440 24531.
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Sun, 13 Oct 2019 20:38:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Thu, 20 Jul 2023 20:16:02 GMT)
Full text and
rfc822 format available.
Message #24 received at 6149 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I see that this bug is about 13 years old. I think there's a pretty
obvious solution: process-connection-type should default to nil.
Otherwise this is a footgun just waiting to happen for anyone writing
process-interaction code in Emacs.
But if we don't do that, we should at least document it. See my
attached patch.
Btw, just to feed the fire, here's my own reproducer:
(with-temp-buffer
(make-process :name "broken" :buffer (current-buffer) :command '("cat"))
(process-send-string nil (make-string 10000 ?x))
(process-send-eof)
(sit-for 1)
(cons (point-min) (point-max)))
[0001-Include-warning-about-long-line-truncation-in-proces.patch (text/x-patch, inline)]
From dcfd129b3f08273a8b0705f03b6074443a7a33c1 Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh <at> janestreet.com>
Date: Thu, 20 Jul 2023 16:13:56 -0400
Subject: [PATCH] Include warning about long line truncation in
process-send-string
Maybe we can't fix this. But we can at least warn the user about it!
To have no warning anywhere about this default behavior which silently
discards data, is very user-hostile.
* src/process.c (Fprocess_send_string): Include a warning about long
line truncation (bug#6149)
---
src/process.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/process.c b/src/process.c
index 67d1d3e425f..82ace1b3a41 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6755,6 +6755,8 @@ DEFUN ("process-send-string", Fprocess_send_string, Sprocess_send_string,
of which depends on the process connection type and the operating
system), it is sent in several bunches. This may happen even for
shorter strings. Output from processes can arrive in between bunches.
+If the process connection type is `pty', then long lines present in
+STRING may be truncated depending on the operating system.
If PROCESS is a non-blocking network process that hasn't been fully
set up yet, this function will block until socket setup has completed. */)
--
2.39.3
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Thu, 20 Jul 2023 21:23:02 GMT)
Full text and
rfc822 format available.
Message #27 received at 6149 <at> debbugs.gnu.org (full text, mbox):
> I see that this bug is about 13 years old. I think there's a pretty
> obvious solution: process-connection-type should default to nil.
In practice, that can't fly because it'll break existing code.
Also I don't think either value of `process-connection-type` is a good
option. IOW, I think that the connection type should be a mandatory
argument when creating an async process (except maybe for those
processes with no input/output).
So maybe, the default value of `process-connection-type` should be
`unspecified` and the process creation code should emit a warning when
creating a process whose connection type is `unspecified` (just
a warning, tho: it should then pursue execution as if that value was t,
as usual, so as to preserve compatibility).
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Fri, 21 Jul 2023 05:40:02 GMT)
Full text and
rfc822 format available.
Message #30 received at 6149 <at> debbugs.gnu.org (full text, mbox):
> Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org, jidanni <at> jidanni.org
> Date: Thu, 20 Jul 2023 17:21:53 -0400
> From: Stefan Monnier via "Bug reports for GNU Emacs,
> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>
> > I see that this bug is about 13 years old. I think there's a pretty
> > obvious solution: process-connection-type should default to nil.
>
> In practice, that can't fly because it'll break existing code.
Indeed. A large portion (I think the majority, but I'm not sure) of
Lisp programs that use bidirectional communications with async
subprocesses actually _want_ the PTY interface, because they act as
GUI front-ends to other programs. Think "M-x term", "M-x gdb",
inferior-python-mode, etc. Even "M-x grep" and the likes need that
because they rely on default color output, which only happens if Grep
is connected to a terminal device. Some of the Emacs features based
on this don't work or don't work well on MS-Windows because Windows
only supports pipes. Do we really want such semi-broken behavior on
GNU and Unix systems?
The number of applications that (a) don't need console-like behavior
and (b) need to send larger-than-4KB buffers to sub-processes is quite
small. Which is why this issue comes up only very rarely. So making
pipes the default will fix a very small fraction of applications, and
break the vast majority -- clearly a wrong balance.
> Also I don't think either value of `process-connection-type` is a good
> option. IOW, I think that the connection type should be a mandatory
> argument when creating an async process (except maybe for those
> processes with no input/output).
If we go that way, we should start by specifying :connection-type for
all the uses of make-process and start-process we have in the core.
It's a large job, but before it is done we cannot in good faith make
such an incompatible transition.
> So maybe, the default value of `process-connection-type` should be
> `unspecified` and the process creation code should emit a warning when
> creating a process whose connection type is `unspecified` (just
> a warning, tho: it should then pursue execution as if that value was t,
> as usual, so as to preserve compatibility).
Something like that, yes.
But I'm actually wondering how come modern Linux kernels don't have a
way of lifting this restriction, or at least enlarging the limit so it
makes the problem even less frequent. Is there some inherent
limitation that this must be 4KB and nothing larger?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Fri, 21 Jul 2023 13:59:02 GMT)
Full text and
rfc822 format available.
Message #33 received at 6149 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org, jidanni <at> jidanni.org
>> Date: Thu, 20 Jul 2023 17:21:53 -0400
>> From: Stefan Monnier via "Bug reports for GNU Emacs,
>> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>>
>> > I see that this bug is about 13 years old. I think there's a pretty
>> > obvious solution: process-connection-type should default to nil.
>>
>> In practice, that can't fly because it'll break existing code.
>
> Indeed.
I agree that we probably can't change the default. However...
> A large portion (I think the majority, but I'm not sure) of
> Lisp programs that use bidirectional communications with async
> subprocesses actually _want_ the PTY interface, because they act as
> GUI front-ends to other programs. Think "M-x term", "M-x gdb",
> inferior-python-mode, etc. Even "M-x grep" and the likes need that
> because they rely on default color output, which only happens if Grep
> is connected to a terminal device. Some of the Emacs features based
> on this don't work or don't work well on MS-Windows because Windows
> only supports pipes. Do we really want such semi-broken behavior on
> GNU and Unix systems?
>
> The number of applications that (a) don't need console-like behavior
> and (b) need to send larger-than-4KB buffers to sub-processes is quite
> small. Which is why this issue comes up only very rarely. So making
> pipes the default will fix a very small fraction of applications, and
> break the vast majority -- clearly a wrong balance.
I see your point, but at the same time, the PTY interface on its own is
not sufficient to make these applications work, not at all. Specialized
modes are necessary to make M-x term (to implement a terminal) and M-x
grep (to parse ANSI color codes) and other such programs work. Running
things in a PTY without such specialized code doesn't give you anything,
AFAIK, because a PTY alone is far from enough to make the Emacs end
behave like a terminal. So such programs need to be aware and careful
about such things anyway, and need additional infrastructure on top of
make-process. So the default being "pty" gives such programs very
little: it doesn't save them any complexity.
Programs that just want to do some data processing with a subprocess, on
the other hand, work fine with just make-process alone, they need no
additional infrastructure, just process-send-string and reading directly
from the process buffer. The default being "pipe" would take away a big
footgun for such programs, since it's easy to forget that and then have
a silently wrong program which will fail once you get large input.
>> Also I don't think either value of `process-connection-type` is a good
>> option. IOW, I think that the connection type should be a mandatory
>> argument when creating an async process (except maybe for those
>> processes with no input/output).
>
> If we go that way, we should start by specifying :connection-type for
> all the uses of make-process and start-process we have in the core.
> It's a large job, but before it is done we cannot in good faith make
> such an incompatible transition.
I can do that.
However, what about my patch adding a warning about this to
process-send-string? I think that is independently valuable. Right now
we have no documentation of this problem...
>> So maybe, the default value of `process-connection-type` should be
>> `unspecified` and the process creation code should emit a warning when
>> creating a process whose connection type is `unspecified` (just
>> a warning, tho: it should then pursue execution as if that value was t,
>> as usual, so as to preserve compatibility).
>
> Something like that, yes.
>
> But I'm actually wondering how come modern Linux kernels don't have a
> way of lifting this restriction, or at least enlarging the limit so it
> makes the problem even less frequent. Is there some inherent
> limitation that this must be 4KB and nothing larger?
Unfortunately from looking at Linux the limit of 4096 seems to be
hardcoded.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Fri, 21 Jul 2023 14:19:02 GMT)
Full text and
rfc822 format available.
Message #36 received at 6149 <at> debbugs.gnu.org (full text, mbox):
> From: Spencer Baugh <sbaugh <at> janestreet.com>
> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 24531 <at> debbugs.gnu.org,
> 6149 <at> debbugs.gnu.org, jidanni <at> jidanni.org
> Date: Fri, 21 Jul 2023 09:58:38 -0400
>
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
> > The number of applications that (a) don't need console-like behavior
> > and (b) need to send larger-than-4KB buffers to sub-processes is quite
> > small. Which is why this issue comes up only very rarely. So making
> > pipes the default will fix a very small fraction of applications, and
> > break the vast majority -- clearly a wrong balance.
>
> I see your point, but at the same time, the PTY interface on its own is
> not sufficient to make these applications work, not at all. Specialized
> modes are necessary to make M-x term (to implement a terminal) and M-x
> grep (to parse ANSI color codes) and other such programs work. Running
> things in a PTY without such specialized code doesn't give you anything,
> AFAIK, because a PTY alone is far from enough to make the Emacs end
> behave like a terminal. So such programs need to be aware and careful
> about such things anyway, and need additional infrastructure on top of
> make-process. So the default being "pty" gives such programs very
> little: it doesn't save them any complexity.
That Emacs needs to do something doesn't invalidate my point. My
point is that communications via a PTY is a necessary (though a
sufficient) condition for these features. Basically, you cannot use
pipes for any interactive feature, because pipes are buffered.
> However, what about my patch adding a warning about this to
> process-send-string? I think that is independently valuable. Right now
> we have no documentation of this problem...
This should be documented in the ELisp manual, and in more detail, not
just as a vague warning.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Fri, 21 Jul 2023 15:10:02 GMT)
Full text and
rfc822 format available.
Message #39 received at 6149 <at> debbugs.gnu.org (full text, mbox):
>> Also I don't think either value of `process-connection-type` is a good
>> option. IOW, I think that the connection type should be a mandatory
>> argument when creating an async process (except maybe for those
>> processes with no input/output).
> If we go that way, we should start by specifying :connection-type for
> all the uses of make-process and start-process we have in the core.
That would be a good start, yes.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Thu, 27 Jul 2023 01:49:02 GMT)
Full text and
rfc822 format available.
Message #42 received at 6149 <at> debbugs.gnu.org (full text, mbox):
On 21/07/2023 16:58, Spencer Baugh wrote:
>> But I'm actually wondering how come modern Linux kernels don't have a
>> way of lifting this restriction, or at least enlarging the limit so it
>> makes the problem even less frequent. Is there some inherent
>> limitation that this must be 4KB and nothing larger?
> Unfortunately from looking at Linux the limit of 4096 seems to be
> hardcoded.
If some syscall or etc limits the length of a string to 4096, can't we
detect this case, split the string and emit said call multiple times?
This function's docstring already mentions the case of
If STRING is larger than the input buffer of the process, ...
it is sent in several bunches
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Thu, 27 Jul 2023 05:42:02 GMT)
Full text and
rfc822 format available.
Message #45 received at 6149 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 27 Jul 2023 04:48:18 +0300
> Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org,
> Stefan Monnier <monnier <at> iro.umontreal.ca>, jidanni <at> jidanni.org
> From: Dmitry Gutov <dmitry <at> gutov.dev>
>
> If some syscall or etc limits the length of a string to 4096, can't we
> detect this case, split the string and emit said call multiple times?
>
> This function's docstring already mentions the case of
>
> If STRING is larger than the input buffer of the process, ...
> it is sent in several bunches
AFAIU, that is based on the errno value returned by a 'write' call
which attempts to write too many bytes (see the would_block function).
I guess writes to PTYs don't do that?
Paul, do you know anything about that?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Thu, 27 Jul 2023 14:01:02 GMT)
Full text and
rfc822 format available.
Message #48 received at 6149 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> Date: Thu, 27 Jul 2023 04:48:18 +0300
>> Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org,
>> Stefan Monnier <monnier <at> iro.umontreal.ca>, jidanni <at> jidanni.org
>> From: Dmitry Gutov <dmitry <at> gutov.dev>
>>
>> If some syscall or etc limits the length of a string to 4096, can't we
>> detect this case, split the string and emit said call multiple times?
>>
>> This function's docstring already mentions the case of
>>
>> If STRING is larger than the input buffer of the process, ...
>> it is sent in several bunches
Alas it's far more cursed than that. The length of a *line* is limited
to 4096 characters. So regardless of how big or small your buffers for
writing are, if you write more than 4095 characters before writing a
newline, the remaining characters will be discarded. There is no way to
prevent this with ptys.
So even if we wrote one character at a time, characters would start
getting dropped after writing 4095 non-newline characters.
>
> AFAIU, that is based on the errno value returned by a 'write' call
> which attempts to write too many bytes (see the would_block function).
> I guess writes to PTYs don't do that?
Writes to PTYs do tell us when the data has been truncated. There's
just nothing we can do with that information.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#6149
; Package
emacs
.
(Thu, 27 Jul 2023 14:52:02 GMT)
Full text and
rfc822 format available.
Message #51 received at 6149 <at> debbugs.gnu.org (full text, mbox):
On 2023-07-27 06:59, Spencer Baugh wrote:
>> AFAIU, that is based on the errno value returned by a 'write' call
>> which attempts to write too many bytes (see the would_block function).
>> I guess writes to PTYs don't do that?
> Writes to PTYs do tell us when the data has been truncated.
Unfortunately not. Data bytes are silently truncated, at least on Ubuntu
23.04. If I fire up Emacs and type:
M-x shell RET cat >out RET C-u 4096 x RET C-d
the last RET causes Emacs to write 4097 bytes (4096 'x's followed by a
newline) to the pty. This 'write' system call succeeds and returns 4097.
However, the two 'read' calls that 'cat' executes see only 4095 'x's
followed by '\n' ('read' returns 4096) followed by EOF ('read' returns
0). An 'x' was lost, and Emacs has no way to see this directly.
This comes from the canonical mode of Linux's terminal driver, which
silently discards non-newline bytes after the 4095th byte of an input
line. See:
https://github.com/torvalds/linux/blob/v6.4/drivers/tty/n_tty.c#L1648
One possibility is that Emacs could monitor writes to a Linux pty,
looking for too many non-newline bytes in a row, and warn the user if
that number exceeds 4095. That might be the best it can do in this
troublesome environment. (The warning would be irrelevant for ttys
operating in non-canonical mode, which have a different set of problems.)
This bug report was last modified 1 year and 321 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.