GNU bug report logs - #6149
24.0.50; shell buffer overflow when input longer than 4096 bytes

Previous Next

Package: emacs;

Reported by: jidanni <at> jidanni.org

Date: Mon, 10 May 2010 04:17:01 UTC

Severity: normal

Tags: confirmed

Merged with 12440, 24531

Found in versions 24.0.50, 24.2

To reply to this bug, email your comments to 6149 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Mon, 10 May 2010 04:17:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to jidanni <at> jidanni.org:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 10 May 2010 04:17:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: jidanni <at> jidanni.org
To: bug-gnu-emacs <at> gnu.org
Subject: 24.0.50; shell buffer overflow when input longer than 4096 bytes
Date: Mon, 10 May 2010 12:14:54 +0800
[Message part 1 (text/plain, inline)]
This is a serious bug in M-x shell. It is not a bash or dash bug. It is
not a readline bug. It does not happen in xterm. It does not happen when
using pipes or backticks to get the input. It only happens in M-x
shell... when one gives lines longer than ~4096 characters.

Actually it is not buffer overflow, but buffer truncation, with NO
WARNING to the user. One day the wrong file will get removed via this
mess.

In GNU Emacs 24.0.50.1 (i486-pc-linux-gnu, GTK+ Version 2.20.0)
 of 2010-05-01 on elegiac, modified by Debian
 (emacs-snapshot package, version 1:20100501-1)

[input_truncation.txt.gz (application/octet-stream, attachment)]

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Tue, 01 Jun 2010 01:51:02 GMT) Full text and rfc822 format available.

Message #8 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: jidanni <at> jidanni.org
Cc: 6149 <at> debbugs.gnu.org
Subject: Re: bug#6149: 24.0.50;
	shell buffer overflow when input longer than 4096 bytes
Date: Mon, 31 May 2010 21:50:37 -0400
>>>>> "jidanni" == jidanni  <jidanni <at> jidanni.org> writes:

> This is a serious bug in M-x shell. It is not a bash or dash bug. It is
> not a readline bug. It does not happen in xterm. It does not happen when
> using pipes or backticks to get the input. It only happens in M-x
> shell... when one gives lines longer than ~4096 characters.

> Actually it is not buffer overflow, but buffer truncation, with NO
> WARNING to the user. One day the wrong file will get removed via this
> mess.

> In GNU Emacs 24.0.50.1 (i486-pc-linux-gnu, GTK+ Version 2.20.0)
>  of 2010-05-01 on elegiac, modified by Debian
>  (emacs-snapshot package, version 1:20100501-1)

Thanks for this nice test case.
It appears it was a silly mistake (code placed in the wrong side of
a #if).  I've installed the patch below which should fix it,


        Stefan


=== modified file 'src/sysdep.c'
--- src/sysdep.c	2010-05-04 07:40:53 +0000
+++ src/sysdep.c	2010-06-01 01:40:00 +0000
@@ -537,15 +537,6 @@
   s.main.c_cflag = (s.main.c_cflag & ~CBAUD) | B9600; /* baud rate sanity */
 #endif /* AIX */
 
-#else /* not HAVE_TERMIO */
-
-  s.main.sg_flags &= ~(ECHO | CRMOD | ANYP | ALLDELAY | RAW | LCASE
-		       | CBREAK | TANDEM);
-  s.main.sg_flags |= LPASS8;
-  s.main.sg_erase = 0377;
-  s.main.sg_kill = 0377;
-  s.lmode = LLITOUT | s.lmode;        /* Don't strip 8th bit */
-
   /* We used to enable ICANON (and set VEOF to 04), but this leads to
      problems where process.c wants to send EOFs every once in a while
      to force the output, which leads to weird effects when the
@@ -558,6 +549,15 @@
   s.main.c_cc[VMIN] = 1;
   s.main.c_cc[VTIME] = 0;
 
+#else /* not HAVE_TERMIO */
+
+  s.main.sg_flags &= ~(ECHO | CRMOD | ANYP | ALLDELAY | RAW | LCASE
+		       | CBREAK | TANDEM);
+  s.main.sg_flags |= LPASS8;
+  s.main.sg_erase = 0377;
+  s.main.sg_kill = 0377;
+  s.lmode = LLITOUT | s.lmode;        /* Don't strip 8th bit */
+
 #endif /* not HAVE_TERMIO */
 
   EMACS_SET_TTY (out, &s, 0);





bug closed, send any further explanations to jidanni <at> jidanni.org Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Tue, 22 Jun 2010 06:31:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 20 Jul 2010 11:24:03 GMT) Full text and rfc822 format available.

bug unarchived. Request was from charles <at> aurox.ch (Charles A. Roelli) to control <at> debbugs.gnu.org. (Fri, 28 Sep 2018 19:48:00 GMT) Full text and rfc822 format available.

Did not alter fixed versions and reopened. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 28 Sep 2018 19:50:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Fri, 28 Sep 2018 20:11:02 GMT) Full text and rfc822 format available.

Message #19 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: charles <at> aurox.ch (Charles A. Roelli)
To: jidanni <at> jidanni.org
Cc: 6149 <at> debbugs.gnu.org
Subject: Re: bug#6149: 24.0.50;
 shell buffer overflow when input longer than 4096 bytes
Date: Fri, 28 Sep 2018 22:13:11 +0200
> From: jidanni <at> jidanni.org
> Date: Mon, 10 May 2010 12:14:54 +0800
> 
> This is a serious bug in M-x shell. It is not a bash or dash bug. It is
> not a readline bug. It does not happen in xterm. It does not happen when
> using pipes or backticks to get the input. It only happens in M-x
> shell... when one gives lines longer than ~4096 characters.
> 
> Actually it is not buffer overflow, but buffer truncation, with NO
> WARNING to the user. One day the wrong file will get removed via this
> mess.
> 
> In GNU Emacs 24.0.50.1 (i486-pc-linux-gnu, GTK+ Version 2.20.0)
>  of 2010-05-01 on elegiac, modified by Debian
>  (emacs-snapshot package, version 1:20100501-1)
> 
> 
> [application/octet-stream input_truncation.txt.gz (2kB)]

I can still reproduce this bug in 26.1 with the following recipe:

M-x shell RET
echo SPC C-SPC
C-u 5000 a RET
C-p C-e
M-=

On GNU/Linux: Region has 2 lines, 2 words, and 9096 characters.

If echo had received all of the input, you would expect around 10000
characters in the region.  Instead, there are 5000 + 4096 characters.

Back when EOF chars were used to flush output, we had an "fpathconf"
check as in:

https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=3d082a269ece18058ed82957f8a056822b39789e

It might be possible to reinstate this "fpathconf" check to warn the
user that he has gone over the PTY limit, or maybe to prevent overlong
lines from being sent at all.

There is further discussion at:

http://lists.gnu.org/archive/html/emacs-devel/2010-08/msg00209.html


(Also, repeating this recipe on macOS with Emacs 26.1 results in the
behavior pointed out in Bug#32438.)




Forcibly Merged 6149 12440 24531. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 13 Oct 2019 20:38:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Thu, 20 Jul 2023 20:16:02 GMT) Full text and rfc822 format available.

Message #24 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Spencer Baugh <sbaugh <at> janestreet.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org, jidanni <at> jidanni.org
Subject: Re: bug#24531: process-send-string seems to truncate lines over
 4096 characters
Date: Thu, 20 Jul 2023 16:15:11 -0400
[Message part 1 (text/plain, inline)]
I see that this bug is about 13 years old.  I think there's a pretty
obvious solution: process-connection-type should default to nil.
Otherwise this is a footgun just waiting to happen for anyone writing
process-interaction code in Emacs.

But if we don't do that, we should at least document it.  See my
attached patch.

Btw, just to feed the fire, here's my own reproducer:

(with-temp-buffer
  (make-process :name "broken" :buffer (current-buffer) :command '("cat"))
  (process-send-string nil (make-string 10000 ?x))
  (process-send-eof)
  (sit-for 1)
  (cons (point-min) (point-max)))

[0001-Include-warning-about-long-line-truncation-in-proces.patch (text/x-patch, inline)]
From dcfd129b3f08273a8b0705f03b6074443a7a33c1 Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh <at> janestreet.com>
Date: Thu, 20 Jul 2023 16:13:56 -0400
Subject: [PATCH] Include warning about long line truncation in
 process-send-string

Maybe we can't fix this.  But we can at least warn the user about it!
To have no warning anywhere about this default behavior which silently
discards data, is very user-hostile.

* src/process.c (Fprocess_send_string): Include a warning about long
line truncation (bug#6149)
---
 src/process.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/process.c b/src/process.c
index 67d1d3e425f..82ace1b3a41 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6755,6 +6755,8 @@ DEFUN ("process-send-string", Fprocess_send_string, Sprocess_send_string,
 of which depends on the process connection type and the operating
 system), it is sent in several bunches.  This may happen even for
 shorter strings.  Output from processes can arrive in between bunches.
+If the process connection type is `pty', then long lines present in
+STRING may be truncated depending on the operating system.
 
 If PROCESS is a non-blocking network process that hasn't been fully
 set up yet, this function will block until socket setup has completed.  */)
-- 
2.39.3


Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Thu, 20 Jul 2023 21:23:02 GMT) Full text and rfc822 format available.

Message #27 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Spencer Baugh <sbaugh <at> janestreet.com>
Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org, jidanni <at> jidanni.org
Subject: Re: bug#24531: process-send-string seems to truncate lines over
 4096 characters
Date: Thu, 20 Jul 2023 17:21:53 -0400
> I see that this bug is about 13 years old.  I think there's a pretty
> obvious solution: process-connection-type should default to nil.

In practice, that can't fly because it'll break existing code.

Also I don't think either value of `process-connection-type` is a good
option.  IOW, I think that the connection type should be a mandatory
argument when creating an async process (except maybe for those
processes with no input/output).

So maybe, the default value of `process-connection-type` should be
`unspecified` and the process creation code should emit a warning when
creating a process whose connection type is `unspecified` (just
a warning, tho: it should then pursue execution as if that value was t,
as usual, so as to preserve compatibility).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Fri, 21 Jul 2023 05:40:02 GMT) Full text and rfc822 format available.

Message #30 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 24531 <at> debbugs.gnu.org, sbaugh <at> janestreet.com, 6149 <at> debbugs.gnu.org,
 jidanni <at> jidanni.org
Subject: Re: bug#6149: bug#24531: process-send-string seems to truncate lines
 over 4096 characters
Date: Fri, 21 Jul 2023 08:39:52 +0300
> Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org, jidanni <at> jidanni.org
> Date: Thu, 20 Jul 2023 17:21:53 -0400
> From:  Stefan Monnier via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> > I see that this bug is about 13 years old.  I think there's a pretty
> > obvious solution: process-connection-type should default to nil.
> 
> In practice, that can't fly because it'll break existing code.

Indeed.  A large portion (I think the majority, but I'm not sure) of
Lisp programs that use bidirectional communications with async
subprocesses actually _want_ the PTY interface, because they act as
GUI front-ends to other programs.  Think "M-x term", "M-x gdb",
inferior-python-mode, etc.  Even "M-x grep" and the likes need that
because they rely on default color output, which only happens if Grep
is connected to a terminal device.  Some of the Emacs features based
on this don't work or don't work well on MS-Windows because Windows
only supports pipes.  Do we really want such semi-broken behavior on
GNU and Unix systems?

The number of applications that (a) don't need console-like behavior
and (b) need to send larger-than-4KB buffers to sub-processes is quite
small.  Which is why this issue comes up only very rarely.  So making
pipes the default will fix a very small fraction of applications, and
break the vast majority -- clearly a wrong balance.

> Also I don't think either value of `process-connection-type` is a good
> option.  IOW, I think that the connection type should be a mandatory
> argument when creating an async process (except maybe for those
> processes with no input/output).

If we go that way, we should start by specifying :connection-type for
all the uses of make-process and start-process we have in the core.
It's a large job, but before it is done we cannot in good faith make
such an incompatible transition.

> So maybe, the default value of `process-connection-type` should be
> `unspecified` and the process creation code should emit a warning when
> creating a process whose connection type is `unspecified` (just
> a warning, tho: it should then pursue execution as if that value was t,
> as usual, so as to preserve compatibility).

Something like that, yes.

But I'm actually wondering how come modern Linux kernels don't have a
way of lifting this restriction, or at least enlarging the limit so it
makes the problem even less frequent.  Is there some inherent
limitation that this must be 4KB and nothing larger?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Fri, 21 Jul 2023 13:59:02 GMT) Full text and rfc822 format available.

Message #33 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Spencer Baugh <sbaugh <at> janestreet.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, jidanni <at> jidanni.org
Subject: Re: bug#24531: process-send-string seems to truncate lines over
 4096 characters
Date: Fri, 21 Jul 2023 09:58:38 -0400
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org, jidanni <at> jidanni.org
>> Date: Thu, 20 Jul 2023 17:21:53 -0400
>> From:  Stefan Monnier via "Bug reports for GNU Emacs,
>>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>> 
>> > I see that this bug is about 13 years old.  I think there's a pretty
>> > obvious solution: process-connection-type should default to nil.
>> 
>> In practice, that can't fly because it'll break existing code.
>
> Indeed.

I agree that we probably can't change the default.  However...

> A large portion (I think the majority, but I'm not sure) of
> Lisp programs that use bidirectional communications with async
> subprocesses actually _want_ the PTY interface, because they act as
> GUI front-ends to other programs.  Think "M-x term", "M-x gdb",
> inferior-python-mode, etc.  Even "M-x grep" and the likes need that
> because they rely on default color output, which only happens if Grep
> is connected to a terminal device.  Some of the Emacs features based
> on this don't work or don't work well on MS-Windows because Windows
> only supports pipes.  Do we really want such semi-broken behavior on
> GNU and Unix systems?
>
> The number of applications that (a) don't need console-like behavior
> and (b) need to send larger-than-4KB buffers to sub-processes is quite
> small.  Which is why this issue comes up only very rarely.  So making
> pipes the default will fix a very small fraction of applications, and
> break the vast majority -- clearly a wrong balance.

I see your point, but at the same time, the PTY interface on its own is
not sufficient to make these applications work, not at all.  Specialized
modes are necessary to make M-x term (to implement a terminal) and M-x
grep (to parse ANSI color codes) and other such programs work.  Running
things in a PTY without such specialized code doesn't give you anything,
AFAIK, because a PTY alone is far from enough to make the Emacs end
behave like a terminal.  So such programs need to be aware and careful
about such things anyway, and need additional infrastructure on top of
make-process.  So the default being "pty" gives such programs very
little: it doesn't save them any complexity.

Programs that just want to do some data processing with a subprocess, on
the other hand, work fine with just make-process alone, they need no
additional infrastructure, just process-send-string and reading directly
from the process buffer.  The default being "pipe" would take away a big
footgun for such programs, since it's easy to forget that and then have
a silently wrong program which will fail once you get large input.

>> Also I don't think either value of `process-connection-type` is a good
>> option.  IOW, I think that the connection type should be a mandatory
>> argument when creating an async process (except maybe for those
>> processes with no input/output).
>
> If we go that way, we should start by specifying :connection-type for
> all the uses of make-process and start-process we have in the core.
> It's a large job, but before it is done we cannot in good faith make
> such an incompatible transition.

I can do that.

However, what about my patch adding a warning about this to
process-send-string?  I think that is independently valuable.  Right now
we have no documentation of this problem...

>> So maybe, the default value of `process-connection-type` should be
>> `unspecified` and the process creation code should emit a warning when
>> creating a process whose connection type is `unspecified` (just
>> a warning, tho: it should then pursue execution as if that value was t,
>> as usual, so as to preserve compatibility).
>
> Something like that, yes.
>
> But I'm actually wondering how come modern Linux kernels don't have a
> way of lifting this restriction, or at least enlarging the limit so it
> makes the problem even less frequent.  Is there some inherent
> limitation that this must be 4KB and nothing larger?

Unfortunately from looking at Linux the limit of 4096 seems to be
hardcoded.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Fri, 21 Jul 2023 14:19:02 GMT) Full text and rfc822 format available.

Message #36 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Spencer Baugh <sbaugh <at> janestreet.com>
Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca,
 jidanni <at> jidanni.org
Subject: Re: bug#24531: process-send-string seems to truncate lines over
 4096 characters
Date: Fri, 21 Jul 2023 17:18:34 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com>
> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>,  24531 <at> debbugs.gnu.org,
>    6149 <at> debbugs.gnu.org,  jidanni <at> jidanni.org
> Date: Fri, 21 Jul 2023 09:58:38 -0400
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > The number of applications that (a) don't need console-like behavior
> > and (b) need to send larger-than-4KB buffers to sub-processes is quite
> > small.  Which is why this issue comes up only very rarely.  So making
> > pipes the default will fix a very small fraction of applications, and
> > break the vast majority -- clearly a wrong balance.
> 
> I see your point, but at the same time, the PTY interface on its own is
> not sufficient to make these applications work, not at all.  Specialized
> modes are necessary to make M-x term (to implement a terminal) and M-x
> grep (to parse ANSI color codes) and other such programs work.  Running
> things in a PTY without such specialized code doesn't give you anything,
> AFAIK, because a PTY alone is far from enough to make the Emacs end
> behave like a terminal.  So such programs need to be aware and careful
> about such things anyway, and need additional infrastructure on top of
> make-process.  So the default being "pty" gives such programs very
> little: it doesn't save them any complexity.

That Emacs needs to do something doesn't invalidate my point.  My
point is that communications via a PTY is a necessary (though a
sufficient) condition for these features.  Basically, you cannot use
pipes for any interactive feature, because pipes are buffered.

> However, what about my patch adding a warning about this to
> process-send-string?  I think that is independently valuable.  Right now
> we have no documentation of this problem...

This should be documented in the ELisp manual, and in more detail, not
just as a vague warning.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Fri, 21 Jul 2023 15:10:02 GMT) Full text and rfc822 format available.

Message #39 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24531 <at> debbugs.gnu.org, sbaugh <at> janestreet.com, 6149 <at> debbugs.gnu.org,
 jidanni <at> jidanni.org
Subject: Re: bug#6149: bug#24531: process-send-string seems to truncate
 lines over 4096 characters
Date: Fri, 21 Jul 2023 11:09:17 -0400
>> Also I don't think either value of `process-connection-type` is a good
>> option.  IOW, I think that the connection type should be a mandatory
>> argument when creating an async process (except maybe for those
>> processes with no input/output).
> If we go that way, we should start by specifying :connection-type for
> all the uses of make-process and start-process we have in the core.

That would be a good start, yes.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Thu, 27 Jul 2023 01:49:02 GMT) Full text and rfc822 format available.

Message #42 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Spencer Baugh <sbaugh <at> janestreet.com>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org,
 Stefan Monnier <monnier <at> iro.umontreal.ca>, jidanni <at> jidanni.org
Subject: Re: bug#6149: bug#24531: process-send-string seems to truncate lines
 over 4096 characters
Date: Thu, 27 Jul 2023 04:48:18 +0300
On 21/07/2023 16:58, Spencer Baugh wrote:
>> But I'm actually wondering how come modern Linux kernels don't have a
>> way of lifting this restriction, or at least enlarging the limit so it
>> makes the problem even less frequent.  Is there some inherent
>> limitation that this must be 4KB and nothing larger?
> Unfortunately from looking at Linux the limit of 4096 seems to be
> hardcoded.

If some syscall or etc limits the length of a string to 4096, can't we 
detect this case, split the string and emit said call multiple times?

This function's docstring already mentions the case of

  If STRING is larger than the input buffer of the process, ...
  it is sent in several bunches




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Thu, 27 Jul 2023 05:42:02 GMT) Full text and rfc822 format available.

Message #45 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 24531 <at> debbugs.gnu.org, sbaugh <at> janestreet.com, 6149 <at> debbugs.gnu.org,
 monnier <at> iro.umontreal.ca, jidanni <at> jidanni.org
Subject: Re: bug#6149: bug#24531: process-send-string seems to truncate lines
 over 4096 characters
Date: Thu, 27 Jul 2023 08:41:52 +0300
> Date: Thu, 27 Jul 2023 04:48:18 +0300
> Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org,
>  Stefan Monnier <monnier <at> iro.umontreal.ca>, jidanni <at> jidanni.org
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> If some syscall or etc limits the length of a string to 4096, can't we 
> detect this case, split the string and emit said call multiple times?
> 
> This function's docstring already mentions the case of
> 
>    If STRING is larger than the input buffer of the process, ...
>    it is sent in several bunches

AFAIU, that is based on the errno value returned by a 'write' call
which attempts to write too many bytes (see the would_block function).
I guess writes to PTYs don't do that?

Paul, do you know anything about that?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Thu, 27 Jul 2023 14:01:02 GMT) Full text and rfc822 format available.

Message #48 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Spencer Baugh <sbaugh <at> janestreet.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 6149 <at> debbugs.gnu.org, Dmitry Gutov <dmitry <at> gutov.dev>,
 Paul Eggert <eggert <at> cs.ucla.edu>, monnier <at> iro.umontreal.ca,
 jidanni <at> jidanni.org
Subject: Re: bug#24531: process-send-string seems to truncate lines over
 4096 characters
Date: Thu, 27 Jul 2023 09:59:53 -0400
Eli Zaretskii <eliz <at> gnu.org> writes:
>> Date: Thu, 27 Jul 2023 04:48:18 +0300
>> Cc: 24531 <at> debbugs.gnu.org, 6149 <at> debbugs.gnu.org,
>>  Stefan Monnier <monnier <at> iro.umontreal.ca>, jidanni <at> jidanni.org
>> From: Dmitry Gutov <dmitry <at> gutov.dev>
>> 
>> If some syscall or etc limits the length of a string to 4096, can't we 
>> detect this case, split the string and emit said call multiple times?
>> 
>> This function's docstring already mentions the case of
>> 
>>    If STRING is larger than the input buffer of the process, ...
>>    it is sent in several bunches

Alas it's far more cursed than that.  The length of a *line* is limited
to 4096 characters.  So regardless of how big or small your buffers for
writing are, if you write more than 4095 characters before writing a
newline, the remaining characters will be discarded.  There is no way to
prevent this with ptys.

So even if we wrote one character at a time, characters would start
getting dropped after writing 4095 non-newline characters.

>
> AFAIU, that is based on the errno value returned by a 'write' call
> which attempts to write too many bytes (see the would_block function).
> I guess writes to PTYs don't do that?

Writes to PTYs do tell us when the data has been truncated.  There's
just nothing we can do with that information.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6149; Package emacs. (Thu, 27 Jul 2023 14:52:02 GMT) Full text and rfc822 format available.

Message #51 received at 6149 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Spencer Baugh <sbaugh <at> janestreet.com>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 6149 <at> debbugs.gnu.org, Dmitry Gutov <dmitry <at> gutov.dev>,
 monnier <at> iro.umontreal.ca, jidanni <at> jidanni.org
Subject: Re: bug#24531: process-send-string seems to truncate lines over 4096
 characters
Date: Thu, 27 Jul 2023 07:51:31 -0700
On 2023-07-27 06:59, Spencer Baugh wrote:
>> AFAIU, that is based on the errno value returned by a 'write' call
>> which attempts to write too many bytes (see the would_block function).
>> I guess writes to PTYs don't do that?
> Writes to PTYs do tell us when the data has been truncated.

Unfortunately not. Data bytes are silently truncated, at least on Ubuntu 
23.04. If I fire up Emacs and type:

   M-x shell RET cat >out RET C-u 4096 x RET C-d

the last RET causes Emacs to write 4097 bytes (4096 'x's followed by a 
newline) to the pty. This 'write' system call succeeds and returns 4097. 
However, the two 'read' calls that 'cat' executes see only 4095 'x's 
followed by '\n' ('read' returns 4096) followed by EOF ('read' returns 
0). An 'x' was lost, and Emacs has no way to see this directly.

This comes from the canonical mode of Linux's terminal driver, which 
silently discards non-newline bytes after the 4095th byte of an input 
line. See:

https://github.com/torvalds/linux/blob/v6.4/drivers/tty/n_tty.c#L1648

One possibility is that Emacs could monitor writes to a Linux pty, 
looking for too many non-newline bytes in a row, and warn the user if 
that number exceeds 4095. That might be the best it can do in this 
troublesome environment. (The warning would be irrelevant for ttys 
operating in non-canonical mode, which have a different set of problems.)




This bug report was last modified 1 year and 321 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.