GNU bug report logs -
#5173
23.1.50; interrupted connect() not handled properly
Previous Next
Reported by: Helmut Eller <eller.helmut <at> gmail.com>
Date: Wed, 9 Dec 2009 23:45:04 UTC
Severity: normal
Tags: patch
Done: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 5173 in the body.
You can then email your comments to 5173 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#5173
; Package
emacs
.
(Wed, 09 Dec 2009 23:45:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Helmut Eller <eller.helmut <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Wed, 09 Dec 2009 23:45:04 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):
[Message part 1 (text/plain, inline)]
Interrupts during connect() in make-network-process aren't handled
properly. The following recipe to reproduce the problem is rather
complicated. You'll need:
- Qemu
- a kernel with tuntap support (/dev/net/tun)
- tunctl (from uml-utilities)
- a linux image for Qemu. If you haven't one
use http://www.nongnu.org/qemu/linux-0.2.img.bz2
- netcat
The problem occurs during connect() and to make this period longer and
more controllable we will use Qemu so that we can stop&resume the
(virtual) TCP stack. Also note that we are dealing with signals here
and that strace or gdb would interfere with the problem.
* Prepare Qemu
Our goal here is to start Qemu with a virtual network interface
like so:
qemu -hda linux-0.2.img -net nic -net tap,ifname=qtap0,script=no
Before we can do that we need to create the qtap0 device:
sudo tunctl -u USER -t qtap0 # replace USER with your user
sudo ifconfig qtap0 192.168.255.1 up
192.168.255.1 will most likely work for you but any non-conflicting IP
address will do.
We'll also need netcat on the virtual machine. So let's copy it to the
image:
mkdir img
sudo mount -o loop linux-0.2.img img
sudo cp -L /bin/netcat img/usr/bin
sudo umount img
Now try
qemu -hda linux-0.2.img -net nic -net tap,ifname=qtap0,script=no
This should boot up a linux and present you a shell.
In the shell configure the network device like so:
sh-2.05b# ifconfig eth0 192.168.255.2
Make sure that you can ping that device from the host system with
ping 192.168.255.2
If it doesn't work, check the output of route on the guest and the host.
* Test netcat
Next test netcat. Inside Qemu do:
sh-2.05b# netcat -l -p 44444
and on the host:
netcat 192.168.255.2 44444
Everything you type on the host should be echoed on the guest. When you
abort with C-c the netcat inside Qemu should also abort.
* Test Function
Now we are almost ready to run real tests. Create a file
connect-eintr.el containing the following function.
(defun testit (vmpid)
(switch-to-buffer "*Messages*")
(signal-process vmpid 'SIGSTOP)
(shell-command
(concat (format "(sleep 0.4; kill -SIGSTOP %d; " (emacs-pid))
(format " sleep 0.1; kill -SIGCONT %d; " vmpid)
(format " sleep 3; kill -SIGCONT %d;)&" (emacs-pid))))
(let ((sock
(make-network-process :name "test"
:service 44444
:host "192.168.255.2"
:sentinel (lambda (x y)
(error "sentinel: %s %s" x y)))))
(process-send-string sock "foo")
(process-send-string sock "bar")
(process-send-string sock "baz\n")
(message "ok")))
The function does the following steps:
1) stop Qemu
2) connect()
3) stop Emacs
4) resume Qemu
5) resume Emacs
6) write some output to the socket
From 2 to 4 Emacs will be inside connect() and we have plenty of time to
press a key to generate an interrupt.
* Run the test
Before running the function create a listening socket inside Qemu as
above:
sh-2.05b# netcat -l -p 44444
For the next step we need the process id of Qemu, lets call that QPID.
Use QPID in the following command line:
emacs -Q -load connect-eintr.el -eval '(testit QPID)' -f kill-emacs
This starts Emacs and runs the test. If you're using X11 and and don't
press any key, Emacs will terminate after a few seconds and foobarbaz
will appear in Qemu.
Restart netcat as above and re-run the test, but this time press a key
after Emacs' frame appears. This time Emacs will not terminate, but
instead an error message will be visible in the *Messages* buffer. Also
the netcat process in Qemu will be terminated but without producing any
output.
This latter behavior is wrong. Emacs should handle interrupts
generated by pressing keys more gracefully.
The problem will also occur if you run Emacs without X11 but the SIGSTOP
will return the terminal to the shell and you have to put Emacs into
foreground again with the fg command. Your terminal is most likely
messed up at that point but the error message should still be visible.
* Probable Cause of the problem
The cause of the problem is that Emacs closes the socket after being
interrupted in connect(). That approach works with servers which accept
many connections but fails for servers which serve one connection only
as the example with netcat above did.
* Proposed fix
As described here:
http://www.madore.org/~david/computers/connect-intr.html
the recommended way to handle interrupts during connect() is to use
select() on the socket. The socket will become writable when the
connection is established or when an error occurs. The error can be
obtained with getsockopt.
The patch below implements just that.
The only other addition is the introduction of two macros
EWOULDBLOCK_P and EINPROGRESS_P which have the only purpose
to reduce #ifdef/#ifndef clutter.
Helmut
[connect.patch (text/x-diff, inline)]
--- process.c.~1.607.~ 2009-12-04 08:01:43.000000000 +0100
+++ process.c 2009-12-09 23:37:19.000000000 +0100
@@ -234,6 +234,18 @@
#endif /* NON_BLOCKING_CONNECT */
#endif /* BROKEN_NON_BLOCKING_CONNECT */
+#ifdef EWOULDBLOCK
+# define EWOULDBLOCK_P(x) (x == EWOULDBLOCK)
+#else
+# define EWOULDBLOCK_P(x) (0)
+#endif
+
+#ifdef EINPROGRESS
+# define EINPROGRESS_P(x) (x == EINPROGRESS)
+#else
+# define EINPROGRESS_P(x) (0)
+#endif
+
/* Define DATAGRAM_SOCKETS if datagrams can be used safely on
this system. We need to read full packets, so we need a
"non-destructive" select. So we require either native select,
@@ -3338,9 +3350,8 @@
{
#ifndef NON_BLOCKING_CONNECT
error ("Non-blocking connect not supported");
-#else
- is_non_blocking_client = 1;
#endif
+ is_non_blocking_client = 1;
}
name = Fplist_get (contact, QCname);
@@ -3566,10 +3577,8 @@
continue;
}
-#ifdef DATAGRAM_SOCKETS
if (!is_server && socktype == SOCK_DGRAM)
break;
-#endif /* DATAGRAM_SOCKETS */
#ifdef NON_BLOCKING_CONNECT
if (is_non_blocking_client)
@@ -3655,26 +3664,44 @@
ret = connect (s, lres->ai_addr, lres->ai_addrlen);
xerrno = errno;
- turn_on_atimers (1);
+ turn_on_atimers (1);
- if (ret == 0 || xerrno == EISCONN)
- {
+ if (ret == 0
+ || (EWOULDBLOCK_P (xerrno) && is_non_blocking_client)
+ || (EINPROGRESS_P (xerrno) && is_non_blocking_client))
/* The unwind-protect will be discarded afterwards.
Likewise for immediate_quit. */
break;
- }
-#ifdef NON_BLOCKING_CONNECT
-#ifdef EINPROGRESS
- if (is_non_blocking_client && xerrno == EINPROGRESS)
- break;
-#else
-#ifdef EWOULDBLOCK
- if (is_non_blocking_client && xerrno == EWOULDBLOCK)
- break;
-#endif
-#endif
-#endif
+ if (xerrno == EINTR)
+ {
+ /* Unlike most other syscalls connect() cannot be called
+ again. (That would return EALREADY.) The proper way to
+ wait for completion is select(). */
+ int sc;
+ fd_set fdset;
+ retry_select:
+ FD_ZERO (&fdset);
+ FD_SET (s, &fdset);
+ QUIT;
+ sc = select (s + 1, 0, &fdset, 0, 0);
+ if (sc == -1)
+ if (errno == EINTR)
+ goto retry_select;
+ else
+ report_file_error ("select failed", Qnil);
+ eassert (sc > 0);
+ {
+ int len = sizeof xerrno;
+ eassert (FD_ISSET (s, &fdset));
+ if (getsockopt (s, SOL_SOCKET, SO_ERROR, &xerrno, &len) == -1)
+ report_file_error ("getsockopt failed", Qnil);
+ if (xerrno != 0)
+ errno = xerrno, report_file_error ("error during connect", Qnil);
+ else
+ break;
+ }
+ }
immediate_quit = 0;
@@ -3682,9 +3709,6 @@
specpdl_ptr = specpdl + count1;
emacs_close (s);
s = -1;
-
- if (xerrno == EINTR)
- goto retry_connect;
}
if (s >= 0)
Added tag(s) patch.
Request was from
Glenn Morris <rgm <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Thu, 28 Jan 2010 00:14:01 GMT)
Full text and
rfc822 format available.
Reply sent
to
YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
:
You have taken responsibility.
(Thu, 25 Mar 2010 09:01:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Helmut Eller <eller.helmut <at> gmail.com>
:
bug acknowledged by developer.
(Thu, 25 Mar 2010 09:01:02 GMT)
Full text and
rfc822 format available.
Message #12 received at 5173-done <at> debbugs.gnu.org (full text, mbox):
Closed with this change:
revno: 99750
author: Helmut Eller <eller.helmut <at> gmail.com>
committer: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
branch nick: trunk
timestamp: Thu 2010-03-25 17:48:52 +0900
message:
Call `select' for interrupted `connect' rather than creating new socket (Bug#5173).
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 22 Apr 2010 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 15 years and 148 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.