Interrupts during connect() in make-network-process aren't handled properly. The following recipe to reproduce the problem is rather complicated. You'll need: - Qemu - a kernel with tuntap support (/dev/net/tun) - tunctl (from uml-utilities) - a linux image for Qemu. If you haven't one use http://www.nongnu.org/qemu/linux-0.2.img.bz2 - netcat The problem occurs during connect() and to make this period longer and more controllable we will use Qemu so that we can stop&resume the (virtual) TCP stack. Also note that we are dealing with signals here and that strace or gdb would interfere with the problem. * Prepare Qemu Our goal here is to start Qemu with a virtual network interface like so: qemu -hda linux-0.2.img -net nic -net tap,ifname=qtap0,script=no Before we can do that we need to create the qtap0 device: sudo tunctl -u USER -t qtap0 # replace USER with your user sudo ifconfig qtap0 192.168.255.1 up 192.168.255.1 will most likely work for you but any non-conflicting IP address will do. We'll also need netcat on the virtual machine. So let's copy it to the image: mkdir img sudo mount -o loop linux-0.2.img img sudo cp -L /bin/netcat img/usr/bin sudo umount img Now try qemu -hda linux-0.2.img -net nic -net tap,ifname=qtap0,script=no This should boot up a linux and present you a shell. In the shell configure the network device like so: sh-2.05b# ifconfig eth0 192.168.255.2 Make sure that you can ping that device from the host system with ping 192.168.255.2 If it doesn't work, check the output of route on the guest and the host. * Test netcat Next test netcat. Inside Qemu do: sh-2.05b# netcat -l -p 44444 and on the host: netcat 192.168.255.2 44444 Everything you type on the host should be echoed on the guest. When you abort with C-c the netcat inside Qemu should also abort. * Test Function Now we are almost ready to run real tests. Create a file connect-eintr.el containing the following function. (defun testit (vmpid) (switch-to-buffer "*Messages*") (signal-process vmpid 'SIGSTOP) (shell-command (concat (format "(sleep 0.4; kill -SIGSTOP %d; " (emacs-pid)) (format " sleep 0.1; kill -SIGCONT %d; " vmpid) (format " sleep 3; kill -SIGCONT %d;)&" (emacs-pid)))) (let ((sock (make-network-process :name "test" :service 44444 :host "192.168.255.2" :sentinel (lambda (x y) (error "sentinel: %s %s" x y))))) (process-send-string sock "foo") (process-send-string sock "bar") (process-send-string sock "baz\n") (message "ok"))) The function does the following steps: 1) stop Qemu 2) connect() 3) stop Emacs 4) resume Qemu 5) resume Emacs 6) write some output to the socket From 2 to 4 Emacs will be inside connect() and we have plenty of time to press a key to generate an interrupt. * Run the test Before running the function create a listening socket inside Qemu as above: sh-2.05b# netcat -l -p 44444 For the next step we need the process id of Qemu, lets call that QPID. Use QPID in the following command line: emacs -Q -load connect-eintr.el -eval '(testit QPID)' -f kill-emacs This starts Emacs and runs the test. If you're using X11 and and don't press any key, Emacs will terminate after a few seconds and foobarbaz will appear in Qemu. Restart netcat as above and re-run the test, but this time press a key after Emacs' frame appears. This time Emacs will not terminate, but instead an error message will be visible in the *Messages* buffer. Also the netcat process in Qemu will be terminated but without producing any output. This latter behavior is wrong. Emacs should handle interrupts generated by pressing keys more gracefully. The problem will also occur if you run Emacs without X11 but the SIGSTOP will return the terminal to the shell and you have to put Emacs into foreground again with the fg command. Your terminal is most likely messed up at that point but the error message should still be visible. * Probable Cause of the problem The cause of the problem is that Emacs closes the socket after being interrupted in connect(). That approach works with servers which accept many connections but fails for servers which serve one connection only as the example with netcat above did. * Proposed fix As described here: http://www.madore.org/~david/computers/connect-intr.html the recommended way to handle interrupts during connect() is to use select() on the socket. The socket will become writable when the connection is established or when an error occurs. The error can be obtained with getsockopt. The patch below implements just that. The only other addition is the introduction of two macros EWOULDBLOCK_P and EINPROGRESS_P which have the only purpose to reduce #ifdef/#ifndef clutter. Helmut