GNU bug report logs -
#15025
emacs --daemon stuck in infinite loop
Previous Next
Reported by: Dan Nicolaescu <dann <at> gnu.org>
Date: Mon, 5 Aug 2013 12:37:01 UTC
Severity: important
Fixed in version 24.4
Done: Glenn Morris <rgm <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 15025 in the body.
You can then email your comments to 15025 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Mon, 05 Aug 2013 12:37:01 GMT)
Full text and
rfc822 format available.
Message #3 received at submit <at> debbugs.gnu.org (full text, mbox):
This seems to be reproducible.
emacs compiled with Lucid toolkit
The recipe here uses Xnest because it easy to kill/restart, probably the same
happens if the X session is killed.
Xnest :1&
xterm -display :1&
Now type in the xterm above:
emacs --daemon
In a different xterm type:
emacsclient -t Makefile
(or any file that exists).
C-z
now while emacsclient is suspended kill Xnest (using the window manager
close button)
Emacs daemon should still survive, but
emacsclient -t
cannot connect to it.
Looking in the debugger, emacs is stuck in an infinite loop in:
frame.c: next_frame
while (passed < 2)
passed never gets set to more than 1, so the loop never ends.
What is the intention of that code?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Tue, 06 Aug 2013 23:56:01 GMT)
Full text and
rfc822 format available.
Message #6 received at 15025 <at> debbugs.gnu.org (full text, mbox):
Dan Nicolaescu wrote:
> This seems to be reproducible.
>
> emacs compiled with Lucid toolkit
>
> The recipe here uses Xnest because it easy to kill/restart, probably the same
> happens if the X session is killed.
>
> Xnest :1&
> xterm -display :1&
>
> Now type in the xterm above:
> emacs --daemon
>
> In a different xterm type:
> emacsclient -t Makefile
> (or any file that exists).
> C-z
>
> now while emacsclient is suspended kill Xnest (using the window manager
> close button)
>
> Emacs daemon should still survive, but
> emacsclient -t
> cannot connect to it.
>
> Looking in the debugger, emacs is stuck in an infinite loop in:
>
> frame.c: next_frame
>
> while (passed < 2)
>
> passed never gets set to more than 1, so the loop never ends.
>
> What is the intention of that code?
It seems this was introduced in
http://lists.gnu.org/archive/html/emacs-diffs/2012-12/msg00093.html
Dmitry, please could you take a look?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Wed, 07 Aug 2013 02:14:01 GMT)
Full text and
rfc822 format available.
Message #9 received at 15025 <at> debbugs.gnu.org (full text, mbox):
On 08/07/2013 03:55 AM, Glenn Morris wrote:
> Dan Nicolaescu wrote:
>
>> This seems to be reproducible.
>>
>> emacs compiled with Lucid toolkit
>>
>> The recipe here uses Xnest because it easy to kill/restart, probably the same
>> happens if the X session is killed.
>>
>> Xnest :1&
>> xterm -display :1&
>>
>> Now type in the xterm above:
>> emacs --daemon
>>
>> In a different xterm type:
>> emacsclient -t Makefile
>> (or any file that exists).
>> C-z
Should I run "different xterm" connected to base (:0) server or nested (:1)?
Anyway, I can't reproduce it now. Could you please try to run Emacs daemon with -Q?
Dmitry
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Fri, 09 Aug 2013 01:24:02 GMT)
Full text and
rfc822 format available.
Message #12 received at 15025 <at> debbugs.gnu.org (full text, mbox):
Dmitry Antipov <dmantipov <at> yandex.ru> writes:
> On 08/07/2013 03:55 AM, Glenn Morris wrote:
>
>> Dan Nicolaescu wrote:
>>
>>> This seems to be reproducible.
>>>
>>> emacs compiled with Lucid toolkit
>>>
>>> The recipe here uses Xnest because it easy to kill/restart, probably the same
>>> happens if the X session is killed.
>>>
>>> Xnest :1&
>>> xterm -display :1&
>>>
>>> Now type in the xterm above:
>>> emacs --daemon
>>>
>>> In a different xterm type:
>>> emacsclient -t Makefile
>>> (or any file that exists).
>>> C-z
>
> Should I run "different xterm" connected to base (:0) server or nested (:1)?
one connected to base.
> Anyway, I can't reproduce it now. Could you please try to run Emacs daemon with -Q?
I rebuilt my emacs, and I can reproduce it anymore. I do see the
problem from time to time on my work machine, but I cannot reproduce it
reliably.
What should I look for when that happens?
next_frame has that loop "while (passed < 2)" where emacs gets stuck,
but prev_frame does not. Any idea what can it make it get stuck there?
I run into another problem when trying to reproduce this:
Xnest :1&
xterm -display :1&
Now type in the xterm above:
emacs -Q --daemon
In a different xterm in the default display (not in Xnest) type:
emacsclient -t Makefile
(or any file that exists).
M-x
C-z
(suspend while in minibuffer)
kill Xnest
go to a different xterm and type
emacsclient -t FOO
where FOO is a file that exists. This will display "Makefile", not
"FOO"
--dan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Fri, 09 Aug 2013 02:35:01 GMT)
Full text and
rfc822 format available.
Message #15 received at 15025 <at> debbugs.gnu.org (full text, mbox):
On 08/09/2013 05:23 AM, Dan Nicolaescu wrote:
> What should I look for when that happens?
> next_frame has that loop "while (passed < 2)" where emacs gets stuck,
> but prev_frame does not. Any idea what can it make it get stuck there?
Hm... strange values (dead frame?) in Vframe_list may be a reason.
Next time when it's get stuck, attach gdb and examine:
1) `frame' arg of next_frame;
2) each entry in Vframe_list, like this:
(gdb) call debug_print (Vframe_list)
(#<frame emacs <at> localhost 0x102d390>)
(gdb) p *(struct frame *)0x102d390
$1 = {header = {size = 4611686018477891605}, name = {i = 12705553}, icon_name = {i = 10920482}, title = {i = 10920482},
focus_frame = {i = 10920482}, root_window = {i = 16966565}, selected_window = {i = 16966565}, minibuffer_window = {i = 16966997},
param_alist = {i = 19029702}, scroll_bars = {i = 17338645}, condemned_scroll_bars = {i = 10920482}, menu_bar_items = {i =
17188581}, face_alist = {i = 19032150}, menu_bar_vector = {i = 42561837}, buffer_predicate = {i = 10920482}, buffer_list = {i =
19028102}, buried_buffer_list = {i = 10920482}, tool_bar_window = {i = 10920482}, tool_bar_items = {i = 14190525},
tool_bar_position = {i = 10970402}, desired_tool_bar_string = {i = 10920482}, current_tool_bar_string = {i = 10920482},
face_cache = 0xc2f6a0, menu_bar_items_used = 0, namebuf = 0x0, current_pool = 0x0, desired_pool = 0x0, desired_matrix = 0x0,
current_matrix = 0x0, glyphs_initialized_p = 1, resized_p = 0, force_flush_display_p = 0, default_face_done_p = 1,
already_hscrolled_p = 0, updated_p = 1, minimize_tool_bar_window_p = 0, external_tool_bar = 1, tool_bar_lines = 0,
n_tool_bar_rows = 0, n_tool_bar_items = 13, decode_mode_spec_buffer = 0xc69d20 "", insert_line_cost = 0x0, delete_line_cost = 0x0,
insert_n_lines_cost = 0x0, delete_n_lines_cost = 0x0, text_lines = 34, text_cols = 80, total_lines = 0, total_cols = 84,
new_text_lines = 0, new_text_cols = 0, left_pos = 0, top_pos = 0, pixel_height = 612, pixel_width = 756, x_pixels_diff = 600,
y_pixels_diff = 85, win_gravity = 1, size_hint_flags = 0, border_width = 0, internal_border_width = 0, column_width = 9,
line_height = 18, output_method = output_x_window, terminal = 0xf7cab8, output_data = {tty = 0xc1e160, x = 0xc1e160, w32 =
0xc1e160, ns = 0xc1e160, nothing = 12706144}, font_driver_list = 0x136d520, font_data_list = 0xc7c1c0, fringe_cols = 2,
left_fringe_width = 9, right_fringe_width = 9, want_fullscreen = FULLSCREEN_NONE, menu_bar_lines = 0, external_menu_bar = 1,
visible = 1, iconified = 0, garbaged = 0, has_minibuffer = 1, wants_modeline = 1, auto_raise = 0, auto_lower = 0, no_split = 0,
explicit_name = 0, window_sizes_changed = 0, mouse_moved = 0, pointer_invisible = 0, vertical_scroll_bar_type =
vertical_scroll_bar_right, desired_cursor = FILLED_BOX_CURSOR, cursor_width = 1, blink_off_cursor = DEFAULT_CURSOR,
blink_off_cursor_width = 0, config_scroll_bar_width = 16, config_scroll_bar_cols = 2, scroll_bar_actual_width = 18,
cost_calculation_baud_rate = 19200, alpha = {-1, -1}, gamma = 0, extra_line_spacing = 0, background_pixel = 16777215,
foreground_pixel = 0}
In particular, if you find the frame with zero f->terminal pointer,
we have dead frames in the game, which is definitely wrong.
Dmitry
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Fri, 09 Aug 2013 09:13:01 GMT)
Full text and
rfc822 format available.
Message #18 received at 15025 <at> debbugs.gnu.org (full text, mbox):
>> next_frame has that loop "while (passed < 2)" where emacs gets stuck,
>> but prev_frame does not. Any idea what can it make it get stuck there?
>
> Hm... strange values (dead frame?) in Vframe_list may be a reason.
IIUC failing to increment passed can happen only if the frame next_frame
got called with is not on Vframe_list. We could try asserting that it is.
martin
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Fri, 09 Aug 2013 13:26:02 GMT)
Full text and
rfc822 format available.
Message #21 received at 15025 <at> debbugs.gnu.org (full text, mbox):
martin rudalics <rudalics <at> gmx.at> writes:
>>> next_frame has that loop "while (passed < 2)" where emacs gets stuck,
>>> but prev_frame does not. Any idea what can it make it get stuck there?
>>
>> Hm... strange values (dead frame?) in Vframe_list may be a reason.
>
> IIUC failing to increment passed can happen only if the frame next_frame
> got called with is not on Vframe_list. We could try asserting that it is.
That gives the idea of killing multiple emacsclients at the same time...
This looks reproducible:
Xnest :1
xterm -display :1
in that xterm:
emacs -Q --deamon
from the :0 display create 2 more xterms in Xnest:
xterm -display :1
xterm -display :1
and in each of the do
emacsclient -t
C-z
in another xterm on :0 do:
emacsclient -t
C-z
at this point we should have 3 suspended emacsclients, 2 in Xnest, one
in the main display.
Kill Xnest
Now emacs is stuck. VframeList looks like it's empty.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Fri, 09 Aug 2013 15:01:01 GMT)
Full text and
rfc822 format available.
Message #24 received at 15025 <at> debbugs.gnu.org (full text, mbox):
On 08/09/2013 05:25 PM, Dan Nicolaescu wrote:
> Now emacs is stuck. VframeList looks like it's empty.
...and you should hit `eassert (CONSP (Vframe_list))' in next_frame, but...
do you have ENABLE_CHECKING turned on? (Hunting for a bug without it
is just a waste of time).
Dmitry
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Fri, 09 Aug 2013 17:10:02 GMT)
Full text and
rfc822 format available.
Message #27 received at 15025 <at> debbugs.gnu.org (full text, mbox):
> Now emacs is stuck. VframeList looks like it's empty.
So is it empty?
maartin
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Fri, 09 Aug 2013 19:15:02 GMT)
Full text and
rfc822 format available.
Message #30 received at 15025 <at> debbugs.gnu.org (full text, mbox):
Dmitry Antipov <dmantipov <at> yandex.ru> writes:
> On 08/09/2013 05:25 PM, Dan Nicolaescu wrote:
>
>> Now emacs is stuck. VframeList looks like it's empty.
>
> ...and you should hit `eassert (CONSP (Vframe_list))' in next_frame, but...
> do you have ENABLE_CHECKING turned on? (Hunting for a bug without it
> is just a waste of time).
It does not trigger that assert.
Vframe_list is:
(gdb) call debug_print(Vframe_list)
(gdb)
The backtrace is:
(gdb) xbacktrace
"delete-frame" (0x82f2d508)
"server-delete-client" (0x82f2da00)
"server-sentinel" (0x82f2ded8)
(gdb) bt
#0 next_frame (frame=0x1263165, minibuf=0xd47212)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/frame.c:1016
#1 0x0000000000421e14 in delete_frame (frame=0x1263165, force=0xd18092)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/frame.c:1216
#2 0x000000000042289a in Fdelete_frame (frame=0x1263165, force=0xd18092)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/frame.c:1455
#3 0x00000000005dfc9e in Ffuncall (nargs=0x2, args=0x7fff82f2d500)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/eval.c:2819
#4 0x0000000000628239 in exec_byte_code (bytestr=0x103bf01, vector=0x12c1925,
maxdepth=0x38, args_template=0x804, nargs=0x1, args=0x7fff82f2da08)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/bytecode.c:905
#5 0x00000000005e0483 in funcall_lambda (fun=0x11d3e3d, nargs=0x1,
arg_vector=0x7fff82f2da00)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/eval.c:2984
#6 0x00000000005dfe60 in Ffuncall (nargs=0x2, args=0x7fff82f2d9f8)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/eval.c:2865
#7 0x0000000000628239 in exec_byte_code (bytestr=0x113dc31, vector=0x128c3f5,
maxdepth=0x28, args_template=0x808, nargs=0x2, args=0x7fff82f2dee8)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/bytecode.c:905
#8 0x00000000005e0483 in funcall_lambda (fun=0x11ef1e5, nargs=0x2,
arg_vector=0x7fff82f2ded8)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/eval.c:2984
#9 0x00000000005dfe60 in Ffuncall (nargs=0x3, args=0x7fff82f2ded0)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/eval.c:2865
#10 0x00000000005def01 in Fapply (nargs=0x2, args=0x7fff82f2dfa0)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/eval.c:2355
#11 0x00000000005df58d in apply1 (fn=0x300b912, arg=0xec2596)
at /fac2/vol6/software/dann/hack/emacs/emacs-writable/trunk/src/eval.c:2589
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Fri, 09 Aug 2013 20:36:02 GMT)
Full text and
rfc822 format available.
Message #33 received at 15025 <at> debbugs.gnu.org (full text, mbox):
> It does not trigger that assert.
So put in an assert that the argument frame of next_frame is
on Vframe_list.
martin
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Fri, 09 Aug 2013 21:55:02 GMT)
Full text and
rfc822 format available.
Message #36 received at 15025 <at> debbugs.gnu.org (full text, mbox):
martin rudalics <rudalics <at> gmx.at> writes:
>> It does not trigger that assert.
>
> So put in an assert that the argument frame of next_frame is
> on Vframe_list.
Unfortunately I won't have time to debug this any time soon.
But the recipe I gave shows the problem every time, so hopefully someone
can track down the problem.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Mon, 12 Aug 2013 08:11:02 GMT)
Full text and
rfc822 format available.
Message #39 received at 15025 <at> debbugs.gnu.org (full text, mbox):
On 08/10/2013 01:54 AM, Dan Nicolaescu wrote:
> Unfortunately I won't have time to debug this any time soon.
> But the recipe I gave shows the problem every time, so hopefully someone
> can track down the problem.
Finally I got it reproduced. Very nasty and long-standing issue (I'm pretty
sure that it was there before r111121).
The problem is that candidate_frame looks for 1) frames which shares the
keyboard for graphical frames or 2) frames which share the TTY for termcap
frames. But with Dan's example, we have:
(gdb) p XFRAME (frame)
$20 = (struct frame *) 0x108c6e0 ; argument of `next_frame'
And in Vframe_list:
(gdb) p XFRAME (XCAR (Vframe_list))
$27 = (struct frame *) 0x10756e0
(gdb) p XFRAME (XCAR (Vframe_list))->terminal
$28 = (struct terminal *) 0x1075570 ; TTY /dev/pts/X
(gdb) p XFRAME (XCAR (Vframe_list))->output_method
$29 = output_termcap
(gdb) p XFRAME (XCAR (XCDR (Vframe_list)))
$30 = (struct frame *) 0x108c6e0
(gdb) p XFRAME (XCAR (XCDR (Vframe_list)))->terminal
$31 = (struct terminal *) 0xbeba70 ; TTY /dev/pts/Y
(gdb) p XFRAME (XCAR (XCDR (Vframe_list)))->output_method
$32 = output_termcap
Two TTY frames on a different TTYs! So, candidate_frame always returns Qnil.
But the problem is much more interesting than redesign this:
if ((!FRAME_TERMCAP_P (c) && !FRAME_TERMCAP_P (f)
&& FRAME_KBOARD (c) == FRAME_KBOARD (f))
|| (FRAME_TERMCAP_P (c) && FRAME_TERMCAP_P (f)
&& FRAME_TTY (c) == FRAME_TTY (f)))
IIUC, TTY "peers" (which was on xterms connected to :1) of Emacs frames becomes
invalid when Xnest dies. I suspect that we should have a method to handle such
a "TTY disconnect", but currently I have no ideas how to implement this :-(.
Dmitry
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Tue, 13 Aug 2013 14:42:02 GMT)
Full text and
rfc822 format available.
Message #42 received at 15025 <at> debbugs.gnu.org (full text, mbox):
Dmitry Antipov <dmantipov <at> yandex.ru> writes:
> On 08/10/2013 01:54 AM, Dan Nicolaescu wrote:
>
>> Unfortunately I won't have time to debug this any time soon.
>> But the recipe I gave shows the problem every time, so hopefully someone
>> can track down the problem.
>
> Finally I got it reproduced. Very nasty and long-standing issue (I'm pretty
> sure that it was there before r111121).
>
> The problem is that candidate_frame looks for 1) frames which shares the
> keyboard for graphical frames or 2) frames which share the TTY for termcap
> frames. But with Dan's example, we have:
>
> (gdb) p XFRAME (frame)
> $20 = (struct frame *) 0x108c6e0 ; argument of `next_frame'
>
> And in Vframe_list:
>
> (gdb) p XFRAME (XCAR (Vframe_list))
> $27 = (struct frame *) 0x10756e0
> (gdb) p XFRAME (XCAR (Vframe_list))->terminal
> $28 = (struct terminal *) 0x1075570 ; TTY /dev/pts/X
> (gdb) p XFRAME (XCAR (Vframe_list))->output_method
> $29 = output_termcap
>
> (gdb) p XFRAME (XCAR (XCDR (Vframe_list)))
> $30 = (struct frame *) 0x108c6e0
> (gdb) p XFRAME (XCAR (XCDR (Vframe_list)))->terminal
> $31 = (struct terminal *) 0xbeba70 ; TTY /dev/pts/Y
> (gdb) p XFRAME (XCAR (XCDR (Vframe_list)))->output_method
> $32 = output_termcap
>
> Two TTY frames on a different TTYs! So, candidate_frame always returns Qnil.
Does this happen because two TTY frames got killed at the same time
(because the xterm they were running on was killed)?
Does emacs think that those frames are still alive?
What is the goal when looking for frames that share the TTY ?
[I don't know what the code in question is trying to do...]
> But the problem is much more interesting than redesign this:
>
> if ((!FRAME_TERMCAP_P (c) && !FRAME_TERMCAP_P (f)
> && FRAME_KBOARD (c) == FRAME_KBOARD (f))
> || (FRAME_TERMCAP_P (c) && FRAME_TERMCAP_P (f)
> && FRAME_TTY (c) == FRAME_TTY (f)))
>
> IIUC, TTY "peers" (which was on xterms connected to :1) of Emacs frames becomes
> invalid when Xnest dies. I suspect that we should have a method to handle such
> a "TTY disconnect", but currently I have no ideas how to implement this :-(.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Wed, 14 Aug 2013 17:44:01 GMT)
Full text and
rfc822 format available.
Message #45 received at 15025 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Dan,
could you please try this hack? You should see some noise from
lisp/server.el code (because server process doesn't know about
dead terminals), and warnings about frame deletion. But it
shouldn't crash or stuck at least...
Dmitry
[bug15025_hack.patch (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Thu, 15 Aug 2013 15:41:01 GMT)
Full text and
rfc822 format available.
Message #48 received at 15025 <at> debbugs.gnu.org (full text, mbox):
Hopefully should be fixed in r113891.
Dmitry
bug marked as fixed in version 24.4, send any further explanations to
15025 <at> debbugs.gnu.org and Dan Nicolaescu <dann <at> gnu.org>
Request was from
Glenn Morris <rgm <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Fri, 16 Aug 2013 18:29:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#15025
; Package
emacs
.
(Fri, 16 Aug 2013 18:37:02 GMT)
Full text and
rfc822 format available.
Message #53 received at 15025-done <at> debbugs.gnu.org (full text, mbox):
Dmitry Antipov <dmantipov <at> yandex.ru> writes:
> Hopefully should be fixed in r113891.
Thanks, it looks like the infinite loop is gone now.
--dan
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 14 Sep 2013 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 11 years and 281 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.