GNU bug report logs - #63589
29.0.91; crash after creating graphical frames via emacsclient when compiled with cairo-xcb

Previous Next

Package: emacs;

Reported by: Thiago Melo <tmdmelo <at> gmail.com>

Date: Fri, 19 May 2023 15:22:03 UTC

Severity: normal

Found in version 29.0.91

To reply to this bug, email your comments to 63589 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Fri, 19 May 2023 15:22:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Thiago Melo <tmdmelo <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 19 May 2023 15:22:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 29.0.91; crash after creating graphical frames via emacsclient when
 compiled with cairo-xcb
Date: Fri, 19 May 2023 11:17:36 +0000
[Message part 1 (text/plain, inline)]
With emacs 29 compiled with cairo-xcb, after starting emacs-daemon,
repeatedly closing the last graphical frame and creating a new one via
emacsclient will eventually crash emacs (it takes a few seconds for
me). During the process, some frames might fail to be created (they
briefly appear and close themselves).

After the crash, I get the following error messages from the tty where
I started the daemon:

```
emacs: ../../../../src/cairo-xcb-screen.c:219: _get_screen_index:
Assertion `!"reached"' failed.

Fatal error 6: Aborted
```

Affects starting emacs with `emacs -Q --daemon`.

Issue happens since commit de614ec9 ("Use Cairo XCB surfaces when XCB
is available").

Compiling emacs with only cairo is enough to trigger the bug.
 'configure --without-all --with-x-toolkit=no --with-cairo'

Attached are a gdb session log with backtraces and system information.

Looking up the cairo-xcb error message above, I found a related
discussion at the cairo mailing list:

https://lists.cairographics.org/archives/cairo/2017-December/028491.html

Where someone had the same issue with a different software in a
similar scenario. One developer gives insight about the behavior and
suggests how to better manage cairo-xcb surfaces.

My workaround for now is patching emacs' configure.ac to disable cairo-xcb.
[system-information.txt (text/plain, attachment)]
[emacs-29-cairo-xcb-gdb-backtrace.org (application/vnd.lotus-organizer, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sat, 20 May 2023 01:48:01 GMT) Full text and rfc822 format available.

Message #8 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Sat, 20 May 2023 09:46:45 +0800
Thiago Melo <tmdmelo <at> gmail.com> writes:

> emacs: ../../../../src/cairo-xcb-screen.c:219: _get_screen_index:
> Assertion `!"reached"' failed.
>
> Fatal error 6: Aborted
> ```
>
> Affects starting emacs with `emacs -Q --daemon`.
>
> Issue happens since commit de614ec9 ("Use Cairo XCB surfaces when XCB
> is available").

This is one bug.  Thanks for bringing it to our attention.  However,
this crash happens when a display connection is closed, which is not
common in normal use.  As the backtraces you attached show, an unrelated
X error is what caused a connection to be closed.

To really fix this bug, we need to know the details of the X error.
Once you reach the breakpoint on `x_error_quitter', would you please
run:

  (gdb) p *event

and send us the resulting print out?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sat, 20 May 2023 11:49:01 GMT) Full text and rfc822 format available.

Message #11 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Sat, 20 May 2023 11:47:34 +0000
Thank you for looking at it.

> However,
> this crash happens when a display connection is closed, which is not
> common in normal use.  As the backtraces you attached show, an unrelated
> X error is what caused a connection to be closed.

I must clarity that, after these particular X errors happen (the ones
that trigger the x_error_quitter breakpoint), I might still be able to
create new frames, without emacs crashing. And vice versa, emacs might
crash without these X errors happening. So, the issues might or not
have a common underlying cause.

> To really fix this bug, we need to know the details of the X error.
> Once you reach the breakpoint on `x_error_quitter', would you please
> run:
>
>   (gdb) p *event
>
> and send us the resulting print out?

Sure. Just in case, this time I compiled emacs with better configure
options for debugging (`--enable-checking='yes,glyphs'
--enable-check-lisp-object-type  CFLAGS='-O0 -g3'`) and I was more
careful to run emacs with `-xrm "emacs.synchronous: true"`.

I must also highlight that the following errors in the backtrace
happen one right after the other (i.e., I'm unable to interact with
the zombie emacs frame in between).

```
Breakpoint 2, x_error_quitter (display=0x55555654f4f0,
    event=0x7fffffff71c0) at xterm.c:26126
26126      if (event->error_code == BadName)
(gdb) p *event
$1 = {
  type = 0,
  display = 0x55555654f4f0,
  resourceid = 54526136,
  serial = 706,
  error_code = 14 '\016',
  request_code = 1 '\001',
  minor_code = 0 '\000'
}
(gdb) continue
Continuing.

Breakpoint 2, x_error_quitter (display=0x55555654f4f0,
    event=0x7fffffff6a50) at xterm.c:26126
26126      if (event->error_code == BadName)
(gdb) p *event
$2 = {
  type = 0,
  display = 0x55555654f4f0,
  resourceid = 54526136,
  serial = 707,
  error_code = 3 '\003',
  request_code = 8 '\b',
  minor_code = 0 '\000'
}
(gdb) continue
Continuing.

Breakpoint 2, x_error_quitter (display=0x55555654f4f0,
    event=0x7fffffff6a50) at xterm.c:26126
26126      if (event->error_code == BadName)
(gdb) p *event
$3 = {
  type = 0,
  display = 0x55555654f4f0,
  resourceid = 54526136,
  serial = 708,
  error_code = 3 '\003',
  request_code = 12 '\f',
  minor_code = 0 '\000'
}
(gdb) continue
Continuing.
```




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sat, 20 May 2023 22:49:01 GMT) Full text and rfc822 format available.

Message #14 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: 63589 <at> debbugs.gnu.org
Subject: [PATCH] 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Sat, 20 May 2023 22:47:17 +0000
[Message part 1 (text/plain, inline)]
Here's a patch to fix this issue. It was created on top of the
emacs-29 branch, commit 6b60c81.

It's based on the suggestion from the cairo mailing list (see the link
I sent in my original message here). It ensures that the cairo device
associated with the cairo-xcb surfaces in the display is destroyed
before closing the display.

It can probably be improved. It could even be extended to handle
cairo-xlib. In the cairo mailing list, they mentioned one corner case
where the xlib device is not properly destroyed: when cairo is
unloaded before the X11 connection is closed.
[bugfix-63589.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 21 May 2023 00:43:02 GMT) Full text and rfc822 format available.

Message #17 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Sun, 21 May 2023 08:42:31 +0800
Thiago Melo <tmdmelo <at> gmail.com> writes:

> I must also highlight that the following errors in the backtrace
> happen one right after the other (i.e., I'm unable to interact with
> the zombie emacs frame in between).

Yes, I know.  Thanks.

> ```
> Breakpoint 2, x_error_quitter (display=0x55555654f4f0,
>     event=0x7fffffff71c0) at xterm.c:26126
> 26126      if (event->error_code == BadName)
> (gdb) p *event
> $1 = {
>   type = 0,
>   display = 0x55555654f4f0,
>   resourceid = 54526136,
>   serial = 706,
>   error_code = 14 '\016',
>   request_code = 1 '\001',
>   minor_code = 0 '\000'
> }
> (gdb) continue
> Continuing.

This means Emacs tried to create a window with an invalid XID.  Would
you please show the backtrace from this error, now that Emacs is
operating synchronously?

> Breakpoint 2, x_error_quitter (display=0x55555654f4f0,
>     event=0x7fffffff6a50) at xterm.c:26126
> 26126      if (event->error_code == BadName)
> (gdb) p *event
> $2 = {
>   type = 0,
>   display = 0x55555654f4f0,
>   resourceid = 54526136,
>   serial = 707,
>   error_code = 3 '\003',
>   request_code = 8 '\b',
>   minor_code = 0 '\000'
> }
> (gdb) continue
> Continuing.
>
> Breakpoint 2, x_error_quitter (display=0x55555654f4f0,
>     event=0x7fffffff6a50) at xterm.c:26126
> 26126      if (event->error_code == BadName)
> (gdb) p *event
> $3 = {
>   type = 0,
>   display = 0x55555654f4f0,
>   resourceid = 54526136,
>   serial = 708,
>   error_code = 3 '\003',
>   request_code = 12 '\f',
>   minor_code = 0 '\000'
> }
> (gdb) continue
> Continuing.
> ```

These further errors are simply a result of the invalid window being
used.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 21 May 2023 13:42:01 GMT) Full text and rfc822 format available.

Message #20 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Sun, 21 May 2023 21:40:45 +0800
Thiago Melo <tmdmelo <at> gmail.com> writes:

> Here's a patch to fix this issue. It was created on top of the
> emacs-29 branch, commit 6b60c81.
>
> It's based on the suggestion from the cairo mailing list (see the link
> I sent in my original message here). It ensures that the cairo device
> associated with the cairo-xcb surfaces in the display is destroyed
> before closing the display.
>
> It can probably be improved. It could even be extended to handle
> cairo-xlib. In the cairo mailing list, they mentioned one corner case
> where the xlib device is not properly destroyed: when cairo is
> unloaded before the X11 connection is closed.

I would like to know the details of the X error that caused the display
connection to be closed in the first place: this change is too large for
the release branch, but we may be able to fix the X error.

Also, please keep in mind that our policy is to place a space between
the function identifier and the opening paren of its parameter list in
function calls, and that the device should probably be destroyed even if
the display no longer exists, which usually happens when
x_delete_terminal is called in response to an IO error.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 21 May 2023 14:30:02 GMT) Full text and rfc822 format available.

Message #23 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91;
 crash after creating graphical frames via emacsclient when compiled
 with cairo-xcb
Date: Sun, 21 May 2023 17:30:09 +0300
> Cc: 63589 <at> debbugs.gnu.org
> Date: Sun, 21 May 2023 21:40:45 +0800
> From:  Po Lu via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> Thiago Melo <tmdmelo <at> gmail.com> writes:
> 
> > Here's a patch to fix this issue. It was created on top of the
> > emacs-29 branch, commit 6b60c81.
> >
> > It's based on the suggestion from the cairo mailing list (see the link
> > I sent in my original message here). It ensures that the cairo device
> > associated with the cairo-xcb surfaces in the display is destroyed
> > before closing the display.
> >
> > It can probably be improved. It could even be extended to handle
> > cairo-xlib. In the cairo mailing list, they mentioned one corner case
> > where the xlib device is not properly destroyed: when cairo is
> > unloaded before the X11 connection is closed.
> 
> I would like to know the details of the X error that caused the display
> connection to be closed in the first place: this change is too large for
> the release branch, but we may be able to fix the X error.

What I would like to understand is how come this didn't happen until
now?  The Cairo build is the default since Emacs 28, is it not?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 21 May 2023 16:11:02 GMT) Full text and rfc822 format available.

Message #26 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Sun, 21 May 2023 16:09:40 +0000
(sorry, forgot to hit reply all)

> I would like to know the details of the X error that caused the display
> connection to be closed in the first place

I'm happy to contribute, but I'm sorry that I might not have much time
to do it right now. :(

We can do it slowly, but just in case I'm not around, I'm leaving here
one way to trigger this bug more automatically. After starting emacs
(compiled with cairo-xcb) in daemon mode, the bug can be triggered via
a shell script like this:


```
# some elisp code to close all graphical frames
ELISP="(mapcar (lambda (x) (when (frame-parameter x 'display)
(delete-frame x))) (frame-list))"

# repeatedly create graphical frames and close them all
for k in $(seq 10); do
    emacsclient -c -n -a /bin/false &&
        sleep 1 &&
        emacsclient -e "${ELISP}"
done
```


In any case, I'll reply you about the errors in the other message in
this thread, if you don't mind.

> this change is too large for
> the release branch, but we may be able to fix the X error.

No problem. At least, I wanted to point out one potential direction
for the solution. By the way, I've done my FSF Copyright assignment
already.

> Also, please keep in mind that our policy is to place a space between
> the function identifier and the opening paren of its parameter list in
> function calls

Thank you, I'll keep it in mind.

> and that the device should probably be destroyed even if
> the display no longer exists

It's puzzling, isn't it? The cairo dev also said it should be
destroyed, but that sometimes it doesn't happen for cairo-xcb when (1)
there's a leak somewhere or (2) during some non-leak cases they didn't
specify.

> which usually happens when
> x_delete_terminal is called in response to an IO error.

Here is one thing that I'd like to be clarified about: it seems to me
that you don't expect the display to be closed, as you mentioned
before:

> However,
> this crash happens when a display connection is closed, which is not
> common in normal use.  As the backtraces you attached show, an unrelated
> X error is what caused a connection to be closed.

The thing is, with an emacs daemon, after I close the last graphical
frame, x_delete_terminal is always called and the display is always
closed. It happens no matter if I build it with or without cairo, with
errors or without errors.

So I probably misunderstood something here, please do let me know if I
did. Or these things only happens with my computer. I also would love
to know if anyone else is able to replicate this issue.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 21 May 2023 16:12:01 GMT) Full text and rfc822 format available.

Message #29 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Po Lu <luangruo <at> yahoo.com>, 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Sun, 21 May 2023 16:10:35 +0000
(sorry, forgot to hit reply all)

> What I would like to understand is how come this didn't happen until
> now?  The Cairo build is the default since Emacs 28, is it not?

This is not just about cairo, but about cairo with xcb surfaces. It
was introduced in commit de614ec9, which is part of emacs 29. If I
understood well, previously, emacs + cairo used the xlib device only.

I'll write here again the link to the relevant discussion in the cairo
mailing list:

https://lists.cairographics.org/archives/cairo/2017-December/028491.html

Please do take a look. But to summarize: the cairo-xcb device is not
always destroyed when the display closes. However, the cairo-xlib
device is pretty much always destroyed.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 21 May 2023 17:43:01 GMT) Full text and rfc822 format available.

Message #32 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: luangruo <at> yahoo.com, 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Sun, 21 May 2023 20:42:25 +0300
> From: Thiago Melo <tmdmelo <at> gmail.com>
> Date: Sun, 21 May 2023 16:10:35 +0000
> Cc: Po Lu <luangruo <at> yahoo.com>, 63589 <at> debbugs.gnu.org
> 
> > What I would like to understand is how come this didn't happen until
> > now?  The Cairo build is the default since Emacs 28, is it not?
> 
> This is not just about cairo, but about cairo with xcb surfaces. It
> was introduced in commit de614ec9, which is part of emacs 29. If I
> understood well, previously, emacs + cairo used the xlib device only.

Which means we must fix this in Emacs 29.1.  If the right fix is too
unsafe for that, perhaps the alternative is to make the xcb surfaces
support be off by default, unless Emacs is explicitly configured to
use it.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 21 May 2023 18:27:02 GMT) Full text and rfc822 format available.

Message #35 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Sun, 21 May 2023 18:25:17 +0000
> > Breakpoint 2, x_error_quitter (display=0x55555654f4f0,
> >     event=0x7fffffff71c0) at xterm.c:26126
> > 26126      if (event->error_code == BadName)
> > (gdb) p *event
> > $1 = {
> >   type = 0,
> >   display = 0x55555654f4f0,
> >   resourceid = 54526136,
> >   serial = 706,
> >   error_code = 14 '\016',
> >   request_code = 1 '\001',
> >   minor_code = 0 '\000'
> > }
> > (gdb) continue
> > Continuing.
>
> This means Emacs tried to create a window with an invalid XID.  Would
> you please show the backtrace from this error, now that Emacs is
> operating synchronously?

There you go:


$ gdb --args ./emacs-cairo-xcb -xrm "emacs.synchronous: true" -Q
--fg-daemon=test
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./emacs-cairo-xcb...
SIGINT is used by the debugger.
Are you sure you want to change it? (y or n) [answered Y; input not
from terminal]
DISPLAY = :0
TERM = xterm-256color
Breakpoint 1 at 0x1de341: file emacs.c, line 427.
Breakpoint 2 at 0x1ad020: file xterm.c, line 26126.
(gdb) run
Starting program: /dev/shm/src/emacs-29.0.91/src/emacs-cairo-xcb -xrm
emacs.synchronous:\ true -Q --fg-daemon=test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Starting Emacs daemon.

Breakpoint 2, x_error_quitter (display=0x55555629dca0,
    event=0x7fffffff71a0) at xterm.c:26126
26126      if (event->error_code == BadName)
(gdb) p *event
$1 = {
  type = 0,
  display = 0x55555629dca0,
  resourceid = 41943224,
  serial = 707,
  error_code = 14 '\016',
  request_code = 1 '\001',
  minor_code = 0 '\000'
}
(gdb) backtrace
#0  x_error_quitter (display=0x55555629dca0, event=0x7fffffff71a0) at
xterm.c:26126
#1  0x0000555555701000 in x_error_handler (display=0x55555629dca0,
event=0x7fffffff71a0) at xterm.c:26107
#2  0x00007ffff7e6e864 in _XError () from /usr/lib/x86_64-linux-gnu/libX11.so.6
#3  0x00007ffff7e6b327 in ?? () from /usr/lib/x86_64-linux-gnu/libX11.so.6
#4  0x00007ffff7e6b3c5 in ?? () from /usr/lib/x86_64-linux-gnu/libX11.so.6
#5  0x00007ffff7e6bffa in _XEventsQueued () from
/usr/lib/x86_64-linux-gnu/libX11.so.6
#6  0x00007ffff7e5d931 in XPending () from /usr/lib/x86_64-linux-gnu/libX11.so.6
#7  0x00005555556fe44b in XTread_socket (terminal=0x55555628c1c0,
hold_quit=0x7fffffff7440) at xterm.c:24773
#8  0x000055555574d48d in gobble_input () at keyboard.c:7426
#9  0x000055555574d97a in handle_async_input () at keyboard.c:7657
#10 0x000055555574d999 in process_pending_signals () at keyboard.c:7671
#11 0x000055555574d9d9 in unblock_input_to (level=0) at keyboard.c:7686
#12 0x000055555574d9fd in unblock_input () at keyboard.c:7705
#13 0x00005555558da91b in ftcrfont_text_extents (font=0x5555561339c0,
code=0x7fffffff7608, nglyphs=1, metrics=0x555555f79a18 <metrics>)
    at ftcrfont.c:430
#14 0x000055555561df02 in get_per_char_metric (font=0x5555561339c0,
char2b=0x7fffffff7608) at xdisp.c:29776
#15 0x0000555555626ec6 in gui_produce_glyphs (it=0x7fffffff7760) at
xdisp.c:31946
#16 0x0000555555625acb in produce_special_glyphs (it=0x7fffffff8ba0,
what=IT_CONTINUATION) at xdisp.c:31556
#17 0x00005555555c5790 in init_iterator (it=0x7fffffff8ba0,
w=0x55555628c650, charpos=-1, bytepos=-1, row=0x0,
base_face_id=DEFAULT_FACE_ID)
    at xdisp.c:3321
#18 0x00005555555e70e7 in gui_consider_frame_title
(frame=XIL(0x55555628c3e5)) at xdisp.c:13566
#19 0x00005555555e7690 in prepare_menu_bars () at xdisp.c:13682
#20 0x00005555555ef199 in redisplay_internal () at xdisp.c:16602
#21 0x00005555555f1235 in redisplay_preserve_echo_area (from_where=13)
at xdisp.c:17359
#22 0x0000555555888aab in Fdelete_process
(process=XIL(0x5555562de7cd)) at process.c:1120
#23 0x00005555558256e0 in funcall_subr (subr=0x555555f71320
<Sdelete_process>, numargs=1, args=0x7ffff5bff2b0) at eval.c:3034
#24 0x00005555558802bf in exec_byte_code (fun=XIL(0x7ffff657b8d5),
args_template=514, nargs=2, args=0x7ffff5bff2c0) at bytecode.c:809
#25 0x0000555555825a66 in fetch_and_exec_byte_code
(fun=XIL(0x5555560c6cbd), args_template=514, nargs=2,
args=0x7fffffffbc38) at eval.c:3081
#26 0x0000555555825ed2 in funcall_lambda (fun=XIL(0x5555560c6cbd),
nargs=2, arg_vector=0x7fffffffbc38) at eval.c:3153
#27 0x00005555558251bf in funcall_general (fun=XIL(0x5555560c6cbd),
numargs=2, args=0x7fffffffbc38) at eval.c:2945
#28 0x00005555558254c1 in Ffuncall (nargs=3, args=0x7fffffffbc30) at eval.c:2995
#29 0x0000555555824727 in Fapply (nargs=2, args=0x7fffffffbcf0) at eval.c:2666
#30 0x0000555555824dd9 in apply1 (fn=XIL(0xd99d0),
arg=XIL(0x5555563b0dd3)) at eval.c:2882
#31 0x0000555555894e46 in read_process_output_call
(fun_and_args=XIL(0x5555563b0de3)) at process.c:6070
#32 0x0000555555820bb0 in internal_condition_case_1
(bfun=0x555555894db9 <read_process_output_call>,
arg=XIL(0x5555563b0de3), handlers=XIL(0x90),
    hfun=0x555555894e48 <read_process_output_error_handler>) at eval.c:1498
#33 0x00005555558957b0 in read_and_dispose_of_process_output (p=0x5555562de7c8,
    chars=0x7fffffffbe10 "-env SHELL=/bin/bash -env
SESSION_MANAGER=local/debian-x250:@/tmp/.ICE-unix/1634,unix/debian-x250:/tmp/.ICE-unix/1634
-env WINDOWID=23179042 -env QT_ACCESSIBILITY=1 -env
COLORTERM=truecolor -env XDG_C"..., nbytes=2923,
coding=0x5555560f5840) at process.c:6294
#34 0x0000555555895390 in read_process_output
(proc=XIL(0x5555562de7cd), channel=5) at process.c:6204
#35 0x0000555555894585 in wait_reading_process_output (time_limit=0,
nsecs=0, read_kbd=-1, do_display=true, wait_for_cell=XIL(0),
wait_proc=0x0, just_wait_proc=0) at process.c:5888
#36 0x000055555574355b in kbd_buffer_get_event (kbp=0x7fffffffd4b8,
used_mouse_menu=0x7fffffffdb5f, end_time=0x0) at keyboard.c:4012
#37 0x000055555573ded8 in read_event_from_main_queue (end_time=0x0,
local_getcjmp=0x7fffffffd930, used_mouse_menu=0x7fffffffdb5f) at
keyboard.c:2279
#38 0x000055555573e288 in read_decoded_event_from_main_queue
(end_time=0x0, local_getcjmp=0x7fffffffd930, prev_event=XIL(0),
used_mouse_menu=0x7fffffffdb5f) at keyboard.c:2343
#39 0x000055555574042e in read_char (commandflag=1,
map=XIL(0x5555563a8f33), prev_event=XIL(0),
used_mouse_menu=0x7fffffffdb5f, end_time=0x0) at keyboard.c:2973
#40 0x0000555555754a7b in read_key_sequence (keybuf=0x7fffffffdcf0,
prompt=XIL(0), dont_downcase_last=false, can_return_switch_frame=true,
fix_current_buffer=true, prevent_redisplay=false) at keyboard.c:10083
#41 0x000055555573b05a in command_loop_1 () at keyboard.c:1384
#42 0x0000555555820ad5 in internal_condition_case (bfun=0x55555573ac30
<command_loop_1>, handlers=XIL(0x90), hfun=0x55555573a09c <cmd_error>)
at eval.c:1474
#43 0x000055555573a819 in command_loop_2 (handlers=XIL(0x90)) at keyboard.c:1133
#44 0x000055555581fd0e in internal_catch (tag=XIL(0xf240),
func=0x55555573a7f2 <command_loop_2>, arg=XIL(0x90)) at eval.c:1197
#45 0x000055555573a7ae in command_loop () at keyboard.c:1111
#46 0x0000555555739b5f in recursive_edit_1 () at keyboard.c:720
#47 0x0000555555739d7c in Frecursive_edit () at keyboard.c:803
#48 0x000055555573556a in main (argc=5, argv=0x7fffffffe238) at emacs.c:2529

Lisp Backtrace:
"redisplay_internal (C function)" (0x0)
"delete-process" (0xf5bff2b0)
"server-delete-client" (0xf5bff240)
"server-execute" (0xf5bff1a0)
0x5606ddf0 PVEC_COMPILED
"server-execute-continuation" (0xf5bff0c8)
"server-process-filter" (0xffffbc38)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 22 May 2023 00:58:02 GMT) Full text and rfc822 format available.

Message #38 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, Thiago Melo <tmdmelo <at> gmail.com>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Mon, 22 May 2023 08:56:51 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> Which means we must fix this in Emacs 29.1.  If the right fix is too
> unsafe for that, perhaps the alternative is to make the xcb surfaces
> support be off by default, unless Emacs is explicitly configured to
> use it.

The situation in which this crash occurs is sufficiently uncommon.  It's
the result of another bug in Emacs, hopefully one that should be safe to
fix.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 22 May 2023 01:06:02 GMT) Full text and rfc822 format available.

Message #41 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Mon, 22 May 2023 09:05:41 +0800
Thiago Melo <tmdmelo <at> gmail.com> writes:

> I'm happy to contribute, but I'm sorry that I might not have much time
> to do it right now. :(
>
> We can do it slowly, but just in case I'm not around, I'm leaving here
> one way to trigger this bug more automatically. After starting emacs
> (compiled with cairo-xcb) in daemon mode, the bug can be triggered via
> a shell script like this:
>
>
> ```
> # some elisp code to close all graphical frames
> ELISP="(mapcar (lambda (x) (when (frame-parameter x 'display)
> (delete-frame x))) (frame-list))"
>
> # repeatedly create graphical frames and close them all
> for k in $(seq 10); do
>     emacsclient -c -n -a /bin/false &&
>         sleep 1 &&
>         emacsclient -e "${ELISP}"
> done
> ```
>
>
> In any case, I'll reply you about the errors in the other message in
> this thread, if you don't mind.

I will try to look into this, thanks.

> The thing is, with an emacs daemon, after I close the last graphical
> frame, x_delete_terminal is always called and the display is always
> closed. It happens no matter if I build it with or without cairo, with
> errors or without errors.

Which X toolkit did you build Emacs with?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 22 May 2023 02:50:02 GMT) Full text and rfc822 format available.

Message #44 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, Thiago Melo <tmdmelo <at> gmail.com>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Mon, 22 May 2023 10:48:43 +0800
Po Lu <luangruo <at> yahoo.com> writes:

> The situation in which this crash occurs is sufficiently uncommon.  It's
> the result of another bug in Emacs, hopefully one that should be safe to
> fix.

Unfortunately, both this crash and its cause (actually, a
RenderBadPicture from a glyph compositing request somewhere within
cairo) are bugs in cairo-xcb itself.  Emacs never allows the display
connection to be closed without dereferencing all Cairo resources
created for that display connection, but Cairo keeps its own references
around.

The only reasonable solution is to disable the use of XCB surfaces by
default.  Is this OK for the release branch?

diff --git a/configure.ac b/configure.ac
index 95167329c28..d7296168ff9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -459,6 +459,7 @@ AC_DEFUN
 OPTION_DEFAULT_ON([lcms2],[don't compile with Little CMS support])
 OPTION_DEFAULT_ON([libsystemd],[don't compile with libsystemd support])
 OPTION_DEFAULT_ON([cairo],[don't compile with Cairo drawing])
+OPTION_DEFAULT_OFF([cairo-xcb], [use XCB surfaces for Cairo support])
 OPTION_DEFAULT_ON([xml2],[don't compile with XML parsing support])
 OPTION_DEFAULT_OFF([imagemagick],[compile with ImageMagick image support])
 OPTION_DEFAULT_ON([native-image-api], [don't use native image APIs (GDI+ on Windows)])
@@ -3571,14 +3572,19 @@ AC_DEFUN
     CAIRO_MODULE="cairo >= $CAIRO_REQUIRED"
     EMACS_CHECK_MODULES([CAIRO], [$CAIRO_MODULE])
     if test $HAVE_CAIRO = yes; then
-      CAIRO_XCB_MODULE="cairo-xcb >= $CAIRO_REQUIRED"
-      EMACS_CHECK_MODULES([CAIRO_XCB], [$CAIRO_XCB_MODULE])
-      if test $HAVE_CAIRO_XCB = yes; then
-	CAIRO_CFLAGS="$CAIRO_CFLAGS $CAIRO_XCB_CFLAGS"
-	CAIRO_LIBS="$CAIRO_LIBS $CAIRO_XCB_LIBS"
-	AC_DEFINE([USE_CAIRO_XCB], [1],
-	  [Define to 1 if cairo XCB surfaces are available.])
-      fi
+      dnl Cairo XCB support is disabled by default, as the Cairo XCB
+      dnl backend itself seems to be buggy: multiple Cairo devices can
+      dnl be created for the same visual on the same connection, and
+      dnl the devices are never destroyed, even when all references go
+      dnl away.
+      AS_IF([test "x$with_cairo_xcb" = "xyes"], [
+	CAIRO_XCB_MODULE="cairo-xcb >= $CAIRO_REQUIRED"
+	EMACS_CHECK_MODULES([CAIRO_XCB], [$CAIRO_XCB_MODULE])
+	AS_IF([test "x$HAVE_CAIRO_XCB" = "xyes"], [
+	  CAIRO_CFLAGS="$CAIRO_CFLAGS $CAIRO_XCB_CFLAGS"
+	  CAIRO_LIBS="$CAIRO_LIBS $CAIRO_XCB_LIBS"
+	  AC_DEFINE([USE_CAIRO_XCB], [1],
+	    [Define to 1 if cairo XCB surfaces are available.])])])
       AC_DEFINE([USE_CAIRO], [1], [Define to 1 if using cairo.])
       CFLAGS="$CFLAGS $CAIRO_CFLAGS"
       LIBS="$LIBS $CAIRO_LIBS"




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 22 May 2023 05:24:01 GMT) Full text and rfc822 format available.

Message #47 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Mon, 22 May 2023 05:23:08 +0000
> I will try to look into this, thanks.

Thank you, Po Lu.

> > The thing is, with an emacs daemon, after I close the last graphical
> > frame, x_delete_terminal is always called and the display is always
> > closed. It happens no matter if I build it with or without cairo, with
> > errors or without errors.
>
> Which X toolkit did you build Emacs with?

--with-x-toolkit=no

Your question made me take a look at `delete_frame` at frame.c and
realize that the display is not closed in this situation with Lucid or
GTK. My bad, I didn't test these before. Then, it makes the conditions
for this bug even more uncommon.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 22 May 2023 11:00:02 GMT) Full text and rfc822 format available.

Message #50 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Mon, 22 May 2023 13:59:33 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: 63589 <at> debbugs.gnu.org,  Thiago Melo <tmdmelo <at> gmail.com>
> Date: Mon, 22 May 2023 10:48:43 +0800
> 
> Po Lu <luangruo <at> yahoo.com> writes:
> 
> > The situation in which this crash occurs is sufficiently uncommon.  It's
> > the result of another bug in Emacs, hopefully one that should be safe to
> > fix.
> 
> Unfortunately, both this crash and its cause (actually, a
> RenderBadPicture from a glyph compositing request somewhere within
> cairo) are bugs in cairo-xcb itself.  Emacs never allows the display
> connection to be closed without dereferencing all Cairo resources
> created for that display connection, but Cairo keeps its own references
> around.

Was this bug reported to the relevant Cairo developers?

> The only reasonable solution is to disable the use of XCB surfaces by
> default.  Is this OK for the release branch?

It's OK, but please also add to NEWS some short notice about this
option and its potential pitfalls, which explain why it is off by
default.  Perhaps also about its advantages, so that users could make
up their minds.

What is the kind of situations in which these crashes could happen?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 22 May 2023 11:19:01 GMT) Full text and rfc822 format available.

Message #53 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Mon, 22 May 2023 19:17:44 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> Was this bug reported to the relevant Cairo developers?

I will get to that soon.

> It's OK, but please also add to NEWS some short notice about this
> option and its potential pitfalls, which explain why it is off by
> default.  Perhaps also about its advantages, so that users could make
> up their minds.

The advantage is that it is moderately faster when Emacs is running over
the network.

> What is the kind of situations in which these crashes could happen?

Precisely that described in this bug report: when displays are closed
and reopened within a short time period.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 22 May 2023 11:41:01 GMT) Full text and rfc822 format available.

Message #56 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Mon, 22 May 2023 14:40:54 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: 63589 <at> debbugs.gnu.org,  tmdmelo <at> gmail.com
> Date: Mon, 22 May 2023 19:17:44 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Was this bug reported to the relevant Cairo developers?
> 
> I will get to that soon.
> 
> > It's OK, but please also add to NEWS some short notice about this
> > option and its potential pitfalls, which explain why it is off by
> > default.  Perhaps also about its advantages, so that users could make
> > up their minds.
> 
> The advantage is that it is moderately faster when Emacs is running over
> the network.

OK, so let's mention that in NEWS.

> > What is the kind of situations in which these crashes could happen?
> 
> Precisely that described in this bug report: when displays are closed
> and reopened within a short time period.

What kind of user-level situations could cause this?  Is invoking
emacsclient soon after deleting the last visible frame the only one?
And what does "short time period" mean, quantitatively? milliseconds?
seconds? minutes?





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 22 May 2023 12:09:01 GMT) Full text and rfc822 format available.

Message #59 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Po Lu <luangruo <at> yahoo.com>, 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Mon, 22 May 2023 12:07:36 +0000
> > > What is the kind of situations in which these crashes could happen?
> >
> > Precisely that described in this bug report: when displays are closed
> > and reopened within a short time period.
>
> What kind of user-level situations could cause this?  Is invoking
> emacsclient soon after deleting the last visible frame the only one?
> And what does "short time period" mean, quantitatively? milliseconds?
> seconds? minutes?

Sorry, in my experience it seems that the time interval between
closing the display and opening it again doesn't matter. It seems to
be more about the amount of times that the display is closed and then
opened (which is often 3 times for me, for whatever reason).

I'm testing it here again with Xvfb and an automation script, with a
10 minutes delay after creating a single graphical frame, and another
10 minutes delay after closing it and before creating a new one. I'll
report the results soon.

Also, this bug seems more likely to happen when emacs is built without
a toolkit (which is was I've been testing so far), since the display
is always closed after the last graphical frame is closed. Which made
me realize, after looking at frame.c, that this bug might as well join
the family of Bug#5802, Bug#21509, Bug#23499, and Bug#27816.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 22 May 2023 13:14:01 GMT) Full text and rfc822 format available.

Message #62 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Po Lu <luangruo <at> yahoo.com>, 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Mon, 22 May 2023 13:12:16 +0000
> > > > What is the kind of situations in which these crashes could happen?
> > >
> > > Precisely that described in this bug report: when displays are closed
> > > and reopened within a short time period.
> >
> > What kind of user-level situations could cause this?  Is invoking
> > emacsclient soon after deleting the last visible frame the only one?
> > And what does "short time period" mean, quantitatively? milliseconds?
> > seconds? minutes?
>
> Sorry, in my experience it seems that the time interval between
> closing the display and opening it again doesn't matter. It seems to
> be more about the amount of times that the display is closed and then
> opened (which is often 3 times for me, for whatever reason).
>
> I'm testing it here again with Xvfb and an automation script, with a
> 10 minutes delay after creating a single graphical frame, and another
> 10 minutes delay after closing it and before creating a new one. I'll
> report the results soon.
>
> Also, this bug seems more likely to happen when emacs is built without
> a toolkit (which is was I've been testing so far), since the display
> is always closed after the last graphical frame is closed. Which made
> me realize, after looking at frame.c, that this bug might as well join
> the family of Bug#5802, Bug#21509, Bug#23499, and Bug#27816.

With 10 minutes intervals, I got the X errors previously mentioned by
the 3rd time the display was opened, and then emacs crashed by the 5th
time the display was opened. So, assuming that 10 minutes is close
enough to infinity, we can say that the time interval doesn't matter.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 22 May 2023 19:23:01 GMT) Full text and rfc822 format available.

Message #65 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Po Lu <luangruo <at> yahoo.com>, 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Mon, 22 May 2023 19:21:36 +0000
> > > > > What is the kind of situations in which these crashes could happen?
> > > >
> > > > Precisely that described in this bug report: when displays are closed
> > > > and reopened within a short time period.
> > >
> > > What kind of user-level situations could cause this?  Is invoking
> > > emacsclient soon after deleting the last visible frame the only one?
> > > And what does "short time period" mean, quantitatively? milliseconds?
> > > seconds? minutes?
> >
> > Sorry, in my experience it seems that the time interval between
> > closing the display and opening it again doesn't matter. It seems to
> > be more about the amount of times that the display is closed and then
> > opened (which is often 3 times for me, for whatever reason).
> >
> > I'm testing it here again with Xvfb and an automation script, with a
> > 10 minutes delay after creating a single graphical frame, and another
> > 10 minutes delay after closing it and before creating a new one. I'll
> > report the results soon.
> >
> > Also, this bug seems more likely to happen when emacs is built without
> > a toolkit (which is was I've been testing so far), since the display
> > is always closed after the last graphical frame is closed. Which made
> > me realize, after looking at frame.c, that this bug might as well join
> > the family of Bug#5802, Bug#21509, Bug#23499, and Bug#27816.
>
> With 10 minutes intervals, I got the X errors previously mentioned by
> the 3rd time the display was opened, and then emacs crashed by the 5th
> time the display was opened. So, assuming that 10 minutes is close
> enough to infinity, we can say that the time interval doesn't matter.

So, trying to gather everything into a summary here.

To trigger the bug, all the following conditions must be met:

- Emacs built without a toolkit
- Emacs built with Cairo-XCB
- Emacs started in daemon mode
- The user closes all graphical frames and creates a new one (manually
or programatically, duration in between doesn't matter, amount of
times is not certain)

It goes into the `delete_frame' -> `Fdelete_terminal' ->
`x_delete_terminal' -> `XCloseDisplay' path, where cairo-xcb
references are not destroyed, leading to X errors and emacs crashing.

What else I've tested so far, that didn't trigger the bug:

- Closing the X Server
- xkill'ing graphical frames

These two cases go into the `x_connection_closed' ->
`Fdelete_terminal' -> `x_delete_terminal' path, where `XCloseDisplay'
is not called.

Other builds I've tested, that didn't trigger the bug:

- GTK + Cairo-XCB
- Lucid + Cairo-XCB

Which doesn't end up calling XCloseDisplay, since the terminal is not
deleted when the last graphical frame is closed (due to the infamous
longstanding GTK bug, Bug#5802, Bug#21509, Bug#23499, and Bug#27816).

Considering all the above, I propose this smaller (and potentially
temporary) patch:


#+begin_src diff
--- a/src/frame.c    2023-05-22 19:52:25.155145242 +0200
+++ b/src/frame.c    2023-05-22 20:13:41.548566364 +0200
@@ -2206,14 +2206,15 @@
     /* If needed, delete the terminal that this frame was on.
        (This must be done after the frame is killed.)  */
     terminal->reference_count--;
-#if defined (USE_X_TOOLKIT) || defined (USE_GTK)
+#if defined (USE_X_TOOLKIT) || defined (USE_GTK) || defined (USE_CAIRO_XCB)
     /* FIXME: Deleting the terminal crashes emacs because of a GTK
        bug.
        https://lists.gnu.org/r/emacs-devel/2011-10/msg00363.html */

     /* Since a similar behavior was observed on the Lucid and Motif
-       builds (see Bug#5802, Bug#21509, Bug#23499, Bug#27816), we now
-       don't delete the terminal for these builds either.  */
+       builds (see Bug#5802, Bug#21509, Bug#23499, Bug#27816), and builds
+       without a toolkit together with Cairo-XCB support (Bug#63589),
+       we now don't delete the terminal for these builds either.  */
     if (terminal->reference_count == 0
     && (terminal->type == output_x_window
         || terminal->type == output_pgtk))
#+end_src


Caveat: I've tested it, the errors and crash were gone, but it likely
introduces potential leaks in this build (which probably happens with
GTK and Lucid builds too anyway, from what I've seen in the wild). No
free lunch.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Tue, 23 May 2023 00:32:02 GMT) Full text and rfc822 format available.

Message #68 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Tue, 23 May 2023 08:30:49 +0800
Thiago Melo <tmdmelo <at> gmail.com> writes:

>> > > > > What is the kind of situations in which these crashes could happen?
>> > > >
>> > > > Precisely that described in this bug report: when displays are closed
>> > > > and reopened within a short time period.
>> > >
>> > > What kind of user-level situations could cause this?  Is invoking
>> > > emacsclient soon after deleting the last visible frame the only one?
>> > > And what does "short time period" mean, quantitatively? milliseconds?
>> > > seconds? minutes?
>> >
>> > Sorry, in my experience it seems that the time interval between
>> > closing the display and opening it again doesn't matter. It seems to
>> > be more about the amount of times that the display is closed and then
>> > opened (which is often 3 times for me, for whatever reason).
>> >
>> > I'm testing it here again with Xvfb and an automation script, with a
>> > 10 minutes delay after creating a single graphical frame, and another
>> > 10 minutes delay after closing it and before creating a new one. I'll
>> > report the results soon.
>> >
>> > Also, this bug seems more likely to happen when emacs is built without
>> > a toolkit (which is was I've been testing so far), since the display
>> > is always closed after the last graphical frame is closed. Which made
>> > me realize, after looking at frame.c, that this bug might as well join
>> > the family of Bug#5802, Bug#21509, Bug#23499, and Bug#27816.
>>
>> With 10 minutes intervals, I got the X errors previously mentioned by
>> the 3rd time the display was opened, and then emacs crashed by the 5th
>> time the display was opened. So, assuming that 10 minutes is close
>> enough to infinity, we can say that the time interval doesn't matter.
>
> So, trying to gather everything into a summary here.
>
> To trigger the bug, all the following conditions must be met:
>
> - Emacs built without a toolkit
> - Emacs built with Cairo-XCB
> - Emacs started in daemon mode
> - The user closes all graphical frames and creates a new one (manually
> or programatically, duration in between doesn't matter, amount of
> times is not certain)
>
> It goes into the `delete_frame' -> `Fdelete_terminal' ->
> `x_delete_terminal' -> `XCloseDisplay' path, where cairo-xcb
> references are not destroyed, leading to X errors and emacs crashing.
>
> What else I've tested so far, that didn't trigger the bug:
>
> - Closing the X Server
> - xkill'ing graphical frames
>
> These two cases go into the `x_connection_closed' ->
> `Fdelete_terminal' -> `x_delete_terminal' path, where `XCloseDisplay'
> is not called.
>
> Other builds I've tested, that didn't trigger the bug:
>
> - GTK + Cairo-XCB
> - Lucid + Cairo-XCB
>
> Which doesn't end up calling XCloseDisplay, since the terminal is not
> deleted when the last graphical frame is closed (due to the infamous
> longstanding GTK bug, Bug#5802, Bug#21509, Bug#23499, and Bug#27816).
>
> Considering all the above, I propose this smaller (and potentially
> temporary) patch:
>
> #+begin_src diff
> --- a/src/frame.c    2023-05-22 19:52:25.155145242 +0200
> +++ b/src/frame.c    2023-05-22 20:13:41.548566364 +0200
> @@ -2206,14 +2206,15 @@
>      /* If needed, delete the terminal that this frame was on.
>         (This must be done after the frame is killed.)  */
>      terminal->reference_count--;
> -#if defined (USE_X_TOOLKIT) || defined (USE_GTK)
> +#if defined (USE_X_TOOLKIT) || defined (USE_GTK) || defined (USE_CAIRO_XCB)
>      /* FIXME: Deleting the terminal crashes emacs because of a GTK
>         bug.
>         https://lists.gnu.org/r/emacs-devel/2011-10/msg00363.html */
>
>      /* Since a similar behavior was observed on the Lucid and Motif
> -       builds (see Bug#5802, Bug#21509, Bug#23499, Bug#27816), we now
> -       don't delete the terminal for these builds either.  */
> +       builds (see Bug#5802, Bug#21509, Bug#23499, Bug#27816), and builds
> +       without a toolkit together with Cairo-XCB support (Bug#63589),
> +       we now don't delete the terminal for these builds either.  */
>      if (terminal->reference_count == 0
>      && (terminal->type == output_x_window
>          || terminal->type == output_pgtk))
> #+end_src

We want closing displays (think x-delete-terminal) to still work on such
builds if the user uses it.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Tue, 23 May 2023 11:38:02 GMT) Full text and rfc822 format available.

Message #71 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Tue, 23 May 2023 14:37:23 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  63589 <at> debbugs.gnu.org
> Date: Tue, 23 May 2023 08:30:49 +0800
> 
> We want closing displays (think x-delete-terminal) to still work on such
> builds if the user uses it.

Can't we have a separate "delete terminal" function for when Emacs is
about to exit?  Then it doesn't need the extra logic, AFAIU.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Tue, 23 May 2023 12:10:02 GMT) Full text and rfc822 format available.

Message #74 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Tue, 23 May 2023 20:08:56 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> Can't we have a separate "delete terminal" function for when Emacs is
> about to exit?  Then it doesn't need the extra logic, AFAIU.

When it is about to exit, Emacs simply does so without closing the
display at all, which is TRT to do.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Tue, 23 May 2023 13:02:02 GMT) Full text and rfc822 format available.

Message #77 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Tue, 23 May 2023 16:01:08 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: tmdmelo <at> gmail.com,  63589 <at> debbugs.gnu.org
> Date: Tue, 23 May 2023 20:08:56 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Can't we have a separate "delete terminal" function for when Emacs is
> > about to exit?  Then it doesn't need the extra logic, AFAIU.
> 
> When it is about to exit, Emacs simply does so without closing the
> display at all, which is TRT to do.

Then I don't understand your objections to the proposed patch.  Please
elaborate.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Tue, 23 May 2023 13:19:02 GMT) Full text and rfc822 format available.

Message #80 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Tue, 23 May 2023 21:18:04 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> Then I don't understand your objections to the proposed patch.  Please
> elaborate.

Please see the other thread(s), where I explained in detail the two
problems with `gtk_init_check': upon success, it leaves two display
connections open, which is quite fragile, and upon failure, it makes
creating another display impossible, even if a display then becomes
available.

It's the typical misdesign in GTK.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Tue, 23 May 2023 14:21:02 GMT) Full text and rfc822 format available.

Message #83 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Tue, 23 May 2023 17:20:48 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: tmdmelo <at> gmail.com,  63589 <at> debbugs.gnu.org
> Date: Tue, 23 May 2023 21:18:04 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Then I don't understand your objections to the proposed patch.  Please
> > elaborate.
> 
> Please see the other thread(s), where I explained in detail the two
> problems with `gtk_init_check': upon success, it leaves two display
> connections open, which is quite fragile, and upon failure, it makes
> creating another display impossible, even if a display then becomes
> available.

I've read all those discussions in real time, and I still don't see
the obvious connection.  So please humor me with a more detailed and
complete explanation of why the last suggested patch somehow causes
extra connections open.  And let me remind you that your objection,
which is what caused my question, was

> We want closing displays (think x-delete-terminal) to still work on such
> builds if the user uses it.

Which seems to be about _closing_ connections, not about opening too
many of them.  It's the leap between the extra connections on the one
hand and closing display not working OTOH that I cannot make.  Please
help me fill the dots.

TIA




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Wed, 24 May 2023 00:24:01 GMT) Full text and rfc822 format available.

Message #86 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Wed, 24 May 2023 08:22:44 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Po Lu <luangruo <at> yahoo.com>
>> Cc: tmdmelo <at> gmail.com,  63589 <at> debbugs.gnu.org
>> Date: Tue, 23 May 2023 21:18:04 +0800
>> 
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>> 
>> > Then I don't understand your objections to the proposed patch.  Please
>> > elaborate.
>> 
>> Please see the other thread(s), where I explained in detail the two
>> problems with `gtk_init_check': upon success, it leaves two display
>> connections open, which is quite fragile, and upon failure, it makes
>> creating another display impossible, even if a display then becomes
>> available.
>
> I've read all those discussions in real time, and I still don't see
> the obvious connection.  So please humor me with a more detailed and
> complete explanation of why the last suggested patch somehow causes
> extra connections open.  And let me remind you that your objection,
> which is what caused my question, was
>
>> We want closing displays (think x-delete-terminal) to still work on such
>> builds if the user uses it.
>
> Which seems to be about _closing_ connections, not about opening too
> many of them.  It's the leap between the extra connections on the one
> hand and closing display not working OTOH that I cannot make.  Please
> help me fill the dots.

Nevermind, I got this thread mixed up with that of that of bug#63555...
What I was originally trying to explain was why closing displays can
still happen, even on toolkit builds: the user might call
`x-delete-terminal'.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Wed, 24 May 2023 02:31:02 GMT) Full text and rfc822 format available.

Message #89 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Wed, 24 May 2023 05:30:39 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: tmdmelo <at> gmail.com,  63589 <at> debbugs.gnu.org
> Date: Wed, 24 May 2023 08:22:44 +0800
> 
> Nevermind, I got this thread mixed up with that of that of bug#63555...
> What I was originally trying to explain was why closing displays can
> still happen, even on toolkit builds: the user might call
> `x-delete-terminal'.

Which is why I asked whether x-delete-terminal etc. could call a
function that is different from what we call when we exit.  So now I'm
back to my question.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Wed, 24 May 2023 03:14:02 GMT) Full text and rfc822 format available.

Message #92 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Wed, 24 May 2023 11:13:04 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> Which is why I asked whether x-delete-terminal etc. could call a
> function that is different from what we call when we exit.

Why is what Emacs does upon exiting relevant here?
The problem occurs when a display connection is closed while Emacs
wants to stay running.  Cairo keeps a pointer to the xcb connection,
and if by some chance a pointer with the same value is returned the
next time a display connection is opened, it loses.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Wed, 24 May 2023 05:17:01 GMT) Full text and rfc822 format available.

Message #95 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Wed, 24 May 2023 05:15:46 +0000
> > Which is why I asked whether x-delete-terminal etc. could call a
> > function that is different from what we call when we exit.
>
> Why is what Emacs does upon exiting relevant here?
> The problem occurs when a display connection is closed while Emacs
> wants to stay running.  Cairo keeps a pointer to the xcb connection,
> and if by some chance a pointer with the same value is returned the
> next time a display connection is opened, it loses.

Regardless of what Emacs does at exit, you were right to point out
about calling, eg, the elisp function `delete-terminal', Po Lu. It
ends up always calling XCloseDisplay and, in fact, the last patch
doesn't cover it. I've tested it, and it even triggers the bug with
Lucid and GTK builds.

Here's an updated script to trigger the bug in all these cases:

#+begin_src bash
# repeatedly create graphical frames and close them all
for k in $(seq 10); do
    emacsclient -c -n -a /bin/false &&
        sleep 1 &&
        emacsclient -e "(delete-terminal)"
done
#+end_src




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Wed, 24 May 2023 11:02:02 GMT) Full text and rfc822 format available.

Message #98 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Wed, 24 May 2023 14:01:56 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: tmdmelo <at> gmail.com,  63589 <at> debbugs.gnu.org
> Date: Wed, 24 May 2023 11:13:04 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Which is why I asked whether x-delete-terminal etc. could call a
> > function that is different from what we call when we exit.
> 
> Why is what Emacs does upon exiting relevant here?
> The problem occurs when a display connection is closed while Emacs
> wants to stay running.  Cairo keeps a pointer to the xcb connection,
> and if by some chance a pointer with the same value is returned the
> next time a display connection is opened, it loses.

Maybe I'm jumping to conclusions, sorry.

So let's back up a notch.  There was a suggestion to avoid the call to
XCloseDisplay when the last frame on display is deleted, like we do
for some other toolkits already.  Would that avoid the crashes due to
this issue?  If yes, why did you reject the suggestion?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Wed, 24 May 2023 11:08:01 GMT) Full text and rfc822 format available.

Message #101 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: luangruo <at> yahoo.com, 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Wed, 24 May 2023 14:07:46 +0300
> From: Thiago Melo <tmdmelo <at> gmail.com>
> Date: Wed, 24 May 2023 05:15:46 +0000
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 63589 <at> debbugs.gnu.org
> 
> Regardless of what Emacs does at exit, you were right to point out
> about calling, eg, the elisp function `delete-terminal', Po Lu. It
> ends up always calling XCloseDisplay and, in fact, the last patch
> doesn't cover it. I've tested it, and it even triggers the bug with
> Lucid and GTK builds.

What triggers the bug with Lucid and GTK builds? the patch you
proposed or calls to delete-terminal?  If the latter, then this is a
separate issue, and at least the Cairo-xcb build will behave like the
other builds in that scenario.  Right?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Wed, 24 May 2023 11:56:02 GMT) Full text and rfc822 format available.

Message #104 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: luangruo <at> yahoo.com, 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Wed, 24 May 2023 11:54:30 +0000
> What triggers the bug with Lucid and GTK builds? the patch you
> proposed or calls to delete-terminal?  If the latter, then this is a
> separate issue, and at least the Cairo-xcb build will behave like the
> other builds in that scenario.  Right?

Sorry for not being clear. I meant calls to `delete-terminal', when
Emacs is built with Cairo-XCB, and regardless of the toolkit. It's a
similar scenario to trigger the bug: launch Emacs daemon, open one or
more graphical frames, call the elisp function`delete-terminal' (all
graphical frames and the display are closed because of it), open a new
graphical frame...

My last patch doesn't fix it because it only works when
`(delete-frame)' is called instead. This situation is specific for
toolkitless + Cairo-XCB Emacs, since here it ends up calling
`x_delete_terminal' -> `XCloseDisplay'. Other toolkits don't call
`x_delete_terminal' here, due to the logic at `delete_frame'.

I hope I was more clear this time.

My opinion is that it's all the same issue, which boils down to
Cairo-XCB requiring more manual memory management than Cairo-XLib by
design or limitation. I think we should really consider the approach
from my first patch, which is ensuring that the Cairo-XCB device is
cleaned up before calling XCloseDisplay. Everything else is a
workaround. If the patch is considered too unsafe or too big, then we
must clarify the specifics of what makes it so, so the matter can be
addressed in a better way.

Here's another patch, similar to the first one, but it only acts at
`x_delete_terminal', and without storing global references. The
strategy is similar to the one used at
`ftcrfont_get_default_font_options'. It creates a dummy pixmap, then a
dummy cairo xcb surface from it, then it extracts the cairo device
from the surface, and then cleans up them all.

#+begin_src diff
--- a/src/xterm.c    2023-05-24 12:42:14.873824624 +0200
+++ b/src/xterm.c    2023-05-24 13:45:23.798382193 +0200
@@ -30841,6 +30841,30 @@
      closing all the displays.  */
       XrmDestroyDatabase (dpyinfo->rdb);
 #endif
+#ifdef USE_CAIRO_XCB_SURFACE
+      /* Ensure that the cairo device is destroyed before closing
+         connection (Bug#63589).  For that, we create a drawable, an XCB
+         surface for that drawable, and then we get the device reference
+         from there.  */
+
+      Pixmap drawable;
+      cairo_surface_t *surface;
+
+      drawable = XCreatePixmap (dpyinfo->display, dpyinfo->root_window,
+                1, 1, dpyinfo->n_planes);
+      surface = cairo_xcb_surface_create (dpyinfo->xcb_connection, drawable,
+                      dpyinfo->xcb_visual, 1, 1);
+
+      if (cairo_surface_status (surface) == CAIRO_STATUS_SUCCESS)
+    {
+      cairo_device_t *cairo_device;
+      cairo_device = cairo_device_reference (cairo_surface_get_device
(surface));
+      cairo_surface_destroy (surface);
+      cairo_device_finish (cairo_device);
+      cairo_device_destroy (cairo_device);
+    }
+      XFreePixmap (dpyinfo->display, drawable);
+#endif
 #ifdef USE_GTK
       xg_display_close (dpyinfo->display);
 #else
#+end_src




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Wed, 24 May 2023 12:16:02 GMT) Full text and rfc822 format available.

Message #107 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Wed, 24 May 2023 20:15:21 +0800
Thiago Melo <tmdmelo <at> gmail.com> writes:

> Sorry for not being clear. I meant calls to `delete-terminal', when
> Emacs is built with Cairo-XCB, and regardless of the toolkit. It's a
> similar scenario to trigger the bug: launch Emacs daemon, open one or
> more graphical frames, call the elisp function`delete-terminal' (all
> graphical frames and the display are closed because of it), open a new
> graphical frame...
>
> My last patch doesn't fix it because it only works when
> `(delete-frame)' is called instead. This situation is specific for
> toolkitless + Cairo-XCB Emacs, since here it ends up calling
> `x_delete_terminal' -> `XCloseDisplay'. Other toolkits don't call
> `x_delete_terminal' here, due to the logic at `delete_frame'.
>
> I hope I was more clear this time.
>
> My opinion is that it's all the same issue, which boils down to
> Cairo-XCB requiring more manual memory management than Cairo-XLib by
> design or limitation. I think we should really consider the approach
> from my first patch, which is ensuring that the Cairo-XCB device is
> cleaned up before calling XCloseDisplay. Everything else is a
> workaround. If the patch is considered too unsafe or too big, then we
> must clarify the specifics of what makes it so, so the matter can be
> addressed in a better way.

I thought I explained what the problems with trying to fix this in Emacs
are.  The first is: there's a reference leak in Cairo somewhere, since
Emacs never allows displays to be closed without each frame being
destroyed, and destroying each frame will also dereference its Cairo
surface; thus, it's not actually Emacs's problem.

> Here's another patch, similar to the first one, but it only acts at
> `x_delete_terminal', and without storing global references. The
> strategy is similar to the one used at
> `ftcrfont_get_default_font_options'. It creates a dummy pixmap, then a
> dummy cairo xcb surface from it, then it extracts the cairo device
> from the surface, and then cleans up them all.
>
> #+begin_src diff
> --- a/src/xterm.c    2023-05-24 12:42:14.873824624 +0200
> +++ b/src/xterm.c    2023-05-24 13:45:23.798382193 +0200
> @@ -30841,6 +30841,30 @@
>       closing all the displays.  */
>        XrmDestroyDatabase (dpyinfo->rdb);
>  #endif
> +#ifdef USE_CAIRO_XCB_SURFACE
> +      /* Ensure that the cairo device is destroyed before closing
> +         connection (Bug#63589).  For that, we create a drawable, an XCB
> +         surface for that drawable, and then we get the device reference
> +         from there.  */
> +
> +      Pixmap drawable;
> +      cairo_surface_t *surface;
> +
> +      drawable = XCreatePixmap (dpyinfo->display, dpyinfo->root_window,
> +                1, 1, dpyinfo->n_planes);
> +      surface = cairo_xcb_surface_create (dpyinfo->xcb_connection, drawable,
> +                      dpyinfo->xcb_visual, 1, 1);
> +
> +      if (cairo_surface_status (surface) == CAIRO_STATUS_SUCCESS)
> +    {
> +      cairo_device_t *cairo_device;
> +      cairo_device = cairo_device_reference (cairo_surface_get_device
> (surface));
> +      cairo_surface_destroy (surface);
> +      cairo_device_finish (cairo_device);
> +      cairo_device_destroy (cairo_device);
> +    }
> +      XFreePixmap (dpyinfo->display, drawable);
> +#endif
>  #ifdef USE_GTK
>        xg_display_close (dpyinfo->display);
>  #else
> #+end_src

The other problem occurs when `cairo_xcb_surface_create' creates a
different device from the one that was previously created for the
display.  So you have only destroyed one of several devices, any one of
which may rear its ugly head later.  This is also a bug in Cairo.

BTW, it's not necessary to call XFreePixmap, as all resources created
by the client will be destroyed per the close down mode set earlier.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Wed, 24 May 2023 14:18:02 GMT) Full text and rfc822 format available.

Message #110 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Wed, 24 May 2023 14:16:20 +0000
[Message part 1 (text/plain, inline)]
> I thought I explained what the problems with trying to fix this in Emacs
> are.  The first is: there's a reference leak in Cairo somewhere, since
> Emacs never allows displays to be closed without each frame being
> destroyed, and destroying each frame will also dereference its Cairo
> surface; thus, it's not actually Emacs's problem.
[...]
> The other problem occurs when `cairo_xcb_surface_create' creates a
> different device from the one that was previously created for the
> display.  So you have only destroyed one of several devices, any one of
> which may rear its ugly head later.  This is also a bug in Cairo.
>
> BTW, it's not necessary to call XFreePixmap, as all resources created
> by the client will be destroyed per the close down mode set earlier.

Welp, I guess it means that all that remains is bringing up the issue
to the Cairo mailing list (again) and waiting for the problem to be
solved from their side. I'm afraid they might just say that Emacs is
"holding it wrong". :(

By the way, I wrote a minimal standalone cairo-xcb c program (see
attached) to trigger this particular bug. It opens a small window via
xcb, draws something via cairo, destroys the window and closes the
display when any key or mouse button is pressed on it, then recreates
everything again... And it repeats until it crashes (Hopefully. At
lest it crashes in my system). Redrawing errors also happen during the
process. It might be an useful example to bring to the Cairo mailing
list and to debug the root of this issue.

Thanks for everything you taught me, Po Lu.
[cairo-xcb-bug.c (text/x-csrc, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Wed, 24 May 2023 15:45:02 GMT) Full text and rfc822 format available.

Message #113 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Wed, 24 May 2023 18:44:51 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  63589 <at> debbugs.gnu.org
> Date: Wed, 24 May 2023 20:15:21 +0800
> 
> I thought I explained what the problems with trying to fix this in Emacs
> are.  [...]

Please also answer my questions I asked in my previous message.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Thu, 25 May 2023 00:19:02 GMT) Full text and rfc822 format available.

Message #116 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Thu, 25 May 2023 08:18:22 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> Please also answer my questions I asked in my previous message.

Which one?

I lost track of the discussion surrounding this thread, sorry about
that.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Thu, 25 May 2023 03:38:02 GMT) Full text and rfc822 format available.

Message #119 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Thu, 25 May 2023 06:38:19 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: tmdmelo <at> gmail.com,  63589 <at> debbugs.gnu.org
> Date: Thu, 25 May 2023 08:18:22 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Please also answer my questions I asked in my previous message.
> 
> Which one?

This one:

  Maybe I'm jumping to conclusions, sorry.

  So let's back up a notch.  There was a suggestion to avoid the call to
  XCloseDisplay when the last frame on display is deleted, like we do
  for some other toolkits already.  Would that avoid the crashes due to
  this issue?  If yes, why did you reject the suggestion?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Thu, 25 May 2023 06:10:01 GMT) Full text and rfc822 format available.

Message #122 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Thu, 25 May 2023 14:08:59 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

>   So let's back up a notch.  There was a suggestion to avoid the call to
>   XCloseDisplay when the last frame on display is deleted, like we do
>   for some other toolkits already.  Would that avoid the crashes due to
>   this issue?  If yes, why did you reject the suggestion?

Because it would still lead to crashes when the display connection is
closed by other means.  Thus, the proper solution is simply to disable
cairo-xcb by default.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Thu, 25 May 2023 07:13:01 GMT) Full text and rfc822 format available.

Message #125 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Thu, 25 May 2023 10:12:59 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: tmdmelo <at> gmail.com,  63589 <at> debbugs.gnu.org
> Date: Thu, 25 May 2023 14:08:59 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >   So let's back up a notch.  There was a suggestion to avoid the call to
> >   XCloseDisplay when the last frame on display is deleted, like we do
> >   for some other toolkits already.  Would that avoid the crashes due to
> >   this issue?  If yes, why did you reject the suggestion?
> 
> Because it would still lead to crashes when the display connection is
> closed by other means.

Which other means are those?  Please be more specific.

> Thus, the proper solution is simply to disable cairo-xcb by default.

We already agreed to do that (why wasn't that change installed, btw?).
I'm trying to establish if there's anything we could do in the
cairo-xcb configuration to make the crashes more rare, or even prevent
them altogether.  Please bear with me.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Thu, 25 May 2023 10:26:02 GMT) Full text and rfc822 format available.

Message #128 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Po Lu <luangruo <at> yahoo.com>, 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Thu, 25 May 2023 10:24:30 +0000
On Thu, May 25, 2023 at 7:12 AM Eli Zaretskii <eliz <at> gnu.org> wrote:
> I'm trying to establish if there's anything we could do in the
> cairo-xcb configuration to make the crashes more rare, or even prevent
> them altogether.

Regarding this, before posting at the cairo mailing list, I searched
better their archives to check if this issue was already properly
addressed. I found this:

https://lists.cairographics.org/archives/cairo/2018-November/028791.html

Title: cairo_xcb_surface_create() segfaults on second call with
different xcb info

Where Uli Schlachter (libxcb contributor and main maintainer of
Cairo-XCB nowadays) discusses the issue we're having here, the design
of Cairo-XCB, how to use it with multiple surfaces and after reopening
the display. To highlight, Uli says:

| Cairo has to get quite some information from the X11 server. [...]
|
| [...] querying this all the time would be slow. Thus, cairo caches
| this information. Namely, there is an instance of cairo_device_t. This
| instance is kept around even when all surfaces using this device are
| destroyed. [...]
|
| [...] when you call xcb_disconnect(),
| the cache now contains a dangling pointer. The next call to
| xcb_connect() might very well allocate an xcb_connection_t* with the
| same pointer. Thus, you now get a cache hit even though there is a new
| XCB connection. Bad things happen afterwards

Thus, it doesn't matter if Emacs destroys all cairo-xcb surfaces
before closing the display, the device reference will always linger
around together with all its cairo cache by design. It simply doesn't
work like Cairo-XLib at all.

Then, Uli says:

| If you want to keep the device around for later (i.e. have multiple
| surface using the same device), you can save a pointer via:
|
|   cairo_device_t *device = cairo_device_reference(....);
|
| Now, you have to later call cairo_device_destroy() when you no longer
| need the reference, but you get a pointer to the cairo_device_t
| independent of a cairo xcb surface.
|
| Oh and: You have to finish the device before you call xcb_disconnect().

So, any application that uses Cairo-XCB with multiple surfaces and
wants to reopen displays _must_ save a reference to the device and
_must_ finish + destroy it before closing the display.

With this, here's another try to improve the initial patch, this time
storing the cairo xcb device for the display at `x_term_init':

#+begin_src diff
--- a/src/xterm.h    2023-05-25 09:43:50.943793850 +0200
+++ b/src/xterm.h    2023-05-25 11:32:03.701771148 +0200
@@ -883,6 +883,13 @@ struct x_display_info
      clock, or 0 if unknown (if the difference is legitimately 0,
      server_time_monotonic_p will be true).  */
   int_fast64_t server_time_offset;
+
+#if defined USE_XCB && defined USE_CAIRO_XCB
+  /* Cairo device associated with cairo surfaces in this display.
+     Required for proper cleanup before closing display connection
+     in cairo-xcb builds.  */
+  cairo_device_t *cairo_device;
+#endif
 #endif
 };
#+end_src


#+begin_src diff
--- a/src/xterm.c    2023-05-25 09:37:24.811402435 +0200
+++ b/src/xterm.c    2023-05-25 12:18:06.003572028 +0200
@@ -5806,10 +5806,15 @@ x_begin_cr_clip (struct frame *f, GC gc)
       cairo_surface_t *surface;
 #ifdef USE_CAIRO_XCB_SURFACE
       if (FRAME_DISPLAY_INFO (f)->xcb_visual)
+    {
     surface = cairo_xcb_surface_create (FRAME_DISPLAY_INFO (f)->xcb_connection,
                         (xcb_drawable_t) FRAME_X_RAW_DRAWABLE (f),
                         FRAME_DISPLAY_INFO (f)->xcb_visual,
                         width, height);
+    if (cairo_surface_status (surface) == CAIRO_STATUS_SUCCESS)
+      eassert (FRAME_DISPLAY_INFO (f)->cairo_device
+           == cairo_surface_get_device (surface));
+    }
       else
 #endif
     surface = cairo_xlib_surface_create (FRAME_X_DISPLAY (f),
@@ -30504,6 +30509,27 @@ x_term_init (Lisp_Object display_name, c

   unblock_input ();

+#ifdef USE_CAIRO_XCB_SURFACE
+  /* Store reference to the cairo device for this display, to ensure
+     that it is destroyed before closing connection (Bug#63589).
+     For that, we create a drawable, an XCB surface for that drawable,
+     and then we get the device reference from there.  */
+  Pixmap drawable;
+  cairo_surface_t *surface;
+
+  drawable = XCreatePixmap (dpyinfo->display, dpyinfo->root_window,
+                1, 1, dpyinfo->n_planes);
+  surface = cairo_xcb_surface_create (dpyinfo->xcb_connection, drawable,
+                      dpyinfo->xcb_visual, 1, 1);
+
+  if (cairo_surface_status (surface) == CAIRO_STATUS_SUCCESS)
+    {
+      dpyinfo->cairo_device = cairo_device_reference
(cairo_surface_get_device (surface));
+      cairo_surface_destroy (surface);
+    }
+  XFreePixmap (dpyinfo->display, drawable);
+#endif
+
 #if defined HAVE_XFIXES && defined USE_XCB
   SAFE_FREE ();
 #endif
@@ -30783,6 +30809,17 @@ x_delete_terminal (struct terminal *term
     xim_close_dpy (dpyinfo);
 #endif

+#ifdef USE_CAIRO_XCB_SURFACE
+  /* Ensure that the cairo device is destroyed before closing
+     connection (Bug#63589).  */
+  if (dpyinfo->cairo_device)
+    {
+      cairo_device_finish (dpyinfo->cairo_device);
+      cairo_device_destroy (dpyinfo->cairo_device);
+      dpyinfo->cairo_device = NULL;
+    }
+#endif
+
   /* Normally, the display is available...  */
   if (dpyinfo->display)
     {
#+end_src




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Thu, 25 May 2023 10:34:02 GMT) Full text and rfc822 format available.

Message #131 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Thu, 25 May 2023 18:32:50 +0800
Thiago Melo <tmdmelo <at> gmail.com> writes:

> So, any application that uses Cairo-XCB with multiple surfaces and
> wants to reopen displays _must_ save a reference to the device and
> _must_ finish + destroy it before closing the display.
>
> With this, here's another try to improve the initial patch, this time
> storing the cairo xcb device for the display at `x_term_init':
>
> #+begin_src diff
> --- a/src/xterm.h    2023-05-25 09:43:50.943793850 +0200
> +++ b/src/xterm.h    2023-05-25 11:32:03.701771148 +0200
> @@ -883,6 +883,13 @@ struct x_display_info
>       clock, or 0 if unknown (if the difference is legitimately 0,
>       server_time_monotonic_p will be true).  */
>    int_fast64_t server_time_offset;
> +
> +#if defined USE_XCB && defined USE_CAIRO_XCB
> +  /* Cairo device associated with cairo surfaces in this display.
> +     Required for proper cleanup before closing display connection
> +     in cairo-xcb builds.  */
> +  cairo_device_t *cairo_device;
> +#endif
>  #endif
>  };
> #+end_src
>
> #+begin_src diff
> --- a/src/xterm.c    2023-05-25 09:37:24.811402435 +0200
> +++ b/src/xterm.c    2023-05-25 12:18:06.003572028 +0200
> @@ -5806,10 +5806,15 @@ x_begin_cr_clip (struct frame *f, GC gc)
>        cairo_surface_t *surface;
>  #ifdef USE_CAIRO_XCB_SURFACE
>        if (FRAME_DISPLAY_INFO (f)->xcb_visual)
> +    {
>      surface = cairo_xcb_surface_create (FRAME_DISPLAY_INFO (f)->xcb_connection,
>                          (xcb_drawable_t) FRAME_X_RAW_DRAWABLE (f),
>                          FRAME_DISPLAY_INFO (f)->xcb_visual,
>                          width, height);

> +    if (cairo_surface_status (surface) == CAIRO_STATUS_SUCCESS)
> +      eassert (FRAME_DISPLAY_INFO (f)->cairo_device
> +           == cairo_surface_get_device (surface));

Did you build with checking?  Because when I last tried, this assert
triggered with the second frame created.

> +    }
>        else
>  #endif
>      surface = cairo_xlib_surface_create (FRAME_X_DISPLAY (f),
> @@ -30504,6 +30509,27 @@ x_term_init (Lisp_Object display_name, c
>
>    unblock_input ();
>
> +#ifdef USE_CAIRO_XCB_SURFACE
> +  /* Store reference to the cairo device for this display, to ensure
> +     that it is destroyed before closing connection (Bug#63589).
> +     For that, we create a drawable, an XCB surface for that drawable,
> +     and then we get the device reference from there.  */
> +  Pixmap drawable;
> +  cairo_surface_t *surface;
> +
> +  drawable = XCreatePixmap (dpyinfo->display, dpyinfo->root_window,
> +                1, 1, dpyinfo->n_planes);
> +  surface = cairo_xcb_surface_create (dpyinfo->xcb_connection, drawable,
> +                      dpyinfo->xcb_visual, 1, 1);
> +
> +  if (cairo_surface_status (surface) == CAIRO_STATUS_SUCCESS)
> +    {
> +      dpyinfo->cairo_device = cairo_device_reference
> (cairo_surface_get_device (surface));
> +      cairo_surface_destroy (surface);
> +    }
> +  XFreePixmap (dpyinfo->display, drawable);
> +#endif
> +
>  #if defined HAVE_XFIXES && defined USE_XCB
>    SAFE_FREE ();
>  #endif
> @@ -30783,6 +30809,17 @@ x_delete_terminal (struct terminal *term
>      xim_close_dpy (dpyinfo);
>  #endif
>
> +#ifdef USE_CAIRO_XCB_SURFACE
> +  /* Ensure that the cairo device is destroyed before closing
> +     connection (Bug#63589).  */
> +  if (dpyinfo->cairo_device)
> +    {
> +      cairo_device_finish (dpyinfo->cairo_device);
> +      cairo_device_destroy (dpyinfo->cairo_device);
> +      dpyinfo->cairo_device = NULL;
> +    }
> +#endif

If we are going down this route, I think we should save each distinct
device returned by `cairo_surface_get_device', and delete each of them
upon the terminal being deleted.

As I explained, I saw that function return different devices for the
same XCB connection, which is definitely a problem with Cairo.

But again, that's a hack.  I would rather just disable this misdesigned
and buggy interface by default.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Thu, 25 May 2023 10:36:01 GMT) Full text and rfc822 format available.

Message #134 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Thu, 25 May 2023 18:34:43 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> Which other means are those?  Please be more specific.

I thought I explained already: delete-terminal, X server disconnects.

> We already agreed to do that (why wasn't that change installed, btw?).

I didn't realize we agreed.

> I'm trying to establish if there's anything we could do in the
> cairo-xcb configuration to make the crashes more rare, or even prevent
> them altogether.  Please bear with me.

I understand.  I've been very preoccupied these past days, which has
made it difficult for me to follow ~3 bug reports at the same time, so
please bear with me also.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Thu, 25 May 2023 11:34:01 GMT) Full text and rfc822 format available.

Message #137 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Thu, 25 May 2023 14:33:48 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: tmdmelo <at> gmail.com,  63589 <at> debbugs.gnu.org
> Date: Thu, 25 May 2023 18:34:43 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Which other means are those?  Please be more specific.
> 
> I thought I explained already: delete-terminal, X server disconnects.

What bad things can happen (in the cairo-xcb build) if we don't delete
the terminal in all these cases?

> > We already agreed to do that (why wasn't that change installed, btw?).
> 
> I didn't realize we agreed.

We did.

> > I'm trying to establish if there's anything we could do in the
> > cairo-xcb configuration to make the crashes more rare, or even prevent
> > them altogether.  Please bear with me.
> 
> I understand.  I've been very preoccupied these past days, which has
> made it difficult for me to follow ~3 bug reports at the same time, so
> please bear with me also.

No problem.  As long as the discussion goes on, it can go on slowly,
for all I care.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Thu, 25 May 2023 14:08:01 GMT) Full text and rfc822 format available.

Message #140 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Thu, 25 May 2023 14:06:24 +0000
[Message part 1 (text/plain, inline)]
On Thu, May 25, 2023 at 10:33 AM Po Lu <luangruo <at> yahoo.com> wrote:
> Did you build with checking?

Yes. Here are the configure options I've been using to test it:

./configure --without-all --with-x-toolkit=no
--without-compress-install --without-tree-sitter --without-json
--with-cairo --enable-checking='yes,glyphs'
--enable-check-lisp-object-type  CFLAGS='-O0 -g3'

Let me know if there are relevant settings differences.

> Because when I last tried, this assert
> triggered with the second frame created.

I'm not sure if you tested the last patch I sent as it is, or if you
previously did assert tests on your own with the device returned by
`cairo_xcb_surface_create' at `x_begin_cr_clip'. Assuming it's the
latter, then please pay close attention at this change I made to the
code:


    if (cairo_surface_status (surface) == CAIRO_STATUS_SUCCESS)
      eassert (FRAME_DISPLAY_INFO (f)->cairo_device
           == cairo_surface_get_device (surface));


Notice that, before I do the assert, I first check if the surface
returned by `cairo_xcb_surface_create' is good. One thing that I
observed when debugging is that, every time a new frame is created,
this part of the code is hit 3 times. The first time, the surface it
returns is always a bad one, which might even have some random garbage
value for the device. The other 2 times, it's a proper xcb surface,
and they always have the same device in common. Same thing with
subsequent calls, it's always the same cairo-xcb device. Similar
behavior when I make tooltips appear. I've attached a gdb session log
showing it.

> If we are going down this route, I think we should save each distinct
> device returned by `cairo_surface_get_device', and delete each of them
> upon the terminal being deleted.
> As I explained, I saw that function return different devices for the
> same XCB connection, which is definitely a problem with Cairo.

Considering my observation above, it doesn't seem that different
_valid_ devices are being created. But if that is really the case and
I'm missing it, then yes, we make a dynamic list of devices instead.
Consider also that, with the changes I proposed, a device reference is
kept from the beginning, which might influence the results here.
[gdb-session--cairo-xcb-device.org (application/vnd.lotus-organizer, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Thu, 25 May 2023 18:18:02 GMT) Full text and rfc822 format available.

Message #143 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Thu, 25 May 2023 18:17:11 +0000
[Message part 1 (text/plain, inline)]
Sorry, my last patch had a misplaced `#if' block  at xterm.h, that I
only noticed after trying to build emacs with gtk. I've attached an
updated version here.
[bugfix-63589-v4.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Fri, 26 May 2023 00:24:02 GMT) Full text and rfc822 format available.

Message #146 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Fri, 26 May 2023 08:23:19 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> What bad things can happen (in the cairo-xcb build) if we don't delete
> the terminal in all these cases?

In the former case, Emacs will never be able to close a display.
In the latter case, the display connection is forcibly deleted, and the
same crash happens again.

> We did.

OK, I will install this soon.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Fri, 26 May 2023 01:01:02 GMT) Full text and rfc822 format available.

Message #149 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Fri, 26 May 2023 08:59:53 +0800
Thiago Melo <tmdmelo <at> gmail.com> writes:

> I'm not sure if you tested the last patch I sent as it is, or if you
> previously did assert tests on your own with the device returned by
> `cairo_xcb_surface_create' at `x_begin_cr_clip'. Assuming it's the
> latter, then please pay close attention at this change I made to the
> code:
>
>
>     if (cairo_surface_status (surface) == CAIRO_STATUS_SUCCESS)
>       eassert (FRAME_DISPLAY_INFO (f)->cairo_device
>            == cairo_surface_get_device (surface));
>
>
> Notice that, before I do the assert, I first check if the surface
> returned by `cairo_xcb_surface_create' is good. One thing that I
> observed when debugging is that, every time a new frame is created,
> this part of the code is hit 3 times. The first time, the surface it
> returns is always a bad one, which might even have some random garbage
> value for the device. The other 2 times, it's a proper xcb surface,
> and they always have the same device in common. Same thing with
> subsequent calls, it's always the same cairo-xcb device. Similar
> behavior when I make tooltips appear. I've attached a gdb session log
> showing it.

What version of Cairo did you test?

>> If we are going down this route, I think we should save each distinct
>> device returned by `cairo_surface_get_device', and delete each of them
>> upon the terminal being deleted.
>> As I explained, I saw that function return different devices for the
>> same XCB connection, which is definitely a problem with Cairo.
>
> Considering my observation above, it doesn't seem that different
> _valid_ devices are being created. But if that is really the case and
> I'm missing it, then yes, we make a dynamic list of devices instead.
> Consider also that, with the changes I proposed, a device reference is
> kept from the beginning, which might influence the results here.

Or let's just disable this by default, which is really the better
solution until some people get their act together and fix this
misdesign.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Fri, 26 May 2023 05:08:01 GMT) Full text and rfc822 format available.

Message #152 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical frames
 via emacsclient when compiled with cairo-xcb
Date: Fri, 26 May 2023 05:06:16 +0000
On Fri, May 26, 2023 at 1:00 AM Po Lu <luangruo <at> yahoo.com> wrote:
> What version of Cairo did you test?

1.16.0, on Debian.

> Or let's just disable this by default, which is really the better
> solution until some people get their act together and fix this
> misdesign.

I'm also in agreement about leaving this backend as an opt-in for now.
Like Eli, I just wanted to address the crash itself and what can be
fixed on the Emacs side.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Fri, 26 May 2023 06:11:02 GMT) Full text and rfc822 format available.

Message #155 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Fri, 26 May 2023 09:10:49 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: tmdmelo <at> gmail.com,  63589 <at> debbugs.gnu.org
> Date: Fri, 26 May 2023 08:23:19 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > What bad things can happen (in the cairo-xcb build) if we don't delete
> > the terminal in all these cases?
> 
> In the former case, Emacs will never be able to close a display.

Why is this bad?  It isn't clean, I agree, but what problems would
this cause to Emacs and the user, and why is this worse than the
current situation where Emacs crashes?

> In the latter case, the display connection is forcibly deleted, and the
> same crash happens again.

But that evidently happens already with other toolkits, doesn't it?
So I guess these forced deletions are very rarely used.

(Btw, I hope I understood correctly what you mean by "former" and
"latter"; if not, please tell explicitly what they are, since the
citations above don't include any two cases to which this could
allude, so I needed to guess.)

> > We did.
> 
> OK, I will install this soon.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Fri, 26 May 2023 06:14:02 GMT) Full text and rfc822 format available.

Message #158 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Fri, 26 May 2023 09:14:22 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  63589 <at> debbugs.gnu.org
> Date: Fri, 26 May 2023 08:59:53 +0800
> 
> Or let's just disable this by default, which is really the better
> solution until some people get their act together and fix this
> misdesign.

Disabling this by default doesn't mean we shouldn't strive for making
this non-default configuration less buggy.  It just lowers the
priority of those bugs, but it doesn't make them go away from our POV.

So let's try to improve the situation with this configuration, even
though we already decided to make it OFF by default.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Fri, 26 May 2023 08:02:02 GMT) Full text and rfc822 format available.

Message #161 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Po Lu <luangruo <at> yahoo.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Fri, 26 May 2023 16:01:01 +0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> Why is this bad?  It isn't clean, I agree, but what problems would
> this cause to Emacs and the user, and why is this worse than the
> current situation where Emacs crashes?

Because if the connection to the other X server becomes very slow, or
abruptly disappears, Emacs could lock up or crash.

> But that evidently happens already with other toolkits, doesn't it?
> So I guess these forced deletions are very rarely used.

Connecting Emacs to multiple displays is already rarely used.  But we've
been hearing people complain about such crashes on other toolkits a lot,
so it is certainly an important situation to consider.

> (Btw, I hope I understood correctly what you mean by "former" and
> "latter"; if not, please tell explicitly what they are, since the
> citations above don't include any two cases to which this could
> allude, so I needed to guess.)

You understood correctly.  I'm sorry I was not sufficiently clear.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Fri, 26 May 2023 08:35:01 GMT) Full text and rfc822 format available.

Message #164 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Po Lu <luangruo <at> yahoo.com>
Cc: 63589 <at> debbugs.gnu.org, tmdmelo <at> gmail.com
Subject: Re: bug#63589: [PATCH] 29.0.91; crash after creating graphical
 frames via emacsclient when compiled with cairo-xcb
Date: Fri, 26 May 2023 11:34:20 +0300
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: tmdmelo <at> gmail.com,  63589 <at> debbugs.gnu.org
> Date: Fri, 26 May 2023 16:01:01 +0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Why is this bad?  It isn't clean, I agree, but what problems would
> > this cause to Emacs and the user, and why is this worse than the
> > current situation where Emacs crashes?
> 
> Because if the connection to the other X server becomes very slow, or
> abruptly disappears, Emacs could lock up or crash.

Sorry, I don't understand: how is the fact that we don't close the
connection related to other connections' becoming very slow, and why
would that cause us to lock up?

In any case, it sounds like this possibility is more rare than the
situation where the user repeatedly visits files one by one via
emacsclient, each time using "C-x C-c" to finish, which closes the
connection.  So it sounds like not deleting the terminal is an
improvement, isn't it?

> > But that evidently happens already with other toolkits, doesn't it?
> > So I guess these forced deletions are very rarely used.
> 
> Connecting Emacs to multiple displays is already rarely used.  But we've
> been hearing people complain about such crashes on other toolkits a lot,
> so it is certainly an important situation to consider.

I agree.  But if Cauro-XCB behaves like those other toolkits, then we
are not worse in this respect than we already are with those other
toolkits.  So again, this sounds like an improvement.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 28 May 2023 03:25:02 GMT) Full text and rfc822 format available.

Message #167 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Andrés Ramírez <rrandresf <at> hotmail.com>
To: 63589 <at> debbugs.gnu.org
Subject: 29.0.91; crash after creating graphical frames via emacsclient when
 compiled with cairo-xcb
Date: Sun, 28 May 2023 03:10:40 +0000
Hi.

So It means now it defaults to cairo-xlib surface.

Does it means this bug is going to happen again?
--8<---------------cut here---------------start------------->8---
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=57364
--8<---------------cut here---------------end--------------->8---

Best Regards




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 28 May 2023 03:49:02 GMT) Full text and rfc822 format available.

Message #170 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Andrés Ramírez <rrandresf <at> hotmail.com>
To: 63589 <at> debbugs.gnu.org
Subject: 29.0.91; crash after creating graphical frames via emacsclient when
 compiled with cairo-xcb
Date: Sun, 28 May 2023 03:34:45 +0000
Hi. Thiago.

My cairo version is 1.17.8.

I have tested 
cairo-xcb-bug.c

And On my case. It never crashes.

Best Regards





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 28 May 2023 05:56:01 GMT) Full text and rfc822 format available.

Message #173 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Andrés Ramírez <rrandresf <at> hotmail.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91;
 crash after creating graphical frames via emacsclient when compiled
 with cairo-xcb
Date: Sun, 28 May 2023 08:55:48 +0300
> From: Andrés Ramírez <rrandresf <at> hotmail.com>
> Date: Sun, 28 May 2023 03:34:45 +0000
> 
> Hi. Thiago.
> 
> My cairo version is 1.17.8.
> 
> I have tested 
> cairo-xcb-bug.c
> 
> And On my case. It never crashes.

Then you can still configure Emacs to be built with Cairo XCB, and
Bob's your uncle.  The code for XCB support was not removed, we just
made that configuration optional.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Sun, 28 May 2023 21:25:02 GMT) Full text and rfc822 format available.

Message #176 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: rrandresf <at> hotmail.com
Cc: 63589 <at> debbugs.gnu.org
Subject: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Sun, 28 May 2023 21:23:33 +0000
[Message part 1 (text/plain, inline)]
Hi Andrés.

Andrés Ramírez <rrandresf <at> hotmail.com> writes:
> My cairo version is 1.17.8.

I did some tests with Emacs + Cairo 1.17.8 as well. I still got the same errors.

While good to know if it runs well there, Cairo 1.17 is an
experimental pre-release.  The latest stable version of Cairo at the
moment is 1.16.0, which is the version shipped by Debian based
distros.  Even Debian Unstable packages Cairo 1.16 at the moment.  If
Cairo 1.17 received relevant bug fixes, they should have been
(hopefully) backported to 1.16 either by the Cairo devs or Debian
package maintainers.  If we find out this is not the case, then it
would be nice to report it upstream.

> I have tested
> cairo-xcb-bug.c
>
> And On my case. It never crashes.

Thanks.  In the meantime, I wrote a headless, non-interactive and
slightly improved version of this code.  It should iterate faster and
trigger the bug more reliably.  I've attached it here.  Needless to
say, but please take a careful look at the code before compiling and
running it.  Then, it would be nice if you let us know if it crashes
on you.
[cairo-xcb-bug-2.c (text/x-csrc, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 29 May 2023 14:52:01 GMT) Full text and rfc822 format available.

Message #179 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: andrés ramírez <rrandresf <at> hotmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Mon, 29 May 2023 14:51:03 +0000
Hi. Eli.

>>>>> "Eli" == Eli Zaretskii <eliz <at> gnu.org> writes:


[...]

    Eli> Then you can still configure Emacs to be built with Cairo XCB, and Bob's your uncle.  The
    Eli> code for XCB support was not removed, we just made that configuration optional.

Sure. That solves it on my case.

But I am thinking about the others emacsers who are going to be affected
by this bug#57364 I think I would need to retest this bug again with the rc3
for checking If present. X-forwarding bugs (aka multiple frames on
different DISPLAY) are very difficult to debug. That could be the reason
there are not too much bug reports about this behaviour. On my
particular case I connect to a headless machine by a network cable. So
this bug is triggered. So I think most of emacsers using the '--daemon'
(aka server ) option are using the lucid toolkit with cairo. That was my
reasoning for making You guys aware of the issue with cario and the xlib surface.

Best Regards




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 29 May 2023 15:00:01 GMT) Full text and rfc822 format available.

Message #182 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: andrés ramírez <rrandresf <at> hotmail.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Mon, 29 May 2023 14:58:47 +0000
Hi. Thiago.

>>>>> "Thiago" == Thiago Melo <tmdmelo <at> gmail.com> writes:


[...]


    Thiago> I did some tests with Emacs + Cairo 1.17.8 as well. I still got the same errors.

Not on my case. I have tested cairo-xcb-bug-2.c three times without the crash.

    Thiago> While good to know if it runs well there, Cairo 1.17 is an experimental pre-release.
    Thiago> The latest stable version of Cairo at the moment is 1.16.0, which is the version shipped
    Thiago> by Debian based distros.  Even Debian Unstable packages Cairo 1.16 at the moment.  If
    Thiago> Cairo 1.17 received relevant bug fixes, they should have been (hopefully) backported to
    Thiago> 1.16 either by the Cairo devs or Debian package maintainers.  If we find out this is not
    Thiago> the case, then it would be nice to report it upstream.

It could be the case. But We could be wrong also.

Weird the bug is present on your case on debian.

I am on archlinux with up-to-date packages. I remember as part of the
discussion about bug#57364 Po Lu asked me to update cairo. At that time
I compiled the git version. But now archlinux By default packages
1.17. But our distros are the opposite about the packages they publish.

Best Regards





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 29 May 2023 15:23:01 GMT) Full text and rfc822 format available.

Message #185 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: andrés ramírez <rrandresf <at> hotmail.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Mon, 29 May 2023 15:21:48 +0000
On Mon, May 29, 2023 at 2:59 PM andrés ramírez <rrandresf <at> hotmail.com> wrote:
> Not on my case. I have tested cairo-xcb-bug-2.c three times without the crash.

After I wrote my last message, I've tested this example with Cairo
1.17 as well. As it is, it doesn't trigger a crash.

Could you try this: at the top of this code, there are the variables
`width' and `height'. Please try setting them to a higher value (eg,
64). then compile and run it again. If you do, let me know if you get
an X Error.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 29 May 2023 15:39:01 GMT) Full text and rfc822 format available.

Message #188 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: andrés ramírez <rrandresf <at> hotmail.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Mon, 29 May 2023 15:37:56 +0000
Hi. Thiago.

>>>>> "Thiago" == Thiago Melo <tmdmelo <at> gmail.com> writes:


[...]


    Thiago> Could you try this: at the top of this code, there are the variables `width' and
    Thiago> `height'. Please try setting them to a higher value (eg, 64). then compile and run it
    Thiago> again. If you do, let me know if you get an X Error.


--8<---------------cut here---------------start------------->8---
Press C-c to exit.
Iteration: 2/100000X Error of failed request:  143
  Major opcode of failed request:  139 ()
  Minor opcode of failed request:  10
  Serial number of failed request:  13
  Current serial number in output stream:  23
--8<---------------cut here---------------end--------------->8---

Best Regards




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 29 May 2023 16:12:02 GMT) Full text and rfc822 format available.

Message #191 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: Thiago Melo <tmdmelo <at> gmail.com>
To: andrés ramírez <rrandresf <at> hotmail.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Mon, 29 May 2023 16:10:54 +0000
On Mon, May 29, 2023 at 3:38 PM andrés ramírez <rrandresf <at> hotmail.com> wrote:
> --8<---------------cut here---------------start------------->8---
> Press C-c to exit.
> Iteration: 2/100000X Error of failed request:  143
>   Major opcode of failed request:  139 ()
>   Minor opcode of failed request:  10
>   Serial number of failed request:  13
>   Current serial number in output stream:  23
> --8<---------------cut here---------------end--------------->8---

Now, if you uncomment this line of the code:

//#define USE_CAIRO_DEVICE

Then compile and run it again, you shouldn't get the error anymore. It
enables the cairo device destruction and proper invalidation of the
cairo xcb connection cache at the end, as per what I explained in
previous messages.

The bottom line is, if you're getting these errors in this toy
example, this bug might bite you on Emacs with Cairo XCB at some
point. When, who knows, we're on undefined behavior land here.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63589; Package emacs. (Mon, 29 May 2023 16:23:01 GMT) Full text and rfc822 format available.

Message #194 received at 63589 <at> debbugs.gnu.org (full text, mbox):

From: andrés ramírez <rrandresf <at> hotmail.com>
To: Thiago Melo <tmdmelo <at> gmail.com>
Cc: 63589 <at> debbugs.gnu.org
Subject: Re: bug#63589: 29.0.91; crash after creating graphical frames via
 emacsclient when compiled with cairo-xcb
Date: Mon, 29 May 2023 16:21:51 +0000
Hi. Thiago.

>>>>> "Thiago" == Thiago Melo <tmdmelo <at> gmail.com> writes:


[...]


    Thiago> The bottom line is, if you're getting these errors in this toy example, this bug might
    Thiago> bite you on Emacs with Cairo XCB at some point. When, who knows, we're on undefined
    Thiago> behavior land here.

Probably with emacs-29 I would need to pick between bug #63589 or  #bug#57364.

But I have been on this setup for several months and that issue has not
presented on my side. So probably when needed I would compile emacs-29 with '--with-cairo-xcb'.

Best Regards




This bug report was last modified 2 years and 19 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.