GNU bug report logs -
#58732
installer: finalizers & device destroy segfault
Previous Next
Reported by: Mathieu Othacehe <othacehe <at> gnu.org>
Date: Sun, 23 Oct 2022 09:08:01 UTC
Severity: important
Done: Mathieu Othacehe <othacehe <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Thu, 10 Nov 2022 13:29:10 +0100
with message-id <87r0ybyovd.fsf <at> gnu.org>
and subject line Re: bug#58732: installer: finalizers & device destroy segfault
has caused the debbugs.gnu.org bug report #58732,
regarding installer: finalizers & device destroy segfault
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
58732: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=58732
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
Hello,
I found a segfault in the installer by running those steps:
- Run an automatic partitioning with separate home and no encryption
- In the final configuration page, come back to partitioning
- Remove all partitions but the ESP one, create a new btrfs root
- partition
- Repeat until the crash occurs
Using Josselin's instructions here: https://issues.guix.gnu.org/57513, I
was able to get the following backtrace:
--8<---------------cut here---------------start------------->8---
Reading symbols from /gnu/store/b0ymz7vjfkcvhbci49q5yk1fi0l9lq49-parted-3.5/lib/libparted.so...
(gdb) bt
#0 linux_destroy (dev=0x1dc89e0) at arch/linux.c:1615
#1 0x00007f8941aecd37 in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#2 0x00007f8941a45e3f in GC_invoke_finalizers () from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#3 0x00007f8941aed429 in scm_run_finalizers () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#4 0x00007f8941af4482 in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#5 0x00007f8941ae085a in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#6 0x00007f8941b6d336 in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#7 0x00007f8941b7a5e9 in scm_call_n () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#8 0x00007f8941ae209a in scm_call_2 () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#9 0x00007f8941b98752 in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#10 0x00007f8941b6a88f in scm_c_catch () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#11 0x00007f8941ae2e66 in scm_c_with_continuation_barrier () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#12 0x00007f8941b69b39 in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#13 0x00007f8941a400ba in GC_call_with_stack_base () from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#14 0x00007f8941b628b8 in scm_with_guile () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#15 0x00007f8941a16d7e in ?? () from /gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/libpthread.so.0
#16 0x00007f8941614eff in clone () from /gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/libc.so.6
--8<---------------cut here---------------end--------------->8---
linux_destroy is the PedDevice destruction function. The crash occurs
when dereferencing the arch_specific pointer which is ...
--8<---------------cut here---------------start------------->8---
(gdb) p dev
$1 = (PedDevice *) 0x1dc89e0
(gdb) p *dev
$2 = {next = 0x1, model = 0x1645d50 "", path = 0x0, type = PED_DEVICE_UNKNOWN, sector_size = 0, phys_sector_size = 1, length = 23272720, open_count = 0, read_only = 1, external_mode = 0, dirty = 0, boot_dirty = 0, hw_geom = {
cylinders = 0, heads = 2, sectors = 0}, bios_geom = {cylinders = 23259184, heads = 0, sectors = 0}, host = 1, did = 0, arch_specific = 0x0}
(gdb) p dev->arch_specific
$3 = (void *) 0x0
--8<---------------cut here---------------end--------------->8---
null! I guess this has to deal with device pointer finalizers. I'm a bit
disappointed because I thought we had overcome those mistakes.
Thanks,
Mathieu
[Message part 3 (message/rfc822, inline)]
Hey,
> Looking at device.c in Parted, that’s probably the right thing because
> PedDevice objects are kept in a linked list whose head is stored in the
> ‘devices’ global variable of device.c. So you cannot just free them
> asynchronously from a finalizer thread because they might still be
> accessed from other parts of the library. This is the explanation that
> should go in the comment, and it’s clearly a good reason not to free
> those PedDevice objects.
If the finalizer was run synchronously when a device is removed from the
weak hash table then things would be OK. The device would be removed
from the global linked list by _device_register. get_device would malloc
a new structure and so on. However finalizers are not run synchronously
so here we are.
> Now, we could provide bindings for ‘ped_device_destroy’ that users could
> explicitly call if they want to (this would be similar to explicit calls
> to ‘close-port’). We’d arrange to make it idempotent.
Sure.
Thanks for your help on that one. I pushed the proposed patch and updated
Guile-Parted to 0.0.7 in Guix.
Mathieu
This bug report was last modified 2 years and 194 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.