GNU bug report logs -
#58732
installer: finalizers & device destroy segfault
Previous Next
Reported by: Mathieu Othacehe <othacehe <at> gnu.org>
Date: Sun, 23 Oct 2022 09:08:01 UTC
Severity: important
Done: Mathieu Othacehe <othacehe <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your bug report
#58732: installer: finalizers & device destroy segfault
which was filed against the guix package, has been closed.
The explanation is attached below, along with your original report.
If you require more details, please reply to 58732 <at> debbugs.gnu.org.
--
58732: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=58732
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
Hey,
> Looking at device.c in Parted, that’s probably the right thing because
> PedDevice objects are kept in a linked list whose head is stored in the
> ‘devices’ global variable of device.c. So you cannot just free them
> asynchronously from a finalizer thread because they might still be
> accessed from other parts of the library. This is the explanation that
> should go in the comment, and it’s clearly a good reason not to free
> those PedDevice objects.
If the finalizer was run synchronously when a device is removed from the
weak hash table then things would be OK. The device would be removed
from the global linked list by _device_register. get_device would malloc
a new structure and so on. However finalizers are not run synchronously
so here we are.
> Now, we could provide bindings for ‘ped_device_destroy’ that users could
> explicitly call if they want to (this would be similar to explicit calls
> to ‘close-port’). We’d arrange to make it idempotent.
Sure.
Thanks for your help on that one. I pushed the proposed patch and updated
Guile-Parted to 0.0.7 in Guix.
Mathieu
[Message part 3 (message/rfc822, inline)]
Hello,
I found a segfault in the installer by running those steps:
- Run an automatic partitioning with separate home and no encryption
- In the final configuration page, come back to partitioning
- Remove all partitions but the ESP one, create a new btrfs root
- partition
- Repeat until the crash occurs
Using Josselin's instructions here: https://issues.guix.gnu.org/57513, I
was able to get the following backtrace:
--8<---------------cut here---------------start------------->8---
Reading symbols from /gnu/store/b0ymz7vjfkcvhbci49q5yk1fi0l9lq49-parted-3.5/lib/libparted.so...
(gdb) bt
#0 linux_destroy (dev=0x1dc89e0) at arch/linux.c:1615
#1 0x00007f8941aecd37 in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#2 0x00007f8941a45e3f in GC_invoke_finalizers () from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#3 0x00007f8941aed429 in scm_run_finalizers () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#4 0x00007f8941af4482 in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#5 0x00007f8941ae085a in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#6 0x00007f8941b6d336 in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#7 0x00007f8941b7a5e9 in scm_call_n () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#8 0x00007f8941ae209a in scm_call_2 () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#9 0x00007f8941b98752 in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#10 0x00007f8941b6a88f in scm_c_catch () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#11 0x00007f8941ae2e66 in scm_c_with_continuation_barrier () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#12 0x00007f8941b69b39 in ?? () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#13 0x00007f8941a400ba in GC_call_with_stack_base () from /gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#14 0x00007f8941b628b8 in scm_with_guile () from /gnu/store/1jgcbdzx2ss6xv59w55g3kr3x4935dfb-guile-3.0.8/lib/libguile-3.0.so.1
#15 0x00007f8941a16d7e in ?? () from /gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/libpthread.so.0
#16 0x00007f8941614eff in clone () from /gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib/libc.so.6
--8<---------------cut here---------------end--------------->8---
linux_destroy is the PedDevice destruction function. The crash occurs
when dereferencing the arch_specific pointer which is ...
--8<---------------cut here---------------start------------->8---
(gdb) p dev
$1 = (PedDevice *) 0x1dc89e0
(gdb) p *dev
$2 = {next = 0x1, model = 0x1645d50 "", path = 0x0, type = PED_DEVICE_UNKNOWN, sector_size = 0, phys_sector_size = 1, length = 23272720, open_count = 0, read_only = 1, external_mode = 0, dirty = 0, boot_dirty = 0, hw_geom = {
cylinders = 0, heads = 2, sectors = 0}, bios_geom = {cylinders = 23259184, heads = 0, sectors = 0}, host = 1, did = 0, arch_specific = 0x0}
(gdb) p dev->arch_specific
$3 = (void *) 0x0
--8<---------------cut here---------------end--------------->8---
null! I guess this has to deal with device pointer finalizers. I'm a bit
disappointed because I thought we had overcome those mistakes.
Thanks,
Mathieu
This bug report was last modified 2 years and 194 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.