GNU bug report logs - #42140
26.3; sigsegv when using nss-docker

Previous Next

Package: emacs;

Reported by: Hans van den Bogert <hansbogert <at> gmail.com>

Date: Tue, 30 Jun 2020 15:11:07 UTC

Severity: normal

Found in version 26.3

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Hans van den Bogert <hansbogert <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 42140 <at> debbugs.gnu.org
Subject: bug#42140: 26.3; sigsegv when using nss-docker
Date: Tue, 30 Jun 2020 22:15:42 +0200
On 6/30/20 5:40 PM, Eli Zaretskii wrote:
> Emacs is not multithreaded.  If you never start any additional Lisp
> threads, only one thread ever runs (not counting GTK threads, but
> those aren't new in Emacs 26).
> 
> The backtrace seems to suggest its a problem in nss-docker, since the
> crash is in its code.  Are you sure this is an Emacs problem?

> Emacs is not multithreaded.
You are right, poor choice of words; concurrent seems to be the proper 
word. The release notes of v26 do note the change to an async network layer:

Release note v26 snippet
--->8---
** The networking code has been reworked so that it's more
asynchronous than it was (when specifying :nowait t in
'make-network-process').  How asynchronous it is varies based on the
capabilities of the system, but on a typical GNU/Linux system the DNS
resolution, the connection, and (for TLS streams) the TLS negotiation
are all done without blocking the main Emacs thread.  To get
asynchronous TLS, the TLS boot parameters have to be passed in (see
the manual for details).
--->8---

> If you never start any additional Lisp
> threads, only one thread ever runs (not counting GTK threads, but
> those aren't new in Emacs 26).

I am an extreme novice wrt to emacs development, but I have to disagree,
in contrast to v25, I can see this async change in the debug prints 
which I added to `_nss_docker_*_r` functions; the order of internal 
method calls can interleave between `_nss_docker_gethostbyname2_r` 
invocations.

Further, Ithink I see 2 threads for 2 name resolves (is this what you 
meant with 'additional lisp threads'?):

```
Thread 7 (Thread 0x7fffd8ce7b40 (LWP 18899)):
#0  0x00007fffd8acecd5 in _nss_docker_gethostbyname3_r (name=Python 
Exception <class 'gdb.MemoryError'> Cannot access memory at address 
0x7fffd8ccd388:
#1  0x00007fffd8acf518 in _nss_docker_gethostbyname2_r (name=0x2d72768 
"orgmode.org", af=10, result=0x7fffd8ce67d0, buffer=0x7fffd8ce6a40 
"\377\002", buflen=1024, errnop=0x7fffd8ce7948, herrnop=0x7fffd8ce79ac)
    at libnss_docker.c:340
#2  0x00007fffebf70f9f in gaih_inet (name=name <at> entry=0x2d72768 
"orgmode.org", service=<optimized out>, req=req <at> entry=0x2d72738, 
pai=pai <at> entry=0x7fffd8ce69c8, naddrs=naddrs <at> entry=0x7fffd8ce69c4, 
tmpbuf=tmpbuf <at> entry=0x7fffd8ce6a30) at ../sysdeps/posix/getaddrinfo.c:873
#3  0x00007fffebf72ce4 in __GI_getaddrinfo (name=<optimized out>, 
service=<optimized out>, hints=0x2d72738, pai=pai <at> entry=0x2d72718) at 
../sysdeps/posix/getaddrinfo.c:2300
#4  0x00007fffecb5a058 in handle_requests (arg=<optimized out>) at 
gai_misc.c:317
#5  0x00007fffecd646db in start_thread (arg=0x7fffd8ce7b40) at 
pthread_create.c:463
#6  0x00007fffebf8c88f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
...
Thread 5 (Thread 0x7fffd8f51b40 (LWP 18897)):
#0  0x00007fffd8acecd5 in _nss_docker_gethostbyname3_r 
(name=0x2e6f732e312e302d <error: Cannot access memory at address 
0x2e6f732e312e302d>, af=2002936162, result=0x6e672d78756e696c, 
buffer=0x2d34365f3638782f <error: Cannot access memory at address 
0x2d34365f3638782f>, buflen=7091318039310988591, 
errnop=0x312e6f732e312e, herrnop=0x302d77626162696c, 
ttlp=0x302e6f732e6563, canonp=0x697672657373746e) at libnss_docker.c:72
#1  0x00007fffd8acf518 in _nss_docker_gethostbyname2_r (name=0x338a068 
"elpa.gnu.org", af=10, result=0x7fffd8f507d0, buffer=0x7fffd8f50a40 
"\377\002", buflen=1024, errnop=0x7fffd8f51948, herrnop=0x7fffd8f519ac) 
at libnss_docker.c:340
#2  0x00007fffebf70f9f in gaih_inet (name=name <at> entry=0x338a068 
"elpa.gnu.org", service=<optimized out>, req=req <at> entry=0x338a038, 
pai=pai <at> entry=0x7fffd8f509c8, naddrs=naddrs <at> entry=0x7fffd8f509c4, 
tmpbuf=tmpbuf <at> entry=0x7fffd8f50a30) at ../sysdeps/posix/getaddrinfo.c:873
#3  0x00007fffebf72ce4 in __GI_getaddrinfo (name=<optimized out>, 
service=<optimized out>, hints=0x338a038, pai=pai <at> entry=0x338a018) at 
../sysdeps/posix/getaddrinfo.c:2300
#4  0x00007fffecb5a058 in handle_requests (arg=<optimized out>) at 
gai_misc.c:317
#5  0x00007fffecd646db in start_thread (arg=0x7fffd8f51b40) at 
pthread_create.c:463
#6  0x00007fffebf8c88f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
```


If someone could help me point out where the libc/nss code is called on 
the emacs side, I can debug this further. Because tbh, I'm having 
difficulty pin-pointing that.






This bug report was last modified 5 years and 5 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.