GNU bug report logs - #70342
29.3.50; treesitter and RTLD_GLOBAL

Previous Next

Package: emacs;

Reported by: Michael Lausch <mick.lausch <at> gmail.com>

Date: Thu, 11 Apr 2024 18:01:03 UTC

Severity: normal

Tags: moreinfo, notabug

Found in version 29.3.50

Done: Stefan Kangas <stefankangas <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 70342 in the body.
You can then email your comments to 70342 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#70342; Package emacs. (Thu, 11 Apr 2024 18:01:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Michael Lausch <mick.lausch <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 11 Apr 2024 18:01:04 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Michael Lausch <mick.lausch <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 29.3.50; treesitter and RTLD_GLOBAL
Date: Thu, 11 Apr 2024 19:38:52 +0200
[Message part 1 (text/plain, inline)]
When loading a treesitter grammar in GNU/Linux, the dlopen() call is used
with the RTLD_GLOBAL flag set. If you load more than one
treesitter grammer, and both grammars define the same functions, most
probably in the scanner.c file, symbol resolution may use the wrong symbol.
For example the org and the yaml grammar both define a deserialize()
function in their scanner.c file. This may result a call from the org
grammar to the yaml defined deserialize() function. This fails, because the
yaml function does different things than the org grammer expects (it's a
free of a dangling pointer and therefore emacs crashes).

A solution can be:
1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to
what the eln loader does.
2) fix all the grammars and make all functions 'static' so that the
functions are not visible outside the compilation unit.
3) something i didn't think about
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#70342; Package emacs. (Thu, 11 Apr 2024 18:40:01 GMT) Full text and rfc822 format available.

Message #8 received at 70342 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Michael Lausch <mick.lausch <at> gmail.com>
Cc: 70342 <at> debbugs.gnu.org
Subject: Re: bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
Date: Thu, 11 Apr 2024 21:39:16 +0300
> From: Michael Lausch <mick.lausch <at> gmail.com>
> Date: Thu, 11 Apr 2024 19:38:52 +0200
> 
> When loading a treesitter grammar in GNU/Linux, the dlopen() call is used with the RTLD_GLOBAL flag set. If
> you load more than one treesitter grammer, and both grammars define the same functions, most probably in
> the scanner.c file, symbol resolution may use the wrong symbol.
> For example the org and the yaml grammar both define a deserialize() function in their scanner.c file. This
> may result a call from the org grammar to the yaml defined deserialize() function. This fails, because the yaml
> function does different things than the org grammer expects (it's a free of a dangling pointer and therefore
> emacs crashes). 
> 
> A solution can be:
> 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to what the eln loader does. 
> 2) fix all the grammars and make all functions 'static' so that the functions are not visible outside the
> compilation unit. 
> 3) something i didn't think about

If those 'serialize' functions are not needed to be called from
outside of the shared library, the usual way is not to export them,
i.e. to give all symbols except the few that need to be exported the
so-called "hidden visibility".




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#70342; Package emacs. (Thu, 11 Apr 2024 18:55:02 GMT) Full text and rfc822 format available.

Message #11 received at 70342 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Michael Lausch <mick.lausch <at> gmail.com>
Cc: 70342 <at> debbugs.gnu.org
Subject: Re: bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
Date: Thu, 11 Apr 2024 21:54:39 +0300
> From: Michael Lausch <mick.lausch <at> gmail.com>
> Date: Thu, 11 Apr 2024 20:47:50 +0200
> Cc: 70342 <at> debbugs.gnu.org
> 
>  > A solution can be:
>  > 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to what the eln loader does. 
>  > 2) fix all the grammars and make all functions 'static' so that the functions are not visible outside the
>  > compilation unit. 
>  > 3) something i didn't think about
> 
>  If those 'serialize' functions are not needed to be called from
>  outside of the shared library, the usual way is not to export them,
>  i.e. to give all symbols except the few that need to be exported the
>  so-called "hidden visibility".
> 
> I agree that this would be the cleanest way to solve the problem, but that would mean to patch all the existing
> grammars and maybe all the future grammars and push the changes to their maintainers.
> 
> I started to prep patches for the yaml and org grammar (those were the ones which triggered the bug for me)
> and i'm going to have them merged upstream. 

I understand, but why is this an Emacs problem?  We use RTLD_GLOBAL
for a reason, and the problem of not exposing unnecessary symbols
should be solved by the respective libraries and those who build them.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#70342; Package emacs. (Fri, 12 Apr 2024 04:11:04 GMT) Full text and rfc822 format available.

Message #14 received at 70342 <at> debbugs.gnu.org (full text, mbox):

From: Michael Lausch <mick.lausch <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 70342 <at> debbugs.gnu.org
Subject: Re: bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
Date: Thu, 11 Apr 2024 20:47:50 +0200
[Message part 1 (text/plain, inline)]
On Thu, Apr 11, 2024 at 8:39 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: Michael Lausch <mick.lausch <at> gmail.com>
> > Date: Thu, 11 Apr 2024 19:38:52 +0200
> >
> > When loading a treesitter grammar in GNU/Linux, the dlopen() call is
> used with the RTLD_GLOBAL flag set. If
> > you load more than one treesitter grammer, and both grammars define the
> same functions, most probably in
> > the scanner.c file, symbol resolution may use the wrong symbol.
> > For example the org and the yaml grammar both define a deserialize()
> function in their scanner.c file. This
> > may result a call from the org grammar to the yaml defined deserialize()
> function. This fails, because the yaml
> > function does different things than the org grammer expects (it's a free
> of a dangling pointer and therefore
> > emacs crashes).
> >
> > A solution can be:
> > 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to
> what the eln loader does.
> > 2) fix all the grammars and make all functions 'static' so that the
> functions are not visible outside the
> > compilation unit.
> > 3) something i didn't think about
>
> If those 'serialize' functions are not needed to be called from
> outside of the shared library, the usual way is not to export them,
> i.e. to give all symbols except the few that need to be exported the
> so-called "hidden visibility".
>

I agree that this would be the cleanest way to solve the problem, but that
would mean to patch all the existing grammars and maybe all the future
grammars and push the changes to their maintainers.

I started to prep patches for the yaml and org grammar (those were the ones
which triggered the bug for me) and i'm going to have them merged upstream.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#70342; Package emacs. (Fri, 12 Apr 2024 04:11:04 GMT) Full text and rfc822 format available.

Message #17 received at 70342 <at> debbugs.gnu.org (full text, mbox):

From: Michael Lausch <mick.lausch <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 70342 <at> debbugs.gnu.org
Subject: Re: bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
Date: Thu, 11 Apr 2024 21:04:55 +0200
[Message part 1 (text/plain, inline)]
On Thu, Apr 11, 2024 at 8:54 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: Michael Lausch <mick.lausch <at> gmail.com>
> > Date: Thu, 11 Apr 2024 20:47:50 +0200
> > Cc: 70342 <at> debbugs.gnu.org
> >
> >  > A solution can be:
> >  > 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar
> to what the eln loader does.
> >  > 2) fix all the grammars and make all functions 'static' so that the
> functions are not visible outside the
> >  > compilation unit.
> >  > 3) something i didn't think about
> >
> >  If those 'serialize' functions are not needed to be called from
> >  outside of the shared library, the usual way is not to export them,
> >  i.e. to give all symbols except the few that need to be exported the
> >  so-called "hidden visibility".
> >
> > I agree that this would be the cleanest way to solve the problem, but
> that would mean to patch all the existing
> > grammars and maybe all the future grammars and push the changes to their
> maintainers.
> >
> > I started to prep patches for the yaml and org grammar (those were the
> ones which triggered the bug for me)
> > and i'm going to have them merged upstream.
>
> I understand, but why is this an Emacs problem?  We use RTLD_GLOBAL
> for a reason, and the problem of not exposing unnecessary symbols
> should be solved by the respective libraries and those who build them.
>

You are completely right, the thing is that it may take a long time to fix
all the grammars and in the meantime,
whenever someone loads two buggy grammars in the same emacs process, it
will crash emacs. And that
causes more bug reports against emacs, even if it isn't an emacs problem.

The addition of yet another dlopen() function may mitigate this, but i
think that would lead to not fixing
the grammars, because it then works.

Therefore i created a bug, instead submitting a patch.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#70342; Package emacs. (Sat, 01 Mar 2025 02:50:02 GMT) Full text and rfc822 format available.

Message #20 received at 70342 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Yuan Fu <casouri <at> gmail.com>, Michael Lausch <mick.lausch <at> gmail.com>,
 70342 <at> debbugs.gnu.org
Subject: Re: bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
Date: Fri, 28 Feb 2025 18:49:50 -0800
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Michael Lausch <mick.lausch <at> gmail.com>
>> Date: Thu, 11 Apr 2024 20:47:50 +0200
>> Cc: 70342 <at> debbugs.gnu.org
>>
>>  > A solution can be:
>>  > 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to what the eln loader does.
>>  > 2) fix all the grammars and make all functions 'static' so that the functions are not visible outside the
>>  > compilation unit.
>>  > 3) something i didn't think about
>>
>>  If those 'serialize' functions are not needed to be called from
>>  outside of the shared library, the usual way is not to export them,
>>  i.e. to give all symbols except the few that need to be exported the
>>  so-called "hidden visibility".
>>
>> I agree that this would be the cleanest way to solve the problem, but that would mean to patch all the existing
>> grammars and maybe all the future grammars and push the changes to their maintainers.
>>
>> I started to prep patches for the yaml and org grammar (those were the ones which triggered the bug for me)
>> and i'm going to have them merged upstream.
>
> I understand, but why is this an Emacs problem?  We use RTLD_GLOBAL
> for a reason, and the problem of not exposing unnecessary symbols
> should be solved by the respective libraries and those who build them.

I'd tend to agree.  Should this therefore be closed?

I'm copying in Yuan also.




Added tag(s) moreinfo and notabug. Request was from Stefan Kangas <stefankangas <at> gmail.com> to control <at> debbugs.gnu.org. (Sat, 01 Mar 2025 02:51:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#70342; Package emacs. (Sat, 01 Mar 2025 09:25:02 GMT) Full text and rfc822 format available.

Message #25 received at 70342 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Kangas <stefankangas <at> gmail.com>
Cc: casouri <at> gmail.com, mick.lausch <at> gmail.com, 70342 <at> debbugs.gnu.org
Subject: Re: bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
Date: Sat, 01 Mar 2025 11:24:45 +0200
> From: Stefan Kangas <stefankangas <at> gmail.com>
> Date: Fri, 28 Feb 2025 18:49:50 -0800
> Cc: Michael Lausch <mick.lausch <at> gmail.com>, 70342 <at> debbugs.gnu.org, 
> 	Yuan Fu <casouri <at> gmail.com>
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> From: Michael Lausch <mick.lausch <at> gmail.com>
> >> Date: Thu, 11 Apr 2024 20:47:50 +0200
> >> Cc: 70342 <at> debbugs.gnu.org
> >>
> >>  > A solution can be:
> >>  > 1) use a special call to dlopen without the RTLD_OPEN flag, sim,ilar to what the eln loader does.
> >>  > 2) fix all the grammars and make all functions 'static' so that the functions are not visible outside the
> >>  > compilation unit.
> >>  > 3) something i didn't think about
> >>
> >>  If those 'serialize' functions are not needed to be called from
> >>  outside of the shared library, the usual way is not to export them,
> >>  i.e. to give all symbols except the few that need to be exported the
> >>  so-called "hidden visibility".
> >>
> >> I agree that this would be the cleanest way to solve the problem, but that would mean to patch all the existing
> >> grammars and maybe all the future grammars and push the changes to their maintainers.
> >>
> >> I started to prep patches for the yaml and org grammar (those were the ones which triggered the bug for me)
> >> and i'm going to have them merged upstream.
> >
> > I understand, but why is this an Emacs problem?  We use RTLD_GLOBAL
> > for a reason, and the problem of not exposing unnecessary symbols
> > should be solved by the respective libraries and those who build them.
> 
> I'd tend to agree.  Should this therefore be closed?

I think so, yes.

> I'm copying in Yuan also.





Reply sent to Stefan Kangas <stefankangas <at> gmail.com>:
You have taken responsibility. (Sat, 01 Mar 2025 23:52:02 GMT) Full text and rfc822 format available.

Notification sent to Michael Lausch <mick.lausch <at> gmail.com>:
bug acknowledged by developer. (Sat, 01 Mar 2025 23:52:02 GMT) Full text and rfc822 format available.

Message #30 received at 70342-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: casouri <at> gmail.com, mick.lausch <at> gmail.com, 70342-done <at> debbugs.gnu.org
Subject: Re: bug#70342: 29.3.50; treesitter and RTLD_GLOBAL
Date: Sat, 1 Mar 2025 15:51:17 -0800
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Stefan Kangas <stefankangas <at> gmail.com>
>> Date: Fri, 28 Feb 2025 18:49:50 -0800
>> Cc: Michael Lausch <mick.lausch <at> gmail.com>, 70342 <at> debbugs.gnu.org,
>> 	Yuan Fu <casouri <at> gmail.com>
>>
>> I'd tend to agree.  Should this therefore be closed?
>
> I think so, yes.

Thanks, done.  Please reopen if that's wrong for some reason.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 30 Mar 2025 11:24:25 GMT) Full text and rfc822 format available.

This bug report was last modified 82 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.