GNU bug report logs -
#17168
24.3.50; Segfault at mark_object
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
On 04/06/2014 09:59 AM, Eli Zaretskii wrote:
>> Date: Sun, 06 Apr 2014 09:37:23 -0700
>> From: Daniel Colascione <dancol <at> dancol.org>
>> CC: monnier <at> IRO.UMontreal.CA, dmantipov <at> yandex.ru, 17168 <at> debbugs.gnu.org
>>
>>> Because Richard has been using that machine for years, and I very much
>>> doubt that he changed his usage patterns lately.
>>
>> Richard's not the only one who has seen this crash. Drew's also reported
>> GC crashes in odd, and different, places.
>
> Which seem unrelated, and started much later than Richard reported
> his.
With a bug like this, unpredictable, usage-pattern-dependent behavior is
expected.
>>>>>>> In http://debbugs.gnu.org/cgi/bugreport.cgi?bug=15583#23, Richard
>>>>>>> provided the last good revno (113938) and the first bad one (114268);
>>>>>>> I looked at that range of revisions, and 114156 looks relevant. How
>>>>>>> about if we revert it and see if the problems go away?
>>>>>>
>>>>>> The bug would still be there, and we'd have no way to tell whether your
>>>>>> proposed change actually reduced its occurrence to a tolerable level.
>>>>>> Why would you want to do that instead of just fixing the bug?
>>>>>
>>>>> Because it's simpler,
>>>>
>>>> It's easy to make code that's simple and wrong.
>>>
>>> I didn't suggest any new code.
>>
>> No: you're just suggesting leaving incorrect code in Emacs.
>
> It's not incorrect, AFAIU. It might be less optimal.
The current code isn't just sub-optimal. It's wrong. If you get unlucky
and try to mark a dead symbol, you will crash.
>>>>> and because it just might be that the bug was
>>>>> caused by that other changeset.
>>>>
>>>> How might that changeset in particular have caused the problem reports?
>>>
>>> It is related to calling a function, and is in the same function from
>>> which all the recent crashes started.
>>
>> You haven't identified a causal mechanism. Any recent change could have
>> caused enough of a shift in code generation or stack layout to cause
>> this problem, and because it manifests so seldom, it'd be hard to verify
>> that reverting any particular change "fixed" the problem.
>
> I thought you had a test case. If not, how did you verify that your
> suggested changes do fix the problem?
There is a test. Your proposed change does not cause the test to pass.
Even if it did, I would argue against substituting a real fix with your
change.
>> Also, eval_sub does *everything*. It's no surprise that we saw the
>> crashes there. That's like saying "all crashes are associated with main,
>> this change affects main, and therefore this change is responsible."
>
> The change is related to calling a function whose symbol has certain
> properties. That sounds related to me, not just a random change
> somewhere in eval_sub.
It's a dangling pointer. Changing slightly the way we chase that
dangling pointer won't change the overall result.
[signature.asc (application/pgp-signature, attachment)]
This bug report was last modified 11 years and 47 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.