GNU bug report logs -
#18438
24.4.50; assertion failed in bidi.c
Previous Next
Reported by: aidalgol <at> amuri.net
Date: Tue, 9 Sep 2014 21:52:01 UTC
Severity: normal
Tags: moreinfo
Merged with 17817
Found in versions 24.3.91, 24.4.50
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 18438 in the body.
You can then email your comments to 18438 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 09 Sep 2014 21:52:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
aidalgol <at> amuri.net
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Tue, 09 Sep 2014 21:52:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
With no apparent pattern, I am getting an "assertion failed" (which, of
course terminates Emacs):
src/bidi.c:329: Emacs fatal error: assertion failed: UNKNOWN_BT <= type
&& type <= NEUTRAL_ON
I am running from git master, commit 567e68d.
In GNU Emacs 24.4.50.1 (x86_64-unknown-cygwin)
of 2014-09-09 on AGAULAND-NZCD
Windowing system distributor `Microsoft Corp.', version 6.1.7601
Configured using:
`configure --enable-checking --with-w32 'CFLAGS=-O0 -ggdb''
Configured features:
XPM JPEG TIFF GIF PNG IMAGEMAGICK DBUS NOTIFY ACL GNUTLS LIBXML2 ZLIB
Important settings:
value of $LANG: en_US.UTF-8
locale-coding-system: utf-8-unix
Major mode: Dired by name
Minor modes in effect:
erc-track-mode: t
erc-track-minor-mode: t
erc-spelling-mode: t
erc-services-mode: t
erc-ring-mode: t
erc-networks-mode: t
erc-netsplit-mode: t
erc-menu-mode: t
erc-match-mode: t
erc-log-mode: t
erc-list-mode: t
erc-pcomplete-mode: t
erc-button-mode: t
erc-stamp-mode: t
erc-autojoin-mode: t
erc-autoaway-mode: t
erc-irccontrols-mode: t
erc-noncommands-mode: t
erc-move-to-prompt-mode: t
erc-readonly-mode: t
ido-everywhere: t
savehist-mode: t
display-time-mode: t
display-battery-mode: t
desktop-save-mode: t
tooltip-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
buffer-read-only: t
size-indication-mode: t
column-number-mode: t
line-number-mode: t
Recent input:
y <switch-frame> M-x e r <return> M-p <return> M-p
M-p <return> <return> <return> <lwindow> C-x b f r
e <return> C-x 3 C-x b # C-x b # <return> C-x 0 C-x
3 C-x b <return> <switch-frame> <S-lwindow> M-x C-s
<return>
Recent messages:
Desktop: File "/home/agauland/message-20140910-092339" no longer
exists.
Starting new Ispell process aspell with british dictionary...
Wrote /home/agauland/.emacs.d/emacs.desktop.lock
Desktop: 2 frames, 41 buffers restored, 1 failed to restore.
For information about GNU Emacs and the GNU system, type C-h C-a.
Connecting to localhost:7000... ...done
Logging in as 'aidalgol'...
Logging in without password
Logging in as 'aidalgol'... done
Quit
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug sendmail gnutls network-stream starttls
tls erc-track erc-spelling erc-services erc-ring erc-networks
erc-netsplit erc-menu erc-match erc-log erc-list erc-pcomplete
pcomplete
erc-button erc-fill erc-stamp wid-edit erc-join erc-autoaway
erc-goodies
erc erc-backend erc-compat auth-source eieio byte-opt bytecomp
byte-compile cconv eieio-core gnus-util password-cache thingatpt pp
smex
flyspell ispell rst compile vc-git conf-mode view cc-langs cc-mode
cc-fonts cc-guess cc-menus cc-cmds jka-compr paredit vc-dispatcher
vc-svn hideshow undo-tree diff python json message dired format-spec
rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231
rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mailabbrev mail-utils
gmm-utils mailheader browse-kill-ring-autoloads
ido-ubiquitous-autoloads
info magit-autoloads git-rebase-mode-autoloads
git-commit-mode-autoloads
paredit-autoloads smex-autoloads undo-tree-autoloads package server
cc-styles cc-align cc-engine cc-vars cc-defs appt diary-lib
diary-loaddefs cal-menu easymenu calendar cal-loaddefs advice help-fns
ido header-file find-file gtags mu cl-macs edmacro kmacro cl gv comint
ansi-color ring saveplace paren savehist avoid time battery desktop
frameset cl-loaddefs cl-lib cus-start cus-load time-date tooltip
electric uniquify ediff-hook vc-hooks lisp-float-type mwheel
w32-common-fns disp-table w32-win w32-vars tool-bar dnd fontset image
regexp-opt fringe tabulated-list newcomment lisp-mode prog-mode
register
page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock
font-lock syntax facemenu font-core frame cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew
greek romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer nadvice
loaddefs button faces cus-face macroexp files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process dbusbind
gfilenotify w32 multi-tty emacs)
Memory information:
((conses 16 308313 29216)
(symbols 48 34000 0)
(miscs 40 174 713)
(strings 32 76196 4962)
(string-bytes 1 2139625)
(vectors 16 31311)
(vector-slots 8 629615 26163)
(floats 8 155 237)
(intervals 56 11960 548)
(buffers 976 58))
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 10 Sep 2014 00:17:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> With no apparent pattern, I am getting an "assertion failed" (which, of
> course terminates Emacs):
> src/bidi.c:329: Emacs fatal error: assertion failed: UNKNOWN_BT <= type &&
> type <= NEUTRAL_ON
Can you give a recipe to reproduce the problem, or at least describe in
which circumstance this occurred?
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 10 Sep 2014 13:15:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 9/9/2014 5:41 PM, aidalgol <at> amuri.net wrote:
> With no apparent pattern, I am getting an "assertion failed" (which, of
> course terminates Emacs):
> src/bidi.c:329: Emacs fatal error: assertion failed: UNKNOWN_BT <= type
> && type <= NEUTRAL_ON
>
> I am running from git master, commit 567e68d.
>
> In GNU Emacs 24.4.50.1 (x86_64-unknown-cygwin)
> of 2014-09-09 on AGAULAND-NZCD
> Windowing system distributor `Microsoft Corp.', version 6.1.7601
> Configured using:
> `configure --enable-checking --with-w32 'CFLAGS=-O0 -ggdb''
This looks like a duplicate of bug#17817, which unfortunately didn't go
anywhere because I lost the gdb session. I haven't seen it again since
then.
Ken
Merged 17817 18438.
Request was from
Glenn Morris <rgm <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Wed, 10 Sep 2014 15:51:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 10 Sep 2014 18:59:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 18438 <at> debbugs.gnu.org (full text, mbox):
There does not appear to be any pattern to this. Sometimes it happens
when I'm typing in an ERC buffer; sometimes it happens when I'm typing
in a python-mode buffer; sometimes it happens when I'm typing in the
minibuffer. I don't think it has happened when I was *not* sending
keystrokes to Emacs, but nothing else is apparently common between the
situations in which this happens.
On 10/09/14 12:16, Stefan Monnier wrote:
>> With no apparent pattern, I am getting an "assertion failed" (which, of
>> course terminates Emacs):
>> src/bidi.c:329: Emacs fatal error: assertion failed: UNKNOWN_BT <= type &&
>> type <= NEUTRAL_ON
>
> Can you give a recipe to reproduce the problem, or at least describe in
> which circumstance this occurred?
>
>
> Stefan
>
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 10 Sep 2014 19:09:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 11 Sep 2014 06:55:49 +1200
> From: Aidan Gauland <aidalgol <at> amuri.net>
> Cc: 18438 <at> debbugs.gnu.org
>
> There does not appear to be any pattern to this. Sometimes it happens
> when I'm typing in an ERC buffer; sometimes it happens when I'm typing
> in a python-mode buffer; sometimes it happens when I'm typing in the
> minibuffer. I don't think it has happened when I was *not* sending
> keystrokes to Emacs, but nothing else is apparently common between the
> situations in which this happens.
Please run Emacs under GDB, and when this happens again, produce a
full backtrace and post it here. I'd like to see if the symptoms are
similar to what Ken reported in bug #17817.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 11 Sep 2014 04:32:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Wed, 10 Sep 2014 22:09:08 +0300, Eli Zaretskii wrote:
>
> Please run Emacs under GDB, and when this happens again, produce a
> full backtrace and post it here. I'd like to see if the symptoms are
> similar to what Ken reported in bug #17817.
I am unable to do this because Emacs hangs when run under GDB on
opening a symbolic link. Emacs hangs with a message in the minibuffer
indicating that it is loading a library that is on my load-path as a
symbolic link; if I replace this symlink with a copy of the file it
links to, Emacs hangs on loading the next library that is installed via
a symlink. If there are no symlinks on load-path, it hangs where it
would normally load my desktop file, which has files open via symlinks
(i.e. there is a symlink to a directory in their path). If I run `gdb
emacs`, gdb gives "Error creating process /usr/local/bin/emacs, (error
5)." when I type the `run` command, but it can run emacs if I specify
the name of the binary to which `emacs` is a symlink
(/usr/local/bin/emacs -> /usr/local/bin/emacs-24.4.50.exe). I am
running GDB 7.8 from the Cygwin package.
I'm surprised Ken did not run into this. Ken, which version of Cygwin
and GDB were you running at the time?
--Aidan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 11 Sep 2014 13:19:02 GMT)
Full text and
rfc822 format available.
Message #25 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 9/11/2014 12:31 AM, aidalgol <at> amuri.net wrote:
> On Wed, 10 Sep 2014 22:09:08 +0300, Eli Zaretskii wrote:
>>
>> Please run Emacs under GDB, and when this happens again, produce a
>> full backtrace and post it here. I'd like to see if the symptoms are
>> similar to what Ken reported in bug #17817.
>
> I am unable to do this because Emacs hangs when run under GDB on opening
> a symbolic link. Emacs hangs with a message in the minibuffer
> indicating that it is loading a library that is on my load-path as a
> symbolic link; if I replace this symlink with a copy of the file it
> links to, Emacs hangs on loading the next library that is installed via
> a symlink. If there are no symlinks on load-path, it hangs where it
> would normally load my desktop file, which has files open via symlinks
> (i.e. there is a symlink to a directory in their path). If I run `gdb
> emacs`, gdb gives "Error creating process /usr/local/bin/emacs, (error
> 5)." when I type the `run` command, but it can run emacs if I specify
> the name of the binary to which `emacs` is a symlink
> (/usr/local/bin/emacs -> /usr/local/bin/emacs-24.4.50.exe). I am
> running GDB 7.8 from the Cygwin package.
>
> I'm surprised Ken did not run into this. Ken, which version of Cygwin
> and GDB were you running at the time?
I was running whatever was current at the time. But there is a bug in
GDB 7.8 that prevents you from using it to debug a GUI version of emacs.
See https://sourceware.org/bugzilla/show_bug.cgi?id=17247#c34. Until
this is fixed, you need to downgrade to the previous release of GDB.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 12 Sep 2014 01:52:01 GMT)
Full text and
rfc822 format available.
Message #28 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Thu, 11 Sep 2014 09:18:14 -0400, Ken Brown wrote:
>
> I was running whatever was current at the time. But there is a bug
> in GDB 7.8 that prevents you from using it to debug a GUI version of
> emacs. See
> https://sourceware.org/bugzilla/show_bug.cgi?id=17247#c34.
> Until this is fixed, you need to downgrade to the previous release of
> GDB.
Ah, OK, thanks. I've downgraded to the previous Cygwin package and it
works fine now.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 12 Sep 2014 01:56:02 GMT)
Full text and
rfc822 format available.
Message #31 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Wed, 10 Sep 2014 22:09:08 +0300, Eli Zaretskii wrote:
>
> Please run Emacs under GDB, and when this happens again, produce a
> full backtrace and post it here. I'd like to see if the symptoms are
> similar to what Ken reported in bug #17817.
It finally happened again; backtrace attached (gzipped because it's
over 100K uncompressed).
[emacs-bidi.backtrace.gz (application/octet-stream, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 12 Sep 2014 06:01:01 GMT)
Full text and
rfc822 format available.
Message #34 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 12 Sep 2014 13:55:23 +1200
> From: aidalgol <at> amuri.net
> Cc: <monnier <at> iro.umontreal.ca>, <18438 <at> debbugs.gnu.org>
>
> It finally happened again; backtrace attached
Thanks.
This is an entirely different place and condition:
#1 0x00000001005b9a67 in die (msg=0x100a51d98 <DEFAULT_REHASH_SIZE+8416> "!it->bidi_p || (EQ (it->bidi_it.string.lstring, Qnil) && it->bidi_it.string.s == NULL)", file=0x100a4fcc0 <DEFAULT_REHASH_SIZE+8> "xdisp.c", line=8222)
at alloc.c:7160
No locals.
#2 0x00000001004435aa in next_element_from_buffer (it=0x2269b0) at xdisp.c:8220
success_p = true
And it again is a bogus assertion violation: as you see in frame #5,
string = {
lstring = 4306669618,
s = 0x0,
(4306669618 is nil), the condition of the assertion does in fact hold.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 12 Sep 2014 07:47:02 GMT)
Full text and
rfc822 format available.
Message #37 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 12 Sep 2014 09:00:32 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 18438 <at> debbugs.gnu.org
>
> > Date: Fri, 12 Sep 2014 13:55:23 +1200
> > From: aidalgol <at> amuri.net
> > Cc: <monnier <at> iro.umontreal.ca>, <18438 <at> debbugs.gnu.org>
> >
> > It finally happened again; backtrace attached
>
> Thanks.
>
> This is an entirely different place and condition:
>
> #1 0x00000001005b9a67 in die (msg=0x100a51d98 <DEFAULT_REHASH_SIZE+8416> "!it->bidi_p || (EQ (it->bidi_it.string.lstring, Qnil) && it->bidi_it.string.s == NULL)", file=0x100a4fcc0 <DEFAULT_REHASH_SIZE+8> "xdisp.c", line=8222)
> at alloc.c:7160
> No locals.
> #2 0x00000001004435aa in next_element_from_buffer (it=0x2269b0) at xdisp.c:8220
> success_p = true
>
> And it again is a bogus assertion violation: as you see in frame #5,
Sorry, that was frame #6, not 5.
>
> string = {
> lstring = 4306669618,
> s = 0x0,
>
> (4306669618 is nil), the condition of the assertion does in fact hold.
If you still have that session inside GDB, could you show the values
that are tested in the assertion in frame #2:
eassert (!it->bidi_p
|| (EQ (it->bidi_it.string.lstring, Qnil)
&& it->bidi_it.string.s == NULL));
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 16 Sep 2014 01:05:01 GMT)
Full text and
rfc822 format available.
Message #40 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Fri, 12 Sep 2014 10:46:08 +0300, Eli Zaretskii wrote:
>
> If you still have that session inside GDB, could you show the values
> that are tested in the assertion in frame #2:
>
> eassert (!it->bidi_p
> || (EQ (it->bidi_it.string.lstring, Qnil)
> && it->bidi_it.string.s == NULL));
I closed the session, but I got another bidi.c assert, this time in a
different place. Is this one more like bug #17817?
[emacs-bidi-02.backtrace.gz (application/octet-stream, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 16 Sep 2014 02:48:02 GMT)
Full text and
rfc822 format available.
Message #43 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 16 Sep 2014 13:04:12 +1200
> From: aidalgol <at> amuri.net
> Cc: Eli Zaretskii <eliz <at> gnu.org>
>
> I closed the session, but I got another bidi.c assert, this time in a
> different place. Is this one more like bug #17817?
Yes, it does. And like that one, it makes no sense: it clearly shows
that 'type' passed to bidi_check_type is STRONG_L, a valid value.
Here's the relevant part of the backtrace:
> #0 terminate_due_to_signal (sig=6, backtrace_limit=2147483647) at emacs.c:361
> No locals.
> #1 0x00000001005b9a67 in die (msg=0x100a5aad8 <DEFAULT_REHASH_SIZE+64> "UNKNOWN_BT <= type && type <= NEUTRAL_ON", file=0x100a5aad0 <DEFAULT_REHASH_SIZE+56> "bidi.c", line=329) at alloc.c:7160
> No locals.
> #2 0x00000001005010fe in bidi_check_type (type=STRONG_L) at bidi.c:329
> No locals.
> #3 0x0000000100506230 in bidi_level_of_next_char (bidi_it=0x223d08) at bidi.c:2430
> type = STRONG_L
> level = 0
> prev_level = 0
> next_for_neutral = {
> bytepos = 0,
> charpos = -1,
> type = UNKNOWN_BT,
> type_after_w1 = UNKNOWN_BT,
> orig_type = UNKNOWN_BT
> }
> next_char_pos = 1
The only reason I could think of for such assertion violations is some
asynchronously run code that doesn't properly restore registers.
Which is pretty far-fetched, but what else can explain this?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 16 Sep 2014 03:00:03 GMT)
Full text and
rfc822 format available.
Message #46 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Tue, 16 Sep 2014 05:47:21 +0300, Eli Zaretskii wrote:
>
>> I closed the session, but I got another bidi.c assert, this time in
>> a
>> different place. Is this one more like bug #17817?
>
> Yes, it does. And like that one, it makes no sense: it clearly shows
> that 'type' passed to bidi_check_type is STRONG_L, a valid value.
> Here's the relevant part of the backtrace:
>
>> #0 terminate_due_to_signal (sig=6, backtrace_limit=2147483647) at
>> emacs.c:361
>> No locals.
>> #1 0x00000001005b9a67 in die (msg=0x100a5aad8
>> <DEFAULT_REHASH_SIZE+64> "UNKNOWN_BT <= type && type <= NEUTRAL_ON",
>> file=0x100a5aad0 <DEFAULT_REHASH_SIZE+56> "bidi.c", line=329) at
>> alloc.c:7160
>> No locals.
>> #2 0x00000001005010fe in bidi_check_type (type=STRONG_L) at
>> bidi.c:329
>> No locals.
>> #3 0x0000000100506230 in bidi_level_of_next_char (bidi_it=0x223d08)
>> at bidi.c:2430
>> type = STRONG_L
>> level = 0
>> prev_level = 0
>> next_for_neutral = {
>> bytepos = 0,
>> charpos = -1,
>> type = UNKNOWN_BT,
>> type_after_w1 = UNKNOWN_BT,
>> orig_type = UNKNOWN_BT
>> }
>> next_char_pos = 1
>
> The only reason I could think of for such assertion violations is
> some
> asynchronously run code that doesn't properly restore registers.
> Which is pretty far-fetched, but what else can explain this?
I still have this session open, by the way. Do you want anything more
out of it?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 16 Sep 2014 14:34:01 GMT)
Full text and
rfc822 format available.
Message #49 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 16 Sep 2014 14:59:32 +1200
> From: aidalgol <at> amuri.net
> Cc: Eli Zaretskii <eliz <at> gnu.org>
>
> > The only reason I could think of for such assertion violations is
> > some asynchronously run code that doesn't properly restore
> > registers. Which is pretty far-fetched, but what else can explain
> > this?
>
> I still have this session open, by the way. Do you want anything more
> out of it?
If you don't mind messing with assembler, it would be interesting to
disassemble bidi_check_type, see in which register it holds the value
when it tests it, and then look at the actual value in that register
in the bidi_check_type's call-stack frame.
TIA
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 16 Sep 2014 22:43:02 GMT)
Full text and
rfc822 format available.
Message #52 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Tue, 16 Sep 2014 17:33:09 +0300, Eli Zaretskii wrote:
> If you don't mind messing with assembler, it would be interesting to
> disassemble bidi_check_type, see in which register it holds the value
> when it tests it, and then look at the actual value in that register
> in the bidi_check_type's call-stack frame.
Sure, but I'm not very familiar with x86 assembly, so I'll just post
the entire disassemble output to start with and someone else will have
to identify the register of interest.
Dump of assembler code for function bidi_check_type:
0x00000001005010c3 <+0>: push %rbp
0x00000001005010c4 <+1>: mov %rsp,%rbp
0x00000001005010c7 <+4>: sub $0x20,%rsp
0x00000001005010cb <+8>: mov %ecx,0x10(%rbp)
0x00000001005010ce <+11>: mov 0x58ab9b(%rip),%rax #
0x100a8bc70 <.refptr.suppress_checking>
0x00000001005010d5 <+18>: movzbl (%rax),%eax
0x00000001005010d8 <+21>: xor $0x1,%eax
0x00000001005010db <+24>: test %al,%al
0x00000001005010dd <+26>: je 0x1005010ff <bidi_check_type+60>
0x00000001005010df <+28>: cmpl $0x17,0x10(%rbp)
0x00000001005010e3 <+32>: jbe 0x1005010ff <bidi_check_type+60>
0x00000001005010e5 <+34>: mov $0x149,%r8d
0x00000001005010eb <+40>: lea 0x5599de(%rip),%rdx #
0x100a5aad0 <DEFAULT_REHASH_SIZE+56>
0x00000001005010f2 <+47>: lea 0x5599df(%rip),%rcx #
0x100a5aad8 <DEFAULT_REHASH_SIZE+64>
0x00000001005010f9 <+54>: callq 0x1005b9a15 <die>
=> 0x00000001005010fe <+59>: nop
0x00000001005010ff <+60>: add $0x20,%rsp
0x0000000100501103 <+64>: pop %rbp
0x0000000100501104 <+65>: retq
End of assembler dump.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 17 Sep 2014 05:08:01 GMT)
Full text and
rfc822 format available.
Message #55 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 17 Sep 2014 10:42:18 +1200
> From: aidalgol <at> amuri.net
> Cc: Eli Zaretskii <eliz <at> gnu.org>
>
> On Tue, 16 Sep 2014 17:33:09 +0300, Eli Zaretskii wrote:
> > If you don't mind messing with assembler, it would be interesting to
> > disassemble bidi_check_type, see in which register it holds the value
> > when it tests it, and then look at the actual value in that register
> > in the bidi_check_type's call-stack frame.
>
> Sure, but I'm not very familiar with x86 assembly, so I'll just post
> the entire disassemble output to start with and someone else will have
> to identify the register of interest.
>
> Dump of assembler code for function bidi_check_type:
> 0x00000001005010c3 <+0>: push %rbp
> 0x00000001005010c4 <+1>: mov %rsp,%rbp
> 0x00000001005010c7 <+4>: sub $0x20,%rsp
> 0x00000001005010cb <+8>: mov %ecx,0x10(%rbp)
> 0x00000001005010ce <+11>: mov 0x58ab9b(%rip),%rax #
> 0x100a8bc70 <.refptr.suppress_checking>
> 0x00000001005010d5 <+18>: movzbl (%rax),%eax
> 0x00000001005010d8 <+21>: xor $0x1,%eax
> 0x00000001005010db <+24>: test %al,%al
> 0x00000001005010dd <+26>: je 0x1005010ff <bidi_check_type+60>
> 0x00000001005010df <+28>: cmpl $0x17,0x10(%rbp)
> 0x00000001005010e3 <+32>: jbe 0x1005010ff <bidi_check_type+60>
> 0x00000001005010e5 <+34>: mov $0x149,%r8d
> 0x00000001005010eb <+40>: lea 0x5599de(%rip),%rdx #
> 0x100a5aad0 <DEFAULT_REHASH_SIZE+56>
> 0x00000001005010f2 <+47>: lea 0x5599df(%rip),%rcx #
> 0x100a5aad8 <DEFAULT_REHASH_SIZE+64>
> 0x00000001005010f9 <+54>: callq 0x1005b9a15 <die>
> => 0x00000001005010fe <+59>: nop
> 0x00000001005010ff <+60>: add $0x20,%rsp
> 0x0000000100501103 <+64>: pop %rbp
> 0x0000000100501104 <+65>: retq
> End of assembler dump.
My reading of this is:
. the value being tested is originally in ECX
. it is stored in a temporary local variable at RBP+0x10
. then it is compared with 0x17 (decimal 23)
So you have two places to check: the ECX register and the value
pointed to by RBP+0x10.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 18 Sep 2014 04:56:01 GMT)
Full text and
rfc822 format available.
Message #58 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Wed, 17 Sep 2014 08:07:34 +0300, Eli Zaretskii wrote:
>
>> Dump of assembler code for function bidi_check_type:
>> 0x00000001005010c3 <+0>: push %rbp
>> 0x00000001005010c4 <+1>: mov %rsp,%rbp
>> 0x00000001005010c7 <+4>: sub $0x20,%rsp
>> 0x00000001005010cb <+8>: mov %ecx,0x10(%rbp)
>> 0x00000001005010ce <+11>: mov 0x58ab9b(%rip),%rax #
>> 0x100a8bc70 <.refptr.suppress_checking>
>> 0x00000001005010d5 <+18>: movzbl (%rax),%eax
>> 0x00000001005010d8 <+21>: xor $0x1,%eax
>> 0x00000001005010db <+24>: test %al,%al
>> 0x00000001005010dd <+26>: je 0x1005010ff
>> <bidi_check_type+60>
>> 0x00000001005010df <+28>: cmpl $0x17,0x10(%rbp)
>> 0x00000001005010e3 <+32>: jbe 0x1005010ff
>> <bidi_check_type+60>
>> 0x00000001005010e5 <+34>: mov $0x149,%r8d
>> 0x00000001005010eb <+40>: lea 0x5599de(%rip),%rdx #
>> 0x100a5aad0 <DEFAULT_REHASH_SIZE+56>
>> 0x00000001005010f2 <+47>: lea 0x5599df(%rip),%rcx #
>> 0x100a5aad8 <DEFAULT_REHASH_SIZE+64>
>> 0x00000001005010f9 <+54>: callq 0x1005b9a15 <die>
>> => 0x00000001005010fe <+59>: nop
>> 0x00000001005010ff <+60>: add $0x20,%rsp
>> 0x0000000100501103 <+64>: pop %rbp
>> 0x0000000100501104 <+65>: retq
>> End of assembler dump.
>
> My reading of this is:
>
> . the value being tested is originally in ECX
> . it is stored in a temporary local variable at RBP+0x10
> . then it is compared with 0x17 (decimal 23)
>
> So you have two places to check: the ECX register and the value
> pointed to by RBP+0x10.
OK, I've got the first one, but what's the syntax for "value at address
X+Y"?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 18 Sep 2014 05:00:02 GMT)
Full text and
rfc822 format available.
Message #61 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Wed, 17 Sep 2014 08:07:34 +0300, Eli Zaretskii wrote:
>
>> Dump of assembler code for function bidi_check_type:
>> 0x00000001005010c3 <+0>: push %rbp
>> 0x00000001005010c4 <+1>: mov %rsp,%rbp
>> 0x00000001005010c7 <+4>: sub $0x20,%rsp
>> 0x00000001005010cb <+8>: mov %ecx,0x10(%rbp)
>> 0x00000001005010ce <+11>: mov 0x58ab9b(%rip),%rax #
>> 0x100a8bc70 <.refptr.suppress_checking>
>> 0x00000001005010d5 <+18>: movzbl (%rax),%eax
>> 0x00000001005010d8 <+21>: xor $0x1,%eax
>> 0x00000001005010db <+24>: test %al,%al
>> 0x00000001005010dd <+26>: je 0x1005010ff
>> <bidi_check_type+60>
>> 0x00000001005010df <+28>: cmpl $0x17,0x10(%rbp)
>> 0x00000001005010e3 <+32>: jbe 0x1005010ff
>> <bidi_check_type+60>
>> 0x00000001005010e5 <+34>: mov $0x149,%r8d
>> 0x00000001005010eb <+40>: lea 0x5599de(%rip),%rdx #
>> 0x100a5aad0 <DEFAULT_REHASH_SIZE+56>
>> 0x00000001005010f2 <+47>: lea 0x5599df(%rip),%rcx #
>> 0x100a5aad8 <DEFAULT_REHASH_SIZE+64>
>> 0x00000001005010f9 <+54>: callq 0x1005b9a15 <die>
>> => 0x00000001005010fe <+59>: nop
>> 0x00000001005010ff <+60>: add $0x20,%rsp
>> 0x0000000100501103 <+64>: pop %rbp
>> 0x0000000100501104 <+65>: retq
>> End of assembler dump.
>
> My reading of this is:
>
> . the value being tested is originally in ECX
> . it is stored in a temporary local variable at RBP+0x10
> . then it is compared with 0x17 (decimal 23)
>
> So you have two places to check: the ECX register and the value
> pointed to by RBP+0x10.
Here we are:
(gdb) print $ecx
$3 = 6
(gdb) x $rbp+0x10
0x222e10: 0x00000001
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 18 Sep 2014 14:43:02 GMT)
Full text and
rfc822 format available.
Message #64 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 18 Sep 2014 16:59:34 +1200
> From: aidalgol <at> amuri.net
> Cc: Eli Zaretskii <eliz <at> gnu.org>
>
> > My reading of this is:
> >
> > . the value being tested is originally in ECX
> > . it is stored in a temporary local variable at RBP+0x10
> > . then it is compared with 0x17 (decimal 23)
> >
> > So you have two places to check: the ECX register and the value
> > pointed to by RBP+0x10.
>
> Here we are:
>
> (gdb) print $ecx
> $3 = 6
> (gdb) x $rbp+0x10
> 0x222e10: 0x00000001
Thanks. Both values are valid, although the second one is probably
the accurate one.
So again, this is a riddle for which I have no clues. Perhaps the
strange backtraces reported in bug #17753, and the discussion Ken
started on the Cygwin list about that, will bring some insight (e.g.,
is it possible that this code also runs in some other thread?).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sun, 21 Sep 2014 22:31:02 GMT)
Full text and
rfc822 format available.
Message #67 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 9/18/2014 10:42 AM, Eli Zaretskii wrote:
>> Date: Thu, 18 Sep 2014 16:59:34 +1200
>> From: aidalgol <at> amuri.net
>> Cc: Eli Zaretskii <eliz <at> gnu.org>
>>
>>> My reading of this is:
>>>
>>> . the value being tested is originally in ECX
>>> . it is stored in a temporary local variable at RBP+0x10
>>> . then it is compared with 0x17 (decimal 23)
>>>
>>> So you have two places to check: the ECX register and the value
>>> pointed to by RBP+0x10.
>>
>> Here we are:
>>
>> (gdb) print $ecx
>> $3 = 6
>> (gdb) x $rbp+0x10
>> 0x222e10: 0x00000001
>
> Thanks. Both values are valid, although the second one is probably
> the accurate one.
>
> So again, this is a riddle for which I have no clues. Perhaps the
> strange backtraces reported in bug #17753, and the discussion Ken
> started on the Cygwin list about that, will bring some insight (e.g.,
> is it possible that this code also runs in some other thread?).
The other possibility is that the strange backtraces are due to a bug in
gdb that has since been fixed. PR 16155
(https://sourceware.org/bugzilla/show_bug.cgi?id=16155) seems like a
possible candidate for such a bug, but I haven't yet tried to verify
this. OP, could you update to Cygwin's gdb-7.8-2 and see if your
backtraces start to make more sense? The problem with gdb-7.8-1 that I
mentioned earlier has been fixed.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 24 Sep 2014 05:08:02 GMT)
Full text and
rfc822 format available.
Message #70 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Sun, 21 Sep 2014 18:30:09 -0400, Ken Brown wrote:
> On 9/18/2014 10:42 AM, Eli Zaretskii wrote:
>> So again, this is a riddle for which I have no clues. Perhaps the
>> strange backtraces reported in bug #17753, and the discussion Ken
>> started on the Cygwin list about that, will bring some insight
>> (e.g.,
>> is it possible that this code also runs in some other thread?).
>
> The other possibility is that the strange backtraces are due to a bug
> in gdb that has since been fixed. PR 16155
> (https://sourceware.org/bugzilla/show_bug.cgi?id=16155) seems like a
> possible candidate for such a bug, but I haven't yet tried to verify
> this. OP, could you update to Cygwin's gdb-7.8-2 and see if your
> backtraces start to make more sense? The problem with gdb-7.8-1 that
> I mentioned earlier has been fixed.
Since your post, I have been running emacs under gdb-7.8-2 as you said,
and regularly pulling from git master (and rebuilding, of course).
Today I pulled 270b6e3 and rebuilt, and it hangs under gdb before even
drawing the frame. It runs fine outside gdb, so I did a clean build and
tried again, but the same happened.
$ gdb -x .gdbinit ./emacs.exe
GNU gdb (GDB) 7.8
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show
copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-cygwin".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./emacs.exe...done.
warning: File "/home/agauland/src/emacs/src/.gdbinit" auto-loading has
been declined by your `auto-load safe-path' set to
"$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path /home/agauland/src/emacs/src/.gdbinit
line to your configuration file "/home/agauland/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/home/agauland/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the
shell:
info "(gdb)Auto-loading safe path"
SIGINT is used by the debugger.
Are you sure you want to change it? (y or n) [answered Y; input not
from terminal]
Environment variable "DISPLAY" not defined.
TERM = xterm
Breakpoint 1 at 0x100531bd8: file emacs.c, line 361.
Temporary breakpoint 2 at 0x10055e8a8: file sysdep.c, line 915.
(gdb) run
Starting program: /home/agauland/src/emacs/src/emacs.exe
[New Thread 6856.0x1f38]
[New Thread 6856.0x1240]
[New Thread 6856.0x26c]
[New Thread 6856.0x173c]
[New Thread 6856.0x1f80]
[New Thread 6856.0x1528]
[New Thread 6856.0x15c4]
And then it just sits there. I tried using Kyle McKay's debugbreak.c
<https://cygwin.com/ml/cygwin/2006-06/msg00321.html> to send the emacs
process a DebugBreak, and gdb did not respond (or not visibly, anyway).
I was running a build of d29b3c1 under the same version of GDB, so that
suggests that a change in Emacs may be exposing a different bug in GDB
7.8.
Where shall I go from here? Downgrading GDB is not an option as we
need a backtrace from Emacs run under GDB 7.8.
Regards,
Aidan Gauland
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 24 Sep 2014 14:08:02 GMT)
Full text and
rfc822 format available.
Message #73 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 9/24/2014 1:07 AM, aidalgol <at> amuri.net wrote:
> Since your post, I have been running emacs under gdb-7.8-2 as you said,
> and regularly pulling from git master (and rebuilding, of course). Today
> I pulled 270b6e3 and rebuilt, and it hangs under gdb before even drawing
> the frame. It runs fine outside gdb, so I did a clean build and tried
> again, but the same happened.
>
> $ gdb -x .gdbinit ./emacs.exe
> GNU gdb (GDB) 7.8
> Copyright (C) 2014 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-cygwin".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from ./emacs.exe...done.
> warning: File "/home/agauland/src/emacs/src/.gdbinit" auto-loading has
> been declined by your `auto-load safe-path' set to
> "$debugdir:$datadir/auto-load".
> To enable execution of this file add
> add-auto-load-safe-path /home/agauland/src/emacs/src/.gdbinit
> line to your configuration file "/home/agauland/.gdbinit".
> To completely disable this security protection add
> set auto-load safe-path /
> line to your configuration file "/home/agauland/.gdbinit".
> For more information about this security protection see the
> "Auto-loading safe path" section in the GDB manual. E.g., run from the
> shell:
> info "(gdb)Auto-loading safe path"
> SIGINT is used by the debugger.
> Are you sure you want to change it? (y or n) [answered Y; input not from
> terminal]
> Environment variable "DISPLAY" not defined.
> TERM = xterm
> Breakpoint 1 at 0x100531bd8: file emacs.c, line 361.
> Temporary breakpoint 2 at 0x10055e8a8: file sysdep.c, line 915.
> (gdb) run
> Starting program: /home/agauland/src/emacs/src/emacs.exe
> [New Thread 6856.0x1f38]
> [New Thread 6856.0x1240]
> [New Thread 6856.0x26c]
> [New Thread 6856.0x173c]
> [New Thread 6856.0x1f80]
> [New Thread 6856.0x1528]
> [New Thread 6856.0x15c4]
>
> And then it just sits there. I tried using Kyle McKay's debugbreak.c
> <https://cygwin.com/ml/cygwin/2006-06/msg00321.html> to send the emacs
> process a DebugBreak, and gdb did not respond (or not visibly, anyway).
> I was running a build of d29b3c1 under the same version of GDB, so that
> suggests that a change in Emacs may be exposing a different bug in GDB 7.8.
>
> Where shall I go from here? Downgrading GDB is not an option as we need
> a backtrace from Emacs run under GDB 7.8.
It works fine for me with the latest bzr revision (117936).
If you still can't get it to work, maybe you should test the emacs-24
branch instead of the trunk. That's much more stable, and fixing bugs
there is more important than fixing bugs in the trunk right now.
By the way, I just got the same assertion failure in bidi.c, with a
backtrace under gdb 7.8 (attached). The assertion failure still doesn't
make sense because type=STRONG_L. But I see lots of strange values of
"type" in frames 3 and higher. Eli, can you make sense out of this? I
still have the gdb session open.
Ken
[bt.txt.gz (application/gzip, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 24 Sep 2014 15:02:01 GMT)
Full text and
rfc822 format available.
Message #76 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 24 Sep 2014 10:06:30 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: Eli Zaretskii <eliz <at> gnu.org>
>
> By the way, I just got the same assertion failure in bidi.c, with a
> backtrace under gdb 7.8 (attached).
What is the backtrace of the other threads?
> The assertion failure still doesn't make sense because
> type=STRONG_L. But I see lots of strange values of "type" in frames
> 3 and higher.
Which ones? The only ones I see are these:
#3 0x00000001005009cc in bidi_level_of_next_char (bidi_it=0x426878)
at /usr/src/debug/emacs-24.3.93-4/src/bidi.c:2325
eob = 811
type = UNKNOWN_BT
level = 0
prev_level = -1
next_for_neutral = {
bytepos = 4353152,
charpos = 4300194024,
type = 4344736, <<<<<<<<<<<<<<<<<<<<<<<<<<<
type_after_w1 = UNKNOWN_BT,
orig_type = 2148292989 <<<<<<<<<<<<<<<<<<<<<<<<<<<
}
which is OK, since the next_for_neutral member doesn't have to be
initialized (the UNKNOWN_BT value in type_after_w1 says it isn't), and
will not be used until it is. Others are in frame #6, like this:
bidi_it = {
bytepos = 4348392,
charpos = 4306733202,
ch = 5482662,
nchars = 25775286310,
ch_len = 25775286790,
type = 4348416,
type_after_w1 = UNKNOWN_BT,
orig_type = 6632775,
but it is from the wrap_it variable, which is not assigned values
unless you have word-wrap turned on in that buffer.
I also see a 'struct it' in redisplay_window (frame #8) with garbled
bidi type values, but that variable is only used under certain
conditions (see line 16178 of xdisp.c), and I have no reason to
believe those conditions were true in this case.
So I don't really see any immediate problems here.
We are left with the riddle.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 24 Sep 2014 16:42:02 GMT)
Full text and
rfc822 format available.
Message #79 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
[Adding Daniel Colascione to the CC.]
On 9/24/2014 11:01 AM, Eli Zaretskii wrote:
>> Date: Wed, 24 Sep 2014 10:06:30 -0400
>> From: Ken Brown <kbrown <at> cornell.edu>
>> CC: Eli Zaretskii <eliz <at> gnu.org>
>>
>> By the way, I just got the same assertion failure in bidi.c, with a
>> backtrace under gdb 7.8 (attached).
>
> What is the backtrace of the other threads?
Attached.
> We are left with the riddle.
That's too bad.
I have one thought: You've mentioned before the possibility that this
problem is caused by interference from other threads. The one thread
that exists in the Cygwin-w32 build but not in other Cygwin builds is
the one used for the Windows message queue. (Thread 5 in the attached
backtrace). I'm not familiar with the code involving the message queue,
but is it possible that this code is not thread safe in the 64-bit
Cygwin build?
Dan, can you help? In case you don't want to read through the whole bug
report, the gist of the problem is this: In the Cygwin-w32 build of
Emacs with checking enabled (64-bit case only), there are random
assertion failures that make no sense when viewed under gdb. In other
words, the assertions clearly hold according to the information provided
by gdb. Eli has wondered whether code running in a different thread is
somehow the cause of this.
Ken
[bt_all.txt.gz (application/gzip, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 24 Sep 2014 19:50:01 GMT)
Full text and
rfc822 format available.
Message #82 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 24 Sep 2014 12:40:31 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: aidalgol <at> amuri.net, 18438 <at> debbugs.gnu.org,
> Daniel Colascione <dancol <at> dancol.org>
>
> > What is the backtrace of the other threads?
>
> Attached.
Nothing stands out.
> I have one thought: You've mentioned before the possibility that this
> problem is caused by interference from other threads. The one thread
> that exists in the Cygwin-w32 build but not in other Cygwin builds is
> the one used for the Windows message queue. (Thread 5 in the attached
> backtrace). I'm not familiar with the code involving the message queue,
> but is it possible that this code is not thread safe in the 64-bit
> Cygwin build?
I will try to look at that code, although I'm not very familiar with
it.
Also note that if there's some problem with that code, I'd expect the
users of the native MinGW64 build to bump into it.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sun, 28 Sep 2014 23:04:01 GMT)
Full text and
rfc822 format available.
Message #85 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Wed, 24 Sep 2014 10:06:30 -0400, Ken Brown wrote:
> On 9/24/2014 1:07 AM, aidalgol <at> amuri.net wrote:
>> Since your post, I have been running emacs under gdb-7.8-2 as you
>> said,
>> and regularly pulling from git master (and rebuilding, of course).
>> Today
>> I pulled 270b6e3 and rebuilt, and it hangs under gdb before even
>> drawing
>> the frame. It runs fine outside gdb, so I did a clean build and
>> tried
>> again, but the same happened.
>
> It works fine for me with the latest bzr revision (117936).
>
> If you still can't get it to work, maybe you should test the emacs-24
> branch instead of the trunk. That's much more stable, and fixing
> bugs
> there is more important than fixing bugs in the trunk right now.
>
I was wrong, it was actually just *very* slow to finish startup. I am
now running from the emacs-24 branch instead. The assert has finally
failed again, and here's the backtrace. Not sure it's of any help since
you have already provided a backtrace from the same assert, but it can't
hurt and might provide some additional information.
--Aidan
[emacs-67c13df-bidi-02.backtrace (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 29 Sep 2014 00:57:02 GMT)
Full text and
rfc822 format available.
Message #88 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
And another backtrace, in the hopes that one will eventually provide
some additional information.
[emacs-67c13df-bidi-03.backtrace (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 29 Sep 2014 06:24:02 GMT)
Full text and
rfc822 format available.
Message #91 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 29 Sep 2014 13:56:39 +1300
> From: aidalgol <at> amuri.net
>
> And another backtrace, in the hopes that one will eventually provide
> some additional information.
Unfortunately, it doesn't. It's still the same STRONG_L type that
somehow fails the test:
> #1 0x00000001005b0095 in die (msg=0x100a27c08 <DEFAULT_REHASH_SIZE+64> "UNKNOWN_BT <= type && type <= NEUTRAL_ON", file=0x100a27c00 <DEFAULT_REHASH_SIZE+56> "bidi.c", line=329) at alloc.c:6833
> No locals.
> #2 0x00000001004f9d2e in bidi_check_type (type=STRONG_L) at bidi.c:329
> No locals. ^^^^^^^^^^^^^
Btw, what is the numerical value of STRONG_L in that build?
Anyway, can you build your own Emacs? If so, I might suggest some
simple changes to try to understand what is going on there.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 29 Sep 2014 15:52:02 GMT)
Full text and
rfc822 format available.
Message #94 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 9/29/2014 2:23 AM, Eli Zaretskii wrote:
> Anyway, can you build your own Emacs? If so, I might suggest some
> simple changes to try to understand what is going on there.
He's mentioned earlier in the thread that he builds his own emacs, so
please send your changes. I'll try them too since I've also hit this
assertion violation, though not nearly as often as Aidan.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 29 Sep 2014 17:01:01 GMT)
Full text and
rfc822 format available.
Message #97 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 29 Sep 2014 11:50:51 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: 18438 <at> debbugs.gnu.org
>
> On 9/29/2014 2:23 AM, Eli Zaretskii wrote:
> > Anyway, can you build your own Emacs? If so, I might suggest some
> > simple changes to try to understand what is going on there.
>
> He's mentioned earlier in the thread that he builds his own emacs
Sorry for my failing memory.
> so please send your changes. I'll try them too since I've also hit
> this assertion violation, though not nearly as often as Aidan.
Thanks.
Let's start by replacing eassert with its equivalent. Please run with
the change below for some time and see if the assertions still happen.
(The purpose is to see whether small changes in the code have drastic
effects on the problem. If they do, it will be hard to know whether
some more serious change solves the problem or simply hides it.)
=== modified file 'src/bidi.c'
--- src/bidi.c 2014-04-06 15:56:01 +0000
+++ src/bidi.c 2014-09-29 16:55:55 +0000
@@ -326,7 +326,8 @@ bidi_get_type (int ch, bidi_dir_t overri
static void
bidi_check_type (bidi_type_t type)
{
- eassert (UNKNOWN_BT <= type && type <= NEUTRAL_ON);
+ if (!(suppress_checking || (UNKNOWN_BT <= type && type <= NEUTRAL_ON)))
+ die ("UNKNOWN_BT <= type && type <= NEUTRAL_ON", __FILE__, __LINE__);
}
/* Given a bidi TYPE of a character, return its category. */
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 29 Sep 2014 22:29:01 GMT)
Full text and
rfc822 format available.
Message #100 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, 29 Sep 2014 20:00:16 +0300, Eli Zaretskii wrote:
>
> Let's start by replacing eassert with its equivalent. Please run
> with
> the change below for some time and see if the assertions still
> happen.
>
> (The purpose is to see whether small changes in the code have drastic
> effects on the problem. If they do, it will be hard to know whether
> some more serious change solves the problem or simply hides it.)
Yes, they're still happening; I just got one after running with the
patch for only a few hours.
[emacs-6f07e0b-patched-bidi-01.backtrace (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 30 Sep 2014 15:25:02 GMT)
Full text and
rfc822 format available.
Message #103 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 30 Sep 2014 11:28:35 +1300
> From: aidalgol <at> amuri.net
> Cc: Eli Zaretskii <eliz <at> gnu.org>, Ken Brown <kbrown <at> cornell.edu>
>
> On Mon, 29 Sep 2014 20:00:16 +0300, Eli Zaretskii wrote:
> >
> > Let's start by replacing eassert with its equivalent. Please run
> > with
> > the change below for some time and see if the assertions still
> > happen.
> >
> > (The purpose is to see whether small changes in the code have drastic
> > effects on the problem. If they do, it will be hard to know whether
> > some more serious change solves the problem or simply hides it.)
>
> Yes, they're still happening; I just got one after running with the
> patch for only a few hours.
OK, so far so good. How about the one below (which tries to reveal
the face of the beast)?
=== modified file 'src/bidi.c'
--- src/bidi.c 2014-04-06 15:56:01 +0000
+++ src/bidi.c 2014-09-30 15:21:28 +0000
@@ -326,7 +326,12 @@ bidi_get_type (int ch, bidi_dir_t overri
static void
bidi_check_type (bidi_type_t type)
{
- eassert (UNKNOWN_BT <= type && type <= NEUTRAL_ON);
+ if (!(suppress_checking || (UNKNOWN_BT <= type && type <= NEUTRAL_ON)))
+ {
+ fprintf (stderr, "\r\n%s:%d: bidi type %d is not in [%d..%d]\r\n",
+ __FILE__, __LINE__, type, UNKNOWN_BT, NEUTRAL_ON);
+ emacs_abort ();
+ }
}
/* Given a bidi TYPE of a character, return its category. */
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 30 Sep 2014 16:08:01 GMT)
Full text and
rfc822 format available.
Message #106 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 9/30/2014 11:24 AM, Eli Zaretskii wrote:
> OK, so far so good. How about the one below (which tries to reveal
> the face of the beast)?
>
> === modified file 'src/bidi.c'
> --- src/bidi.c 2014-04-06 15:56:01 +0000
> +++ src/bidi.c 2014-09-30 15:21:28 +0000
> @@ -326,7 +326,12 @@ bidi_get_type (int ch, bidi_dir_t overri
> static void
> bidi_check_type (bidi_type_t type)
> {
> - eassert (UNKNOWN_BT <= type && type <= NEUTRAL_ON);
> + if (!(suppress_checking || (UNKNOWN_BT <= type && type <= NEUTRAL_ON)))
> + {
> + fprintf (stderr, "\r\n%s:%d: bidi type %d is not in [%d..%d]\r\n",
> + __FILE__, __LINE__, type, UNKNOWN_BT, NEUTRAL_ON);
> + emacs_abort ();
> + }
> }
>
> /* Given a bidi TYPE of a character, return its category. */
Wouldn't it make sense for him to first see if your recent fix of the
"Current trunk aborts with MinGW" problem also fixes the present bug?
We speculated previously that these strange assertion violations might
be a result of the w32_msg stuff not being thread safe.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 30 Sep 2014 16:28:02 GMT)
Full text and
rfc822 format available.
Message #109 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 30 Sep 2014 12:09:32 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: 18438 <at> debbugs.gnu.org
>
> On 9/30/2014 11:24 AM, Eli Zaretskii wrote:
> > OK, so far so good. How about the one below (which tries to reveal
> > the face of the beast)?
> >
> > === modified file 'src/bidi.c'
> > --- src/bidi.c 2014-04-06 15:56:01 +0000
> > +++ src/bidi.c 2014-09-30 15:21:28 +0000
> > @@ -326,7 +326,12 @@ bidi_get_type (int ch, bidi_dir_t overri
> > static void
> > bidi_check_type (bidi_type_t type)
> > {
> > - eassert (UNKNOWN_BT <= type && type <= NEUTRAL_ON);
> > + if (!(suppress_checking || (UNKNOWN_BT <= type && type <= NEUTRAL_ON)))
> > + {
> > + fprintf (stderr, "\r\n%s:%d: bidi type %d is not in [%d..%d]\r\n",
> > + __FILE__, __LINE__, type, UNKNOWN_BT, NEUTRAL_ON);
> > + emacs_abort ();
> > + }
> > }
> >
> > /* Given a bidi TYPE of a character, return its category. */
>
> Wouldn't it make sense for him to first see if your recent fix of the
> "Current trunk aborts with MinGW" problem also fixes the present bug?
Could be, yes. I actually thought about this possibility, but didn't
mention it because I couldn't come up with a scenario where that bug
could have triggered such strange problems, and only in bidi.c. But
it does no harm to try applying that patch first, and only apply this
one if that doesn't help.
> We speculated previously that these strange assertion violations might
> be a result of the w32_msg stuff not being thread safe.
Yes, but you need memory allocation in the picture to have that, and I
see no such allocation in the sequence of calls we saw in the
backtraces.
Still, "the proof of the pudding is in eating"...
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 30 Sep 2014 23:07:01 GMT)
Full text and
rfc822 format available.
Message #112 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Tue, 30 Sep 2014 19:27:37 +0300, Eli Zaretskii wrote:
>> Date: Tue, 30 Sep 2014 12:09:32 -0400
>> From: Ken Brown <kbrown <at> cornell.edu>
>> CC: 18438 <at> debbugs.gnu.org
>>
>> Wouldn't it make sense for him to first see if your recent fix of
>> the
>> "Current trunk aborts with MinGW" problem also fixes the present
>> bug?
>
> Could be, yes. I actually thought about this possibility, but didn't
> mention it because I couldn't come up with a scenario where that bug
> could have triggered such strange problems, and only in bidi.c. But
> it does no harm to try applying that patch first, and only apply this
> one if that doesn't help.
And where is this patch? I don't keep up with the happenings on trunk.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 01 Oct 2014 02:41:04 GMT)
Full text and
rfc822 format available.
Message #115 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 01 Oct 2014 12:06:06 +1300
> From: aidalgol <at> amuri.net
> Cc: Eli Zaretskii <eliz <at> gnu.org>, Ken Brown <kbrown <at> cornell.edu>
>
> On Tue, 30 Sep 2014 19:27:37 +0300, Eli Zaretskii wrote:
> >> Date: Tue, 30 Sep 2014 12:09:32 -0400
> >> From: Ken Brown <kbrown <at> cornell.edu>
> >> CC: 18438 <at> debbugs.gnu.org
> >>
> >> Wouldn't it make sense for him to first see if your recent fix of
> >> the
> >> "Current trunk aborts with MinGW" problem also fixes the present
> >> bug?
> >
> > Could be, yes. I actually thought about this possibility, but didn't
> > mention it because I couldn't come up with a scenario where that bug
> > could have triggered such strange problems, and only in bidi.c. But
> > it does no harm to try applying that patch first, and only apply this
> > one if that doesn't help.
>
> And where is this patch? I don't keep up with the happenings on trunk.
Sorry, it's here:
http://lists.gnu.org/archive/html/emacs-devel/2014-09/msg00901.html
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 01 Oct 2014 02:41:05 GMT)
Full text and
rfc822 format available.
Message #118 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 9/30/2014 7:06 PM, aidalgol <at> amuri.net wrote:
> On Tue, 30 Sep 2014 19:27:37 +0300, Eli Zaretskii wrote:
>>> Date: Tue, 30 Sep 2014 12:09:32 -0400
>>> From: Ken Brown <kbrown <at> cornell.edu>
>>> CC: 18438 <at> debbugs.gnu.org
>>>
>>> Wouldn't it make sense for him to first see if your recent fix of the
>>> "Current trunk aborts with MinGW" problem also fixes the present bug?
>>
>> Could be, yes. I actually thought about this possibility, but didn't
>> mention it because I couldn't come up with a scenario where that bug
>> could have triggered such strange problems, and only in bidi.c. But
>> it does no harm to try applying that patch first, and only apply this
>> one if that doesn't help.
>
> And where is this patch? I don't keep up with the happenings on trunk.
It was applied to the emacs-24 branch today, as bzr revision 117524.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 01 Oct 2014 02:59:01 GMT)
Full text and
rfc822 format available.
Message #121 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Wed, 01 Oct 2014 05:39:53 +0300, Eli Zaretskii wrote:
>> Date: Wed, 01 Oct 2014 12:06:06 +1300
>> From: aidalgol <at> amuri.net
>> Cc: Eli Zaretskii <eliz <at> gnu.org>, Ken Brown <kbrown <at> cornell.edu>
>>
>> On Tue, 30 Sep 2014 19:27:37 +0300, Eli Zaretskii wrote:
>> >> Date: Tue, 30 Sep 2014 12:09:32 -0400
>> >> From: Ken Brown <kbrown <at> cornell.edu>
>> >> CC: 18438 <at> debbugs.gnu.org
>> >>
>> >> Wouldn't it make sense for him to first see if your recent fix of
>> >> the
>> >> "Current trunk aborts with MinGW" problem also fixes the present
>> >> bug?
>> >
>> > Could be, yes. I actually thought about this possibility, but
>> didn't
>> > mention it because I couldn't come up with a scenario where that
>> bug
>> > could have triggered such strange problems, and only in bidi.c.
>> But
>> > it does no harm to try applying that patch first, and only apply
>> this
>> > one if that doesn't help.
>>
>> And where is this patch? I don't keep up with the happenings on
>> trunk.
>
> Sorry, it's here:
>
> http://lists.gnu.org/archive/html/emacs-devel/2014-09/msg00901.html
When I try to apply that to commit 623a1b2, patch says "Reversed (or
previously applied) patch detected!" Has this been merged to the
emacs-24 branch?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 01 Oct 2014 14:43:02 GMT)
Full text and
rfc822 format available.
Message #124 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 01 Oct 2014 15:58:08 +1300
> From: aidalgol <at> amuri.net
> Cc: <18438 <at> debbugs.gnu.org>, <kbrown <at> cornell.edu>
>
> On Wed, 01 Oct 2014 05:39:53 +0300, Eli Zaretskii wrote:
> >> Date: Wed, 01 Oct 2014 12:06:06 +1300
> >> From: aidalgol <at> amuri.net
> >> Cc: Eli Zaretskii <eliz <at> gnu.org>, Ken Brown <kbrown <at> cornell.edu>
> >>
> >> On Tue, 30 Sep 2014 19:27:37 +0300, Eli Zaretskii wrote:
> >> >> Date: Tue, 30 Sep 2014 12:09:32 -0400
> >> >> From: Ken Brown <kbrown <at> cornell.edu>
> >> >> CC: 18438 <at> debbugs.gnu.org
> >> >>
> >> >> Wouldn't it make sense for him to first see if your recent fix of
> >> >> the
> >> >> "Current trunk aborts with MinGW" problem also fixes the present
> >> >> bug?
> >> >
> >> > Could be, yes. I actually thought about this possibility, but
> >> didn't
> >> > mention it because I couldn't come up with a scenario where that
> >> bug
> >> > could have triggered such strange problems, and only in bidi.c.
> >> But
> >> > it does no harm to try applying that patch first, and only apply
> >> this
> >> > one if that doesn't help.
> >>
> >> And where is this patch? I don't keep up with the happenings on
> >> trunk.
> >
> > Sorry, it's here:
> >
> > http://lists.gnu.org/archive/html/emacs-devel/2014-09/msg00901.html
>
> When I try to apply that to commit 623a1b2, patch says "Reversed (or
> previously applied) patch detected!" Has this been merged to the
> emacs-24 branch?
Yes, and in the meantime it was also merged to the trunk. So just
update and rebuild.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 01 Oct 2014 21:57:01 GMT)
Full text and
rfc822 format available.
Message #127 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Wed, 01 Oct 2014 17:42:16 +0300, Eli Zaretskii wrote:
>> When I try to apply that to commit 623a1b2, patch says "Reversed (or
>> previously applied) patch detected!" Has this been merged to the
>> emacs-24 branch?
>
> Yes, and in the meantime it was also merged to the trunk. So just
> update and rebuild.
Still asserting with commit e677cce. Rebuilding with the patch you
provided earlier (in message #103
<http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18438#103>).
[emacs-e677cce-bidi-01.backtrace (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 08 Oct 2014 22:21:02 GMT)
Full text and
rfc822 format available.
Message #130 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Tue, 30 Sep 2014 18:24:15 +0300, Eli Zaretskii wrote:
>> Date: Tue, 30 Sep 2014 11:28:35 +1300
>> From: aidalgol <at> amuri.net
>> Cc: Eli Zaretskii <eliz <at> gnu.org>, Ken Brown <kbrown <at> cornell.edu>
>>
>> On Mon, 29 Sep 2014 20:00:16 +0300, Eli Zaretskii wrote:
>> >
>> > Let's start by replacing eassert with its equivalent. Please run
>> > with
>> > the change below for some time and see if the assertions still
>> > happen.
>> >
>> > (The purpose is to see whether small changes in the code have
>> drastic
>> > effects on the problem. If they do, it will be hard to know
>> whether
>> > some more serious change solves the problem or simply hides it.)
>>
>> Yes, they're still happening; I just got one after running with the
>> patch for only a few hours.
>
> OK, so far so good. How about the one below (which tries to reveal
> the face of the beast)?
>
> === modified file 'src/bidi.c'
> --- src/bidi.c 2014-04-06 15:56:01 +0000
> +++ src/bidi.c 2014-09-30 15:21:28 +0000
> @@ -326,7 +326,12 @@ bidi_get_type (int ch, bidi_dir_t overri
> static void
> bidi_check_type (bidi_type_t type)
> {
> - eassert (UNKNOWN_BT <= type && type <= NEUTRAL_ON);
> + if (!(suppress_checking || (UNKNOWN_BT <= type && type <=
> NEUTRAL_ON)))
> + {
> + fprintf (stderr, "\r\n%s:%d: bidi type %d is not in
> [%d..%d]\r\n",
> + __FILE__, __LINE__, type, UNKNOWN_BT, NEUTRAL_ON);
> + emacs_abort ();
> + }
> }
>
> /* Given a bidi TYPE of a character, return its category. */
OK, it finally happened. It printed out...
bidi.c:332: bidi type 22 is not in [0..23]
Isn't that a logical impossibility? What the hell is going on?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 09 Oct 2014 07:30:02 GMT)
Full text and
rfc822 format available.
Message #133 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 09 Oct 2014 11:20:51 +1300
> From: aidalgol <at> amuri.net
> Cc: <18438 <at> debbugs.gnu.org>, <kbrown <at> cornell.edu>
>
> > --- src/bidi.c 2014-04-06 15:56:01 +0000
> > +++ src/bidi.c 2014-09-30 15:21:28 +0000
> > @@ -326,7 +326,12 @@ bidi_get_type (int ch, bidi_dir_t overri
> > static void
> > bidi_check_type (bidi_type_t type)
> > {
> > - eassert (UNKNOWN_BT <= type && type <= NEUTRAL_ON);
> > + if (!(suppress_checking || (UNKNOWN_BT <= type && type <=
> > NEUTRAL_ON)))
> > + {
> > + fprintf (stderr, "\r\n%s:%d: bidi type %d is not in
> > [%d..%d]\r\n",
> > + __FILE__, __LINE__, type, UNKNOWN_BT, NEUTRAL_ON);
> > + emacs_abort ();
> > + }
> > }
> >
> > /* Given a bidi TYPE of a character, return its category. */
>
> OK, it finally happened. It printed out...
>
> bidi.c:332: bidi type 22 is not in [0..23]
Bidi type 22 is whitespace, most likely the SPC character.
> Isn't that a logical impossibility?
Of course, it is. It always was. This is what this bug is all about.
> What the hell is going on?
That's what we are trying to establish. My working hypothesis is that
some unrelated code, either in another Emacs thread or maybe (less
likely) in the OS bowels preempts this function and doesn't restore
all the registers when it passes control back to the function. Any
other ideas are welcome.
Could you show the disassembly of this function in its new form? I'd
like to see if the value of the bidi type being checked is loaded into
the same register as in the original version.
Also, if you have the backtrace, including from all the other threads,
please post that.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 10 Oct 2014 02:22:01 GMT)
Full text and
rfc822 format available.
Message #136 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Resending to list, because I hit the wrong reply button the first time.
Apologies.
On Thu, 09 Oct 2014 10:29:50 +0300, Eli Zaretskii wrote:
>
>> OK, it finally happened. It printed out...
>>
>> bidi.c:332: bidi type 22 is not in [0..23]
>
> Bidi type 22 is whitespace, most likely the SPC character.
>
>> Isn't that a logical impossibility?
>
> Of course, it is. It always was. This is what this bug is all
> about.
>
>> What the hell is going on?
>
> That's what we are trying to establish. My working hypothesis is
> that
> some unrelated code, either in another Emacs thread or maybe (less
> likely) in the OS bowels preempts this function and doesn't restore
> all the registers when it passes control back to the function. Any
> other ideas are welcome.
>
> Could you show the disassembly of this function in its new form? I'd
> like to see if the value of the bidi type being checked is loaded
> into
> the same register as in the original version.
(gdb) disassemble 'bidi.c'::bidi_check_type
Dump of assembler code for function bidi_check_type:
0x00000001004f9dd3 <+0>: push %rbp
0x00000001004f9dd4 <+1>: mov %rsp,%rbp
0x00000001004f9dd7 <+4>: sub $0x40,%rsp
0x00000001004f9ddb <+8>: mov %ecx,0x10(%rbp)
0x00000001004f9dde <+11>: mov 0x55d8db(%rip),%rax #
0x100a576c0 <.refptr.suppress_checking>
0x00000001004f9de5 <+18>: movzbl (%rax),%eax
0x00000001004f9de8 <+21>: xor $0x1,%eax
0x00000001004f9deb <+24>: test %al,%al
0x00000001004f9ded <+26>: je 0x1004f9e37
<bidi_check_type+100>
0x00000001004f9def <+28>: cmpl $0x17,0x10(%rbp)
0x00000001004f9df3 <+32>: jbe 0x1004f9e37
<bidi_check_type+100>
0x00000001004f9df5 <+34>: callq 0x1006b7d00 <__getreent>
0x00000001004f9dfa <+39>: mov 0x18(%rax),%rax
0x00000001004f9dfe <+43>: movl $0x17,0x30(%rsp)
0x00000001004f9e06 <+51>: movl $0x0,0x28(%rsp)
0x00000001004f9e0e <+59>: mov 0x10(%rbp),%edx
0x00000001004f9e11 <+62>: mov %edx,0x20(%rsp)
0x00000001004f9e15 <+66>: mov $0x14c,%r9d
0x00000001004f9e1b <+72>: lea 0x52edde(%rip),%r8 #
0x100a28c00 <DEFAULT_REHASH_SIZE+56>
0x00000001004f9e22 <+79>: lea 0x52eddf(%rip),%rdx #
0x100a28c08 <DEFAULT_REHASH_SIZE+64>
0x00000001004f9e29 <+86>: mov %rax,%rcx
0x00000001004f9e2c <+89>: callq 0x1006b8080 <fprintf>
0x00000001004f9e31 <+94>: callq 0x100665e15 <emacs_abort>
0x00000001004f9e36 <+99>: nop
0x00000001004f9e37 <+100>: add $0x40,%rsp
0x00000001004f9e3b <+104>: pop %rbp
0x00000001004f9e3c <+105>: retq
End of assembler dump.
> Also, if you have the backtrace, including from all the other
> threads,
> please post that.
Attached, but the emacs process died while printing the backtrace for
thread 2, and I have no idea why.
Since the first time I sent this (to only Eli, see above), I rebooted,
and got the assert again, and was able to get a backtrace (attached)
from another thread, but it died printing the backtrace of the same
thread as last time.
[backtrace (text/plain, attachment)]
[emacs-b8497de-patched-bidi-01.backtrace (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 10 Oct 2014 07:20:02 GMT)
Full text and
rfc822 format available.
Message #139 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 10 Oct 2014 15:21:53 +1300
> From: aidalgol <at> amuri.net
> Cc: Eli Zaretskii <eliz <at> gnu.org>, <kbrown <at> cornell.edu>
>
> > Could you show the disassembly of this function in its new form? I'd
> > like to see if the value of the bidi type being checked is loaded
> > into
> > the same register as in the original version.
>
> (gdb) disassemble 'bidi.c'::bidi_check_type
> Dump of assembler code for function bidi_check_type:
> 0x00000001004f9dd3 <+0>: push %rbp
> 0x00000001004f9dd4 <+1>: mov %rsp,%rbp
> 0x00000001004f9dd7 <+4>: sub $0x40,%rsp
> 0x00000001004f9ddb <+8>: mov %ecx,0x10(%rbp)
> 0x00000001004f9dde <+11>: mov 0x55d8db(%rip),%rax
> 0x00000001004f9de5 <+18>: movzbl (%rax),%eax
> 0x00000001004f9de8 <+21>: xor $0x1,%eax
> 0x00000001004f9deb <+24>: test %al,%al
> 0x00000001004f9ded <+26>: je 0x1004f9e37 <bidi_check_type+100>
> 0x00000001004f9def <+28>: cmpl $0x17,0x10(%rbp)
> 0x00000001004f9df3 <+32>: jbe 0x1004f9e37
Yes, this is the same arrangement as in the original version: passed
through ECX, then stored in RBP+0x10.
Moreover, the value printed by fprintf is taken from RBP+0x10:
> 0x00000001004f9dfa <+39>: mov 0x18(%rax),%rax
> 0x00000001004f9dfe <+43>: movl $0x17,0x30(%rsp)
> 0x00000001004f9e06 <+51>: movl $0x0,0x28(%rsp)
> 0x00000001004f9e0e <+59>: mov 0x10(%rbp),%edx <<<<<<<<<<<
> 0x00000001004f9e11 <+62>: mov %edx,0x20(%rsp) <<<<<<<<<<<
> 0x00000001004f9e15 <+66>: mov $0x14c,%r9d
> 0x00000001004f9e1b <+72>: lea 0x52edde(%rip),%r8
> 0x00000001004f9e22 <+79>: lea 0x52eddf(%rip),%rdx
> 0x00000001004f9e29 <+86>: mov %rax,%rcx
> 0x00000001004f9e2c <+89>: callq 0x1006b8080 <fprintf>
So now I'm no longer sure that my theory about some other thread
overwriting registers is valid. But what else could cause this?
Hm... can you try the following version instead? I expect it to force
GCC to store the value of 'type' in a 64-bit register, and use a
64-bit compare instruction for it. Please show the resulting
disassembly, so we are sure this trick succeeded.
=== modified file 'src/bidi.c'
--- src/bidi.c 2014-04-06 15:56:01 +0000
+++ src/bidi.c 2014-10-10 07:12:01 +0000
@@ -326,7 +326,14 @@ bidi_get_type (int ch, bidi_dir_t overri
static void
bidi_check_type (bidi_type_t type)
{
- eassert (UNKNOWN_BT <= type && type <= NEUTRAL_ON);
+ volatile ptrdiff_t qtype = type;
+
+ if (!(suppress_checking || (UNKNOWN_BT <= qtype && qtype <= NEUTRAL_ON)))
+ {
+ fprintf (stderr, "\r\n%s:%d: bidi type %d is not in [%d..%d]\r\n",
+ __FILE__, __LINE__, type, UNKNOWN_BT, NEUTRAL_ON);
+ emacs_abort ();
+ }
}
/* Given a bidi TYPE of a character, return its category. */
> > Also, if you have the backtrace, including from all the other
> > threads,
> > please post that.
>
> Attached, but the emacs process died while printing the backtrace for
> thread 2, and I have no idea why.
I do: it's because you started GDB from the src directory, where it
read the .gdbinit file, which causes the "bt" command to call a
function in the Emacs process being debugged.
To work around this, comment out (by prepending a # to every line) the
following few lines in .gdbinit:
define hookpost-backtrace
set $bt = backtrace_top ()
if backtrace_p ($bt)
echo \n
echo Lisp Backtrace:\n
xbacktrace
end
end
Then you will still be able to invoke "xbacktrace" by hand, but it
won't be invoked automatically by "bt".
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 10 Oct 2014 07:26:02 GMT)
Full text and
rfc822 format available.
Message #142 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 10 Oct 2014 10:19:57 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 18438 <at> debbugs.gnu.org
>
> To work around this, comment out (by prepending a # to every line) the
> following few lines in .gdbinit:
>
> define hookpost-backtrace
> set $bt = backtrace_top ()
> if backtrace_p ($bt)
> echo \n
> echo Lisp Backtrace:\n
> xbacktrace
> end
> end
Alternatively, you could leave .gdbinit alone, and redefine
hookpost-backtrace manually to an empty function at start of the GDB
session, like this:
(gdb) define hookpost-backtrace
Redefine command "hookpost-backtrace"? (y or n) y
Type commands for definition of "hookpost-backtrace".
End with a line saying just "end".
>end
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 10 Oct 2014 13:55:02 GMT)
Full text and
rfc822 format available.
Message #145 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/10/2014 3:19 AM, Eli Zaretskii wrote:
> === modified file 'src/bidi.c'
> --- src/bidi.c 2014-04-06 15:56:01 +0000
> +++ src/bidi.c 2014-10-10 07:12:01 +0000
> @@ -326,7 +326,14 @@ bidi_get_type (int ch, bidi_dir_t overri
> static void
> bidi_check_type (bidi_type_t type)
> {
> - eassert (UNKNOWN_BT <= type && type <= NEUTRAL_ON);
> + volatile ptrdiff_t qtype = type;
> +
> + if (!(suppress_checking || (UNKNOWN_BT <= qtype && qtype <= NEUTRAL_ON)))
> + {
> + fprintf (stderr, "\r\n%s:%d: bidi type %d is not in [%d..%d]\r\n",
> + __FILE__, __LINE__, type, UNKNOWN_BT, NEUTRAL_ON);
> + emacs_abort ();
> + }
> }
>
> /* Given a bidi TYPE of a character, return its category. */
It works:
Dump of assembler code for function bidi_check_type:
0x00000001004fc423 <+0>: push %rbp
0x00000001004fc424 <+1>: mov %rsp,%rbp
0x00000001004fc427 <+4>: sub $0x50,%rsp
0x00000001004fc42b <+8>: mov %ecx,0x10(%rbp)
0x00000001004fc42e <+11>: mov 0x10(%rbp),%eax
0x00000001004fc431 <+14>: mov %rax,-0x8(%rbp)
0x00000001004fc435 <+18>: mov 0x56bfb4(%rip),%rax # 0x100a683f0
<.refptr.suppress_checking>
0x00000001004fc43c <+25>: movzbl (%rax),%eax
0x00000001004fc43f <+28>: xor $0x1,%eax
0x00000001004fc442 <+31>: test %al,%al
0x00000001004fc444 <+33>: je 0x1004fc49b <bidi_check_type+120>
0x00000001004fc446 <+35>: mov -0x8(%rbp),%rax
0x00000001004fc44a <+39>: test %rax,%rax <<<<<<<<<<<<<<<<<<<<<<<<<<<<<
0x00000001004fc44d <+42>: js 0x1004fc459 <bidi_check_type+54>
0x00000001004fc44f <+44>: mov -0x8(%rbp),%rax
0x00000001004fc453 <+48>: cmp $0x17,%rax <<<<<<<<<<<<<<<<<<<<<<<<<<<<<
0x00000001004fc457 <+52>: jle 0x1004fc49b <bidi_check_type+120>
0x00000001004fc459 <+54>: callq 0x1006c6a50 <__getreent>
0x00000001004fc45e <+59>: mov 0x18(%rax),%rax
0x00000001004fc462 <+63>: movl $0x17,0x30(%rsp)
0x00000001004fc46a <+71>: movl $0x0,0x28(%rsp)
0x00000001004fc472 <+79>: mov 0x10(%rbp),%edx
0x00000001004fc475 <+82>: mov %edx,0x20(%rsp)
0x00000001004fc479 <+86>: mov $0x14d,%r9d
0x00000001004fc47f <+92>: lea 0x53b16a(%rip),%r8 # 0x100a375f0
<DEFAULT_REHASH_SIZE+56>
0x00000001004fc486 <+99>: lea 0x53b16b(%rip),%rdx # 0x100a375f8
<DEFAULT_REHASH_SIZE+64>
0x00000001004fc48d <+106>: mov %rax,%rcx
0x00000001004fc490 <+109>: callq 0x1006c6d70 <fprintf>
0x00000001004fc495 <+114>: callq 0x100672ebc <emacs_abort>
0x00000001004fc49a <+119>: nop
0x00000001004fc49b <+120>: add $0x50,%rsp
0x00000001004fc49f <+124>: pop %rbp
0x00000001004fc4a0 <+125>: retq
End of assembler dump.
Do you have a new theory that this is testing?
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 10 Oct 2014 15:14:02 GMT)
Full text and
rfc822 format available.
Message #148 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 10 Oct 2014 09:54:18 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: 18438 <at> debbugs.gnu.org
>
> It works:
Thanks. Although I have a question to x86_64 experts here:
> 0x00000001004fc42b <+8>: mov %ecx,0x10(%rbp)
> 0x00000001004fc42e <+11>: mov 0x10(%rbp),%eax
> 0x00000001004fc431 <+14>: mov %rax,-0x8(%rbp)
Do these 3 instructions ensure that the MSB 32 bits of RAX (and
therefore the place where the result is stored at RBP+0x08) are zeroed
out?
> Do you have a new theory that this is testing?
Something vague about the upper 32 bits of the 64-bit registers.
(Yes, I'm desperate.)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 10 Oct 2014 17:15:02 GMT)
Full text and
rfc822 format available.
Message #151 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/10/2014 11:12 AM, Eli Zaretskii wrote:
> Thanks. Although I have a question to x86_64 experts here:
>
>> 0x00000001004fc42b <+8>: mov %ecx,0x10(%rbp)
>> 0x00000001004fc42e <+11>: mov 0x10(%rbp),%eax
>> 0x00000001004fc431 <+14>: mov %rax,-0x8(%rbp)
>
> Do these 3 instructions ensure that the MSB 32 bits of RAX (and
> therefore the place where the result is stored at RBP+0x08) are zeroed
> out?
According to this, the answer is yes:
http://stackoverflow.com/questions/11177137/why-do-most-x64-instructions-zero-the-upper-part-of-a-32-bit-register
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sat, 11 Oct 2014 01:58:02 GMT)
Full text and
rfc822 format available.
Message #154 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/10/2014 11:12 AM, Eli Zaretskii wrote:
> Something vague about the upper 32 bits of the 64-bit registers.
> (Yes, I'm desperate.)
I'm desperate too. Here's another thought: Suppose this really is a
thread-safety issue in some way that we don't understand. Then maybe
the problem is that the test 'type <= 23' is not atomic in the
compilation that Aidan and I have been doing. First 'type' is copied
from ECX to RBP+0x10, then the latter is tested. We could make it
atomic by forcing GCC to directly test ECX <= 23. We can do this by
compiling with -Og instead of -O0. (Aidan and I have both been using -O0.)
The resulting disassembly (based on your earlier patch, in
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18438#103) is
Dump of assembler code for function bidi_check_type:
0x00000001004ee9db <+0>: push %rbx
0x00000001004ee9dc <+1>: sub $0x40,%rsp
0x00000001004ee9e0 <+5>: mov %ecx,%ebx
0x00000001004ee9e2 <+7>: mov 0x543027(%rip),%rax #
0x100a31a10 <.refptr.suppress_checking>
0x00000001004ee9e9 <+14>: cmpb $0x0,(%rax)
0x00000001004ee9ec <+17>: jne 0x1004eea2f <bidi_check_type+84>
0x00000001004ee9ee <+19>: cmp $0x17,%ecx
0x00000001004ee9f1 <+22>: jbe 0x1004eea2f <bidi_check_type+84>
0x00000001004ee9f3 <+24>: callq 0x10069fd40 <__getreent>
0x00000001004ee9f8 <+29>: mov 0x18(%rax),%rcx
0x00000001004ee9fc <+33>: movl $0x17,0x30(%rsp)
0x00000001004eea04 <+41>: movl $0x0,0x28(%rsp)
0x00000001004eea0c <+49>: mov %ebx,0x20(%rsp)
0x00000001004eea10 <+53>: mov $0x14c,%r9d
0x00000001004eea16 <+59>: lea 0x51e713(%rip),%r8 #
0x100a0d130 <chartab_size+112>
0x00000001004eea1d <+66>: lea 0x51e814(%rip),%rdx #
0x100a0d238 <chartab_size+376>
0x00000001004eea24 <+73>: callq 0x1006a0040 <fprintf>
0x00000001004eea29 <+78>: callq 0x10065227d <emacs_abort>
0x00000001004eea2e <+83>: nop
0x00000001004eea2f <+84>: add $0x40,%rsp
0x00000001004eea33 <+88>: pop %rbx
0x00000001004eea34 <+89>: retq
End of assembler dump.
Do you think this is worth trying (perhaps after Aidan tries your other
suggestion, involving 64-bit registers)?
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sat, 11 Oct 2014 07:12:02 GMT)
Full text and
rfc822 format available.
Message #157 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 10 Oct 2014 21:57:24 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: aidalgol <at> amuri.net, 18438 <at> debbugs.gnu.org
>
> On 10/10/2014 11:12 AM, Eli Zaretskii wrote:
> > Something vague about the upper 32 bits of the 64-bit registers.
> > (Yes, I'm desperate.)
>
> I'm desperate too. Here's another thought: Suppose this really is a
> thread-safety issue in some way that we don't understand. Then maybe
> the problem is that the test 'type <= 23' is not atomic in the
> compilation that Aidan and I have been doing. First 'type' is copied
> from ECX to RBP+0x10, then the latter is tested.
That's true; but note that the value at RBP+0x10 is the one passed to
fprintf (by pushing it on the stack via EDX), and it printed correctly.
> We could make it
> atomic by forcing GCC to directly test ECX <= 23. We can do this by
> compiling with -Og instead of -O0. (Aidan and I have both been using -O0.)
>
> The resulting disassembly (based on your earlier patch, in
> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18438#103) is
>
> Dump of assembler code for function bidi_check_type:
> 0x00000001004ee9db <+0>: push %rbx
> 0x00000001004ee9dc <+1>: sub $0x40,%rsp
> 0x00000001004ee9e0 <+5>: mov %ecx,%ebx
> 0x00000001004ee9e2 <+7>: mov 0x543027(%rip),%rax #
> 0x100a31a10 <.refptr.suppress_checking>
> 0x00000001004ee9e9 <+14>: cmpb $0x0,(%rax)
> 0x00000001004ee9ec <+17>: jne 0x1004eea2f <bidi_check_type+84>
> 0x00000001004ee9ee <+19>: cmp $0x17,%ecx
> 0x00000001004ee9f1 <+22>: jbe 0x1004eea2f <bidi_check_type+84>
> 0x00000001004ee9f3 <+24>: callq 0x10069fd40 <__getreent>
> 0x00000001004ee9f8 <+29>: mov 0x18(%rax),%rcx
> 0x00000001004ee9fc <+33>: movl $0x17,0x30(%rsp)
> 0x00000001004eea04 <+41>: movl $0x0,0x28(%rsp)
> 0x00000001004eea0c <+49>: mov %ebx,0x20(%rsp)
> 0x00000001004eea10 <+53>: mov $0x14c,%r9d
> 0x00000001004eea16 <+59>: lea 0x51e713(%rip),%r8 #
> 0x100a0d130 <chartab_size+112>
> 0x00000001004eea1d <+66>: lea 0x51e814(%rip),%rdx #
> 0x100a0d238 <chartab_size+376>
> 0x00000001004eea24 <+73>: callq 0x1006a0040 <fprintf>
> 0x00000001004eea29 <+78>: callq 0x10065227d <emacs_abort>
> 0x00000001004eea2e <+83>: nop
> 0x00000001004eea2f <+84>: add $0x40,%rsp
> 0x00000001004eea33 <+88>: pop %rbx
> 0x00000001004eea34 <+89>: retq
> End of assembler dump.
>
> Do you think this is worth trying (perhaps after Aidan tries your other
> suggestion, involving 64-bit registers)?
Holding a value in a register AFAIU actually makes the probability of
a clobber by another thread higher than keeping it on the stack.
But I think any idea is worth trying at this time, certainly including
yours. Thanks.
Btw, note that the above version copies the argument into EBX, which
is then pushed onto the stack (10 instructions later) before calling
fprintf. This is somewhat different from the original code, which
held the value in a temporary variable on the stack instead of in EBX.
Not sure this matters, just mentioning it for the record.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sat, 11 Oct 2014 13:59:02 GMT)
Full text and
rfc822 format available.
Message #160 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/11/2014 3:11 AM, Eli Zaretskii wrote:
> But I think any idea is worth trying at this time, certainly including
> yours. Thanks.
OK, here's one more. I just remembered a thread on emacs-devel in
August 2013 involving mysterious crashes on 64-bit Cygwin, and you
finally diagnosed the problem: The stack was too small.
(http://lists.gnu.org/archive/html/emacs-devel/2013-08/msg00515.html)
Aidan, what happens if you give emacs a larger stack, say 8MB? (The
default on Cygwin is 2MB.) You can do that at build time by adding
LDFLAGS=-Wl,--stack,0x800000
to your 'configure' invocation. Or you can do it after the build with
the 'peflags' command:
peflags -x0x800000 emacs.exe
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sat, 11 Oct 2014 14:26:01 GMT)
Full text and
rfc822 format available.
Message #163 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 11 Oct 2014 09:58:07 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: aidalgol <at> amuri.net, 18438 <at> debbugs.gnu.org
>
> OK, here's one more. I just remembered a thread on emacs-devel in
> August 2013 involving mysterious crashes on 64-bit Cygwin, and you
> finally diagnosed the problem: The stack was too small.
> (http://lists.gnu.org/archive/html/emacs-devel/2013-08/msg00515.html)
>
> Aidan, what happens if you give emacs a larger stack, say 8MB? (The
> default on Cygwin is 2MB.) You can do that at build time by adding
>
> LDFLAGS=-Wl,--stack,0x800000
>
> to your 'configure' invocation. Or you can do it after the build with
> the 'peflags' command:
>
> peflags -x0x800000 emacs.exe
Thanks, this is indeed worth pursuing. The MinGW build already
specifies a 8MB stack at link time, see configure.ac.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sat, 11 Oct 2014 16:32:02 GMT)
Full text and
rfc822 format available.
Message #166 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/11/2014 10:24 AM, Eli Zaretskii wrote:
> Thanks, this is indeed worth pursuing. The MinGW build already
> specifies a 8MB stack at link time, see configure.ac.
I'm inclined to go ahead and patch configure.ac to do the same on
Cygwin, whether it turns out to fix this bug or not. After all, we
already know from the August 2013 discussion that the default Cygwin
stack is not big enough. What do you think?
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sat, 11 Oct 2014 16:54:02 GMT)
Full text and
rfc822 format available.
Message #169 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 11 Oct 2014 12:33:29 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: aidalgol <at> amuri.net, 18438 <at> debbugs.gnu.org
>
> I'm inclined to go ahead and patch configure.ac to do the same on
> Cygwin, whether it turns out to fix this bug or not. After all, we
> already know from the August 2013 discussion that the default Cygwin
> stack is not big enough. What do you think?
I think go ahead and do it. The result should be tested with as many
additional threads as possible, because each thread eats some of the
stack space.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sat, 11 Oct 2014 17:19:02 GMT)
Full text and
rfc822 format available.
Message #172 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/11/2014 12:52 PM, Eli Zaretskii wrote:
>> Date: Sat, 11 Oct 2014 12:33:29 -0400
>> From: Ken Brown <kbrown <at> cornell.edu>
>> CC: aidalgol <at> amuri.net, 18438 <at> debbugs.gnu.org
>>
>> I'm inclined to go ahead and patch configure.ac to do the same on
>> Cygwin, whether it turns out to fix this bug or not. After all, we
>> already know from the August 2013 discussion that the default Cygwin
>> stack is not big enough. What do you think?
>
> I think go ahead and do it. The result should be tested with as many
> additional threads as possible, because each thread eats some of the
> stack space.
Done as revision 117572 on the release branch.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 15 Oct 2014 00:59:02 GMT)
Full text and
rfc822 format available.
Message #175 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Fri, 10 Oct 2014 10:19:57 +0300, Eli Zaretskii wrote:
>
>> Attached, but the emacs process died while printing the backtrace
>> for
>> thread 2, and I have no idea why.
>
> I do: it's because you started GDB from the src directory, where it
> read the .gdbinit file, which causes the "bt" command to call a
> function in the Emacs process being debugged.
>
> To work around this, comment out (by prepending a # to every line)
> the
> following few lines in .gdbinit:
>
> define hookpost-backtrace
> set $bt = backtrace_top ()
> if backtrace_p ($bt)
> echo \n
> echo Lisp Backtrace:\n
> xbacktrace
> end
> end
>
> Then you will still be able to invoke "xbacktrace" by hand, but it
> won't be invoked automatically by "bt".
OK, crashed again, and here's the threads' backtraces.
bidi.c:332: bidi type 1 is not in [0..23]
[emacs-74a217c-patched-assert-bidi (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 15 Oct 2014 05:14:02 GMT)
Full text and
rfc822 format available.
Message #178 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 15 Oct 2014 13:58:20 +1300
> From: aidalgol <at> amuri.net
> Cc: Eli Zaretskii <eliz <at> gnu.org>, <kbrown <at> cornell.edu>
>
> OK, crashed again, and here's the threads' backtraces.
Thanks. Is this with or without the stack size increase to 8MB?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 15 Oct 2014 19:30:03 GMT)
Full text and
rfc822 format available.
Message #181 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Wed, 15 Oct 2014 08:13:16 +0300, Eli Zaretskii wrote:
>> Date: Wed, 15 Oct 2014 13:58:20 +1300
>> From: aidalgol <at> amuri.net
>> Cc: Eli Zaretskii <eliz <at> gnu.org>, <kbrown <at> cornell.edu>
>>
>> OK, crashed again, and here's the threads' backtraces.
>
> Thanks. Is this with or without the stack size increase to 8MB?
This includes the stack size increase made my Ken's commit, and and a
clean build after that change was pulled in.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 16 Oct 2014 07:28:01 GMT)
Full text and
rfc822 format available.
Message #184 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 16 Oct 2014 08:29:16 +1300
> From: aidalgol <at> amuri.net
> Cc: Eli Zaretskii <eliz <at> gnu.org>, <kbrown <at> cornell.edu>
>
> On Wed, 15 Oct 2014 08:13:16 +0300, Eli Zaretskii wrote:
> >> Date: Wed, 15 Oct 2014 13:58:20 +1300
> >> From: aidalgol <at> amuri.net
> >> Cc: Eli Zaretskii <eliz <at> gnu.org>, <kbrown <at> cornell.edu>
> >>
> >> OK, crashed again, and here's the threads' backtraces.
> >
> > Thanks. Is this with or without the stack size increase to 8MB?
>
> This includes the stack size increase made my Ken's commit, and and a
> clean build after that change was pulled in.
I see. So one more theory eats dust.
Let's try to get a couple more full backtraces like this one, in case
some pattern emerges that could give us some ideas.
Ken, isn't it strange that these crashes are reported by so few
people? Or does it mean that only those people are using the 64-bit
Cygwin-w32 build? Maybe it would be worth asking on the Cygwin list,
and getting us some usage statistics for this build, like how many are
using it, and how much uptime each one can generally report? Then, if
only some of them get the crashes, we could try figuring out what are
the differences between their systems.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 16 Oct 2014 13:12:01 GMT)
Full text and
rfc822 format available.
Message #187 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/16/2014 3:27 AM, Eli Zaretskii wrote:
> Let's try to get a couple more full backtraces like this one, in case
> some pattern emerges that could give us some ideas.
I saw some things in Thread 7 (the Windows message queue thread), especially
frame #14, which got me to look at the code for w32_wnd_proc in w32fns.c. The
code is about 1300 lines long, and includes several comments about why it is
thread-safe. Here are a few examples:
Walking the frame list in this thread is safe (as long as
writes of Lisp_Object slots are atomic, which they are on Windows).
It is also safe to use functions that make GDI calls, such as
w32_clear_rect, because these functions must obtain a DC handle
from the frame struct using get_frame_dc which is thread-aware.
The code below does something that one shouldn't do: it
accesses the window object from a separate thread, while the
main (a.k.a. "Lisp") thread runs and can legitimately delete
and even GC it. That is why we are extra careful...
I wonder if something in these 1300 lines is not thread-safe on Cygwin. For
example, I don't know if it's true on Cygwin that "writes of Lisp_Object slots
are atomic".
> Ken, isn't it strange that these crashes are reported by so few
> people? Or does it mean that only those people are using the 64-bit
> Cygwin-w32 build? Maybe it would be worth asking on the Cygwin list,
> and getting us some usage statistics for this build, like how many are
> using it, and how much uptime each one can generally report? Then, if
> only some of them get the crashes, we could try figuring out what are
> the differences between their systems.
That's a good idea. I'll do it right now. Thanks for the suggestion.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 16 Oct 2014 13:39:02 GMT)
Full text and
rfc822 format available.
Message #190 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 16 Oct 2014 09:11:18 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: 18438 <at> debbugs.gnu.org
>
> On 10/16/2014 3:27 AM, Eli Zaretskii wrote:
> > Let's try to get a couple more full backtraces like this one, in case
> > some pattern emerges that could give us some ideas.
>
> I saw some things in Thread 7 (the Windows message queue thread), especially
> frame #14, which got me to look at the code for w32_wnd_proc in w32fns.c. The
> code is about 1300 lines long, and includes several comments about why it is
> thread-safe. Here are a few examples:
>
> Walking the frame list in this thread is safe (as long as
> writes of Lisp_Object slots are atomic, which they are on Windows).
>
> It is also safe to use functions that make GDI calls, such as
> w32_clear_rect, because these functions must obtain a DC handle
> from the frame struct using get_frame_dc which is thread-aware.
>
> The code below does something that one shouldn't do: it
> accesses the window object from a separate thread, while the
> main (a.k.a. "Lisp") thread runs and can legitimately delete
> and even GC it. That is why we are extra careful...
>
> I wonder if something in these 1300 lines is not thread-safe on Cygwin. For
> example, I don't know if it's true on Cygwin that "writes of Lisp_Object slots
> are atomic".
I will take a look, thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sun, 19 Oct 2014 14:41:03 GMT)
Full text and
rfc822 format available.
Message #193 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 16 Oct 2014 09:11:18 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: 18438 <at> debbugs.gnu.org
>
> On 10/16/2014 3:27 AM, Eli Zaretskii wrote:
> > Let's try to get a couple more full backtraces like this one, in case
> > some pattern emerges that could give us some ideas.
>
> I saw some things in Thread 7 (the Windows message queue thread), especially
> frame #14, which got me to look at the code for w32_wnd_proc in w32fns.c. The
> code is about 1300 lines long, and includes several comments about why it is
> thread-safe. Here are a few examples:
>
> Walking the frame list in this thread is safe (as long as
> writes of Lisp_Object slots are atomic, which they are on Windows).
>
> It is also safe to use functions that make GDI calls, such as
> w32_clear_rect, because these functions must obtain a DC handle
> from the frame struct using get_frame_dc which is thread-aware.
>
> The code below does something that one shouldn't do: it
> accesses the window object from a separate thread, while the
> main (a.k.a. "Lisp") thread runs and can legitimately delete
> and even GC it. That is why we are extra careful...
>
> I wonder if something in these 1300 lines is not thread-safe on Cygwin. For
> example, I don't know if it's true on Cygwin that "writes of Lisp_Object slots
> are atomic".
I couldn't find even one "write to Lisp_Object slot" in that function,
so I don't see how this would matter.
Besides, the code that crashes has no relation to any Lisp objects: we
are walking the buffer text there. So even if w32_wnd_proc does do
something that's "verboten" with Lisp objects, I still don't see how
that could change the result of a comparison-and-jump pair of
instructions in mid-flight.
The rest of what the comments in w32_wnd_proc say is correct, but
again unrelated, for the same reasons. In fact, I cannot explain to
myself at all how _any_ code that is not thread-safe could cause such
a phenomenon. I can think of no other explanations for what we see
except some code that somehow modifies the CPU flags between the
compare instruction and the following jump instruction. Otherwise,
how can it be that the value is valid, but Emacs still aborts? Any
other ideas?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sun, 19 Oct 2014 15:39:02 GMT)
Full text and
rfc822 format available.
Message #196 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/19/2014 10:39 AM, Eli Zaretskii wrote:
> The rest of what the comments in w32_wnd_proc say is correct, but
> again unrelated, for the same reasons.
OK, thanks for checking.
> In fact, I cannot explain to
> myself at all how _any_ code that is not thread-safe could cause such
> a phenomenon. I can think of no other explanations for what we see
> except some code that somehow modifies the CPU flags between the
> compare instruction and the following jump instruction. Otherwise,
> how can it be that the value is valid, but Emacs still aborts? Any
> other ideas?
What about your earlier suggestion (from
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18438#139) to force a 64-bit
compare instruction for 'type', with the latter in a 64-bit register:
=== modified file 'src/bidi.c'
--- src/bidi.c 2014-04-06 15:56:01 +0000
+++ src/bidi.c 2014-10-10 07:12:01 +0000
@@ -326,7 +326,14 @@ bidi_get_type (int ch, bidi_dir_t overri
static void
bidi_check_type (bidi_type_t type)
{
- eassert (UNKNOWN_BT <= type && type <= NEUTRAL_ON);
+ volatile ptrdiff_t qtype = type;
+
+ if (!(suppress_checking || (UNKNOWN_BT <= qtype && qtype <= NEUTRAL_ON)))
+ {
+ fprintf (stderr, "\r\n%s:%d: bidi type %d is not in [%d..%d]\r\n",
+ __FILE__, __LINE__, type, UNKNOWN_BT, NEUTRAL_ON);
+ emacs_abort ();
+ }
}
/* Given a bidi TYPE of a character, return its category. */
Aidan, have you tried this yet?
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sun, 19 Oct 2014 18:11:01 GMT)
Full text and
rfc822 format available.
Message #199 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Sun, 19 Oct 2014 11:37:54 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: aidalgol <at> amuri.net, 18438 <at> debbugs.gnu.org
>
> What about your earlier suggestion (from
> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18438#139) to force a 64-bit
> compare instruction for 'type', with the latter in a 64-bit register:
Surely worth trying, but I'd like first to see one or 2 more detailed
backtraces like the last one, in case some pattern emerges from what
we see there.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sun, 19 Oct 2014 19:50:02 GMT)
Full text and
rfc822 format available.
Message #202 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Sun, 19 Oct 2014 11:37:54 -0400, Ken Brown wrote:
> On 10/19/2014 10:39 AM, Eli Zaretskii wrote:
>> The rest of what the comments in w32_wnd_proc say is correct, but
>> again unrelated, for the same reasons.
>
> OK, thanks for checking.
>
>> In fact, I cannot explain to
>> myself at all how _any_ code that is not thread-safe could cause
>> such
>> a phenomenon. I can think of no other explanations for what we see
>> except some code that somehow modifies the CPU flags between the
>> compare instruction and the following jump instruction. Otherwise,
>> how can it be that the value is valid, but Emacs still aborts? Any
>> other ideas?
>
> What about your earlier suggestion (from
> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18438#139) to force a
> 64-bit compare instruction for 'type', with the latter in a 64-bit
> register:
>
> === modified file 'src/bidi.c'
> --- src/bidi.c 2014-04-06 15:56:01 +0000
> +++ src/bidi.c 2014-10-10 07:12:01 +0000
> @@ -326,7 +326,14 @@ bidi_get_type (int ch, bidi_dir_t overri
> static void
> bidi_check_type (bidi_type_t type)
> {
> - eassert (UNKNOWN_BT <= type && type <= NEUTRAL_ON);
> + volatile ptrdiff_t qtype = type;
> +
> + if (!(suppress_checking || (UNKNOWN_BT <= qtype && qtype <=
> NEUTRAL_ON)))
> + {
> + fprintf (stderr, "\r\n%s:%d: bidi type %d is not in
> [%d..%d]\r\n",
> + __FILE__, __LINE__, type, UNKNOWN_BT, NEUTRAL_ON);
> + emacs_abort ();
> + }
> }
>
> /* Given a bidi TYPE of a character, return its category. */
>
>
> Aidan, have you tried this yet?
Oops! No, I somehow missed this patch when I first read that post. I
think I absent-mindedly mistook it for the same patch as the one in
<http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18438#103>. Applying now
and doing a clean build. Note that this means that my last backtrace
was with the patch in message #103, *NOT* the one in #139 that forces a
64-bit comparison.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sun, 19 Oct 2014 20:21:02 GMT)
Full text and
rfc822 format available.
Message #205 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sun, 19 Oct 2014 17:39:51 +0300, Eli Zaretskii wrote:
> The rest of what the comments in w32_wnd_proc say is correct, but
> again unrelated, for the same reasons. In fact, I cannot explain to
> myself at all how _any_ code that is not thread-safe could cause such
> a phenomenon. I can think of no other explanations for what we see
> except some code that somehow modifies the CPU flags between the
> compare instruction and the following jump instruction. Otherwise,
> how can it be that the value is valid, but Emacs still aborts? Any
> other ideas?
Not sure whether this is relevant, but I have been getting a recurring
seg. fault in w32xfns.c, but in a different function, and in lisp.h.
(Why is there code complex enough in a header file to warrant asserts
there?) I'll post the backtraces in case they're of some help. (I only
got backtraces for the main thread.) The only patch applied was the one
from message #103
<http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18438#103>.
[SIGSEGV-lisp_h-0a2fe9a-patched (text/plain, attachment)]
[SIGSEGV-lisp_h-b8497de-patched (text/plain, attachment)]
[SIGSEGV-w32xfns_c-af4c73d-patched (text/plain, attachment)]
[SIGSEGV-w32xfns_c-ffb1b3a-patched (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 20 Oct 2014 15:47:02 GMT)
Full text and
rfc822 format available.
Message #208 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 20 Oct 2014 09:20:52 +1300
> From: aidalgol <at> amuri.net
> Cc: Ken Brown <kbrown <at> cornell.edu>, <18438 <at> debbugs.gnu.org>
>
> Not sure whether this is relevant, but I have been getting a recurring
> seg. fault in w32xfns.c, but in a different function, and in lisp.h.
> (Why is there code complex enough in a header file to warrant asserts
> there?)
The assertions are in inline functions that manipulate Lisp objects.
In a binary configured with --enable-checking, each Lisp object is
tested for validity when the C code extracts from it a C pointer to
the corresponding structure.
> #0 0x000000010051ce24 in CHAR_TABLE_REF_ASCII (ct=25778897525, idx=112) at lisp.h:1492
> tbl = 0x0
> val = 2230448
> #1 0x000000010051cefa in CHAR_TABLE_REF (ct=25778897525, idx=112) at lisp.h:1510
> No locals.
> #2 0x0000000100520ea9 in syntax_property_entry (c=112, via_property=true) at syntax.h:96
> No locals.
This is again a non-sensical backtrace. The code near line 1492 of
lisp.h, where it crashes is this (line 1492 is the last one):
INLINE Lisp_Object
CHAR_TABLE_REF_ASCII (Lisp_Object ct, ptrdiff_t idx)
{
struct Lisp_Char_Table *tbl = NULL;
Lisp_Object val;
do
{
tbl = tbl ? XCHAR_TABLE (tbl->parent) : XCHAR_TABLE (ct); <<<<<<<<<
So, if 'tbl' is a NULL pointer, it cannot be dereferenced, right? And
yet the local variables clearly show that 'tbl' _is_ NULL, and it
still is dereferenced (and causes the segfault)!
> #0 0x000000010051ceb4 in CHAR_TABLE_REF_ASCII (ct=25787135005, idx=44) at lisp.h:1492
> tbl = 0x0
> val = 2230320
> #1 0x000000010051cf8a in CHAR_TABLE_REF (ct=25787135005, idx=44) at lisp.h:1510
> No locals.
Same here.
> #0 0x0000000100680609 in deselect_palette (f=0x0, hdc=0x0) at w32xfns.c:123
> No locals.
> #1 0x00000001006806d8 in release_frame_dc (f=0x0, hdc=0x0) at w32xfns.c:154
> ret = 0
> #2 0x0000000100683d36 in uniscribe_encode_char (font=0x600764000, c=32) at w32uniscribe.c:585
> context = 0x0
> f = 0x0
> old_font = 0x0
And this is a similar situation, just in a different place (see
bug#18659):
if (context)
{
SelectObject (context, old_font);
release_frame_dc (f, context); <<<<<<<<<<<<<<<<<<<<<<
}
As you see, if 'context' is a NULL pointer, release_frame_dc should
NOT be called. And yet the locals in frame #2 above clearly show that
it _is_ NULL, and release_frame_dc _is_ called!
> #0 0x0000000100680609 in deselect_palette (f=0x0, hdc=0x0) at w32xfns.c:123
> No locals.
> #1 0x00000001006806d8 in release_frame_dc (f=0x0, hdc=0x0) at w32xfns.c:154
> ret = 0
> #2 0x0000000100683d36 in uniscribe_encode_char (font=0x600b25360, c=48) at w32uniscribe.c:585
> context = 0x0
> f = 0x0
> old_font = 0x0
> code = 19
Same here.
IOW, these all exhibit the same bug, just in different places in the
Emacs sources.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 20 Oct 2014 16:53:01 GMT)
Full text and
rfc822 format available.
Message #211 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 20 Oct 2014 18:46:13 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 18438 <at> debbugs.gnu.org
>
> IOW, these all exhibit the same bug, just in different places in the
> Emacs sources.
One possible cause that can explain all of these is that the CPU flags
are reset between the comparison instruction and the following jump
instruction. For example, if the Zero flag is reset, then a test for
a NULL pointer will misbehave.
Can anyone think of a scenario where this could happen?
Another idea I have is to collect information about the Emacs build
from all the 4 people who experience these problems, and look for
common libraries or versions of libraries used for the build, like
dbus, glib, maybe the Cygwin runtime versions, etc.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 20 Oct 2014 19:36:01 GMT)
Full text and
rfc822 format available.
Message #214 received at 18438 <at> debbugs.gnu.org (full text, mbox):
>>>>> Eli Zaretskii <eliz <at> gnu.org> writes:
>>>>> From: Eli Zaretskii <eliz <at> gnu.org>
>> IOW, these all exhibit the same bug, just in different places in the
>> Emacs sources.
> One possible cause that can explain all of these is that the CPU
> flags are reset between the comparison instruction and the following
> jump instruction. For example, if the Zero flag is reset, then a
> test for a NULL pointer will misbehave.
> Can anyone think of a scenario where this could happen?
For me, it’s much easier to imagine a misbehaving compiler here.
Is there a chance of getting a disassembly at the points of
interest?
[…]
--
FSF associate member #7257 http://boycottsystemd.org/ … 3013 B6A0 230E 334A
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 20 Oct 2014 19:41:01 GMT)
Full text and
rfc822 format available.
Message #217 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> From: Ivan Shmakov <ivan <at> siamics.net>
> Date: Mon, 20 Oct 2014 19:35:01 +0000
>
> >>>>> Eli Zaretskii <eliz <at> gnu.org> writes:
> >>>>> From: Eli Zaretskii <eliz <at> gnu.org>
>
> >> IOW, these all exhibit the same bug, just in different places in the
> >> Emacs sources.
>
> > One possible cause that can explain all of these is that the CPU
> > flags are reset between the comparison instruction and the following
> > jump instruction. For example, if the Zero flag is reset, then a
> > test for a NULL pointer will misbehave.
>
> > Can anyone think of a scenario where this could happen?
>
> For me, it’s much easier to imagine a misbehaving compiler here.
> Is there a chance of getting a disassembly at the points of
> interest?
It was posted several times in this bug report, look it up.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 20 Oct 2014 20:04:02 GMT)
Full text and
rfc822 format available.
Message #220 received at 18438 <at> debbugs.gnu.org (full text, mbox):
>>>>> Eli Zaretskii <eliz <at> gnu.org> writes:
>>>>> From: Ivan Shmakov <ivan <at> siamics.net>
[…]
>> For me, it’s much easier to imagine a misbehaving compiler here. Is
>> there a chance of getting a disassembly at the points of interest?
> It was posted several times in this bug report, look it up.
Indeed, I’ve missed them while reading this thread via the list.
Unfortunately, at a glance, I can’t add anything to the analysis
already done. I’ll take another look at that, but all the
obvious ideas seem to already have been spoken out.
--
FSF associate member #7257 http://boycottsystemd.org/ … 3013 B6A0 230E 334A
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Mon, 20 Oct 2014 21:00:03 GMT)
Full text and
rfc822 format available.
Message #223 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/20/2014 12:51 PM, Eli Zaretskii wrote:
> Another idea I have is to collect information about the Emacs build
> from all the 4 people who experience these problems, and look for
> common libraries or versions of libraries used for the build, like
> dbus, glib, maybe the Cygwin runtime versions, etc.
Aidan does his own build, and I assume he has the current releases of
all the relevant libraries (and Cygwin DLL) installed. Is that right,
Aidan? The other people who have reported this problem all use my build
of Emacs for the Cygwin distribution.
I'm about to release Emacs-24.4 within the next couple days, so when
people update Emacs, the Cygwin setup program should also update all
libraries. That means everyone will have the same versions unless they
deliberately choose to keep an old version.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 21 Oct 2014 15:43:02 GMT)
Full text and
rfc822 format available.
Message #226 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 20 Oct 2014 16:59:18 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: aidalgol <at> amuri.net, 18438 <at> debbugs.gnu.org
>
> I'm about to release Emacs-24.4 within the next couple days, so when
> people update Emacs, the Cygwin setup program should also update all
> libraries. That means everyone will have the same versions unless they
> deliberately choose to keep an old version.
If you build Emacs 24.4 with optimizations, please still specify
"--enable-checking", because otherwise the assertions that cause 3/4th
of the problems we have on record will be compiled to nothing, and we
will be left only with the problem in w32uniscribe.c, which is only
compiled in the Cygwin-w32 build.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 21 Oct 2014 16:19:02 GMT)
Full text and
rfc822 format available.
Message #229 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/21/2014 11:42 AM, Eli Zaretskii wrote:
>> Date: Mon, 20 Oct 2014 16:59:18 -0400
>> From: Ken Brown <kbrown <at> cornell.edu>
>> CC: aidalgol <at> amuri.net, 18438 <at> debbugs.gnu.org
>>
>> I'm about to release Emacs-24.4 within the next couple days, so when
>> people update Emacs, the Cygwin setup program should also update all
>> libraries. That means everyone will have the same versions unless they
>> deliberately choose to keep an old version.
>
> If you build Emacs 24.4 with optimizations, please still specify
> "--enable-checking", because otherwise the assertions that cause 3/4th
> of the problems we have on record will be compiled to nothing, and we
> will be left only with the problem in w32uniscribe.c, which is only
> compiled in the Cygwin-w32 build.
Yes, I'll do that in the x86_64 builds. I'm also planning to use -Og
instead of the usual -O2 in those builds.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 21 Oct 2014 19:40:01 GMT)
Full text and
rfc822 format available.
Message #232 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Mon, 20 Oct 2014 16:59:18 -0400, Ken Brown wrote:
> On 10/20/2014 12:51 PM, Eli Zaretskii wrote:
>> Another idea I have is to collect information about the Emacs build
>> from all the 4 people who experience these problems, and look for
>> common libraries or versions of libraries used for the build, like
>> dbus, glib, maybe the Cygwin runtime versions, etc.
>
> Aidan does his own build, and I assume he has the current releases of
> all the relevant libraries (and Cygwin DLL) installed. Is that
> right,
> Aidan? The other people who have reported this problem all use my
> build of Emacs for the Cygwin distribution.
Yes, that is correct. Which version of Windows are the other four
people using? (I'm on Windows 7.)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 21 Oct 2014 21:13:02 GMT)
Full text and
rfc822 format available.
Message #235 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/21/2014 3:38 PM, aidalgol <at> amuri.net wrote:
> On Mon, 20 Oct 2014 16:59:18 -0400, Ken Brown wrote:
>> On 10/20/2014 12:51 PM, Eli Zaretskii wrote:
>>> Another idea I have is to collect information about the Emacs build
>>> from all the 4 people who experience these problems, and look for
>>> common libraries or versions of libraries used for the build, like
>>> dbus, glib, maybe the Cygwin runtime versions, etc.
>>
>> Aidan does his own build, and I assume he has the current releases of
>> all the relevant libraries (and Cygwin DLL) installed. Is that right,
>> Aidan? The other people who have reported this problem all use my
>> build of Emacs for the Cygwin distribution.
>
> Yes, that is correct. Which version of Windows are the other four
> people using? (I'm on Windows 7.)
I'm also on Windows 7, as is Markus (bug#17753). I don't know about Jon
(bug#18769). In any case, I'm about to release emacs-24.4 for the
Cygwin distribution, with checking enabled. So we'll probably be
getting some more data points.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 21 Oct 2014 21:59:01 GMT)
Full text and
rfc822 format available.
Message #238 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Tue, 21 Oct 2014 17:12:26 -0400, Ken Brown wrote:
> On 10/21/2014 3:38 PM, aidalgol <at> amuri.net wrote:
>> Yes, that is correct. Which version of Windows are the other four
>> people using? (I'm on Windows 7.)
>
> I'm also on Windows 7, as is Markus (bug#17753). I don't know about
> Jon (bug#18769). In any case, I'm about to release emacs-24.4 for
> the
> Cygwin distribution, with checking enabled. So we'll probably be
> getting some more data points.
I've written a GDB command to print the full backtrace for all threads,
which may be of use to the others experiencing this bug.
[utils-gdb.py (text/x-java, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 21 Oct 2014 22:23:02 GMT)
Full text and
rfc822 format available.
Message #241 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/21/2014 5:58 PM, aidalgol <at> amuri.net wrote:
> I've written a GDB command to print the full backtrace for all threads,
> which may be of use to the others experiencing this bug.
Thanks! Where should I put this script so that the command will be enabled?
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Tue, 21 Oct 2014 22:39:02 GMT)
Full text and
rfc822 format available.
Message #244 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Tue, 21 Oct 2014 18:21:55 -0400, Ken Brown wrote:
> On 10/21/2014 5:58 PM, aidalgol <at> amuri.net wrote:
>> I've written a GDB command to print the full backtrace for all
>> threads,
>> which may be of use to the others experiencing this bug.
>
> Thanks! Where should I put this script so that the command will be
> enabled?
I don't know; I just load it with the "source" command. I think the
only may be to make it autoloaded is to "source" it in the .gdbinit
file.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 22 Oct 2014 04:17:01 GMT)
Full text and
rfc822 format available.
Message #247 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Another full backtrace (of all threads).
alloc.c:2830: Emacs fatal error: assertion failed: VBLOCK_BYTES_MIN <=
nbytes && nbytes <= VBLOCK_BYTES_MAX
[assert-VBLOCK_BYTES_MIN-emacs_c-dbaca04-patched (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 22 Oct 2014 15:25:02 GMT)
Full text and
rfc822 format available.
Message #250 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 22 Oct 2014 17:16:36 +1300
> From: aidalgol <at> amuri.net
>
> Another full backtrace (of all threads).
>
> alloc.c:2830: Emacs fatal error: assertion failed: VBLOCK_BYTES_MIN <=
> nbytes && nbytes <= VBLOCK_BYTES_MAX
And another bogus assertion:
> #1 0x00000001005b0375 in die (msg=0x100a3a368 <STRING_BYTES_MAX+296> "VBLOCK_BYTES_MIN <= nbytes && nbytes <= VBLOCK_BYTES_MAX", file=0x100a3a020 <DEFAULT_REHASH_SIZE+40> "alloc.c", line=2830) at alloc.c:6833
> No locals.
> #2 0x00000001005a8f2d in allocate_vector_from_block (nbytes=112) at alloc.c:2830
> vector = 0x823f50
> block = 0x1
> index = 25769803776
> restbytes = 8536096
> #3 0x00000001005a9a49 in allocate_vectorlike (len=13) at alloc.c:3073
> nbytes = 112
> p = 0x824240
'nbytes' is 112, which is perfectly within valid limits
(VBLOCK_BYTES_MIN is 8 and VBLOCK_BYTES_MAX is around 2000).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 22 Oct 2014 17:17:01 GMT)
Full text and
rfc822 format available.
Message #253 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 22 Oct 2014 17:16:36 +1300
> From: aidalgol <at> amuri.net
>
> Full backtrace for thread 9
> #0 0x000000007790131a in ntdll!ZwReadFile () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
> No symbol table info available.
> #1 0x000007fefe051a7a in ReadFile () from /cygdrive/c/Windows/system32/KERNELBASE.dll
> No symbol table info available.
> #2 0x00000000777a0a19 in ReadFile () from /cygdrive/c/Windows/system32/kernel32.dll
> No symbol table info available.
> #3 0x00000001800ed834 in proc_waiter(void*) () from /usr/bin/cygwin1.dll
> No symbol table info available.
> #4 0x0000000180044fc5 in cygthread::callfunc(bool) () from /usr/bin/cygwin1.dll
> No symbol table info available.
> #5 0x000000018004552a in cygthread::stub(void*) () from /usr/bin/cygwin1.dll
> No symbol table info available.
> #6 0x000000018004619b in _cygtls::call2(unsigned int (*)(void*, void*), void*, void*) () from /usr/bin/cygwin1.dll
> No symbol table info available.
> #7 0x00000001800462f4 in _cygtls::call(unsigned int (*)(void*, void*), void*) () from /usr/bin/cygwin1.dll
> No symbol table info available.
> #8 0x00000000777a59ed in KERNEL32!BaseThreadInitThunk () from /cygdrive/c/Windows/system32/kernel32.dll
> No symbol table info available.
> #9 0x00000000778dc541 in ntdll!RtlUserThreadStart () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
> No symbol table info available.
> #10 0x0000000000000000 in ?? ()
> No symbol table info available.
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
>
> Full backtrace for thread 10
> #0 0x000000007790131a in ntdll!ZwReadFile () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
> No symbol table info available.
> #1 0x000007fefe051a7a in ReadFile () from /cygdrive/c/Windows/system32/KERNELBASE.dll
> No symbol table info available.
> #2 0x00000000777a0a19 in ReadFile () from /cygdrive/c/Windows/system32/kernel32.dll
> No symbol table info available.
> #3 0x00000001800ed834 in proc_waiter(void*) () from /usr/bin/cygwin1.dll
> No symbol table info available.
> #4 0x0000000180044fc5 in cygthread::callfunc(bool) () from /usr/bin/cygwin1.dll
> No symbol table info available.
> #5 0x000000018004552a in cygthread::stub(void*) () from /usr/bin/cygwin1.dll
> No symbol table info available.
> #6 0x000000018004619b in _cygtls::call2(unsigned int (*)(void*, void*), void*, void*) () from /usr/bin/cygwin1.dll
> No symbol table info available.
> #7 0x00000001800462f4 in _cygtls::call(unsigned int (*)(void*, void*), void*) () from /usr/bin/cygwin1.dll
> No symbol table info available.
> #8 0x00000000777a59ed in KERNEL32!BaseThreadInitThunk () from /cygdrive/c/Windows/system32/kernel32.dll
> No symbol table info available.
> #9 0x00000000778dc541 in ntdll!RtlUserThreadStart () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
> No symbol table info available.
> #10 0x0000000000000000 in ?? ()
> No symbol table info available.
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
These 2 threads seem to be waiting on a pipe to a subprocess. Any
idea which 2 subprocesses were running at the time?
Also, is there any pattern to what Emacs is doing when these assertion
violations and segfaults happen? Is it idle, or are you typing
something, and if the latter, what were you doing immediately prior to
the crash?
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 22 Oct 2014 20:40:02 GMT)
Full text and
rfc822 format available.
Message #256 received at 18438 <at> debbugs.gnu.org (full text, mbox):
Here's one more random idea. In case this problem is caused by a bug in
one of the libraries that Emacs depends on, I wonder if it would be
useful for Aidan to build emacs so that it depends on only those
libraries that he absolutely needs. (Aidan, you can use cygcheck or ldd
to see which DLLs Emacs pulls in.) If, by some miracle, the crashes and
assertion violations stop, he can add them back in one by one to see
which one is the culprit.
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 23 Oct 2014 04:16:02 GMT)
Full text and
rfc822 format available.
Message #259 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Wed, 22 Oct 2014 20:16:28 +0300, Eli Zaretskii wrote:
> These 2 threads seem to be waiting on a pipe to a subprocess. Any
> idea which 2 subprocesses were running at the time?
I had ERC running using gnutls-cli, so that's one, but I don't know
what the other one would have been. I *may* have had the Python
interpreter open, but I thought I had killed that before the crash.
> Also, is there any pattern to what Emacs is doing when these
> assertion
> violations and segfaults happen? Is it idle, or are you typing
> something, and if the latter, what were you doing immediately prior
> to
> the crash?
No pattern whatsoever; sometimes it's been while typing, sometimes
while idle, and it has crashed with no subprocesses running, although
not as often, as I usually do have ERC connected.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 23 Oct 2014 20:39:03 GMT)
Full text and
rfc822 format available.
Message #262 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Wed, 22 Oct 2014 16:39:42 -0400, Ken Brown wrote:
> Here's one more random idea. In case this problem is caused by a bug
> in one of the libraries that Emacs depends on, I wonder if it would
> be
> useful for Aidan to build emacs so that it depends on only those
> libraries that he absolutely needs. (Aidan, you can use cygcheck or
> ldd to see which DLLs Emacs pulls in.) If, by some miracle, the
> crashes and assertion violations stop, he can add them back in one by
> one to see which one is the culprit.
OK, last backtrace before moving to a minimal build. Here's the output
of ldd on my current emacs build:
$ ldd emacs.exe
ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(0x76ff0000)
kernel32.dll => /cygdrive/c/Windows/system32/kernel32.dll
(0x76d70000)
KERNELBASE.dll => /cygdrive/c/Windows/system32/KERNELBASE.dll
(0x7fefd6b0000)
cygwin1.dll => /usr/bin/cygwin1.dll (0x180040000)
cyggio-2.0-0.dll => /usr/bin/cyggio-2.0-0.dll (0x3fd680000)
cygglib-2.0-0.dll => /usr/bin/cygglib-2.0-0.dll (0x3fd540000)
cygiconv-2.dll => /usr/bin/cygiconv-2.dll (0x3fd0f0000)
cygintl-8.dll => /usr/bin/cygintl-8.dll (0x3fafc0000)
cygpcre-1.dll => /usr/bin/cygpcre-1.dll (0x3fa460000)
cyggmodule-2.0-0.dll => /usr/bin/cyggmodule-2.0-0.dll
(0x3fd530000)
cyggobject-2.0-0.dll => /usr/bin/cyggobject-2.0-0.dll
(0x3fd300000)
cygffi-6.dll => /usr/bin/cygffi-6.dll (0x3fdd90000)
cygz.dll => /usr/bin/cygz.dll (0x3f9620000)
cygjpeg-8.dll => /usr/bin/cygjpeg-8.dll (0x3fad70000)
cygMagickCore-5.dll => /usr/bin/cygMagickCore-5.dll
(0x3fe9a0000)
cyggomp-1.dll => /usr/bin/cyggomp-1.dll (0x3fd2e0000)
cyggcc_s-seh-1.dll => /usr/bin/cyggcc_s-seh-1.dll (0x3fd890000)
cygautotrace-3.dll => /usr/bin/cygautotrace-3.dll (0x3fe460000)
cygming-1.dll => /usr/bin/cygming-1.dll (0x3fa920000)
cygfreetype-6.dll => /usr/bin/cygfreetype-6.dll (0x3fd8b0000)
cygbz2-1.dll => /usr/bin/cygbz2-1.dll (0x3fe400000)
cygpng15-15.dll => /usr/bin/cygpng15-15.dll (0x3fa1c0000)
cyggif-4.dll => /usr/bin/cyggif-4.dll (0x3fd7c0000)
cygX11-6.dll => /usr/bin/cygX11-6.dll (0x3fe750000)
cygxcb-1.dll => /usr/bin/cygxcb-1.dll (0x3f9790000)
cygXau-6.dll => /usr/bin/cygXau-6.dll (0x3fe730000)
cygXdmcp-6.dll => /usr/bin/cygXdmcp-6.dll (0x3fe710000)
cygpstoedit-0.dll => /usr/bin/cygpstoedit-0.dll (0x3f9fa0000)
cyggd-2.dll => /usr/bin/cyggd-2.dll (0x3fd840000)
cygfontconfig-1.dll => /usr/bin/cygfontconfig-1.dll
(0x3fda10000)
cygexpat-1.dll => /usr/bin/cygexpat-1.dll (0x3fdcb0000)
cygXpm-4.dll => /usr/bin/cygXpm-4.dll (0x3fe6a0000)
cygstdc++-6.dll => /usr/bin/cygstdc++-6.dll (0x3f9c60000)
cygcairo-2.dll => /usr/bin/cygcairo-2.dll (0x3fe2e0000)
cygGL-1.dll => /usr/bin/cygGL-1.dll (0x3fff50000)
cygglapi-0.dll => /usr/bin/cygglapi-0.dll (0x3fd630000)
cygX11-xcb-1.dll => /usr/bin/cygX11-xcb-1.dll (0x3fe740000)
cygxcb-glx-0.dll => /usr/bin/cygxcb-glx-0.dll (0x3f9420000)
cygpixman-1-0.dll => /usr/bin/cygpixman-1-0.dll (0x3fa060000)
cygxcb-render-0.dll => /usr/bin/cygxcb-render-0.dll
(0x3f9780000)
cygxcb-shm-0.dll => /usr/bin/cygxcb-shm-0.dll (0x3f9410000)
cygXext-6.dll => /usr/bin/cygXext-6.dll (0x3fe6f0000)
cygXrender-1.dll => /usr/bin/cygXrender-1.dll (0x3fe660000)
cygfftw3-3.dll => /usr/bin/cygfftw3-3.dll (0x3fdb80000)
cygfpx-1.dll => /usr/bin/cygfpx-1.dll (0x3fd950000)
cyggs-9.dll => /usr/bin/cyggs-9.dll (0x3fc960000)
cygidn-11.dll => /usr/bin/cygidn-11.dll (0x3faf80000)
cyglcms2-2.dll => /usr/bin/cyglcms2-2.dll (0x3fabc0000)
cygpaper-1.dll => /usr/bin/cygpaper-1.dll (0x3fa4b0000)
cygtiff-5.dll => /usr/bin/cygtiff-5.dll (0x3f9870000)
cygjbig-2.dll => /usr/bin/cygjbig-2.dll (0x3fadd0000)
cygXt-6.dll => /usr/bin/cygXt-6.dll (0x3fe600000)
cygICE-6.dll => /usr/bin/cygICE-6.dll (0x3fff30000)
cygSM-6.dll => /usr/bin/cygSM-6.dll (0x3fe870000)
cyguuid-1.dll => /usr/bin/cyguuid-1.dll (0x3f97d0000)
cygjasper-1.dll => /usr/bin/cygjasper-1.dll (0x3fadf0000)
cygltdl-7.dll => /usr/bin/cygltdl-7.dll (0x3fab10000)
cyglzma-5.dll => /usr/bin/cyglzma-5.dll (0x3fac30000)
cygpango-1.0-0.dll => /usr/bin/cygpango-1.0-0.dll (0x3fa510000)
cygthai-0.dll => /usr/bin/cygthai-0.dll (0x3f9900000)
cygdatrie-1.dll => /usr/bin/cygdatrie-1.dll (0x3fe030000)
cygpangocairo-1.0-0.dll => /usr/bin/cygpangocairo-1.0-0.dll
(0x3fa4f0000)
cygpangoft2-1.0-0.dll => /usr/bin/cygpangoft2-1.0-0.dll
(0x3fa4d0000)
cygharfbuzz-0.dll => /usr/bin/cygharfbuzz-0.dll (0x3fc900000)
cyggraphite2-3.dll => /usr/bin/cyggraphite2-3.dll (0x3fd2b0000)
cygrsvg-2-2.dll => /usr/bin/cygrsvg-2-2.dll (0x3f9f00000)
cygcroco-0.6-3.dll => /usr/bin/cygcroco-0.6-3.dll (0x3fe210000)
cygxml2-2.dll => /usr/bin/cygxml2-2.dll (0x3f9640000)
cyggdk_pixbuf-2.0-0.dll => /usr/bin/cyggdk_pixbuf-2.0-0.dll
(0x3fd7d0000)
ADVAPI32.dll => /cygdrive/c/Windows/system32/ADVAPI32.dll
(0x7fefdd20000)
msvcrt.dll => /cygdrive/c/Windows/system32/msvcrt.dll
(0x7feff070000)
sechost.dll => /cygdrive/c/Windows/SYSTEM32/sechost.dll
(0x7fefda30000)
RPCRT4.dll => /cygdrive/c/Windows/system32/RPCRT4.dll
(0x7fefdad0000)
GDI32.dll => /cygdrive/c/Windows/system32/GDI32.dll
(0x7fefda50000)
USER32.dll => /cygdrive/c/Windows/system32/USER32.dll
(0x76c70000)
LPK.dll => /cygdrive/c/Windows/system32/LPK.dll (0x7feff110000)
USP10.dll => /cygdrive/c/Windows/system32/USP10.dll
(0x7fefdc00000)
cygMagickWand-5.dll => /usr/bin/cygMagickWand-5.dll
(0x3fe880000)
cygncursesw-10.dll => /usr/bin/cygncursesw-10.dll (0x3faa30000)
cygtiff-6.dll => /usr/bin/cygtiff-6.dll (0x3f9800000)
COMCTL32.dll =>
/cygdrive/c/Windows/WinSxS/amd64_microsoft.windows.common-controls_6595b64144ccf1df_6.0.7601.17514_none_fa396087175ac9ac/COMCTL32.dll
(0x7fefbe70000)
SHLWAPI.dll => /cygdrive/c/Windows/system32/SHLWAPI.dll
(0x7fefe070000)
comdlg32.dll => /cygdrive/c/Windows/system32/comdlg32.dll
(0x7fefefd0000)
SHELL32.dll => /cygdrive/c/Windows/system32/SHELL32.dll
(0x7fefe240000)
ole32.dll => /cygdrive/c/Windows/system32/ole32.dll
(0x7fefde60000)
IMM32.DLL => /cygdrive/c/Windows/system32/IMM32.DLL
(0x7fefe210000)
MSCTF.dll => /cygdrive/c/Windows/system32/MSCTF.dll
(0x7fefd820000)
[assert-STRINGP-emacs_c-126aa17 (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Thu, 23 Oct 2014 21:55:02 GMT)
Full text and
rfc822 format available.
Message #265 received at 18438 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Wed, 22 Oct 2014 16:39:42 -0400, Ken Brown wrote:
> Here's one more random idea. In case this problem is caused by a bug
> in one of the libraries that Emacs depends on, I wonder if it would
> be
> useful for Aidan to build emacs so that it depends on only those
> libraries that he absolutely needs. (Aidan, you can use cygcheck or
> ldd to see which DLLs Emacs pulls in.) If, by some miracle, the
> crashes and assertion violations stop, he can add them back in one by
> one to see which one is the culprit.
Welp, another one bites the dust. Asserts still happening.
$ ldd src/emacs.exe
ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll
(0x76ff0000)
kernel32.dll => /cygdrive/c/Windows/system32/kernel32.dll
(0x76d70000)
KERNELBASE.dll => /cygdrive/c/Windows/system32/KERNELBASE.dll
(0x7fefd6b0000)
cygwin1.dll => /usr/bin/cygwin1.dll (0x180040000)
cygncursesw-10.dll => /usr/bin/cygncursesw-10.dll (0x3faa30000)
cygxml2-2.dll => /usr/bin/cygxml2-2.dll (0x3f9640000)
cygiconv-2.dll => /usr/bin/cygiconv-2.dll (0x3fd0f0000)
cygz.dll => /usr/bin/cygz.dll (0x3f9620000)
cyggcc_s-seh-1.dll => /usr/bin/cyggcc_s-seh-1.dll (0x3fd890000)
ADVAPI32.dll => /cygdrive/c/Windows/system32/ADVAPI32.dll
(0x7fefdd20000)
msvcrt.dll => /cygdrive/c/Windows/system32/msvcrt.dll
(0x7feff070000)
sechost.dll => /cygdrive/c/Windows/SYSTEM32/sechost.dll
(0x7fefda30000)
RPCRT4.dll => /cygdrive/c/Windows/system32/RPCRT4.dll
(0x7fefdad0000)
COMCTL32.dll =>
/cygdrive/c/Windows/WinSxS/amd64_microsoft.windows.common-controls_6595b64144ccf1df_6.0.7601.17514_none_fa396087175ac9ac/COMCTL32.dll
(0x7fefbe70000)
GDI32.dll => /cygdrive/c/Windows/system32/GDI32.dll
(0x7fefda50000)
USER32.dll => /cygdrive/c/Windows/system32/USER32.dll
(0x76c70000)
LPK.dll => /cygdrive/c/Windows/system32/LPK.dll (0x7feff110000)
USP10.dll => /cygdrive/c/Windows/system32/USP10.dll
(0x7fefdc00000)
SHLWAPI.dll => /cygdrive/c/Windows/system32/SHLWAPI.dll
(0x7fefe070000)
comdlg32.dll => /cygdrive/c/Windows/system32/comdlg32.dll
(0x7fefefd0000)
SHELL32.dll => /cygdrive/c/Windows/system32/SHELL32.dll
(0x7fefe240000)
ole32.dll => /cygdrive/c/Windows/system32/ole32.dll
(0x7fefde60000)
IMM32.DLL => /cygdrive/c/Windows/system32/IMM32.DLL
(0x7fefe210000)
MSCTF.dll => /cygdrive/c/Windows/system32/MSCTF.dll
(0x7fefd820000)
[assert-CONSP-emacs_c-126aa17-minimal (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 24 Oct 2014 06:51:02 GMT)
Full text and
rfc822 format available.
Message #268 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 24 Oct 2014 10:54:50 +1300
> From: aidalgol <at> amuri.net
> Cc: Ken Brown <kbrown <at> cornell.edu>, Eli Zaretskii <eliz <at> gnu.org>
>
> Welp, another one bites the dust. Asserts still happening.
Did you make sure this is a bogus assertion violation? In this case:
> #0 terminate_due_to_signal (sig=6, backtrace_limit=2147483647) at emacs.c:351
> No locals.
> #1 0x00000001005b0395 in die (msg=0x100a44842 <DEFAULT_REHASH_SIZE+506> "CONSP (tail)", file=0x100a44650 <DEFAULT_REHASH_SIZE+8> "intervals.c", line=1777) at alloc.c:6833
> No locals.
> #2 0x000000010064151a in lookup_char_property (plist=25793766198, prop=4306944674, textprop=true) at intervals.c:1777
> tail = 4306571314
> fallback = 4306571314
On line 1777 of intervals.c, do the following commands
(gdb) frame 2
(gdb) p tail
(gdb) xtype
say that 'tail' is a Lisp_Cons? If so, then the assertion violation
is indeed bogus.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 24 Oct 2014 19:20:02 GMT)
Full text and
rfc822 format available.
Message #271 received at 18438 <at> debbugs.gnu.org (full text, mbox):
A Cygwin bug was just found (and fixed) that could cause corruption of the flags
register, as Eli suspected:
https://cygwin.com/ml/cygwin/2014-10/msg00397.html
Aidan, could you try the latest Cygwin snapshot and see if that fixes the problem?
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Fri, 24 Oct 2014 21:20:01 GMT)
Full text and
rfc822 format available.
Message #274 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/24/2014 3:18 PM, Ken Brown wrote:
> A Cygwin bug was just found (and fixed) that could cause corruption of the flags
> register, as Eli suspected:
>
> https://cygwin.com/ml/cygwin/2014-10/msg00397.html
>
> Aidan, could you try the latest Cygwin snapshot and see if that fixes the problem?
There's no need to use a snapshot anymore. The proposed fix is in the latest
test release, cygwin-1.7.33-0.2.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 29 Oct 2014 04:31:02 GMT)
Full text and
rfc822 format available.
Message #277 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On Fri, 24 Oct 2014 17:19:14 -0400, Ken Brown wrote:
> On 10/24/2014 3:18 PM, Ken Brown wrote:
>> A Cygwin bug was just found (and fixed) that could cause corruption
>> of the flags
>> register, as Eli suspected:
>>
>> https://cygwin.com/ml/cygwin/2014-10/msg00397.html
>>
>> Aidan, could you try the latest Cygwin snapshot and see if that
>> fixes the problem?
>
> There's no need to use a snapshot anymore. The proposed fix is in
> the latest test release, cygwin-1.7.33-0.2.
I have had no Emacs crashes since updating to this Cygwin version two
days ago (with the same amount of use per day), so this bug seems to
have been the cause. Let's see if the other users experiencing the
bogus assertions stop getting them after this fix makes it into the
stable release.
(I am still occasionally having Emacs freeze in such a way that it is
unaffected by a DebugBreak (see
<https://cygwin.com/ml/cygwin/2006-06/msg00321.html>), but less often
than the bogus asserts, and likely unrelated.)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 29 Oct 2014 12:17:02 GMT)
Full text and
rfc822 format available.
Message #280 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 10/29/2014 12:30 AM, aidalgol <at> amuri.net wrote:
> I have had no Emacs crashes since updating to this Cygwin version two days ago
> (with the same amount of use per day), so this bug seems to have been the
> cause. Let's see if the other users experiencing the bogus assertions stop
> getting them after this fix makes it into the stable release.
Jon (bug#18769) has already reported on the Cygwin list that he is no longer
getting bogus assertion failures
(https://cygwin.com/ml/cygwin/2014-10/msg00485.html). So I'm optimistic.
Eli, thanks very much for all your help!
Ken
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 29 Oct 2014 14:18:01 GMT)
Full text and
rfc822 format available.
Message #283 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 29 Oct 2014 17:30:41 +1300
> From: aidalgol <at> amuri.net
> Cc: Ken Brown <kbrown <at> cornell.edu>, Eli Zaretskii <eliz <at> gnu.org>
>
> (I am still occasionally having Emacs freeze in such a way that it is
> unaffected by a DebugBreak (see
> <https://cygwin.com/ml/cygwin/2006-06/msg00321.html>), but less often
> than the bogus asserts, and likely unrelated.)
I suggest that you report that as a separate bug with all the details
(like when it hangs, and maybe even where).
TIA
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Wed, 29 Oct 2014 14:38:04 GMT)
Full text and
rfc822 format available.
Message #286 received at 18438 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 29 Oct 2014 08:15:41 -0400
> From: Ken Brown <kbrown <at> cornell.edu>
> CC: Eli Zaretskii <eliz <at> gnu.org>
>
> On 10/29/2014 12:30 AM, aidalgol <at> amuri.net wrote:
> > I have had no Emacs crashes since updating to this Cygwin version two days ago
> > (with the same amount of use per day), so this bug seems to have been the
> > cause. Let's see if the other users experiencing the bogus assertions stop
> > getting them after this fix makes it into the stable release.
>
> Jon (bug#18769) has already reported on the Cygwin list that he is no longer
> getting bogus assertion failures
> (https://cygwin.com/ml/cygwin/2014-10/msg00485.html). So I'm optimistic.
So am I, to a degree.
> Eli, thanks very much for all your help!
You are welcome. But the real kudos go to those who found and fixed
the root cause of the problem.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sat, 26 Dec 2015 15:38:02 GMT)
Full text and
rfc822 format available.
Message #289 received at 18438 <at> debbugs.gnu.org (full text, mbox):
Ken Brown <kbrown <at> cornell.edu> writes:
> Jon (bug#18769) has already reported on the Cygwin list that he is no
> longer getting bogus assertion failures
> (https://cygwin.com/ml/cygwin/2014-10/msg00485.html). So I'm
> optimistic.
I think the conclusion of the several threads in these bug reports is
that this was a Cygwin bug? So I'm closing these bug reports.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
bug closed, send any further explanations to
17817 <at> debbugs.gnu.org and Ken Brown <kbrown <at> cornell.edu>
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Sat, 26 Dec 2015 15:38:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#18438
; Package
emacs
.
(Sat, 26 Dec 2015 15:44:03 GMT)
Full text and
rfc822 format available.
Message #294 received at 18438 <at> debbugs.gnu.org (full text, mbox):
On 12/26/2015 10:37 AM, Lars Ingebrigtsen wrote:
> Ken Brown <kbrown <at> cornell.edu> writes:
>
>> Jon (bug#18769) has already reported on the Cygwin list that he is no
>> longer getting bogus assertion failures
>> (https://cygwin.com/ml/cygwin/2014-10/msg00485.html). So I'm
>> optimistic.
>
> I think the conclusion of the several threads in these bug reports is
> that this was a Cygwin bug? So I'm closing these bug reports.
That's right. Thanks.
Ken
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sun, 24 Jan 2016 12:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 153 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.