GNU bug report logs - #21572
24.5; Gud gdb doesn't load source files with utf-8 chars in the file name

Previous Next

Package: emacs;

Reported by: Augusto Fraga Giachero <augustofg96 <at> gmail.com>

Date: Sun, 27 Sep 2015 16:16:01 UTC

Severity: normal

Merged with 21940

Found in version 24.5

Fixed in version 25.1

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21572 in the body.
You can then email your comments to 21572 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#21572; Package emacs. (Sun, 27 Sep 2015 16:16:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Augusto Fraga Giachero <augustofg96 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 27 Sep 2015 16:16:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Augusto Fraga Giachero <augustofg96 <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.5;
 Gud gdb doesn't load source files with utf-8 chars in the file name
Date: Sun, 27 Sep 2015 12:26:55 -0300
I'm having problems when trying to debug a program with gdb. The GUD
doesn't load the source files if they have any utf-8 character in their
names. I know that gdb replaces utf-8 characters with backslash and
their corresponding octal value, it seems that GUD isn't parsing these
octal sequences.

Here is an part of my gdb-source-file-list:

(... "/home/augusto/Projetos/Eletr\303\264nica/ARM/IoControl/src/main.c"
...)

The correct path should be:
/home/augusto/Projetos/EletrĂ´nica/ARM/IoControl/src/main.c

I think it's not hard to fix it, but my knowledge of lisp isn't that
great.




In GNU Emacs 24.5.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.16.6)
 of 2015-09-09 on foutrelis
Windowing system distributor `The X.Org Foundation', version 11.0.11702000
Configured using:
 `configure --prefix=/usr --sysconfdir=/etc --libexecdir=/usr/lib
 --localstatedir=/var --with-x-toolkit=gtk3 --with-xft
 'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong
 --param=ssp-buffer-size=4' CPPFLAGS=-D_FORTIFY_SOURCE=2
 LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro'

Important settings:
  value of $LANG: pt_BR.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Fundamental

Minor modes in effect:
  global-company-mode: t
  company-mode: t
  yas-global-mode: t
  yas-minor-mode: t
  display-time-mode: t
  tooltip-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent messages:
[yas] Loading for `emacs-lisp-mode', just-in-time: (lambda nil (yas--load-directory-1 (quote /home/augusto/.emacs.d/elpa/yasnippet-20150811.1222/snippets/emacs-lisp-mode) (quote emacs-lisp-mode)))!
[yas] Loading compiled snippets from /home/augusto/.emacs.d/elpa/yasnippet-20150811.1222/snippets/emacs-lisp-mode
[yas] Loading for `prog-mode', just-in-time: (lambda nil (yas--load-directory-1 (quote /home/augusto/.emacs.d/elpa/yasnippet-20150811.1222/snippets/prog-mode) (quote prog-mode)))!
[yas] Loading compiled snippets from /home/augusto/.emacs.d/elpa/yasnippet-20150811.1222/snippets/prog-mode
Loading /home/augusto/.emacs.d/elpa/yasnippet-20150811.1222/snippets/prog-mode/.yas-setup...done
For information about GNU Emacs and the GNU system, type C-h C-a.
*message*-20150927-113643 has auto save data; consider M-x recover-this-file
Beginning of buffer
Mark set [2 times]
Making completion list...

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr emacsbug message idna cl-macs
format-spec rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse
rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045
ietf-drums mm-util mail-prsvr mail-utils company-files company-oddmuse
company-keywords company-etags etags ring company-gtags
company-dabbrev-code company-dabbrev company-capf company-cmake
company-xcode company-clang company-semantic company-eclim
company-template company-css company-nxml company-bbdb company-irony
irony-completion irony-snippet irony find-func company waher-theme ido
cl-extra yasnippet help-mode cl gv linum-relative advice help-fns linum
picasm picasm-loops picasm-external edmacro kmacro cl-loaddefs cl-lib
info easymenu tex-site package epg-config time time-date tooltip
electric uniquify ediff-hook vc-hooks lisp-float-type mwheel x-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment
lisp-mode prog-mode register page menu-bar rfn-eshadow timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham
georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese hebrew greek romanian slovak czech european ethiopic
indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple
abbrev minibuffer nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote make-network-process
dbusbind gfilenotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs)

Memory information:
((conses 16 154849 3972)
 (symbols 48 24539 0)
 (miscs 40 103 151)
 (strings 32 37552 10436)
 (string-bytes 1 932429)
 (vectors 16 17551)
 (vector-slots 8 519725 3706)
 (floats 8 406 239)
 (intervals 56 257 0)
 (buffers 960 15)
 (heap 1024 34440 1713))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21572; Package emacs. (Wed, 30 Sep 2015 17:53:01 GMT) Full text and rfc822 format available.

Message #8 received at 21572 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Augusto Fraga Giachero <augustofg96 <at> gmail.com>
Cc: 21572 <at> debbugs.gnu.org
Subject: Re: bug#21572: 24.5;
 Gud gdb doesn't load source files with utf-8 chars in the file name
Date: Wed, 30 Sep 2015 20:52:24 +0300
> From: Augusto Fraga Giachero <augustofg96 <at> gmail.com>
> Date: Sun, 27 Sep 2015 12:26:55 -0300
> 
> I'm having problems when trying to debug a program with gdb. The GUD
> doesn't load the source files if they have any utf-8 character in their
> names. I know that gdb replaces utf-8 characters with backslash and
> their corresponding octal value, it seems that GUD isn't parsing these
> octal sequences.

They are just ASCII characters, so GUD had no reason to parse them.

> Here is an part of my gdb-source-file-list:
> 
> (... "/home/augusto/Projetos/Eletr\303\264nica/ARM/IoControl/src/main.c"
> ...)
> 
> The correct path should be:
> /home/augusto/Projetos/EletrĂ´nica/ARM/IoControl/src/main.c
> 
> I think it's not hard to fix it

Actually, it's not very simple.  GDB outputs octal escapes in every
string, not just in file names, so decoding should be done on a very
low level, where we don't yet know what is a file name and what is
some other string (like a value of some string variable).  We can
decode that if we assume that all the strings output by GDB are
encoded the same (in your case, probably UTF-8), and keeping fingers
crossed that the communications channel between GBD and Emacs never
breaks the 3-digit sequence due to buffering issues.

I have a prototype fix along the above-mentioned lines which I will
commit soon, unless someone has a better idea.  You could then patch
your gdb-mi.el and use it with those source files.

Alternatively, you can invoke GDB via "M-x gud-gdb RET", which doesn't
have this problem in the first place.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21572; Package emacs. (Thu, 01 Oct 2015 00:38:02 GMT) Full text and rfc822 format available.

Message #11 received at 21572 <at> debbugs.gnu.org (full text, mbox):

From: Augusto Fraga <augustofg96 <at> gmail.com>
To: 21572 <at> debbugs.gnu.org
Subject: Fwd: bug#21572: 24.5; Gud gdb doesn't load source files with utf-8
 chars in the file name
Date: Wed, 30 Sep 2015 21:37:33 -0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Date: Wed, 30 Sep 2015 20:52:24 +0300
>
> Actually, it's not very simple.  GDB outputs octal escapes in every
> string, not just in file names, so decoding should be done on a very
> low level, where we don't yet know what is a file name and what is
> some other string (like a value of some string variable).

The only string that needs to be converted back to UTF-8 is the
sources file names string (couldn't it be done by hacking the
gdb-get-source-file-list function?).

> We can decode that if we assume that all the strings output by GDB are
> encoded the same (in your case, probably UTF-8), and keeping fingers
> crossed that the communications channel between GBD and Emacs never
> breaks the 3-digit sequence due to buffering issues.

I think that would be a nice option if gdb had a flag to disable these
octal sequences for the mi protocol. It would make everything easier.

> I have a prototype fix along the above-mentioned lines which I will
> commit soon, unless someone has a better idea.  You could then patch
> your gdb-mi.el and use it with those source files.

Nice! I'll try it out when you commit.

> Alternatively, you can invoke GDB via "M-x gud-gdb RET", which doesn't
> have this problem in the first place.

Well, but I wouldn't have a good source debugging interface. In fact
the "M-x gdb RET" doesn't fails, it only doesn't load the buffer for
the source code (it behaves like standard gdb without tui).

Thank you!




Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Thu, 01 Oct 2015 11:55:02 GMT) Full text and rfc822 format available.

Notification sent to Augusto Fraga Giachero <augustofg96 <at> gmail.com>:
bug acknowledged by developer. (Thu, 01 Oct 2015 11:55:02 GMT) Full text and rfc822 format available.

Message #16 received at 21572-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: augustofg96 <at> gmail.com
Cc: 21572-done <at> debbugs.gnu.org
Subject: Re: bug#21572: 24.5;
 Gud gdb doesn't load source files with utf-8 chars in the file name
Date: Thu, 01 Oct 2015 14:53:52 +0300
> Date: Wed, 30 Sep 2015 20:52:24 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 21572 <at> debbugs.gnu.org
> 
> I have a prototype fix along the above-mentioned lines which I will
> commit soon, unless someone has a better idea.  You could then patch
> your gdb-mi.el and use it with those source files.

I've pushed those changes now, and I'm marking this bug done.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21572; Package emacs. (Thu, 01 Oct 2015 12:16:02 GMT) Full text and rfc822 format available.

Message #19 received at 21572 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Augusto Fraga <augustofg96 <at> gmail.com>
Cc: 21572 <at> debbugs.gnu.org
Subject: Re: bug#21572: Fwd: bug#21572: 24.5;
 Gud gdb doesn't load source files with utf-8 chars in the file name
Date: Thu, 01 Oct 2015 15:14:57 +0300
> Date: Wed, 30 Sep 2015 21:37:33 -0300
> From: Augusto Fraga <augustofg96 <at> gmail.com>
> 
> > From: Eli Zaretskii <eliz <at> gnu.org>
> > Date: Wed, 30 Sep 2015 20:52:24 +0300
> >
> > Actually, it's not very simple.  GDB outputs octal escapes in every
> > string, not just in file names, so decoding should be done on a very
> > low level, where we don't yet know what is a file name and what is
> > some other string (like a value of some string variable).
> 
> The only string that needs to be converted back to UTF-8 is the
> sources file names string

For your use case, yes.  But if your program manipulates non-ASCII
text in its strings (like non-ASCII file names it creates), you will
see the same problem with them.  Why should they be treated any
different?

> (couldn't it be done by hacking the gdb-get-source-file-list
> function?).

No, because the source file names arrive through other ways as well,
notably when GDB reports a breakpoint being hit.

I actually started with gdb-get-source-file-list, but this wasn't
enough to automatically pop up the source when the program is started.

> > We can decode that if we assume that all the strings output by GDB are
> > encoded the same (in your case, probably UTF-8), and keeping fingers
> > crossed that the communications channel between GBD and Emacs never
> > breaks the 3-digit sequence due to buffering issues.
> 
> I think that would be a nice option if gdb had a flag to disable these
> octal sequences for the mi protocol. It would make everything easier.

I agree.  But if this will happen (and I hope it will; I'm talking to
GDB developers about that), Emacs users will not be able to take
advantage of that until their sysadmins upgrade to that newer version
of GDB.  So it makes sense to provide a solution now, with the
existing GDB versions.

> > I have a prototype fix along the above-mentioned lines which I will
> > commit soon, unless someone has a better idea.  You could then patch
> > your gdb-mi.el and use it with those source files.
> 
> Nice! I'll try it out when you commit.

You can do that now, see commits 439f483 and 9c86325.

Note that this decoding is by default turned off, for the reasons I
explained in the doc string of gdb-mi-decode-strings option and in the
comments to the gdb-mi-decode function.  Set it to t to see the
feature at work.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21572; Package emacs. (Thu, 01 Oct 2015 13:01:02 GMT) Full text and rfc822 format available.

Message #22 received at 21572 <at> debbugs.gnu.org (full text, mbox):

From: Augusto Fraga Giachero <augustofg96 <at> gmail.com>
To: 21572 <at> debbugs.gnu.org
Subject: Re: bug#21572: Fwd: bug#21572: 24.5; Gud gdb doesn't load source
 files with utf-8 chars in the file name
Date: Thu, 1 Oct 2015 10:00:48 -0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Date: Thu, 01 Oct 2015 15:14:57 +0300
>
> You can do that now, see commits 439f483 and 9c86325.
>
> Note that this decoding is by default turned off, for the reasons I
> explained in the doc string of gdb-mi-decode-strings option and in the
> comments to the gdb-mi-decode function.  Set it to t to see the
> feature at work.

Thank you! I've tried and it worked well!

It is a nice short-time fix until gdb support turning off octal
conversion for non-ASCII strings.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 30 Oct 2015 11:24:04 GMT) Full text and rfc822 format available.

bug unarchived. Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Wed, 18 Nov 2015 02:02:02 GMT) Full text and rfc822 format available.

Forcibly Merged 21572 21940. Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Wed, 18 Nov 2015 02:02:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 16 Dec 2015 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 185 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.