GNU bug report logs -
#44173
28.0.50; gdb-mi mangles strings with octal escapes
Previous Next
Reported by: Mattias Engdegård <mattiase <at> acm.org>
Date: Fri, 23 Oct 2020 11:51:02 UTC
Severity: normal
Found in version 28.0.50
Done: Mattias Engdegård <mattiase <at> acm.org>
Bug is archived. No further changes may be made.
Full log
Message #17 received at 44173 <at> debbugs.gnu.org (full text, mbox):
23 okt. 2020 kl. 15.19 skrev Eli Zaretskii <eliz <at> gnu.org>:
> The basic ambiguity, AFAIR, is what is described last here: a string
> reported bu GDB could include literal \nnn sequences, which are not
> non-ASCII characters that GDB/MI converts to octal escapes. The
> information which was which is lost once we receive the GDB/MI output.
So you mean that GDB would produce the value "\303" that does not stand for a string containing the single byte octal 303? When does this occur?
> AFAIU, this bug's root cause is the way we solved the ambiguity, which
> basically assumes one of the possible interpretations should be
> preferred to another, because it is more popular/useful.
Then we disagree. The code doesn't do the right thing if gdb-mi-decode-string is nil, unless you by 'ambiguity' mean that GDB sometimes inserts a spurious backslash that should be ignored. When gdb-mi-decode-string is non-nil, it is sometimes wrong as well.
> Let me turn the table and ask you how did you get that string you show
> in the original report?
A program in the C language containing the local declaration
char *s = "\303\266";
produces nonsense in the 'Locals' window when debugged. It doesn't matter what the string means; I would have been happy with gdb/emacs interpreting it as utf-8, latin-1 or just raw bytes presented in octal or hex.
> And what will then happen to non-ASCII strings and file names reported
> by GDB? How will our parser solve that?
The parser can either leave the strings as undecoded unibyte strings -- that is, "\303\266" would be a 2-char string -- or decode them according to gdb-mi-decode-strings, in which case it might become a 1-char multibyte string. In the former case, the code receiving the parse tree could decide what to do with the strings and how to display them, perhaps on a case-by-case basis.
> Do you intend to extend the existing parser or write a new one from
> scratch?
Extending the existing one appears sensible, just replacing the JSON tour.
This bug report was last modified 4 years and 193 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.