GNU bug report logs -
#30405
26.0.91; Incorrect apostrophe translation in ImageMagick error message
Previous Next
Reported by: Gemini Lasswell <gazally <at> runbox.com>
Date: Fri, 9 Feb 2018 21:15:01 UTC
Severity: normal
Found in versions 26.0.91, 25.1
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 30405 in the body.
You can then email your comments to 30405 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Fri, 09 Feb 2018 21:15:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Gemini Lasswell <gazally <at> runbox.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Fri, 09 Feb 2018 21:15:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
When I try to resize an image using an Emacs built without
ImageMagick, either with emacs -Q or my full config, the apostrophe in
the error message is not displayed correctly. To reproduce:
C-x C-f path/to/image-file.jpg RET
+
Result: The error "Can^Yt rescale images without ImageMagick support"
appears in the echo area.
When I define this little command and run it:
(defun my-command ()
(interactive)
(error "Can't do this"))
then the typographically correct apostrophe appears in the echo area.
In *Messages*, the first error message appears as:
Can\342\200\231t rescale images without ImageMagick support
although now in report-emacs-bug's message composition buffer, I see
the typograpically correct apostrophe.
In GNU Emacs 26.0.91 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.22.21)
of 2018-02-09 built on localhost
Windowing system distributor 'The X.Org Foundation', version 11.0.11905000
Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Type C-c C-c or C-c C-x to view the image as text or hex.
image--get-imagemagick-and-warn: Can’t rescale images without ImageMagick support
Configured using:
'configure
--prefix=/nix/store/y06nnna2nzr4fx1pbigs67hbjm396ijn-emacs-26.0
--with-modules --with-x-toolkit=gtk3 --with-xft'
Configured features:
XPM JPEG TIFF GIF PNG RSVG SOUND DBUS GSETTINGS NOTIFY LIBSELINUX
GNUTLS LIBXML2 FREETYPE XFT ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 MODULES
THREADS
Important settings:
value of $LANG: en_US.UTF-8
locale-coding-system: utf-8-unix
Major mode: Image[jpeg]
Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs
format-spec rfc822 mml mml-sec password-cache epa derived epg
epg-config gnus-util rmail rmail-loaddefs mm-decode mm-bodies
mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail
rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils map seq
byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib image-mode
easymenu elec-pair time-date mule-util tooltip eldoc electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript charprop case-table epa-hook
jka-cmpr-hook help simple abbrev obarray minibuffer cl-preloaded
nadvice loaddefs button faces cus-face macroexp files text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote dbusbind inotify dynamic-setting
system-font-setting font-render-setting move-toolbar gtk x-toolkit x
multi-tty make-network-process emacs)
Memory information:
((conses 16 97112 7073)
(symbols 48 20569 1)
(miscs 40 47 79)
(strings 32 28786 1005)
(string-bytes 1 777748)
(vectors 16 15028)
(vector-slots 8 504094 7678)
(floats 8 58 59)
(intervals 56 209 0)
(buffers 992 12))
bug Marked as found in versions 25.1.
Request was from
Glenn Morris <rgm <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Fri, 09 Feb 2018 22:46:01 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Fri, 09 Feb 2018 23:05:02 GMT)
Full text and
rfc822 format available.
Message #10 received at 30405 <at> debbugs.gnu.org (full text, mbox):
Gemini Lasswell wrote:
> When I try to resize an image using an Emacs built without
> ImageMagick, either with emacs -Q or my full config, the apostrophe in
> the error message is not displayed correctly. To reproduce:
>
> C-x C-f path/to/image-file.jpg RET
> +
>
> Result: The error "Can^Yt rescale images without ImageMagick support"
> appears in the echo area.
Present since 25.1. Perhaps a minimal example is:
emacs -Q
(set-buffer-multibyte nil)
(message "can't")
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 00:11:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 30405 <at> debbugs.gnu.org (full text, mbox):
Glenn Morris wrote:
> Present since 25.1. Perhaps a minimal example is:
> emacs -Q
> (set-buffer-multibyte nil)
> (message "can't")
Although I might be confusing the return value with the displayed message.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 00:30:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 30405 <at> debbugs.gnu.org (full text, mbox):
Maybe a better example is:
(defun foo ()
(interactive)
(error "can't"))
(set-buffer-multibyte nil)
M-x foo
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 03:08:01 GMT)
Full text and
rfc822 format available.
Message #19 received at 30405 <at> debbugs.gnu.org (full text, mbox):
Glenn Morris <rgm <at> gnu.org> writes:
> Present since 25.1. Perhaps a minimal example is:
> emacs -Q
> (set-buffer-multibyte nil)
> (message "can't")
The issue with message producing fancy quotes is new in 25.1, although
stepping with the debugger, it looks like the root cause is that the
" *Echo Area 0*" buffer becomes unibyte along with the main buffer. So
the following shows the problem in earlier versions as well:
(set-buffer-multibyte nil)
(message "can\u2019t")
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 08:03:01 GMT)
Full text and
rfc822 format available.
Message #22 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> From: Glenn Morris <rgm <at> gnu.org>
> Date: Fri, 09 Feb 2018 19:29:03 -0500
> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 30405 <at> debbugs.gnu.org
>
> Maybe a better example is:
>
> (defun foo ()
> (interactive)
> (error "can't"))
> (set-buffer-multibyte nil)
> M-x foo
I applied (the obvious) band-aid to image.el, so it no longer shows a
garbled error message.
The more general issue should be fixed on master, as it's too late to
make such changes on the release branch. Note also that
substitute-command-keys is affected as well, as can be seen by
evaluating the following:
(progn
(set-buffer-multibyte nil)
(substitute-command-keys "can't"))
Basically, anything that produces non-ASCII characters and then shows
that in the echo area while the current buffer is unibyte will hit
this problem. While Lisp programs that produce literal strings can be
told to take care of that when they use unibyte buffers, the cases
discussed in this bug report happen because we convert ASCII strings
to non-ASCII strings under the hood, so the Lisp programs cannot be
held accountable.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 08:41:02 GMT)
Full text and
rfc822 format available.
Message #25 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> From: Noam Postavsky <npostavs <at> users.sourceforge.net>
> Date: Fri, 09 Feb 2018 22:07:06 -0500
> Cc: Gemini Lasswell <gazally <at> runbox.com>, Paul Eggert <eggert <at> cs.ucla.edu>,
> 30405 <at> debbugs.gnu.org
>
> The issue with message producing fancy quotes is new in 25.1, although
> stepping with the debugger, it looks like the root cause is that the
> " *Echo Area 0*" buffer becomes unibyte along with the main buffer. So
> the following shows the problem in earlier versions as well:
>
> (set-buffer-multibyte nil)
> (message "can\u2019t")
This is the intended behavior, not a bug. We make the echo area
buffer unibyte when the message is generated with the current buffer
being unibyte.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 14:16:02 GMT)
Full text and
rfc822 format available.
Message #28 received at 30405 <at> debbugs.gnu.org (full text, mbox):
On Feb 10 2018, Eli Zaretskii <eliz <at> gnu.org> wrote:
> This is the intended behavior, not a bug. We make the echo area
> buffer unibyte when the message is generated with the current buffer
> being unibyte.
Do we? What I see in set_message_1 is that we set the echo area buffer
multibyteness to the same as the string to be displayed.
Andreas.
--
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 16:45:01 GMT)
Full text and
rfc822 format available.
Message #31 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> From: Andreas Schwab <schwab <at> linux-m68k.org>
> Cc: Noam Postavsky <npostavs <at> users.sourceforge.net>, eggert <at> cs.ucla.edu, 30405 <at> debbugs.gnu.org, gazally <at> runbox.com
> Date: Sat, 10 Feb 2018 15:15:46 +0100
>
> What I see in set_message_1 is that we set the echo area buffer
> multibyteness to the same as the string to be displayed.
That function is not called in the use case discussed here (I
initially also thought it was part of the story, but GDB convinced me
otherwise). The function which is relevant here is
setup_echo_area_for_printing, it is called from PRINTPREPARE.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 18:58:02 GMT)
Full text and
rfc822 format available.
Message #34 received at 30405 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii wrote:
> We make the echo area
> buffer unibyte when the message is generated with the current buffer
> being unibyte.
This made sense back in the 1990s when unibyte was commonly used for text.
Nowadays, though, wouldn't it make more sense to keep the echo area multibyte?
The echo area is intended for text, not for binary data.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 21:25:01 GMT)
Full text and
rfc822 format available.
Message #37 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> Cc: rgm <at> gnu.org, gazally <at> runbox.com, 30405 <at> debbugs.gnu.org
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Sat, 10 Feb 2018 10:57:28 -0800
>
> Eli Zaretskii wrote:
> > We make the echo area
> > buffer unibyte when the message is generated with the current buffer
> > being unibyte.
>
> This made sense back in the 1990s when unibyte was commonly used for text.
> Nowadays, though, wouldn't it make more sense to keep the echo area multibyte?
> The echo area is intended for text, not for binary data.
I don't see how the date outside could matter here. If you understand
the reason behind the code in question, please describe it, and we can
then discuss whether that reason is still valid in the current
codebase.
I have a guess for why we did that: it's because in Emacs 21 we
displayed raw bytes as Latin-N characters, so non-ASCII text in
unibyte strings needed a unibyte buffer to display it as expected.
But that feature is no longer available, as raw bytes are always
displayed as octal escapes.
The question that bothers me is can a unibyte string inserted or
printed into a multibyte buffer be converted to something that will
display as a non-ASCII character, not as an octal escape.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 21:32:02 GMT)
Full text and
rfc822 format available.
Message #40 received at 30405 <at> debbugs.gnu.org (full text, mbox):
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]
> This made sense back in the 1990s when unibyte was commonly used for text.
> Nowadays, though, wouldn't it make more sense to keep the echo area multibyte?
> The echo area is intended for text, not for binary data.
I agree in principle. But I don't know how much work the change would
be.
--
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sat, 10 Feb 2018 23:35:02 GMT)
Full text and
rfc822 format available.
Message #43 received at 30405 <at> debbugs.gnu.org (full text, mbox):
Glenn Morris wrote:
> Maybe a better example is:
>
> (defun foo ()
> (interactive)
> (error "can't"))
> (set-buffer-multibyte nil)
> M-x foo
BTW, replace "error" with "message" and the issue does not appear.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sun, 11 Feb 2018 03:40:01 GMT)
Full text and
rfc822 format available.
Message #46 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> From: Glenn Morris <rgm <at> gnu.org>
> Date: Sat, 10 Feb 2018 18:34:02 -0500
> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 30405 <at> debbugs.gnu.org
>
> Glenn Morris wrote:
>
> > Maybe a better example is:
> >
> > (defun foo ()
> > (interactive)
> > (error "can't"))
> > (set-buffer-multibyte nil)
> > M-x foo
>
> BTW, replace "error" with "message" and the issue does not appear.
Because 'message' doesn't change the quotes.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sun, 11 Feb 2018 03:42:02 GMT)
Full text and
rfc822 format available.
Message #49 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> Date: Sun, 11 Feb 2018 05:38:50 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: gazally <at> runbox.com, eggert <at> cs.ucla.edu, 30405 <at> debbugs.gnu.org
>
> > BTW, replace "error" with "message" and the issue does not appear.
>
> Because 'message' doesn't change the quotes.
Oops, ignore me. That's not the reason.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sun, 11 Feb 2018 05:09:02 GMT)
Full text and
rfc822 format available.
Message #52 received at 30405 <at> debbugs.gnu.org (full text, mbox):
On February 11, 2018 5:41:19 AM GMT+02:00, Eli Zaretskii <eliz <at> gnu.org> wrote:
> > Date: Sun, 11 Feb 2018 05:38:50 +0200
> > From: Eli Zaretskii <eliz <at> gnu.org>
> > Cc: gazally <at> runbox.com, eggert <at> cs.ucla.edu, 30405 <at> debbugs.gnu.org
> >
> > > BTW, replace "error" with "message" and the issue does not appear.
> >
> > Because 'message' doesn't change the quotes.
>
> Oops, ignore me. That's not the reason.
The real reason is that 'message' has a string to display, and so can set up
the echo-area buffer according to multibyteness of that string (it does that
using set_message_1), whereas 'princ' and friends cannot do that. And
'error' calls 'princ'.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sun, 11 Feb 2018 17:27:02 GMT)
Full text and
rfc822 format available.
Message #55 received at 30405 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii wrote:
>> Cc: rgm <at> gnu.org, gazally <at> runbox.com, 30405 <at> debbugs.gnu.org
>> From: Paul Eggert <eggert <at> cs.ucla.edu>
>> Date: Sat, 10 Feb 2018 10:57:28 -0800
>>
>> Eli Zaretskii wrote:
>>> We make the echo area
>>> buffer unibyte when the message is generated with the current buffer
>>> being unibyte.
>>
>> This made sense back in the 1990s when unibyte was commonly used for text.
>> Nowadays, though, wouldn't it make more sense to keep the echo area multibyte?
>> The echo area is intended for text, not for binary data.
>
> I don't see how the date outside could matter here.
What I was trying to say is that back in the 1990s it was relatively common for
people to run Emacs mostly in unibyte mode and to edit files in a Latin-1
locale, so it was natural for programmers to expect the echo area to be
consistent with the file being edited. Nowadays we live in a mostly-multibyte
world, where unibyte inside Emacs is intended only for binary data, and so it's
no longer a reasonable design choice to have the echo area (which is intended
for text messages to the user) to be unibyte (which is now intended for binary
data).
> I have a guess for why we did that: it's because in Emacs 21 we
> displayed raw bytes as Latin-N characters, so non-ASCII text in
> unibyte strings needed a unibyte buffer to display it as expected.
> But that feature is no longer available, as raw bytes are always
> displayed as octal escapes.
Sounds plausible.
> The question that bothers me is can a unibyte string inserted or
> printed into a multibyte buffer be converted to something that will
> display as a non-ASCII character, not as an octal escape.
Surely we can arrange for the latter.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sun, 11 Feb 2018 18:05:02 GMT)
Full text and
rfc822 format available.
Message #58 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> Cc: npostavs <at> users.sourceforge.net, rgm <at> gnu.org, gazally <at> runbox.com,
> 30405 <at> debbugs.gnu.org
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Sun, 11 Feb 2018 09:26:41 -0800
>
> > The question that bothers me is can a unibyte string inserted or
> > printed into a multibyte buffer be converted to something that will
> > display as a non-ASCII character, not as an octal escape.
>
> Surely we can arrange for the latter.
I think we already do. At least I couldn't find a way to display a
raw byte as a non-ASCII Latin character in a unibyte buffer. If no
one can, we could probably remove that unibyte/multibyte magic in echo
area.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sun, 11 Feb 2018 18:17:01 GMT)
Full text and
rfc822 format available.
Message #61 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> Date: Sun, 11 Feb 2018 20:04:19 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: npostavs <at> users.sourceforge.net, 30405 <at> debbugs.gnu.org, gazally <at> runbox.com
>
> > Cc: npostavs <at> users.sourceforge.net, rgm <at> gnu.org, gazally <at> runbox.com,
> > 30405 <at> debbugs.gnu.org
> > From: Paul Eggert <eggert <at> cs.ucla.edu>
> > Date: Sun, 11 Feb 2018 09:26:41 -0800
> >
> > > The question that bothers me is can a unibyte string inserted or
> > > printed into a multibyte buffer be converted to something that will
> > > display as a non-ASCII character, not as an octal escape.
> >
> > Surely we can arrange for the latter.
>
> I think we already do. At least I couldn't find a way to display a
> raw byte as a non-ASCII Latin character in a unibyte buffer. If no
> one can, we could probably remove that unibyte/multibyte magic in echo
> area.
Actually, I can:
emacs -Q
M-x set-variable RET unibyte-display-via-language-environment RET t RET
M-: (set-buffer-multibyte nil) RET
C-q 0242 SPC
This should display ¢.
So I think we can get rid of making echo-area buffers unibyte, as long
as we make sure that variable is nil (which it is by default).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Sun, 11 Feb 2018 20:37:02 GMT)
Full text and
rfc822 format available.
Message #64 received at 30405 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Eli Zaretskii wrote:
> emacs -Q
> M-x set-variable RET unibyte-display-via-language-environment RET t RET
> M-: (set-buffer-multibyte nil) RET
> C-q 0242 SPC
>
> This should display ¢.
>
> So I think we can get rid of making echo-area buffers unibyte, as long
> as we make sure that variable is nil (which it is by default).
Getting rid of it sounds good, but why do we need to worry about
unibyte-display-via-language-environment? For me, the attached patch does
display that test as ¢, and it fixes the other test cases reported so far in
Bug#30405. And yet this patch works without worrying about
unibyte-display-via-language-environment, even if I run Emacs in a unibyte
locale like en_US.iso885915 (a practice that's no longer common).
For what it's worth, I'm testing on Ubuntu 16.04 and on Fedora 27, built
--without-imagicmagick so that I can reproduce the original problem.
[0001-Echo-area-multibyteness-is-independent-of-buffer.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Mon, 12 Feb 2018 18:22:01 GMT)
Full text and
rfc822 format available.
Message #67 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> Cc: npostavs <at> users.sourceforge.net, 30405 <at> debbugs.gnu.org, gazally <at> runbox.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Sun, 11 Feb 2018 12:36:47 -0800
>
> > emacs -Q
> > M-x set-variable RET unibyte-display-via-language-environment RET t RET
> > M-: (set-buffer-multibyte nil) RET
> > C-q 0242 SPC
> >
> > This should display ¢.
> >
> > So I think we can get rid of making echo-area buffers unibyte, as long
> > as we make sure that variable is nil (which it is by default).
>
> Getting rid of it sounds good, but why do we need to worry about
> unibyte-display-via-language-environment?
Because with your patch the following doesn't work as it did before:
emacs -Q
M-x set-variable RET unibyte-display-via-language-environment RET t RET
M-: (set-buffer-multibyte nil) RET
(defun foo ()
(interactive)
(message "cannot ¢")) C-j
M-x foo RET
(To insert ¢, type "C-q 0242".) And the same if you replace 'message'
with 'error'. With your patch, I see an octal escape, not ¢, i.e. the
effect of unibyte-display-via-language-environment is lost.
> And yet this patch works without worrying about
> unibyte-display-via-language-environment, even if I run Emacs in a unibyte
> locale like en_US.iso885915 (a practice that's no longer common).
Whether the locale is unibyte or not is immaterial, because we use
multibyte representation for Latin-N characters as well. What is
important is to be in a unibyte buffer and display a message with
non-ASCII bytes, while unibyte-display-via-language-environment is
non-nil.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Mon, 12 Feb 2018 19:35:01 GMT)
Full text and
rfc822 format available.
Message #70 received at 30405 <at> debbugs.gnu.org (full text, mbox):
On 02/12/2018 10:21 AM, Eli Zaretskii wrote:
>> Getting rid of it sounds good, but why do we need to worry about
>> unibyte-display-via-language-environment?
> Because with your patch the following doesn't work as it did before:
> emacs -Q M-x set-variable RET unibyte-display-via-language-environment
> RET t RET M-: (set-buffer-multibyte nil) RET (defun foo ()
> (interactive) (message "cannot ¢")) C-j M-x foo RET (To insert ¢, type
> "C-q 0242".) And the same if you replace 'message' with 'error'. With
> your patch, I see an octal escape, not ¢, i.e. the effect of
> unibyte-display-via-language-environment is lost.
That's OK, as the effect is intended for unibyte buffers and strings
which are intended for binary data, whereas the echo area is intended to
be text. That is, you're correct that there is a change in behavior
here; the change is an improvement.
The example that you gave is more clearly formulated as follows:
emacs -Q
M-x set-variable RET unibyte-display-via-language-environment RET t RET
(defun foo ()
(interactive)
(message "cannot \242")) C-j
M-x foo RET
because the message deliberately contains the binary byte with octal
value 242. It's more appropriate for this example to display "\242" in
the echo area than to display "¢", because the echo area is text, not
binary data.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Mon, 12 Feb 2018 20:00:02 GMT)
Full text and
rfc822 format available.
Message #73 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> Cc: npostavs <at> users.sourceforge.net, 30405 <at> debbugs.gnu.org, gazally <at> runbox.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Mon, 12 Feb 2018 11:34:13 -0800
>
> emacs -Q
> M-x set-variable RET unibyte-display-via-language-environment RET t RET
> (defun foo ()
> (interactive)
> (message "cannot \242")) C-j
> M-x foo RET
>
> because the message deliberately contains the binary byte with octal
> value 242. It's more appropriate for this example to display "\242" in
> the echo area than to display "¢", because the echo area is text, not
> binary data.
I disagree, because unibyte-display-via-language-environment
explicitly requests display of raw bytes as Latin-1 characters, and it
requests that everywhere, including the echo area and whatnot. That's
the whole raison d'être of that feature.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Mon, 12 Feb 2018 20:33:01 GMT)
Full text and
rfc822 format available.
Message #76 received at 30405 <at> debbugs.gnu.org (full text, mbox):
On 02/12/2018 11:59 AM, Eli Zaretskii wrote:
> unibyte-display-via-language-environment
> explicitly requests display of raw bytes as Latin-1 characters, and it
> requests that everywhere, including the echo area and whatnot.
That's not how Emacs works, at least not in my experience. For example,
on current emacs-26:
emacs -Q
M-x set-variable RET unibyte-display-via-language-environment RET t RET
(defun foo ()
(interactive)
(message "cannot \xA2\u00A2")) C-j
M-x foo RET
This displays "\242¢", not "¢¢".
No doubt this isn't documented as well as it should be, but from looking
at the source code get_next_display_element it's clear that
unibyte-display-via-language-environment does not simply display every
raw byte as a Latin-1 character; instead, the code also takes context
into account, and if the context is multibyte then
unibyte-display-via-language-environment is ignored. Since the echo
area's context is text and not binary data, the display of raw bytes in
the echo area should be unaffected by
unibyte-display-via-language-environment.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Tue, 13 Feb 2018 05:04:03 GMT)
Full text and
rfc822 format available.
Message #79 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> Cc: npostavs <at> users.sourceforge.net, 30405 <at> debbugs.gnu.org, gazally <at> runbox.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Mon, 12 Feb 2018 12:31:55 -0800
>
> On 02/12/2018 11:59 AM, Eli Zaretskii wrote:
> > unibyte-display-via-language-environment
> > explicitly requests display of raw bytes as Latin-1 characters, and it
> > requests that everywhere, including the echo area and whatnot.
>
> That's not how Emacs works, at least not in my experience. For example,
> on current emacs-26:
>
> emacs -Q
> M-x set-variable RET unibyte-display-via-language-environment RET t RET
> (defun foo ()
> (interactive)
> (message "cannot \xA2\u00A2")) C-j
> M-x foo RET
>
> This displays "\242¢", not "¢¢".
Because you shoot your self in the foot by passing a multibyte string
to 'message'. In this scenario, you are not supposed to do that, you
are supposed to use only unibyte strings.
Also, did you try the variant with 'error' instead of 'message' (in
which case you need to make *scratch* unibyte before invoking 'foo'.
> Since the echo area's context is text and not binary data, the
> display of raw bytes in the echo area should be unaffected by
> unibyte-display-via-language-environment.
That variable's purpose is to display raw bytes as readable text, so I
definitely disagree in this specific use case.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Tue, 13 Feb 2018 17:44:01 GMT)
Full text and
rfc822 format available.
Message #82 received at 30405 <at> debbugs.gnu.org (full text, mbox):
On 02/12/2018 09:03 PM, Eli Zaretskii wrote:
> In this scenario, you are not supposed to do that, you
> are supposed to use only unibyte strings.
In that case the user should set the echo area to be unibyte, and if
there's not a convenient way to do that then we should supply one. In
the meantime Emacs is messed up, since it just guesses whether the echo
area should be unibyte and (as we've seen) it guesses wrong in common cases.
> Also, did you try the variant with 'error' instead of 'message' (in
> which case you need to make*scratch* unibyte before invoking 'foo'.
In that setup in the emacs-26 branch, (error "\xA2\u00A2") displays "¢¢"
in the echo area and "\242¢" in *Backtrace* and "\300\242\302\242" in
*Messages*, which is bogus. The 'message' variant displays "\242¢" in
all three places; this is much better behavior.
>> Since the echo area's context is text and not binary data, the
>> display of raw bytes in the echo area should be unaffected by
>> unibyte-display-via-language-environment.
> That variable's purpose is to display raw bytes as readable text, so I
> definitely disagree in this specific use case.
The abovementioned test case establishes that the variable does not in
fact always cause Emacs to display raw bytes as readable text. The only
question is whether the documentation is wrong, or the code (or both
:-). I've given a consistent interpretation that the intent of the
variable is to display raw bytes as text when in a unibyte context
(which the echo area is not). I haven't seen an alternative consistent
interpretation that's corresponds to the behavior Emacs currently
exhibits (i.e., the sort of behavior that elicited this bug report).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#30405
; Package
emacs
.
(Tue, 13 Feb 2018 19:59:01 GMT)
Full text and
rfc822 format available.
Message #85 received at 30405 <at> debbugs.gnu.org (full text, mbox):
> Cc: npostavs <at> users.sourceforge.net, 30405 <at> debbugs.gnu.org, gazally <at> runbox.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Tue, 13 Feb 2018 09:43:34 -0800
>
> > That variable's purpose is to display raw bytes as readable text, so I
> > definitely disagree in this specific use case.
>
> The abovementioned test case establishes that the variable does not in
> fact always cause Emacs to display raw bytes as readable text. The only
> question is whether the documentation is wrong, or the code (or both
> :-). I've given a consistent interpretation that the intent of the
> variable is to display raw bytes as text when in a unibyte context
> (which the echo area is not). I haven't seen an alternative consistent
> interpretation that's corresponds to the behavior Emacs currently
> exhibits (i.e., the sort of behavior that elicited this bug report).
That's your POV, but not mine. I'm not prepared to lose this
variable, given the dozen line it takes to support it. Fixing the
original problem without losing the effect of this variable is easy,
so I see no reason to continue arguing.
Reply sent
to
Eli Zaretskii <eliz <at> gnu.org>
:
You have taken responsibility.
(Sat, 17 Feb 2018 12:30:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Gemini Lasswell <gazally <at> runbox.com>
:
bug acknowledged by developer.
(Sat, 17 Feb 2018 12:30:03 GMT)
Full text and
rfc822 format available.
Message #90 received at 30405-done <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 13 Feb 2018 21:58:30 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: gazally <at> runbox.com, 30405 <at> debbugs.gnu.org, npostavs <at> users.sourceforge.net
>
> > Cc: npostavs <at> users.sourceforge.net, 30405 <at> debbugs.gnu.org, gazally <at> runbox.com
>
> Fixing the original problem without losing the effect of this
> variable is easy
Fixed.
I'm marking this bug done now.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sun, 18 Mar 2018 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 7 years and 92 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.