GNU bug report logs - #12925
24.1; string-make-unibyte instead of string-as-unibyte

Previous Next

Package: emacs;

Reported by: Ethan Glasser-Camp <ethan.glasser.camp <at> gmail.com>

Date: Sun, 18 Nov 2012 17:47:01 UTC

Severity: minor

Found in version 24.1

To reply to this bug, email your comments to 12925 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Sun, 18 Nov 2012 17:47:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ethan Glasser-Camp <ethan.glasser.camp <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 18 Nov 2012 17:47:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ethan Glasser-Camp <ethan.glasser.camp <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Sun, 18 Nov 2012 12:45:20 -0500
This bug report will be sent to the Bug-GNU-Emacs mailing list
and the GNU bug tracker at debbugs.gnu.org.  Please check that
the From: line contains a valid email address.  After a delay of up
to one day, you should receive an acknowledgement at that address.

Please write in English if possible, as the Emacs maintainers
usually do not have translators for other languages.

Please describe exactly what actions triggered the bug, and
the precise symptoms of the bug.  If you can, give a recipe
starting from `emacs -Q':

This is more of a request for information than a bug report.

Consider this code:

(let ((s (string ?\u2019))) ;; RIGHT SINGLE QUOTATION MARK
     (with-temp-buffer 
       (set-buffer-multibyte nil) 
       (insert s) 
       (buffer-string)))

This returns a string with the character ^Y. Whereas, if you switch the
insert and set-buffer-multibyte calls:

(let ((s (string ?\u2019))) ;; RIGHT SINGLE QUOTATION MARK
     (with-temp-buffer 
       (insert s) 
       (set-buffer-multibyte nil) 
       (buffer-string)))

This returns "\342\200\231" (the bytes that make up this character in
utf-8).

The first behavior is documented at the info node "(elisp)Converting
Representations" -- every character is truncated to its low 8 bits. The
second behavior is documented in the following node, "(elisp)Selecting a
Representation" -- the same bytes are left in the buffer but they are
interpreted differently.

I believe that the second behavior is easier to explain and sometimes
useful and that the first one is not. So why does it exist? Why does
inserting multibyte text into a unibyte buffer corrupt it like this?

If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
    `bt full' and `xbacktrace'.
For information about debugging Emacs, please read the file
/usr/share/emacs/24.1/etc/DEBUG.


In GNU Emacs 24.1.1 (x86_64-pc-linux-gnu, GTK+ Version 2.24.12)
 of 2012-09-22 on batsu, modified by Debian
Windowing system distributor `The X.Org Foundation', version 11.0.11300000
Configured using:
 `configure '--build' 'x86_64-linux-gnu' '--build' 'x86_64-linux-gnu'
 '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib'
 '--localstatedir=/var/lib' '--infodir=/usr/share/info'
 '--mandir=/usr/share/man' '--with-pop=yes'
 '--enable-locallisppath=/etc/emacs24:/etc/emacs:/usr/local/share/emacs/24.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/24.1/site-lisp:/usr/share/emacs/site-lisp'
 '--with-crt-dir=/usr/lib/x86_64-linux-gnu' '--with-x=yes'
 '--with-x-toolkit=gtk' '--with-toolkit-scroll-bars'
 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fstack-protector
 --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wall -O2'
 'CPPFLAGS=-D_FORTIFY_SOURCE=2''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_US.UTF-8
  value of $XMODIFIERS: nil
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t

Major mode: Fundamental

Minor modes in effect:
  diff-auto-refine-mode: t
  cua-mode: t
  global-ethan-wspace-mode: t
  ethan-wspace-mode: t
  ethan-wspace-clean-many-nls-eof-mode: t
  ethan-wspace-clean-no-nl-eof-mode: t
  ethan-wspace-clean-eol-mode: t
  ethan-wspace-clean-tabs-mode: t
  shell-dirtrack-mode: t
  recentf-mode: t
  show-paren-mode: t
  global-auto-revert-mode: t
  xterm-mouse-mode: t
  global-undo-tree-mode: t
  undo-tree-mode: t
  sml-modeline-mode: t
  me-minor-mode: t
  tooltip-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  size-indication-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
i k e SPC u t f - 8 , SPC w e SPC r u n SPC i n t o 
SPC p r o b l e m s SPC w i t h SPC e m a c s ' s SPC 
M I M E S-SPC r o u t i s <backspace> n e s , SPC w 
h i c h SPC f o r c e SPC b u f f e r s SPC t o SPC 
b u SPC <backspace> <backspace> e SPC u n i b y t e 
. M-q SPC <backspace> SPC n o t m u c h - b o <backspace> 
<backspace> g e t - b o d y p a r t i <backspace> - 
i n t e r n a l SPC a l r e a d y SPC M-b M-b M-b M-b 
M-b <return> <return> C-e d o e s SPC t h i s , SPC 
<backspace> <backspace> . SPC B r i n g SPC w i t h 
- - n o t <backspace> <backspace> <backspace> <backspace> 
c u r r e n t - n o t m u c h - s h o w - m e s s g 
e <backspace> <backspace> a g e SPC i n t o SPC l i 
n e . M-q <up> C-e <up> <up> <backspace> M-< S-SPC 
<down> <backspace> <C-S-return> C-_ <up> C-d C-SPC 
<down> <down> <down> <down> <down> <down> <down> <down> 
<down> M-w M-x e m <tab> a c s <tab> - r e <tab> <M-backspace> 
<M-backspace> r e <tab> <backspace> <backspace> b u 
g <tab> <tab> <M-backspace> <M-backspace> m <backspace> 
e m <tab> <backspace> <backspace> r e <tab> p o <tab> 
r <tab> <return>

Recent messages:
Auto-saving...done
Mark set [2 times]
C-?:help M-p:pad M-o:open M-c:close M-b:blank M-s:string M-f:fill M-i:incr M-n:seq
Mark set
byte-code: End of buffer [2 times]
Auto-saving...done
Saving all Org-mode buffers...
(No files need saving)
Saving all Org-mode buffers... done
Making completion list... [7 times]

Load-path shadows:
/home/ethan/.emacs.d/el-get/scratch/el-get hides /home/ethan/.emacs.d/el-get/el-get/el-get
/home/ethan/.emacs.d/el-get/el-get/.dir-locals hides /home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/.dir-locals
/home/ethan/.emacs.d/el-get/el-get/.dir-locals hides /home/ethan/.emacs.d/elhome/site-lisp/upstream/magit.git/.dir-locals
/home/ethan/.emacs.d/el-get/scratch/scratch hides ~/.emacs.d/scratch
/home/ethan/.emacs.d/el-get/el-get/el-get-install hides ~/.emacs.d/el-get-install
/home/ethan/.emacs.d/el-get/browse-kill-ring/browse-kill-ring hides /usr/share/emacs24/site-lisp/emacs-goodies-el/browse-kill-ring
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/contrib/lisp/htmlize hides /usr/share/emacs24/site-lisp/emacs-goodies-el/htmlize
/home/ethan/.emacs.d/el-get/initsplit/initsplit hides /usr/share/emacs24/site-lisp/emacs-goodies-el/initsplit
~/.emacs.d/custom hides /usr/share/emacs/24.1/lisp/custom
/home/ethan/.emacs.d/el-get/package/elpa/css-mode-1.0/css-mode hides /usr/share/emacs/24.1/lisp/textmodes/css-mode
/home/ethan/.emacs.d/el-get/rst-mode/rst hides /usr/share/emacs/24.1/lisp/textmodes/rst
/usr/share/emacs24/site-lisp/dictionaries-common/ispell hides /usr/share/emacs/24.1/lisp/textmodes/ispell
/usr/share/emacs24/site-lisp/dictionaries-common/flyspell hides /usr/share/emacs/24.1/lisp/textmodes/flyspell
/home/ethan/.emacs.d/el-get/package/elpa/ruby-mode-1.1/ruby-mode hides /usr/share/emacs/24.1/lisp/progmodes/ruby-mode
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-footnote hides /usr/share/emacs/24.1/lisp/org/org-footnote
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-publish hides /usr/share/emacs/24.1/lisp/org/org-publish
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-ascii hides /usr/share/emacs/24.1/lisp/org/org-ascii
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-ledger hides /usr/share/emacs/24.1/lisp/org/ob-ledger
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-mobile hides /usr/share/emacs/24.1/lisp/org/org-mobile
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-scheme hides /usr/share/emacs/24.1/lisp/org/ob-scheme
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-sqlite hides /usr/share/emacs/24.1/lisp/org/ob-sqlite
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-dot hides /usr/share/emacs/24.1/lisp/org/ob-dot
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-plantuml hides /usr/share/emacs/24.1/lisp/org/ob-plantuml
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-mouse hides /usr/share/emacs/24.1/lisp/org/org-mouse
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-docbook hides /usr/share/emacs/24.1/lisp/org/org-docbook
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-irc hides /usr/share/emacs/24.1/lisp/org/org-irc
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-capture hides /usr/share/emacs/24.1/lisp/org/org-capture
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-pcomplete hides /usr/share/emacs/24.1/lisp/org/org-pcomplete
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-feed hides /usr/share/emacs/24.1/lisp/org/org-feed
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-octave hides /usr/share/emacs/24.1/lisp/org/ob-octave
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-exp hides /usr/share/emacs/24.1/lisp/org/org-exp
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-html hides /usr/share/emacs/24.1/lisp/org/org-html
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-latex hides /usr/share/emacs/24.1/lisp/org/ob-latex
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-mscgen hides /usr/share/emacs/24.1/lisp/org/ob-mscgen
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-matlab hides /usr/share/emacs/24.1/lisp/org/ob-matlab
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-css hides /usr/share/emacs/24.1/lisp/org/ob-css
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-org hides /usr/share/emacs/24.1/lisp/org/ob-org
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-latex hides /usr/share/emacs/24.1/lisp/org/org-latex
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-datetree hides /usr/share/emacs/24.1/lisp/org/org-datetree
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-compat hides /usr/share/emacs/24.1/lisp/org/org-compat
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-mks hides /usr/share/emacs/24.1/lisp/org/org-mks
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-comint hides /usr/share/emacs/24.1/lisp/org/ob-comint
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-maxima hides /usr/share/emacs/24.1/lisp/org/ob-maxima
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-special-blocks hides /usr/share/emacs/24.1/lisp/org/org-special-blocks
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-wl hides /usr/share/emacs/24.1/lisp/org/org-wl
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-ocaml hides /usr/share/emacs/24.1/lisp/org/ob-ocaml
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-ruby hides /usr/share/emacs/24.1/lisp/org/ob-ruby
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-beamer hides /usr/share/emacs/24.1/lisp/org/org-beamer
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-protocol hides /usr/share/emacs/24.1/lisp/org/org-protocol
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-list hides /usr/share/emacs/24.1/lisp/org/org-list
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-bbdb hides /usr/share/emacs/24.1/lisp/org/org-bbdb
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-docview hides /usr/share/emacs/24.1/lisp/org/org-docview
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-w3m hides /usr/share/emacs/24.1/lisp/org/org-w3m
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-keys hides /usr/share/emacs/24.1/lisp/org/ob-keys
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-R hides /usr/share/emacs/24.1/lisp/org/ob-R
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-taskjuggler hides /usr/share/emacs/24.1/lisp/org/org-taskjuggler
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-awk hides /usr/share/emacs/24.1/lisp/org/ob-awk
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-entities hides /usr/share/emacs/24.1/lisp/org/org-entities
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-agenda hides /usr/share/emacs/24.1/lisp/org/org-agenda
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-table hides /usr/share/emacs/24.1/lisp/org/ob-table
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob hides /usr/share/emacs/24.1/lisp/org/ob
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-ditaa hides /usr/share/emacs/24.1/lisp/org/ob-ditaa
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-tangle hides /usr/share/emacs/24.1/lisp/org/ob-tangle
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-remember hides /usr/share/emacs/24.1/lisp/org/org-remember
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-rmail hides /usr/share/emacs/24.1/lisp/org/org-rmail
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-sql hides /usr/share/emacs/24.1/lisp/org/ob-sql
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-ref hides /usr/share/emacs/24.1/lisp/org/ob-ref
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-vm hides /usr/share/emacs/24.1/lisp/org/org-vm
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-habit hides /usr/share/emacs/24.1/lisp/org/org-habit
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-lisp hides /usr/share/emacs/24.1/lisp/org/ob-lisp
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org hides /usr/share/emacs/24.1/lisp/org/org
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-faces hides /usr/share/emacs/24.1/lisp/org/org-faces
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-inlinetask hides /usr/share/emacs/24.1/lisp/org/org-inlinetask
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-colview hides /usr/share/emacs/24.1/lisp/org/org-colview
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-sass hides /usr/share/emacs/24.1/lisp/org/ob-sass
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-id hides /usr/share/emacs/24.1/lisp/org/org-id
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-calc hides /usr/share/emacs/24.1/lisp/org/ob-calc
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-exp-blocks hides /usr/share/emacs/24.1/lisp/org/org-exp-blocks
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-gnuplot hides /usr/share/emacs/24.1/lisp/org/ob-gnuplot
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-mac-message hides /usr/share/emacs/24.1/lisp/org/org-mac-message
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-lob hides /usr/share/emacs/24.1/lisp/org/ob-lob
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-python hides /usr/share/emacs/24.1/lisp/org/ob-python
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-archive hides /usr/share/emacs/24.1/lisp/org/org-archive
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-eval hides /usr/share/emacs/24.1/lisp/org/ob-eval
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-plot hides /usr/share/emacs/24.1/lisp/org/org-plot
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-clock hides /usr/share/emacs/24.1/lisp/org/org-clock
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-timer hides /usr/share/emacs/24.1/lisp/org/org-timer
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-exp hides /usr/share/emacs/24.1/lisp/org/ob-exp
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-sh hides /usr/share/emacs/24.1/lisp/org/ob-sh
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-info hides /usr/share/emacs/24.1/lisp/org/org-info
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-attach hides /usr/share/emacs/24.1/lisp/org/org-attach
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-asymptote hides /usr/share/emacs/24.1/lisp/org/ob-asymptote
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/contrib/babel/langs/ob-fortran hides /usr/share/emacs/24.1/lisp/org/ob-fortran
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-icalendar hides /usr/share/emacs/24.1/lisp/org/org-icalendar
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-lilypond hides /usr/share/emacs/24.1/lisp/org/ob-lilypond
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-indent hides /usr/share/emacs/24.1/lisp/org/org-indent
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-mhe hides /usr/share/emacs/24.1/lisp/org/org-mhe
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-clojure hides /usr/share/emacs/24.1/lisp/org/ob-clojure
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-screen hides /usr/share/emacs/24.1/lisp/org/ob-screen
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-perl hides /usr/share/emacs/24.1/lisp/org/ob-perl
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-ctags hides /usr/share/emacs/24.1/lisp/org/org-ctags
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/contrib/lisp/org-odt hides /usr/share/emacs/24.1/lisp/org/org-odt
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-crypt hides /usr/share/emacs/24.1/lisp/org/org-crypt
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-xoxo hides /usr/share/emacs/24.1/lisp/org/org-xoxo
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-js hides /usr/share/emacs/24.1/lisp/org/ob-js
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/contrib/lisp/org-lparse hides /usr/share/emacs/24.1/lisp/org/org-lparse
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-java hides /usr/share/emacs/24.1/lisp/org/ob-java
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-src hides /usr/share/emacs/24.1/lisp/org/org-src
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-C hides /usr/share/emacs/24.1/lisp/org/ob-C
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-freemind hides /usr/share/emacs/24.1/lisp/org/org-freemind
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-macs hides /usr/share/emacs/24.1/lisp/org/org-macs
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-mew hides /usr/share/emacs/24.1/lisp/org/org-mew
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-haskell hides /usr/share/emacs/24.1/lisp/org/ob-haskell
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-gnus hides /usr/share/emacs/24.1/lisp/org/org-gnus
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/ob-emacs-lisp hides /usr/share/emacs/24.1/lisp/org/ob-emacs-lisp
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-jsinfo hides /usr/share/emacs/24.1/lisp/org/org-jsinfo
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-table hides /usr/share/emacs/24.1/lisp/org/org-table
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/contrib/lisp/org-eshell hides /usr/share/emacs/24.1/lisp/org/org-eshell
/home/ethan/.emacs.d/elhome/site-lisp/upstream/org-mode.git/lisp/org-bibtex hides /usr/share/emacs/24.1/lisp/org/org-bibtex
/home/ethan/.emacs.d/el-get/el-get/.dir-locals hides /usr/share/emacs/24.1/lisp/gnus/.dir-locals

Features:
(shadow emacsbug cua-rect hi-lock shr-color color shr browse-url
gnus-art mm-uu mml2015 epg-config gnus-sum nnoo gnus-group gnus-undo
nnmail mail-source gnus-start gnus-spec gnus-int gnus-range gnus-win
gnus gnus-ems nnheader vc-bzr conf-mode dired-aux tramp-cmds face-remap
mailalias ielm sendmail multi-isearch skeleton sh-script sort mail-extr
mule-util notmuch notmuch-message notmuch-maildir-fcc notmuch-hello
notmuch-show notmuch-print notmuch-crypto notmuch-mua rfc2368
notmuch-address notmuch-wash diff-mode coolj notmuch-query goto-addr
icalendar notmuch-tag crm notmuch-lib json message rfc822 mml mailabbrev
mail-utils gmm-utils mailheader mm-view mml-smime mml-sec smime dig
mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045
ietf-drums executable image-file org-irc org-capture vc-git flyspell
ispell bibtex diary-lib diary-loaddefs org noutline outline cal-menu
calendar cal-loaddefs ffap hl-line idle-highlight css-mode-autoloads
find-file-in-project-autoloads idle-highlight-autoloads
inf-ruby-autoloads rainbow-mode-autoloads ruby-electric-autoloads
ruby-mode-autoloads log-edit pcvs-util add-log ethan-misc elide-head
cua-base info color-theme ido tramp-cache tramp-sh tramp tramp-compat
shell pcomplete format-spec tramp-loaddefs recentf tree-widget paren
autorevert xt-mouse imenu thingatpt uniquify ethan-el-get .loaddefs
twittering-mode url url-proxy url-privacy url-expand url-methods
url-history url-cookie url-util url-parse auth-source eieio assoc
gnus-util password-cache url-vars mm-util mail-prsvr mailcap xml
yasnippet undo-tree diff rst compile comint ansi-color ring newcomment
whole-line-or-region browse-kill-ring java-mode-indent-annotations iedit
rect paredit edmacro kmacro rainbow-mode windmove byte-code-cache
initsplit byte-opt warnings advice advice-preload cus-edit cus-start
cus-load wid-edit find-func el-get el-get-autoloads el-get-list-packages
el-get-notify help-mode easymenu view el-get-dependencies el-get-build
el-get-status pp el-get-recipes el-get-byte-compile el-get-methods
el-get-fossil el-get-svn el-get-pacman el-get-github-zip
el-get-github-tar el-get-http-zip el-get-http-tar el-get-hg
el-get-git-svn el-get-fink el-get-emacswiki el-get-http
el-get-emacsmirror el-get-github el-get-git el-get-elpa package
tabulated-list el-get-darcs el-get-cvs el-get-bzr el-get-brew
el-get-builtin el-get-apt-get el-get-custom el-get-core autoload
help-fns bytecomp byte-compile cconv macroexp cl dired regexp-opt
emacs-goodies-el emacs-goodies-custom emacs-goodies-loaddefs easy-mmode
time-date tooltip ediff-hook vc-hooks lisp-float-type mwheel x-win x-dnd
tool-bar dnd fontset image fringe lisp-mode register page menu-bar
rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax
facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak
czech european ethiopic indian cyrillic chinese case-table epa-hook
jka-cmpr-hook help simple abbrev minibuffer loaddefs button faces
cus-face files text-properties overlay sha1 md5 base64 format env
code-pages mule custom widget hashtable-print-readable backquote
make-network-process dbusbind dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Mon, 19 Nov 2012 02:29:01 GMT) Full text and rfc822 format available.

Message #8 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Ethan Glasser-Camp <ethan.glasser.camp <at> gmail.com>
Cc: 12925 <at> debbugs.gnu.org
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Sun, 18 Nov 2012 21:27:17 -0500
> Why does inserting multibyte text into a unibyte buffer corrupt it
> like this?

Because the right thing (i.e. signaling an error) was not backward
compatible with broken code that assumed that chars can be presented
with 8bit (i.e. code written in the glory days of latin-N, koi-8, ...).

We could/should probably try to do the right thing now, since such
broken code is probably much less common.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Tue, 01 Jun 2021 07:03:02 GMT) Full text and rfc822 format available.

Message #11 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 12925 <at> debbugs.gnu.org, Ethan Glasser-Camp <ethan.glasser.camp <at> gmail.com>
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Tue, 01 Jun 2021 09:02:13 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>> Why does inserting multibyte text into a unibyte buffer corrupt it
>> like this?
>
> Because the right thing (i.e. signaling an error) was not backward
> compatible with broken code that assumed that chars can be presented
> with 8bit (i.e. code written in the glory days of latin-N, koi-8, ...).
>
> We could/should probably try to do the right thing now, since such
> broken code is probably much less common.

(Now eight years later.)

So the suggestion is to make inserting multibyte strings into a unibyte
buffer signal an error (instead of inserting the lower byte of
characters).

Has anybody experimented with doing this and seeing whether this signals
a lot of errors in daily usage?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Tue, 01 Jun 2021 11:58:02 GMT) Full text and rfc822 format available.

Message #14 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: monnier <at> iro.umontreal.ca, 12925 <at> debbugs.gnu.org,
 ethan.glasser.camp <at> gmail.com
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Tue, 01 Jun 2021 14:56:39 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Date: Tue, 01 Jun 2021 09:02:13 +0200
> Cc: 12925 <at> debbugs.gnu.org, Ethan Glasser-Camp <ethan.glasser.camp <at> gmail.com>
> 
> Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
> 
> >> Why does inserting multibyte text into a unibyte buffer corrupt it
> >> like this?
> >
> > Because the right thing (i.e. signaling an error) was not backward
> > compatible with broken code that assumed that chars can be presented
> > with 8bit (i.e. code written in the glory days of latin-N, koi-8, ...).
> >
> > We could/should probably try to do the right thing now, since such
> > broken code is probably much less common.
> 
> (Now eight years later.)
> 
> So the suggestion is to make inserting multibyte strings into a unibyte
> buffer signal an error (instead of inserting the lower byte of
> characters).
> 
> Has anybody experimented with doing this and seeing whether this signals
> a lot of errors in daily usage?

Why not make both methods do the same: insert the bytes of the
multibyte text into the unibyte buffer?

Making the buffer unibyte after insertion is a PITA, because it could
be very slow if the text in the buffer is long.  That's why people may
wish to do it the other way around: making an empty buffer unibyte is
a snap.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Tue, 01 Jun 2021 13:46:01 GMT) Full text and rfc822 format available.

Message #17 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 12925 <at> debbugs.gnu.org,
 ethan.glasser.camp <at> gmail.com
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Tue, 01 Jun 2021 09:45:07 -0400
> Why not make both methods do the same: insert the bytes of the
> multibyte text into the unibyte buffer?

AFAIK it's rather unusual to need to insert a text that's multibyte into
a buffer that's unibyte.  And in those cases, the right behavior is not
always the same (sometimes it should covert using something like
locale-coding-system, sometimes it should preserve the actual
byte-sequence used internally, sometimes it should signal an error, ...).

So I think, as much as possible, we should refrain from guessing and
rather request that the coder call `encode-coding-string` or something
like that explicitly to say what they want.

> Making the buffer unibyte after insertion is a PITA, because it could
> be very slow if the text in the buffer is long.

Agreed.  In my book `set-buffer-multibyte` should signal an error if the
buffer is not empty (yes, I know it's not going to happen, but I think
it's the direction we should be headed).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Tue, 01 Jun 2021 14:05:02 GMT) Full text and rfc822 format available.

Message #20 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: larsi <at> gnus.org, 12925 <at> debbugs.gnu.org, ethan.glasser.camp <at> gmail.com
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Tue, 01 Jun 2021 17:03:53 +0300
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>,  12925 <at> debbugs.gnu.org,
>   ethan.glasser.camp <at> gmail.com
> Date: Tue, 01 Jun 2021 09:45:07 -0400
> 
> > Why not make both methods do the same: insert the bytes of the
> > multibyte text into the unibyte buffer?
> 
> AFAIK it's rather unusual to need to insert a text that's multibyte into
> a buffer that's unibyte.

Most possibly, people don't know the text is multibyte.  Or don't
care.

> And in those cases, the right behavior is not always the same
> (sometimes it should covert using something like
> locale-coding-system, sometimes it should preserve the actual
> byte-sequence used internally, sometimes it should signal an error,
> ...).

What I mean is: if we think the current behavior is broken, then what
I suggest is at least less broken (and sometimes might just be TRT).
At the very least what I suggest is reversible, whereas neither the
current behavior nor what you suggest is.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Tue, 01 Jun 2021 14:26:01 GMT) Full text and rfc822 format available.

Message #23 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: larsi <at> gnus.org, 12925 <at> debbugs.gnu.org, ethan.glasser.camp <at> gmail.com
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Tue, 01 Jun 2021 10:25:17 -0400
>> > Why not make both methods do the same: insert the bytes of the
>> > multibyte text into the unibyte buffer?
>> AFAIK it's rather unusual to need to insert a text that's multibyte into
>> a buffer that's unibyte.
> Most possibly, people don't know the text is multibyte.
> Or don't care.

If they don't know or don't care, then the best we can do is signal an
error to try and wake them up: they *should* know and they *should*
care, otherwise it's a bit like inserting in "any buffer you like,
I don't care".

>> And in those cases, the right behavior is not always the same
>> (sometimes it should covert using something like
>> locale-coding-system, sometimes it should preserve the actual
>> byte-sequence used internally, sometimes it should signal an error,
>> ...).
> What I mean is: if we think the current behavior is broken, then what
> I suggest is at least less broken (and sometimes might just be TRT).

I doubt it's less broken: sometimes it will be TRT, other times it will
be worse than what we have.

> At the very least what I suggest is reversible, whereas neither the
> current behavior nor what you suggest is.

My point is that we shouldn't even get into the position of having to
make such arbitrary choices: we should signal an error before we
get there.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Tue, 01 Jun 2021 15:27:01 GMT) Full text and rfc822 format available.

Message #26 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: larsi <at> gnus.org, 12925 <at> debbugs.gnu.org, ethan.glasser.camp <at> gmail.com
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Tue, 01 Jun 2021 18:26:33 +0300
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: larsi <at> gnus.org,  12925 <at> debbugs.gnu.org,  ethan.glasser.camp <at> gmail.com
> Date: Tue, 01 Jun 2021 10:25:17 -0400
> 
> > What I mean is: if we think the current behavior is broken, then what
> > I suggest is at least less broken (and sometimes might just be TRT).
> 
> I doubt it's less broken: sometimes it will be TRT, other times it will
> be worse than what we have.
> 
> > At the very least what I suggest is reversible, whereas neither the
> > current behavior nor what you suggest is.
> 
> My point is that we shouldn't even get into the position of having to
> make such arbitrary choices: we should signal an error before we
> get there.

Well, then we still disagree.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Wed, 02 Jun 2021 05:08:01 GMT) Full text and rfc822 format available.

Message #29 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 12925 <at> debbugs.gnu.org,
 ethan.glasser.camp <at> gmail.com
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Wed, 02 Jun 2021 07:07:25 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> Why not make both methods do the same: insert the bytes of the
> multibyte text into the unibyte buffer?

I think it's still common to have raw bytes in multibyte buffers.
Inserting data from these buffers into unibyte buffers works fine.
(That's the rationale for inserting the "lower byte" in these
situations.)

So I don't think we should change this to insert the multibyte text,
because that'd break stuff.

The question is what to do when inserting multibyte characters in
unibyte buffers, and I think that's always an error (i.e., it's never
what the person who wrote the code wanted to happen).  I think we should
start off by doing a demoted warning thing, and then segue into
signalling an error at a later date.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Wed, 02 Jun 2021 12:08:01 GMT) Full text and rfc822 format available.

Message #32 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: monnier <at> iro.umontreal.ca, 12925 <at> debbugs.gnu.org,
 ethan.glasser.camp <at> gmail.com
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Wed, 02 Jun 2021 15:07:24 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: monnier <at> iro.umontreal.ca,  12925 <at> debbugs.gnu.org,
>   ethan.glasser.camp <at> gmail.com
> Date: Wed, 02 Jun 2021 07:07:25 +0200
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Why not make both methods do the same: insert the bytes of the
> > multibyte text into the unibyte buffer?
> 
> I think it's still common to have raw bytes in multibyte buffers.
> Inserting data from these buffers into unibyte buffers works fine.
> (That's the rationale for inserting the "lower byte" in these
> situations.)
> 
> So I don't think we should change this to insert the multibyte text,
> because that'd break stuff.

And signaling an error won't break stuff?

> The question is what to do when inserting multibyte characters in
> unibyte buffers, and I think that's always an error (i.e., it's never
> what the person who wrote the code wanted to happen).

Now I'm confused: you have just explained above that it should
continue working.  What am I missing?

Please note that I wasn't talking about inserting raw bytes, whether
they come from unibyte or multibyte buffers, I was talking about
inserting multibyte text that represents human-readable characters.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Wed, 02 Jun 2021 13:10:02 GMT) Full text and rfc822 format available.

Message #35 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 12925 <at> debbugs.gnu.org,
 ethan.glasser.camp <at> gmail.com
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Wed, 02 Jun 2021 15:09:35 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> Please note that I wasn't talking about inserting raw bytes, whether
> they come from unibyte or multibyte buffers, I was talking about
> inserting multibyte text that represents human-readable characters.

OK, then we're in violent agreement there.  I was simply pointing out
that we can't change insertion of multibyte text in the simple way you
seemed to be suggesting (i.e., just insert the bytes in the multibyte
string, because a multibyte raw character is represented by several
bytes (is it two or three? I forget)).

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#12925; Package emacs. (Wed, 02 Jun 2021 13:37:01 GMT) Full text and rfc822 format available.

Message #38 received at 12925 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: monnier <at> iro.umontreal.ca, 12925 <at> debbugs.gnu.org,
 ethan.glasser.camp <at> gmail.com
Subject: Re: bug#12925: 24.1; string-make-unibyte instead of string-as-unibyte
Date: Wed, 02 Jun 2021 16:36:27 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: monnier <at> iro.umontreal.ca,  12925 <at> debbugs.gnu.org,
>   ethan.glasser.camp <at> gmail.com
> Date: Wed, 02 Jun 2021 15:09:35 +0200
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Please note that I wasn't talking about inserting raw bytes, whether
> > they come from unibyte or multibyte buffers, I was talking about
> > inserting multibyte text that represents human-readable characters.
> 
> OK, then we're in violent agreement there.  I was simply pointing out
> that we can't change insertion of multibyte text in the simple way you
> seemed to be suggesting (i.e., just insert the bytes in the multibyte
> string

Yes, we need special handling of raw bytes, as usual.

> because a multibyte raw character is represented by several bytes
> (is it two or three? I forget)).

2 or 5.





This bug report was last modified 4 years and 11 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.