GNU bug report logs - #9318
23.3.50; The first call of encode-coding-region() returns wrong result on on Windows

Previous Next

Package: emacs;

Reported by: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>

Date: Thu, 18 Aug 2011 09:04:02 UTC

Severity: normal

Found in version 23.3.50

Fixed in version 24.0.93

Done: Glenn Morris <rgm <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 9318 in the body.
You can then email your comments to 9318 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Thu, 18 Aug 2011 09:04:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 18 Aug 2011 09:04:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: bug-gnu-emacs <at> gnu.org
Subject: 23.3.50;
	The first call of encode-coding-region() returns wrong result on on
	Windows
Date: Thu, 18 Aug 2011 18:01:13 +0900

When I start Emacs and evaluate the below code, unexpected result returns.

(let ((func (lambda ()
	      (with-temp-buffer
		(mapc 'insert '(166 25339))
		(encode-coding-region (point-min) (point-max) 'ctext-unix)
		(buffer-string)))))
  (cons (funcall func)
	(funcall func)))
-> ("¦拻^@^@^@^@^@^@^@^@^@^@" . "^[$(D\"C^[$(H*f^[(B") 

car of the result is not constant.  In the worst case, emacs
crashes.  It doesn't occur on Linux.  If I evaluate twice, car and cdr
of the last result are correct.  Using encode-coding-string instead of
encode-coding-region has no problem. 

(let ((func (lambda ()
	      (encode-coding-string
	       (mapconcat 'char-to-string '(166 25339) "")
	       'ctext-unix))))
  (cons (funcall func)
	(funcall func)))
-> ("^[$(D\"C^[$(H*f^[(B" . "^[$(D\"C^[$(H*f^[(B")

Before calling encode-coding-string also can avoid problem.

(let ((func (lambda ()
	      (with-temp-buffer
		(mapc 'insert '(166 25339))
		(encode-coding-region (point-min) (point-max) 'ctext-unix)
		(buffer-string)))))
  (encode-coding-string
   (mapconcat 'char-to-string '(166 25339) "") 'ctext-unix)
  (cons (funcall func)
	(funcall func)))
-> ("^[$(D\"C^[$(H*f^[(B" . "^[$(D\"C^[$(H*f^[(B")

-- 
Kazuhiro Ito

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Thu, 18 Aug 2011 09:51:01 GMT) Full text and rfc822 format available.

Message #8 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
Cc: 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result on on
	Windows
Date: Thu, 18 Aug 2011 11:48:36 +0200

Kazuhiro Ito <kzhr <at> d1.dion.ne.jp> writes:

> Before calling encode-coding-string also can avoid problem.

Perhaps something is clobbered by some autoloading?

Andreas.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Thu, 18 Aug 2011 21:37:01 GMT) Full text and rfc822 format available.

Message #11 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result on on
	Windows
Date: Fri, 19 Aug 2011 06:33:41 +0900

> Perhaps something is clobbered by some autoloading?

I think I don't understand what you mean excatly, but these phenomena are
reproducible on precompiled binary (*1) with -Q option.

(*1) http://ftp.gnu.org/pub/gnu/emacs/windows/emacs-23.3-bin-i386.zip

-- 
Kazuhiro Ito

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Fri, 19 Aug 2011 13:49:02 GMT) Full text and rfc822 format available.

Message #14 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: 9318 <at> debbugs.gnu.org
Subject: 23.3.50; The first call of encode-coding-region() returns wrong result
Date: Fri, 19 Aug 2011 22:46:18 +0900

> When I start Emacs and evaluate the below code, unexpected result returns.

> (let ((func (lambda ()
> 	      (with-temp-buffer
> 		(mapc 'insert '(166 25339))
> 		(encode-coding-region (point-min) (point-max) 'ctext-unix)
> 		(buffer-string)))))
>   (cons (funcall func)
> 	(funcall func)))
> -> ("¦拻^@^@^@^@^@^@^@^@^@^@" . "^[$(D\"C^[$(H*f^[(B") 

> car of the result is not constant.

I noticed this problem is not Windows specific.  I confirmed that it
is reproducible in Emacs 23.3.1 (build by pkgsrc) on NetBSD/amd64 via
SSH from remote host.  But it doesn't occur on openSUSE 11.3.

-- 
Kazuhiro Ito

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Sat, 20 Aug 2011 21:29:02 GMT) Full text and rfc822 format available.

Message #17 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Chong Yidong <cyd <at> stupidchicken.com>
To: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
Cc: 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Sat, 20 Aug 2011 17:26:04 -0400

Kazuhiro Ito <kzhr <at> d1.dion.ne.jp> writes:

>> When I start Emacs and evaluate the below code, unexpected result returns.
>
>> (let ((func (lambda ()
>> 	      (with-temp-buffer
>> 		(mapc 'insert '(166 25339))
>> 		(encode-coding-region (point-min) (point-max) 'ctext-unix)
>> 		(buffer-string)))))
>>   (cons (funcall func)
>> 	(funcall func)))
>> -> ("¦拻^@^@^@^@^@^@^@^@^@^@" . "^[$(D\"C^[$(H*f^[(B")
>
>> car of the result is not constant.
>
> I noticed this problem is not Windows specific.  I confirmed that it
> is reproducible in Emacs 23.3.1 (build by pkgsrc) on NetBSD/amd64 via
> SSH from remote host.  But it doesn't occur on openSUSE 11.3.

Could you run Emacs under a debugger, trigger the crash, and provide a
backtrace?  (You will need to have compiled Emacs with debugging
symbols.)

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Sun, 21 Aug 2011 00:20:02 GMT) Full text and rfc822 format available.

Message #20 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: Chong Yidong <cyd <at> stupidchicken.com>
Cc: 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Sun, 21 Aug 2011 09:17:05 +0900

> >> When I start Emacs and evaluate the below code, unexpected result returns.
> >
> >> (let ((func (lambda ()
> >> 	      (with-temp-buffer
> >> 		(mapc 'insert '(166 25339))
> >> 		(encode-coding-region (point-min) (point-max) 'ctext-unix)
> >> 		(buffer-string)))))
> >>   (cons (funcall func)
> >> 	(funcall func)))
> >> -> ("¦拻^@^@^@^@^@^@^@^@^@^@" . "^[$(D\"C^[$(H*f^[(B")
> >
> > I noticed this problem is not Windows specific.  I confirmed that it
> > is reproducible in Emacs 23.3.1 (build by pkgsrc) on NetBSD/amd64 via
> > SSH from remote host.  But it doesn't occur on openSUSE 11.3.
> 
> Could you run Emacs under a debugger, trigger the crash, and provide a
> backtrace?  (You will need to have compiled Emacs with debugging
> symbols.)

I built Emacs 23.3 with "-O0 -g" option on NetBSD 5.1 (amd64), and
started with below commad (via SSH).

gdb --args emacs -Q --no-splash

Next, inputtedand below code and evaluated with C-x C-e.

(progn
  (goto-char (point-min))
  (insert #x80)
  (insert (make-string 16 ?A))
  (encode-coding-region 1 18 'ctext-unix))

backtrace is below.  Please let me know if you need more information.


Program received signal SIGSEGV, Segmentation fault.
0x0000000000557419 in mark_object (arg=4702111234474983745) at alloc.c:5473
5473            if (STRING_MARKED_P (ptr))
(gdb) bt full
#0  0x0000000000557419 in mark_object (arg=4702111234474983745) at alloc.c:5473
        ptr = (struct Lisp_String *) 0x4141414141414140
        obj = 4702111234474983745
        cdr_count = 0
#1  0x0000000000557320 in mark_char_table (ptr=0x1281800) at alloc.c:5405
        val = 4702111234474983745
        size = 130
        i = 0
#2  0x0000000000557315 in mark_char_table (ptr=0x17f6c00) at alloc.c:5402
        val = 19404805
        size = 34
        i = 14
#3  0x0000000000557315 in mark_char_table (ptr=0x13ea700) at alloc.c:5402
        val = 25127941
        size = 18
        i = 6
#4  0x0000000000557315 in mark_char_table (ptr=0x10ba800) at alloc.c:5402
        val = 20883205
        size = 68
        i = 4
#5  0x0000000000557838 in mark_object (arg=17541125) at alloc.c:5567
        obj = 17541125
        cdr_count = 0
#6  0x0000000000557228 in mark_vectorlike (ptr=0xb16480) at alloc.c:5377
        size = 10
        i = 9
#7  0x0000000000557855 in mark_object (arg=11625605) at alloc.c:5569
        obj = 11625605
        cdr_count = 0
#8  0x0000000000557228 in mark_vectorlike (ptr=0xb56000) at alloc.c:5377
        size = 434
        i = 107
#9  0x0000000000557855 in mark_object (arg=11886597) at alloc.c:5569
        obj = 11886597
        cdr_count = 0
#10 0x00000000005577b0 in mark_object (arg=10786565) at alloc.c:5562
        h = (struct Lisp_Hash_Table *) 0xa49700
        obj = 10786565
        cdr_count = 0
#11 0x00000000005568ff in Fgarbage_collect () at alloc.c:5092
        bind = (struct specbinding *) 0xb96526
        catch = (struct catchtag *) 0x7f7fffffc508
        handler = (struct handler *) 0x10
        stack_top_variable = 0 '\0'
        i = 418
        message_p = 0
        total = {140187732526192, 140187732526008, 140187732526000, 4294967295, 
  12148454, 10960258, 10312685, 68}
        count = 8
        t1 = {
  tv_sec = 1313842937, 
  tv_usec = 498976
}
        t2 = {
  tv_sec = 0, 
  tv_usec = 140187732530104
}
        t3 = {
  tv_sec = 11465618, 
  tv_usec = 0
}
#12 0x0000000000577bb4 in Ffuncall (nargs=2, args=0x7f7fffffc4f0) at eval.c:2965
        fun = 10313885
        original_fun = 10959186
        funcar = 10762338
        numargs = 1
        lisp_numargs = 10950075
        val = 68
        backtrace = {
  next = 0x7f7fffffc9a0, 
  function = 0x7f7fffffc4f8, 
  args = 0x7f7fffffc500, 
  nargs = 1, 
  evalargs = 0 '\0', 
  debug_on_exit = 0 '\0'
}
        internal_args = (Lisp_Object *) 0x7f7fffffc500
        i = 0
#13 0x00000000005ce3c1 in Fbyte_code (bytestr=9300689, vector=9300725, maxdepth=12)
    at bytecode.c:680
        count = 7
        op = 1
        vectorp = (Lisp_Object *) 0x8deb00
        bytestr_length = 18
        stack = {
  pc = 0x96972f ")\207", 
  top = 0x7f7fffffc4f8, 
  bottom = 0x7f7fffffc4f0, 
  byte_string = 9300689, 
  byte_string_start = 0x96971f "\b\203\b", 
  constants = 9300725, 
  next = 0x7f7fffffcb40
}
        top = (Lisp_Object *) 0x7f7fffffc4f0
        result = 10956883
#14 0x00000000005788cc in funcall_lambda (fun=9300621, nargs=1, 
    arg_vector=0x7f7fffffca28) at eval.c:3220
        val = 10762242
        syms_left = 10762242
        next = 18577650
        count = 6
        i = 1
        optional = 0
        rest = 0
#15 0x000000000057821a in Ffuncall (nargs=2, args=0x7f7fffffca20) at eval.c:3077
        fun = 9300621
        original_fun = 18577602
        funcar = 18577842
        numargs = 1
        lisp_numargs = 10956963
        val = 10762242
        backtrace = {
  next = 0x7f7fffffced0, 
  function = 0x7f7fffffca20, 
  args = 0x7f7fffffca28, 
  nargs = 1, 
  evalargs = 0 '\0', 
  debug_on_exit = 0 '\0'
}
        internal_args = (Lisp_Object *) 0xa730a3
        i = 0
#16 0x00000000005ce3c1 in Fbyte_code (bytestr=9301185, vector=9301221, maxdepth=12)
    at bytecode.c:680
        count = 5
        op = 1
        vectorp = (Lisp_Object *) 0x8decf0
        bytestr_length = 31
        stack = {
  pc = 0x969692 "\v)B\211\034A\n=\204\033", 
  top = 0x7f7fffffca28, 
  bottom = 0x7f7fffffca20, 
  byte_string = 9301185, 
  byte_string_start = 0x969685 "\b\204\b", 
  constants = 9301221, 
  next = 0x0
}
        top = (Lisp_Object *) 0x7f7fffffca20
        result = 10762242
#17 0x00000000005788cc in funcall_lambda (fun=9301109, nargs=1, 
    arg_vector=0x7f7fffffcfa8) at eval.c:3220
        val = 140187732528832
        syms_left = 10762242
        next = 18577650
        count = 4
        i = 1
        optional = 0
        rest = 0
#18 0x000000000057821a in Ffuncall (nargs=2, args=0x7f7fffffcfa0) at eval.c:3077
        fun = 9301109
        original_fun = 11438610
        funcar = 5059672
        numargs = 1
        lisp_numargs = 5059670
        val = 10762242
        backtrace = {
  next = 0x7f7fffffd310, 
  function = 0x7f7fffffcfa0, 
  args = 0x7f7fffffcfa8, 
  nargs = 1, 
  evalargs = 0 '\0', 
  debug_on_exit = 0 '\0'
}
        internal_args = (Lisp_Object *) 0xa77993
        i = 0
#19 0x000000000057296b in Fcall_interactively (function=11438610, 
    record_flag=10762242, keys=10790405) at callint.c:869
        val = 4
        args = (Lisp_Object *) 0x7f7fffffcfa0
        visargs = (Lisp_Object *) 0x7f7fffffcf80
        specs = 9301281
        filter_specs = 9301281
        teml = 5734938
        up_event = 10762242
        enable = 10762242
        speccount = 2
        next_event = 2
        prefix_arg = 10762242
        string = (unsigned char *) 0x7f7fffffcfc0 "P"
        tem = (unsigned char *) 0x61652c ""
        varies = (int *) 0x7f7fffffcf60
        i = 2
        j = 1
        count = 1
        foo = 1
        prompt1 = '\0' <repeats 99 times>
        tem1 = 0x0
        arg_from_tty = 0
        gcpro1 = {
  next = 0xa43802, 
  var = 0xa43802, 
  nvars = 0
}
        gcpro2 = {
  next = 0xa53bc2, 
  var = 0xa51c05, 
  nvars = 10828738
}
        gcpro3 = {
  next = 0xa55952, 
  var = 0xa53bc2, 
  nvars = 2
}
        gcpro4 = {
  next = 0xa43802, 
  var = 0xb4a776, 
  nvars = 2
}
        gcpro5 = {
  next = 0xa43802, 
  var = 0xa43802, 
  nvars = 10836306
}
        key_count = 2
        record_then_fail = 0
        save_this_command = 11438610
        save_last_command = 11490098
        save_this_original_command = 11438610
        save_real_this_command = 11438610
#20 0x0000000000577f70 in Ffuncall (nargs=4, args=0x7f7fffffd3b0) at eval.c:3037
        fun = 10312397
        original_fun = 10978002
        funcar = 4294967297
        numargs = 3
        lisp_numargs = 10937344
        val = 315
        backtrace = {
  next = 0x0, 
  function = 0x7f7fffffd3b0, 
  args = 0x7f7fffffd3b8, 
  nargs = 3, 
  evalargs = 0 '\0', 
  debug_on_exit = 0 '\0'
}
        internal_args = (Lisp_Object *) 0x7f7fffffd3b8
        i = 0
#21 0x000000000057795d in call3 (fn=10978002, arg1=11438610, arg2=10762242, 
    arg3=10762242) at eval.c:2857
        ret_ungc_val = 9301109
        gcpro1 = {
  next = 0x8dec75, 
  var = 0xa43802, 
  nvars = 4
}
        args = {10978002, 11438610, 10762242, 10762242}
#22 0x00000000004e4bca in Fcommand_execute (cmd=11438610, record_flag=10762242, 
    keys=10762242, special=10762242) at keyboard.c:10562
        final = 9301109
        tem = 10762242
        prefixarg = 10762242
#23 0x00000000004d564d in command_loop_1 () at keyboard.c:1906
        cmd = 11438610
        lose = 1
        keybuf = {96, 20, 8, 0, 140187732530800, 18451712, 1893, 0, 
  140187732530816, 1983, 18451712, 4294967317, 140187732530800, 6299742, 10656928, 
  216, 10937344, 7378697632079252736, 140187732530864, 9720, 274877896416, 
  140187732531032, 0, 140187732530872, 140187732530384, 0, 10762242, 12348018, 
  8166853, 10762242}
        i = 2
        prev_modiff = 158
        prev_buffer = (struct buffer *) 0xa51c00
        already_adjusted = 0
#24 0x0000000000575049 in internal_condition_case (bfun=0x4d3a17 <command_loop_1>, 
    handlers=10851522, hfun=0x4d34bc <cmd_error>) at eval.c:1492
        val = 10762242
        c = {
  tag = 10762242, 
  val = 10762242, 
  next = 0x7f7fffffd880, 
  gcpro = 0x0, 
  jmp = {2129, 140187732531264, 140187732541408, 140187698962432, 140187696909296, 
    3, 140187732531000, 5722036, 0, 140187732531488, 18636288}, 
  backlist = 0x0, 
  handlerlist = 0x0, 
  lisp_eval_depth = 0, 
  pdlcount = 2, 
  poll_suppress_count = 0, 
  interrupt_input_blocked = 0, 
  byte_stack = 0x0
}
        h = {
  handler = 10851522, 
  var = 10762242, 
  chosen_clause = 0, 
  tag = 0x7f7fffffd790, 
  next = 0x0
}
#25 0x00000000004d389f in command_loop_2 () at keyboard.c:1362
        val = 1
#26 0x0000000000574a0e in internal_catch (tag=10846786, 
    func=0x4d3885 <command_loop_2>, arg=10762242) at eval.c:1228
        c = {
  tag = 10846786, 
  val = 10762242, 
  next = 0x0, 
  gcpro = 0x0, 
  jmp = {2129, 140187732531488, 140187732541408, 140187698962432, 140187696909296, 
    3, 140187732531288, 5720565, 4301358603, 10820608, 11046651}, 
  backlist = 0x0, 
  handlerlist = 0x0, 
  lisp_eval_depth = 0, 
  pdlcount = 2, 
  poll_suppress_count = 0, 
  interrupt_input_blocked = 0, 
  byte_stack = 0x0
}
#27 0x00000000004d3859 in command_loop () at keyboard.c:1341
No locals.
#28 0x00000000004d3004 in recursive_edit_1 () at keyboard.c:956
        count = 1
        val = 5059007
#29 0x00000000004d31a6 in Frecursive_edit () at keyboard.c:1018
        count = 0
        buffer = 10762242
#30 0x00000000004d169a in main (argc=3, argv=0x7f7fffffdb70) at emacs.c:1833
        dummy = 140187730444288
        stack_bottom_variable = 0 '\0'
        do_initial_setlocale = 1
        skip_args = 0
        rlim = {
  rlim_cur = 8720384, 
  rlim_max = 33554432
}
        no_loadup = 0
        junk = 0x0
        dname_arg = 0x0

Lisp Backtrace:
"eval-last-sexp-1" (0xffffca28)
"eval-last-sexp" (0xffffcfa8)
"call-interactively" (0xffffd3b8)


-- 
Kazuhiro Ito

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Wed, 24 Aug 2011 09:41:02 GMT) Full text and rfc822 format available.

Message #23 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: Chong Yidong <cyd <at> stupidchicken.com>
Cc: 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Wed, 24 Aug 2011 18:37:24 +0900

> I built Emacs 23.3 with "-O0 -g" option on NetBSD 5.1 (amd64), and
> started with below commad (via SSH).
> 
> gdb --args emacs -Q --no-splash
> 
> Next, inputtedand below code and evaluated with C-x C-e.
> 
> (progn
>   (goto-char (point-min))
>   (insert #x80)
>   (insert (make-string 16 ?A))
>   (encode-coding-region 1 18 'ctext-unix))
> 
> backtrace is below.  Please let me know if you need more information.
> 
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000000557419 in mark_object (arg=4702111234474983745) at alloc.c:5473
> 5473            if (STRING_MARKED_P (ptr))

I think relocation of buffer may cause the problem.

The comment for CODING_DECODE_CHAR macro in coding.c says as below.

> /* This wrapper macro is used to preserve validity of pointers into
>    buffer text across calls to decode_char, which could cause
>    relocation of buffers if it loads a charset map, because loading a
>    charset map allocates large structures.  */

encode_coding_iso_2022() uses ENCODE_ISO_CHARACTER macro, which uses
ENCODE_CHAR macro.  ENCODE_CHAR macro calls encode_char() and it may
load a charset map.  If this is the cause of the problem,
encode_coding_emace_mule() has the same problem.

-- 
Kazuhiro Ito

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Wed, 24 Aug 2011 12:10:02 GMT) Full text and rfc822 format available.

Message #26 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
Cc: cyd <at> stupidchicken.com, 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Wed, 24 Aug 2011 15:06:48 +0300

> Date: Wed, 24 Aug 2011 18:37:24 +0900
> From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
> Cc: 9318 <at> debbugs.gnu.org
> 
> > (progn
> >   (goto-char (point-min))
> >   (insert #x80)
> >   (insert (make-string 16 ?A))
> >   (encode-coding-region 1 18 'ctext-unix))
> > 
> > backtrace is below.  Please let me know if you need more information.
> > 
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x0000000000557419 in mark_object (arg=4702111234474983745) at alloc.c:5473
> > 5473            if (STRING_MARKED_P (ptr))
> 
> I think relocation of buffer may cause the problem.
> 
> The comment for CODING_DECODE_CHAR macro in coding.c says as below.
> 
> > /* This wrapper macro is used to preserve validity of pointers into
> >    buffer text across calls to decode_char, which could cause
> >    relocation of buffers if it loads a charset map, because loading a
> >    charset map allocates large structures.  */
> 
> encode_coding_iso_2022() uses ENCODE_ISO_CHARACTER macro, which uses
> ENCODE_CHAR macro.  ENCODE_CHAR macro calls encode_char() and it may
> load a charset map.

But which pointer(s) in encode_coding_iso_2022 can be altered by
relocation?  Do you actually see any of the pointers used by this
function modified by relocation of some buffer?

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Wed, 24 Aug 2011 18:03:01 GMT) Full text and rfc822 format available.

Message #29 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
Cc: Chong Yidong <cyd <at> stupidchicken.com>, 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Wed, 24 Aug 2011 19:59:34 +0200

Kazuhiro Ito <kzhr <at> d1.dion.ne.jp> writes:

> I think relocation of buffer may cause the problem.

Does that help?

diff --git a/src/coding.c b/src/coding.c
index 65c8a76..f34a023 100644
--- a/src/coding.c
+++ b/src/coding.c
@@ -915,8 +915,8 @@ record_conversion_result (struct coding_system *coding,
     }
 }
 
-/* This wrapper macro is used to preserve validity of pointers into
-   buffer text across calls to decode_char, which could cause
+/* These wrapper macros are used to preserve validity of pointers into
+   buffer text across calls to decode_char/encode_char, which could cause
    relocation of buffers if it loads a charset map, because loading a
    charset map allocates large structures.  */
 #define CODING_DECODE_CHAR(coding, src, src_base, src_end, charset, code, c) \
@@ -935,6 +935,21 @@ record_conversion_result (struct coding_system *coding,
 	src_end += offset;						     \
       }									     \
   } while (0)
+#define CODING_ENCODE_CHAR(coding, dst, dst_end, charset, c, code)	\
+  do {									\
+    charset_map_loaded = 0;						\
+    code = ENCODE_CHAR (charset, c);					\
+    if (charset_map_loaded)						\
+      {									\
+	const unsigned char *orig = coding->destination;		\
+	EMACS_INT offset;						\
+									\
+	coding_set_destination (coding);				\
+	offset = coding->destination - orig;				\
+	dst += offset;							\
+	dst_end += offset;						\
+      }									\
+  } while (0)
 
 
 /* If there are at least BYTES length of room at dst, allocate memory
@@ -2652,7 +2667,7 @@ encode_coding_emacs_mule (struct coding_system *coding)
 	    {
 	      charset = CHARSET_FROM_ID (preferred_charset_id);
 	      if (CHAR_CHARSET_P (c, charset))
-		code = ENCODE_CHAR (charset, c);
+		CODING_ENCODE_CHAR (coding, dst, dst_end, charset, c, code);
 	      else
 		charset = char_charset (c, charset_list, &code);
 	    }
@@ -4185,7 +4200,8 @@ decode_coding_iso_2022 (struct coding_system *coding)
 
 #define ENCODE_ISO_CHARACTER(charset, c)				   \
   do {									   \
-    int code = ENCODE_CHAR ((charset), (c));				   \
+    int code;								   \
+    CODING_ENCODE_CHAR (coding, dst, dst_end, charset, c, code);	   \
 									   \
     if (CHARSET_DIMENSION (charset) == 1)				   \
       ENCODE_ISO_CHARACTER_DIMENSION1 ((charset), code);		   \

Andreas.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Thu, 25 Aug 2011 09:59:01 GMT) Full text and rfc822 format available.

Message #32 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Thu, 25 Aug 2011 18:49:52 +0900

> > > (progn
> > >   (goto-char (point-min))
> > >   (insert #x80)
> > >   (insert (make-string 16 ?A))
> > >   (encode-coding-region 1 18 'ctext-unix))
> > > 
> > > backtrace is below.  Please let me know if you need more information.
> > > 
> > > 
> > > Program received signal SIGSEGV, Segmentation fault.
> > > 0x0000000000557419 in mark_object (arg=4702111234474983745) at alloc.c:5473
> > > 5473            if (STRING_MARKED_P (ptr))
> > 
> > I think relocation of buffer may cause the problem.
> > 
> > The comment for CODING_DECODE_CHAR macro in coding.c says as below.
> > 
> > > /* This wrapper macro is used to preserve validity of pointers into
> > >    buffer text across calls to decode_char, which could cause
> > >    relocation of buffers if it loads a charset map, because loading a
> > >    charset map allocates large structures.  */
> > 
> > encode_coding_iso_2022() uses ENCODE_ISO_CHARACTER macro, which uses
> > ENCODE_CHAR macro.  ENCODE_CHAR macro calls encode_char() and it may
> > load a charset map.
> 
> But which pointer(s) in encode_coding_iso_2022 can be altered by
> relocation?  

encode_coding() sets coding->destination with coding_set_destination()
before calling encode_coding_iso_2022().  I think at least correct
value of coding->destination can change in encode_coding_iso_2022() by
loading charset maps.

> Do you actually see any of the pointers used by this
> function modified by relocation of some buffer?

No, beacuse I don't know how to see.

-- 
Kazuhiro Ito

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Thu, 25 Aug 2011 09:59:02 GMT) Full text and rfc822 format available.

Message #35 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Chong Yidong <cyd <at> stupidchicken.com>,
	9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Thu, 25 Aug 2011 18:54:13 +0900

> > I think relocation of buffer may cause the problem.
> 
> Does that help?
> 
> diff --git a/src/coding.c b/src/coding.c
> index 65c8a76..f34a023 100644
> --- a/src/coding.c
> +++ b/src/coding.c
> @@ -915,8 +915,8 @@ record_conversion_result (struct coding_system *coding,
>      }
>  }
> 
> -/* This wrapper macro is used to preserve validity of pointers into
> -   buffer text across calls to decode_char, which could cause
> +/* These wrapper macros are used to preserve validity of pointers into
> +   buffer text across calls to decode_char/encode_char, which could cause
>     relocation of buffers if it loads a charset map, because loading a
>     charset map allocates large structures.  */
>  #define CODING_DECODE_CHAR(coding, src, src_base, src_end, charset, code, c) \
> @@ -935,6 +935,21 @@ record_conversion_result (struct coding_system *coding,
>  	src_end += offset;						     \
>        }									     \
>    } while (0)
> +#define CODING_ENCODE_CHAR(coding, dst, dst_end, charset, c, code)	\
> +  do {									\
> +    charset_map_loaded = 0;						\
> +    code = ENCODE_CHAR (charset, c);					\
> +    if (charset_map_loaded)						\
> +      {									\
> +	const unsigned char *orig = coding->destination;		\
> +	EMACS_INT offset;						\
> +									\
> +	coding_set_destination (coding);				\
> +	offset = coding->destination - orig;				\
> +	dst += offset;							\
> +	dst_end += offset;						\
> +      }									\
> +  } while (0)
> 
> 
>  /* If there are at least BYTES length of room at dst, allocate memory
> @@ -2652,7 +2667,7 @@ encode_coding_emacs_mule (struct coding_system *coding)
>  	    {
>  	      charset = CHARSET_FROM_ID (preferred_charset_id);
>  	      if (CHAR_CHARSET_P (c, charset))
> -		code = ENCODE_CHAR (charset, c);
> +		CODING_ENCODE_CHAR (coding, dst, dst_end, charset, c, code);
>  	      else
>  		charset = char_charset (c, charset_list, &code);
>  	    }
> @@ -4185,7 +4200,8 @@ decode_coding_iso_2022 (struct coding_system *coding)
 
>  #define ENCODE_ISO_CHARACTER(charset, c)				   \
>    do {									   \
> -    int code = ENCODE_CHAR ((charset), (c));				   \
> +    int code;								   \
> +    CODING_ENCODE_CHAR (coding, dst, dst_end, charset, c, code);	   \
>  									   \
>      if (CHARSET_DIMENSION (charset) == 1)				   \
>        ENCODE_ISO_CHARACTER_DIMENSION1 ((charset), code);		   \

Andreas' patch resolved the problem partially.  It resolved the problem on
NetBSD with '-O0' CFLAGS, but failed on NetBSD with '-O2' and Windows.

I confirmed that adding the protection of coding->dst_object to
Andreas' patch resolved the problem on NetBSD with '-O2' but not on
Windows.  I don't know whether it is incorrect way or is not enough.

--- src/coding.c	2011-07-01 11:03:55 +0000
+++ src/coding.c	2011-08-24 23:39:49 +0000
@@ -7397,10 +7436,15 @@
       setup_ccl_program (&cclspec.ccl, CODING_CCL_ENCODER (coding));
     }
   do {
+    struct gcpro gcpro1;
+    GCPRO1 (coding->dst_object);
+
     coding_set_source (coding);
     consume_chars (coding, translation_table, max_lookup);
     coding_set_destination (coding);
     (*(coding->encoder)) (coding);
+
+    UNGCPRO;
   } while (coding->consumed_char < coding->src_chars);
 
   if (BUFFERP (coding->dst_object) && coding->produced_char > 0)

-- 
Kazuhiro Ito

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Fri, 26 Aug 2011 11:46:01 GMT) Full text and rfc822 format available.

Message #38 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Chong Yidong <cyd <at> stupidchicken.com>,
	9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Fri, 26 Aug 2011 20:41:57 +0900

> > > I think relocation of buffer may cause the problem.
> > 
> > Does that help?
> 
> Andreas' patch resolved the problem partially.  It resolved the problem on
> NetBSD with '-O0' CFLAGS, but failed on NetBSD with '-O2' and Windows.
> 
> I confirmed that adding the protection of coding->dst_object to
> Andreas' patch resolved the problem on NetBSD with '-O2' but not on
> Windows.  I don't know whether it is incorrect way or is not enough.

I noticed char_charset() could cause relocation of buffers because it
could call encode_char().  I confirmed similar changes to callers of
char_charset() fixed my problem (without the protection of
coding->dst_object).


SUMMARY OF THE PROBLEM:
In encode_coding_XXX(), calling encode_char() could cause relocation
of buffers.  char_charset(), ENCODE_ISO_CHARACTER and ENCODE_CHAR
could also cause relocation because they could call encode_char().
After using of them, coding->destination, dst, dst_end should be
updated as needed.

-- 
Kazuhiro Ito

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Sun, 28 Aug 2011 00:09:01 GMT) Full text and rfc822 format available.

Message #41 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Chong Yidong <cyd <at> stupidchicken.com>,
	9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Sun, 28 Aug 2011 09:04:49 +0900

> SUMMARY OF THE PROBLEM:
> In encode_coding_XXX(), calling encode_char() could cause relocation
> of buffers.  char_charset(), ENCODE_ISO_CHARACTER and ENCODE_CHAR
> could also cause relocation because they could call encode_char().
> After using of them, coding->destination, dst, dst_end should be
> updated as needed.

I noticed CHAR_CHARSET_P macro slipped out of my check.
CHAR_CHARSET_P could also cause relocation of buffers.

-- 
Kazuhiro Ito

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Tue, 30 Aug 2011 23:35:02 GMT) Full text and rfc822 format available.

Message #44 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Chong Yidong <cyd <at> stupidchicken.com>,
	9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Wed, 31 Aug 2011 08:30:47 +0900

> > SUMMARY OF THE PROBLEM:
> > In encode_coding_XXX(), calling encode_char() could cause relocation
> > of buffers.  char_charset(), ENCODE_ISO_CHARACTER and ENCODE_CHAR
> > could also cause relocation because they could call encode_char().
> > After using of them, coding->destination, dst, dst_end should be
> > updated as needed.
> 
> I noticed CHAR_CHARSET_P macro slipped out of my check.
> CHAR_CHARSET_P could also cause relocation of buffers.

Here is the patch for the code, which contains Andreas' patch.  In my
environment, problems are fixed.  I think it would be better that the
interface of encode_designation_at_bol() is changed.

=== modified file 'src/coding.c'
--- src/coding.c	2011-05-09 09:59:23 +0000
+++ src/coding.c	2011-08-28 07:33:54 +0000
@@ -1026,6 +1026,54 @@
       }									     \
   } while (0)
 
+#define CODING_ENCODE_CHAR(coding, dst, dst_end, charset, c, code)	\
+  do {									\
+    charset_map_loaded = 0;						\
+    code = ENCODE_CHAR (charset, c);					\
+    if (charset_map_loaded)						\
+      {									\
+	const unsigned char *orig = coding->destination;		\
+	EMACS_INT offset;						\
+									\
+	coding_set_destination (coding);				\
+	offset = coding->destination - orig;				\
+	dst += offset;							\
+	dst_end += offset;						\
+      }									\
+  } while (0)
+
+#define CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list, code_return, charset) \
+  do {									\
+    charset_map_loaded = 0;						\
+    charset = char_charset (c, charset_list, code_return);		\
+    if (charset_map_loaded)						\
+      {									\
+	const unsigned char *orig = coding->destination;		\
+	EMACS_INT offset;						\
+									\
+	coding_set_destination (coding);				\
+	offset = coding->destination - orig;				\
+	dst += offset;							\
+	dst_end += offset;						\
+      }									\
+  } while (0)
+
+#define CODING_CHAR_CHARSET_P(coding, dst, dst_end, c, charset, result) \
+  do {									\
+    charset_map_loaded = 0;						\
+    result = CHAR_CHARSET_P(c, charset);				\
+    if (charset_map_loaded)						\
+      {									\
+	const unsigned char *orig = coding->destination;		\
+	EMACS_INT offset;						\
+									\
+	coding_set_destination (coding);				\
+	offset = coding->destination - orig;				\
+	dst += offset;							\
+	dst_end += offset;						\
+      }									\
+  } while (0)
+
 
 /* If there are at least BYTES length of room at dst, allocate memory
    for coding->destination and update dst and dst_end.  We don't have
@@ -2778,14 +2826,19 @@
 
 	  if (preferred_charset_id >= 0)
 	    {
+	      int result;
+
 	      charset = CHARSET_FROM_ID (preferred_charset_id);
-	      if (CHAR_CHARSET_P (c, charset))
+	      CODING_CHAR_CHARSET_P (coding, dst, dst_end, c, charset, result);
+	      if (result)
 		code = ENCODE_CHAR (charset, c);
 	      else
-		charset = char_charset (c, charset_list, &code);
+		CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
+				    &code, charset);
 	    }
 	  else
-	    charset = char_charset (c, charset_list, &code);
+	    CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
+				&code, charset);
 	  if (! charset)
 	    {
 	      c = coding->default_char;
@@ -2794,7 +2847,8 @@
 		  EMIT_ONE_ASCII_BYTE (c);
 		  continue;
 		}
-	      charset = char_charset (c, charset_list, &code);
+	      CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
+				  &code, charset);
 	    }
 	  dimension = CHARSET_DIMENSION (charset);
 	  emacs_mule_id = CHARSET_EMACS_MULE_ID (charset);
@@ -4317,8 +4371,9 @@
 
 #define ENCODE_ISO_CHARACTER(charset, c)				   \
   do {									   \
-    int code = ENCODE_CHAR ((charset),(c));				   \
-									   \
+    int code;								   \
+    CODING_ENCODE_CHAR (coding, dst, dst_end, (charset), (c), code);	   \
+    									   \
     if (CHARSET_DIMENSION (charset) == 1)				   \
       ENCODE_ISO_CHARACTER_DIMENSION1 ((charset), code);		   \
     else								   \
@@ -4476,7 +4531,17 @@
       c = *charbuf++;
       if (c == '\n')
 	break;
+
+      charset_map_loaded = 0;
       charset = char_charset (c, charset_list, NULL);
+      if (charset_map_loaded)
+	{
+	  const unsigned char *orig = coding->destination;
+
+	  coding_set_destination (coding);
+	  dst += coding->destination - orig;
+	}
+
       id = CHARSET_ID (charset);
       reg = CODING_ISO_REQUEST (coding, id);
       if (reg >= 0 && r[reg] < 0)
@@ -4543,6 +4608,12 @@
 
 	  /* We have to produce designation sequences if any now.  */
 	  dst = encode_designation_at_bol (coding, charbuf, charbuf_end, dst);
+	  if (charset_map_loaded)
+	    {
+	      EMACS_INT offset = coding->destination + coding->dst_bytes - dst_end;
+	      dst_end += offset;
+	      dst_prev += offset;
+	    }
 	  bol_designation = 0;
 	  /* We are sure that designation sequences are all ASCII bytes.  */
 	  produced_chars += dst - dst_prev;
@@ -4616,12 +4687,17 @@
 
 	  if (preferred_charset_id >= 0)
 	    {
+	      int result;
+
 	      charset = CHARSET_FROM_ID (preferred_charset_id);
-	      if (! CHAR_CHARSET_P (c, charset))
-		charset = char_charset (c, charset_list, NULL);
+	      CODING_CHAR_CHARSET_P (coding, dst, dst_end, c, charset, result);
+	      if (! result)
+		CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
+				    NULL, charset);
 	    }
 	  else
-	    charset = char_charset (c, charset_list, NULL);
+	    CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
+				NULL, charset);
 	  if (!charset)
 	    {
 	      if (coding->mode & CODING_MODE_SAFE_ENCODING)
@@ -4632,7 +4708,8 @@
 	      else
 		{
 		  c = coding->default_char;
-		  charset = char_charset (c, charset_list, NULL);
+		  CODING_CHAR_CHARSET(coding, dst, dst_end, c,
+				      charset_list, NULL, charset);
 		}
 	    }
 	  ENCODE_ISO_CHARACTER (charset, c);
@@ -5064,7 +5141,9 @@
       else
 	{
 	  unsigned code;
-	  struct charset *charset = char_charset (c, charset_list, &code);
+	  struct charset *charset;
+	  CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
+			      &code, charset);
 
 	  if (!charset)
 	    {
@@ -5076,7 +5155,8 @@
 	      else
 		{
 		  c = coding->default_char;
-		  charset = char_charset (c, charset_list, &code);
+		  CODING_CHAR_CHARSET(coding, dst, dst_end, c,
+				      charset_list, &code, charset);
 		}
 	    }
 	  if (code == CHARSET_INVALID_CODE (charset))
@@ -5153,7 +5233,9 @@
       else
 	{
 	  unsigned code;
-	  struct charset *charset = char_charset (c, charset_list, &code);
+	  struct charset *charset;
+	  CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
+			      &code, charset);
 
 	  if (! charset)
 	    {
@@ -5165,7 +5247,8 @@
 	      else
 		{
 		  c = coding->default_char;
-		  charset = char_charset (c, charset_list, &code);
+		  CODING_CHAR_CHARSET(coding, dst, dst_end, c,
+				      charset_list, &code, charset);
 		}
 	    }
 	  if (code == CHARSET_INVALID_CODE (charset))
@@ -5747,7 +5831,9 @@
 	}
       else
 	{
-	  charset = char_charset (c, charset_list, &code);
+	  CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
+			      &code, charset);
+
 	  if (charset)
 	    {
 	      if (CHARSET_DIMENSION (charset) == 1)


-- 
Kazuhiro Ito

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Thu, 01 Dec 2011 01:57:02 GMT) Full text and rfc822 format available.

Message #47 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kenichi Handa <handa <at> m17n.org>
To: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
Cc: cyd <at> stupidchicken.com, schwab <at> linux-m68k.org, 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Thu, 01 Dec 2011 10:56:12 +0900

In article <20110830233131.C74A61E0043 <at> msa101.auone-net.jp>, Kazuhiro Ito <kzhr <at> d1.dion.ne.jp> writes:

> Here is the patch for the code, which contains Andreas' patch.  In my
> environment, problems are fixed.  I think it would be better that the
> interface of encode_designation_at_bol() is changed.

Oops, sorry, I have vaguely thought that your patch below
has already been applied, but just noticed that it was not.
I'll commit a slightly modified version including the
improved interface for encode_designation_at_bol soon.

By the way, it would be good if we had a way to suppress
buffer text relocation temporarily.

---
Kenichi Handa
handa <at> m17n.org

> === modified file 'src/coding.c'
> --- src/coding.c	2011-05-09 09:59:23 +0000
> +++ src/coding.c	2011-08-28 07:33:54 +0000
> @@ -1026,6 +1026,54 @@
>        }									     \
>    } while (0)
 
> +#define CODING_ENCODE_CHAR(coding, dst, dst_end, charset, c, code)	\
> +  do {									\
> +    charset_map_loaded = 0;						\
> +    code = ENCODE_CHAR (charset, c);					\
> +    if (charset_map_loaded)						\
> +      {									\
> +	const unsigned char *orig = coding->destination;		\
> +	EMACS_INT offset;						\
> +									\
> +	coding_set_destination (coding);				\
> +	offset = coding->destination - orig;				\
> +	dst += offset;							\
> +	dst_end += offset;						\
> +      }									\
> +  } while (0)
> +
> +#define CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list, code_return, charset) \
> +  do {									\
> +    charset_map_loaded = 0;						\
> +    charset = char_charset (c, charset_list, code_return);		\
> +    if (charset_map_loaded)						\
> +      {									\
> +	const unsigned char *orig = coding->destination;		\
> +	EMACS_INT offset;						\
> +									\
> +	coding_set_destination (coding);				\
> +	offset = coding->destination - orig;				\
> +	dst += offset;							\
> +	dst_end += offset;						\
> +      }									\
> +  } while (0)
> +
> +#define CODING_CHAR_CHARSET_P(coding, dst, dst_end, c, charset, result) \
> +  do {									\
> +    charset_map_loaded = 0;						\
> +    result = CHAR_CHARSET_P(c, charset);				\
> +    if (charset_map_loaded)						\
> +      {									\
> +	const unsigned char *orig = coding->destination;		\
> +	EMACS_INT offset;						\
> +									\
> +	coding_set_destination (coding);				\
> +	offset = coding->destination - orig;				\
> +	dst += offset;							\
> +	dst_end += offset;						\
> +      }									\
> +  } while (0)
> +
 
>  /* If there are at least BYTES length of room at dst, allocate memory
>     for coding->destination and update dst and dst_end.  We don't have
> @@ -2778,14 +2826,19 @@
 
>  	  if (preferred_charset_id >= 0)
>  	    {
> +	      int result;
> +
>  	      charset = CHARSET_FROM_ID (preferred_charset_id);
> -	      if (CHAR_CHARSET_P (c, charset))
> +	      CODING_CHAR_CHARSET_P (coding, dst, dst_end, c, charset, result);
> +	      if (result)
>  		code = ENCODE_CHAR (charset, c);
>  	      else
> -		charset = char_charset (c, charset_list, &code);
> +		CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
> +				    &code, charset);
>  	    }
>  	  else
> -	    charset = char_charset (c, charset_list, &code);
> +	    CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
> +				&code, charset);
>  	  if (! charset)
>  	    {
>  	      c = coding->default_char;
> @@ -2794,7 +2847,8 @@
>  		  EMIT_ONE_ASCII_BYTE (c);
>  		  continue;
>  		}
> -	      charset = char_charset (c, charset_list, &code);
> +	      CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
> +				  &code, charset);
>  	    }
>  	  dimension = CHARSET_DIMENSION (charset);
>  	  emacs_mule_id = CHARSET_EMACS_MULE_ID (charset);
> @@ -4317,8 +4371,9 @@
 
>  #define ENCODE_ISO_CHARACTER(charset, c)				   \
>    do {									   \
> -    int code = ENCODE_CHAR ((charset),(c));				   \
> -									   \
> +    int code;								   \
> +    CODING_ENCODE_CHAR (coding, dst, dst_end, (charset), (c), code);	   \
> +    									   \
>      if (CHARSET_DIMENSION (charset) == 1)				   \
>        ENCODE_ISO_CHARACTER_DIMENSION1 ((charset), code);		   \
>      else								   \
> @@ -4476,7 +4531,17 @@
>        c = *charbuf++;
>        if (c == '\n')
>  	break;
> +
> +      charset_map_loaded = 0;
>        charset = char_charset (c, charset_list, NULL);
> +      if (charset_map_loaded)
> +	{
> +	  const unsigned char *orig = coding->destination;
> +
> +	  coding_set_destination (coding);
> +	  dst += coding->destination - orig;
> +	}
> +
>        id = CHARSET_ID (charset);
>        reg = CODING_ISO_REQUEST (coding, id);
>        if (reg >= 0 && r[reg] < 0)
> @@ -4543,6 +4608,12 @@
 
>  	  /* We have to produce designation sequences if any now.  */
>  	  dst = encode_designation_at_bol (coding, charbuf, charbuf_end, dst);
> +	  if (charset_map_loaded)
> +	    {
> +	      EMACS_INT offset = coding->destination + coding->dst_bytes - dst_end;
> +	      dst_end += offset;
> +	      dst_prev += offset;
> +	    }
>  	  bol_designation = 0;
>  	  /* We are sure that designation sequences are all ASCII bytes.  */
>  	  produced_chars += dst - dst_prev;
> @@ -4616,12 +4687,17 @@
 
>  	  if (preferred_charset_id >= 0)
>  	    {
> +	      int result;
> +
>  	      charset = CHARSET_FROM_ID (preferred_charset_id);
> -	      if (! CHAR_CHARSET_P (c, charset))
> -		charset = char_charset (c, charset_list, NULL);
> +	      CODING_CHAR_CHARSET_P (coding, dst, dst_end, c, charset, result);
> +	      if (! result)
> +		CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
> +				    NULL, charset);
>  	    }
>  	  else
> -	    charset = char_charset (c, charset_list, NULL);
> +	    CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
> +				NULL, charset);
>  	  if (!charset)
>  	    {
>  	      if (coding->mode & CODING_MODE_SAFE_ENCODING)
> @@ -4632,7 +4708,8 @@
>  	      else
>  		{
>  		  c = coding->default_char;
> -		  charset = char_charset (c, charset_list, NULL);
> +		  CODING_CHAR_CHARSET(coding, dst, dst_end, c,
> +				      charset_list, NULL, charset);
>  		}
>  	    }
>  	  ENCODE_ISO_CHARACTER (charset, c);
> @@ -5064,7 +5141,9 @@
>        else
>  	{
>  	  unsigned code;
> -	  struct charset *charset = char_charset (c, charset_list, &code);
> +	  struct charset *charset;
> +	  CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
> +			      &code, charset);
 
>  	  if (!charset)
>  	    {
> @@ -5076,7 +5155,8 @@
>  	      else
>  		{
>  		  c = coding->default_char;
> -		  charset = char_charset (c, charset_list, &code);
> +		  CODING_CHAR_CHARSET(coding, dst, dst_end, c,
> +				      charset_list, &code, charset);
>  		}
>  	    }
>  	  if (code == CHARSET_INVALID_CODE (charset))
> @@ -5153,7 +5233,9 @@
>        else
>  	{
>  	  unsigned code;
> -	  struct charset *charset = char_charset (c, charset_list, &code);
> +	  struct charset *charset;
> +	  CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
> +			      &code, charset);
 
>  	  if (! charset)
>  	    {
> @@ -5165,7 +5247,8 @@
>  	      else
>  		{
>  		  c = coding->default_char;
> -		  charset = char_charset (c, charset_list, &code);
> +		  CODING_CHAR_CHARSET(coding, dst, dst_end, c,
> +				      charset_list, &code, charset);
>  		}
>  	    }
>  	  if (code == CHARSET_INVALID_CODE (charset))
> @@ -5747,7 +5831,9 @@
>  	}
>        else
>  	{
> -	  charset = char_charset (c, charset_list, &code);
> +	  CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list,
> +			      &code, charset);
> +
>  	  if (charset)
>  	    {
>  	      if (CHARSET_DIMENSION (charset) == 1)


> -- 
> Kazuhiro Ito

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Mon, 05 Dec 2011 07:11:02 GMT) Full text and rfc822 format available.

Message #50 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kenichi Handa <handa <at> m17n.org>
To: 9318 <at> debbugs.gnu.org
Cc: kzhr <at> d1.dion.ne.jp, schwab <at> linux-m68k.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Mon, 05 Dec 2011 16:10:11 +0900

In article <tl7zkfdnjgj.fsf <at> m17n.org>, Kenichi Handa <handa <at> m17n.org> writes:

> In article <20110830233131.C74A61E0043 <at> msa101.auone-net.jp>, Kazuhiro Ito <kzhr <at> d1.dion.ne.jp> writes:
> > Here is the patch for the code, which contains Andreas' patch.  In my
> > environment, problems are fixed.  I think it would be better that the
> > interface of encode_designation_at_bol() is changed.

> Oops, sorry, I have vaguely thought that your patch below
> has already been applied, but just noticed that it was not.
> I'll commit a slightly modified version including the
> improved interface for encode_designation_at_bol soon.

I've just installed the following changes.  As I don't have
cygwin environment now, could you please check if this
change surely fix the problem?

---
Kenichi Handa
handa <at> m17n.org

2011-12-05  Kenichi Handa  <handa <at> m17n.org>

	* coding.c (encode_designation_at_bol): New args charbuf_end and
	dst.  Return the number of produced bytes.  Callers changed.
	(coding_set_source): Return how many bytes coding->source was
	relocated.
	(coding_set_destination): Return how many bytes
	coding->destination was relocated.
	(CODING_DECODE_CHAR, CODING_ENCODE_CHAR, CODING_CHAR_CHARSET)
	(CODING_CHAR_CHARSET_P): Adjusted for the avove changes.

2011-12-05  Kazuhiro Ito  <kzhr <at> d1.dion.ne.jp>  (tiny change)

	* coding.c (CODING_CHAR_CHARSET_P): New macro.
	(encode_coding_emacs_mule, encode_coding_iso_2022): Use the above
	macro (Bug#9318).

2011-12-05  Andreas Schwab  <schwab <at> linux-m68k.org>

	The following changes are to fix Bug#9318.

	* coding.c (CODING_ENCODE_CHAR, CODING_CHAR_CHARSET): New macros.
	(encode_coding_emacs_mule, ENCODE_ISO_CHARACTER)
	(encode_coding_iso_2022, encode_coding_sjis)
	(encode_coding_big5, encode_coding_charset): Use the above macros.


=== modified file 'src/coding.c'
--- src/coding.c	2011-11-07 01:57:07 +0000
+++ src/coding.c	2011-12-05 06:14:46 +0000
@@ -847,16 +847,16 @@
 static void decode_coding_raw_text (struct coding_system *);
 static int encode_coding_raw_text (struct coding_system *);
 
-static void coding_set_source (struct coding_system *);
-static void coding_set_destination (struct coding_system *);
+static EMACS_INT coding_set_source (struct coding_system *);
+static EMACS_INT coding_set_destination (struct coding_system *);
 static void coding_alloc_by_realloc (struct coding_system *, EMACS_INT);
 static void coding_alloc_by_making_gap (struct coding_system *,
                                         EMACS_INT, EMACS_INT);
 static unsigned char *alloc_destination (struct coding_system *,
                                          EMACS_INT, unsigned char *);
 static void setup_iso_safe_charsets (Lisp_Object);
-static unsigned char *encode_designation_at_bol (struct coding_system *,
-                                                 int *, unsigned char *);
+static int encode_designation_at_bol (struct coding_system *,
+				      int *, int *, unsigned char *);
 static int detect_eol (const unsigned char *,
                        EMACS_INT, enum coding_category);
 static Lisp_Object adjust_coding_eol_type (struct coding_system *, int);
@@ -915,27 +915,68 @@
     }
 }
 
-/* This wrapper macro is used to preserve validity of pointers into
-   buffer text across calls to decode_char, which could cause
-   relocation of buffers if it loads a charset map, because loading a
-   charset map allocates large structures.  */
+/* These wrapper macros are used to preserve validity of pointers into
+   buffer text across calls to decode_char, encode_char, etc, which
+   could cause relocation of buffers if it loads a charset map,
+   because loading a charset map allocates large structures.  */
+
 #define CODING_DECODE_CHAR(coding, src, src_base, src_end, charset, code, c) \
   do {									     \
+    EMACS_INT offset;							     \
+									     \
     charset_map_loaded = 0;						     \
     c = DECODE_CHAR (charset, code);					     \
-    if (charset_map_loaded)						     \
+    if (charset_map_loaded						     \
+	&& (offset = coding_set_source (coding)))			     \
       {									     \
-	const unsigned char *orig = coding->source;			     \
-	EMACS_INT offset;						     \
-									     \
-	coding_set_source (coding);					     \
-	offset = coding->source - orig;					     \
 	src += offset;							     \
 	src_base += offset;						     \
 	src_end += offset;						     \
       }									     \
   } while (0)
 
+#define CODING_ENCODE_CHAR(coding, dst, dst_end, charset, c, code)	\
+  do {									\
+    EMACS_INT offset;							\
+									\
+    charset_map_loaded = 0;						\
+    code = ENCODE_CHAR (charset, c);					\
+    if (charset_map_loaded						\
+	&& (offset = coding_set_destination (coding)))			\
+      {									\
+	dst += offset;							\
+	dst_end += offset;						\
+      }									\
+  } while (0)
+
+#define CODING_CHAR_CHARSET(coding, dst, dst_end, c, charset_list, code_return, charset) \
+  do {									\
+    EMACS_INT offset;							\
+									\
+    charset_map_loaded = 0;						\
+    charset = char_charset (c, charset_list, code_return);		\
+    if (charset_map_loaded						\
+	&& (offset = coding_set_destination (coding)))			\
+      {									\
+	dst += offset;							\
+	dst_end += offset;						\
+      }									\
+  } while (0)
+
+#define CODING_CHAR_CHARSET_P(coding, dst, dst_end, c, charset, result)	\
+  do {									\
+    EMACS_INT offset;							\
+									\
+    charset_map_loaded = 0;						\
+    result = CHAR_CHARSET_P (c, charset);				\
+    if (charset_map_loaded						\
+	&& (offset = coding_set_destination (coding)))			\
+      {									\
+	dst += offset;							\
+	dst_end += offset;						\
+      }									\
+  } while (0)
+
 
 /* If there are at least BYTES length of room at dst, allocate memory
    for coding->destination and update dst and dst_end.  We don't have
@@ -1015,9 +1056,14 @@
        | ((p)[-1] & 0x3F))))
 
 
-static void
+/* Update coding->source from coding->src_object, and return how many
+   bytes coding->source was changed.  */
+
+static EMACS_INT
 coding_set_source (struct coding_system *coding)
 {
+  const unsigned char *orig = coding->source;
+
   if (BUFFERP (coding->src_object))
     {
       struct buffer *buf = XBUFFER (coding->src_object);
@@ -1036,11 +1082,18 @@
       /* Otherwise, the source is C string and is never relocated
 	 automatically.  Thus we don't have to update anything.  */
     }
+  return coding->source - orig;
 }
 
-static void
+
+/* Update coding->destination from coding->dst_object, and return how
+   many bytes coding->destination was changed.  */
+
+static EMACS_INT
 coding_set_destination (struct coding_system *coding)
 {
+  const unsigned char *orig = coding->destination;
+
   if (BUFFERP (coding->dst_object))
     {
       if (BUFFERP (coding->src_object) && coding->src_pos < 0)
@@ -1065,6 +1118,7 @@
       /* Otherwise, the destination is C string and is never relocated
 	 automatically.  Thus we don't have to update anything.  */
     }
+  return coding->destination - orig;
 }
 
 
@@ -2650,14 +2704,19 @@
 
 	  if (preferred_charset_id >= 0)
 	    {
+	      int result;
+
 	      charset = CHARSET_FROM_ID (preferred_charset_id);
-	      if (CHAR_CHARSET_P (c, charset))
+	      CODING_CHAR_CHARSET_P (coding, dst, dst_end, c, charset, result);
+	      if (result)
 		code = ENCODE_CHAR (charset, c);
 	      else
-		charset = char_charset (c, charset_list, &code);
+		CODING_CHAR_CHARSET (coding, dst, dst_end, c, charset_list,
+				     &code, charset);
 	    }
 	  else
-	    charset = char_charset (c, charset_list, &code);
+	    CODING_CHAR_CHARSET (coding, dst, dst_end, c, charset_list,
+				 &code, charset);
 	  if (! charset)
 	    {
 	      c = coding->default_char;
@@ -2666,7 +2725,8 @@
 		  EMIT_ONE_ASCII_BYTE (c);
 		  continue;
 		}
-	      charset = char_charset (c, charset_list, &code);
+	      CODING_CHAR_CHARSET (coding, dst, dst_end, c, charset_list,
+				   &code, charset);
 	    }
 	  dimension = CHARSET_DIMENSION (charset);
 	  emacs_mule_id = CHARSET_EMACS_MULE_ID (charset);
@@ -4185,7 +4245,8 @@
 
 #define ENCODE_ISO_CHARACTER(charset, c)				   \
   do {									   \
-    int code = ENCODE_CHAR ((charset), (c));				   \
+    int code;								   \
+    CODING_ENCODE_CHAR (coding, dst, dst_end, (charset), (c), code);	   \
 									   \
     if (CHARSET_DIMENSION (charset) == 1)				   \
       ENCODE_ISO_CHARACTER_DIMENSION1 ((charset), code);		   \
@@ -4283,15 +4344,19 @@
 
 
 /* Produce designation sequences of charsets in the line started from
-   SRC to a place pointed by DST, and return updated DST.
+   CHARBUF to a place pointed by DST, and return the number of
+   produced bytes.  DST should not directly point a buffer text area
+   which may be relocated by char_charset call.
 
    If the current block ends before any end-of-line, we may fail to
    find all the necessary designations.  */
 
-static unsigned char *
-encode_designation_at_bol (struct coding_system *coding, int *charbuf,
+static int
+encode_designation_at_bol (struct coding_system *coding,
+			   int *charbuf, int *charbuf_end,
 			   unsigned char *dst)
 {
+  unsigned char *orig;
   struct charset *charset;
   /* Table of charsets to be designated to each graphic register.  */
   int r[4];
@@ -4309,7 +4374,7 @@
   for (reg = 0; reg < 4; reg++)
     r[reg] = -1;
 
-  while (found < 4)
+  while (charbuf < charbuf_end && found < 4)
     {
       int id;
 
@@ -4334,7 +4399,7 @@
 	  ENCODE_DESIGNATION (CHARSET_FROM_ID (r[reg]), reg, coding);
     }
 
-  return dst;
+  return dst - orig;
 }
 
 /* See the above "GENERAL NOTES on `encode_coding_XXX ()' functions".  */
@@ -4378,13 +4443,26 @@
 
       if (bol_designation)
 	{
-	  unsigned char *dst_prev = dst;
-
 	  /* We have to produce designation sequences if any now.  */
-	  dst = encode_designation_at_bol (coding, charbuf, dst);
-	  bol_designation = 0;
+	  unsigned char desig_buf[16];
+	  int nbytes;
+	  EMACS_INT offset;
+
+	  charset_map_loaded = 0;
+	  nbytes = encode_designation_at_bol (coding, charbuf, charbuf_end,
+					      desig_buf);
+	  if (charset_map_loaded
+	      && (offset = coding_set_destination (coding)))
+	    {
+	      dst += offset;
+	      dst_end += offset;
+	    }
+	  memcpy (dst, desig_buf, nbytes);
+	  dst += nbytes;
 	  /* We are sure that designation sequences are all ASCII bytes.  */
-	  produced_chars += dst - dst_prev;
+	  produced_chars += nbytes;
+	  bol_designation = 0;
+	  ASSURE_DESTINATION (safe_room);
 	}
 
       c = *charbuf++;
@@ -4455,12 +4533,17 @@
 
 	  if (preferred_charset_id >= 0)
 	    {
+	      int result;
+
 	      charset = CHARSET_FROM_ID (preferred_charset_id);
-	      if (! CHAR_CHARSET_P (c, charset))
-		charset = char_charset (c, charset_list, NULL);
+	      CODING_CHAR_CHARSET_P (coding, dst, dst_end, c, charset, result);
+	      if (! result)
+		CODING_CHAR_CHARSET (coding, dst, dst_end, c, charset_list,
+				     NULL, charset);
 	    }
 	  else
-	    charset = char_charset (c, charset_list, NULL);
+	    CODING_CHAR_CHARSET (coding, dst, dst_end, c, charset_list,
+				 NULL, charset);
 	  if (!charset)
 	    {
 	      if (coding->mode & CODING_MODE_SAFE_ENCODING)
@@ -4471,7 +4554,8 @@
 	      else
 		{
 		  c = coding->default_char;
-		  charset = char_charset (c, charset_list, NULL);
+		  CODING_CHAR_CHARSET (coding, dst, dst_end, c,
+				       charset_list, NULL, charset);
 		}
 	    }
 	  ENCODE_ISO_CHARACTER (charset, c);
@@ -4897,7 +4981,9 @@
       else
 	{
 	  unsigned code;
-	  struct charset *charset = char_charset (c, charset_list, &code);
+	  struct charset *charset;
+	  CODING_CHAR_CHARSET (coding, dst, dst_end, c, charset_list,
+			       &code, charset);
 
 	  if (!charset)
 	    {
@@ -4909,7 +4995,8 @@
 	      else
 		{
 		  c = coding->default_char;
-		  charset = char_charset (c, charset_list, &code);
+		  CODING_CHAR_CHARSET (coding, dst, dst_end, c,
+				       charset_list, &code, charset);
 		}
 	    }
 	  if (code == CHARSET_INVALID_CODE (charset))
@@ -4984,7 +5071,9 @@
       else
 	{
 	  unsigned code;
-	  struct charset *charset = char_charset (c, charset_list, &code);
+	  struct charset *charset;
+	  CODING_CHAR_CHARSET (coding, dst, dst_end, c, charset_list,
+			       &code, charset);
 
 	  if (! charset)
 	    {
@@ -4996,7 +5085,8 @@
 	      else
 		{
 		  c = coding->default_char;
-		  charset = char_charset (c, charset_list, &code);
+		  CODING_CHAR_CHARSET (coding, dst, dst_end, c,
+				       charset_list, &code, charset);
 		}
 	    }
 	  if (code == CHARSET_INVALID_CODE (charset))
@@ -5572,7 +5662,9 @@
 	}
       else
 	{
-	  charset = char_charset (c, charset_list, &code);
+	  CODING_CHAR_CHARSET (coding, dst, dst_end, c, charset_list,
+			       &code, charset);
+
 	  if (charset)
 	    {
 	      if (CHARSET_DIMENSION (charset) == 1)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Mon, 05 Dec 2011 09:13:01 GMT) Full text and rfc822 format available.

Message #53 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Kenichi Handa <handa <at> m17n.org>
Cc: 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Mon, 05 Dec 2011 01:11:53 -0800

That patch (bzr 106613) causes Emacs to use an uninitialized variable;
I found this via static checking with GCC.  I installed the following
further patch, which I think is right and anyway does not introduce a bug --
can you please check it?  Thanks.

* coding.c (encode_designation_at_bol): Don't use uninitialized
local variable (Bug#9318).
=== modified file 'src/coding.c'
--- src/coding.c	2011-12-05 07:03:31 +0000
+++ src/coding.c	2011-12-05 09:00:44 +0000
@@ -4356,7 +4356,7 @@
 			   int *charbuf, int *charbuf_end,
 			   unsigned char *dst)
 {
-  unsigned char *orig;
+  unsigned char *orig = dst;
   struct charset *charset;
   /* Table of charsets to be designated to each graphic register.  */
   int r[4];

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Mon, 05 Dec 2011 11:33:02 GMT) Full text and rfc822 format available.

Message #56 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kazuhiro Ito <kzhr <at> d1.dion.ne.jp>
To: Kenichi Handa <handa <at> m17n.org>
Cc: schwab <at> linux-m68k.org, 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Mon, 05 Dec 2011 20:31:33 +0900

> In article <tl7zkfdnjgj.fsf <at> m17n.org>, Kenichi Handa <handa <at> m17n.org> writes:
> 
> > In article <20110830233131.C74A61E0043 <at> msa101.auone-net.jp>, Kazuhiro Ito <kzhr <at> d1.dion.ne.jp> writes:
> > > Here is the patch for the code, which contains Andreas' patch.  In my
> > > environment, problems are fixed.  I think it would be better that the
> > > interface of encode_designation_at_bol() is changed.
> 
> > Oops, sorry, I have vaguely thought that your patch below
> > has already been applied, but just noticed that it was not.
> > I'll commit a slightly modified version including the
> > improved interface for encode_designation_at_bol soon.
> 
> I've just installed the following changes.  As I don't have
> cygwin environment now, could you please check if this
> change surely fix the problem?

As far as I confirmed, the problems were fixed (except the point Paul
pointed out).  Thank you.


Additionally, if you have time, please confirm Bug#8619 and Bug#9389.
-- 
Kazuhiro Ito

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9318; Package emacs. (Tue, 06 Dec 2011 00:32:01 GMT) Full text and rfc822 format available.

Message #59 received at 9318 <at> debbugs.gnu.org (full text, mbox):

From: Kenichi Handa <handa <at> m17n.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 9318 <at> debbugs.gnu.org
Subject: Re: bug#9318: 23.3.50;
	The first call of encode-coding-region() returns wrong result
Date: Tue, 06 Dec 2011 09:30:33 +0900

In article <4EDC8AD9.3050004 <at> cs.ucla.edu>, Paul Eggert <eggert <at> cs.ucla.edu> writes:

> That patch (bzr 106613) causes Emacs to use an uninitialized variable;
> I found this via static checking with GCC.  I installed the following
> further patch, which I think is right and anyway does not introduce a bug --
> can you please check it?  Thanks.

Oops, my fault.  Yes, your patch is correct.  Thank you.

---
Kenichi Handa
handa <at> m17n.org

> * coding.c (encode_designation_at_bol): Don't use uninitialized
> local variable (Bug#9318).
> === modified file 'src/coding.c'
> --- src/coding.c	2011-12-05 07:03:31 +0000
> +++ src/coding.c	2011-12-05 09:00:44 +0000
> @@ -4356,7 +4356,7 @@
>  			   int *charbuf, int *charbuf_end,
>  			   unsigned char *dst)
>  {
> -  unsigned char *orig;
> +  unsigned char *orig = dst;
>    struct charset *charset;
>    /* Table of charsets to be designated to each graphic register.  */
>    int r[4];

bug marked as fixed in version 24.0.93, send any further explanations to 9318 <at> debbugs.gnu.org and Kazuhiro Ito <kzhr <at> d1.dion.ne.jp> Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Tue, 06 Dec 2011 08:36:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 03 Jan 2012 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 13 years and 170 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #9318 23.3.50; The first call of encode-coding-region() returns wrong result on on Windows

GNU bug report logs - #9318
23.3.50; The first call of encode-coding-region() returns wrong result on on Windows