GNU bug report logs - #25322
[AUCTeX-devel] preview-latex coding system problem with Japanese LaTeX

Previous Next

Package: auctex;

Reported by: Ikumi Keita <ikumi <at> ikumi.que.jp>

Date: Sun, 1 Jan 2017 14:42:01 UTC

Severity: normal

Done: Ikumi Keita <ikumi <at> ikumi.que.jp>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 25322 in the body.
You can then email your comments to 25322 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-auctex <at> gnu.org:
bug#25322; Package auctex. (Sun, 01 Jan 2017 14:42:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ikumi Keita <ikumi <at> ikumi.que.jp>:
New bug report received and forwarded. Copy sent to bug-auctex <at> gnu.org. (Sun, 01 Jan 2017 14:42:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: bug-auctex <at> gnu.org
Subject: [AUCTeX-devel] preview-latex coding system problem with Japanese LaTeX
Date: Sun, 01 Jan 2017 23:41:06 +0900
[Message part 1 (text/plain, inline)]
According to Mosè's advice, I'm forwarding this message to bug tracking
ML.

To those who tries to work on this issue:  This was originally posted to
auctex-devel <at> gnu.org.  See also
https://lists.gnu.org/archive/html/auctex-devel/2016-12/msg00058.html
and the thread following it.

----------------------------------------------------------------------
Dear AUCTeX developers,

I have some problems with preview-latex with regard to the coding system
when I use Japanese LaTeX.  Since the recent TeXLive contains Japanese
LaTeX by default, I suppose that non-Japanese users can experience the
problems if sample file is provided.  So I organize this email as the
following 3 parts:

A. The problems are described with the attached sample files so
   that anyone can actually experience the situation and examine
   what's going on in detail.
B. The reasons of the problems are explained and their tentative fixes
   are proposed by the attached patches.
C. The patches in B. fix problems only partially.  The remaining
   problem is described and call for help is expressed.

A. There are two problems.  I will describe them in order.
A-1. How to reproduce:
(1) Start a new emacs session with
env LC_ALL=ja_JP.SJIS emacs &
    and enable preview-latex.
(2) Open the attached file "preview-error-test.tex", which has many
    \section lines.  They are all commented out initially.
(3) Uncomment any one of them and start preview-latex with C-c C-p C-d.
    Answer with n to "Cache preamble?" question.  Then the error or bad
    result described on the next line of the uncommented \section will
    occur, e.g.
Invalid regexp: "Unmatched ( or \\("
(4) Comment out again that \section line, uncomment another \section
    line, and try C-c C-p C-d again.  Another error will come out.
(5) Repeat the procedure described in (4).

The process (3) will not work if your tex distribution lacks the
Japanese LaTeX command binary "platex".  In that case, please check up
the following list.
o Be sure to install TeXLive.  Other tex distributions usually lack
  Japanese TeX engines.
o If you (or the package manager you are using) didn't select a scheme
  large enough when installing TeXLive, Japanese LaTeX suite is not
  present on your machine.
o Japanese TeX was first included in TeXLive several years ago.  Thus if
  your TeXLive is older than that, Japanese LaTeX is not available.
o If your ghostscript is not configured to handle PS file with Japanese
  font, the character in the preview image may be garbled.  However,
  that is not the point I'm speaking of now.  Rather, it is the error in
  regexp match preventing preview-latex to do the job that I'd like you
  to look at.

A-2. How to reproduce:
(1) This time, start a new emacs session with another locale
env LC_ALL=ja_JP.eucJP emacs &
    and enable preview-latex.
(2) Open the attached file "preview-error-test2.tex" and type C-c C-p
    C-d.  This time, answer with y, not n, to "Cache preamble?"
    question.
(3) Then the preview image will come out at wrong position.

This example requires `platex' binary, too.

B. The reasons and tentative fixes to the problems.
B-1. Shift-JIS encoding problem.
The bad results demonstrated in A-1 are caused by the nature of the
coding system `japanese-shift-jis' (SJIS for short).  SJIS is one of the
major encodings for Japanese text and the standard encoding in the
Japanese edition of windows for historical reasons.  Basically, SJIS
represents one Japanese character by two bytes.  Examples of such
two-byte sequences are, in hexadecimal form:

8E 82

and

81 5B

.  While the first byte of the sequence is always 8-bit (MSB on), the
second is not necessarily so.  In the above two examples, the second
byte of the first example (82) is 8-bit, but the second one (5B) is
7-bit (MSB off).  It is this 7-bit byte that brings the problems in A-1
above.  Unfortunately, this 7-bit byte sometimes coincides with a regexp
meta character.  Thus it is interfered with `regexp-quote' in the
function `preview-error-quote'.  Roughly speaking, 'preview-error-quote'
works along this flow:
1. Encodes string in the given coding system (i.e., SJIS in this
   example).
2. Replaces texts which begin with "^^" with the corresponding byte.
3. Supplies regular expression, for later use to locate the position
   in the buffer for putting the preview image, guarding the meta
   character in the original text by `regexp-quote'.
4. Decodes back the obtained string out of the coding system again.
However, when `regexp-quote' in the item 3 quotes the 7-bit byte in
SJIS, decoding back fails to gain the original character.

The following example illustrates what is going on:
(let* ((s1 (char-to-string (make-char 'japanese-jisx0208 37 63)))
       ;; s1 is multibyte Japanese string.
       ;; Encode s1 in SJIS.
       (s2 (encode-coding-string s1 'shift_jis))
       ;; At this point s2 is "\203^".
       (s3 (regexp-quote s2))
       ;; Now s3 is "\203\\^".
       ;; Then decode back assuming SJIS encoding.
       (s4 (decode-coding-string s3 'shift_jis)))
  (string-equal s1 s4))
=> nil ;; no longer goes back to the original string s1.

The attached patch "preview-latex-fix" is my approach to fix this
problem.  It avoids to handle encoded string and does the relavant
operations on the decoded string consistently.  (In addition, it fixes a
problem that `char-to-string' in the original code does not do the
expected job in unicode-based emacs for chars of #x80 through #xFF.  I
changed to use `byte-to-string' instead when that function is
available.)

B-2. preview-latex drops the necessary command option.
Japanese TeX command sometimes needs "-kanji" option to know the coding
system of the given TeX file.  In AUCTeX, this requirement is usually
covered by the "%(kanjiopt)" construct in the following lines quoted
from tex-jp.el:

(setq TeX-engine-alist-builtin
      (append TeX-engine-alist-builtin
             '((ptex "pTeX" "ptex %(kanjiopt)" "platex %(kanjiopt)" "eptex")
               (jtex "jTeX" "jtex" "jlatex" nil)
               (uptex "upTeX" "euptex" "uplatex" "euptex"))))

This "%(kanjiopt)" is changed to suitable option string like "-kanji
XXX" when necessary.  However, if the answer to the question "Cache
preamble?" is y, preview-latex drops this option, which leads to the
results described in A-2 above.

The reason why the option "-kanji XXX" is missing is that
`TeX-inline-preview-internal' transforms the command line passed to the
OS shell by `(preview-do-replacements command
preview-undump-replacements)' when caching preamble is enabled.  Here
the regular expression in `preview-undump-replacements' is designed to
pick up the very first word of the value of the variable `command',
leaving behind the option "-kanji XXX".

The attached patch "preview-latex-fix2" aims to resolve this problem.
It gives back the latex command options provided in the entry which
`(TeX-engine-alist)' returns so that the command will run smoothly.

C. Call for help
There are still some problems remained.  I think we should have a
integrated framework which can serve for both preview-latex and
tex-jp.el to determine the suitable process coding system.

The coding systems to communicate with Japanese TeX command are not
constant but vary with the environments.  In fact it can only be
determined at run time.  Currently that situation is handled by the
function `japanese-TeX-set-process-coding-system' in tex-jp.el during
the normal runs.  That function is set to the value of
`TeX-after-start-process-function' and called after the TeX process
starts.  In that way, the process coding systems are set to suitable
values under the environment at that point of time.  However, the way
preview-latex handles process coding systems sometimes conflicts with
such setting.  For example, `TeX-inline-preview-internal' overwrites the
process coding system after `japanese-TeX-set-process-coding-system'
does its job.  (Current preview-latex uses the value of
`TeX-japanese-process-output-coding-system', but it is not sufficient to
rely on such constant value.  In fact the default value of
`TeX-japanese-process-output-coding-system' was changed to nil
recently.)  Even my patch "preview-latex-fix" is not sufficient about
this point.  The coding-system argument supplied to
`decode-coding-string' should not simply be `buffer-file-coding-system'.

I would appreciate if anyone who has deeper knowledge of AUCTeX could
help to resolve all these coding system issues in preview-latex.

Best regards,
Ikumi Keita

[test-suite.tar.gz (application/x-gzip, attachment)]

Information forwarded to bug-auctex <at> gnu.org:
bug#25322; Package auctex. (Thu, 18 May 2017 13:33:02 GMT) Full text and rfc822 format available.

Message #8 received at 25322 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: 25322 <at> debbugs.gnu.org
Cc: auctex-devel <at> gnu.org
Subject: Re: bug#25322: Acknowledgement ([AUCTeX-devel] preview-latex coding
 system problem with Japanese LaTeX)
Date: Thu, 18 May 2017 22:32:16 +0900
[Message part 1 (text/plain, inline)]
Hi all,

I worked on to resolve the remaining problems with respect to the
incompatibility between preview-latex and Japanese LaTeX, and think that
I managed to sort out them.  Please take a look at the attached patch.

My basic plan, in addtion to the part described before in
http://lists.gnu.org/archive/html/auctex-devel/2016-09/msg00101.html
, is
(1) Implement in tex-buf.el a function to adjust the process coding
    system for normal tex documents as well as Japanese tex documents
    and make it the new default value for
    `TeX-after-start-process-function', which was previously used only
    for Japanese TeX by tex-jp.el.  With this change, all tex processes
    invoked within AUCTeX are given suitable coding systems.
(2) Make preview-latex to examine the coding system set in (1), and
    use it to decode later if it decides not to decode during receiving
    outputs and to store them as byte sequence in order to work around
    xemacs bug.  To achieve this, the patch changes the meaning of the
    variable `preview-coding-system'.  Its value used to be the "new"
    coding system, which is potentially `raw-text' preserving the byte
    sequence, but it's now the "original" coding system assigned to the
    process, which decodes the outputs properly.

Any comments and suggestions are greatly welcome.

> -- 
> 25322: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=25322
> GNU Bug Tracking System
> Contact help-debbugs <at> gnu.org with problems

[preview-latex-coding-system-fix.gz (application/x-gzip, attachment)]

Information forwarded to bug-auctex <at> gnu.org:
bug#25322; Package auctex. (Fri, 19 May 2017 10:41:02 GMT) Full text and rfc822 format available.

Message #11 received at 25322 <at> debbugs.gnu.org (full text, mbox):

From: Mosè Giordano <mose <at> gnu.org>
To: Ikumi Keita <ikumi <at> ikumi.que.jp>
Cc: 25322 <at> debbugs.gnu.org, auctex-devel <auctex-devel <at> gnu.org>
Subject: Re: bug#25322: Acknowledgement ([AUCTeX-devel] preview-latex coding
 system problem with Japanese LaTeX)
Date: Fri, 19 May 2017 12:40:04 +0200
Hi Keita,

2017-05-18 15:32 GMT+02:00 Ikumi Keita <ikumi <at> ikumi.que.jp>:
> Hi all,
>
> I worked on to resolve the remaining problems with respect to the
> incompatibility between preview-latex and Japanese LaTeX, and think that
> I managed to sort out them.  Please take a look at the attached patch.
>
> My basic plan, in addtion to the part described before in
> http://lists.gnu.org/archive/html/auctex-devel/2016-09/msg00101.html
> , is
> (1) Implement in tex-buf.el a function to adjust the process coding
>     system for normal tex documents as well as Japanese tex documents
>     and make it the new default value for
>     `TeX-after-start-process-function', which was previously used only
>     for Japanese TeX by tex-jp.el.  With this change, all tex processes
>     invoked within AUCTeX are given suitable coding systems.
> (2) Make preview-latex to examine the coding system set in (1), and
>     use it to decode later if it decides not to decode during receiving
>     outputs and to store them as byte sequence in order to work around
>     xemacs bug.  To achieve this, the patch changes the meaning of the
>     variable `preview-coding-system'.  Its value used to be the "new"
>     coding system, which is potentially `raw-text' preserving the byte
>     sequence, but it's now the "original" coding system assigned to the
>     process, which decodes the outputs properly.
>
> Any comments and suggestions are greatly welcome.

I only read the patch, didn't actually try it, but I'm confident you
did it ;-)  I didn't see anything clearly wrong.  I have only a
request: do you think it's possible to add a test for this fix?

Bye,
Mosè




Information forwarded to bug-auctex <at> gnu.org:
bug#25322; Package auctex. (Fri, 19 May 2017 12:44:01 GMT) Full text and rfc822 format available.

Message #14 received at 25322 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: Mosè Giordano <mose <at> gnu.org>
Cc: 25322 <at> debbugs.gnu.org, auctex-devel <auctex-devel <at> gnu.org>
Subject: Re: bug#25322: preview-latex coding system problem with Japanese LaTeX
Date: Fri, 19 May 2017 21:42:58 +0900
[Message part 1 (text/plain, inline)]
Hi Mosè, thanks for your reply!

>>>>> Mosè Giordano <mose <at> gnu.org> writes:

> I only read the patch, didn't actually try it, but I'm confident you
> did it ;-)  I didn't see anything clearly wrong.  I have only a
> request: do you think it's possible to add a test for this fix?

OK, I'll try.  Maybe it takes a week or so.  I'll also wait for possible
comments from others to arrive during that.

And I found another issue and made a patch which might be incorporated
as well.
Preview-latex on xemacs on w32 does not work except when
`preview-image-type' is set to `dvipng'.  It fails to call ghostscript
correctly and ghostscript signals error without generating any image
files.  The attached patch fixes this.  How do you think about this one?

[Background of the patch]
Preview-latex constructs path names like
"circ.prv/tmpkWCzEN/preview.dsc" and tries to pass them to ghostscript.
However, functions of xemacs on w32 uses backslash as path separator by
default when the return value is a file name, and
`preview-ps-quote-filename' is confused by this feature:

(defun preview-ps-quote-filename (str &optional nonrel)
  "Make a PostScript string from filename STR.
The file name is first made relative unless
NONREL is not NIL."
  (unless nonrel (setq str (file-relative-name str)))
  (let ((index 0))
    (while (setq index (string-match "[\\()]" str index))
      (setq str (replace-match "\\\\\\&" t nil str)
	    index (+ 2 index)))
    (concat "(" str ")")))

Here, `file-relative-name' converts slashes contained in `str' to
backslashes, each of which is duplicated by the while loop after it.  As
a result, the path names actually passed to ghostscript become like
"circ.prv\\tmpkWCzEN\\preview.dsc".  That's the reason ghostscript
signals error.

The behavior of w32 xemacs functions with respect to path separator
mentioned above is controlled by binding a suitable value to
`directory-sep-char', so the attatched patch takes care of that.  With
this change, preview-latex does work as expected, at least for the
xemacs 21.5 binary I found and installed on my machine.

I tried applying the previous patch, and to my delight, it makes
preview-latex to cooperate with Japanese LaTeX on w32 xemacs, too.

Regards,
Ikumi Keita

[fix-xemacs-w32-preview (text/x-diff, attachment)]

Information forwarded to bug-auctex <at> gnu.org:
bug#25322; Package auctex. (Fri, 26 May 2017 13:48:01 GMT) Full text and rfc822 format available.

Message #17 received at 25322 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
Cc: 25322 <at> debbugs.gnu.org, Mosè Giordano <mose <at> gnu.org>,
 auctex-devel <auctex-devel <at> gnu.org>
Subject: Re: bug#25322: preview-latex coding system problem with Japanese LaTeX
Date: Fri, 26 May 2017 22:47:46 +0900
Hi Mosè and all,

>> I only read the patch, didn't actually try it, but I'm confident you
>> did it ;-)  I didn't see anything clearly wrong.  I have only a
>> request: do you think it's possible to add a test for this fix?

> OK, I'll try.  Maybe it takes a week or so.  I'll also wait for possible
> comments from others to arrive during that.

I wrote a suite of ERT files and commited them with the patches, since I
heard no objections during the span.  I also commited the patch in my
previous message to make preview-latex to work with xemacs on w32
system.

Since it was too difficult for me to write a full automated test which
can complete entirely in batch mode because the functionality of
preview-latex depends how images scattered on the buffer "look" like to
human eyes, some of the tests are marked to be skipped in batch mode.
Such tests requires manual execution instead.

If someone finds difficulties with this commit, feel free to ask me.

Best regards,
Ikumi Keita




Information forwarded to bug-auctex <at> gnu.org:
bug#25322; Package auctex. (Fri, 26 May 2017 15:35:02 GMT) Full text and rfc822 format available.

Message #20 received at 25322 <at> debbugs.gnu.org (full text, mbox):

From: Mosè Giordano <mose <at> gnu.org>
To: Ikumi Keita <ikumi <at> ikumi.que.jp>
Cc: 25322 <at> debbugs.gnu.org, auctex-devel <auctex-devel <at> gnu.org>
Subject: Re: bug#25322: preview-latex coding system problem with Japanese LaTeX
Date: Fri, 26 May 2017 17:33:27 +0200
[Message part 1 (text/plain, inline)]
Hi!

2017-05-26 15:47 GMT+02:00 Ikumi Keita <ikumi <at> ikumi.que.jp>:
> I wrote a suite of ERT files and commited them with the patches, since I
> heard no objections during the span.  I also commited the patch in my
> previous message to make preview-latex to work with xemacs on w32
> system.
>
> Since it was too difficult for me to write a full automated test which
> can complete entirely in batch mode because the functionality of
> preview-latex depends how images scattered on the buffer "look" like to
> human eyes, some of the tests are marked to be skipped in batch mode.
> Such tests requires manual execution instead.

I know that tests aren't always easy to write, especially for stuff
that requires interaction.  I don't ask tests at any cost, but when
it's possible to write them, they're definitely useful to prevent
future failures ;-)

> If someone finds difficulties with this commit, feel free to ask me.

Indeed I have a problem: all non skipped tests in
japanese/preview-latex.el fail for me.  Attached you can find the log,
if it can help you.

Bye,
Mosè
[preview-latex.log (text/x-log, attachment)]

Information forwarded to bug-auctex <at> gnu.org:
bug#25322; Package auctex. (Fri, 26 May 2017 17:12:01 GMT) Full text and rfc822 format available.

Message #23 received at 25322 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: Mosè Giordano <mose <at> gnu.org>
Cc: 25322 <at> debbugs.gnu.org, auctex-devel <auctex-devel <at> gnu.org>
Subject: Re: bug#25322: preview-latex coding system problem with Japanese LaTeX
Date: Sat, 27 May 2017 02:11:05 +0900
Hi Mosè, thanks for your response!

>>>>> Mosè Giordano <mose <at> gnu.org> writes:
> Indeed I have a problem: all non skipped tests in
> japanese/preview-latex.el fail for me.  Attached you can find the log,
> if it can help you.

It seems that the latest commit is not yet installed on your box.  I
encountered similar fails with AUCTeX before the update.   My tests only
pass with the updated codes.

Bye,
Ikumi Keita




Information forwarded to bug-auctex <at> gnu.org:
bug#25322; Package auctex. (Fri, 26 May 2017 17:17:01 GMT) Full text and rfc822 format available.

Message #26 received at 25322 <at> debbugs.gnu.org (full text, mbox):

From: Mosè Giordano <mose <at> gnu.org>
To: Ikumi Keita <ikumi <at> ikumi.que.jp>
Cc: 25322 <at> debbugs.gnu.org, auctex-devel <auctex-devel <at> gnu.org>
Subject: Re: bug#25322: preview-latex coding system problem with Japanese LaTeX
Date: Fri, 26 May 2017 19:15:54 +0200
2017-05-26 19:11 GMT+02:00 Ikumi Keita <ikumi <at> ikumi.que.jp>:
> Hi Mosè, thanks for your response!
>
>>>>>> Mosè Giordano <mose <at> gnu.org> writes:
>> Indeed I have a problem: all non skipped tests in
>> japanese/preview-latex.el fail for me.  Attached you can find the log,
>> if it can help you.
>
> It seems that the latest commit is not yet installed on your box.  I
> encountered similar fails with AUCTeX before the update.   My tests only
> pass with the updated codes.

The repo was up-to-date, but I had to make distclean before
recompiling AUCTeX (before I run only make clean, which wasn't
sufficient).  Now the tests pass.  Thanks and sorry for the noise!

Bye,
Mosè




bug closed, send any further explanations to 25322 <at> debbugs.gnu.org and Ikumi Keita <ikumi <at> ikumi.que.jp> Request was from Ikumi Keita <ikumi <at> ikumi.que.jp> to control <at> debbugs.gnu.org. (Wed, 31 Oct 2018 13:34:01 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 29 Nov 2018 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 204 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.