GNU bug report logs -
#5553
23.1.92; Archives with wrong coding system
Previous Next
To reply to this bug, email your comments to 5553 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#5553
; Package
emacs
.
(Tue, 09 Feb 2010 21:28:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Juri Linkov <juri <at> jurta.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Tue, 09 Feb 2010 21:28:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
When `archive-mode' is enabled for an archive file with an unknown file
extension, using the rule ("\\(PK00\\)?[P]K\003\004" . archive-mode)
from `magic-fallback-mode-alist', visiting such a file fails with the
args-out-of-range error.
The following patch should fix this bug using the same regexp as in
`magic-fallback-mode-alist' and the same coding system as for archive
file extensions in `auto-coding-alist':
=== modified file 'lisp/international/mule.el'
--- lisp/international/mule.el 2010-02-01 22:57:45 +0000
+++ lisp/international/mule.el 2010-02-09 21:18:51 +0000
@@ -1653,7 +1653,9 @@ (defcustom auto-coding-regexp-alist
("\\`\xFE\xFF" . utf-16be-with-signature)
("\\`\xFF\xFE" . utf-16le-with-signature)
("\\`\xEF\xBB\xBF" . utf-8-with-signature)
- ("\\`;ELC\024\0\0\0" . emacs-mule))) ; Emacs 20-compiled
+ ("\\`;ELC\024\0\0\0" . emacs-mule) ; Emacs 20-compiled
+ ;; For `archive-mode' in `magic-fallback-mode-alist':
+ ("\\(PK00\\)?[P]K\003\004" . no-conversion-multibyte)))
"Alist of patterns vs corresponding coding systems.
Each element looks like (REGEXP . CODING-SYSTEM).
A file whose first bytes match REGEXP is decoded by CODING-SYSTEM on reading.
--
Juri Linkov
http://www.jurta.org/emacs/
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#5553
; Package
emacs
.
(Tue, 09 Feb 2010 22:26:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 5553 <at> debbugs.gnu.org (full text, mbox):
> When `archive-mode' is enabled for an archive file with an unknown file
> extension, using the rule ("\\(PK00\\)?[P]K\003\004" . archive-mode)
> from `magic-fallback-mode-alist', visiting such a file fails with the
> args-out-of-range error.
>
> The following patch should fix this bug using the same regexp as in
> `magic-fallback-mode-alist' and the same coding system as for archive
> file extensions in `auto-coding-alist':
The same problem exists also for images. `magic-fallback-mode-alist' contains:
(image-type-auto-detected-p . image-mode)
but visiting an image file with a non-standard file extension
(i.e. not in `auto-mode-alist') doesn't display it as an image.
The following patch fixes this problem, but it seems duplicating
image regexps from `image-type-header-regexps' is too ugly?
=== modified file 'lisp/international/mule.el'
--- lisp/international/mule.el 2010-02-09 05:00:56 +0000
+++ lisp/international/mule.el 2010-02-09 22:16:28 +0000
@@ -1655,7 +1655,14 @@ (defcustom auto-coding-regexp-alist
("\\`\xEF\xBB\xBF" . utf-8-with-signature)
("\\`;ELC\024\0\0\0" . emacs-mule) ; Emacs 20-compiled
;; For `archive-mode' in `magic-fallback-mode-alist':
- ("\\(PK00\\)?[P]K\003\004" . no-conversion-multibyte)))
+ ("\\(PK00\\)?[P]K\003\004" . no-conversion-multibyte)
+ ;; For `image-mode' in `magic-fallback-mode-alist'
+ ;; (regexps duplicated from `image-type-header-regexps'):
+ ("\\`GIF8[79]a" . no-conversion) ; gif
+ ("\\`\x89PNG\r\n\x1a\n" . no-conversion) ; png
+ ("\\`\\(?:MM\0\\*\\|II\\*\0\\)" . no-conversion) ; tiff
+ ("\\`\xff\xd8" . no-conversion) ; jpeg
+ ))
"Alist of patterns vs corresponding coding systems.
Each element looks like (REGEXP . CODING-SYSTEM).
A file whose first bytes match REGEXP is decoded by CODING-SYSTEM on reading.
--
Juri Linkov
http://www.jurta.org/emacs/
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#5553
; Package
emacs
.
(Tue, 09 Feb 2010 22:36:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 5553 <at> debbugs.gnu.org (full text, mbox):
> From: Juri Linkov <juri <at> jurta.org>
> Date: Tue, 09 Feb 2010 23:19:27 +0200
> Cc:
>
> When `archive-mode' is enabled for an archive file with an unknown file
> extension, using the rule ("\\(PK00\\)?[P]K\003\004" . archive-mode)
> from `magic-fallback-mode-alist', visiting such a file fails with the
> args-out-of-range error.
>
> The following patch should fix this bug using the same regexp as in
> `magic-fallback-mode-alist' and the same coding system as for archive
> file extensions in `auto-coding-alist':
Thanks, but please provide a self-contained recipe for reproducing the
problem, starting with "emacs -Q".
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#5553
; Package
emacs
.
(Wed, 10 Feb 2010 01:05:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 5553 <at> debbugs.gnu.org (full text, mbox):
> Thanks, but please provide a self-contained recipe for reproducing the
> problem, starting with "emacs -Q".
AFAICS, it is not reproducible with "emacs -Q" where visited archives
and images with non-standard file extensions are visited in proper modes.
The problem appears with using Unicad (http://code.google.com/p/unicad/).
Basically what is does boils down to the following line:
(add-to-list 'auto-coding-functions 'unicad-universal-charset-detect)
The rest is just statistical guessing of the coding system based solely
on the content of the file, and in case of archives and images, the
guess is incorrect, and `magic-fallback-mode-alist' fails to match
a mode regexp at the beginning of the buffer.
So the question is whether we should complement entries in
`magic-fallback-mode-alist' with the corresponding entries in
`auto-coding-regexp-alist' with the same regexps (like we complement
entries in `auto-mode-alist' with entries in `auto-coding-alist')?
Or every function in `auto-coding-functions' that determines a coding system
should somehow take care of exceptions in `magic-fallback-mode-alist'?
--
Juri Linkov
http://www.jurta.org/emacs/
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#5553
; Package
emacs
.
(Wed, 10 Feb 2010 20:15:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 5553 <at> debbugs.gnu.org (full text, mbox):
> So the question is whether we should complement entries in
> `magic-fallback-mode-alist' with the corresponding entries in
> `auto-coding-regexp-alist' with the same regexps (like we complement
> entries in `auto-mode-alist' with entries in `auto-coding-alist')?
> Or every function in `auto-coding-functions' that determines a coding system
> should somehow take care of exceptions in `magic-fallback-mode-alist'?
I think that auto-coding-alist should allow mapping not only file-names
but also major modes to coding-systems. This should hopefully take care
of those issues by mapping image-mode and archive-mode to no-conversion.
Stefan
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#5553
; Package
emacs
.
(Wed, 10 Feb 2010 22:40:03 GMT)
Full text and
rfc822 format available.
Message #20 received at 5553 <at> debbugs.gnu.org (full text, mbox):
>> So the question is whether we should complement entries in
>> `magic-fallback-mode-alist' with the corresponding entries in
>> `auto-coding-regexp-alist' with the same regexps (like we complement
>> entries in `auto-mode-alist' with entries in `auto-coding-alist')?
>
>> Or every function in `auto-coding-functions' that determines a coding system
>> should somehow take care of exceptions in `magic-fallback-mode-alist'?
>
> I think that auto-coding-alist should allow mapping not only file-names
> but also major modes to coding-systems. This should hopefully take care
> of those issues by mapping image-mode and archive-mode to no-conversion.
I don't understand how this is possible because currently a coding system
should be recognized before mode is chosen:
1. Recognizing Coding Systems
1.1. coding-system-for-read if non-nil
1.2. auto-coding-alist matching a filename
1.3. auto-coding-regexp-alist matching first bytes
1.4. `-*- coding: -*-' tag
1.5. auto-coding-functions (e.g. unicad-universal-charset-detect)
1.6. file-coding-system-alist matching a filename
2. Choosing Modes
2.1. `-*- mode: -*-' tag
2.2. interpreter-mode-alist
2.3. magic-mode-alist
2.4. auto-mode-alist
2.5. magic-fallback-mode-alist
--
Juri Linkov
http://www.jurta.org/emacs/
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#5553
; Package
emacs
.
(Thu, 11 Feb 2010 02:13:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 5553 <at> debbugs.gnu.org (full text, mbox):
>> I think that auto-coding-alist should allow mapping not only file-names
>> but also major modes to coding-systems. This should hopefully take care
>> of those issues by mapping image-mode and archive-mode to no-conversion.
> I don't understand how this is possible because currently a coding system
> should be recognized before mode is chosen:
This is the reason why my suggestion did not come with a patch ;-)
This said, I don't think it's impossible, but it would require
a reorganization indeed.
Stefan
This bug report was last modified 15 years and 125 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.