GNU bug report logs -
#12807
24.2; Emacs cannot edit file with funny Unicode characters in the file name on Windows
Previous Next
Reported by: Nils Gösche <cartan <at> cartan.de>
Date: Mon, 5 Nov 2012 21:02:01 UTC
Severity: wishlist
Merged with 7100,
15236
Found in versions 24.0.50, 24.2, 24.3.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 12807 in the body.
You can then email your comments to 12807 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#12807
; Package
emacs
.
(Mon, 05 Nov 2012 21:02:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Nils Gösche <cartan <at> cartan.de>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 05 Nov 2012 21:02:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Dear Sirs,
I keep a bunch of text files on my Windows 7 desktop containing my thoughts
about the solutions of chess problems I am trying to solve. Now, one of these
problems was composed by a Russian. So, I named the file Кузовков_Lösung.txt:
First the name of the Russian composer, then the German word for »solution«.
However, when I tried to edit that file in Emacs, I only got error messages,
probably because of the funny Unicode characters in the file name. (See below
for the exact wording of the messages.)
Another file with only English/German characters in the name,
Thorton_Lösung.txt, does not cause any trouble at all (oh, but it seems I
misspelled the name, actually).
(BTW, Notepad does not have any problems editing the same file. So, it is
not some weird, OS-related problem, either).
Regards,
Nils Gösche
======== End of bug report======
In GNU Emacs 24.2.1 (i386-mingw-nt6.1.7601)
of 2012-08-29 on MARVIN
Windowing system distributor `Microsoft Corp.', version 6.1.7601
Configured using:
`configure --with-gcc (4.6) --cflags
-ID:/devel/emacs/libs/libXpm-3.5.8/include
-ID:/devel/emacs/libs/libXpm-3.5.8/src
-ID:/devel/emacs/libs/libpng-dev_1.4.3-1/include
-ID:/devel/emacs/libs/zlib-dev_1.2.5-2/include
-ID:/devel/emacs/libs/giflib-4.1.4-1/include
-ID:/devel/emacs/libs/jpeg-6b-4/include
-ID:/devel/emacs/libs/tiff-3.8.2-1/include
-ID:/devel/emacs/libs/gnutls-3.0.9/include'
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: en_US
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: DEU
value of $XMODIFIERS: nil
locale-coding-system: cp1252
default enable-multibyte-characters: t
Major mode: Lisp Interaction
Minor modes in effect:
display-time-mode: t
tooltip-mode: t
mouse-wheel-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
Recent input:
<down-mouse-1> <mouse-1> <down-mouse-1> <mouse-1> C-x
C-f d e s k <tab> K u s o w k o w _ L ö s u n g . t
x t C-g <help-echo> <down-mouse-1> <mouse-1> C-x C-f
K u <backspace> <backspace> d e s k <tab> K u s o w
k o s <backspace> w _ L ö s u n g . t x t <return>
b l a r k <return> C-x C-s C-x k <return> C-z <down-mouse-1>
<mouse-1> <down-mouse-1> <mouse-1> d x f o i p g d
o j g <return> C-x C-s <down-mouse-1> <mouse-1> <down-mouse-1>
<mouse-1> C-x k <return> y e s <return> C-z <down-mouse-1>
<mouse-1> <return> C-x C-s <backspace> C-x C-s C-x
k <return> C-z C-x k <return> C-z <down-mouse-1> <mouse-1>
C-x k <return> C-z C-x C-f d e s k <tab> k <tab> <backspace>
<tab> <tab> <down-mouse-1> <mouse-2> <end> F a r k
. <return> C-x C-s C-x k <return> y e s <return> C-z
M-x M-x C-g M-x r e p o r <tab> <return>
Recent messages:
Wrote c:/Users/cartan/Desktop/Thorton_Lösung.txt
Saving file c:/Users/cartan/Desktop/Thorton_Lösung.txt...
Wrote c:/Users/cartan/Desktop/Thorton_Lösung.txt
(New file) [2 times]
Making completion list...
Mark set
Saving file c:/Users/cartan/Desktop/????????_Lösung.txt...
basic-save-buffer-2: Opening output file: invalid argument, c:/Users/cartan/Desktop/????????_Lösung.txt
completing-read-default: Command attempted to use minibuffer while in minibuffer
Quit
Quit
Load-path shadows:
None found.
Features:
(shadow sort gnus-util mail-extr emacsbug message format-spec rfc822 mml
mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils
mailheader sendmail regexp-opt rfc2047 rfc2045 ietf-drums mm-util mail-prsvr
mail-utils help-mode easymenu view eliserv doctor server time cl time-date
tooltip ediff-hook vc-hooks lisp-float-type mwheel dos-w32 disp-table ls-lisp
w32-win w32-vars tool-bar dnd fontset image fringe lisp-mode register page
menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax
facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan
thai tai-viet lao korean japanese hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple
abbrev minibuffer loaddefs button faces cus-face files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process multi-tty emacs)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#12807
; Package
emacs
.
(Mon, 05 Nov 2012 21:52:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 12807 <at> debbugs.gnu.org (full text, mbox):
> From: Nils Gösche <cartan <at> cartan.de>
> Date: Mon, 05 Nov 2012 21:52:05 +0100
>
> I keep a bunch of text files on my Windows 7 desktop containing my thoughts
> about the solutions of chess problems I am trying to solve. Now, one of these
> problems was composed by a Russian. So, I named the file Кузовков_Lösung.txt:
> First the name of the Russian composer, then the German word for »solution«.
> However, when I tried to edit that file in Emacs, I only got error messages,
> probably because of the funny Unicode characters in the file name. (See below
> for the exact wording of the messages.)
>
> Another file with only English/German characters in the name,
> Thorton_Lösung.txt, does not cause any trouble at all (oh, but it seems I
> misspelled the name, actually).
Emacs on Windows currently supports only file names that can be
expressed in the system codepage. So unless someone writes the code
to support the Unicode APIs throughout, this limitation will remain
for some time to come. Volunteers are welcome.
> (BTW, Notepad does not have any problems editing the same file. So, it is
> not some weird, OS-related problem, either).
Yes, but the Explorer and the Notepad are about the only programs that
do. Many others don't. Emacs is one of them.
Sorry.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#12807
; Package
emacs,w32
.
(Mon, 05 Nov 2012 22:10:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 12807 <at> debbugs.gnu.org (full text, mbox):
You wrote:
> Emacs on Windows currently supports only file names that can be
> expressed in the system codepage. So unless someone writes the code to
> support the Unicode APIs throughout, this limitation will remain for
> some time to come. Volunteers are welcome.
Subtle hint noted. Ok ok, I'll look into it.
> > (BTW, Notepad does not have any problems editing the same file. So,
> it
> > is not some weird, OS-related problem, either).
>
> Yes, but the Explorer and the Notepad are about the only programs that
> do. Many others don't. Emacs is one of them.
»About the only« is a bit of an exaggeration ;-) Anything that is written
in C# or Java shouldn't have that problem; or Common Lisp, come to think of
it. But yeah, back in the old days, pretty much nobody felt like using
wchar_t instead of char everywhere in C. I didn't, either, back then. (Not
to mention that in the really old days, wchar_t didn't even exist ;-)
I'll see what I can do.
Regards,
--
Nils Gösche
Don't ask for whom the <Ctrl-G> tolls.
Merged 7100 12807.
Request was from
Glenn Morris <rgm <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Mon, 05 Nov 2012 22:23:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#12807
; Package
emacs,w32
.
(Tue, 06 Nov 2012 04:02:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 12807 <at> debbugs.gnu.org (full text, mbox):
> From: Nils Gösche <cartan <at> cartan.de>
> Cc: <12807 <at> debbugs.gnu.org>
> Date: Mon, 5 Nov 2012 23:05:57 +0100
>
> > Yes, but the Explorer and the Notepad are about the only programs that
> > do. Many others don't. Emacs is one of them.
>
> »About the only« is a bit of an exaggeration ;-) Anything that is written
> in C# or Java shouldn't have that problem; or Common Lisp, come to think of
> it. But yeah, back in the old days, pretty much nobody felt like using
> wchar_t instead of char everywhere in C. I didn't, either, back then. (Not
> to mention that in the really old days, wchar_t didn't even exist ;-)
Using wchar_t is not going to solve the whole problem, unfortunately.
The problem is that the mainline Emacs code uses APIs that don't
accept wide characters. Examples include 'stat', 'access', 'open',
'fopen', etc. To fix the problem, we'd need to provide our own
implementation of these APIs that would accept a UTF-8 encoded file
name, then re-encode the file name in UTF-16, and call the Unicode
APIs as part of the implementation. This is a large job.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 10 Jan 2014 12:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 11 years and 214 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.