GNU bug report logs - #4157
[macOS/HFS] dired doesn't decode ls output when it uses different encoding for filename vs date

Previous Next

Package: emacs;

Reported by: Peter Dyballa <Peter_Dyballa <at> Freenet.DE>

Date: Sun, 16 Aug 2009 02:25:05 UTC

Severity: minor

Tags: notabug

Found in versions 27.0.50, 23.1.50

Done: Stefan Kangas <stefan <at> marxist.se>

Bug is archived. No further changes may be made.

Full log


Message #45 received at 4157 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Peter Dyballa <Peter_Dyballa <at> Freenet.DE>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 4157 <at> debbugs.gnu.org
Subject: Re: bug#4157: 23.1.50; faulty character characterisation for ä
Date: Sun, 23 Aug 2009 11:57:37 +0200
Am 23.08.2009 um 03:49 schrieb Stefan Monnier:

>> In both locales the *file names* are correct and also detected as  
>> containing
>
> "correct" doesn't really tell me what you see, but I see what you  
> mean.

"Correct" meant that I was seeing what I had typed before in Finder...

>
>> "composed characters," it's a problem with the file's  month date.  
>> In the
>
> So my guess was right: ls's output uses utf-8 for the filenames, but
> latin-1 for the date, which is why it's difficult for dired to do the
> right thing (it's not impossible, of course, but it's more work and
> dired is currently not setup for that).
>

Here is a little test from a shell (actually *shell* buffer in NS  
Emacs.app with UTF-8 locales):

pete 252 /\ gls -lN zo*
-rw-r--r-- 1 pete admin 281829 20. Mär 1998  zoä€.au
pete 253 /\ ls -lw zo*
-rw-r--r--   1 pete  admin  281829 20 Mär  1998 zoä€.au
pete 254 /\ gls -lN zo* | od -j 32 -t a
0000040    0   .  sp   M   \303   \244   r  sp   1   9   9   8  sp   
sp   z   o
0000060    a   \314  88   \342  82   \254   .   a   u  nl
0000072
pete 255 /\ env LC_CTYPE=de_DE.ISO8859-15 LANG=de_DE.ISO8859-15 gls - 
lN zo* | od -j 32 -t a
0000040    0   .  sp   M   \344   r  sp   1   9   9   8  sp  sp   z    
o   a
0000060    \314  88   \342  82   \254   .   a   u  nl
0000071
pete 256 /\ ls -lw zo* | od -j 32 -t a
0000040    2   9  sp   2   0  sp   M   \303   \244   r  sp   1   9    
9   8
0000060   sp   z   o   a   \314  88   \342  82   \254   .   a   u  nl
0000075
pete 257 /\ env LC_CTYPE=de_DE.ISO8859-15 LANG=de_DE.ISO8859-15 ls - 
lw zo* | od -j 32 -t a
0000040    2   9  sp   2   0  sp   M   \344   r  sp  sp   1   9   9    
8  sp
0000060    z   o   a   \314  88   \342  82   \254   .   a   u  nl
0000074

So the *ls commands deliver the month date in their locale composed  
while the file name is always *de*composed UTF-8:

\303 \244    = C3 A4    = LATIN SMALL LETTER A WITH DIAERESIS ä at U 
+00E4
\314 88      = CC 88    = COMBINING DIAERESIS                 ¨ at U 
+0308
\342 82 \254 = E2 88 AC = EURO SIGN                           € at U 
+20AC

--
Greetings

  Pete

Bake pizza not war!






This bug report was last modified 5 years and 189 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.