GNU bug report logs - #2940
[macOS] C-s in dired fails to find umlauts in filenames (due to wrong file-name-coding-system)

Previous Next

Package: emacs;

Reported by: Markus Triska <markus.triska <at> gmx.at>

Date: Thu, 9 Apr 2009 16:35:04 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Alp Aker <aker <at> pitt.edu>
To: Glenn Morris <rgm <at> gnu.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 2940 <at> debbugs.gnu.org, markus.triska <at> gmx.at
Subject: bug#2940: 23.0.92; C-s in dired fails to find files with umlauts
Date: Sat, 16 Jul 2011 13:38:19 -0400 (EDT)
Glenn Morris wrote:

> IIUC, he's not using a --with-ns build. It's a "normal", gtk build that
> happens to be running on a Mac. So ns-win.el isn't in use.

My mistake; since it was running on Darwin I just assumed an NS build, and 
didn't look at the build info in the original bug report.

Making this the default behavior for non-NS builds running on a Mac is 
probably TRT.  It was once possible to use Darwin with UFS, but that 
hasn't been true for the last three major versions, so going forward it 
will be a vanishingly rare case where (eq system-type 'darwin) doesn't 
imply that the file system is a variant of HFS+.  And it's reasonable for 
users to expect that Emacs will, out of the box, properly handle file 
names on the system it was built on.

OTOH, just adding something like:

 (when (eq system-type 'darwin)
    (require 'ucs-normalize)
    (setq file-name-coding-system 'utf-8-hfs))

to x-win.el might not be the best solution.  The utf-8-hfs coding system 
does both post-read conversion (normalizing to precomposed utf-8) and 
pre-write conversion (normalizing to Apple's variant of decomposed utf-8). 
The latter is unnecessary:  the OS itself will do normalization on any 
filename handed to it.  (Observe that the coding system defined in 
ns-win.el only does post-read conversion.)

For local operations, the redundant pre-write conversion is harmless. 
But using decomposed utf-8 might cause trouble when dealing with remote 
files.  So it's probably more robust to follow ns-win.el's lead and define 
a coding system that only does post-read conversion.  Thus:

  (when (eq system-type 'darwin)
    (require 'ucs-normalize)
    (define-coding-system 'utf-8-hfs-for-read
      "UTF-8 based coding system for HFS+ file names."
      :coding-type 'utf-8
      :mnemonic ?U
      :charset-list '(unicode)
      :post-read-conversion 'ucs-normalize-hfs-nfd-post-read-conversion)
    (setq file-name-coding-system 'utf-8-hfs-for-read))

would be the addition to make to x-win.el.





This bug report was last modified 4 years and 306 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.