From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 14 14:08:39 2015 Received: (at submit) by debbugs.gnu.org; 14 Dec 2015 19:08:40 +0000 Received: from localhost ([127.0.0.1]:51861 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8YUB-0001An-Ji for submit@debbugs.gnu.org; Mon, 14 Dec 2015 14:08:39 -0500 Received: from eggs.gnu.org ([208.118.235.92]:44936) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8YU9-0001AY-Qq for submit@debbugs.gnu.org; Mon, 14 Dec 2015 14:08:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8YU3-0003k8-8E for submit@debbugs.gnu.org; Mon, 14 Dec 2015 14:08:32 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, HTML_MESSAGE,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:56040) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8YU3-0003k4-4g for submit@debbugs.gnu.org; Mon, 14 Dec 2015 14:08:31 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54932) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8YU1-0006q8-Pa for bug-gnu-emacs@gnu.org; Mon, 14 Dec 2015 14:08:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8YU0-0003jP-BM for bug-gnu-emacs@gnu.org; Mon, 14 Dec 2015 14:08:29 -0500 Received: from mail-vk0-x233.google.com ([2607:f8b0:400c:c05::233]:34986) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8YU0-0003jD-5l for bug-gnu-emacs@gnu.org; Mon, 14 Dec 2015 14:08:28 -0500 Received: by vkha189 with SMTP id a189so160143327vkh.2 for ; Mon, 14 Dec 2015 11:08:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=TOcZmc+LbCdqTfpFVJ2ITazkpenE9HhzrXvfb9lqEVQ=; b=tn1d15mfukLs1KOTxU+Bsc5y/ZbH/wMXWPOCu0GA8ikY1863l8pDz/S6YE1MQPLZtu deIhXFkzSAyQApM0GWpEOrXEQUrN17aM1dJcBHtS+MRuxy5autq7vS7MmtPnSeVENwC6 B2LBw4yz/dpZvF5KuXW9W0joKN9Dt6SwjEixq1dDneuOVtywHmopSU+w3y1YpRRxuoz2 CvD/2zlBWV83u/XUIwkg5BvzrEM1aqHfZcJ7NJ0gyxFqWfPYt+8/8DkAcRRDOf5GQrRv KKu2bdXQxzFN/QcKR6D8zjywaqk0dFPmg+nai8qO1qh8RrNRWiwv3j5KeZ1Ilgd37jtn 286Q== MIME-Version: 1.0 X-Received: by 10.31.10.199 with SMTP id 190mr26042851vkk.51.1450120107341; Mon, 14 Dec 2015 11:08:27 -0800 (PST) Received: by 10.31.210.133 with HTTP; Mon, 14 Dec 2015 11:08:27 -0800 (PST) Date: Mon, 14 Dec 2015 20:08:27 +0100 Message-ID: Subject: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: bug-gnu-emacs@gnu.org Content-Type: multipart/alternative; boundary=001a11440176e13eb40526e06545 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) --001a11440176e13eb40526e06545 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi! File name completion doesn't work with non-ASCII characters on OS X. Steps to repeat: In an empty directory: echo "alpha" > =C3=A5=C3=A4=C3=B6alpha.txt echo "beta" > =C3=A5=C3=A4=C3=B6beta.txt Emacs -Q C-x C-f TAB # Emacs correctly echoes "=C3=A5=C3=A4=C3=B6" TAB # Emacs incorrectly says "[No match]" C-g # To get out of the previous C-x C-f C-x C-f =C3=A5 TAB # Emacs incorrectly says "[No match]" The OS X file system is a bit bizarre in that it stores filenames using the decomposed UTF-8 format (for characters in specific ranges), this might have something to do with this. Emacs can open the files if the full name is manually specified. Also, dired seems to work OK as well. -- Anders Lindgren In GNU Emacs 25.0.50.1 (x86_64-apple-darwin10.8.0, NS appkit-1038.36 Version 10.6.8 (Build 10K549)) of 2015-12-11 Repository revision: 83114ccf77d2a5d59fccbdbda6edefacce1b979e Windowing system distributor 'Apple', version 10.3.1038 Configured using: 'configure --without-dbus --with-ns' Configured features: ACL LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS Important settings: value of $LC_CTYPE: UTF-8 locale-coding-system: utf-8-unix Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Making completion list... Load-path shadows: None found. Features: (shadow sort gnus-util mail-extr emacsbug message dired format-spec rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util help-fns mail-prsvr mail-utils seq byte-opt gv bytecomp byte-compile cconv cl-extra help-mode easymenu cl-loaddefs pcase cl-lib time-date mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel ns-win term/common-win tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese charscript case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote cocoa ns multi-tty make-network-process emacs) Memory information: ((conses 16 86418 9491) (symbols 48 19326 0) (miscs 40 47 155) (strings 32 15120 4410) (string-bytes 1 446648) (vectors 16 11888) (vector-slots 8 415801 4426) (floats 8 150 89) (intervals 56 206 0) (buffers 976 12)) --001a11440176e13eb40526e06545 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

File name completion doe= sn't work with non-ASCII characters on OS X.

S= teps to repeat:

=C2=A0 =C2=A0 In an empty director= y:
=C2=A0 =C2=A0 echo "alpha" > =C3=A5=C3=A4=C3=B6al= pha.txt
=C2=A0 =C2=A0 echo "beta" > =C3=A5=C3=A4=C3= =B6beta.txt
=C2=A0 =C2=A0 Emacs -Q
=C2=A0 =C2=A0 C-x C-= f
=C2=A0 =C2=A0 TAB =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0# Emacs cor= rectly echoes "=C3=A5=C3=A4=C3=B6"
=C2=A0 =C2=A0 TAB = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0# Emacs incorrectly says "[No match]= "

=C2=A0 =C2=A0 =C2=A0C-g =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0# To get out of the previous C-x C-f
=C2=A0 =C2= =A0 =C2=A0C-x C-f
=C2=A0 =C2=A0 =C2=A0=C3=A5 TAB =C2=A0 =C2=A0 = =C2=A0# Emacs incorrectly says "[No match]"

<= div>The OS X file system is a bit bizarre in that it stores filenames using= the decomposed UTF-8 format (for characters in specific ranges), this migh= t have something to do with this.

Emacs can open t= he files if the full name is manually specified. Also, dired seems to work = OK as well.

=C2=A0 =C2=A0 -- Anders Lindgren
=


In GNU Emacs 25.0.50.1 (x86_64-apple-dar= win10.8.0, NS appkit-1038.36 Version 10.6.8 (Build 10K549))
=C2= =A0of 2015-12-11
Repository revision: 83114ccf77d2a5d59fccbdbda6e= defacce1b979e
Windowing system distributor 'Apple', versi= on 10.3.1038
Configured using:
=C2=A0'configure --w= ithout-dbus --with-ns'

Configured features:
ACL LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS

Im= portant settings:
=C2=A0 value of $LC_CTYPE: UTF-8
=C2= =A0 locale-coding-system: utf-8-unix

Major mode: L= isp Interaction

Minor modes in effect:
= =C2=A0 tooltip-mode: t
=C2=A0 global-eldoc-mode: t
=C2= =A0 electric-indent-mode: t
=C2=A0 mouse-wheel-mode: t
= =C2=A0 tool-bar-mode: t
=C2=A0 menu-bar-mode: t
=C2=A0 = file-name-shadow-mode: t
=C2=A0 global-font-lock-mode: t
=C2=A0 font-lock-mode: t
=C2=A0 blink-cursor-mode: t
= =C2=A0 auto-composition-mode: t
=C2=A0 auto-encryption-mode: t
=C2=A0 auto-compression-mode: t
=C2=A0 line-number-mode: = t
=C2=A0 transient-mark-mode: t

Recent m= essages:
For information about GNU Emacs and the GNU system, type= C-h C-a.
Making completion list...

Load= -path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr emacsbug message dired format-spec<= /div>
rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc22= 31
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-= drums
mm-util help-fns mail-prsvr mail-utils seq byte-opt gv byte= comp
byte-compile cconv cl-extra help-mode easymenu cl-loaddefs p= case cl-lib
time-date mule-util tooltip eldoc electric uniquify e= diff-hook vc-hooks
lisp-float-type mwheel ns-win term/common-win = tool-bar dnd fontset image
regexp-opt fringe tabulated-list newco= mment elisp-mode lisp-mode
prog-mode register page menu-bar rfn-e= shadow timer select scroll-bar
mouse jit-lock font-lock syntax fa= cemenu font-core frame cl-generic cham
georgian utf-8-lang misc-l= ang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms= cp51932 hebrew greek romanian slovak czech
european ethiopic ind= ian cyrillic chinese charscript case-table epa-hook
jka-cmpr-hook= help simple abbrev minibuffer cl-preloaded nadvice
loaddefs butt= on faces cus-face macroexp files text-properties overlay
sha1 md5= base64 format env code-pages mule custom widget
hashtable-print-= readable backquote cocoa ns multi-tty
make-network-process emacs)=

Memory information:
((conses 16 86418 9= 491)
=C2=A0(symbols 48 19326 0)
=C2=A0(miscs 40 47 155)=
=C2=A0(strings 32 15120 4410)
=C2=A0(string-bytes 1 44= 6648)
=C2=A0(vectors 16 11888)
=C2=A0(vector-slots 8 41= 5801 4426)
=C2=A0(floats 8 150 89)
=C2=A0(intervals 56 = 206 0)
=C2=A0(buffers 976 12))

--001a11440176e13eb40526e06545-- From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 14 14:20:08 2015 Received: (at 22169) by debbugs.gnu.org; 14 Dec 2015 19:20:08 +0000 Received: from localhost ([127.0.0.1]:51877 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8YfI-0001Ri-F8 for submit@debbugs.gnu.org; Mon, 14 Dec 2015 14:20:08 -0500 Received: from eggs.gnu.org ([208.118.235.92]:47949) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8YfG-0001RA-SI for 22169@debbugs.gnu.org; Mon, 14 Dec 2015 14:20:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8Yf7-0006RI-OL for 22169@debbugs.gnu.org; Mon, 14 Dec 2015 14:20:01 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:37249) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8Yf7-0006RE-LK; Mon, 14 Dec 2015 14:19:57 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1648 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a8Yf6-0001xQ-SQ; Mon, 14 Dec 2015 14:19:57 -0500 Date: Mon, 14 Dec 2015 21:20:09 +0200 Message-Id: <83y4cw3kie.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Mon, 14 Dec 2015 20:08:27 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Mon, 14 Dec 2015 20:08:27 +0100 > From: Anders Lindgren > > In an empty directory: > echo "alpha" > åäöalpha.txt > echo "beta" > åäöbeta.txt > Emacs -Q > C-x C-f > TAB # Emacs correctly echoes "åäö" > TAB # Emacs incorrectly says "[No match]" > > C-g # To get out of the previous C-x C-f > C-x C-f > å TAB # Emacs incorrectly says "[No match]" > > The OS X file system is a bit bizarre in that it stores filenames using the > decomposed UTF-8 format (for characters in specific ranges), this might have > something to do with this. This shouldn't cause any trouble, since Emacs encodes file names before passing them to readdir, and decodes the results. The encoding and decoding process should take care of the decomposition and composition. What is the value of file-name-coding-system on that system? From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 14 15:50:18 2015 Received: (at submit) by debbugs.gnu.org; 14 Dec 2015 20:50:18 +0000 Received: from localhost ([127.0.0.1]:51936 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8a4Y-0003a8-Ew for submit@debbugs.gnu.org; Mon, 14 Dec 2015 15:50:18 -0500 Received: from eggs.gnu.org ([208.118.235.92]:47524) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8a4W-0003Zv-2a for submit@debbugs.gnu.org; Mon, 14 Dec 2015 15:50:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8a4Q-0005Xh-5J for submit@debbugs.gnu.org; Mon, 14 Dec 2015 15:50:10 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:51922) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8a4Q-0005Xd-2x for submit@debbugs.gnu.org; Mon, 14 Dec 2015 15:50:10 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57524) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8a4P-0000KK-DD for bug-gnu-emacs@gnu.org; Mon, 14 Dec 2015 15:50:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8a4L-0005Wq-AR for bug-gnu-emacs@gnu.org; Mon, 14 Dec 2015 15:50:09 -0500 Received: from plane.gmane.org ([80.91.229.3]:41921) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8a4L-0005WT-44 for bug-gnu-emacs@gnu.org; Mon, 14 Dec 2015 15:50:05 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1a8a4E-0004iF-Aq for bug-gnu-emacs@gnu.org; Mon, 14 Dec 2015 21:49:58 +0100 Received: from c-68-39-146-59.hsd1.in.comcast.net ([68.39.146.59]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 14 Dec 2015 21:49:58 +0100 Received: from random832 by c-68-39-146-59.hsd1.in.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 14 Dec 2015 21:49:58 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: bug-gnu-emacs@gnu.org From: Random832 Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X Date: Mon, 14 Dec 2015 20:49:50 +0000 (UTC) Lines: 13 Message-ID: References: X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: c-68-39-146-59.hsd1.in.comcast.net User-Agent: slrn/pre1.0.3-7 (Linux) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.1 (----) On 2015-12-14, Anders Lindgren wrote: > The OS X file system is a bit bizarre in that it stores filenames using the > decomposed UTF-8 format (for characters in specific ranges), this might > have something to do with this. Can you confirm whether: A) the contents of the minibuffer (or e.g. a dired buffer) are in decomposed format (i.e. can you backspace the accent marks separately with DEL)? B) the filenames are returned in decomposed format, normal UTF-8, or some other format from readdir()? C) Does installing the utf-8m encoding from Carbon Emacs solve the problem? From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 14 16:09:33 2015 Received: (at 22169) by debbugs.gnu.org; 14 Dec 2015 21:09:33 +0000 Received: from localhost ([127.0.0.1]:51943 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8aNB-00041A-1O for submit@debbugs.gnu.org; Mon, 14 Dec 2015 16:09:33 -0500 Received: from eggs.gnu.org ([208.118.235.92]:51997) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8aN8-00040y-NN for 22169@debbugs.gnu.org; Mon, 14 Dec 2015 16:09:30 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8aN0-0001ZQ-4v for 22169@debbugs.gnu.org; Mon, 14 Dec 2015 16:09:25 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:38899) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8aN0-0001ZM-1c; Mon, 14 Dec 2015 16:09:22 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1784 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a8aMz-0000kz-C4; Mon, 14 Dec 2015 16:09:21 -0500 Date: Mon, 14 Dec 2015 23:09:34 +0200 Message-Id: <83twnk3fg1.fsf@gnu.org> From: Eli Zaretskii To: andlind@gmail.com In-reply-to: <83y4cw3kie.fsf@gnu.org> (message from Eli Zaretskii on Mon, 14 Dec 2015 21:20:09 +0200) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Mon, 14 Dec 2015 21:20:09 +0200 > From: Eli Zaretskii > Cc: 22169@debbugs.gnu.org > > What is the value of file-name-coding-system on that system? And if that is nil, as it probably should, what is the value of default-file-name-coding-system? From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 14 17:07:29 2015 Received: (at 22169) by debbugs.gnu.org; 14 Dec 2015 22:07:29 +0000 Received: from localhost ([127.0.0.1]:51961 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8bHE-0005Mn-RN for submit@debbugs.gnu.org; Mon, 14 Dec 2015 17:07:29 -0500 Received: from mail-vk0-f49.google.com ([209.85.213.49]:32819) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8bHC-0005Ma-U4 for 22169@debbugs.gnu.org; Mon, 14 Dec 2015 17:07:27 -0500 Received: by vkca188 with SMTP id a188so164621298vkc.0 for <22169@debbugs.gnu.org>; Mon, 14 Dec 2015 14:07:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=PqNeO0iMoHlRSGXinXaEg7114C8hmuD9CUJ1zYk6r8c=; b=bbfuTtnS+cfp6W150ICIuXcdIUnhYPVS+oYEDXG9mxaUB3pPbkI6SYcjcTsM2tNNxp R81eL/4MWbEoIQE8yY16J5ebKLviaP2yVF532vLdSmyC8c4Pz2Ma+eXUe5DKMzXp4ZJh +ZVEWAZeQ5nRRn/OWUyQMQnpLwMLOal22dbfO0WKGg25al6jMH8xphzzSCzFgPSxnJg+ FP820JDhwJ7o65tEucgpQCO0x+xhZrtqV6PaHXzeSC5Knn139EaJtIC/jLvnmJpgVtoZ v+CmiuV03ODL5au6FYY8Ajfde78p/OUUi59hoZ6Hp7peFJq9J5IlGN94F6Ie2Vog6PvI qunA== MIME-Version: 1.0 X-Received: by 10.31.54.134 with SMTP id d128mr12229127vka.26.1450130841558; Mon, 14 Dec 2015 14:07:21 -0800 (PST) Received: by 10.31.210.133 with HTTP; Mon, 14 Dec 2015 14:07:21 -0800 (PST) In-Reply-To: <83twnk3fg1.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> Date: Mon, 14 Dec 2015 23:07:21 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a11438ee8b068f50526e2e5fa X-Spam-Score: 0.4 (/) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.4 (/) --001a11438ee8b068f50526e2e5fa Content-Type: text/plain; charset=UTF-8 > > > What is the value of file-name-coding-system on that system? > file-name-coding-system utf-8-nfd > And if that is nil, as it probably should, what is the value of > default-file-name-coding-system? > default-file-name-coding-system utf-8 -- Anders --001a11438ee8b068f50526e2e5fa Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
> What is the value of file-name-coding-sy= stem on that system?

file-name-c= oding-system
utf-8-nfd

=C2=A0
<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;pa= dding-left:1ex"> And if that is nil, as it probably should, what is the value of
default-file-name-coding-system?

default-file-n= ame-coding-system
utf-8

=C2=A0 =C2=A0 --= Anders

--001a11438ee8b068f50526e2e5fa-- From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 14 17:41:26 2015 Received: (at 22169) by debbugs.gnu.org; 14 Dec 2015 22:41:26 +0000 Received: from localhost ([127.0.0.1]:51983 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8bo6-0006BF-CL for submit@debbugs.gnu.org; Mon, 14 Dec 2015 17:41:26 -0500 Received: from mail-vk0-f49.google.com ([209.85.213.49]:35154) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8bo5-0006B2-BR for 22169@debbugs.gnu.org; Mon, 14 Dec 2015 17:41:25 -0500 Received: by vkha189 with SMTP id a189so163972312vkh.2 for <22169@debbugs.gnu.org>; Mon, 14 Dec 2015 14:41:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=mZKe2qHQAWvv0fq9f0DF0Nx/qskcFek6EWw6GNWLuYU=; b=rMYSFTDqXmhYgrCLms+jqJodCs97gRVwVp/CJDSCY9XqGFZLiUpQu/t+FG5/u74ZeN kx1rcOs2MApSM4wwZwhAY8uXCLKIJYuzLorERnGOOALLoVcbssR2wCPl4sIL/bVZYz4h +4KhPMpfFR15pZfnOjWq9fdhnnn/wdh2HVYFTXcisu0uIVNKXpJUh42truV/ZvUDWcsA N/oiqFGxnnIOs5Sk7hQw0Bfn/t320jU4GTcBDkfgJPL5e4wjeAFhmcVHwNPL9Oxp9VYR zHg0nnCDc1et+3Q3lYFA3RofTX8ZnZW8XXd8zzIlu0qz9N33JQHMEqIdShk1Yzrb98Ok sq9Q== MIME-Version: 1.0 X-Received: by 10.31.10.199 with SMTP id 190mr26799789vkk.51.1450132879883; Mon, 14 Dec 2015 14:41:19 -0800 (PST) Received: by 10.31.210.133 with HTTP; Mon, 14 Dec 2015 14:41:19 -0800 (PST) Date: Mon, 14 Dec 2015 23:41:19 +0100 Message-ID: Subject: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII ch From: Anders Lindgren To: random832@fastmail.com, 22169@debbugs.gnu.org Content-Type: multipart/alternative; boundary=001a114401762ec2b50526e35f37 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a114401762ec2b50526e35f37 Content-Type: text/plain; charset=UTF-8 Hi! > Can you confirm whether: > >A) the contents of the minibuffer (or e.g. a dired buffer) are > in decomposed format (i.e. can you backspace the accent marks > separately with DEL)? When I backspace I delete a full character. >B) the filenames are returned in decomposed format, normal > UTF-8, or some other format from readdir()? The low-level C function? I'll see if I find some time in the next couple of days. >C) Does installing the utf-8m encoding from Carbon Emacs solve > the problem? Carbon Emacs? The one that YAMAMOTO Mitsuharu is maintaining? I haven't got it installed, but I can see if I can do it in the next couple of days. -- Anders --001a114401762ec2b50526e35f37 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

> Can you confirm whether:=
>

>A) the contents of the minibuffer (or e.g. a dired buff= er) are
> =C2=A0 in decomposed format (i.e. can you backspace the acc= ent marks
> =C2=A0 separately with DEL)?

When I ba= ckspace I delete a full character.


>B) the = filenames are returned in decomposed format, normal
> =C2=A0 UTF-8, o= r some other format from readdir()?

The low-level = C function? I'll see if I find some time in the next couple of days.


>C) Does installing the utf-8m encoding from = Carbon Emacs solve
> =C2=A0 the problem?

Carbon Ema= cs? The one that=C2=A0YAMAMOTO Mitsuharu i= s maintaining? I haven't got it installed, but I can see if I can do it= in the next couple of days.

=C2=A0 =C2=A0 --= Anders

--001a114401762ec2b50526e35f37-- From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 14 22:42:24 2015 Received: (at 22169) by debbugs.gnu.org; 15 Dec 2015 03:42:24 +0000 Received: from localhost ([127.0.0.1]:52073 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8gVM-0004e3-4z for submit@debbugs.gnu.org; Mon, 14 Dec 2015 22:42:24 -0500 Received: from eggs.gnu.org ([208.118.235.92]:48060) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8gVK-0004dr-QI for 22169@debbugs.gnu.org; Mon, 14 Dec 2015 22:42:22 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8gVA-0007b5-TR for 22169@debbugs.gnu.org; Mon, 14 Dec 2015 22:42:17 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:44059) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8gVA-0007b1-QB; Mon, 14 Dec 2015 22:42:12 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2176 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a8gVA-0006Oh-4d; Mon, 14 Dec 2015 22:42:12 -0500 Date: Tue, 15 Dec 2015 05:42:26 +0200 Message-Id: <83oads2x99.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Mon, 14 Dec 2015 23:07:21 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Mon, 14 Dec 2015 23:07:21 +0100 > From: Anders Lindgren > Cc: 22169@debbugs.gnu.org > > > What is the value of file-name-coding-system on that system? > > > file-name-coding-system > utf-8-nfd Where is that defined? Can you try using utf-8-hfs? (You might need to load ucs-normalize first.) From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 00:12:54 2015 Received: (at 22169) by debbugs.gnu.org; 15 Dec 2015 05:12:54 +0000 Received: from localhost ([127.0.0.1]:52111 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8huw-0006fp-EF for submit@debbugs.gnu.org; Tue, 15 Dec 2015 00:12:54 -0500 Received: from mail-vk0-f42.google.com ([209.85.213.42]:36215) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8huv-0006fZ-04 for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 00:12:53 -0500 Received: by vkay187 with SMTP id y187so168282061vka.3 for <22169@debbugs.gnu.org>; Mon, 14 Dec 2015 21:12:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=JhQQ8jKQwYU0a+txR3Z0TzzKzF4WJR5+Y826ynqzzd0=; b=medEFV1AYSL8+VNxaCS+Jb0FJpaxthaskzl/Ut5KGhjkqZsbgmxsMYpbJ8sK6NxlQx SJmdOmgqWJEZtuI9j1OdO2nYh5URz3s02Fbn8oGRfUWLnDusufnTnvHMsMP0D0sB+Srq vJs63ajrcpz7SzCzjosH+LgiLxNoSpYDL+W999tpNRQrGYnxx8iNOgQcUNPGXF8b6LK+ iegR7fNyEXwjHzm5aew4GU31XZcJH0HPof9nFXnoAH+nbB7Vf9oFT5Tlki2iph0eOB5L IdZPBq2iu2tpCpJlKrgP5SMJmLKUeSNx8fheXUdfreqiC4a1snWGJtoCEBSB/+0Gt4md GUqA== MIME-Version: 1.0 X-Received: by 10.31.138.20 with SMTP id m20mr27969960vkd.70.1450156367264; Mon, 14 Dec 2015 21:12:47 -0800 (PST) Received: by 10.31.210.133 with HTTP; Mon, 14 Dec 2015 21:12:47 -0800 (PST) In-Reply-To: <83oads2x99.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> Date: Tue, 15 Dec 2015 06:12:47 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a1144f92223c2050526e8d75f X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a1144f92223c2050526e8d75f Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > > > file-name-coding-system > > utf-8-nfd > > Where is that defined? > In ns-win.el Can you try using utf-8-hfs? (You might need to load ucs-normalize > first.) > It behaves like the original, as far as I can tell. I tried setting it to nil. This made completion work. However, the letters are presented in decomposed form, so that pressing backspace first converts "=C3=A5" to "a", a second backspace deletes the "a" -- this is not how we w= ould like to present file names to users. -- Anders --001a1144f92223c2050526e8d75f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
> file-name-coding-system > utf-8-nfd

Where is that defined?

In ns-win= .el
=C2=A0

Can= you try using utf-8-hfs?=C2=A0 (You might need to load ucs-normalize
first.)

It behaves like the= original, as far as I can tell.

=
I tried setting it to nil. This made completion = work. However, the letters are presented in decomposed form, so that pressi= ng backspace first converts "=C3=A5" to "a", a second b= ackspace deletes the "a" -- this is not how we would like to pres= ent file names to users.

=C2=A0 =C2=A0 -- Anders
=
--001a1144f92223c2050526e8d75f-- From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 04:31:04 2015 Received: (at 22169) by debbugs.gnu.org; 15 Dec 2015 09:31:04 +0000 Received: from localhost ([127.0.0.1]:52199 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8lwm-0004CP-Ef for submit@debbugs.gnu.org; Tue, 15 Dec 2015 04:31:04 -0500 Received: from mx2.suse.de ([195.135.220.15]:58812) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8lwk-0004C7-Dj for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 04:31:02 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 4DD93AC72; Tue, 15 Dec 2015 09:31:01 +0000 (UTC) From: Andreas Schwab To: Anders Lindgren Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> X-Yow: Did you move a lot of KOREAN STEAK KNIVES this trip, Dingy? Date: Tue, 15 Dec 2015 10:31:00 +0100 In-Reply-To: (Anders Lindgren's message of "Tue, 15 Dec 2015 06:12:47 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 Cc: Eli Zaretskii , 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Anders Lindgren writes: > I tried setting it to nil. This made completion work. However, the letters > are presented in decomposed form, so that pressing backspace first converts > "å" to "a", a second backspace deletes the "a" -- this is not how we would > like to present file names to users. That's how composed characters work in Emacs. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 05:21:25 2015 Received: (at 22169) by debbugs.gnu.org; 15 Dec 2015 10:21:25 +0000 Received: from localhost ([127.0.0.1]:52221 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8mjV-0005OA-67 for submit@debbugs.gnu.org; Tue, 15 Dec 2015 05:21:25 -0500 Received: from mail-qk0-f181.google.com ([209.85.220.181]:33189) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8mjT-0005Nx-41 for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 05:21:23 -0500 Received: by mail-qk0-f181.google.com with SMTP id k189so5406634qkc.0 for <22169@debbugs.gnu.org>; Tue, 15 Dec 2015 02:21:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=qvjydR43EmfAjquKEmHjto4/ppNITKgbEVgbB+0SMVU=; b=LVnP5pArNmaD2R/id2FYqcbwnyrVZVBmfIw08TAbDcyai+H6vAKkvz8KSO4gfTZ3a0 KjjtnoJzJrNG2kVXlL8qqDbSIXi+cQo12j+VZS3a5e7fTnH386nU9TGEuLGnV5yBFSFA N8G3ctdLZ5w7PvgLhcQn/M5yriuhecuBCcUhYL5XREojw4O+XDAttQcQ4BxKcBkVVG9U 3GYKGIW00qog7oQjOoBZ6VxPLiCt9mKg9Qw8w2TXeNzzwblAjbOQoDnzQxjjVjyFBGxy 0DmYsnInz5jhIgUPm2hdiI+KHHRftl+Uhs2fI4bEvg854shDL5AObkE7uH3WddyX4BIY +bNA== MIME-Version: 1.0 X-Received: by 10.13.227.130 with SMTP id m124mr22148512ywe.215.1450174877541; Tue, 15 Dec 2015 02:21:17 -0800 (PST) Received: by 10.37.88.69 with HTTP; Tue, 15 Dec 2015 02:21:17 -0800 (PST) In-Reply-To: References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> Date: Tue, 15 Dec 2015 11:21:17 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Andreas Schwab Content-Type: multipart/alternative; boundary=94eb2c0773ec702ae50526ed2650 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 Cc: Eli Zaretskii , 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --94eb2c0773ec702ae50526ed2650 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > > > I tried setting it to nil. This made completion work. However, the > letters > > are presented in decomposed form, so that pressing backspace first > converts > > "=C3=A5" to "a", a second backspace deletes the "a" -- this is not how = we > would > > like to present file names to users. > > That's how composed characters work in Emacs. > Andreas, The OS X file system *stores* filenames in a decomposed manner, that is true. However, they should be presented to the user as normal (composed) characters. If `file-name-coding-system' has the original value, they are. However, the problem is that the completion mechanism fails to handle this case, which is a bug and it should be fixed. -- Anders --94eb2c0773ec702ae50526ed2650 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
> I tried setting it to nil.= This made completion work. However, the letters
> are presented in decomposed form, so that pressing backspace first con= verts
> "=C3=A5" to "a", a second backspace deletes the &q= uot;a" -- this is not how we would
> like to present file names to users.

That's how composed characters work in Emacs.

Andreas,

The OS X file system *s= tores* filenames in a decomposed manner, that is true. However, they should= be presented to the user as normal (composed) characters. If `file-name-co= ding-system' has the original value, they are. However, the problem is = that the completion mechanism fails to handle this case, which is a bug and= it should be fixed.

=C2=A0 =C2=A0 -- Anders

--94eb2c0773ec702ae50526ed2650-- From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 10:58:07 2015 Received: (at 22169) by debbugs.gnu.org; 15 Dec 2015 15:58:07 +0000 Received: from localhost ([127.0.0.1]:52808 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8rzL-0006Lg-93 for submit@debbugs.gnu.org; Tue, 15 Dec 2015 10:58:07 -0500 Received: from eggs.gnu.org ([208.118.235.92]:51445) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8rzK-0006LD-BP for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 10:58:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8rzA-0005q5-A4 for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 10:58:01 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:56162) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8rzA-0005q1-88; Tue, 15 Dec 2015 10:57:56 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2876 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a8rz9-00077Z-Ja; Tue, 15 Dec 2015 10:57:56 -0500 Date: Tue, 15 Dec 2015 17:58:10 +0200 Message-Id: <83io3z3drh.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Tue, 15 Dec 2015 06:12:47 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Tue, 15 Dec 2015 06:12:47 +0100 > From: Anders Lindgren > Cc: 22169@debbugs.gnu.org > > > file-name-coding-system > > utf-8-nfd > > Where is that defined? > > In ns-win.el I think we should remove that, and leave behind an alias that uses utf-8-hfs, which is provided by Emacs. There's no reason to maintain 2 identical definitions. > I tried setting it to nil. This made completion work. However, the letters are > presented in decomposed form, so that pressing backspace first converts "å" to > "a", a second backspace deletes the "a" -- this is not how we would like to > present file names to users. When you set file-name-coding-system to nil, Emacs uses default-file-name-system, which is utf-8, so it doesn't compose/decompose characters, and that's why you see what you see. IOW, using nil is a step backward. What does this return: M-: (file-name-all-completion "åäö" "/that/empty/directory/") RET Also, what is your value of completion-ignore-case? From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 11:11:48 2015 Received: (at 22169) by debbugs.gnu.org; 15 Dec 2015 16:11:48 +0000 Received: from localhost ([127.0.0.1]:52840 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8sCX-0006gP-3F for submit@debbugs.gnu.org; Tue, 15 Dec 2015 11:11:48 -0500 Received: from eggs.gnu.org ([208.118.235.92]:55545) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8sCR-0006g8-HZ for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 11:11:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8sCI-0000w1-5T for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 11:11:34 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:56387) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8sCI-0000vx-2w; Tue, 15 Dec 2015 11:11:30 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2890 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a8sCH-0001xj-DI; Tue, 15 Dec 2015 11:11:29 -0500 Date: Tue, 15 Dec 2015 18:11:45 +0200 Message-Id: <83egen3d4u.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Tue, 15 Dec 2015 11:21:17 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: schwab@suse.de, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Tue, 15 Dec 2015 11:21:17 +0100 > From: Anders Lindgren > Cc: Eli Zaretskii , 22169@debbugs.gnu.org > > > I tried setting it to nil. This made completion work. However, the > letters > > are presented in decomposed form, so that pressing backspace first > converts > > "å" to "a", a second backspace deletes the "a" -- this is not how we > would > > like to present file names to users. > > That's how composed characters work in Emacs. > > The OS X file system *stores* filenames in a decomposed manner, that is true. > However, they should be presented to the user as normal (composed) characters. > If `file-name-coding-system' has the original value, they are. However, the > problem is that the completion mechanism fails to handle this case, which is a > bug and it should be fixed. Andreas just pointed out that when character composition happens at display time, the behavior you observe is normal, that's all. The issue of why completion doesn't work still exists. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 14:16:09 2015 Received: (at 22169) by debbugs.gnu.org; 15 Dec 2015 19:16:09 +0000 Received: from localhost ([127.0.0.1]:52924 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8v4z-0002N4-AI for submit@debbugs.gnu.org; Tue, 15 Dec 2015 14:16:09 -0500 Received: from mail-vk0-f52.google.com ([209.85.213.52]:36140) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8v4y-0002Ms-Lh for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 14:16:08 -0500 Received: by mail-vk0-f52.google.com with SMTP id y187so12126189vka.3 for <22169@debbugs.gnu.org>; Tue, 15 Dec 2015 11:16:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=StkTRIyapFUE44yxilJWDN7pIhugG7bVDo4RcU6oVj4=; b=IFbqpuXBCoA1rQvgp78k7Lzyfk5Uj8Dz4iU9YLJzAupD1xuT0g5Vsl4/pwoq8IPj0B ZCu7c9veVqdas4XEcM+D60Oi2Dx8VzqQMrmV9MJf9SIp3yySvqki2/tsWrzEjVy8Wzp7 wSE1Rwc1/2yvhzFx3nDWNZaa7Npe6bZ+PZIedgv4glALvTI3uiltRwI2XIIAjDEuUrTX mtN/vc+HPMP5fP9Eku0jdonuGXNc8PTpLj8vze0Ok5LHV3FVUhjMWjwv0EfNx8kSpeG9 HSArJROfHRceubVWgiVas+Inz1kZFkApD/n2RYXrhbUPbT5hZhZKUotcZRABkLnQWN66 AA5w== MIME-Version: 1.0 X-Received: by 10.31.138.20 with SMTP id m20mr30618022vkd.70.1450206963208; Tue, 15 Dec 2015 11:16:03 -0800 (PST) Received: by 10.31.210.133 with HTTP; Tue, 15 Dec 2015 11:16:03 -0800 (PST) In-Reply-To: <83io3z3drh.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> Date: Tue, 15 Dec 2015 20:16:03 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a1144f922e498250526f49e2e X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a1144f922e498250526f49e2e Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > > I think we should remove that, and leave behind an alias that uses > utf-8-hfs, which is provided by Emacs. There's no reason to maintain > 2 identical definitions. > Sounds reasonable. The implementation is vastly different, so getting rid of one is definitively an improvement. When you set file-name-coding-system to nil, Emacs uses > default-file-name-system, which is utf-8, so it doesn't > compose/decompose characters, and that's why you see what you see. > IOW, using nil is a step backward. > I couldn't agree more! What does this return: > > M-: (file-name-all-completion "=C3=A5=C3=A4=C3=B6" "/that/empty/directo= ry/") RET > It returns nil. Also, what is your value of completion-ignore-case? > It's nil. Just out of curiosity -- how does `file-name-all-completions' work? Is the FILE argument encoded to decomposed form, is the file list converted to composed form, or is this handled by the comparison functions? -- Anders --001a1144f922e498250526f49e2e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I think we should remove that, and leave behind an alias that = uses
utf-8-hfs, which is provided by Emacs.=C2=A0 There's no reason to maint= ain
2 identical definitions.

Sounds reasona= ble. The implementation is vastly different, so getting rid of one is defin= itively an improvement.


When you set file-name-coding-system to nil, Emacs uses
default-file-name-system, which is utf-8, so it doesn't
compose/decompose characters, and that's why you see what you see.
IOW, using nil is a step backward.

I co= uldn't agree more!


= What does this return:

=C2=A0 M-: (file-name-all-completion "=C3=A5=C3=A4=C3=B6" "/= that/empty/directory/") RET

It ret= urns nil.


Also, what is= your value of completion-ignore-case?

It's nil.
=

Just out of= curiosity -- how does `file-name-all-completions' work? Is the FILE ar= gument encoded to decomposed form, is the file list converted to composed f= orm, or is this handled by the comparison functions?

=C2=A0 =C2=A0 -- Anders

--001a1144f922e498250526f49e2e-- From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 14:56:34 2015 Received: (at 22169) by debbugs.gnu.org; 15 Dec 2015 19:56:34 +0000 Received: from localhost ([127.0.0.1]:52940 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8vi6-0003GY-MQ for submit@debbugs.gnu.org; Tue, 15 Dec 2015 14:56:34 -0500 Received: from eggs.gnu.org ([208.118.235.92]:41869) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8vi4-0003GL-Rq for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 14:56:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8vhu-00048f-Ix for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 14:56:27 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:59819) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8vhu-00048b-GC; Tue, 15 Dec 2015 14:56:22 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:3254 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a8vht-00052y-JA; Tue, 15 Dec 2015 14:56:22 -0500 Date: Tue, 15 Dec 2015 21:56:37 +0200 Message-Id: <831tan32q2.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Tue, 15 Dec 2015 20:16:03 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Tue, 15 Dec 2015 20:16:03 +0100 > From: Anders Lindgren > Cc: 22169@debbugs.gnu.org > > I think we should remove that, and leave behind an alias that uses > utf-8-hfs, which is provided by Emacs. There's no reason to maintain > 2 identical definitions. > > Sounds reasonable. The implementation is vastly different, so getting rid of > one is definitively an improvement. Can you write a patch to that effect, for emacs-25 branch? > What does this return: > > M-: (file-name-all-completion "åäö" "/that/empty/directory/") RET > > It returns nil. So this is the heart of the problem. I assume that if you do the same with an ASCII first argument, the result is non-nil, yes? Then the next step is to step with a debugger through file_name_completion, and see why this returns nil instead of a list of files that begin. > Just out of curiosity -- how does `file-name-all-completions' work? Is the FILE > argument encoded to decomposed form, is the file list converted to composed > form, or is this handled by the comparison functions? See dired.c:file_name_completion. In a nutshell, we do this: . encode the file argument . encode the directory argument and pass it to opendir . loop calling readdir, and for each file name it returns: . if the file name begins with the same characters as the encoded file argument, then: . decode the file name . cons the decoded name onto the list to be returned The above is for file-name-all-completions; for file-name-completion the last step is more complicated, but we should understand the file-name-all-completions case first. When you step through the code, please pay attention to the encoded file names. My guess is that somehow the call to scmp around line 500 fails, or maybe we don't count characters correctly in this case. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 15:06:01 2015 Received: (at 22169) by debbugs.gnu.org; 15 Dec 2015 20:06:02 +0000 Received: from localhost ([127.0.0.1]:52947 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8vrF-0003Ug-MP for submit@debbugs.gnu.org; Tue, 15 Dec 2015 15:06:01 -0500 Received: from mail-vk0-f41.google.com ([209.85.213.41]:34333) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8vrE-0003UV-FJ for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 15:06:00 -0500 Received: by mail-vk0-f41.google.com with SMTP id j66so13037983vkg.1 for <22169@debbugs.gnu.org>; Tue, 15 Dec 2015 12:06:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=htHsiY1cUC1UbSxqrQ8ZTllO0Rode6WJINGDPXvqNzg=; b=p9TBAEddRKZNGnjBwzwouvc07BIkbaP2WTKm9oDK8ulc5uddkHjTvwj5ND5QRubUEl FJvPlVgeJ4qRoNKAB2rc+4rlJB0+hhKHNjFSKzbGOKhWKLWpcCDi6/n4P+iyfR2JWFqW Y2wC1CFKM0gbsKN1uSAt/DSkebxKxD4K13qi3ly3U2BHC++u+5Wls5ia++IQoG12zePb bCWVJIeJxWG0t9bxJIJDKr0X59oNYwXGbTHgSPgCQBN6AiFN+ocZ8dPjUpqnvd572ynQ qQhPNFRVRIGSBtKHtEKame+CRjhJlTSuQGu5sNRF7xeqLTyiHHovAIypVBFCOYl9ayDn 5O1Q== MIME-Version: 1.0 X-Received: by 10.31.152.207 with SMTP id a198mr30620350vke.68.1450209955227; Tue, 15 Dec 2015 12:05:55 -0800 (PST) Received: by 10.31.210.133 with HTTP; Tue, 15 Dec 2015 12:05:55 -0800 (PST) In-Reply-To: <831tan32q2.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> Date: Tue, 15 Dec 2015 21:05:55 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a113d39e23b3a520526f5517d X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a113d39e23b3a520526f5517d Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, > Can you write a patch to that effect, for emacs-25 branch? > We have the find the cause of the problem first. But once we do that, this should be straight forward. > What does this return: > > > > M-: (file-name-all-completion "=C3=A5=C3=A4=C3=B6" "/that/empty/dir= ectory/") RET > > > > It returns nil. > > So this is the heart of the problem. I assume that if you do the same > with an ASCII first argument, the result is non-nil, yes? > Yes. > Then the next step is to step with a debugger through > file_name_completion, and see why this returns nil instead of a list > of files that begin. > Auhm, I'll see what I can do. I'm a family father and have very, very, limited time, but I can see in I can find a time slot for it. -- Anders --001a113d39e23b3a520526f5517d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,
=C2=A0
Can you write a p= atch to that effect, for emacs-25 branch?

We have the find the cause of the problem first. But once we do that, th= is should be straight forward.


>=C2=A0 =C2=A0 =C2=A0What does th= is return:
>
>=C2=A0 =C2=A0 =C2=A0M-: (file-name-all-completion "=C3=A5=C3=A4=C3= =B6" "/that/empty/directory/") RET
>
> It returns nil.

So this is the heart of the problem.=C2=A0 I assume that if you do t= he same
with an ASCII first argument, the result is non-nil, yes?
<= div>
Yes.=C2=A0

=C2=A0
Then the next step is to step with a debugger throug= h
file_name_completion, and see why this returns nil instead of a list
of files that begin.

Auhm, I'll see= what I can do. I'm a family father and have very, very, limited time, = but I can see in I can find a time slot for it.

= =C2=A0 =C2=A0 -- Anders

--001a113d39e23b3a520526f5517d-- From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 16:54:06 2015 Received: (at submit) by debbugs.gnu.org; 15 Dec 2015 21:54:06 +0000 Received: from localhost ([127.0.0.1]:52998 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8xXq-0005ut-6P for submit@debbugs.gnu.org; Tue, 15 Dec 2015 16:54:06 -0500 Received: from eggs.gnu.org ([208.118.235.92]:37466) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a8xXn-0005uO-OX for submit@debbugs.gnu.org; Tue, 15 Dec 2015 16:54:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8xXh-0005jR-KE for submit@debbugs.gnu.org; Tue, 15 Dec 2015 16:53:58 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:37641) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8xXh-0005jN-IQ for submit@debbugs.gnu.org; Tue, 15 Dec 2015 16:53:57 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47470) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8xXg-0001Md-Bn for bug-gnu-emacs@gnu.org; Tue, 15 Dec 2015 16:53:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a8xXd-0005j4-4I for bug-gnu-emacs@gnu.org; Tue, 15 Dec 2015 16:53:56 -0500 Received: from plane.gmane.org ([80.91.229.3]:41777) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a8xXc-0005ir-To for bug-gnu-emacs@gnu.org; Tue, 15 Dec 2015 16:53:53 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1a8xXa-0003tX-6K for bug-gnu-emacs@gnu.org; Tue, 15 Dec 2015 22:53:50 +0100 Received: from c-68-39-146-59.hsd1.in.comcast.net ([68.39.146.59]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 15 Dec 2015 22:53:50 +0100 Received: from random832 by c-68-39-146-59.hsd1.in.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 15 Dec 2015 22:53:50 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: bug-gnu-emacs@gnu.org From: Random832 Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X Date: Tue, 15 Dec 2015 16:53:37 -0500 Lines: 38 Message-ID: <87d1u74bvi.fsf@fastmail.com> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: c-68-39-146-59.hsd1.in.comcast.net User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) Cancel-Lock: sha1:5jZsbzwabUK+HddtVcn6JGpecHk= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.1 (----) Eli Zaretskii writes: > . encode the file argument > . encode the directory argument and pass it to opendir > . loop calling readdir, and for each file name it returns: > . if the file name begins with the same characters as the encoded > file argument, then: > . decode the file name > . cons the decoded name onto the list to be returned My guess from the symptoms is that utf-8-nfd doesn't actually bother to make any attempt to convert to decomposed form when encoding, since in *most* cases e.g. for opening a file, the underlying filesystem will take care of this automatically. This is backed up by the fact that, looking at the code, it apparently has a post-read-conversion but no matching pre-write-conversion. Anders Lindgren writes: > I tried setting it to nil. This made completion work. However, > the letters are presented in decomposed form, so that pressing > backspace first converts "å" to "a", a second backspace > deletes the "a" -- this is not how we would like to present > file names to users. Why? That _is_, for better or worse, the filename on the disk. On a non-OSX system, someone might actually have such a filename, distinct from the composed one. For that matter, what happens if an OSX user saves or opens a file on a non-HFS filesystem? Can Emacs handle the concept of different filesystems having different encodings? Ultimately, this isn't really an encoding - it is a destructive folding operation performed by the filesystem (the same as if, say, some filesystem stored filenames in all uppercase), and we've decided, for some reason, that we want the filenames back in what we've judged to be more likely to be the original form. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 15 22:32:52 2015 Received: (at 22169) by debbugs.gnu.org; 16 Dec 2015 03:32:52 +0000 Received: from localhost ([127.0.0.1]:53096 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a92pg-0004yd-99 for submit@debbugs.gnu.org; Tue, 15 Dec 2015 22:32:52 -0500 Received: from eggs.gnu.org ([208.118.235.92]:45357) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a92pe-0004yQ-S4 for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 22:32:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a92pW-000233-Kc for 22169@debbugs.gnu.org; Tue, 15 Dec 2015 22:32:45 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:56298) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a92pW-00022z-FN; Tue, 15 Dec 2015 22:32:42 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:3724 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a92pU-0000vf-9T; Tue, 15 Dec 2015 22:32:41 -0500 Date: Wed, 16 Dec 2015 05:32:56 +0200 Message-Id: <83zixb1313.fsf@gnu.org> From: Eli Zaretskii To: Random832 In-reply-to: <87d1u74bvi.fsf@fastmail.com> (message from Random832 on Tue, 15 Dec 2015 16:53:37 -0500) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <87d1u74bvi.fsf@fastmail.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > From: Random832 > Date: Tue, 15 Dec 2015 16:53:37 -0500 > > Eli Zaretskii writes: > > . encode the file argument > > . encode the directory argument and pass it to opendir > > . loop calling readdir, and for each file name it returns: > > . if the file name begins with the same characters as the encoded > > file argument, then: > > . decode the file name > > . cons the decoded name onto the list to be returned > > My guess from the symptoms is that utf-8-nfd doesn't actually > bother to make any attempt to convert to decomposed form when > encoding, since in *most* cases e.g. for opening a file, the > underlying filesystem will take care of this automatically. > > This is backed up by the fact that, looking at the code, it > apparently has a post-read-conversion but no matching > pre-write-conversion. I certainly see a pre-write-conversion function in ucs-normalize.el: ucs-normalize-hfs-nfd-pre-write-conversion which calls ucs-normalize-HFS-NFD-region. So I'm not sure I understand what you are saying. From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 16 00:06:14 2015 Received: (at submit) by debbugs.gnu.org; 16 Dec 2015 05:06:14 +0000 Received: from localhost ([127.0.0.1]:53127 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a94I2-00072v-Kk for submit@debbugs.gnu.org; Wed, 16 Dec 2015 00:06:14 -0500 Received: from eggs.gnu.org ([208.118.235.92]:41442) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a94I0-00072i-9I for submit@debbugs.gnu.org; Wed, 16 Dec 2015 00:06:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a94Ht-00024v-Uy for submit@debbugs.gnu.org; Wed, 16 Dec 2015 00:06:06 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_40,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:59782) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a94Ht-00024q-Rh for submit@debbugs.gnu.org; Wed, 16 Dec 2015 00:06:05 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60643) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a94Ht-0001Rn-0c for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 00:06:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a94Hp-00024R-Mi for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 00:06:04 -0500 Received: from plane.gmane.org ([80.91.229.3]:59732) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a94Hp-00024N-G4 for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 00:06:01 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1a94Hn-0001rl-N1 for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 06:05:59 +0100 Received: from c-68-39-146-59.hsd1.in.comcast.net ([68.39.146.59]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 16 Dec 2015 06:05:59 +0100 Received: from random832 by c-68-39-146-59.hsd1.in.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 16 Dec 2015 06:05:59 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: bug-gnu-emacs@gnu.org From: Random832 Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X Date: Wed, 16 Dec 2015 00:05:40 -0500 Lines: 31 Message-ID: References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <87d1u74bvi.fsf@fastmail.com> <83zixb1313.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: c-68-39-146-59.hsd1.in.comcast.net User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (darwin) Cancel-Lock: sha1:QyBTg6q8lYUTuW7D8pA6jvuATv0= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.1 (----) Eli Zaretskii writes: > I certainly see a pre-write-conversion function in ucs-normalize.el: > ucs-normalize-hfs-nfd-pre-write-conversion which calls > ucs-normalize-HFS-NFD-region. So I'm not sure I understand what you > are saying. I was talking about the "utf-8-nfd" encoding in ns-win.el. I'd missed the statement that the problem had been reproduced with utf-8-hfs. I can't actually reproduce the problem myself with utf-8-hfs. I thought I had it once, but only immediately after switching from utf-8-nfd (maybe the bad completion result from utf-8-nfd was in some kind of cache?), and now I can't even reproduce that. Otherwise, it seems to fix the problem. Anders, can you try this again from a clean emacs -Q session, and in particular load ucs-normalize and set the coding system to utf-8-hfs _before_ attempting any completion? -- Incidentally, I do get one other bit of bizarre behavior associated with this. If I have multiple files that start with the same base letter and different (or no) accents, pressing TAB _deletes_ that letter. E.g. files: à1 á2 a3. C-x C-f a TAB, deletes the "a". I'd expect it to either offer all three filenames, or just a3. Why exactly does completion do matching with encoded prefix against raw filenames, rather than with unicode prefix against decoded filenames, anyway? From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 16 05:17:39 2015 Received: (at 22169) by debbugs.gnu.org; 16 Dec 2015 10:17:39 +0000 Received: from localhost ([127.0.0.1]:53220 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a999O-0005mf-S7 for submit@debbugs.gnu.org; Wed, 16 Dec 2015 05:17:39 -0500 Received: from eggs.gnu.org ([208.118.235.92]:42911) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a999N-0005mR-W5 for 22169@debbugs.gnu.org; Wed, 16 Dec 2015 05:17:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a999D-0002oy-Jd for 22169@debbugs.gnu.org; Wed, 16 Dec 2015 05:17:32 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:33789) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a999D-0002ou-G1; Wed, 16 Dec 2015 05:17:27 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4166 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a999C-0002b4-Sm; Wed, 16 Dec 2015 05:17:27 -0500 Date: Wed, 16 Dec 2015 12:17:44 +0200 Message-Id: <83wpse1yuv.fsf@gnu.org> From: Eli Zaretskii To: Random832 In-reply-to: (message from Random832 on Wed, 16 Dec 2015 00:05:40 -0500) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <87d1u74bvi.fsf@fastmail.com> <83zixb1313.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > From: Random832 > Date: Wed, 16 Dec 2015 00:05:40 -0500 > > Anders, can you try this again from a clean emacs -Q session, and in > particular load ucs-normalize and set the coding system to utf-8-hfs > _before_ attempting any completion? I certainly hope so, thanks for testing. > Incidentally, I do get one other bit of bizarre behavior > associated with this. If I have multiple files that start with > the same base letter and different (or no) accents, pressing TAB > _deletes_ that letter. E.g. files: à1 á2 a3. C-x C-f a TAB, > deletes the "a". I guess some code is not ready to cope with a list of candidate completions some of which don't match the string-to-complete. Can you spot which code causes the deletion, and whether that is somehow related to file-name-all-completions returning all the 3 file names in this case? > I'd expect it to either offer all three filenames, or just a3. It's not really clear what is correct behavior in this case. On other platforms Emacs will return only a3, but HFS+ stores decomposed characters precisely to allow all 3 to match. So I think we should at least cause Emacs return only a3, and ideally also support the other behavior as an option. Btw, why is completion-ignore-case nil on HFS+? I understand it's a case-insensitive file system, isn't it? > Why exactly does completion do matching with encoded prefix > against raw filenames, rather than with unicode prefix against > decoded filenames, anyway? Performance: we don't want to decode every file name that readdir returns. From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 16 11:01:44 2015 Received: (at submit) by debbugs.gnu.org; 16 Dec 2015 16:01:44 +0000 Received: from localhost ([127.0.0.1]:53727 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9EWK-0006tq-KF for submit@debbugs.gnu.org; Wed, 16 Dec 2015 11:01:44 -0500 Received: from eggs.gnu.org ([208.118.235.92]:36046) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9EWG-0006tb-1S for submit@debbugs.gnu.org; Wed, 16 Dec 2015 11:01:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9EW8-0001Mf-5F for submit@debbugs.gnu.org; Wed, 16 Dec 2015 11:01:30 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:38071) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9EW8-0001MZ-3S for submit@debbugs.gnu.org; Wed, 16 Dec 2015 11:01:28 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55206) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9EW3-0001YW-UU for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 11:01:28 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9EVy-0001Hy-0G for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 11:01:23 -0500 Received: from plane.gmane.org ([80.91.229.3]:38096) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9EVx-0001Ho-PU for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 11:01:17 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1a9EVw-0005RM-GD for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 17:01:16 +0100 Received: from c-68-39-146-59.hsd1.in.comcast.net ([68.39.146.59]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 16 Dec 2015 17:01:16 +0100 Received: from random832 by c-68-39-146-59.hsd1.in.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 16 Dec 2015 17:01:16 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: bug-gnu-emacs@gnu.org From: Random832 Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X Date: Wed, 16 Dec 2015 11:00:57 -0500 Lines: 55 Message-ID: <874mfimlhi.fsf@fastmail.com> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <87d1u74bvi.fsf@fastmail.com> <83zixb1313.fsf@gnu.org> <83wpse1yuv.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: c-68-39-146-59.hsd1.in.comcast.net User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) Cancel-Lock: sha1:AGEd1RkuYZjDg6UexGdaUC7h/6U= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.1 (----) Eli Zaretskii writes: > I guess some code is not ready to cope with a list of candidate > completions some of which don't match the string-to-complete. Can you > spot which code causes the deletion, and whether that is somehow > related to file-name-all-completions returning all the 3 file names in > this case? It's almost certainly related to that. I couldn't follow all the details about how the completion code works, but it looks like the entire design of completion-pcm--merge-completions is based around finding a common prefix and suffix in the returned strings irrespective of the originally entered text. >> I'd expect it to either offer all three filenames, or just a3. > > It's not really clear what is correct behavior in this case. On other > platforms Emacs will return only a3, but HFS+ stores decomposed > characters precisely to allow all 3 to match. So I think we should > at least cause Emacs return only a3, and ideally also support the > other behavior as an option. I'm not aware of any published rationale for the decision to store decomposed characters. (In my testing I did notice that zsh and bash handle globbing differently - all of the files match a* in bash but not zsh.) I think maybe lax matching as an option would be better than blindly doing comparisons based on the decomposed form. With letters with multiple diacritics, for example, the naïve behavior would mean that one of the one-diacritic forms would match and the other would not. If users really want that behavior they can after all just set the file system encoding to utf-8 instead of utf-8-hfs. > Btw, why is completion-ignore-case nil on HFS+? I understand it's a > case-insensitive file system, isn't it? No idea. (IIRC In principle it's an option that can be disabled, though it's case-insensitive by default) I also feel like I should ask what provisions Emacs has for filesystem-specific case folding - NTFS and HFS both have their own algorithms which are different from each other and may both be different from general-purpose case matching algorithms. >> Why exactly does completion do matching with encoded prefix >> against raw filenames, rather than with unicode prefix against >> decoded filenames, anyway? > > Performance: we don't want to decode every file name that readdir > returns. I'm not sure there's a way around it if we want to be 100% correct and consistent, given the existence of parts of the completion system that do work with the strings in Unicode. From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 16 12:21:57 2015 Received: (at 22169) by debbugs.gnu.org; 16 Dec 2015 17:21:57 +0000 Received: from localhost ([127.0.0.1]:53791 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9Fm1-0001vO-BZ for submit@debbugs.gnu.org; Wed, 16 Dec 2015 12:21:57 -0500 Received: from eggs.gnu.org ([208.118.235.92]:57241) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9Fm0-0001vC-CR for 22169@debbugs.gnu.org; Wed, 16 Dec 2015 12:21:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9Flq-00055a-8Z for 22169@debbugs.gnu.org; Wed, 16 Dec 2015 12:21:51 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=BAYES_05,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:40498) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9Flq-00055W-2m; Wed, 16 Dec 2015 12:21:46 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4908 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a9Flp-0003hy-EF; Wed, 16 Dec 2015 12:21:45 -0500 Date: Wed, 16 Dec 2015 19:22:03 +0200 Message-Id: <831tam1f7o.fsf@gnu.org> From: Eli Zaretskii To: Random832 In-reply-to: <874mfimlhi.fsf@fastmail.com> (message from Random832 on Wed, 16 Dec 2015 11:00:57 -0500) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <87d1u74bvi.fsf@fastmail.com> <83zixb1313.fsf@gnu.org> <83wpse1yuv.fsf@gnu.org> <874mfimlhi.fsf@fastmail.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > From: Random832 > Date: Wed, 16 Dec 2015 11:00:57 -0500 > > > It's not really clear what is correct behavior in this case. On other > > platforms Emacs will return only a3, but HFS+ stores decomposed > > characters precisely to allow all 3 to match. So I think we should > > at least cause Emacs return only a3, and ideally also support the > > other behavior as an option. > > I'm not aware of any published rationale for the decision to > store decomposed characters. It cannot be anything other than the desire to support lax matches. > I think maybe lax matching as an option would be better than > blindly doing comparisons based on the decomposed form. It could be, if we had the lax matching implemented in C. But we currently only emulate that with complex regexps, and I think it's not a good idea to call that from dired.c. > I'm not sure there's a way around it if we want to be 100% > correct and consistent, given the existence of parts of the > completion system that do work with the strings in Unicode. I could come up with a patch if someone's interested to try it. I just want to hear first about the details of what happens in file_name_completion that causes file-name-all-completions return nil in the OP's case. There's got to be something that I'm missing here. From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 16 13:19:44 2015 Received: (at submit) by debbugs.gnu.org; 16 Dec 2015 18:19:45 +0000 Received: from localhost ([127.0.0.1]:53815 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9Gfw-0003EU-Lx for submit@debbugs.gnu.org; Wed, 16 Dec 2015 13:19:44 -0500 Received: from eggs.gnu.org ([208.118.235.92]:42191) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9Gfv-0003EH-3f for submit@debbugs.gnu.org; Wed, 16 Dec 2015 13:19:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9Gfp-0002Co-1k for submit@debbugs.gnu.org; Wed, 16 Dec 2015 13:19:37 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_40,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:36835) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9Gfo-0002Cj-Uk for submit@debbugs.gnu.org; Wed, 16 Dec 2015 13:19:36 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33158) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9Gfn-0003ee-IC for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 13:19:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9Gfk-0002AR-Bd for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 13:19:35 -0500 Received: from plane.gmane.org ([80.91.229.3]:49001) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9Gfk-0002AB-5F for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 13:19:32 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1a9Gfi-0000V4-J9 for bug-gnu-emacs@gnu.org; Wed, 16 Dec 2015 19:19:30 +0100 Received: from c-68-39-146-59.hsd1.in.comcast.net ([68.39.146.59]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 16 Dec 2015 19:19:30 +0100 Received: from random832 by c-68-39-146-59.hsd1.in.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 16 Dec 2015 19:19:30 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: bug-gnu-emacs@gnu.org From: Random832 Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X Date: Wed, 16 Dec 2015 13:19:20 -0500 Lines: 38 Message-ID: <8760zyl0if.fsf@fastmail.com> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <87d1u74bvi.fsf@fastmail.com> <83zixb1313.fsf@gnu.org> <83wpse1yuv.fsf@gnu.org> <874mfimlhi.fsf@fastmail.com> <831tam1f7o.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: c-68-39-146-59.hsd1.in.comcast.net User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) Cancel-Lock: sha1:PyVvoAgYLPEv5/cY5OjH2Ek3stk= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.1 (----) Eli Zaretskii writes: >> I'm not aware of any published rationale for the decision to >> store decomposed characters. > > It cannot be anything other than the desire to support lax matches. Maybe. I half suspect it was just to make their case mapping table (which doesn't include entries for the precomposed characters) smaller. >> I think maybe lax matching as an option would be better than >> blindly doing comparisons based on the decomposed form. > > It could be, if we had the lax matching implemented in C. But we > currently only emulate that with complex regexps, and I think it's not > a good idea to call that from dired.c. Whether that ever gets implemented or not, what I meant to suggest is that a half-baked lax matching that only works for a small subset of situations and only on one platform is not a feature worth having at all. And if people really do want it they can have it today by setting the encoding to utf-8 and dealing with the backspacing weirdness. AFAICT the rationale for renormalizing filenames to NFC was that combining characters couldn't be *displayed* on Carbon Emacs, rather than there being anything especially undesirable about the backspacing behavior. > I could come up with a patch if someone's interested to try it. I > just want to hear first about the details of what happens in > file_name_completion that causes file-name-all-completions return nil > in the OP's case. There's got to be something that I'm missing here. Like I said, ns-win's utf-8-nfd doesn't normalize on encode. I've since confirmed this with encode-coding-string. I haven't been able to confirm that ucs-normalize's utf-8-hfs exhibits the problem behavior. From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 16 13:51:40 2015 Received: (at 22169) by debbugs.gnu.org; 16 Dec 2015 18:51:40 +0000 Received: from localhost ([127.0.0.1]:53833 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9HAq-0003xV-Kh for submit@debbugs.gnu.org; Wed, 16 Dec 2015 13:51:40 -0500 Received: from eggs.gnu.org ([208.118.235.92]:49811) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9HAp-0003xH-3f for 22169@debbugs.gnu.org; Wed, 16 Dec 2015 13:51:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9HAg-0001ut-5E for 22169@debbugs.gnu.org; Wed, 16 Dec 2015 13:51:34 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:41843) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9HAg-0001up-2i; Wed, 16 Dec 2015 13:51:30 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4948 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a9HAf-0005rT-AC; Wed, 16 Dec 2015 13:51:29 -0500 Date: Wed, 16 Dec 2015 20:51:47 +0200 Message-Id: <83vb7yz0os.fsf@gnu.org> From: Eli Zaretskii To: Random832 In-reply-to: <8760zyl0if.fsf@fastmail.com> (message from Random832 on Wed, 16 Dec 2015 13:19:20 -0500) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <87d1u74bvi.fsf@fastmail.com> <83zixb1313.fsf@gnu.org> <83wpse1yuv.fsf@gnu.org> <874mfimlhi.fsf@fastmail.com> <831tam1f7o.fsf@gnu.org> <8760zyl0if.fsf@fastmail.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > From: Random832 > Date: Wed, 16 Dec 2015 13:19:20 -0500 > > Eli Zaretskii writes: > >> I'm not aware of any published rationale for the decision to > >> store decomposed characters. > > > > It cannot be anything other than the desire to support lax matches. > > Maybe. I half suspect it was just to make their case mapping > table (which doesn't include entries for the precomposed > characters) smaller. Only if they force decomposition in contexts that have nothing to do with file names. Otherwise, they will have to have those large case tables anyway, for other kinds of text, right? > AFAICT the rationale for renormalizing filenames to NFC was that > combining characters couldn't be *displayed* on Carbon Emacs, > rather than there being anything especially undesirable about > the backspacing behavior. It is generally easier and more convenient to have precomposed characters, yes. It's not an accident that no other filesystem does this kind of decomposition; Windows filesystems actually compose the characters, AFAIK. > > I could come up with a patch if someone's interested to try it. I > > just want to hear first about the details of what happens in > > file_name_completion that causes file-name-all-completions return nil > > in the OP's case. There's got to be something that I'm missing here. > > Like I said, ns-win's utf-8-nfd doesn't normalize on encode. > I've since confirmed this with encode-coding-string. I haven't > been able to confirm that ucs-normalize's utf-8-hfs exhibits the > problem behavior. Let's hope you are right. From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 17 17:01:09 2015 Received: (at 22169) by debbugs.gnu.org; 17 Dec 2015 22:01:09 +0000 Received: from localhost ([127.0.0.1]:54811 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9gbk-0002UE-Uv for submit@debbugs.gnu.org; Thu, 17 Dec 2015 17:01:09 -0500 Received: from mail-vk0-f45.google.com ([209.85.213.45]:34761) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9gbj-0002Tp-42 for 22169@debbugs.gnu.org; Thu, 17 Dec 2015 17:01:07 -0500 Received: by mail-vk0-f45.google.com with SMTP id j66so54884978vkg.1 for <22169@debbugs.gnu.org>; Thu, 17 Dec 2015 14:01:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=tG+XEexpzyYkmCXBJdjwr61FrRJndpYCHW823hk/uVQ=; b=wIZnORYHJvl+o/3Iwb2FUzc2rL/vd1QaHuW/F3gzP0fBKq2ldTr3H4HamYXvm10Xda RZOIiM5EprPGrOtAx9kPWeoX13lBvbLKi6jz/lr/YDKOxQcsVyw9RNmO2SYPFbOzoby6 i4lDzRd7Tr+PXGPSsR5ngpCUwx34I/EWOiDZF0T2JbvceoFE77L+v212lydp8E34AgBe Xc0Xg0MV8heENa0yVC5PX6m0RvvkLIE9PmE241SIs20n+0FFh1GaH368FL9WI/+I226s iE4FQ9ygemy7xYzTYFLbUpPhtEDhTO0iJg0VpYujt+JLIuyyT064d309UaQaQzkT7z/7 pf4w== MIME-Version: 1.0 X-Received: by 10.31.138.20 with SMTP id m20mr88225vkd.70.1450389661235; Thu, 17 Dec 2015 14:01:01 -0800 (PST) Received: by 10.31.210.133 with HTTP; Thu, 17 Dec 2015 14:01:01 -0800 (PST) In-Reply-To: References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> Date: Thu, 17 Dec 2015 23:01:01 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/mixed; boundary=001a1144f9228d29a205271f2885 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a1144f9228d29a205271f2885 Content-Type: multipart/alternative; boundary=001a1144f9228d299e05271f2883 --001a1144f9228d299e05271f2883 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi! I think I have solved this. The current coding system defined in ns-win.el didn't work because it only provided a decode but no encode functions. After revisiting the "hfs" encoder, I managed to get it to work, this time. Below is a patch where I have dropped the old encoder and use the new instead. The only thing noteworthy is that `ucs-normalize' is loaded by loadup (when ns is used) and thus included in the dumped Emacs (if I understand correctly). Unless anybody objects, I'll push it in a couple of days. -- Anders On Tue, Dec 15, 2015 at 9:05 PM, Anders Lindgren wrote: > Hi, > > >> Can you write a patch to that effect, for emacs-25 branch? >> > > We have the find the cause of the problem first. But once we do that, thi= s > should be straight forward. > > > > What does this return: >> > >> > M-: (file-name-all-completion "=C3=A5=C3=A4=C3=B6" "/that/empty/di= rectory/") RET >> > >> > It returns nil. >> >> So this is the heart of the problem. I assume that if you do the same >> with an ASCII first argument, the result is non-nil, yes? >> > > Yes. > > > >> Then the next step is to step with a debugger through >> file_name_completion, and see why this returns nil instead of a list >> of files that begin. >> > > Auhm, I'll see what I can do. I'm a family father and have very, very, > limited time, but I can see in I can find a time slot for it. > > -- Anders > > --001a1144f9228d299e05271f2883 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

I think I have solved this.

The current coding system defined in ns-win.el didn't= work because it only provided a decode but no encode functions.
=
After revisiting the "hfs" encoder, I managed to g= et it to work, this time.

Below is a patch where I= have dropped the old encoder and use the new instead. The only thing notew= orthy is that `ucs-normalize' is loaded by loadup (when ns is used) and= thus included in the dumped Emacs (if I understand correctly). Unless anyb= ody objects, I'll push it in a couple of days.

=C2=A0 =C2=A0 -- Anders

On Tue, Dec 15, 2015 at 9:05 PM, Anders Lindgren <andli= nd@gmail.com> wrote:
Hi,
=C2=A0
Can yo= u write a patch to that effect, for emacs-25 branch?
<= br>
We have the find the cause of the problem first. But o= nce we do that, this should be straight forward.


>=C2=A0 = =C2=A0 =C2=A0What does this return:
>
>=C2=A0 =C2=A0 =C2=A0M-: (file-name-all-completion "=C3=A5=C3=A4=C3= =B6" "/that/empty/directory/") RET
>
> It returns nil.

So this is the heart of the problem.=C2=A0 I assume that if you do t= he same
with an ASCII first argument, the result is non-nil, yes?
<= div>
Yes.=C2=A0

<= div>=C2=A0
Then the next step is to ste= p with a debugger through
file_name_completion, and see why this returns nil instead of a list
of files that begin.

Auhm, I'= ;ll see what I can do. I'm a family father and have very, very, limited= time, but I can see in I can find a time slot for it.

=C2=A0 =C2=A0 -- Anders=


--001a1144f9228d299e05271f2883-- --001a1144f9228d29a205271f2885 Content-Type: text/plain; charset=US-ASCII; name="coding.diff" Content-Disposition: attachment; filename="coding.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_iiasj1mk0 ZGlmZiAtLWdpdCBhL2xpc3AvbG9hZHVwLmVsIGIvbGlzcC9sb2FkdXAuZWwKaW5kZXggZjBjYWE4 Yi4uZGRhNDMzZSAxMDA2NDQKLS0tIGEvbGlzcC9sb2FkdXAuZWwKKysrIGIvbGlzcC9sb2FkdXAu ZWwKQEAgLTI3Niw2ICsyNzYsNyBAQAogKGlmIChmZWF0dXJlcCAnbnMpCiAgICAgKHByb2duCiAg ICAgICAobG9hZCAidGVybS9jb21tb24td2luIikKKyAgICAgIChsb2FkICJpbnRlcm5hdGlvbmFs L3Vjcy1ub3JtYWxpemUiKQogICAgICAgKGxvYWQgInRlcm0vbnMtd2luIikpKQogKGlmIChmYm91 bmRwICd4LWNyZWF0ZS1mcmFtZSkKICAgICA7OyBEbyBpdCBhZnRlciBsb2FkaW5nIHRlcm0vZm9v LXdpbi5lbCBzaW5jZSB0aGUgdmFsdWUgb2YgdGhlCmRpZmYgLS1naXQgYS9saXNwL3Rlcm0vbnMt d2luLmVsIGIvbGlzcC90ZXJtL25zLXdpbi5lbAppbmRleCAwYjNlM2JkLi45YmQ1OWZjIDEwMDY0 NAotLS0gYS9saXNwL3Rlcm0vbnMtd2luLmVsCisrKyBiL2xpc3AvdGVybS9ucy13aW4uZWwKQEAg LTUxLDYgKzUxLDcgQEAKIChyZXF1aXJlICdtZW51LWJhcikKIChyZXF1aXJlICdmb250c2V0KQog KHJlcXVpcmUgJ2RuZCkKKyhyZXF1aXJlICd1Y3Mtbm9ybWFsaXplKQogCiAoZGVmZ3JvdXAgbnMg bmlsCiAgICJHTlVzdGVwL01hYyBPUyBYIHNwZWNpZmljIGZlYXR1cmVzLiIKQEAgLTMzNywyOSAr MzM4LDEyIEBAIG5zLWRlbGV0ZS13b3JraW5nLXRleHQKICAgKHNldHEgbnMtd29ya2luZy1vdmVy bGF5IG5pbCkpCiAKIAotKGRlY2xhcmUtZnVuY3Rpb24gbnMtY29udmVydC11dGY4LW5mZC10by1u ZmMgIm5zZm5zLm0iIChzdHIpKQotCi07Ozs7IE9TIFggZmlsZSBzeXN0ZW0gVW5pY29kZSBVVEYt OCBORkQgKGRlY29tcG9zZWQgZm9ybSkgc3VwcG9ydAotOzsgTGlzcCBjb2RlIGJhc2VkIG9uIHV0 Zi04bS5lbCwgYnkgU2VpamkgWmVuaXRhbmksIEVpamkgSG9uam9oLCBhbmQKLTs7IENhcnN0ZW4g Qm9ybWFubi4KKzs7IE9TIFggZmlsZSBzeXN0ZW0gVW5pY29kZSBVVEYtOCBORkQgKGRlY29tcG9z ZWQgZm9ybSkgc3VwcG9ydC4KICh3aGVuIChlcSBzeXN0ZW0tdHlwZSAnZGFyd2luKQotICAoZGVm dW4gbnMtdXRmOC1uZmQtcG9zdC1yZWFkLWNvbnZlcnNpb24gKGxlbmd0aCkKLSAgICAiQ2FsbHMg YG5zLWNvbnZlcnQtdXRmOC1uZmQtdG8tbmZjJyB0byBjb21wb3NlIGNoYXIgc2VxdWVuY2VzLiIK LSAgICAoc2F2ZS1leGN1cnNpb24KLSAgICAgIChzYXZlLXJlc3RyaWN0aW9uCi0gICAgICAgIChu YXJyb3ctdG8tcmVnaW9uIChwb2ludCkgKCsgKHBvaW50KSBsZW5ndGgpKQotICAgICAgICAobGV0 ICgoc3RyIChidWZmZXItc3RyaW5nKSkpCi0gICAgICAgICAgKGRlbGV0ZS1yZWdpb24gKHBvaW50 LW1pbikgKHBvaW50LW1heCkpCi0gICAgICAgICAgKGluc2VydCAobnMtY29udmVydC11dGY4LW5m ZC10by1uZmMgc3RyKSkKLSAgICAgICAgICAoLSAocG9pbnQtbWF4KSAocG9pbnQtbWluKSkpKSkp Ci0KLSAgKGRlZmluZS1jb2Rpbmctc3lzdGVtICd1dGYtOC1uZmQKLSAgICAiVVRGLTggTkZEIChk ZWNvbXBvc2VkKSBlbmNvZGluZy4iCi0gICAgOmNvZGluZy10eXBlICd1dGYtOAotICAgIDptbmVt b25pYyA/VQotICAgIDpjaGFyc2V0LWxpc3QgJyh1bmljb2RlKQotICAgIDpwb3N0LXJlYWQtY29u dmVyc2lvbiAnbnMtdXRmOC1uZmQtcG9zdC1yZWFkLWNvbnZlcnNpb24pCi0gIChzZXQtZmlsZS1u YW1lLWNvZGluZy1zeXN0ZW0gJ3V0Zi04LW5mZCkpCisgIDs7IFVzZWQgcHJpb3IgdG8gRW1hY3Mg MjUuCisgIChkZWZpbmUtY29kaW5nLXN5c3RlbS1hbGlhcyAndXRmLTgtbmZkICd1dGYtOC1oZnMp CisKKyAgKHNldC1maWxlLW5hbWUtY29kaW5nLXN5c3RlbSAndXRmLTgtaGZzKSkKIAogOzs7OyBJ bnRlci1hcHAgY29tbXVuaWNhdGlvbnMgc3VwcG9ydC4KIApkaWZmIC0tZ2l0IGEvc3JjL25zZm5z Lm0gYi9zcmMvbnNmbnMubQppbmRleCBlZGMwMmU4Li41ZmE2OGMwIDEwMDY0NAotLS0gYS9zcmMv bnNmbnMubQorKysgYi9zcmMvbnNmbnMubQpAQCAtMjA5OSwzOSArMjA5OSw2IEBAIHRoZXJlIHdh cyBubyByZXN1bHQuICAqLykKIH0KIAogCi1ERUZVTiAoIm5zLWNvbnZlcnQtdXRmOC1uZmQtdG8t bmZjIiwgRm5zX2NvbnZlcnRfdXRmOF9uZmRfdG9fbmZjLAotICAgICAgIFNuc19jb252ZXJ0X3V0 ZjhfbmZkX3RvX25mYywgMSwgMSwgMCwKLSAgICAgICBkb2M6IC8qIFJldHVybiBhbiBORkMgc3Ry aW5nIHRoYXQgbWF0Y2hlcyB0aGUgVVRGLTggTkZEIHN0cmluZyBTVFIuICAqLykKLSAgICAgKExp c3BfT2JqZWN0IHN0cikKLXsKLS8qIFRPRE86IElmIEdOVXN0ZXAgZXZlciBpbXBsZW1lbnRzIHBy ZWNvbXBvc2VkU3RyaW5nV2l0aENhbm9uaWNhbE1hcHBpbmcsCi0gICAgICAgICByZW1vdmUgdGhp cy4gKi8KLSAgTlNTdHJpbmcgKnV0ZlN0cjsKLSAgTGlzcF9PYmplY3QgcmV0ID0gUW5pbDsKLSAg TlNBdXRvcmVsZWFzZVBvb2wgKnBvb2w7Ci0KLSAgQ0hFQ0tfU1RSSU5HIChzdHIpOwotICBwb29s ID0gW1tOU0F1dG9yZWxlYXNlUG9vbCBhbGxvY10gaW5pdF07Ci0gIHV0ZlN0ciA9IFtOU1N0cmlu ZyBzdHJpbmdXaXRoVVRGOFN0cmluZzogU1NEQVRBIChzdHIpXTsKLSNpZmRlZiBOU19JTVBMX0NP Q09BCi0gIGlmICh1dGZTdHIpCi0gICAgdXRmU3RyID0gW3V0ZlN0ciBwcmVjb21wb3NlZFN0cmlu Z1dpdGhDYW5vbmljYWxNYXBwaW5nXTsKLSNlbmRpZgotICBpZiAodXRmU3RyKQotICAgIHsKLSAg ICAgIGNvbnN0IGNoYXIgKmNzdHIgPSBbdXRmU3RyIFVURjhTdHJpbmddOwotICAgICAgaWYgKGNz dHIpCi0gICAgICAgIHJldCA9IGJ1aWxkX3N0cmluZyAoY3N0cik7Ci0gICAgfQotCi0gIFtwb29s IHJlbGVhc2VdOwotICBpZiAoTklMUCAocmV0KSkKLSAgICBlcnJvciAoIkludmFsaWQgVVRGLTgi KTsKLQotICByZXR1cm4gcmV0OwotfQotCi0KICNpZmRlZiBOU19JTVBMX0NPQ09BCiAKIC8qIENv bXBpbGUgYW5kIGV4ZWN1dGUgdGhlIEFwcGxlU2NyaXB0IFNDUklQVCBhbmQgcmV0dXJuIHRoZSBl cnJvcgpAQCAtMzIwNyw3ICszMTc0LDYgQEAgYmUgdXNlZCBhcyB0aGUgaW1hZ2Ugb2YgdGhlIGlj b24gcmVwcmVzZW50aW5nIHRoZSBmcmFtZS4gICovKTsKICAgZGVmc3ViciAoJlNuc19lbWFjc19p bmZvX3BhbmVsKTsKICAgZGVmc3ViciAoJlNuc19saXN0X3NlcnZpY2VzKTsKICAgZGVmc3ViciAo JlNuc19wZXJmb3JtX3NlcnZpY2UpOwotICBkZWZzdWJyICgmU25zX2NvbnZlcnRfdXRmOF9uZmRf dG9fbmZjKTsKICAgZGVmc3ViciAoJlNuc19wb3B1cF9mb250X3BhbmVsKTsKICAgZGVmc3ViciAo JlNuc19wb3B1cF9jb2xvcl9wYW5lbCk7CiAK --001a1144f9228d29a205271f2885-- From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 17 21:46:53 2015 Received: (at submit) by debbugs.gnu.org; 18 Dec 2015 02:46:53 +0000 Received: from localhost ([127.0.0.1]:54901 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9l4G-0000aA-Ru for submit@debbugs.gnu.org; Thu, 17 Dec 2015 21:46:53 -0500 Received: from eggs.gnu.org ([208.118.235.92]:46835) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9l4F-0000Zy-DO for submit@debbugs.gnu.org; Thu, 17 Dec 2015 21:46:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9l49-0000aB-LC for submit@debbugs.gnu.org; Thu, 17 Dec 2015 21:46:46 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:44989) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9l49-0000a7-HZ for submit@debbugs.gnu.org; Thu, 17 Dec 2015 21:46:45 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37802) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9l48-0003nv-N1 for bug-gnu-emacs@gnu.org; Thu, 17 Dec 2015 21:46:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9l44-0000ZY-K4 for bug-gnu-emacs@gnu.org; Thu, 17 Dec 2015 21:46:44 -0500 Received: from plane.gmane.org ([80.91.229.3]:48040) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9l44-0000ZU-DH for bug-gnu-emacs@gnu.org; Thu, 17 Dec 2015 21:46:40 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1a9l3w-0000nO-Rc for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 03:46:33 +0100 Received: from c-68-39-146-59.hsd1.in.comcast.net ([68.39.146.59]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 18 Dec 2015 03:46:32 +0100 Received: from random832 by c-68-39-146-59.hsd1.in.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 18 Dec 2015 03:46:32 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: bug-gnu-emacs@gnu.org From: Random832 Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X Date: Thu, 17 Dec 2015 21:46:15 -0500 Lines: 33 Message-ID: References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: c-68-39-146-59.hsd1.in.comcast.net User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (darwin) Cancel-Lock: sha1:E4KmY3fpb+KEmmS3PNhETKanJBo= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.1 (----) Anders Lindgren writes: > Below is a patch where I have dropped the old encoder and use the new > instead. The only thing noteworthy is that `ucs-normalize' is loaded > by loadup (when ns is used) and thus included in the dumped Emacs (if > I understand correctly). Unless anybody objects, I'll push it in a > couple of days. Out of sheer morbid curiosity, I decided to see what happens if I create a filesystem with both NFC and NFD characters (For thoroughness, I tested both colliding and non-colliding names, on FAT32 and NFS. On a FAT32 volume, Linux creates all of them fine, obviously. OSX completely fails to do anything meaningful with the files that are in NFD on disk: They are returned by readdir, but cannot be opened or statted (opening one that has a name collision with an NFC file opens the NFC file). To my under- standing the same behavior would be present for SMB and UDF volumes. The filenames are normalized to NFD when returned by readdir, but only the filenames that are normalized to NFC on disk are accessible. On NFS, the story is a bit more interesting. OSX does not perform any normalization on filenames on an NFS share. After being bitten by a similar bug in zsh's globbing, I was able to determine that Emacs is able to open and save files in both formats with utf-8-nfd (since encoding passes values through unchanged), but _not_ with utf-8-hfs. Arguably, for the rare users who use NFS or other filesystems and work with characters whose representations differ, they can simply use the utf-8 encoding and be explicit about what filenames they want. It is something to be aware of, though. From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 18 01:29:25 2015 Received: (at 22169) by debbugs.gnu.org; 18 Dec 2015 06:29:25 +0000 Received: from localhost ([127.0.0.1]:55049 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9oXc-0007S4-Sx for submit@debbugs.gnu.org; Fri, 18 Dec 2015 01:29:25 -0500 Received: from mail-vk0-f50.google.com ([209.85.213.50]:35903) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9oXb-0007Rs-7K for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 01:29:23 -0500 Received: by mail-vk0-f50.google.com with SMTP id f2so21390574vkb.3 for <22169@debbugs.gnu.org>; Thu, 17 Dec 2015 22:29:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=lRsZyLGcyEnfVe5EsEYCdeP6sfvDZSVYOiLsp8EphYI=; b=QLBn6RrOM+a3OIfAVS57OONNMsyvMfMc1pWufq83fDZv/MHBy0F8yFRuj4O6MIVWl+ 07g0mwO/3A//e+v7jh3h0JxQ/3FLMd3FjhkDkN3ezszypEMOOQk621w5/MS0HYeUEutL E+NGUsYYf5K5kT4dh8zyJiu/vUc0cEefUu4keNsvKZDRawoB5IT8b4268V83GSVSAk79 UtURAJ05OBIONXQTOxk22HLAJhWkDZ+hHL5Ihk0ZmCywcUDgELGXYlW/NKDEt+T+4wBf iENBjQifrekuXsKLNqAvZUwbgHMQAi1IfZanLSZDtyLf/hbaAFR9pFO9xapL2xtxSea/ 03Ew== MIME-Version: 1.0 X-Received: by 10.31.10.199 with SMTP id 190mr1171826vkk.51.1450420157705; Thu, 17 Dec 2015 22:29:17 -0800 (PST) Received: by 10.31.210.133 with HTTP; Thu, 17 Dec 2015 22:29:17 -0800 (PST) In-Reply-To: References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> Date: Fri, 18 Dec 2015 07:29:17 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: 22169@debbugs.gnu.org Content-Type: multipart/alternative; boundary=001a114401764673c0052726422f X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, Eli Zaretskii X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a114401764673c0052726422f Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi! I just realized that I missed parts of the ongoing discussion -- I was under impression that I as OP should be CC:ed, but apparently I wasn't. After reading through Random832:s comments, I also see the problem with "=C3=A5=C3=A4=C3=B6" and "aao" not being handled correctly. Typing "a TAB" = makes Emacs delete the "a", which seems very confusing. Typing "=C3=A5 TAB" or "aa TAB" works, though. (Here `(file-name-all-completions "a" ".")' returns `("=C3=A5=C3=A4=C3=B6first.txt" "aaosecond.txt")'. In other words, Emcas is in better shape with my than it was before, but there is still some work to be done. When it comes to "lax" matching -- I really don't think we should use it for file names. I don't want to match "=C3=A5" when I type "a" etc. HFS+ file systems are case sensitive (It's possible this can be disabled, but if so it's very rarely used). However, many OS X desktop applications work hard to make this invisible to users. I think that we should keep `read-file-name-completion-ignore-case' as it is, as this corresponds to how files really are stored. After giving this some thought, it feels like the file name matching should be done on decoded strings (so that an "a" doesn't match the "a" in a decomposed "=C3=A5"). However, this is a major change and needs to be discu= ssed further. -- Anders On Thu, Dec 17, 2015 at 11:01 PM, Anders Lindgren wrote= : > > Hi! > > I think I have solved this. > > The current coding system defined in ns-win.el didn't work because it only provided a decode but no encode functions. > > After revisiting the "hfs" encoder, I managed to get it to work, this time. > > Below is a patch where I have dropped the old encoder and use the new instead. The only thing noteworthy is that `ucs-normalize' is loaded by loadup (when ns is used) and thus included in the dumped Emacs (if I understand correctly). Unless anybody objects, I'll push it in a couple of days. > > -- Anders > > On Tue, Dec 15, 2015 at 9:05 PM, Anders Lindgren wrote: >> >> Hi, >> >>> >>> Can you write a patch to that effect, for emacs-25 branch? >> >> >> We have the find the cause of the problem first. But once we do that, this should be straight forward. >> >> >>> > What does this return: >>> > >>> > M-: (file-name-all-completion "=C3=A5=C3=A4=C3=B6" "/that/empty/d= irectory/") RET >>> > >>> > It returns nil. >>> >>> So this is the heart of the problem. I assume that if you do the same >>> with an ASCII first argument, the result is non-nil, yes? >> >> >> Yes. >> >> >>> >>> Then the next step is to step with a debugger through >>> file_name_completion, and see why this returns nil instead of a list >>> of files that begin. >> >> >> Auhm, I'll see what I can do. I'm a family father and have very, very, limited time, but I can see in I can find a time slot for it. >> >> -- Anders >> > --001a114401764673c0052726422f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

I just realized that I missed parts of the ongo= ing discussion -- I was under impression that I as OP should be CC:ed, but = apparently I wasn't.

After reading through Random832:s comments,= I also see the problem with "=C3=A5=C3=A4=C3=B6" and "aao&q= uot; not being handled correctly. Typing "a TAB" makes Emacs dele= te the "a", which seems very confusing. Typing "=C3=A5 TAB&q= uot; or "aa TAB" works, though. (Here `(file-name-all-completions= "a" ".")' returns `("=C3=A5=C3=A4=C3=B6first.= txt" "aaosecond.txt")'.

In other words, Emcas is = in better shape with my than it was before, but there is still some work to= be done.

When it comes to "lax" matching -- I really don&= #39;t think we should use it for file names. I don't want to match &quo= t;=C3=A5" when I type "a" etc.

HFS+ file systems are = case sensitive (It's possible this can be disabled, but if so it's = very rarely used). However, =C2=A0many OS X desktop applications work hard = to make this invisible to users. I think that we should keep `read-file-nam= e-completion-ignore-case' as it is, as this corresponds to how files re= ally are stored.

After giving this some thought, it feels like the f= ile name matching should be done on decoded strings (so that an "a&quo= t; doesn't match the "a" in a decomposed "=C3=A5").= However, this is a major change and needs to be discussed further.

= =C2=A0 =C2=A0 -- Anders

On Thu, Dec 17, 2015 at 11:01 PM, Anders Lin= dgren <andlind@gmail.com> wr= ote:
>
> Hi!
>
> I think I have solved this.
>= ;
> The current coding system defined in ns-win.el didn't work be= cause it only provided a decode but no encode functions.
>
> Af= ter revisiting the "hfs" encoder, I managed to get it to work, th= is time.
>
> Below is a patch where I have dropped the old enco= der and use the new instead. The only thing noteworthy is that `ucs-normali= ze' is loaded by loadup (when ns is used) and thus included in the dump= ed Emacs (if I understand correctly). Unless anybody objects, I'll push= it in a couple of days.
>
> =C2=A0 =C2=A0 -- Anders
>> On Tue, Dec 15, 2015 at 9:05 PM, Anders Lindgren <andlind@gmail.com> wrote:
>>
>&g= t; Hi,
>> =C2=A0
>>>
>>> Can you write a p= atch to that effect, for emacs-25 branch?
>>
>>
>&g= t; We have the find the cause of the problem first. But once we do that, th= is should be straight forward.
>>
>>
>>> >= =C2=A0 =C2=A0 What does this return:
>>> >
>>> = > =C2=A0 =C2=A0 M-: (file-name-all-completion "=C3=A5=C3=A4=C3=B6&q= uot; "/that/empty/directory/") RET
>>> >
>&g= t;> > It returns nil.
>>>
>>> So this is the = heart of the problem.=C2=A0 I assume that if you do the same
>>>= ; with an ASCII first argument, the result is non-nil, yes?
>>
= >>
>> Yes.
>>
>> =C2=A0
>>>>>> Then the next step is to step with a debugger through
>= ;>> file_name_completion, and see why this returns nil instead of a l= ist
>>> of files that begin.
>>
>>
>>= ; Auhm, I'll see what I can do. I'm a family father and have very, = very, limited time, but I can see in I can find a time slot for it.
>= >
>> =C2=A0 =C2=A0 -- Anders
>>
>
--001a114401764673c0052726422f-- From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 18 02:07:31 2015 Received: (at 22169) by debbugs.gnu.org; 18 Dec 2015 07:07:31 +0000 Received: from localhost ([127.0.0.1]:55069 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9p8U-0008OP-Qv for submit@debbugs.gnu.org; Fri, 18 Dec 2015 02:07:31 -0500 Received: from eggs.gnu.org ([208.118.235.92]:35834) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9p8U-0008OE-5B for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 02:07:30 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9p8N-0006QD-Vu for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 02:07:24 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:50657) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9p8H-0006ML-3Z; Fri, 18 Dec 2015 02:07:17 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1688 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a9p8G-0006qY-BM; Fri, 18 Dec 2015 02:07:16 -0500 Date: Fri, 18 Dec 2015 09:07:39 +0200 Message-Id: <83r3ikxmis.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Fri, 18 Dec 2015 07:29:17 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Fri, 18 Dec 2015 07:29:17 +0100 > From: Anders Lindgren > Cc: Eli Zaretskii , random832@fastmail.com > > After reading through Random832:s comments, I also see the problem with "åäö" > and "aao" not being handled correctly. Typing "a TAB" makes Emacs delete the > "a", which seems very confusing. Typing "å TAB" or "aa TAB" works, though. > (Here `(file-name-all-completions "a" ".")' returns `("åäöfirst.txt" > "aaosecond.txt")'. > > In other words, Emcas is in better shape with my than it was before, but there > is still some work to be done. > > When it comes to "lax" matching -- I really don't think we should use it for > file names. I don't want to match "å" when I type "a" etc. I have an idea for a change that could solve this. I will post it in a day or two and ask you to try it. > HFS+ file systems are case sensitive (It's possible this can be disabled, but > if so it's very rarely used). However, many OS X desktop applications work hard > to make this invisible to users. I think that we should keep > `read-file-name-completion-ignore-case' as it is, as this corresponds to how > files really are stored. If that's what OS X users expect, fine with me. > After giving this some thought, it feels like the file name matching should be > done on decoded strings (so that an "a" doesn't match the "a" in a decomposed > "å"). However, this is a major change and needs to be discussed further. I rather think it's a non-starter, at least for Emacs 25.1. It probably means users of all systems will be punished by slower directory searches, on behalf of one peculiar filesystem. Unless there's some clever idea that avoids decoding each file name returned by readdir, that is. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 18 02:25:20 2015 Received: (at 22169) by debbugs.gnu.org; 18 Dec 2015 07:25:20 +0000 Received: from localhost ([127.0.0.1]:55096 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9pPk-0000Os-MG for submit@debbugs.gnu.org; Fri, 18 Dec 2015 02:25:20 -0500 Received: from eggs.gnu.org ([208.118.235.92]:39669) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9pPi-0000Of-Tu for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 02:25:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9pPa-00027X-EJ for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 02:25:13 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:50903) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9pPa-00027T-B7; Fri, 18 Dec 2015 02:25:10 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1699 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a9pPZ-0005sh-6p; Fri, 18 Dec 2015 02:25:09 -0500 Date: Fri, 18 Dec 2015 09:25:32 +0200 Message-Id: <83h9jgxloz.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Thu, 17 Dec 2015 23:01:01 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Thu, 17 Dec 2015 23:01:01 +0100 > From: Anders Lindgren > Cc: 22169@debbugs.gnu.org > > Below is a patch where I have dropped the old encoder and use the new instead. > The only thing noteworthy is that `ucs-normalize' is loaded by loadup (when ns > is used) and thus included in the dumped Emacs (if I understand correctly). > Unless anybody objects, I'll push it in a couple of days. Looks good to me, with one comment: > diff --git a/lisp/loadup.el b/lisp/loadup.el > index f0caa8b..dda433e 100644 > --- a/lisp/loadup.el > +++ b/lisp/loadup.el > @@ -276,6 +276,7 @@ > (if (featurep 'ns) > (progn > (load "term/common-win") > + (load "international/ucs-normalize") > (load "term/ns-win"))) > (if (fboundp 'x-create-frame) > ;; Do it after loading term/foo-win.el since the value of the > diff --git a/lisp/term/ns-win.el b/lisp/term/ns-win.el > index 0b3e3bd..9bd59fc 100644 > --- a/lisp/term/ns-win.el > +++ b/lisp/term/ns-win.el > @@ -51,6 +51,7 @@ > (require 'menu-bar) > (require 'fontset) > (require 'dnd) > +(require 'ucs-normalize) Why do you need the 'require' if loadup will unconditionally load ucs-normalize? Thanks. From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 18 03:38:16 2015 Received: (at 22169) by debbugs.gnu.org; 18 Dec 2015 08:38:16 +0000 Received: from localhost ([127.0.0.1]:55108 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9qYK-00024s-EP for submit@debbugs.gnu.org; Fri, 18 Dec 2015 03:38:16 -0500 Received: from mail-vk0-f52.google.com ([209.85.213.52]:36315) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9qYI-00024g-MW for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 03:38:15 -0500 Received: by mail-vk0-f52.google.com with SMTP id f2so22819892vkb.3 for <22169@debbugs.gnu.org>; Fri, 18 Dec 2015 00:38:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=eDmTxLW8aPtDn+NeU4DZPk4jBfk7bkQ6ZdwCK4Fwp6A=; b=FRqAfSpZLMx+i9cDIjPanX4vdFK5U0uqqDp4wDzp/yYeV0dEgks2mB6DYH5z7X9Cfc /dcvAb/S8esjUmQtIwD5aAmFqwWamcsJ+spiDSbGSf7YK3NSb4Ym4QptvAtiylpQe+fU UTfMglQUUAO9fDR2/96UKxQF0MQyZRMAXhBztz22pZaTBc46HctnO9Qxj2yiPCjZk7R6 WsPpdEpmQFtIHGcrI0+zEZH3+5x/dQPHjTlB6VxlQSgwIXQqvgRmkfm+mnyVrGWjM7rd TuorkhVbZfHVO+WvATQEeVGvG9WPBAUofTRLCZ9IBV1z3GwMXegtp+gsgOaaVyl77I8j /F/g== MIME-Version: 1.0 X-Received: by 10.31.58.74 with SMTP id h71mr1469408vka.149.1450427889081; Fri, 18 Dec 2015 00:38:09 -0800 (PST) Received: by 10.31.210.133 with HTTP; Fri, 18 Dec 2015 00:38:08 -0800 (PST) In-Reply-To: <83h9jgxloz.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83h9jgxloz.fsf@gnu.org> Date: Fri, 18 Dec 2015 09:38:08 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a114405c619e3360527280ff7 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a114405c619e3360527280ff7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > > > Below is a patch where I have dropped the old encoder and use the new > instead. > > The only thing noteworthy is that `ucs-normalize' is loaded by loadup > (when ns > > is used) and thus included in the dumped Emacs (if I understand > correctly). > > Unless anybody objects, I'll push it in a couple of days. > > Looks good to me, with one comment: > > > diff --git a/lisp/loadup.el b/lisp/loadup.el > > index f0caa8b..dda433e 100644 > > --- a/lisp/loadup.el > > +++ b/lisp/loadup.el > > @@ -276,6 +276,7 @@ > > (if (featurep 'ns) > > (progn > > (load "term/common-win") > > + (load "international/ucs-normalize") > > (load "term/ns-win"))) > > (if (fboundp 'x-create-frame) > > ;; Do it after loading term/foo-win.el since the value of the > > diff --git a/lisp/term/ns-win.el b/lisp/term/ns-win.el > > index 0b3e3bd..9bd59fc 100644 > > --- a/lisp/term/ns-win.el > > +++ b/lisp/term/ns-win.el > > @@ -51,6 +51,7 @@ > > (require 'menu-bar) > > (require 'fontset) > > (require 'dnd) > > +(require 'ucs-normalize) > > Why do you need the 'require' if loadup will unconditionally load > ucs-normalize? > I was just trying to follow the pattern in ns-win.el, there are a number of requires at the beginning, after a comment saying ";; Documentation-purposes only: actually loaded in loadup.el." I can easily drop the line, if you think it's better. > > After giving this some thought, it feels like the file name matching should be > > done on decoded strings (so that an "a" doesn't match the "a" in a decomposed > > "=C3=A5"). However, this is a major change and needs to be discussed fu= rther. > > I rather think it's a non-starter, at least for Emacs 25.1. It > probably means users of all systems will be punished by slower > directory searches, on behalf of one peculiar filesystem. Unless > there's some clever idea that avoids decoding each file name returned > by readdir, that is. The eternal question of correctness versus speed... My gut feeling is that the time it takes to decode the file names is dwarfed by the time it takes to read the file list from the harddisk (this needs to be verified, of course). In addition, for systems like Linux, encoding and decoding are no-ops (as both the source and destination is UTF-8), so there won't be a penalty there. I agree that this is not a project for Emacs 25.1 -- however, I think that we should at explore this for future versions. I suggest that we push the current patch (after dropping the `require' line), close the current issue, and post a new bug report suggesting performing the completion on decoded strings. -- Anders --001a114405c619e3360527280ff7 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
> Below is a patch where I have dropped the old encod= er and use the new instead.
> The only thing noteworthy is that `ucs-normalize' is loaded by loa= dup (when ns
> is used) and thus included in the dumped Emacs (if I understand correc= tly).
> Unless anybody objects, I'll push it in a couple of days.

Looks good to me, with one comment:

> diff --git a/lisp/loadup.el b/lisp/loadup.el
> index f0caa8b..dda433e 100644
> --- a/lisp/loadup.el
> +++ b/lisp/loadup.el
> @@ -276,6 +276,7 @@
>=C2=A0 (if (featurep 'ns)
>=C2=A0 =C2=A0 =C2=A0 (progn
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 (load "term/common-win")
> +=C2=A0 =C2=A0 =C2=A0 (load "international/ucs-normalize") >=C2=A0 =C2=A0 =C2=A0 =C2=A0 (load "term/ns-win")))
>=C2=A0 (if (fboundp 'x-create-frame)
>=C2=A0 =C2=A0 =C2=A0 ;; Do it after loading term/foo-win.el since the v= alue of the
> diff --git a/lisp/term/ns-win.el b/lisp/term/ns-win.el
> index 0b3e3bd..9bd59fc 100644
> --- a/lisp/term/ns-win.el
> +++ b/lisp/term/ns-win.el
> @@ -51,6 +51,7 @@
>=C2=A0 (require 'menu-bar)
>=C2=A0 (require 'fontset)
>=C2=A0 (require 'dnd)
> +(require 'ucs-normalize)

Why do you need the 'require' if loadup will unconditionally load ucs-normalize?

I was just trying to fol= low the pattern in ns-win.el, there are a number of requires at the beginni= ng, after a comment saying ";; Docume= ntation-purposes only: actually loaded in loadup.el."

I can easily drop t= he line, if you think it's better.


<= div>> > After giving this some thoug= ht, it feels like the file name matching should be
> > done on dec= oded strings (so that an "a" doesn't match the "a" = in a decomposed
> > "=C3=A5"). However, this is a major = change and needs to be discussed further.
>
> I rather think it's a non-starter, at least for = Emacs 25.1.=C2=A0 It
> probably means users of all systems will be punished b= y slower
> directory searches, on behalf of one peculiar filesystem.=C2=A0 Un= less
= > there's some clever idea that avoids decoding each file name retur= ned
&= gt; by readdir, that is.

The eternal ques= tion of correctness versus speed...

My gut fe= eling is that the time it takes to decode the file names is dwarfed by the = time it takes to read the file list from the harddisk (this needs to be ver= ified, of course). In addition, for systems like Linux, encoding and decodi= ng are no-ops (as both the source and destination is UTF-8), so there won&#= 39;t be a penalty there.
=
I agree that this is not a project for Emacs 25.1 -- = however, I think that we should at explore this for future versions. I sugg= est that we push the current patch (after dropping the `require' line),= close the current issue, and post a new bug report suggesting performing t= he completion on decoded strings.

=C2=A0 =C2= =A0 -- Anders

--001a114405c619e3360527280ff7-- From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 18 04:15:20 2015 Received: (at 22169) by debbugs.gnu.org; 18 Dec 2015 09:15:20 +0000 Received: from localhost ([127.0.0.1]:55124 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9r8C-00033N-02 for submit@debbugs.gnu.org; Fri, 18 Dec 2015 04:15:20 -0500 Received: from eggs.gnu.org ([208.118.235.92]:37142) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9r8A-00033A-9i for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 04:15:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9r81-0002kf-No for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 04:15:12 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:52819) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9r81-0002ka-JQ; Fri, 18 Dec 2015 04:15:09 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1722 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a9r7y-0001Qg-Df; Fri, 18 Dec 2015 04:15:09 -0500 Date: Fri, 18 Dec 2015 11:15:29 +0200 Message-Id: <837fkcxglq.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Fri, 18 Dec 2015 09:38:08 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83h9jgxloz.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Fri, 18 Dec 2015 09:38:08 +0100 > From: Anders Lindgren > Cc: 22169@debbugs.gnu.org > > I was just trying to follow the pattern in ns-win.el, there are a number of > requires at the beginning, after a comment saying ";; Documentation-purposes > only: actually loaded in loadup.el." > > I can easily drop the line, if you think it's better. I see other files do the same, so I'm probably missing something here. Let's leave that as you wrote it. > > I rather think it's a non-starter, at least for Emacs 25.1. It > > probably means users of all systems will be punished by slower > > directory searches, on behalf of one peculiar filesystem. Unless > > there's some clever idea that avoids decoding each file name returned > > by readdir, that is. > > The eternal question of correctness versus speed... No, it's correctness on one platform vs speed on all the rest. > My gut feeling is that the time it takes to decode the file names is dwarfed by > the time it takes to read the file list from the harddisk (this needs to be > verified, of course). I think you should time this. My gut feeling is the other way around, for several reasons: . reading file entries in a directory is essentially a system call, that is usually highly optimized code . modern OSes cache this stuff, so you can do that without ever hitting the disk . many modern machines have SSDs (mine does), where disk drive accesses, even when they are needed, are very fast . by contrast, decoding a non-trivial encoding might take many CPU cycles, especially in the utf-8-hfs case, where we call Lisp as part of that Nevertheless, my gut feeling could also be false. We should time that. > In addition, for systems like Linux, encoding and decoding are > no-ops (as both the source and destination is UTF-8), so there won't > be a penalty there. Yes, but only in UTF-8 locales. I won't be surprised to learn that most of Far East uses something else, even on GNU/Linux. And then there are Windows volumes mounted via NFS and such likes. > I agree that this is not a project for Emacs 25.1 -- however, I think that we > should at explore this for future versions. I suggest that we push the current > patch (after dropping the `require' line), close the current issue, and post a > new bug report suggesting performing the completion on decoded strings. I have a simpler idea for fixing the issue without decoding every file in a directory. Please wait for a couple of days. (There's no need for another bug report, we could continue solving the left-over problem as part of this one.) From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 18 10:42:54 2015 Received: (at submit) by debbugs.gnu.org; 18 Dec 2015 15:42:54 +0000 Received: from localhost ([127.0.0.1]:55840 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9xBG-0005YG-2z for submit@debbugs.gnu.org; Fri, 18 Dec 2015 10:42:54 -0500 Received: from eggs.gnu.org ([208.118.235.92]:49424) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9xBE-0005Y2-K4 for submit@debbugs.gnu.org; Fri, 18 Dec 2015 10:42:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9xB8-0004hm-BV for submit@debbugs.gnu.org; Fri, 18 Dec 2015 10:42:47 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:43217) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9xB8-0004hi-87 for submit@debbugs.gnu.org; Fri, 18 Dec 2015 10:42:46 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40392) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9xB7-0007YK-Cp for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 10:42:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9xB3-0004gP-8N for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 10:42:45 -0500 Received: from plane.gmane.org ([80.91.229.3]:33949) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9xB3-0004gA-1m for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 10:42:41 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1a9xB1-0000yD-7b for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 16:42:39 +0100 Received: from c-68-39-146-59.hsd1.in.comcast.net ([68.39.146.59]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 18 Dec 2015 16:42:39 +0100 Received: from random832 by c-68-39-146-59.hsd1.in.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 18 Dec 2015 16:42:39 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: bug-gnu-emacs@gnu.org From: Random832 Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X Date: Fri, 18 Dec 2015 10:42:31 -0500 Lines: 34 Message-ID: <87h9jfpxug.fsf@fastmail.com> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83h9jgxloz.fsf@gnu.org> <837fkcxglq.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: c-68-39-146-59.hsd1.in.comcast.net User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) Cancel-Lock: sha1:UcTgLGQJ1bad6Cg6himgkms2hAM= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.1 (----) Eli Zaretskii writes: > . modern OSes cache this stuff, so you can do that without ever > hitting the disk *ever*? Surely at least once. > . many modern machines have SSDs (mine does), where disk drive > accesses, even when they are needed, are very fast They're fast, yes, but my own gut feeling is that they're not actually fast *enough* not to be the bottleneck. > . by contrast, decoding a non-trivial encoding might take many CPU > cycles, especially in the utf-8-hfs case, where we call Lisp as > part of that I don't know why "especially in the utf-8-hfs case" - the current code is no more correct for utf-8-hfs on Linux than for utf-8-hfs on OSX. > Yes, but only in UTF-8 locales. I won't be surprised to learn that > most of Far East uses something else, even on GNU/Linux. And then > there are Windows volumes mounted via NFS and such likes. I think most people who do this (I should think it would be SMB/CIFS rather than NFS - if it's really NFS then I suppose the translation has to happen on the Windows side) have the file- system translated to UTF-8 [etc] for them by the kernel. There are mount options "iocharset" and "codepage" (the latter for the filesystem's coding system on 8-bit filesystems), to take care of this. Working with multiple different directories with different filename encoding systems is a pathological case, and one which as far as I know Emacs makes no attempt to deal with (except by the user switching manually). From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 18 11:13:07 2015 Received: (at submit) by debbugs.gnu.org; 18 Dec 2015 16:13:07 +0000 Received: from localhost ([127.0.0.1]:55853 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9xeU-0006Ec-VI for submit@debbugs.gnu.org; Fri, 18 Dec 2015 11:13:07 -0500 Received: from eggs.gnu.org ([208.118.235.92]:57220) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9xeT-0006E5-Sm for submit@debbugs.gnu.org; Fri, 18 Dec 2015 11:13:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9xeL-0003oZ-LB for submit@debbugs.gnu.org; Fri, 18 Dec 2015 11:13:00 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:55641) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9xeL-0003oU-Ig for submit@debbugs.gnu.org; Fri, 18 Dec 2015 11:12:57 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35233) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9ww6-0003iZ-6t for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 10:27:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9ww1-0000Pf-7Y for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 10:27:14 -0500 Received: from plane.gmane.org ([80.91.229.3]:32770) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9ww1-0000PG-0o for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 10:27:09 -0500 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1a9wvx-00082c-O0 for bug-gnu-emacs@gnu.org; Fri, 18 Dec 2015 16:27:05 +0100 Received: from c-68-39-146-59.hsd1.in.comcast.net ([68.39.146.59]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 18 Dec 2015 16:27:05 +0100 Received: from random832 by c-68-39-146-59.hsd1.in.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 18 Dec 2015 16:27:05 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: bug-gnu-emacs@gnu.org From: Random832 Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X Date: Fri, 18 Dec 2015 10:26:49 -0500 Lines: 36 Message-ID: <87lh8rpykm.fsf@fastmail.com> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: c-68-39-146-59.hsd1.in.comcast.net User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) Cancel-Lock: sha1:YSOfkfpTB09Z2neaUqiMN4lHD2s= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.1 (----) Eli Zaretskii writes: > I rather think it's a non-starter, at least for Emacs 25.1. It > probably means users of all systems will be punished by slower > directory searches, How much slower do you suppose it would be? Especially for utf-8, which I assume is fast anyway (it doesn't even seem to reject excessively high codepoints... I'm not _entirely_ sure utf-8 is not actually identical to emacs-internal, does anyone know any concrete differences?) Sometimes features, and correctness, have a performance cost. If performance is the end-all and be-all priority, let's just abolish all encodings and assume all filenames are in emacs-internal. What if it's only a 1% slow down? 5%? 10%? Or would an absolute measure be more appropriate - i.e. define how much time it's acceptable for it to take (on some standard directory and CPU). > No, it's correctness on one platform vs speed on all the rest. Strictly speaking, it's correctness for one encoding. In trying to come up with another example, I noticed that in an EUC-JP locale, typing "*修"TAB ("*\275\244") doesn't actually match "文字化け" ["\312\270\273\372\262\275\244\261"] as I had expected it to. I guess matching with embedded stars goes through a different code path? Is there a way to simply enable doing this for normal completion when the file system encoding is utf-8-hfs? Or to add post-filtering [only return a filename if it matches both the existing way *and* the decoded string matches], only enabled by default on utf-8-hfs? Or is even the time spent checking a boolean variable too much of a performance penalty? From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 18 12:06:13 2015 Received: (at 22169) by debbugs.gnu.org; 18 Dec 2015 17:06:13 +0000 Received: from localhost ([127.0.0.1]:55924 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9yTs-00034L-Pe for submit@debbugs.gnu.org; Fri, 18 Dec 2015 12:06:12 -0500 Received: from eggs.gnu.org ([208.118.235.92]:47268) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1a9yTr-000348-Iw for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 12:06:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a9yTj-0002hi-5R for 22169@debbugs.gnu.org; Fri, 18 Dec 2015 12:06:06 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:44137) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a9yTj-0002he-2X; Fri, 18 Dec 2015 12:06:03 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1964 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1a9yTi-0008G7-7t; Fri, 18 Dec 2015 12:06:02 -0500 Date: Fri, 18 Dec 2015 19:06:25 +0200 Message-Id: <83k2obwusu.fsf@gnu.org> From: Eli Zaretskii To: Random832 In-reply-to: <87lh8rpykm.fsf@fastmail.com> (message from Random832 on Fri, 18 Dec 2015 10:26:49 -0500) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <87lh8rpykm.fsf@fastmail.com> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > From: Random832 > Date: Fri, 18 Dec 2015 10:26:49 -0500 > > Eli Zaretskii writes: > > I rather think it's a non-starter, at least for Emacs 25.1. It > > probably means users of all systems will be punished by slower > > directory searches, > > How much slower do you suppose it would be? I don't know. That's why I suggested timing it. > I'm not _entirely_ sure utf-8 is not actually identical to > emacs-internal It isn't. > does anyone know any concrete differences? Raw bytes and characters from Far-Eastern charsets that are not unified get non-trivial conversions. > Sometimes features, and correctness, have a performance cost. If > performance is the end-all and be-all priority, let's just > abolish all encodings and assume all filenames are in > emacs-internal. We need to know the cost before we make the decision. It could be that you are right and the cost is negligible, then the decision will be very easy. But it also could be not so easy. > In trying to come up with another example, I noticed that in an > EUC-JP locale, typing "*修"TAB ("*\275\244") doesn't actually > match "文字化け" ["\312\270\273\372\262\275\244\261"] as I had > expected it to. I guess matching with embedded stars goes > through a different code path? I'm not familiar enough with all the high-level tricks above the dired.c primitives that were introduced lately. I hope someone else will answer that. (You could try figuring that out by looking at the calls to file_name_completion that are done by Lisp.) > Is there a way to simply enable doing this for normal completion > when the file system encoding is utf-8-hfs? Or to add post-filtering > [only return a filename if it matches both the existing way *and* > the decoded string matches], only enabled by default on utf-8-hfs? The latter is what I had in mind, yes. > Or is even the time spent checking a boolean variable too much > of a performance penalty? No, I don't think so. From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 20 12:56:02 2015 Received: (at 22169) by debbugs.gnu.org; 20 Dec 2015 17:56:02 +0000 Received: from localhost ([127.0.0.1]:58195 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aAiDC-000158-0v for submit@debbugs.gnu.org; Sun, 20 Dec 2015 12:56:02 -0500 Received: from eggs.gnu.org ([208.118.235.92]:59452) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aAiD9-00014q-P8 for 22169@debbugs.gnu.org; Sun, 20 Dec 2015 12:56:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aAiD0-0005oN-Fq for 22169@debbugs.gnu.org; Sun, 20 Dec 2015 12:55:54 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:42290) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aAiD0-0005oI-Cf; Sun, 20 Dec 2015 12:55:50 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4597 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aAiCz-0003z4-KU; Sun, 20 Dec 2015 12:55:50 -0500 Date: Sun, 20 Dec 2015 19:56:17 +0200 Message-Id: <83fuyxt35q.fsf@gnu.org> From: Eli Zaretskii To: andlind@gmail.com, random832@fastmail.com In-reply-to: <83r3ikxmis.fsf@gnu.org> (message from Eli Zaretskii on Fri, 18 Dec 2015 09:07:39 +0200) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Fri, 18 Dec 2015 09:07:39 +0200 > From: Eli Zaretskii > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > > After reading through Random832:s comments, I also see the problem with "åäö" > > and "aao" not being handled correctly. Typing "a TAB" makes Emacs delete the > > "a", which seems very confusing. Typing "å TAB" or "aa TAB" works, though. > > (Here `(file-name-all-completions "a" ".")' returns `("åäöfirst.txt" > > "aaosecond.txt")'. > > > > In other words, Emcas is in better shape with my than it was before, but there > > is still some work to be done. > > > > When it comes to "lax" matching -- I really don't think we should use it for > > file names. I don't want to match "å" when I type "a" etc. > > I have an idea for a change that could solve this. I will post it in > a day or two and ask you to try it. Could you please try the patch below, and see if it avoids the "lax" matches and the confusing effect of deleting "a" in the scenario above is avoided on OS X? (This is not the full patch, since we need to add this code only for some file-name encodings, such as utf-8-hfs. If this works for you, I will add the missing bits. If it doesn't work, please tell where I goofed.) Thanks. diff --git a/src/dired.c b/src/dired.c index 84bf247..4ff85f1 100644 --- a/src/dired.c +++ b/src/dired.c @@ -641,16 +641,30 @@ file_name_completion (Lisp_Object file, Lisp_Object dirname, bool all_flag, matchcount += matchcount <= 1; + Lisp_Object zero = make_number (0); if (all_flag) - bestmatch = Fcons (name, bestmatch); + { + Lisp_Object cmp1 + = Fcompare_strings (name, zero, make_number (SCHARS (name)), + file, zero, make_number (SCHARS (file)), + completion_ignore_case ? Qt : Qnil); + if (EQ (cmp1, Qt) || XINT (cmp1) != -1) + bestmatch = Fcons (name, bestmatch); + } else if (NILP (bestmatch)) { - bestmatch = name; - bestmatchsize = SCHARS (name); + Lisp_Object cmp2 + = Fcompare_strings (name, zero, make_number (SCHARS (name)), + file, zero, make_number (SCHARS (file)), + completion_ignore_case ? Qt : Qnil); + if (EQ (cmp2, Qt) || XINT (cmp2) != -1) + { + bestmatch = name; + bestmatchsize = SCHARS (name); + } } else { - Lisp_Object zero = make_number (0); /* FIXME: This is a copy of the code in Ftry_completion. */ ptrdiff_t compare = min (bestmatchsize, SCHARS (name)); Lisp_Object cmp From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 20 14:16:37 2015 Received: (at 22169) by debbugs.gnu.org; 20 Dec 2015 19:16:37 +0000 Received: from localhost ([127.0.0.1]:58218 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aAjTB-0002zJ-1Q for submit@debbugs.gnu.org; Sun, 20 Dec 2015 14:16:37 -0500 Received: from mail-vk0-f51.google.com ([209.85.213.51]:32837) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aAjT9-0002z4-Gl for 22169@debbugs.gnu.org; Sun, 20 Dec 2015 14:16:35 -0500 Received: by mail-vk0-f51.google.com with SMTP id a188so91620210vkc.0 for <22169@debbugs.gnu.org>; Sun, 20 Dec 2015 11:16:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=VwYOt3C/fbWgoMJGIrPvqUsRuaAgIJkk3so34oE0CWg=; b=PJAETFN5/f4Ypu7FLYHCWMHWJFhNoAAuO/0GaS3qQsmGWuqr3WX/QukejbE4GRv5RG oyuiZ1UFEgBcTXCv6dNTi9n+KTHnLxvvjKf1DmrikoXpALj2Qvz4c6rvo4Lw+s0SpHFg v9DN7g/EczJx6vnvzng+q/P6jrQ28xfSocDVA9AL5XvMeNALV8OOhXrmydwMIoFvtuzm YyWxdM2byyh5rDKVIf7LSGUWBr067UwHk0/wtSntMhmAJsNmM9SDVbszAS+K1vYa+GTD V30rxbjjQLvjqO33P4AGLq6Me5fR8DjB9alL7MRJaAmypxGyuQFuoe+H/3qEqNHK4dMY YEbQ== MIME-Version: 1.0 X-Received: by 10.31.58.74 with SMTP id h71mr9873741vka.149.1450638989966; Sun, 20 Dec 2015 11:16:29 -0800 (PST) Received: by 10.31.210.133 with HTTP; Sun, 20 Dec 2015 11:16:29 -0800 (PST) In-Reply-To: <83fuyxt35q.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> Date: Sun, 20 Dec 2015 20:16:29 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a114405c6b1c46105275935e2 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a114405c6b1c46105275935e2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi! Unfortunately, it still doesn't work, the "a" is still deleted. You can see what happens here: (file-name-all-completions "" ".") ("=C3=A5=C3=A4=C3=B6.txt" "aao.txt" "../" "./") (file-name-all-completions "a" ".") ("=C3=A5=C3=A4=C3=B6.txt" "aao.txt") <=3D Incorrect res= ult (file-name-all-completions "=C3=A5" ".") ("=C3=A5=C3=A4=C3=B6.txt") I gave this a bit of thinking, would the following work: - For each match of the current system (using encoded comparison), after the decoding of the entry, perform a second comparison with the decoded (original) version of "file" (when not empty). There is no extra decoding included, as the number of entries decoded is the same as before (even if some entries will be rejected now). The extra comparison is only performed if "file" is not empty, so it will not affect normal directory retrieval, only when performing a completion operation. Concretely, in the example above, completing "a" will find both entries which are decoded. However, the second comparison will reject "=C3=A5=C3=A4= =C3=B6.txt". -- Anders On Sun, Dec 20, 2015 at 6:56 PM, Eli Zaretskii wrote: > > Date: Fri, 18 Dec 2015 09:07:39 +0200 > > From: Eli Zaretskii > > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > > > > After reading through Random832:s comments, I also see the problem > with "=C3=A5=C3=A4=C3=B6" > > > and "aao" not being handled correctly. Typing "a TAB" makes Emacs > delete the > > > "a", which seems very confusing. Typing "=C3=A5 TAB" or "aa TAB" work= s, > though. > > > (Here `(file-name-all-completions "a" ".")' returns `("=C3=A5=C3=A4= =C3=B6first.txt" > > > "aaosecond.txt")'. > > > > > > In other words, Emcas is in better shape with my than it was before, > but there > > > is still some work to be done. > > > > > > When it comes to "lax" matching -- I really don't think we should use > it for > > > file names. I don't want to match "=C3=A5" when I type "a" etc. > > > > I have an idea for a change that could solve this. I will post it in > > a day or two and ask you to try it. > > Could you please try the patch below, and see if it avoids the "lax" > matches and the confusing effect of deleting "a" in the scenario above > is avoided on OS X? > > (This is not the full patch, since we need to add this code only for > some file-name encodings, such as utf-8-hfs. If this works for you, I > will add the missing bits. If it doesn't work, please tell where I > goofed.) > > Thanks. > > diff --git a/src/dired.c b/src/dired.c > index 84bf247..4ff85f1 100644 > --- a/src/dired.c > +++ b/src/dired.c > @@ -641,16 +641,30 @@ file_name_completion (Lisp_Object file, Lisp_Object > dirname, bool all_flag, > > matchcount +=3D matchcount <=3D 1; > > + Lisp_Object zero =3D make_number (0); > if (all_flag) > - bestmatch =3D Fcons (name, bestmatch); > + { > + Lisp_Object cmp1 > + =3D Fcompare_strings (name, zero, make_number (SCHARS (name))= , > + file, zero, make_number (SCHARS (file)), > + completion_ignore_case ? Qt : Qnil); > + if (EQ (cmp1, Qt) || XINT (cmp1) !=3D -1) > + bestmatch =3D Fcons (name, bestmatch); > + } > else if (NILP (bestmatch)) > { > - bestmatch =3D name; > - bestmatchsize =3D SCHARS (name); > + Lisp_Object cmp2 > + =3D Fcompare_strings (name, zero, make_number (SCHARS (name))= , > + file, zero, make_number (SCHARS (file)), > + completion_ignore_case ? Qt : Qnil); > + if (EQ (cmp2, Qt) || XINT (cmp2) !=3D -1) > + { > + bestmatch =3D name; > + bestmatchsize =3D SCHARS (name); > + } > } > else > { > - Lisp_Object zero =3D make_number (0); > /* FIXME: This is a copy of the code in Ftry_completion. */ > ptrdiff_t compare =3D min (bestmatchsize, SCHARS (name)); > Lisp_Object cmp > --001a114405c6b1c46105275935e2 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

Unfortunately, it still doesn't= work, the "a" is still deleted. You can see what happens here:

(file-name-all-completions "" "= .")
("=C3=A5=C3=A4=C3=B6.txt" "aao.txt" = "../" "./")

(file-name-all-com= pletions "a" ".")
("=C3=A5=C3=A4=C3=B6.t= xt" "aao.txt") =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 <=3D Incorrect result

= (file-name-all-completions "=C3=A5" ".")
(&qu= ot;=C3=A5=C3=A4=C3=B6.txt")


<= div>I gave this a bit of thinking, would the following work:

=
=C2=A0- For each match of the current system (using encoded comp= arison), after the decoding of the entry, perform a second comparison with = the decoded (original) version of "file" (when not empty).
<= div>
There is no extra decoding included, as the number of en= tries decoded is the same as before (even if some entries will be rejected = now). The extra comparison is only performed if "file" is not emp= ty, so it will not affect normal directory retrieval, only when performing = a completion operation.

Concretely, in the example= above, completing "a" will find both entries which are decoded. = However, the second comparison will reject "=C3=A5=C3=A4=C3=B6.txt&quo= t;.

=C2=A0 =C2=A0 -- Anders

On Sun, Dec 20, 2015 at 6:5= 6 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> Date: Fri, 18 Dec 2015 09:07:39 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc:
random832@fastmail.com, 22169@debbugs.gnu.org
>
> > After reading through Random832:s comments, I also see the proble= m with "=C3=A5=C3=A4=C3=B6"
> > and "aao" not being handled correctly. Typing "a T= AB" makes Emacs delete the
> > "a", which seems very confusing. Typing "=C3=A5 TA= B" or "aa TAB" works, though.
> > (Here `(file-name-all-completions "a" ".")= 9; returns `("=C3=A5=C3=A4=C3=B6first.txt"
> > "aaosecond.txt")'.
> >
> > In other words, Emcas is in better shape with my than it was befo= re, but there
> > is still some work to be done.
> >
> > When it comes to "lax" matching -- I really don't t= hink we should use it for
> > file names. I don't want to match "=C3=A5" when I t= ype "a" etc.
>
> I have an idea for a change that could solve this.=C2=A0 I will post i= t in
> a day or two and ask you to try it.

Could you please try the patch below, and see if it avoids the "= ;lax"
matches and the confusing effect of deleting "a" in the scenario = above
is avoided on OS X?

(This is not the full patch, since we need to add this code only for
some file-name encodings, such as utf-8-hfs.=C2=A0 If this works for you, I=
will add the missing bits.=C2=A0 If it doesn't work, please tell where = I
goofed.)

Thanks.

diff --git a/src/dired.c b/src/dired.c
index 84bf247..4ff85f1 100644
--- a/src/dired.c
+++ b/src/dired.c
@@ -641,16 +641,30 @@ file_name_completion (Lisp_Object file, Lisp_Object d= irname, bool all_flag,

=C2=A0 =C2=A0 =C2=A0 =C2=A0matchcount +=3D matchcount <=3D 1;

+=C2=A0 =C2=A0 =C2=A0 Lisp_Object zero =3D make_number (0);
=C2=A0 =C2=A0 =C2=A0 =C2=A0if (all_flag)
-=C2=A0 =C2=A0 =C2=A0 =C2=A0bestmatch =3D Fcons (name, bestmatch);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0{
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Lisp_Object cmp1
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D Fcompare_strings (name, zero,= make_number (SCHARS (name)),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file, zero, make_number (SCHARS (file= )),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0completion_ignore_case ? Qt : Qnil);<= br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (EQ (cmp1, Qt) || XINT (cmp1) !=3D -1= )
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0bestmatch =3D Fcons (name, bestma= tch);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0 =C2=A0else if (NILP (bestmatch))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 {
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0bestmatch =3D name;
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0bestmatchsize =3D SCHARS (name);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Lisp_Object cmp2
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D Fcompare_strings (name, zero,= make_number (SCHARS (name)),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file, zero, make_number (SCHARS (file= )),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0completion_ignore_case ? Qt : Qnil);<= br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (EQ (cmp2, Qt) || XINT (cmp2) !=3D -1= )
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0{
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0bestmatch =3D name;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0bestmatchsize =3D SCHARS (= name);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0else
=C2=A0 =C2=A0 =C2=A0 =C2=A0 {
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Lisp_Object zero =3D make_number (0); =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* FIXME: This is a copy of the code in = Ftry_completion.=C2=A0 */
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ptrdiff_t compare =3D min (bestmatchsize= , SCHARS (name));
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Lisp_Object cmp

--001a114405c6b1c46105275935e2-- From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 20 14:38:48 2015 Received: (at 22169) by debbugs.gnu.org; 20 Dec 2015 19:38:48 +0000 Received: from localhost ([127.0.0.1]:58224 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aAjoe-0003VG-4d for submit@debbugs.gnu.org; Sun, 20 Dec 2015 14:38:48 -0500 Received: from eggs.gnu.org ([208.118.235.92]:55199) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aAjoc-0003V4-TE for 22169@debbugs.gnu.org; Sun, 20 Dec 2015 14:38:47 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aAjoU-0003Zg-DD for 22169@debbugs.gnu.org; Sun, 20 Dec 2015 14:38:41 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43746) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aAjoU-0003Zc-9V; Sun, 20 Dec 2015 14:38:38 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4707 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aAjoT-00064X-3H; Sun, 20 Dec 2015 14:38:37 -0500 Date: Sun, 20 Dec 2015 21:39:06 +0200 Message-Id: <8337uwucyt.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Sun, 20 Dec 2015 20:16:29 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Sun, 20 Dec 2015 20:16:29 +0100 > From: Anders Lindgren > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > Unfortunately, it still doesn't work, the "a" is still deleted. You can see > what happens here: > > (file-name-all-completions "" ".") > ("åäö.txt" "aao.txt" "../" "./") > > (file-name-all-completions "a" ".") > ("åäö.txt" "aao.txt") <= Incorrect result So something's wrong with the patch I wrote, because it was supposed to reject "åäö.txt" in the last case. Can you see why it didn't? > I gave this a bit of thinking, would the following work: > > - For each match of the current system (using encoded comparison), after the > decoding of the entry, perform a second comparison with the decoded (original) > version of "file" (when not empty). > > There is no extra decoding included, as the number of entries decoded is the > same as before (even if some entries will be rejected now). The extra > comparison is only performed if "file" is not empty, so it will not affect > normal directory retrieval, only when performing a completion operation. > > Concretely, in the example above, completing "a" will find both entries which > are decoded. However, the second comparison will reject "åäö.txt". That's exactly what my patch was supposed to do -- it makes a second comparison right before adding a candidate to the result. If you can see why it isn't working, we can take it from there. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 20 17:00:48 2015 Received: (at 22169) by debbugs.gnu.org; 20 Dec 2015 22:00:48 +0000 Received: from localhost ([127.0.0.1]:58284 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aAm23-0008TX-Um for submit@debbugs.gnu.org; Sun, 20 Dec 2015 17:00:48 -0500 Received: from mail-vk0-f44.google.com ([209.85.213.44]:35491) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aAm22-0008TJ-T5 for 22169@debbugs.gnu.org; Sun, 20 Dec 2015 17:00:47 -0500 Received: by mail-vk0-f44.google.com with SMTP id a189so91611658vkh.2 for <22169@debbugs.gnu.org>; Sun, 20 Dec 2015 14:00:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=JdwlyFiwzttuzW3lkWP7gGjacQd8pnr6KUwK/Cz5jxs=; b=A+Cr34H4DqTYR/4aOff0wH/mxGrULDWYhCs6CzKeQIpU0Nbwmg4ZnK5svUZEyG0mBm 5Gmpi8i3/v0/zWAs4JMV1+iZADFE0or/3hBWfF14c1YRE7NizUKOC5XVxGVPo+NGiu44 TEf4Tb6tNlZeUZwcJ5fy0DmHOJvn8bor50hi9lRLQLfApZQwqnqcQzmVJYal+6yjT1Ar MuLCQbkTMZ40+bGHBH98vj6kY8skbrWpS4KEnH1g36q6vAPDzI6r+MN0OvRi+S7hwZDi tZKKY2X9ICDWLdpvPVZRZM+kDLN2ZoMtKZGB1V9FgZKsT2iYgYBr4rzIJzGgMKI9BUuD pAfQ== MIME-Version: 1.0 X-Received: by 10.31.138.20 with SMTP id m20mr10379679vkd.70.1450648841200; Sun, 20 Dec 2015 14:00:41 -0800 (PST) Received: by 10.31.210.133 with HTTP; Sun, 20 Dec 2015 14:00:40 -0800 (PST) In-Reply-To: <8337uwucyt.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> Date: Sun, 20 Dec 2015 23:00:40 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/mixed; boundary=001a1144f922dfbcc905275b80a9 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a1144f922dfbcc905275b80a9 Content-Type: multipart/alternative; boundary=001a1144f922dfbcb905275b80a7 --001a1144f922dfbcb905275b80a7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi! I managed to get the attached patch to work (when used in conjunction with my previous patch). I've tested: * C-x C-f a TAB * (find-file-all-competions "a" ".") -- Anders On Sun, Dec 20, 2015 at 8:39 PM, Eli Zaretskii wrote: > > Date: Sun, 20 Dec 2015 20:16:29 +0100 > > From: Anders Lindgren > > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > > > Unfortunately, it still doesn't work, the "a" is still deleted. You can > see > > what happens here: > > > > (file-name-all-completions "" ".") > > ("=C3=A5=C3=A4=C3=B6.txt" "aao.txt" "../" "./") > > > > (file-name-all-completions "a" ".") > > ("=C3=A5=C3=A4=C3=B6.txt" "aao.txt") <=3D Incorrect result > > So something's wrong with the patch I wrote, because it was supposed > to reject "=C3=A5=C3=A4=C3=B6.txt" in the last case. Can you see why it = didn't? > > > I gave this a bit of thinking, would the following work: > > > > - For each match of the current system (using encoded comparison), afte= r > the > > decoding of the entry, perform a second comparison with the decoded > (original) > > version of "file" (when not empty). > > > > There is no extra decoding included, as the number of entries decoded i= s > the > > same as before (even if some entries will be rejected now). The extra > > comparison is only performed if "file" is not empty, so it will not > affect > > normal directory retrieval, only when performing a completion operation= . > > > > Concretely, in the example above, completing "a" will find both entries > which > > are decoded. However, the second comparison will reject "=C3=A5=C3=A4= =C3=B6.txt". > > That's exactly what my patch was supposed to do -- it makes a second > comparison right before adding a candidate to the result. If you can > see why it isn't working, we can take it from there. > > Thanks. > --001a1144f922dfbcb905275b80a7 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

I managed to get the attached patch= to work (when used in conjunction with my previous patch).

<= /div>
I've tested:

* C-x C-f a TAB
* (find-file-all-competions "a" ".")

<= /div>
=C2=A0 =C2=A0 -- Anders


=

On Sun, Dec 20, 2= 015 at 8:39 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> Date: Sun, 20 Dec 2015 20:16:29 +0100
> From: Anders Lindgren <andlind= @gmail.com>
> Cc: random832@fastmail.com, 22169@debbugs.gnu.org
>
> Unfortunately, it still doesn't work, the "a" is still d= eleted. You can see
> what happens here:
>
> (file-name-all-completions "" ".")
> ("=C3=A5=C3=A4=C3=B6.txt" "aao.txt" "../"= ; "./")
>
> (file-name-all-completions "a" ".")
> ("=C3=A5=C3=A4=C3=B6.txt" "aao.txt") <=3D Incor= rect result

So something's wrong with the patch I wrote, because it was supp= osed
to reject "=C3=A5=C3=A4=C3=B6.txt" in the last case.=C2=A0 Can yo= u see why it didn't?

> I gave this a bit of thinking, would the following work:
>
> - For each match of the current system (using encoded comparison), aft= er the
> decoding of the entry, perform a second comparison with the decoded (o= riginal)
> version of "file" (when not empty).
>
> There is no extra decoding included, as the number of entries decoded = is the
> same as before (even if some entries will be rejected now). The extra<= br> > comparison is only performed if "file" is not empty, so it w= ill not affect
> normal directory retrieval, only when performing a completion operatio= n.
>
> Concretely, in the example above, completing "a" will find b= oth entries which
> are decoded. However, the second comparison will reject "=C3=A5= =C3=A4=C3=B6.txt".

That's exactly what my patch was supposed to do -- it makes a se= cond
comparison right before adding a candidate to the result.=C2=A0 If you can<= br> see why it isn't working, we can take it from there.

Thanks.

--001a1144f922dfbcb905275b80a7-- --001a1144f922dfbcc905275b80a9 Content-Type: text/plain; charset=US-ASCII; name="coding2.diff" Content-Disposition: attachment; filename="coding2.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_iif2v2v90 ZGlmZiAtLWdpdCBhL3NyYy9kaXJlZC5jIGIvc3JjL2RpcmVkLmMKaW5kZXggODRiZjI0Ny4uYTJm Mzg4YyAxMDA2NDQKLS0tIGEvc3JjL2RpcmVkLmMKKysrIGIvc3JjL2RpcmVkLmMKQEAgLTYzNyw2 ICs2MzcsMjAgQEAgZmlsZV9uYW1lX2NvbXBsZXRpb24gKExpc3BfT2JqZWN0IGZpbGUsIExpc3Bf T2JqZWN0IGRpcm5hbWUsIGJvb2wgYWxsX2ZsYWcsCiAgICAgICBpZiAoIU5JTFAgKHByZWRpY2F0 ZSkgJiYgTklMUCAoY2FsbDEgKHByZWRpY2F0ZSwgbmFtZSkpKQogCWNvbnRpbnVlOwogCisgICAg ICAvKiBSZWplY3QgZW50cmllcyB3aGVyZSB0aGUgZW5jb2RlZCBzdHJpbmdzIG1hdGNoZWQgYnV0 IHRoZQorICAgICAgICAgZGVjb2RlZCBkb2Vzbid0LiAgQ29uY3JldGVseSwgImEiIHNob3VsZCBu b3QgbWF0Y2ggImEtcmluZyIKKyAgICAgICAgIG9uIGZpbGUgc3lzdGVtIGVuY29kZWQgdXNpbmcg VVRGLTggZGVjb21wb3NlZCBjaGFyYWN0ZXJzLiAqLworICAgICAgTGlzcF9PYmplY3QgemVybyA9 IG1ha2VfbnVtYmVyICgwKTsKKyAgICAgIGlmIChTQ0hBUlMgKGZpbGUpIDw9IFNDSEFSUyAobmFt ZSkpCisgICAgICAgIHsKKyAgICAgICAgICBMaXNwX09iamVjdCBjbXAKKyAgICAgICAgICAgID0g RmNvbXBhcmVfc3RyaW5ncyAobmFtZSwgemVybywgbWFrZV9udW1iZXIgKFNDSEFSUyAoZmlsZSkp LAorICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBmaWxlLCB6ZXJvLCBtYWtlX251bWJl ciAoU0NIQVJTIChmaWxlKSksCisgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGNvbXBs ZXRpb25faWdub3JlX2Nhc2UgPyBRdCA6IFFuaWwpOworICAgICAgICAgIGlmICghRVEgKGNtcCwg UXQpKQorICAgICAgICAgICAgY29udGludWU7CisgICAgICAgIH0KKwogICAgICAgLyogU3VpdGFi bHkgcmVjb3JkIHRoaXMgbWF0Y2guICAqLwogCiAgICAgICBtYXRjaGNvdW50ICs9IG1hdGNoY291 bnQgPD0gMTsKQEAgLTY1MCw3ICs2NjQsNiBAQCBmaWxlX25hbWVfY29tcGxldGlvbiAoTGlzcF9P YmplY3QgZmlsZSwgTGlzcF9PYmplY3QgZGlybmFtZSwgYm9vbCBhbGxfZmxhZywKIAl9CiAgICAg ICBlbHNlCiAJewotCSAgTGlzcF9PYmplY3QgemVybyA9IG1ha2VfbnVtYmVyICgwKTsKIAkgIC8q IEZJWE1FOiBUaGlzIGlzIGEgY29weSBvZiB0aGUgY29kZSBpbiBGdHJ5X2NvbXBsZXRpb24uICAq LwogCSAgcHRyZGlmZl90IGNvbXBhcmUgPSBtaW4gKGJlc3RtYXRjaHNpemUsIFNDSEFSUyAobmFt ZSkpOwogCSAgTGlzcF9PYmplY3QgY21wCg== --001a1144f922dfbcc905275b80a9-- From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 20 22:38:47 2015 Received: (at 22169) by debbugs.gnu.org; 21 Dec 2015 03:38:47 +0000 Received: from localhost ([127.0.0.1]:58461 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aArJ8-0008Ez-WE for submit@debbugs.gnu.org; Sun, 20 Dec 2015 22:38:47 -0500 Received: from eggs.gnu.org ([208.118.235.92]:36002) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aArJ7-0008En-C2 for 22169@debbugs.gnu.org; Sun, 20 Dec 2015 22:38:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aArIx-0002rV-Lj for 22169@debbugs.gnu.org; Sun, 20 Dec 2015 22:38:40 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:51443) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aArIx-0002rR-Ia; Sun, 20 Dec 2015 22:38:35 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4785 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aArIu-0005BJ-32; Sun, 20 Dec 2015 22:38:35 -0500 Date: Mon, 21 Dec 2015 05:39:02 +0200 Message-Id: <83wps8sc6h.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Sun, 20 Dec 2015 23:00:40 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Sun, 20 Dec 2015 23:00:40 +0100 > From: Anders Lindgren > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > I managed to get the attached patch to work (when used in conjunction with my > previous patch). > > I've tested: > > * C-x C-f a TAB > * (find-file-all-competions "a" ".") OK, thanks. The next step is arrange for this to happen only with those values of file-name-coding-system that require it. My idea is to put a special property on the coding-system's symbol, and check that inside file_name_completion (outside of the loop). Can you add this, or do you want me to suggest a patch along these lines for you to test? Note that the property check will have to be done on file-name-coding-system if it is non-nil, otherwise on default-file-name-coding-system (if that is non-nil). Thanks. From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 21 01:53:01 2015 Received: (at 22169) by debbugs.gnu.org; 21 Dec 2015 06:53:01 +0000 Received: from localhost ([127.0.0.1]:58508 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aAuL6-0004PC-TU for submit@debbugs.gnu.org; Mon, 21 Dec 2015 01:53:01 -0500 Received: from mail-vk0-f49.google.com ([209.85.213.49]:35568) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aAuL5-0004P0-9F for 22169@debbugs.gnu.org; Mon, 21 Dec 2015 01:52:59 -0500 Received: by mail-vk0-f49.google.com with SMTP id a189so95736664vkh.2 for <22169@debbugs.gnu.org>; Sun, 20 Dec 2015 22:52:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=nf8dD55Xjevb8ltEiTH3lzGfZqYsBl/AmfKVLtmpPvY=; b=01ZMFHvvOSY9p7+LWcHaJkuTqO3XIHtwxKUt7f2y/scR9j9feTk9KxV6vxnzR8bQBB GG/eocKd98n/ehagc428XjOVCl4r0ortc2/qsoKvM3vYFikwMRB13ecYslXOjEOfzuXZ p4qmXoO6eeV87to3skics4frZTTJNiywT9+AAn3aVGDGEAsSqAC85BHlf6UJtIQyQE5z +8OPKFP3caIBssr+NA8tFg0uKYIJNc0uCDNrviaj5bP9kAFo9jVA0zejhSeGaGf0bA4a ToX9py9RRR/POBCOdnbJ8TTalvhlvrG09uPbqLMFVd/GT1Nn2Jkfo/+8hmCujUWkimIj Eghg== MIME-Version: 1.0 X-Received: by 10.31.138.20 with SMTP id m20mr11640614vkd.70.1450680773619; Sun, 20 Dec 2015 22:52:53 -0800 (PST) Received: by 10.31.210.133 with HTTP; Sun, 20 Dec 2015 22:52:53 -0800 (PST) In-Reply-To: <83wps8sc6h.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> Date: Mon, 21 Dec 2015 07:52:53 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a1144f92231b80b052762f0f4 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a1144f92231b80b052762f0f4 Content-Type: text/plain; charset=UTF-8 Hi! I did some simple measurements with and without this patch. I ran `(file-name-all-completions "x" "src")' on the Emacs src directory. The timing values were almost identical (varying between 0.001012 and 0.001080). The way I see it, the patch doesn't do any harm in any coding system, and it is fast. Hence, I don't really see that it's worth the effort to make this code conditional. However, please write a patch for this if you still thinks it's necessary. I can test it here to make sure it works under OS X. -- Anders On Mon, Dec 21, 2015 at 4:39 AM, Eli Zaretskii wrote: > > Date: Sun, 20 Dec 2015 23:00:40 +0100 > > From: Anders Lindgren > > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > > > I managed to get the attached patch to work (when used in conjunction > with my > > previous patch). > > > > I've tested: > > > > * C-x C-f a TAB > > * (find-file-all-competions "a" ".") > > OK, thanks. > > The next step is arrange for this to happen only with those values of > file-name-coding-system that require it. My idea is to put a special > property on the coding-system's symbol, and check that inside > file_name_completion (outside of the loop). Can you add this, or do > you want me to suggest a patch along these lines for you to test? > Note that the property check will have to be done on > file-name-coding-system if it is non-nil, otherwise on > default-file-name-coding-system (if that is non-nil). > > Thanks. > --001a1144f92231b80b052762f0f4 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

I did some simple measurements with= and without this patch. I ran `(file-name-all-completions "x" &q= uot;src")' on the Emacs src directory. The timing values were almo= st identical (varying between 0.001012 and 0.001080).

<= div>The way I see it, the patch doesn't do any harm in any coding syste= m, and it is fast. Hence, I don't really see that it's worth the ef= fort to make this code conditional.

However, pleas= e write a patch for this if you still thinks it's necessary. I can test= it here to make sure it works under OS X.

=C2=A0 = =C2=A0 -- Anders


<= div class=3D"gmail_quote">On Mon, Dec 21, 2015 at 4:39 AM, Eli Zaretskii <eliz@= gnu.org> wrote:
> Date: = Sun, 20 Dec 2015 23:00:40 +0100
> From: Anders Lindgren <andlind@gmail.com>
> Cc: random832@fastmail.com, 22169@debbugs.gnu.org
>
> I managed to get the attached patch to work (w= hen used in conjunction with my
> previous patch).
>
> I've tested:
>
> * C-x C-f a TAB
> * (find-file-all-competions "a" ".")

OK, thanks.

The next step is arrange for this to happen only with those values of
file-name-coding-system that require it.=C2=A0 My idea is to put a special<= br> property on the coding-system's symbol, and check that inside
file_name_completion (outside of the loop).=C2=A0 Can you add this, or do you want me to suggest a patch along these lines for you to test?
Note that the property check will have to be done on
file-name-coding-system if it is non-nil, otherwise on
default-file-name-coding-system (if that is non-nil).

Thanks.

--001a1144f92231b80b052762f0f4-- From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 21 11:09:00 2015 Received: (at 22169) by debbugs.gnu.org; 21 Dec 2015 16:09:00 +0000 Received: from localhost ([127.0.0.1]:59024 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aB31A-0002FD-6I for submit@debbugs.gnu.org; Mon, 21 Dec 2015 11:09:00 -0500 Received: from eggs.gnu.org ([208.118.235.92]:43418) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aB318-0002Ey-RZ for 22169@debbugs.gnu.org; Mon, 21 Dec 2015 11:08:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aB30z-0007wo-EF for 22169@debbugs.gnu.org; Mon, 21 Dec 2015 11:08:53 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:36702) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aB30z-0007wk-Az; Mon, 21 Dec 2015 11:08:49 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4962 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aB30w-0002XP-HQ; Mon, 21 Dec 2015 11:08:47 -0500 Date: Mon, 21 Dec 2015 18:09:18 +0200 Message-Id: <83fuyvss0h.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Mon, 21 Dec 2015 07:52:53 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Mon, 21 Dec 2015 07:52:53 +0100 > From: Anders Lindgren > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > I did some simple measurements with and without this patch. I ran > `(file-name-all-completions "x" "src")' on the Emacs src directory. The timing > values were almost identical (varying between 0.001012 and 0.001080). You should try it on a larger directory, preferably one that has many files with non-ASCII file names. > The way I see it, the patch doesn't do any harm in any coding system, and it is > fast. Hence, I don't really see that it's worth the effort to make this code > conditional. I'm surprised to hear that. Did you look at the implementation of Fcompare_strings? It's highly non-trivial. What's more, if the user sets completion-ignore-case non-nil, Fcompare_strings will call Fupcase on each character, which is another non-trivial function; if you are particularly unlucky, Fupcase can even GC (if it needs to set up the case-table), which will definitely take several hundreds of milliseconds if not longer. And that's today; what if tomorrow someone comes and adds to Fcompare_strings something that makes it even more complex and slow? I've learned long ago not to call any non-trivial API unless I really need it. You can never know what complexity hides in there. Besides, it simply looks bad in the code to do processing that is unnecessary. > However, please write a patch for this if you still thinks it's necessary. I > can test it here to make sure it works under OS X. Attached (relative to the current emacs-25 branch). Please note that the patch below attempts to solve a couple of additional subtle aspects of this: . it doesn't force the extra comparison for unibyte strings (which include ASCII strings and unibyte non-ASCII strings), since the issue doesn't exist then, and ENCODE_FILE/DECODE_FILE are no-ops . it forces the FILE argument to have all of its characters precomposed, since if the caller passes us a file name with decomposed characters, we risk rejecting them in the code we are adding Please see that these indeed o their job correctly, as I could only test the code very superficially. Thanks. diff --git a/lisp/international/ucs-normalize.el b/lisp/international/ucs-normalize.el index 8839b00..6f2fb28 100644 --- a/lisp/international/ucs-normalize.el +++ b/lisp/international/ucs-normalize.el @@ -627,6 +627,10 @@ 'utf-8-hfs :pre-write-conversion 'ucs-normalize-hfs-nfd-pre-write-conversion ) +;; This is tested in dired.c:file_name_completion in order to reject +;; false positives due to comparison of encoded file names. +(coding-system-put 'utf-8-hfs 'decomposed-characters 't) + (provide 'ucs-normalize) ;; Local Variables: diff --git a/src/dired.c b/src/dired.c index 84bf247..d5628d5 100644 --- a/src/dired.c +++ b/src/dired.c @@ -467,6 +467,7 @@ file_name_completion (Lisp_Object file, Lisp_Object dirname, bool all_flag, well as "." and "..". Until shown otherwise, assume we can't exclude anything. */ bool includeall = 1; + bool check_decoded = false; ptrdiff_t count = SPECPDL_INDEX (); elt = Qnil; @@ -485,6 +486,28 @@ file_name_completion (Lisp_Object file, Lisp_Object dirname, bool all_flag, on the encoded file name. */ encoded_file = ENCODE_FILE (file); encoded_dir = ENCODE_FILE (Fdirectory_file_name (dirname)); + if (STRING_MULTIBYTE (file)) + { + Lisp_Object file_encoding = Vfile_name_coding_system; + + if (NILP (Vfile_name_coding_system)) + file_encoding = Vdefault_file_name_coding_system; + /* If the file-name encoding decomposes characters, as we do for + HFS+ filesystems, we need to make an additional comparison of + decoded names in order to filter false positives, such as "a" + falsely matching "a-ring". */ + if (!NILP (file_encoding) + && !NILP (Fplist_get (Fcoding_system_plist (file_encoding), + Qdecomposed_characters))) + { + check_decoded = true; + /* Recompute FILE to make sure any decomposed characters in + it are re-composed by the post-read-conversion. + Otherwise, any decomposed characters will be rejected by + the additional check below. */ + file = DECODE_FILE (encoded_file); + } + } int fd; DIR *d = open_directory (encoded_dir, &fd); record_unwind_protect_ptr (directory_files_internal_unwind, d); @@ -637,6 +660,21 @@ file_name_completion (Lisp_Object file, Lisp_Object dirname, bool all_flag, if (!NILP (predicate) && NILP (call1 (predicate, name))) continue; + /* Reject entries where the encoded strings match, but the + decoded don't. For example, "a" should not match "a-ring" on + file systems that store decomposed characters. */ + Lisp_Object zero = make_number (0); + Lisp_Object compare; + Lisp_Object cmp; + if (check_decoded && SCHARS (file) <= SCHARS (name)) + { + compare = make_number (SCHARS (file)); + cmp = Fcompare_strings (name, zero, compare, file, zero, compare, + completion_ignore_case ? Qt : Qnil); + if (!EQ (cmp, Qt)) + continue; + } + /* Suitably record this match. */ matchcount += matchcount <= 1; @@ -650,15 +688,13 @@ file_name_completion (Lisp_Object file, Lisp_Object dirname, bool all_flag, } else { - Lisp_Object zero = make_number (0); /* FIXME: This is a copy of the code in Ftry_completion. */ - ptrdiff_t compare = min (bestmatchsize, SCHARS (name)); - Lisp_Object cmp - = Fcompare_strings (bestmatch, zero, - make_number (compare), - name, zero, - make_number (compare), - completion_ignore_case ? Qt : Qnil); + compare = min (bestmatchsize, SCHARS (name)); + cmp = Fcompare_strings (bestmatch, zero, + make_number (compare), + name, zero, + make_number (compare), + completion_ignore_case ? Qt : Qnil); ptrdiff_t matchsize = EQ (cmp, Qt) ? compare : eabs (XINT (cmp)) - 1; if (completion_ignore_case) @@ -1007,6 +1043,7 @@ syms_of_dired (void) DEFSYM (Qfile_attributes, "file-attributes"); DEFSYM (Qfile_attributes_lessp, "file-attributes-lessp"); DEFSYM (Qdefault_directory, "default-directory"); + DEFSYM (Qdecomposed_characters, "decomposed-characters"); defsubr (&Sdirectory_files); defsubr (&Sdirectory_files_and_attributes); From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 21 17:03:46 2015 Received: (at 22169) by debbugs.gnu.org; 21 Dec 2015 22:03:47 +0000 Received: from localhost ([127.0.0.1]:59177 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aB8YU-000269-94 for submit@debbugs.gnu.org; Mon, 21 Dec 2015 17:03:46 -0500 Received: from mail-vk0-f44.google.com ([209.85.213.44]:34269) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aB8YS-00025x-Gy for 22169@debbugs.gnu.org; Mon, 21 Dec 2015 17:03:45 -0500 Received: by mail-vk0-f44.google.com with SMTP id j66so108234082vkg.1 for <22169@debbugs.gnu.org>; Mon, 21 Dec 2015 14:03:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=J3DWQ4MseYQ6qoowEPoIUh3p5LjhZPoQiT9uM7y4854=; b=BALCQkB7Mi+75fj5rT8Qa01gvgcp14QsvGCEF6sc6Y6RJymoj7t+pZTLy7OHib6gS7 khvJwxZ4ALHvAhUCCNBDckuZ7y8RDqm9YPzMhsM00g1sVlR4UNfuFKR5nY6KyCnlxtep 31aBhyQkAi/Dc9RLL3wWhV6c1hhOa250FGGbbyEbjTFv5rcMfHtQv1v//qmmEI68boDL 9q9D0QcwA6dRKXMr6u9+ciHtvjA28zAtmSX/Uch7X//a+Zs9i/KUVthTSooyk55/4fhg KKPJ5bZSz4bpGVL2KN8g9ny07SXX2I0g8dap2UCzDlhgmTYU4xxII49KSpTsij2UeEwB VTew== MIME-Version: 1.0 X-Received: by 10.31.10.199 with SMTP id 190mr14127617vkk.51.1450735419039; Mon, 21 Dec 2015 14:03:39 -0800 (PST) Received: by 10.31.210.133 with HTTP; Mon, 21 Dec 2015 14:03:38 -0800 (PST) In-Reply-To: <83fuyvss0h.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> Date: Mon, 21 Dec 2015 23:03:38 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a1144017650a55005276fa9de X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a1144017650a55005276fa9de Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi! I just tested your latest patch. Unfortunately, it doesn't work properly. When pressing TAB, it expands the characters correctly. However, `file-name-all-completions' doesn't work: (file-name-all-completions "a" ".") ("=C3=A5=C3=A4=C3=B6first.txt" "aaosecond.txt") I haven't had time to see what actually happens in the code, though. However, the " if (STRING_MULTIBYTE (file))" looks suspicious as the decoded value needs to be checked even for strings like "a". (However, I don't really know what STRING_MULTIBYTE does.) -- Anders On Mon, Dec 21, 2015 at 5:09 PM, Eli Zaretskii wrote: > > Date: Mon, 21 Dec 2015 07:52:53 +0100 > > From: Anders Lindgren > > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > > > I did some simple measurements with and without this patch. I ran > > `(file-name-all-completions "x" "src")' on the Emacs src directory. The > timing > > values were almost identical (varying between 0.001012 and 0.001080). > > You should try it on a larger directory, preferably one that has many > files with non-ASCII file names. > > > The way I see it, the patch doesn't do any harm in any coding system, > and it is > > fast. Hence, I don't really see that it's worth the effort to make this > code > > conditional. > > I'm surprised to hear that. Did you look at the implementation of > Fcompare_strings? It's highly non-trivial. What's more, if the user > sets completion-ignore-case non-nil, Fcompare_strings will call > Fupcase on each character, which is another non-trivial function; if > you are particularly unlucky, Fupcase can even GC (if it needs to set > up the case-table), which will definitely take several hundreds of > milliseconds if not longer. > > And that's today; what if tomorrow someone comes and adds to > Fcompare_strings something that makes it even more complex and slow? > > I've learned long ago not to call any non-trivial API unless I really > need it. You can never know what complexity hides in there. Besides, > it simply looks bad in the code to do processing that is unnecessary. > > > However, please write a patch for this if you still thinks it's > necessary. I > > can test it here to make sure it works under OS X. > > Attached (relative to the current emacs-25 branch). > > Please note that the patch below attempts to solve a couple of > additional subtle aspects of this: > > . it doesn't force the extra comparison for unibyte strings (which > include ASCII strings and unibyte non-ASCII strings), since the > issue doesn't exist then, and ENCODE_FILE/DECODE_FILE are no-ops > > . it forces the FILE argument to have all of its characters > precomposed, since if the caller passes us a file name with > decomposed characters, we risk rejecting them in the code we are > adding > > Please see that these indeed o their job correctly, as I could only > test the code very superficially. > > Thanks. > > diff --git a/lisp/international/ucs-normalize.el > b/lisp/international/ucs-normalize.el > index 8839b00..6f2fb28 100644 > --- a/lisp/international/ucs-normalize.el > +++ b/lisp/international/ucs-normalize.el > @@ -627,6 +627,10 @@ 'utf-8-hfs > :pre-write-conversion 'ucs-normalize-hfs-nfd-pre-write-conversion > ) > > +;; This is tested in dired.c:file_name_completion in order to reject > +;; false positives due to comparison of encoded file names. > +(coding-system-put 'utf-8-hfs 'decomposed-characters 't) > + > (provide 'ucs-normalize) > > ;; Local Variables: > diff --git a/src/dired.c b/src/dired.c > index 84bf247..d5628d5 100644 > --- a/src/dired.c > +++ b/src/dired.c > @@ -467,6 +467,7 @@ file_name_completion (Lisp_Object file, Lisp_Object > dirname, bool all_flag, > well as "." and "..". Until shown otherwise, assume we can't exclu= de > anything. */ > bool includeall =3D 1; > + bool check_decoded =3D false; > ptrdiff_t count =3D SPECPDL_INDEX (); > > elt =3D Qnil; > @@ -485,6 +486,28 @@ file_name_completion (Lisp_Object file, Lisp_Object > dirname, bool all_flag, > on the encoded file name. */ > encoded_file =3D ENCODE_FILE (file); > encoded_dir =3D ENCODE_FILE (Fdirectory_file_name (dirname)); > + if (STRING_MULTIBYTE (file)) > + { > + Lisp_Object file_encoding =3D Vfile_name_coding_system; > + > + if (NILP (Vfile_name_coding_system)) > + file_encoding =3D Vdefault_file_name_coding_system; > + /* If the file-name encoding decomposes characters, as we do for > + HFS+ filesystems, we need to make an additional comparison of > + decoded names in order to filter false positives, such as "a" > + falsely matching "a-ring". */ > + if (!NILP (file_encoding) > + && !NILP (Fplist_get (Fcoding_system_plist (file_encoding), > + Qdecomposed_characters))) > + { > + check_decoded =3D true; > + /* Recompute FILE to make sure any decomposed characters in > + it are re-composed by the post-read-conversion. > + Otherwise, any decomposed characters will be rejected by > + the additional check below. */ > + file =3D DECODE_FILE (encoded_file); > + } > + } > int fd; > DIR *d =3D open_directory (encoded_dir, &fd); > record_unwind_protect_ptr (directory_files_internal_unwind, d); > @@ -637,6 +660,21 @@ file_name_completion (Lisp_Object file, Lisp_Object > dirname, bool all_flag, > if (!NILP (predicate) && NILP (call1 (predicate, name))) > continue; > > + /* Reject entries where the encoded strings match, but the > + decoded don't. For example, "a" should not match "a-ring" on > + file systems that store decomposed characters. */ > + Lisp_Object zero =3D make_number (0); > + Lisp_Object compare; > + Lisp_Object cmp; > + if (check_decoded && SCHARS (file) <=3D SCHARS (name)) > + { > + compare =3D make_number (SCHARS (file)); > + cmp =3D Fcompare_strings (name, zero, compare, file, zero, comp= are, > + completion_ignore_case ? Qt : Qnil); > + if (!EQ (cmp, Qt)) > + continue; > + } > + > /* Suitably record this match. */ > > matchcount +=3D matchcount <=3D 1; > @@ -650,15 +688,13 @@ file_name_completion (Lisp_Object file, Lisp_Object > dirname, bool all_flag, > } > else > { > - Lisp_Object zero =3D make_number (0); > /* FIXME: This is a copy of the code in Ftry_completion. */ > - ptrdiff_t compare =3D min (bestmatchsize, SCHARS (name)); > - Lisp_Object cmp > - =3D Fcompare_strings (bestmatch, zero, > - make_number (compare), > - name, zero, > - make_number (compare), > - completion_ignore_case ? Qt : Qnil); > + compare =3D min (bestmatchsize, SCHARS (name)); > + cmp =3D Fcompare_strings (bestmatch, zero, > + make_number (compare), > + name, zero, > + make_number (compare), > + completion_ignore_case ? Qt : Qnil); > ptrdiff_t matchsize =3D EQ (cmp, Qt) ? compare : eabs (XINT (cm= p)) > - 1; > > if (completion_ignore_case) > @@ -1007,6 +1043,7 @@ syms_of_dired (void) > DEFSYM (Qfile_attributes, "file-attributes"); > DEFSYM (Qfile_attributes_lessp, "file-attributes-lessp"); > DEFSYM (Qdefault_directory, "default-directory"); > + DEFSYM (Qdecomposed_characters, "decomposed-characters"); > > defsubr (&Sdirectory_files); > defsubr (&Sdirectory_files_and_attributes); > --001a1144017650a55005276fa9de Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

I just tested your latest patch. Unfortunately,= it doesn't work properly.

When pressing TAB, it expands the cha= racters correctly. However, `file-name-all-completions' doesn't wor= k:

(file-name-all-completions "a" ".")
("= ;=C3=A5=C3=A4=C3=B6first.txt" "aaosecond.txt")

I have= n't had time to see what actually happens in the code, though. However,= the " =C2=A0if (STRING_MULTIBYTE (file))" looks suspicious as th= e decoded value needs to be checked even for strings like "a". (H= owever, I don't really know what STRING_MULTIBYTE does.)

=C2=A0 =C2=A0 -- Anders



On Mon, Dec 21, 2015 at = 5:09 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> Date: Mon, 21 Dec 2015 07:52:53 +0100
> From: Anders Lindgren <andlind@gmail.com>
> Cc: random832@fastmail.com, 22169@debbugs.gnu.org
>
> I did some simple measurements with and withou= t this patch. I ran
> `(file-name-all-completions "x" "src")' on the= Emacs src directory. The timing
> values were almost identical (varying between 0.001012 and 0.001080).<= br>
You should try it on a larger directory, preferably one that has man= y
files with non-ASCII file names.

> The way I see it, the patch doesn't do any harm in any coding syst= em, and it is
> fast. Hence, I don't really see that it's worth the effort to = make this code
> conditional.

I'm surprised to hear that.=C2=A0 Did you look at the implementa= tion of
Fcompare_strings?=C2=A0 It's highly non-trivial.=C2=A0 What's more,= if the user
sets completion-ignore-case non-nil, Fcompare_strings will call
Fupcase on each character, which is another non-trivial function; if
you are particularly unlucky, Fupcase can even GC (if it needs to set
up the case-table), which will definitely take several hundreds of
milliseconds if not longer.

And that's today; what if tomorrow someone comes and adds to
Fcompare_strings something that makes it even more complex and slow?

I've learned long ago not to call any non-trivial API unless I really need it.=C2=A0 You can never know what complexity hides in there.=C2=A0 Bes= ides,
it simply looks bad in the code to do processing that is unnecessary.

> However, please write a patch for this if you still thinks it's ne= cessary. I
> can test it here to make sure it works under OS X.

Attached (relative to the current emacs-25 branch).

Please note that the patch below attempts to solve a couple of
additional subtle aspects of this:

=C2=A0 . it doesn't force the extra comparison for unibyte strings (whi= ch
=C2=A0 =C2=A0 include ASCII strings and unibyte non-ASCII strings), since t= he
=C2=A0 =C2=A0 issue doesn't exist then, and ENCODE_FILE/DECODE_FILE are= no-ops

=C2=A0 . it forces the FILE argument to have all of its characters
=C2=A0 =C2=A0 precomposed, since if the caller passes us a file name with =C2=A0 =C2=A0 decomposed characters, we risk rejecting them in the code we = are
=C2=A0 =C2=A0 adding

Please see that these indeed o their job correctly, as I could only
test the code very superficially.

Thanks.

diff --git a/lisp/international/ucs-normalize.el b/lisp/international/ucs-n= ormalize.el
index 8839b00..6f2fb28 100644
--- a/lisp/international/ucs-normalize.el
+++ b/lisp/international/ucs-normalize.el
@@ -627,6 +627,10 @@ 'utf-8-hfs
=C2=A0 =C2=A0:pre-write-conversion 'ucs-normalize-hfs-nfd-pre-write-con= version
=C2=A0 =C2=A0)

+;; This is tested in dired.c:file_name_completion in order to reject
+;; false positives due to comparison of encoded file names.
+(coding-system-put 'utf-8-hfs 'decomposed-characters 't)
+
=C2=A0(provide 'ucs-normalize)

=C2=A0;; Local Variables:
diff --git a/src/dired.c b/src/dired.c
index 84bf247..d5628d5 100644
--- a/src/dired.c
+++ b/src/dired.c
@@ -467,6 +467,7 @@ file_name_completion (Lisp_Object file, Lisp_Object dir= name, bool all_flag,
=C2=A0 =C2=A0 =C2=A0 well as "." and "..".=C2=A0 Until = shown otherwise, assume we can't exclude
=C2=A0 =C2=A0 =C2=A0 anything.=C2=A0 */
=C2=A0 =C2=A0bool includeall =3D 1;
+=C2=A0 bool check_decoded =3D false;
=C2=A0 =C2=A0ptrdiff_t count =3D SPECPDL_INDEX ();

=C2=A0 =C2=A0elt =3D Qnil;
@@ -485,6 +486,28 @@ file_name_completion (Lisp_Object file, Lisp_Object di= rname, bool all_flag,
=C2=A0 =C2=A0 =C2=A0 on the encoded file name.=C2=A0 */
=C2=A0 =C2=A0encoded_file =3D ENCODE_FILE (file);
=C2=A0 =C2=A0encoded_dir =3D ENCODE_FILE (Fdirectory_file_name (dirname));<= br> +=C2=A0 if (STRING_MULTIBYTE (file))
+=C2=A0 =C2=A0 {
+=C2=A0 =C2=A0 =C2=A0 Lisp_Object file_encoding =3D Vfile_name_coding_syste= m;
+
+=C2=A0 =C2=A0 =C2=A0 if (NILP (Vfile_name_coding_system))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0file_encoding =3D Vdefault_file_name_coding_sys= tem;
+=C2=A0 =C2=A0 =C2=A0 /* If the file-name encoding decomposes characters, a= s we do for
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 HFS+ filesystems, we need to make an additiona= l comparison of
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 decoded names in order to filter false positiv= es, such as "a"
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 falsely matching "a-ring".=C2=A0 */<= br> +=C2=A0 =C2=A0 =C2=A0 if (!NILP (file_encoding)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0&& !NILP (Fplist_get (Fcoding_sy= stem_plist (file_encoding),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Qdecomposed_characters)))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0{
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0check_decoded =3D true;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Recompute FILE to make sure any decom= posed characters in
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 it are re-composed by the post-r= ead-conversion.
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Otherwise, any decomposed charac= ters will be rejected by
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 the additional check below.=C2= =A0 */
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file =3D DECODE_FILE (encoded_file);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0}
+=C2=A0 =C2=A0 }
=C2=A0 =C2=A0int fd;
=C2=A0 =C2=A0DIR *d =3D open_directory (encoded_dir, &fd);
=C2=A0 =C2=A0record_unwind_protect_ptr (directory_files_internal_unwind, d)= ;
@@ -637,6 +660,21 @@ file_name_completion (Lisp_Object file, Lisp_Object di= rname, bool all_flag,
=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!NILP (predicate) && NILP (call1 (pr= edicate, name)))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 continue;

+=C2=A0 =C2=A0 =C2=A0 /* Reject entries where the encoded strings match, bu= t the
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0decoded don't.=C2=A0 For example, &q= uot;a" should not match "a-ring" on
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file systems that store decomposed chara= cters. */
+=C2=A0 =C2=A0 =C2=A0 Lisp_Object zero =3D make_number (0)= ;
+=C2=A0 =C2=A0 =C2=A0 Lisp_Object compare;
+=C2=A0 =C2=A0 =C2=A0 Lisp_Object cmp;
+=C2=A0 =C2=A0 =C2=A0 if (check_decoded && SCHARS (file) <=3D SC= HARS (name))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0{
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0compare =3D make_number (SCHARS (file));=
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0cmp =3D Fcompare_strings (name, zero, co= mpare, file, zero, compare,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0completion_ig= nore_case ? Qt : Qnil);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (!EQ (cmp, Qt))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0continue;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0}
+
=C2=A0 =C2=A0 =C2=A0 =C2=A0/* Suitably record this match.=C2=A0 */

=C2=A0 =C2=A0 =C2=A0 =C2=A0matchcount +=3D matchcount <=3D 1;
@@ -650,15 +688,13 @@ file_name_completion (Lisp_Object file, Lisp_O= bject dirname, bool all_flag,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0else
=C2=A0 =C2=A0 =C2=A0 =C2=A0 {
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Lisp_Object zero =3D make_number (0); =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* FIXME: This is a copy of the code in = Ftry_completion.=C2=A0 */
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ptrdiff_t compare =3D min (bestma= tchsize, SCHARS (name));
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Lisp_Object cmp
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D Fcompare_strings (bestmatch, = zero,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0make_number (compare),
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0name, zero,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0make_number (compare),
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0completion_ignore_case ? Qt : Qnil);<= br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0compare =3D min (bestmatchsize, SCHARS (= name));
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0cmp =3D Fcompare_strings (bestmatch, zer= o,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0make_number (compare),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0name, zero,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0make_number (compare),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0completion_ig= nore_case ? Qt : Qnil);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ptrdiff_t matchsize =3D EQ (cmp, = Qt) ? compare : eabs (XINT (cmp)) - 1;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (completion_ignore_case)
@@ -1007,6 +1043,7 @@ syms_of_dired (void)
=C2=A0 =C2=A0DEFSYM (Qfile_attributes, "file-attributes");
=C2=A0 =C2=A0DEFSYM (Qfile_attributes_lessp, "file-attributes-lessp&qu= ot;);
=C2=A0 =C2=A0DEFSYM (Qdefault_directory, "default-directory"); +=C2=A0 DEFSYM (Qdecomposed_characters, "decomposed-characters");=

=C2=A0 =C2=A0defsubr (&Sdirectory_files);
=C2=A0 =C2=A0defsubr (&Sdirectory_files_and_attributes);

--001a1144017650a55005276fa9de-- From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 21 22:36:47 2015 Received: (at 22169) by debbugs.gnu.org; 22 Dec 2015 03:36:47 +0000 Received: from localhost ([127.0.0.1]:59303 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBDkl-0001ya-Lt for submit@debbugs.gnu.org; Mon, 21 Dec 2015 22:36:47 -0500 Received: from eggs.gnu.org ([208.118.235.92]:35508) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBDkj-0001yO-UM for 22169@debbugs.gnu.org; Mon, 21 Dec 2015 22:36:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aBDka-0001DB-7P for 22169@debbugs.gnu.org; Mon, 21 Dec 2015 22:36:40 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:47726) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aBDka-0001D7-4T; Mon, 21 Dec 2015 22:36:36 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1436 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aBDkZ-0008VH-Bw; Mon, 21 Dec 2015 22:36:35 -0500 Date: Tue, 22 Dec 2015 05:37:08 +0200 Message-Id: <83poxzqhln.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Mon, 21 Dec 2015 23:03:38 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Mon, 21 Dec 2015 23:03:38 +0100 > From: Anders Lindgren > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > I haven't had time to see what actually happens in the code, though. However, > the " if (STRING_MULTIBYTE (file))" looks suspicious as the decoded value needs > to be checked even for strings like "a". (However, I don't really know what > STRING_MULTIBYTE does.) Does removing the STRING_MULTIBYTE test make it work? From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 22 00:42:55 2015 Received: (at 22169) by debbugs.gnu.org; 22 Dec 2015 05:42:55 +0000 Received: from localhost ([127.0.0.1]:59364 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBFio-0005ES-Sm for submit@debbugs.gnu.org; Tue, 22 Dec 2015 00:42:55 -0500 Received: from mail-vk0-f42.google.com ([209.85.213.42]:36032) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBFim-0005ED-WB for 22169@debbugs.gnu.org; Tue, 22 Dec 2015 00:42:53 -0500 Received: by mail-vk0-f42.google.com with SMTP id f2so73791079vkb.3 for <22169@debbugs.gnu.org>; Mon, 21 Dec 2015 21:42:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=w5yAeegl4HNyhbTlzSo/QRZJvP7LRN5NAj0eaEz2+rM=; b=KulAShH9K2W61dNpG1yXcN8ss3AhR0YDe+MPj0u0aw6nCqv4ScksuJYHfdFsU1YR7J jbTgFriW6VW6HJ7NwUSLXlUYfJoRW3B288cNJVHg5pFGBop9ZtGySTI6JEH6Pi7T4M4J jOHnwTH/nO34oXFDFc+gX/ct9slB3GntxgzSEQiRWbSWMYN68KF3KAmLJ3wFHoTHq2F1 GsZxtLYqihjvcfEHp8SrW1XwM/yFUBr4UZr9BKqGxTZigajdXwRgIog4E8J6wOHyARag QDojYxtFp1egiQCo1NmkGrM8nCyF+PA3x1ooUvlCVI8vqnaqvWzqHESEGMGfgDm/s0VU u+Nw== MIME-Version: 1.0 X-Received: by 10.31.152.207 with SMTP id a198mr14710709vke.68.1450762967313; Mon, 21 Dec 2015 21:42:47 -0800 (PST) Received: by 10.31.210.133 with HTTP; Mon, 21 Dec 2015 21:42:47 -0800 (PST) In-Reply-To: <83poxzqhln.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> <83poxzqhln.fsf@gnu.org> Date: Tue, 22 Dec 2015 06:42:47 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a113d39e251f4ed0527761363 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a113d39e251f4ed0527761363 Content-Type: text/plain; charset=UTF-8 Hi! > I haven't had time to see what actually happens in the code, though. > However, > > the " if (STRING_MULTIBYTE (file))" looks suspicious as the decoded > value needs > > to be checked even for strings like "a". (However, I don't really know > what > > STRING_MULTIBYTE does.) > > Does removing the STRING_MULTIBYTE test make it work? > Yes, it does: (file-name-all-completions "a" ".") ("aaosecond.txt") -- Anders --001a113d39e251f4ed0527761363 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

> I haven't had time to see= what actually happens in the code, though. However,
> the " if (STRING_MULTIBYTE (file))" looks suspicious as the = decoded value needs
> to be checked even for strings like "a". (However, I don'= ;t really know what
> STRING_MULTIBYTE does.)

Does removing the STRING_MULTIBYTE test make it work?

Yes, it does:

(file-name-all= -completions "a" ".")
("aaosecond.txt&qu= ot;)

=C2=A0 =C2=A0 -- Anders
=

--001a113d39e251f4ed0527761363-- From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 22 12:09:56 2015 Received: (at 22169) by debbugs.gnu.org; 22 Dec 2015 17:09:56 +0000 Received: from localhost ([127.0.0.1]:60139 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBQRg-0006VT-DI for submit@debbugs.gnu.org; Tue, 22 Dec 2015 12:09:56 -0500 Received: from eggs.gnu.org ([208.118.235.92]:47837) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBQRe-0006VG-Pm for 22169@debbugs.gnu.org; Tue, 22 Dec 2015 12:09:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aBQRV-0005E8-H3 for 22169@debbugs.gnu.org; Tue, 22 Dec 2015 12:09:49 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:34002) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aBQRV-0005E4-EA; Tue, 22 Dec 2015 12:09:45 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1598 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aBQRU-0006Pb-P3; Tue, 22 Dec 2015 12:09:45 -0500 Date: Tue, 22 Dec 2015 19:10:19 +0200 Message-Id: <838u4mquis.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Tue, 22 Dec 2015 06:42:47 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> <83poxzqhln.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Tue, 22 Dec 2015 06:42:47 +0100 > From: Anders Lindgren > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > > I haven't had time to see what actually happens in the code, though. > However, > > the " if (STRING_MULTIBYTE (file))" looks suspicious as the decoded value > needs > > to be checked even for strings like "a". (However, I don't really know > what > > STRING_MULTIBYTE does.) > > Does removing the STRING_MULTIBYTE test make it work? > > Yes, it does: > > (file-name-all-completions "a" ".") > ("aaosecond.txt") Then please try the final patch below (again against the current emacs-25 branch), I hope I didn't goof this time. Thanks. diff --git a/lisp/international/ucs-normalize.el b/lisp/international/ucs-normalize.el index 8839b00..6f2fb28 100644 --- a/lisp/international/ucs-normalize.el +++ b/lisp/international/ucs-normalize.el @@ -627,6 +627,10 @@ 'utf-8-hfs :pre-write-conversion 'ucs-normalize-hfs-nfd-pre-write-conversion ) +;; This is tested in dired.c:file_name_completion in order to reject +;; false positives due to comparison of encoded file names. +(coding-system-put 'utf-8-hfs 'decomposed-characters 't) + (provide 'ucs-normalize) ;; Local Variables: diff --git a/src/dired.c b/src/dired.c index 84bf247..89bd908 100644 --- a/src/dired.c +++ b/src/dired.c @@ -467,6 +467,7 @@ file_name_completion (Lisp_Object file, Lisp_Object dirname, bool all_flag, well as "." and "..". Until shown otherwise, assume we can't exclude anything. */ bool includeall = 1; + bool check_decoded = false; ptrdiff_t count = SPECPDL_INDEX (); elt = Qnil; @@ -485,6 +486,28 @@ file_name_completion (Lisp_Object file, Lisp_Object dirname, bool all_flag, on the encoded file name. */ encoded_file = ENCODE_FILE (file); encoded_dir = ENCODE_FILE (Fdirectory_file_name (dirname)); + + Lisp_Object file_encoding = Vfile_name_coding_system; + if (NILP (Vfile_name_coding_system)) + file_encoding = Vdefault_file_name_coding_system; + /* If the file-name encoding decomposes characters, as we do for + HFS+ filesystems, we need to make an additional comparison of + decoded names in order to filter false positives, such as "a" + falsely matching "a-ring". */ + if (!NILP (file_encoding) + && !NILP (Fplist_get (Fcoding_system_plist (file_encoding), + Qdecomposed_characters))) + { + check_decoded = true; + if (STRING_MULTIBYTE (file)) + { + /* Recompute FILE to make sure any decomposed characters in + it are re-composed by the post-read-conversion. + Otherwise, any decomposed characters will be rejected by + the additional check below. */ + file = DECODE_FILE (encoded_file); + } + } int fd; DIR *d = open_directory (encoded_dir, &fd); record_unwind_protect_ptr (directory_files_internal_unwind, d); @@ -637,6 +660,21 @@ file_name_completion (Lisp_Object file, Lisp_Object dirname, bool all_flag, if (!NILP (predicate) && NILP (call1 (predicate, name))) continue; + /* Reject entries where the encoded strings match, but the + decoded don't. For example, "a" should not match "a-ring" on + file systems that store decomposed characters. */ + Lisp_Object zero = make_number (0); + Lisp_Object compare; + Lisp_Object cmp; + if (check_decoded && SCHARS (file) <= SCHARS (name)) + { + compare = make_number (SCHARS (file)); + cmp = Fcompare_strings (name, zero, compare, file, zero, compare, + completion_ignore_case ? Qt : Qnil); + if (!EQ (cmp, Qt)) + continue; + } + /* Suitably record this match. */ matchcount += matchcount <= 1; @@ -650,15 +688,13 @@ file_name_completion (Lisp_Object file, Lisp_Object dirname, bool all_flag, } else { - Lisp_Object zero = make_number (0); /* FIXME: This is a copy of the code in Ftry_completion. */ - ptrdiff_t compare = min (bestmatchsize, SCHARS (name)); - Lisp_Object cmp - = Fcompare_strings (bestmatch, zero, - make_number (compare), - name, zero, - make_number (compare), - completion_ignore_case ? Qt : Qnil); + compare = min (bestmatchsize, SCHARS (name)); + cmp = Fcompare_strings (bestmatch, zero, + make_number (compare), + name, zero, + make_number (compare), + completion_ignore_case ? Qt : Qnil); ptrdiff_t matchsize = EQ (cmp, Qt) ? compare : eabs (XINT (cmp)) - 1; if (completion_ignore_case) @@ -1007,6 +1043,7 @@ syms_of_dired (void) DEFSYM (Qfile_attributes, "file-attributes"); DEFSYM (Qfile_attributes_lessp, "file-attributes-lessp"); DEFSYM (Qdefault_directory, "default-directory"); + DEFSYM (Qdecomposed_characters, "decomposed-characters"); defsubr (&Sdirectory_files); defsubr (&Sdirectory_files_and_attributes); From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 22 17:29:24 2015 Received: (at 22169) by debbugs.gnu.org; 22 Dec 2015 22:29:24 +0000 Received: from localhost ([127.0.0.1]:60226 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBVQp-0005PX-G6 for submit@debbugs.gnu.org; Tue, 22 Dec 2015 17:29:23 -0500 Received: from mail-vk0-f44.google.com ([209.85.213.44]:33941) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBVQn-0005PK-I9 for 22169@debbugs.gnu.org; Tue, 22 Dec 2015 17:29:22 -0500 Received: by mail-vk0-f44.google.com with SMTP id j66so125205248vkg.1 for <22169@debbugs.gnu.org>; Tue, 22 Dec 2015 14:29:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=bi4v9u2ThZzv3MIqPM0Tx7abTDfEZ7xDR7BZ6bAPL+Q=; b=M+W0tfUMbFEKIbLMJxpgp8lSpUzltLL1U2HgLAbrRS4gXlKDWEizmXkCxLeMGa93L4 vHLh7rXHtNsOOcPofWaJt/fBSGYR70zRf0O0dQNzXX3uLMvCKxJb/WlsYxVfdI0xIYYx qT6xrB9cGUjxpLIIsGQVvnFieT2BPpetFQTi7cQYmC9U1YEo4YYOxZXx3VhdXOIoQc3d YlYnPhyMHCOYwh9BoYeu+RXr1duykH6ggVeYEMzQFIQpet58xJpjCNx01L2JiK9NsEdw 1KCHCKM5LPHm8MTc2w4Qfx9dW2mJmcRRIE2FTzkE5wy1tfqrCGV/+nqzXkEwfNQE9vzi scTg== MIME-Version: 1.0 X-Received: by 10.31.10.199 with SMTP id 190mr18044364vkk.51.1450823356161; Tue, 22 Dec 2015 14:29:16 -0800 (PST) Received: by 10.31.210.133 with HTTP; Tue, 22 Dec 2015 14:29:16 -0800 (PST) In-Reply-To: <838u4mquis.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> <83poxzqhln.fsf@gnu.org> <838u4mquis.fsf@gnu.org> Date: Tue, 22 Dec 2015 23:29:16 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a11440176c6a3890527842298 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a11440176c6a3890527842298 Content-Type: text/plain; charset=UTF-8 Hi! I just tried this and I can confirm that it works. Thanks for your hard work! I will push my patch for using the utf-8-hfs coding system (most likely) tomorrow. -- Anders On Tue, Dec 22, 2015 at 6:10 PM, Eli Zaretskii wrote: > > Date: Tue, 22 Dec 2015 06:42:47 +0100 > > From: Anders Lindgren > > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > > > > I haven't had time to see what actually happens in the code, > though. > > However, > > > the " if (STRING_MULTIBYTE (file))" looks suspicious as the > decoded value > > needs > > > to be checked even for strings like "a". (However, I don't really > know > > what > > > STRING_MULTIBYTE does.) > > > > Does removing the STRING_MULTIBYTE test make it work? > > > > Yes, it does: > > > > (file-name-all-completions "a" ".") > > ("aaosecond.txt") > > Then please try the final patch below (again against the current > emacs-25 branch), I hope I didn't goof this time. > > Thanks. > > diff --git a/lisp/international/ucs-normalize.el > b/lisp/international/ucs-normalize.el > index 8839b00..6f2fb28 100644 > --- a/lisp/international/ucs-normalize.el > +++ b/lisp/international/ucs-normalize.el > @@ -627,6 +627,10 @@ 'utf-8-hfs > :pre-write-conversion 'ucs-normalize-hfs-nfd-pre-write-conversion > ) > > +;; This is tested in dired.c:file_name_completion in order to reject > +;; false positives due to comparison of encoded file names. > +(coding-system-put 'utf-8-hfs 'decomposed-characters 't) > + > (provide 'ucs-normalize) > > ;; Local Variables: > diff --git a/src/dired.c b/src/dired.c > index 84bf247..89bd908 100644 > --- a/src/dired.c > +++ b/src/dired.c > @@ -467,6 +467,7 @@ file_name_completion (Lisp_Object file, Lisp_Object > dirname, bool all_flag, > well as "." and "..". Until shown otherwise, assume we can't exclude > anything. */ > bool includeall = 1; > + bool check_decoded = false; > ptrdiff_t count = SPECPDL_INDEX (); > > elt = Qnil; > @@ -485,6 +486,28 @@ file_name_completion (Lisp_Object file, Lisp_Object > dirname, bool all_flag, > on the encoded file name. */ > encoded_file = ENCODE_FILE (file); > encoded_dir = ENCODE_FILE (Fdirectory_file_name (dirname)); > + > + Lisp_Object file_encoding = Vfile_name_coding_system; > + if (NILP (Vfile_name_coding_system)) > + file_encoding = Vdefault_file_name_coding_system; > + /* If the file-name encoding decomposes characters, as we do for > + HFS+ filesystems, we need to make an additional comparison of > + decoded names in order to filter false positives, such as "a" > + falsely matching "a-ring". */ > + if (!NILP (file_encoding) > + && !NILP (Fplist_get (Fcoding_system_plist (file_encoding), > + Qdecomposed_characters))) > + { > + check_decoded = true; > + if (STRING_MULTIBYTE (file)) > + { > + /* Recompute FILE to make sure any decomposed characters in > + it are re-composed by the post-read-conversion. > + Otherwise, any decomposed characters will be rejected by > + the additional check below. */ > + file = DECODE_FILE (encoded_file); > + } > + } > int fd; > DIR *d = open_directory (encoded_dir, &fd); > record_unwind_protect_ptr (directory_files_internal_unwind, d); > @@ -637,6 +660,21 @@ file_name_completion (Lisp_Object file, Lisp_Object > dirname, bool all_flag, > if (!NILP (predicate) && NILP (call1 (predicate, name))) > continue; > > + /* Reject entries where the encoded strings match, but the > + decoded don't. For example, "a" should not match "a-ring" on > + file systems that store decomposed characters. */ > + Lisp_Object zero = make_number (0); > + Lisp_Object compare; > + Lisp_Object cmp; > + if (check_decoded && SCHARS (file) <= SCHARS (name)) > + { > + compare = make_number (SCHARS (file)); > + cmp = Fcompare_strings (name, zero, compare, file, zero, compare, > + completion_ignore_case ? Qt : Qnil); > + if (!EQ (cmp, Qt)) > + continue; > + } > + > /* Suitably record this match. */ > > matchcount += matchcount <= 1; > @@ -650,15 +688,13 @@ file_name_completion (Lisp_Object file, Lisp_Object > dirname, bool all_flag, > } > else > { > - Lisp_Object zero = make_number (0); > /* FIXME: This is a copy of the code in Ftry_completion. */ > - ptrdiff_t compare = min (bestmatchsize, SCHARS (name)); > - Lisp_Object cmp > - = Fcompare_strings (bestmatch, zero, > - make_number (compare), > - name, zero, > - make_number (compare), > - completion_ignore_case ? Qt : Qnil); > + compare = min (bestmatchsize, SCHARS (name)); > + cmp = Fcompare_strings (bestmatch, zero, > + make_number (compare), > + name, zero, > + make_number (compare), > + completion_ignore_case ? Qt : Qnil); > ptrdiff_t matchsize = EQ (cmp, Qt) ? compare : eabs (XINT (cmp)) > - 1; > > if (completion_ignore_case) > @@ -1007,6 +1043,7 @@ syms_of_dired (void) > DEFSYM (Qfile_attributes, "file-attributes"); > DEFSYM (Qfile_attributes_lessp, "file-attributes-lessp"); > DEFSYM (Qdefault_directory, "default-directory"); > + DEFSYM (Qdecomposed_characters, "decomposed-characters"); > > defsubr (&Sdirectory_files); > defsubr (&Sdirectory_files_and_attributes); > --001a11440176c6a3890527842298 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

I just tried this and I can confirm= that it works.

Thanks for your hard work!

I will push my patch for using the utf-8-hfs coding syste= m (most likely) tomorrow.

=C2=A0 =C2=A0 -- Anders<= /div>


On Tue, Dec 22, 2015 at 6:10 PM, Eli Zaretskii &= lt;eliz@gnu.org> wrote:
> Date: Tue, 22 Dec 2015 = 06:42:47 +0100
> From: Anders Lindgren <andlind@gmail.com>
> Cc: random832@fastmail.com, 22169@debbugs.gnu.org
>
>=C2=A0 =C2=A0 =C2=A0> I haven't had time= to see what actually happens in the code, though.
>=C2=A0 =C2=A0 =C2=A0However,
>=C2=A0 =C2=A0 =C2=A0> the " if (STRING_MULTIBYTE (file))" = looks suspicious as the decoded value
>=C2=A0 =C2=A0 =C2=A0needs
>=C2=A0 =C2=A0 =C2=A0> to be checked even for strings like "a&qu= ot;. (However, I don't really know
>=C2=A0 =C2=A0 =C2=A0what
>=C2=A0 =C2=A0 =C2=A0> STRING_MULTIBYTE does.)
>
>=C2=A0 =C2=A0 =C2=A0Does removing the STRING_MULTIBYTE test make it wor= k?
>
> Yes, it does:
>
> (file-name-all-completions "a" ".")
> ("aaosecond.txt")

Then please try the final patch below (again against the current
emacs-25 branch), I hope I didn't goof this time.

Thanks.

diff --git a/lisp/international/ucs-normalize.el b/lisp/international/ucs-n= ormalize.el
index 8839b00..6f2fb28 100644
--- a/lisp/international/ucs-normalize.el
+++ b/lisp/international/ucs-normalize.el
@@ -627,6 +627,10 @@ 'utf-8-hfs
=C2=A0 =C2=A0:pre-write-conversion 'ucs-normalize-hfs-nfd-pre-write-con= version
=C2=A0 =C2=A0)

+;; This is tested in dired.c:file_name_completion in order to reject
+;; false positives due to comparison of encoded file names.
+(coding-system-put 'utf-8-hfs 'decomposed-characters 't)
+
=C2=A0(provide 'ucs-normalize)

=C2=A0;; Local Variables:
diff --git a/src/dired.c b/src/dired.c
index 84bf247..89bd908 100644
--- a/src/dired.c
+++ b/src/dired.c
@@ -467,6 +467,7 @@ file_name_completion (Lisp_Object file, Lisp_Object dir= name, bool all_flag,
=C2=A0 =C2=A0 =C2=A0 well as "." and "..".=C2=A0 Until = shown otherwise, assume we can't exclude
=C2=A0 =C2=A0 =C2=A0 anything.=C2=A0 */
=C2=A0 =C2=A0bool includeall =3D 1;
+=C2=A0 bool check_decoded =3D false;
=C2=A0 =C2=A0ptrdiff_t count =3D SPECPDL_INDEX ();

=C2=A0 =C2=A0elt =3D Qnil;
@@ -485,6 +486,28 @@ file_name_completion (Lisp_Object file, Lisp_Object di= rname, bool all_flag,
=C2=A0 =C2=A0 =C2=A0 on the encoded file name.=C2=A0 */
=C2=A0 =C2=A0encoded_file =3D ENCODE_FILE (file);
=C2=A0 =C2=A0encoded_dir =3D ENCODE_FILE (Fdirectory_file_name (dirname));<= br> +
+=C2=A0 Lisp_Object file_encoding =3D Vfile_name_co= ding_system;
+=C2=A0 if (NILP (Vfile_name_coding_system))
+=C2=A0 =C2=A0 file_encoding =3D Vdefault_file_name_coding= _system;
+=C2=A0 /* If the file-name encoding decomposes characters, as we do for +=C2=A0 =C2=A0 =C2=A0HFS+ filesystems, we need to make an additional compar= ison of
+=C2=A0 =C2=A0 =C2=A0decoded names in order to filter false positives, such= as "a"
+=C2=A0 =C2=A0 =C2=A0falsely matching "a-ring".=C2=A0 */
+=C2=A0 if (!NILP (file_encoding)
+=C2=A0 =C2=A0 =C2=A0 && !NILP (Fplist_get (Fcoding_system_plist (f= ile_encoding),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0Qdecomposed_characters)))
+=C2=A0 =C2=A0 {
+=C2=A0 =C2=A0 =C2=A0 check_decoded =3D true;
+=C2=A0 =C2=A0 =C2=A0 if (STRING_MULTIBYTE (file))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0{
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Recompute FILE to make sure any decom= posed characters in
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 it are re-composed by the post-read-conversion.
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Otherwise, any decomposed charac= ters will be rejected by
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 the additional check below.=C2= =A0 */
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file =3D DECODE_FILE (encoded_file);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0}
+=C2=A0 =C2=A0 }
=C2=A0 =C2=A0int fd;
=C2=A0 =C2=A0DIR *d =3D open_directory (encoded_dir, &fd);
=C2=A0 =C2=A0record_unwind_protect_ptr (directory_files_internal_unwind, d)= ;
@@ -637,6 +660,21 @@ file_name_completion (Lisp_Object file, Lisp_Object di= rname, bool all_flag,
=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!NILP (predicate) && NILP (call1 (pr= edicate, name)))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 continue;

+=C2=A0 =C2=A0 =C2=A0 /* Reject entries where the encoded strings match, bu= t the
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0decoded don't.=C2=A0 For example, &q= uot;a" should not match "a-ring" on
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file systems that store decomposed chara= cters. */
+=C2=A0 =C2=A0 =C2=A0 Lisp_Object zero =3D make_number (0);
+=C2=A0 =C2=A0 =C2=A0 Lisp_Object compare;
+=C2=A0 =C2=A0 =C2=A0 Lisp_Object cmp;
+=C2=A0 =C2=A0 =C2=A0 if (check_decoded && SCHARS (file) <=3D SC= HARS (name))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0{
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0compare =3D make_number (SCHARS (file));=
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0cmp =3D Fcompare_strings (name, zero, co= mpare, file, zero, compare,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0completion_ignore_case ? Qt : = Qnil);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (!EQ (cmp, Qt))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0continue;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0}
+
=C2=A0 =C2=A0 =C2=A0 =C2=A0/* Suitably record this match.=C2=A0 */

=C2=A0 =C2=A0 =C2=A0 =C2=A0matchcount +=3D matchcount <=3D 1;
@@ -650,15 +688,13 @@ file_name_completion (Lisp_Object file, Lisp_Object d= irname, bool all_flag,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0else
=C2=A0 =C2=A0 =C2=A0 =C2=A0 {
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Lisp_Object zero =3D make_number (0); =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* FIXME: This is a copy of the code in = Ftry_completion.=C2=A0 */
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ptrdiff_t compare =3D min (bestmatchsize= , SCHARS (name));
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Lisp_Object cmp
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D Fcompare_strings (bestmatch, = zero,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0make_number (compare),
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0name, zero,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0make_number (compare),
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0completion_ignore_case ? Qt : Qnil);<= br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0compare =3D min (bestmatchsize, SCHARS (= name));
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0cmp =3D Fcompare_strings (bestmatch, zer= o,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0make_number (compare),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0name, zero,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0make_number (compare),
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0completion_ignore_case ? Qt : = Qnil);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ptrdiff_t matchsize =3D EQ (cmp, Qt) ? c= ompare : eabs (XINT (cmp)) - 1;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (completion_ignore_case)
@@ -1007,6 +1043,7 @@ syms_of_dired (void)
=C2=A0 =C2=A0DEFSYM (Qfile_attributes, "file-attributes");
=C2=A0 =C2=A0DEFSYM (Qfile_attributes_lessp, "file-attributes-lessp&qu= ot;);
=C2=A0 =C2=A0DEFSYM (Qdefault_directory, "default-directory"); +=C2=A0 DEFSYM (Qdecomposed_characters, "decomposed-characters");=

=C2=A0 =C2=A0defsubr (&Sdirectory_files);
=C2=A0 =C2=A0defsubr (&Sdirectory_files_and_attributes);

--001a11440176c6a3890527842298-- From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 22 22:36:49 2015 Received: (at 22169) by debbugs.gnu.org; 23 Dec 2015 03:36:49 +0000 Received: from localhost ([127.0.0.1]:60305 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBaEL-00046Z-Bt for submit@debbugs.gnu.org; Tue, 22 Dec 2015 22:36:49 -0500 Received: from eggs.gnu.org ([208.118.235.92]:54889) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBaEJ-00046N-Rc for 22169@debbugs.gnu.org; Tue, 22 Dec 2015 22:36:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aBaEA-0000Kc-Tf for 22169@debbugs.gnu.org; Tue, 22 Dec 2015 22:36:42 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:42284) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aBaEA-0000KR-Pi; Tue, 22 Dec 2015 22:36:38 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1978 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aBaEA-0006Tc-4P; Tue, 22 Dec 2015 22:36:38 -0500 Date: Wed, 23 Dec 2015 05:37:14 +0200 Message-Id: <83si2tq1hx.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Tue, 22 Dec 2015 23:29:16 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> <83poxzqhln.fsf@gnu.org> <838u4mquis.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Tue, 22 Dec 2015 23:29:16 +0100 > From: Anders Lindgren > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > I just tried this and I can confirm that it works. Thanks for testing. > I will push my patch for using the utf-8-hfs coding system (most likely) > tomorrow. OK, I will then push this change afterwards. From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 23 01:17:39 2015 Received: (at 22169) by debbugs.gnu.org; 23 Dec 2015 06:17:40 +0000 Received: from localhost ([127.0.0.1]:60349 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBcjz-0007ua-Og for submit@debbugs.gnu.org; Wed, 23 Dec 2015 01:17:39 -0500 Received: from mail-vk0-f43.google.com ([209.85.213.43]:33697) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBcjx-0007uJ-RP for 22169@debbugs.gnu.org; Wed, 23 Dec 2015 01:17:38 -0500 Received: by mail-vk0-f43.google.com with SMTP id a188so130289682vkc.0 for <22169@debbugs.gnu.org>; Tue, 22 Dec 2015 22:17:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=VRQzL3i1o9FKQgg5w5RWHsWJAfZkox3t5xYaV6Chchk=; b=YmsAS+fDAyOAP/AdvIwIxcqNO1e6AzYAja+b2+wSTGhWleGBJvhWBuLsaw9aJ2p4fj LpbZymr1nTttKBYOl4mNEbh9NnT/3AJ35hM8zj9wL209S8Yymhu4sNwpH1IfxVanMVh1 JIDlKatvIUVlPZ3x89k+QnesLDFVXm9Y2jg1pyakBVDQhknlntd8fF3nABJlbfnujoeB WfUofkInEW/rQQRf32DO8SyGeDQyYRGSHsd6O+iTgS0QTLlH8sV2/anPcgHyLPpom2CX YAfR3+CIUzHqjTVWdsu2f35f152xZN2NJry5feWf4+t5LnLOCIuQjJHifd0V9q52G8UO 7PWQ== MIME-Version: 1.0 X-Received: by 10.31.138.20 with SMTP id m20mr18758556vkd.70.1450851452384; Tue, 22 Dec 2015 22:17:32 -0800 (PST) Received: by 10.31.210.133 with HTTP; Tue, 22 Dec 2015 22:17:32 -0800 (PST) In-Reply-To: <83si2tq1hx.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> <83poxzqhln.fsf@gnu.org> <838u4mquis.fsf@gnu.org> <83si2tq1hx.fsf@gnu.org> Date: Wed, 23 Dec 2015 07:17:32 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a1144f92270fe1905278aadd8 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a1144f92270fe1905278aadd8 Content-Type: text/plain; charset=UTF-8 > > > I will push my patch for using the utf-8-hfs coding system (most likely) > > tomorrow. > > OK, I will then push this change afterwards. > Done! -- Anders --001a1144f92270fe1905278aadd8 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
> I will push my patch for u= sing the utf-8-hfs coding system (most likely)
> tomorrow.

OK, I will then push this change afterwards.
Done!

=C2=A0 =C2=A0 -- Anders=C2=A0

--001a1144f92270fe1905278aadd8-- From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 23 12:36:00 2015 Received: (at 22169-done) by debbugs.gnu.org; 23 Dec 2015 17:36:00 +0000 Received: from localhost ([127.0.0.1]:32927 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBnKS-0008QR-0H for submit@debbugs.gnu.org; Wed, 23 Dec 2015 12:36:00 -0500 Received: from eggs.gnu.org ([208.118.235.92]:49651) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aBnKQ-0008QF-Tp for 22169-done@debbugs.gnu.org; Wed, 23 Dec 2015 12:35:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aBnKI-0007d1-I2 for 22169-done@debbugs.gnu.org; Wed, 23 Dec 2015 12:35:53 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:59125) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aBnKI-0007cx-E4; Wed, 23 Dec 2015 12:35:50 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2306 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aBnKH-0000ys-LE; Wed, 23 Dec 2015 12:35:50 -0500 Date: Wed, 23 Dec 2015 19:36:26 +0200 Message-Id: <838u4loyn9.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Wed, 23 Dec 2015 07:17:32 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> <83poxzqhln.fsf@gnu.org> <838u4mquis.fsf@gnu.org> <83si2tq1hx.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169-done Cc: random832@fastmail.com, 22169-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Wed, 23 Dec 2015 07:17:32 +0100 > From: Anders Lindgren > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > > I will push my patch for using the utf-8-hfs coding system (most likely) > > tomorrow. > > OK, I will then push this change afterwards. > > Done! Thanks, I pushed my part, and I'm marking this done. From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 24 14:23:21 2015 Received: (at 22169-done) by debbugs.gnu.org; 24 Dec 2015 19:23:21 +0000 Received: from localhost ([127.0.0.1]:33843 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aCBTt-0003rK-2V for submit@debbugs.gnu.org; Thu, 24 Dec 2015 14:23:21 -0500 Received: from mail-vk0-f54.google.com ([209.85.213.54]:36395) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aCBTr-0003r3-A9 for 22169-done@debbugs.gnu.org; Thu, 24 Dec 2015 14:23:19 -0500 Received: by mail-vk0-f54.google.com with SMTP id f2so112143662vkb.3 for <22169-done@debbugs.gnu.org>; Thu, 24 Dec 2015 11:23:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=wUPSGyMzy900WPpKWGACJnb2dLCejrugAXNUynojWGY=; b=rLEWsGWZApKpgq4aUqQXqBzg4sFbHAcJmmdZtCL9ZfLaGHP7RsfXKbOhttKhN0H6Bk 3q7QUMWzqWmCtv2QQLI2omV11qh+1NxKBDuVK7HMVeCFEfPwGj55DEbY1xSx+uFB4R8b v6wUIYEG+RMi5WNhPiUCC4kOcDVwYRPwdqZIGQF+RWsxOlzsdO0xtnzgP+beNUntdRnW LaUULDlOiHCxm3o3g2KEy2+ZGd/FQ3QRCDDO9k4BTWic1t3i1CQV5lzbouvUhQzntKQ1 UDcbP1L/7/kQTZkXeDSPk6pHJPw78i33CtZ7ckO/iQJhfjwGD7UlvGiR5faOW9zTaqiS etsg== MIME-Version: 1.0 X-Received: by 10.31.146.66 with SMTP id u63mr21740110vkd.31.1450984993709; Thu, 24 Dec 2015 11:23:13 -0800 (PST) Received: by 10.31.210.133 with HTTP; Thu, 24 Dec 2015 11:23:13 -0800 (PST) In-Reply-To: <838u4loyn9.fsf@gnu.org> References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> <83poxzqhln.fsf@gnu.org> <838u4mquis.fsf@gnu.org> <83si2tq1hx.fsf@gnu.org> <838u4loyn9.fsf@gnu.org> Date: Thu, 24 Dec 2015 20:23:13 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a1143a94c1fe32f0527a9c5e0 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 22169-done Cc: random832@fastmail.com, 22169-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a1143a94c1fe32f0527a9c5e0 Content-Type: text/plain; charset=UTF-8 Hi! Eli, can you push or send me your fix to ucs-normalize. I'd like to verify that my patch doesn't cause further problems before I re-publish it. Also, I was curious about one thing. My intention was to include ucs-normalize when bootstrapping for NextStep only -- why did this cause a problem when building on other systems? -- Anders On Wed, Dec 23, 2015 at 6:36 PM, Eli Zaretskii wrote: > > Date: Wed, 23 Dec 2015 07:17:32 +0100 > > From: Anders Lindgren > > Cc: random832@fastmail.com, 22169@debbugs.gnu.org > > > > > I will push my patch for using the utf-8-hfs coding system (most > likely) > > > tomorrow. > > > > OK, I will then push this change afterwards. > > > > Done! > > Thanks, I pushed my part, and I'm marking this done. > --001a1143a94c1fe32f0527a9c5e0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi!

Eli, can you push or send me your f= ix to ucs-normalize. I'd like to verify that my patch doesn't cause= further problems before I re-publish it.

Also, I = was curious about one thing. My intention was to include ucs-normalize when= bootstrapping for NextStep only -- why did this cause a problem when build= ing on other systems?

=C2=A0 =C2=A0 -- Anders


On Wed, Dec 23, 2015 at 6:36 PM, Eli Zaretskii <<= a href=3D"mailto:eliz@gnu.org" target=3D"_blank">eliz@gnu.org> wrote:
> Date: Wed, 23 Dec 2015 07:1= 7:32 +0100
> From: Anders Lindgren <andlind@gmail.com>
> Cc: random832@fastmail.com, 22169@debbugs.gnu.org
>
>=C2=A0 =C2=A0 =C2=A0> I will push my patch f= or using the utf-8-hfs coding system (most likely)
>=C2=A0 =C2=A0 =C2=A0> tomorrow.
>
>=C2=A0 =C2=A0 =C2=A0OK, I will then push this change afterwards.
>
> Done!

Thanks, I pushed my part, and I'm marking this done.

--001a1143a94c1fe32f0527a9c5e0-- From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 24 14:33:49 2015 Received: (at 22169-done) by debbugs.gnu.org; 24 Dec 2015 19:33:49 +0000 Received: from localhost ([127.0.0.1]:33847 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aCBe1-00046q-2j for submit@debbugs.gnu.org; Thu, 24 Dec 2015 14:33:49 -0500 Received: from mail-vk0-f44.google.com ([209.85.213.44]:33903) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aCBdy-00046c-S0 for 22169-done@debbugs.gnu.org; Thu, 24 Dec 2015 14:33:47 -0500 Received: by mail-vk0-f44.google.com with SMTP id j66so150803123vkg.1 for <22169-done@debbugs.gnu.org>; Thu, 24 Dec 2015 11:33:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1bdqLFg/JGg5Nsgz8DekoBeKMuy8/3sPbO4I+SvYzHI=; b=S4K/P+Lk23Ko3AYT0JzyXD5Hs4qv7O+Fh0oMIksumoJ/CLXmLdzijqXjZxY12JDlZk EZrWhv9iklEie+qUU0rtVgPll1b6C/U4fKEys9ANIz44AMWmJ33sSXONA1erB+IQvoOR 6cmp0X3po+HTfkv+ATpCoUYQafCYy0yLN7FVZ7OxNFkyhx8+7SSSi09wowb60J1QvjPS d27w6LuLSYl8fEujqs1B33BT79OSo9qvxN0L9j9UohL1qlDhxLz80miT8jcpUFcJlVX/ Ye3fpvM/bwO51q8tUKxr3nO4Tcw/jZ8e1d3Fom12JN9vV/3t8DbGjY80yEDLXKxaQ74J aIGQ== MIME-Version: 1.0 X-Received: by 10.31.54.134 with SMTP id d128mr22351234vka.26.1450985621265; Thu, 24 Dec 2015 11:33:41 -0800 (PST) Received: by 10.31.210.133 with HTTP; Thu, 24 Dec 2015 11:33:41 -0800 (PST) In-Reply-To: References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> <83poxzqhln.fsf@gnu.org> <838u4mquis.fsf@gnu.org> <83si2tq1hx.fsf@gnu.org> <838u4loyn9.fsf@gnu.org> Date: Thu, 24 Dec 2015 20:33:41 +0100 Message-ID: Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X From: Anders Lindgren To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a11438ee887a32d0527a9ea16 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 22169-done Cc: 22169-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a11438ee887a32d0527a9ea16 Content-Type: text/plain; charset=UTF-8 Eli, I just saw you other letter in the other thread. I reply here instead: I'm saying that it's not a disaster that a bug like that stays unfixed > for a day or two. (It wasn't meant to be a criticism on your behalf.) > Good to know, I was a bit stressed causing trouble for others and not being able to look into the problem. (When it comes to core development, I still feel like a beginner.) Just so you know, I didn't take it as criticism. > Not sure what you are saying. If you mean you want to wait for me to > push my changes, then part of them need your patch, because they > modify some of the code you added. > > I can send you my changes, so you could try them before both parts are > installed, if you want. > Please send the patch to me so I can test here. Alternatively, if you feel confident it will work, feel free to push the full patch. -- Anders On Thu, Dec 24, 2015 at 8:23 PM, Anders Lindgren wrote: > Hi! > > Eli, can you push or send me your fix to ucs-normalize. I'd like to verify > that my patch doesn't cause further problems before I re-publish it. > > Also, I was curious about one thing. My intention was to include > ucs-normalize when bootstrapping for NextStep only -- why did this cause a > problem when building on other systems? > > -- Anders > > > On Wed, Dec 23, 2015 at 6:36 PM, Eli Zaretskii wrote: > >> > Date: Wed, 23 Dec 2015 07:17:32 +0100 >> > From: Anders Lindgren >> > Cc: random832@fastmail.com, 22169@debbugs.gnu.org >> > >> > > I will push my patch for using the utf-8-hfs coding system (most >> likely) >> > > tomorrow. >> > >> > OK, I will then push this change afterwards. >> > >> > Done! >> >> Thanks, I pushed my part, and I'm marking this done. >> > > --001a11438ee887a32d0527a9ea16 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Eli, I just saw you other letter in the other thread. I re= ply here instead:

I'm saying that i= t's not a disaster that a bug like that stays unfixed
for a day or t= wo.=C2=A0 (It wasn't meant to be a criticism on your behalf.)

Good to know, I was a bit stressed causing troub= le for others and not being able to look into the problem. (When it comes t= o core development, I still feel like a beginner.) Just so you know, I didn= 't take it as criticism.

=C2=A0
Not sure what you are saying.=C2=A0 If you = mean you want to wait for me to
push my changes, then part of them need = your patch, because they
modify some of the code you added.

I can= send you my changes, so you could try them before both parts are
instal= led, if you want.

Please send the patch= to me so I can test here. Alternatively, if you feel confident it will wor= k, feel free to push the full patch.

=C2=A0 =C2=A0= -- Anders =C2=A0


On Thu, Dec 24, 2015 at 8:23 PM, Ander= s Lindgren <andlind@gmail.com> wrote:
Hi!

Eli, can you push or = send me your fix to ucs-normalize. I'd like to verify that my patch doe= sn't cause further problems before I re-publish it.

Also, I was curious about one thing. My intention was to include ucs-= normalize when bootstrapping for NextStep only -- why did this cause a prob= lem when building on other systems?

=C2=A0 =C2=A0 -- Anders


On Wed, Dec 23, 2015 at 6:36 P= M, Eli Zaretskii <eliz@gnu.org> wrote:
> Date: Wed, 23 Dec 2015 07:17:32 +0100
> From: Anders Lindgren <andlind@gmail.com>
> Cc: random= 832@fastmail.com, 22169@debbugs.gnu.org
>
>=C2=A0 =C2=A0 =C2=A0> I will push my patch for using th= e utf-8-hfs coding system (most likely)
>=C2=A0 =C2=A0 =C2=A0> tomorrow.
>
>=C2=A0 =C2=A0 =C2=A0OK, I will then push this change afterwards.
>
> Done!

Thanks, I pushed my part, and I'm marking this done.


--001a11438ee887a32d0527a9ea16-- From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 24 14:41:57 2015 Received: (at 22169) by debbugs.gnu.org; 24 Dec 2015 19:41:57 +0000 Received: from localhost ([127.0.0.1]:33862 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aCBlt-0004J9-C2 for submit@debbugs.gnu.org; Thu, 24 Dec 2015 14:41:57 -0500 Received: from eggs.gnu.org ([208.118.235.92]:55693) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aCBls-0004Iw-8C for 22169@debbugs.gnu.org; Thu, 24 Dec 2015 14:41:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aCBlj-00027R-VK for 22169@debbugs.gnu.org; Thu, 24 Dec 2015 14:41:51 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:56877) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aCBlj-00027L-SD; Thu, 24 Dec 2015 14:41:47 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:3207 helo=HOME-C4E4A596F7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aCBlj-0006SM-27; Thu, 24 Dec 2015 14:41:47 -0500 Date: Thu, 24 Dec 2015 21:42:27 +0200 Message-Id: <83r3ibmy58.fsf@gnu.org> From: Eli Zaretskii To: Anders Lindgren In-reply-to: (message from Anders Lindgren on Thu, 24 Dec 2015 20:23:13 +0100) Subject: Re: bug#22169: 25.0.50; File name compiletion doesn't work with non-ASCII characters on OS X References: <83y4cw3kie.fsf@gnu.org> <83twnk3fg1.fsf@gnu.org> <83oads2x99.fsf@gnu.org> <83io3z3drh.fsf@gnu.org> <831tan32q2.fsf@gnu.org> <83r3ikxmis.fsf@gnu.org> <83fuyxt35q.fsf@gnu.org> <8337uwucyt.fsf@gnu.org> <83wps8sc6h.fsf@gnu.org> <83fuyvss0h.fsf@gnu.org> <83poxzqhln.fsf@gnu.org> <838u4mquis.fsf@gnu.org> <83si2tq1hx.fsf@gnu.org> <838u4loyn9.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 22169 Cc: random832@fastmail.com, 22169@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Thu, 24 Dec 2015 20:23:13 +0100 > From: Anders Lindgren > Cc: random832@fastmail.com, 22169-done@debbugs.gnu.org > > Eli, can you push or send me your fix to ucs-normalize. I'd like to verify that > my patch doesn't cause further problems before I re-publish it. Will do, in a separate email. > Also, I was curious about one thing. My intention was to include ucs-normalize > when bootstrapping for NextStep only -- why did this cause a problem when > building on other systems? See src/lisp.mk and the rules in src/Makefile.in which produce that file. Every file that is preloaded on _any_ platform automatically gets added to the list of Lisp files that are compiled by bootstrap-emacs (as opposed to by emacs) before emacs is dumped after loading those files in byte-compiled form. Those files are compiled and scanned for doc strings on all platforms, even if some of them are not preloaded on a particular platform. From unknown Sun Jun 22 17:15:25 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 22 Jan 2016 12:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator