GNU bug report logs - #59275
Unexpected return value of `string-collate-lessp' on Mac

Previous Next

Package: emacs;

Reported by: Ihor Radchenko <yantar92 <at> posteo.net>

Date: Tue, 15 Nov 2022 04:08:02 UTC

Severity: normal

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #55 received at 59275-done <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 59275-done <at> debbugs.gnu.org
Subject: Re: bug#59275: Unexpected return value of `string-collate-lessp' on
 Mac
Date: Sat, 26 Nov 2022 02:03:43 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

>> We use string collation for
>> 
>> 1. Sorting bibliographies
>> 2. Sorting lists
>> 3. Sorting table lines
>> 4. Sorting tags
>> 5. Sorting headings
>> 6. Sorting entries in agendas
>> 7. As a criterion for agenda/tag filtering when comparison operator is
>>    used on string property values (11.3.3 Matching tags and properties)
>> 
>> 1-6 should follow the locale.
>
> I think only 1 and 6 are firmly in that category.  For the others it depends
> on whether the results of the sorting are immediately displayed, or used for
> further processing.  In the former case, using string-collate-lessp is
> semi-okay ("semi" because producing different results in different locales
> can still confuse users); in the latter case it is wrong, IMO, because you
> will cause unexpected results.

1-6 are for interactive use.

As Maxim pointed out in
https://orgmode.org/list/tlle59$pl3$1 <at> ciao.gmane.io,
`string-collate-lessp' generally yield better results for human
consumption:

"		 (setq lst '("semana" "señor" "sepia"))
		 (sort lst #'string-lessp) ;         => ("semana" "sepia" "señor")
		 (sort lst #'string-collate-lessp) ; => ("semana" "señor" "sepia")
"

In the same thread, we also discussed what Org can do about MacOS and
other systems that do not implement string collation.

We concluded that a better fallback when collation is not available
would be using downcase+string-lessp when `string-collate-lessp' is
called with non-nil IGNORE-CASE argument.

Would it be acceptable for Emacs to change the fallback behavior of
`string-collate-lessp' to:

1. If string collation is not available and IGNORE-CASE is nil, fallback
   to`string-lessp';
2. If string collation is not available and IGNORE-CASE is non-nil,
   use `downcase' + `string-lessp'.

This will not compromise consistency and will yield slightly better
fallback results.

I also do not think that it will be backwards-incompatible. If the call
to `string-collate-lessp' explicitly requests ignoring case, `downcase'
is more expected than bare `string-lessp' that _does not_ ignore case.

WDYT?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




This bug report was last modified 2 years and 176 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.