GNU bug report logs - #20258
24.5; format-time-string miscounting of multibyte characters

Previous Next

Package: emacs;

Reported by: Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>

Date: Sat, 4 Apr 2015 15:36:01 UTC

Severity: minor

Tags: fixed, patch

Found in version 24.5

Fixed in version 27.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


Message #22 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: 20258 <at> debbugs.gnu.org
Cc: stefan <at> marxist.se, gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5; format-time-string miscounting of multibyte
 characters
Date: Mon, 30 Sep 2019 05:09:08 +0200
Stefan Kangas <stefan <at> marxist.se> writes:

>>> As the subject says, format-time-string miscounts multibyte characters.
>>> Simple example with nb_NO.utf8 locale, where ø is two bytes:
>>>
>>> (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015"))
>>> "  lø."
>>>
>>> (length (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
>>> 5
>>
>> 'length' counts characters, not bytes.  If you need to count bytes,
>> use 'string-bytes' instead:
>>
>>   (string-bytes "  lø.") => 6
>
> I can see no bug here, only a misunderstanding about the length
> function.  I'm therefore closing this bug.  If that's incorrect, please
> reopen this bug report.

But the issue here is that "%6a" should give you a string that's six
characters long, I think?  Admittedly the doc string is vague here:

---
A field width N is an unsigned decimal integer with a leading digit nonzero.
%NX is like %X, but takes up at least N positions.
---

But the natural interpretation of "positions" isn't bytes, I think, and
if is, then the doc string should say so.

(let ((system-time-locale "nb_NO.UTF-8"))
  (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
=> "  lø."

(if you have that locale in /etc/locale.gen.)

But I seem to remember from previous discussions that this quirk is in
the C strftime function?  And Emacs just call it?  I haven't checked.
But this means that you can't use format-time-string to line stuff up,
but have to use `format':

(let ((system-time-locale "nb_NO.UTF-8"))
  (format "%6s" (format-time-string "%a" (date-to-time "Sat Apr  4 16:14:40 2015"))))
=> "   lø."

So I think what WIDTH means should be said explicitly in the doc string.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




This bug report was last modified 5 years and 234 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.