GNU bug report logs - #30116
[PATCH] `substitute' crashes when file contains NUL characters (core-updates)

Previous Next

Package: guix;

Reported by: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Date: Mon, 15 Jan 2018 01:29:02 UTC

Severity: normal

Tags: patch

Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: Mark H Weaver <mhw <at> netris.org>, 30116 <at> debbugs.gnu.org
Subject: bug#30116: [PATCH] `substitute' crashes when file contains NUL characters (core-updates)
Date: Thu, 25 Jan 2018 00:11:26 -0500
ludo <at> gnu.org (Ludovic Courtès) writes:

> Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:
>
>> In the `patch-el-files' phase of the emacs-build-system, we find the
>> following snippet:
>>
>>     (with-directory-excursion el-dir
>>       ;; Some old '.el' files (e.g., tex-buf.el in AUCTeX) are still encoded
>>       ;; with the "ISO-8859-1" locale.
>>       (unless (false-if-exception (substitute-cmd))
>>         (with-fluids ((%default-port-encoding "ISO-8859-1"))
>>           (substitute-cmd))))
>>
>> In case an exception is returned while processing the file, it is
>> retried being opened with the "ISO-8859-1" encoding. Or, this resolves
>> to a call to `open-file', which documentation says:
>>
>> ‘b’
>>           Use binary mode, ensuring that each byte in the file will be
>>           read as one Scheme character.
>>
>>           To provide this property, the file will be opened with the
>>           8-bit character encoding "ISO-8859-1", ignoring the default
>>           port encoding.  *Note Ports::, for more information on port
>>           encodings.
>>
>> So, by opening an file whose encoding is unknown as a ISO-8859-1 file,
>> we are doing the same as if we had passed the 'binary option. Could this
>> explain why we end up with NUL characters where we were expecting text?
>
> That could be the reason.  Guile provides a way to honor Emacs-style
> ‘encoding’ declarations, and ‘call-with-input-file’ does that if we pass
> #:guess-encoding #t (info "(guile) Character Encoding of Source Files").
>
> Did the faulty file have such a declaration?

Sadly, it doesn't. Although even if it did, I don't think it would be
very robust to expect every misbehaving files we might encounter to
include one!

So I think we should apply my v2 patch to core-updates for now (see my
previous reply on this thread), until we have our substitute routine
implemented using srfi-115!

Thanks,

Maxim




This bug report was last modified 4 years and 193 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.