GNU bug report logs -
#34469
26.1; EWW stops renderring web page on null byte
Previous Next
Reported by: Lukasz Pawelczyk <l.pawelczyk <at> samsung.com>
Date: Wed, 13 Feb 2019 15:57:02 UTC
Severity: normal
Tags: fixed
Found in version 26.1
Fixed in version 27.1
Done: Robert Pluim <rpluim <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Date: Tue, 19 Feb 2019 11:06:37 +0100
>> Cc: 34469 <at> debbugs.gnu.org, Nicholas Drozd <nicholasdrozd <at> gmail.com>
>>
>> Glenn Morris <rgm <at> gnu.org> writes:
>>
>> > Perhaps eww-display-html should replace null bytes (with whatever the
>> > html standard says is appropriate) before calling
>> > libxml-parse-html-region. It already replaces CRLF.
>>
>> Chrome at least just strips the null byte completely.
>>
>> There is apparently a class of attacks that uses the null character
>> for nefarious purposes, so how about something like this:
>>
>> diff --git a/lisp/net/eww.el b/lisp/net/eww.el
>> index 1cc4557ce1..9b57bc43e4 100644
>> --- a/lisp/net/eww.el
>> +++ b/lisp/net/eww.el
>> @@ -448,8 +448,8 @@ eww-display-html
>> (decode-coding-region (point) (point-max) encode)
>> (coding-system-error nil))
>> (save-excursion
>> - ;; Remove CRLF before parsing.
>> - (while (re-search-forward "\r$" nil t)
>> + ;; Remove CRLF and NULL before parsing.
>> + (while (re-search-forward "\r$\\|\000" nil t)
>> (replace-match "" t t)))
>
> It is un-Emacsy, IMO, to remove content without a trace. (CR is
> different: we simply convert text to Unix LF-only EOL format.) So I'd
> suggest to replace with "^@" or "\000" or "NUL" or something to that
> effect. Even U+FFFD would be better than removing.
>
Since this is all due to a C-ism in the handling of content, Iʼd vote
for "\0", although this is inside Emacs, so perhaps "^@" is best.
> (We could get fancy and have a defcustom for those who do want the
> null bytes removed.)
I really donʼt think this is something that needs to be configurable.
Robert
This bug report was last modified 6 years and 80 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.