From unknown Sat Jun 14 00:06:45 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#34469 <34469@debbugs.gnu.org> To: bug#34469 <34469@debbugs.gnu.org> Subject: Status: 26.1; EWW stops renderring web page on null byte Reply-To: bug#34469 <34469@debbugs.gnu.org> Date: Sat, 14 Jun 2025 07:06:45 +0000 retitle 34469 26.1; EWW stops renderring web page on null byte reassign 34469 emacs submitter 34469 Lukasz Pawelczyk severity 34469 normal tag 34469 fixed thanks From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 13 10:56:28 2019 Received: (at submit) by debbugs.gnu.org; 13 Feb 2019 15:56:28 +0000 Received: from localhost ([127.0.0.1]:47248 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gtwte-0003Bq-GY for submit@debbugs.gnu.org; Wed, 13 Feb 2019 10:56:27 -0500 Received: from eggs.gnu.org ([209.51.188.92]:52879) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gttit-0004Iu-Eh for submit@debbugs.gnu.org; Wed, 13 Feb 2019 07:33:09 -0500 Received: from lists.gnu.org ([209.51.188.17]:46742) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gttin-00023U-Dw for submit@debbugs.gnu.org; Wed, 13 Feb 2019 07:33:02 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38839) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gttil-0003dj-Vn for bug-gnu-emacs@gnu.org; Wed, 13 Feb 2019 07:33:01 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_HI, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gttdm-00042x-26 for bug-gnu-emacs@gnu.org; Wed, 13 Feb 2019 07:27:53 -0500 Received: from mailout2.w1.samsung.com ([210.118.77.12]:59803) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gttdc-0003Xa-R9 for bug-gnu-emacs@gnu.org; Wed, 13 Feb 2019 07:27:45 -0500 Received: from eucas1p2.samsung.com (unknown [182.198.249.207]) by mailout2.w1.samsung.com (KnoxPortal) with ESMTP id 20190213122720euoutp02b14e8cda9a27a953c7074a15b4b4697f~C7B8ophar1639316393euoutp02- for ; Wed, 13 Feb 2019 12:27:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20190213122720euoutp02b14e8cda9a27a953c7074a15b4b4697f~C7B8ophar1639316393euoutp02- DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1550060840; bh=ReBnKU3GsyWqSJ/q2/KJzU2aF7+1m8nchm7rpQ8LbGE=; h=Subject:From:To:Date:References:From; b=Tq4DvPutfLJqDEBph/qo0zWk6ieDa3/bWgsiYSgoFdMngwbzfGJzhm3KTjoyuuEwH eJmZAeRdyPw6VnphCK6dVC3u3+ujK6CTNtzjQOaqCYBPr0X38mER6oOETI7uPPHAO8 RR9VZSiqgQiD1FOGqI06AIVQefLqD+t516VNaDVM= Received: from eusmges3new.samsung.com (unknown [203.254.199.245]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20190213122719eucas1p184130aabdf270f64845ab0949c2d842d~C7B8EcPIV2347123471eucas1p1q for ; Wed, 13 Feb 2019 12:27:19 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges3new.samsung.com (EUCPMTA) with SMTP id 83.D9.04806.72D046C5; Wed, 13 Feb 2019 12:27:19 +0000 (GMT) Received: from eusmtrp2.samsung.com (unknown [182.198.249.139]) by eucas1p2.samsung.com (KnoxPortal) with ESMTPA id 20190213122718eucas1p26156656a2376e5055452ac4d0385fc6d~C7B7NZlAs0491804918eucas1p24 for ; Wed, 13 Feb 2019 12:27:18 +0000 (GMT) Received: from eusmgms1.samsung.com (unknown [182.198.249.179]) by eusmtrp2.samsung.com (KnoxPortal) with ESMTP id 20190213122718eusmtrp29f38da281e67f6a28bf7fa0479f8a0ac~C7B6-aroo0032100321eusmtrp2M for ; Wed, 13 Feb 2019 12:27:18 +0000 (GMT) X-AuditID: cbfec7f5-79db79c0000012c6-2d-5c640d277008 Received: from eusmtip2.samsung.com ( [203.254.199.222]) by eusmgms1.samsung.com (EUCPMTA) with SMTP id B5.A2.04284.62D046C5; Wed, 13 Feb 2019 12:27:18 +0000 (GMT) Received: from amdc2143 (unknown [106.120.51.59]) by eusmtip2.samsung.com (KnoxPortal) with ESMTPA id 20190213122718eusmtip2efc80f7209502c3810945c1747cbda32~C7B6x_vZl1852718527eusmtip2d for ; Wed, 13 Feb 2019 12:27:18 +0000 (GMT) Message-ID: Subject: 26.1; EWW stops renderring web page on null byte From: Lukasz Pawelczyk To: bug-gnu-emacs@gnu.org Date: Wed, 13 Feb 2019 13:27:16 +0100 User-Agent: Evolution 3.30.4 (3.30.4-1.fc29) Mime-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrDIsWRmVeSWpSXmKPExsWy7djP87rqvCkxBtMPCFh8P1DlwOjRNs0s gDGKyyYlNSezLLVI3y6BK2Pl5ia2gka1iv2TNjM2MB5T6GLk5JAQMJE4tuISO4gtJLCCUeLq aqA4F5A9gUlizf6j7BBOP5PEglOzmWA6ph8+wQqRWM4oceD7LWYIp4lJ4uqLO4wgVbwCHhJ3 /t5jBbGFBSwl7r18DtbNJmAg8f3CXmYQW0RAUuLrnVawGhYBVYmPL/pZQGxRAV2JKx8vA9kc QHMEJf7uEAYJMwvISzRvnQ22S0JgA5vElyt32CEucpHYvuUWM4QtLPHq+BaouIzE/53zmUDm SAhUS7SdKITo7WCUODe5mQ2ixlri86QtzCA1zAKaEut36UOEHSWWrFnDDtHKJ3HjrSDECXwS k7ZNZ4YI80p0tAlBVKtKvN4zmxHClpb4+Gcv1AEeEvdez2CFBG6sxIWG12wTGOVnIfw1C8lf sxBOWMDIvIpRPLW0ODc9tdg4L7Vcrzgxt7g0L10vOT93EyMwzk//O/51B+O+P0mHGAU4GJV4 eFccTYoRYk0sK67MPcQowcGsJMI7gzElRog3JbGyKrUoP76oNCe1+BCjNAeLkjhvNcODaCGB 9MSS1OzU1ILUIpgsEwenVAPjCl/mvICHIobaxl7LhEpbhR1e/TLoszvPt+l6jVHxYrM/CVeN qu3Lo36yPzZot4+XWP1X7rva+4cru64n/lL8GTl3oUzS68OW6tyFlz7++2NkOWHrMt0JeeEb qjcrf13mzir7ZYdH4DzZi89Xy+/XdJJu1t4za0XK93VLL0YFZky/rOqgpC6gxFKckWioxVxU nAgAzjZatu8CAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrDLMWRmVeSWpSXmKPExsVy+t/xe7pqvCkxBlP+yVp8P1DlwOjRNs0s gDFKz6Yov7QkVSEjv7jEVina0MJIz9DSQs/IxFLP0Ng81srIVEnfziYlNSezLLVI3y5BL2Pl 5ia2gka1iv2TNjM2MB5T6GLk5JAQMJGYfvgEaxcjF4eQwFJGiYUnL7NCJKQljh9YCGULS/y5 1sUGUdTAJPFm7Q0mkASvgIfEnb/3wIqEBSwl7r18DhZnEzCQ+H5hLzOILSIgKfH1TitYDYuA qsTHF/0sIDazgKZE6/bf7CC2qICuxJWPl4HiHEAzBSX+7hCGKJGXaN46m3kCI98sJB2zEKpm IalawMi8ilEktbQ4Nz232FCvODG3uDQvXS85P3cTIzAktx37uXkH46WNwYcYBTgYlXh4VxxN ihFiTSwrrsw9xCjBwawkwjuDMSVGiDclsbIqtSg/vqg0J7X4EKMp0NkTmaVEk/OB8ZJXEm9o amhuYWlobmxubGahJM573qAySkggPbEkNTs1tSC1CKaPiYNTqoFxbtAhASVD/9kXSqIkNwY6 Pagwmvt1fV2Z45uPbhJipnGPZ8gs+cJoZ1h8189CUW7Jx0mOefe4JKUv2vnGMKobLq7cv35z y+7bexWyWa6GyTh3iHXw9UVJXne3mh0tysn16si72YW74jSSr3z82v0gdI5b/kzDA2ulbp9r E8j6cVPawUsm/YQSS3FGoqEWc1FxIgAurrw/XwIAAA== X-CMS-MailID: 20190213122718eucas1p26156656a2376e5055452ac4d0385fc6d X-Msg-Generator: CA Content-Type: text/plain; charset="utf-8" X-RootMTR: 20190213122718eucas1p26156656a2376e5055452ac4d0385fc6d X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20190213122718eucas1p26156656a2376e5055452ac4d0385fc6d References: Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 210.118.77.12 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Wed, 13 Feb 2019 10:56:24 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) As in the topic. See this page: http://blog.eduardofleury.com/archives/2007/09/13 There is a string with a null byte at the beginning. Firefox renders the page past this point. EWW stops on: sock.bind(=E2=80=9C In GNU Emacs 26.1 (build 1, x86_64-redhat-linux-gnu, GTK+ Version 3.23.2) of 2018-08-13 built on buildvm-13.phx2.fedoraproject.org Windowing system distributor 'Fedora Project', version 11.0.12003000 Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Contacting host: blog.eduardofleury.com:80 scroll-up-command: End of buffer [2 times] Configured using: 'configure --build=3Dx86_64-redhat-linux-gnu --host=3Dx86_64-redhat-linux-gnu --program-prefix=3D --disable-dependency-tracking --prefix=3D/usr --exec-prefix=3D/usr --bindir=3D/usr/bin --sbindir=3D/usr/sbin --sysconfdir=3D/etc --datadir=3D/usr/share --includedir=3D/usr/include --libdir=3D/usr/lib64 --libexecdir=3D/usr/libexec --localstatedir=3D/var --sharedstatedir=3D/var/lib --mandir=3D/usr/share/man --infodir=3D/usr/share/info --with-dbus --with-gif --with-jpeg --with- png --with-rsvg --with-tiff --with-xft --with-xpm --with-x-toolkit=3Dgtk3 --with-gpm=3Dno --with-xwidgets --with-modules build_alias=3Dx86_64-redhat-linux-gnu host_alias=3Dx86_64-redhat-linux-g= nu 'CFLAGS=3D-DMAIL_USE_LOCKF -O2 -g -pipe -Wall -Werror=3Dformat-security -Wp,-D_FORTIFY_SOURCE=3D2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=3D/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=3D/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=3Dgeneric -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' LDFLAGS=3D-Wl,-z,relro PKG_CONFIG_PATH=3D:/usr/lib64/pkgconfig:/usr/share/pkgconfig' Configured features: XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND DBUS GSETTINGS NOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 MODULES THREADS XWIDGETS LCMS2 Important settings: value of $LC_COLLATE: C value of $LC_CTYPE: pl_PL.UTF-8 value of $LC_MONETARY: en_US.UTF-8 value of $LC_NUMERIC: en_US.UTF-8 value of $LC_TIME: en_US.UTF-8 value of $LANG: C value of $XMODIFIERS: @im=3Dibus locale-coding-system: utf-8-unix Major mode: eww Minor modes in effect: tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t buffer-read-only: t line-number-mode: t transient-mark-mode: t Load-path shadows: None found. Features: (shadow sort mail-extr emacsbug message dired dired-loaddefs rfc822 mml mml-sec epa derived epg epg-config mm-decode mm-bodies mm-encode mailabbrev gmm-utils mailheader sendmail cl-extra help-mode network-stream starttls url-http tls gnutls mail-parse rfc2231 url-gw nsm rmc url-cache url-auth eww easymenu puny mm-url gnus nnheader gnus-util rmail rmail-loaddefs rfc2047 rfc2045 ietf-drums mail-utils wid-edit mm-util mail-prsvr url-queue url url-proxy url-privacy url-expand url-methods url-history url-cookie url-domsuf url-util url-parse auth-source cl-seq eieio eieio-core cl-macs eieio-loaddefs password-cache url-vars mailcap shr svg xml seq byte-opt gv bytecomp byte-compile cconv dom browse-url format-spec cl-loaddefs cl-lib elec-pair time-date mule-util tooltip eldoc electric uniquify ediff- hook vc-hooks lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote dbusbind inotify lcms2 dynamic-setting system-font-setting font-render-setting xwidget- internal move-toolbar gtk x-toolkit x multi-tty make-network-process emacs) Memory information: ((conses 16 137138 10359) (symbols 48 23803 2) (miscs 40 59 148) (strings 32 40308 1635) (string-bytes 1 1174212) (vectors 16 17956) (vector-slots 8 544601 12850) (floats 8 73 241) (intervals 56 3447 0) (buffers 992 12)) --=20 Lukasz Pawelczyk Samsung R&D Institute Poland Samsung Electronics From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 13 23:46:13 2019 Received: (at submit) by debbugs.gnu.org; 14 Feb 2019 04:46:13 +0000 Received: from localhost ([127.0.0.1]:47581 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gu8ua-0000sh-Vx for submit@debbugs.gnu.org; Wed, 13 Feb 2019 23:46:13 -0500 Received: from eggs.gnu.org ([209.51.188.92]:34029) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gu8uV-0000sC-FE for submit@debbugs.gnu.org; Wed, 13 Feb 2019 23:46:07 -0500 Received: from lists.gnu.org ([209.51.188.17]:47562) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gu8uP-0002CM-O7 for submit@debbugs.gnu.org; Wed, 13 Feb 2019 23:46:01 -0500 Received: from eggs.gnu.org ([209.51.188.92]:48344) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gu8uO-0006g4-Mf for bug-gnu-emacs@gnu.org; Wed, 13 Feb 2019 23:46:01 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gu8uN-00028x-R1 for bug-gnu-emacs@gnu.org; Wed, 13 Feb 2019 23:46:00 -0500 Received: from mail-lj1-x232.google.com ([2a00:1450:4864:20::232]:32782) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gu8uN-0001JO-Iq for bug-gnu-emacs@gnu.org; Wed, 13 Feb 2019 23:45:59 -0500 Received: by mail-lj1-x232.google.com with SMTP id f24-v6so4064220ljk.0 for ; Wed, 13 Feb 2019 20:45:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to :content-transfer-encoding; bh=TPcd10H9Hdb7d5zSAyUnuSty6drPs3HaPhE7706Y1Uw=; b=DlqehAFY6hgfB4TwbmUgieb5opMObCahKhkhm86erEKEdZqyT7ro1xsn2f7majmUvD zNi600UYf2Mc9dBv3d685ss16fSJhAHqrpY1Emulq7H6mNFb0D5ruTiGSNZid93BCqxu 0IYm6Wj882ZxsoTLN8rJZ90OlryQavuIJbOvYLDKYtNr3D/j1f7tXAziMjkIeijjfPi8 6g0+bIU+vDSUkkmIB9FQOZpiCYc80uBdP0UhP/cEr8SH08hoB8sU9HsF91jLvd3Ut4Ru aEp/nlZkNye9J5DpELUNS1TNSeJx17qc8dwyXiK/7BG4EAIUZursgLovW+9pk4xAtcX9 xmAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to :content-transfer-encoding; bh=TPcd10H9Hdb7d5zSAyUnuSty6drPs3HaPhE7706Y1Uw=; b=uZlLBby3ilg81Prs7Mw5fHyb/NMZ5slsefrvyaz8KABDIlBxmqqAG0cnwZv07BWKZH HB9GMfyWTUTzpZHGFvZ7CBUmiGCnlHHGeCHZ9zGcOATFYs6RZZ7xYmkzMFa5I0w3L9AY zpBW0vM/6d+z8UlcvUTteBhvS7fX43h/cCFGobFPNm6zwA7OcSnaTxlDtivjxbM252I+ rUU5FJpe7ysAxLKy0R7yBiE9qcR62H3214g9QmJGIlSSIaQnnXux74X4a6iHwIC+KwWI nILPZh1t9NMeJ89BLSpET1hUi72BpBWHhxphxExmuM/0ysQpI7TD8OkWXhJa2okqBySZ U4vg== X-Gm-Message-State: AHQUAuaSqHH2dYaE350jy4FaTsAaCfVZYj9pmo8BfY9SnzSL8ILZlCDz 6VYw/D/NKKLNOr7fEamAk6xTiaaLtYivXE0mX6ec7aYmUPo= X-Google-Smtp-Source: AHgI3Ia8VxZBCHJ5ClZ3UfhCQT29gZFC0t0qaAXRPbX884DOZ4/BnWqsUHIlxQuBxsnZeqfKmQ0hJM1A6YvZQNPlXSE= X-Received: by 2002:a2e:9cda:: with SMTP id g26mr290425ljj.48.1550119502806; Wed, 13 Feb 2019 20:45:02 -0800 (PST) MIME-Version: 1.0 From: Nicholas Drozd Date: Wed, 13 Feb 2019 22:44:50 -0600 Message-ID: Subject: bug#34469: 26.1; EWW stops renderring web page on null byte To: l.pawelczyk@samsung.com, bug-gnu-emacs@gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::232 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) This looks a problem with libxml-parse-html-region (or maybe even lower than that, I have no idea). Put the following in a buffer

sock.bind(“\0MyBindName”)

and execute (libxml-parse-html-region (point-min) (point-max)) This returns (html nil (body nil (p nil "sock.bind(=E2=80=9C"))) From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 14 14:14:43 2019 Received: (at 34469) by debbugs.gnu.org; 14 Feb 2019 19:14:43 +0000 Received: from localhost ([127.0.0.1]:48399 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1guMT5-0002Wk-0u for submit@debbugs.gnu.org; Thu, 14 Feb 2019 14:14:43 -0500 Received: from eggs.gnu.org ([209.51.188.92]:41588) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1guMT3-0002WY-Tl for 34469@debbugs.gnu.org; Thu, 14 Feb 2019 14:14:42 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:45187) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1guMSo-0001Tj-IS; Thu, 14 Feb 2019 14:14:28 -0500 Received: from [176.228.60.248] (port=2185 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1guMSf-00021t-WA; Thu, 14 Feb 2019 14:14:21 -0500 Date: Thu, 14 Feb 2019 21:14:12 +0200 Message-Id: <83ef8anpx7.fsf@gnu.org> From: Eli Zaretskii To: Nicholas Drozd In-reply-to: (message from Nicholas Drozd on Wed, 13 Feb 2019 22:44:50 -0600) Subject: Re: bug#34469: 26.1; EWW stops renderring web page on null byte References: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 34469 Cc: 34469@debbugs.gnu.org, l.pawelczyk@samsung.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Nicholas Drozd > Date: Wed, 13 Feb 2019 22:44:50 -0600 > > This looks a problem with libxml-parse-html-region (or maybe even > lower than that, I have no idea). libxml-parse-html-region calls parse_region, which passes a C string to libxml functions. So there can be no embedded null bytes. Does libxml have facilities to deal with such cases? If not, maybe this should be taken up with libxml developers. From debbugs-submit-bounces@debbugs.gnu.org Sat Feb 16 13:13:44 2019 Received: (at submit) by debbugs.gnu.org; 16 Feb 2019 18:13:44 +0000 Received: from localhost ([127.0.0.1]:50509 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gv4TA-0006H2-9b for submit@debbugs.gnu.org; Sat, 16 Feb 2019 13:13:44 -0500 Received: from eggs.gnu.org ([209.51.188.92]:54277) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gv4T6-0006Go-QW for submit@debbugs.gnu.org; Sat, 16 Feb 2019 13:13:42 -0500 Received: from lists.gnu.org ([209.51.188.17]:39544) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gv4Sw-0002f2-Tz for submit@debbugs.gnu.org; Sat, 16 Feb 2019 13:13:34 -0500 Received: from eggs.gnu.org ([209.51.188.92]:40418) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gv4Sw-0000ni-6x for bug-gnu-emacs@gnu.org; Sat, 16 Feb 2019 13:13:30 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gv4Sl-0002cO-Ny for bug-gnu-emacs@gnu.org; Sat, 16 Feb 2019 13:13:26 -0500 Received: from mail-lf1-x12b.google.com ([2a00:1450:4864:20::12b]:37958) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gv4Sl-0002cC-08; Sat, 16 Feb 2019 13:13:19 -0500 Received: by mail-lf1-x12b.google.com with SMTP id n15so9488339lfe.5; Sat, 16 Feb 2019 10:13:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=ISHNNsvTsi2liUwX0Y3rs/c1QDmB/p7+Wby2ofQAJQU=; b=pYOFSIjBREXxdQnpDBH75HKCYoD+ug47NIt8sQ0IRdOD85pxgUMERO7z30YOvFRCn6 kfEnQL700GFVmCfva9ANQE+1moTXPi9vjP9Wt6HM2rxq0XCQAbKY5bluPm4q5+p3rwV5 +84s6x6SVl4qWrclfky7t9F9OaZWL1FlWqopd+ulZBe/RMPNtUtLgwnyVMyxdUOL5ja8 DgNLNq4ych1fm+P7wWbxCGZVFLrGmARY1BVOi6fSHcr9QXbg5njRI5M452+G9qnQnh2X 4IdqIBm/DZzMB/BWBPo9YzTuKTxOkb6lZsa+4CXuIVM4XX3HlIDfs9bIs5Jkvng0XyH2 HFVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=ISHNNsvTsi2liUwX0Y3rs/c1QDmB/p7+Wby2ofQAJQU=; b=ZE20kHB9RpokU4XFTWHwKjB/P0nF2xpn8zvW1BTukFIUb4pdCAvW4F4vnr9o7Xo31Z Nc3zYWuL7xwNoEOADU86GVCQwuahU+rprTANmS6SkurzWp7PNA17SHUF95XImozKXVSV 4r3wrr3m3IAreTw/09EGcbVLY+jvaoPpyRxZN2XQUnj1PzyBmakQJQZXcxt/7+55HhJJ nsUFgk12+x8zr6T9YkKjsfOE6Ylei5PVLd/Pg8/HQNuElF8vHiE6RkB22RCiKxw+1iJM cHy1gH1+0RCA4QKlWDJXSF6Zd0oUZUQflt9UQPGCMe1vKVJ9a17/hYO0JipoM8j5SxwG NP1g== X-Gm-Message-State: AHQUAubvdIcQgG84iaPkH3P2XpzhaoQYpRr9XDsjYzakAeh57mgnNI+i Z+Shpuo5Gn7OImRDOGnW80W6sOlyFZB8OUpD4+aO9pAKix8= X-Google-Smtp-Source: AHgI3IZUzng7/6ZPnL4/OjAeCY/JNpwKsWNzH1aIEwvVqnJ5trqE7M4Um+JnhQNWlyZmrOpE0ufaauGxIBRSYlPPnzU= X-Received: by 2002:a19:7410:: with SMTP id v16mr9296119lfe.166.1550340796158; Sat, 16 Feb 2019 10:13:16 -0800 (PST) MIME-Version: 1.0 From: Nicholas Drozd Date: Sat, 16 Feb 2019 12:13:03 -0600 Message-ID: Subject: bug#34469: 26.1; EWW stops renderring web page on null byte To: bug-gnu-emacs@gnu.org, Eli Zaretskii Content-Type: text/plain; charset="UTF-8" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::12b X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) This is a known issue with libxml, or at least it was at some point. Here's a thread from 2008: https://mail.gnome.org/archives/xml/2008-August/msg00008.html From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 18 20:12:51 2019 Received: (at 34469) by debbugs.gnu.org; 19 Feb 2019 01:12:51 +0000 Received: from localhost ([127.0.0.1]:53481 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gvtxr-0000XH-6R for submit@debbugs.gnu.org; Mon, 18 Feb 2019 20:12:51 -0500 Received: from eggs.gnu.org ([209.51.188.92]:48106) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gvtxp-0000Wu-SD for 34469@debbugs.gnu.org; Mon, 18 Feb 2019 20:12:50 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:55725) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gvtxk-0004QE-Ar; Mon, 18 Feb 2019 20:12:44 -0500 Received: from rgm by fencepost.gnu.org with local (Exim 4.82) (envelope-from ) id 1gvtxi-0003Ik-6l; Mon, 18 Feb 2019 20:12:42 -0500 From: Glenn Morris To: Nicholas Drozd Subject: Re: bug#34469: 26.1; EWW stops renderring web page on null byte References: X-Spook: State of emergency JFK Mole Islamist Strain condor Agro X-Ran: S'>o_dqcbU]&4(r[7_{)_Bsa#rk+!"1P\u)PO?p,)4eiDRFC=]~0]g=Y (Nicholas Drozd's message of "Sat, 16 Feb 2019 12:13:03 -0600") Message-ID: <02sgwk1sza.fsf@fencepost.gnu.org> User-Agent: Gnus (www.gnus.org), GNU Emacs (www.gnu.org/software/emacs/) MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 34469 Cc: eliz@gnu.org, 34469@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Perhaps eww-display-html should replace null bytes (with whatever the html standard says is appropriate) before calling libxml-parse-html-region. It already replaces CRLF. From debbugs-submit-bounces@debbugs.gnu.org Tue Feb 19 05:06:50 2019 Received: (at 34469) by debbugs.gnu.org; 19 Feb 2019 10:06:50 +0000 Received: from localhost ([127.0.0.1]:53925 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gw2IZ-0006vI-Uk for submit@debbugs.gnu.org; Tue, 19 Feb 2019 05:06:48 -0500 Received: from mail-wr1-f49.google.com ([209.85.221.49]:33580) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gw2IY-0006v5-8O for 34469@debbugs.gnu.org; Tue, 19 Feb 2019 05:06:46 -0500 Received: by mail-wr1-f49.google.com with SMTP id i12so21545756wrw.0 for <34469@debbugs.gnu.org>; Tue, 19 Feb 2019 02:06:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:mail-copies-to:gmane-reply-to-list :date:in-reply-to:message-id:mime-version; bh=JGz6M0iiGjknT/jU5UfYFgd789p0uaNCCG0KWzX2TGo=; b=JiKu3f+dld9wfYfK3UaVefI0f5LGZwdTwFgOY9x0DGr3a91iVMIbth8CYvbNqYJrSA UFuuSs8+7PM2NSs5Qx+oT7khZ4nfkwil4fYyV6AFG7sCRksW2k/Z4IulyVdrq/zEtFs/ u8tFYSgE6Pg5P73Ybopo/2Y8f7FxEPEO441CIQitqo8WifBspCezf3HaBek+6BbWx3Gw HA0GxO+O2zcoezSDoKXJJMMXpj62iy8w0WTl+WN30APgPetm3jJ/naDekgt8CqgWe4m2 6kXfec+Q2kFdls8FG3tGTwSGNycQqnr2Vt9Sbht4Hx2d9SMxywUv9PwDPjqBBLL063yC dhcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:mail-copies-to :gmane-reply-to-list:date:in-reply-to:message-id:mime-version; bh=JGz6M0iiGjknT/jU5UfYFgd789p0uaNCCG0KWzX2TGo=; b=PVn70axRyHdpjpNhjlDTkx7uWmDdYhjzpVn9VSWOVg4ky5OQjZrVWu9pThu0jTFSFL DZdVt29GqJf5GB5eetdpPVuSvvVemfOJ3gkgbqS7PwIVij1nudPLgtqdQxNkO4rHS4b8 AwQ6Kyy6n7gGlcwr9sG4akY62Bs1HSECFp941jv8dIutQYlGXYy7NsP+gK/8UPyfE6HZ 6lPT3s6fN8tAxkJDR7Merim04FtUrmj5BaWZcohKoyiGkYZsNVTTEesQXH3YzSghPxe8 wVf9JveQN/7rIlk/gqxuDREpfF3kYwZxX1O/np4Jh1LuwqiRJu7cs02Nj61rWENxZgOj 5mbQ== X-Gm-Message-State: AHQUAubqgZQFG9vE7JqeKrA7CuPzPH8zhpuZ/Oi61CCZP6rJS5J51oT3 fWPcRY58wFrTwASl0daGmSb4tCas X-Google-Smtp-Source: AHgI3IYQInifQ5LMXffbQL32hcLG4TA+8xRm3iFBi0+jNFppFczJOVsaMeIeiJFVivR3WVD8dY4omQ== X-Received: by 2002:adf:dd8a:: with SMTP id x10mr14638008wrl.117.1550570799852; Tue, 19 Feb 2019 02:06:39 -0800 (PST) Received: from rpluim-mac ([149.5.228.1]) by smtp.gmail.com with ESMTPSA id p6sm21021525wre.63.2019.02.19.02.06.38 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 19 Feb 2019 02:06:38 -0800 (PST) From: Robert Pluim To: Glenn Morris Subject: Re: bug#34469: 26.1; EWW stops renderring web page on null byte References: <02sgwk1sza.fsf@fencepost.gnu.org> X-Debbugs-No-Ack: yes Mail-Copies-To: never Gmane-Reply-To-List: yes Date: Tue, 19 Feb 2019 11:06:37 +0100 In-Reply-To: <02sgwk1sza.fsf@fencepost.gnu.org> (Glenn Morris's message of "Mon, 18 Feb 2019 20:12:41 -0500") Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34469 Cc: 34469@debbugs.gnu.org, Nicholas Drozd X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Glenn Morris writes: > Perhaps eww-display-html should replace null bytes (with whatever the > html standard says is appropriate) before calling > libxml-parse-html-region. It already replaces CRLF. Chrome at least just strips the null byte completely. There is apparently a class of attacks that uses the null character for nefarious purposes, so how about something like this: diff --git a/lisp/net/eww.el b/lisp/net/eww.el index 1cc4557ce1..9b57bc43e4 100644 --- a/lisp/net/eww.el +++ b/lisp/net/eww.el @@ -448,8 +448,8 @@ eww-display-html (decode-coding-region (point) (point-max) encode) (coding-system-error nil)) (save-excursion - ;; Remove CRLF before parsing. - (while (re-search-forward "\r$" nil t) + ;; Remove CRLF and NULL before parsing. + (while (re-search-forward "\r$\\|\000" nil t) (replace-match "" t t))) (libxml-parse-html-region (point) (point-max)))))) (source (and (null document) From debbugs-submit-bounces@debbugs.gnu.org Tue Feb 19 11:31:09 2019 Received: (at 34469) by debbugs.gnu.org; 19 Feb 2019 16:31:09 +0000 Received: from localhost ([127.0.0.1]:55464 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gw8IV-0003wX-Ci for submit@debbugs.gnu.org; Tue, 19 Feb 2019 11:31:09 -0500 Received: from eggs.gnu.org ([209.51.188.92]:34699) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gw8IU-0003w1-6N for 34469@debbugs.gnu.org; Tue, 19 Feb 2019 11:31:06 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:45471) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gw8IG-0005pb-KL; Tue, 19 Feb 2019 11:30:56 -0500 Received: from [176.228.60.248] (port=1544 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gw8I5-0004aK-2m; Tue, 19 Feb 2019 11:30:47 -0500 Date: Tue, 19 Feb 2019 18:30:48 +0200 Message-Id: <83mumrivuv.fsf@gnu.org> From: Eli Zaretskii To: Robert Pluim In-reply-to: (message from Robert Pluim on Tue, 19 Feb 2019 11:06:37 +0100) Subject: Re: bug#34469: 26.1; EWW stops renderring web page on null byte References: <02sgwk1sza.fsf@fencepost.gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34469 Cc: rgm@gnu.org, 34469@debbugs.gnu.org, nicholasdrozd@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Robert Pluim > Date: Tue, 19 Feb 2019 11:06:37 +0100 > Cc: 34469@debbugs.gnu.org, Nicholas Drozd > > Glenn Morris writes: > > > Perhaps eww-display-html should replace null bytes (with whatever the > > html standard says is appropriate) before calling > > libxml-parse-html-region. It already replaces CRLF. > > Chrome at least just strips the null byte completely. > > There is apparently a class of attacks that uses the null character > for nefarious purposes, so how about something like this: > > diff --git a/lisp/net/eww.el b/lisp/net/eww.el > index 1cc4557ce1..9b57bc43e4 100644 > --- a/lisp/net/eww.el > +++ b/lisp/net/eww.el > @@ -448,8 +448,8 @@ eww-display-html > (decode-coding-region (point) (point-max) encode) > (coding-system-error nil)) > (save-excursion > - ;; Remove CRLF before parsing. > - (while (re-search-forward "\r$" nil t) > + ;; Remove CRLF and NULL before parsing. > + (while (re-search-forward "\r$\\|\000" nil t) > (replace-match "" t t))) It is un-Emacsy, IMO, to remove content without a trace. (CR is different: we simply convert text to Unix LF-only EOL format.) So I'd suggest to replace with "^@" or "\000" or "NUL" or something to that effect. Even U+FFFD would be better than removing. (We could get fancy and have a defcustom for those who do want the null bytes removed.) Thanks. From debbugs-submit-bounces@debbugs.gnu.org Tue Feb 19 12:37:37 2019 Received: (at 34469) by debbugs.gnu.org; 19 Feb 2019 17:37:37 +0000 Received: from localhost ([127.0.0.1]:55504 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gw9Kq-0005WJ-Tp for submit@debbugs.gnu.org; Tue, 19 Feb 2019 12:37:37 -0500 Received: from mail-wr1-f52.google.com ([209.85.221.52]:33631) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gw9Kp-0005W6-Aa for 34469@debbugs.gnu.org; Tue, 19 Feb 2019 12:37:35 -0500 Received: by mail-wr1-f52.google.com with SMTP id i12so22999829wrw.0 for <34469@debbugs.gnu.org>; Tue, 19 Feb 2019 09:37:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:mail-copies-to:gmane-reply-to-list :date:in-reply-to:message-id:mime-version:content-transfer-encoding; bh=t0nOa1Nx1FNaspwTPSG5os9Q7D7iSHPGa5fd5zj2mz0=; b=HebFMQlO0a2rvR5PGyXNet8Lj3t6qHj3AnJVXlpgbudDhxhDaFgSQDwW5ewEnnbDU7 ppm/ESNKxQTiaR/04Y/cO+yaQrY2/r8o41r5mDHuzE0kGzTs1C/NWC3PBazFJt7gp0Bi VN+a/TyK2S4P2XJ7z6vbm1H1CO2InsXUCl3cjszWj2hSwFLxV8JUkQhbFMJmmmTo5fMC sTYZD+ugpc0Tbs0Sw8In29r1XaYCFFQlz1xKZEgYZSJJ3mte5zChLHF0w6LR6X3hfCfA e5fX2zCpCpWd5tEPX3NzRG6pNQYha+EiirQUXPURayIbnuavkTWu0G99qj/rFxTEKNYy jfOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:mail-copies-to :gmane-reply-to-list:date:in-reply-to:message-id:mime-version :content-transfer-encoding; bh=t0nOa1Nx1FNaspwTPSG5os9Q7D7iSHPGa5fd5zj2mz0=; b=dpKfgM0ndrs21YdRvumRP/ZUxogkAwyuLK3UIUS0gHhYCE7XHWVZiKlGhKGoPSYRGO 4HGvScgLwzb+BpY9RHUxM6t90uPqK9UgyiEc9R//Us4EdwaVj3guGVXV35lGLuGOKHJR 4gPaAgZ0DYT1EXtGkFAiicxYIaZJfhx+AXKxUIeab+j++ujaR8J0Z2W6hMXkwRf6uRGf STdU4c355OaiVX+xbsILAo7ohbjfEsv+wKJnM72PahcO7Sv0O/+ch8lHwX8SqyAOUfBR JknQDF074BEQMnw8S8Z9e5L0eSfwDo3EHusIdFz/f98PtGN+OXPtqXDELDT1VOz0I24Q igQQ== X-Gm-Message-State: AHQUAuY+IpyXi/nSBlCPiBBqpf3girdaYY/v5pspK4FKQHeQmJau4yse GD77rpxPK15z3TDwRaTzV8E= X-Google-Smtp-Source: AHgI3Ia3lcreH7aC7cQLC348TrsH01jHg1fG+nxjoW4q4mX3TQhVcp/QYihbocCpVrXCCyau9ucRbw== X-Received: by 2002:a5d:668b:: with SMTP id l11mr20989096wru.116.1550597849098; Tue, 19 Feb 2019 09:37:29 -0800 (PST) Received: from rpluim-mac ([2a01:e34:ecfc:a860:c571:c640:baa2:65db]) by smtp.gmail.com with ESMTPSA id y139sm4227154wmd.22.2019.02.19.09.37.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 19 Feb 2019 09:37:27 -0800 (PST) From: Robert Pluim To: Eli Zaretskii Subject: Re: bug#34469: 26.1; EWW stops renderring web page on null byte References: <02sgwk1sza.fsf@fencepost.gnu.org> <83mumrivuv.fsf@gnu.org> X-Debbugs-No-Ack: yes Mail-Copies-To: never Gmane-Reply-To-List: yes Date: Tue, 19 Feb 2019 18:37:26 +0100 In-Reply-To: <83mumrivuv.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 19 Feb 2019 18:30:48 +0200") Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34469 Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii writes: >> From: Robert Pluim >> Date: Tue, 19 Feb 2019 11:06:37 +0100 >> Cc: 34469@debbugs.gnu.org, Nicholas Drozd >>=20 >> Glenn Morris writes: >>=20 >> > Perhaps eww-display-html should replace null bytes (with whatever the >> > html standard says is appropriate) before calling >> > libxml-parse-html-region. It already replaces CRLF. >>=20 >> Chrome at least just strips the null byte completely. >>=20 >> There is apparently a class of attacks that uses the null character >> for nefarious purposes, so how about something like this: >>=20 >> diff --git a/lisp/net/eww.el b/lisp/net/eww.el >> index 1cc4557ce1..9b57bc43e4 100644 >> --- a/lisp/net/eww.el >> +++ b/lisp/net/eww.el >> @@ -448,8 +448,8 @@ eww-display-html >> (decode-coding-region (point) (point-max) encode) >> (coding-system-error nil)) >> (save-excursion >> - ;; Remove CRLF before parsing. >> - (while (re-search-forward "\r$" nil t) >> + ;; Remove CRLF and NULL before parsing. >> + (while (re-search-forward "\r$\\|\000" nil t) >> (replace-match "" t t))) > > It is un-Emacsy, IMO, to remove content without a trace. (CR is > different: we simply convert text to Unix LF-only EOL format.) So I'd > suggest to replace with "^@" or "\000" or "NUL" or something to that > effect. Even U+FFFD would be better than removing. > Since this is all due to a C-ism in the handling of content, I=CA=BCd vote for "\0", although this is inside Emacs, so perhaps "^@" is best. > (We could get fancy and have a defcustom for those who do want the > null bytes removed.) I really don=CA=BCt think this is something that needs to be configurable. Robert From debbugs-submit-bounces@debbugs.gnu.org Tue Feb 19 13:11:32 2019 Received: (at 34469) by debbugs.gnu.org; 19 Feb 2019 18:11:32 +0000 Received: from localhost ([127.0.0.1]:55516 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gw9rg-0006JF-AL for submit@debbugs.gnu.org; Tue, 19 Feb 2019 13:11:32 -0500 Received: from eggs.gnu.org ([209.51.188.92]:34676) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gw9re-0006J1-1F for 34469@debbugs.gnu.org; Tue, 19 Feb 2019 13:11:30 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46958) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gw9rY-00033d-Th; Tue, 19 Feb 2019 13:11:24 -0500 Received: from [176.228.60.248] (port=4309 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gw9rP-0005Mt-E6; Tue, 19 Feb 2019 13:11:17 -0500 Date: Tue, 19 Feb 2019 20:11:17 +0200 Message-Id: <83bm37ir7e.fsf@gnu.org> From: Eli Zaretskii To: Robert Pluim In-reply-to: (message from Robert Pluim on Tue, 19 Feb 2019 18:37:26 +0100) Subject: Re: bug#34469: 26.1; EWW stops renderring web page on null byte References: <02sgwk1sza.fsf@fencepost.gnu.org> <83mumrivuv.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34469 Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Robert Pluim > Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com > Date: Tue, 19 Feb 2019 18:37:26 +0100 > > Since this is all due to a C-ism in the handling of content, Iʼd vote > for "\0", although this is inside Emacs, so perhaps "^@" is best. Either is fine with me. > > (We could get fancy and have a defcustom for those who do want the > > null bytes removed.) > > I really donʼt think this is something that needs to be configurable. Neither do I. From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 20 13:49:00 2019 Received: (at 34469) by debbugs.gnu.org; 20 Feb 2019 18:49:00 +0000 Received: from localhost ([127.0.0.1]:58451 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gwWvT-00012z-TY for submit@debbugs.gnu.org; Wed, 20 Feb 2019 13:49:00 -0500 Received: from mail-wr1-f52.google.com ([209.85.221.52]:42416) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gwWvS-00012m-Du for 34469@debbugs.gnu.org; Wed, 20 Feb 2019 13:48:58 -0500 Received: by mail-wr1-f52.google.com with SMTP id r5so13886399wrg.9 for <34469@debbugs.gnu.org>; Wed, 20 Feb 2019 10:48:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:mail-copies-to:gmane-reply-to-list :date:message-id:mime-version:content-transfer-encoding; bh=62Da0mtVJqRj8ZLRv5Ucpr1YynQh4sFXFW0SmEDM1Aw=; b=oZ2Xoa6eTZ30kybDyua/OE6sSOSsBViHS5CBL36RkX++s+JapqILUoJbyquRVmM2p0 gpXqLuW/gcLRk1zd08h9M1W/iwlz1bsDNe7aQzvn9Heax4JOAJBEvZcGOmWl9ztexGRq kiJOCghCGgLghz1FrsxwcUQ/e3YAPpQXsmyiMGWqFdyUZJXotsrkL5WhX0Og5YcFU7Sw F04ET7cTmkxEzgKZwGsU9X/FsKeD1f0snzHG4zHmq19yaCfPkkMQEp+lmeVC4dpR9bOS 3Hk0519y7PBAIW/ylH5gtSPOQOGcO79rslAR/kU9sQ1asCj6g69rb6P4CoNU5ji/5dJw Iy4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:mail-copies-to :gmane-reply-to-list:date:message-id:mime-version :content-transfer-encoding; bh=62Da0mtVJqRj8ZLRv5Ucpr1YynQh4sFXFW0SmEDM1Aw=; b=TH0GKou3WrzKaMuH4WVm+qjpldpgiGXgUPzQP/s2GMjbcOrZxJLI8kUG5clbug5Zdf 4eEWTSz9u0T84G8UkJ+settD/Dqy4sHcLc6w9/dyS+ufVmdU3hmP+quv2OulIIyfaMbO 630xg5rjzquLEqqOVuQNYqqGxFGpBAsMUevtNKF/5mpWBX/KLRKB3HNgdPk07MVCla2S +1u7jR7kZdczdQOpVSGvsxNouD3QjEI7lEEJ9l51fwBACCxopmvOI6thw57qWaEtwLIc XUCRR9iEmLPM1Z94lapYpJdw92R9KyEVKTyk2lJWQLWLNfBQdN35mtEj6ARZBH7VOfkz WzZA== X-Gm-Message-State: AHQUAuZpaT7yJFhUyDjQH3HXokTUnWsU1fNNb+dB7u+NY7ilejHv9NLb +e0fF9yrFxjKBrWS7m6tCeM= X-Google-Smtp-Source: AHgI3IaobBMor6P48qIuTyS7+/XOMuiKHLEfYq4sRVEA/8uqbEA219zPU8nq1IdO0yiFEjlOazTxWQ== X-Received: by 2002:adf:efc4:: with SMTP id i4mr28158235wrp.42.1550688532191; Wed, 20 Feb 2019 10:48:52 -0800 (PST) Received: from rpluim-mac ([2a01:e34:ecfc:a860:dda:cfc2:7168:6ad8]) by smtp.gmail.com with ESMTPSA id t9sm15633431wrx.73.2019.02.20.10.48.50 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 20 Feb 2019 10:48:51 -0800 (PST) From: Robert Pluim To: Eli Zaretskii Subject: Re: bug#34469: 26.1; EWW stops renderring web page on null byte References: <02sgwk1sza.fsf@fencepost.gnu.org> <83mumrivuv.fsf@gnu.org> <83bm37ir7e.fsf@gnu.org> X-Debbugs-No-Ack: yes Mail-Copies-To: never Gmane-Reply-To-List: yes Date: Wed, 20 Feb 2019 19:48:50 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 34469 Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii writes: >> From: Robert Pluim >> Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com >> Date: Tue, 19 Feb 2019 18:37:26 +0100 >>=20 >> Since this is all due to a C-ism in the handling of content, I=CA=BCd vo= te >> for "\0", although this is inside Emacs, so perhaps "^@" is best. > > Either is fine with me. Since the web page that triggered this was showing C code, I=CA=BCve gone for the "\0" option. 2019-02-20 Robert Pluim * lisp/net/eww.el (eww-display-html): Replace NULL characters with "\0", as libxml can't handle embedded NULLs. diff --git i/lisp/net/eww.el w/lisp/net/eww.el index 555b3bd591..06075b1ebd 100644 --- i/lisp/net/eww.el +++ w/lisp/net/eww.el @@ -462,10 +462,12 @@ eww-display-html (condition-case nil (decode-coding-region (point) (point-max) encode) (coding-system-error nil)) - (save-excursion - ;; Remove CRLF before parsing. - (while (re-search-forward "\r$" nil t) - (replace-match "" t t))) + (save-excursion + ;; Remove CRLF and NULL before parsing. + (while (re-search-forward "\\(\r$\\)\\|\\(\000\\)" nil t) + (replace-match (if (match-beginning 1) + "" + "\\0") t t))) (libxml-parse-html-region (point) (point-max)))))) (source (and (null document) (buffer-substring (point) (point-max))))) From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 27 06:31:58 2019 Received: (at 34469) by debbugs.gnu.org; 27 Feb 2019 11:31:58 +0000 Received: from localhost ([127.0.0.1]:53578 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gyxRM-0001OM-HQ for submit@debbugs.gnu.org; Wed, 27 Feb 2019 06:31:58 -0500 Received: from mail-wr1-f47.google.com ([209.85.221.47]:34784) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gyxRK-0001O8-PP for 34469@debbugs.gnu.org; Wed, 27 Feb 2019 06:31:55 -0500 Received: by mail-wr1-f47.google.com with SMTP id f14so17550471wrg.1 for <34469@debbugs.gnu.org>; Wed, 27 Feb 2019 03:31:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:mail-copies-to:gmane-reply-to-list :date:in-reply-to:message-id:mime-version; bh=61ipvJze+SVnY1VnCDqvZ26C8lHPRVwL74U3/V3FIHs=; b=H4Pf8La/iiME7y3mqF7S0DyL5Hlh1/d1ei5yd4JOUQ168HbyArDcAWIEObrV3wS16/ nYCRXe0kkCh1BgOpogCBVxMRMdnQagyi4hI8H6hDu0tuZsnyuKtM98JofnpQniD/skfB uCO360UlhW7VtYKXLrdPdJwY8lD+v1NYBB1fd4osDB8SWuSVDbKSz6K14vBLBwQZ2nzG uh763FWjbgC1Aro64fvL411nhP+gumucNpXgTWROPaBqGHSGIb87yfhB0QkrGjiwT55+ m+hnkxn0AP28BxsQDdo0gZ2EuQ23AjxmNCVNkrusCcVtwD7sgcwajjOh7oePA4VOiz9A Ittg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:mail-copies-to :gmane-reply-to-list:date:in-reply-to:message-id:mime-version; bh=61ipvJze+SVnY1VnCDqvZ26C8lHPRVwL74U3/V3FIHs=; b=NHnbqCda5jGukK78Fhy122tz80x4iHZERRMHxxbXQ3oYoV0VMwOVSjS9lSEfGM5Kw0 uJXbZfVkX1qRtmsHRBJ503ot+qTXhp725+uhZigz4EcIM0X4LNWl7XyDvq/E9nHTmLYs mNBSrdJfs/TwO+EbbMRDlTlIWfW2XRYCQSySQCEq2+tX/+mDF1PJYpVRQH3XEor7+bBp PVBjltY0zDqcScPGLQjGeLYdkAY4iMXJHCrAaaHVww6RTiaOYU5Bn4c2sF07IWTF6itQ qQrkUwvXpOJdjq9vaZDC99Ux3Jye+3U4VkneXFFGfOV8VpdP2tZ8UgrD/KIjNSoRj+Rg bGQA== X-Gm-Message-State: APjAAAU3pcq+eJBdVwp8e8uRzeAwyR58TD3LBPWnMIW/jid5XOwFLQ5k Xc/blWuOmcmfFezNe2qa9K0= X-Google-Smtp-Source: APXvYqyfvs3gIwg7Rv5xp6TcW3c5XctuQ87q7PhtwVwh6JOGWhdd0wzWXtwfLqovinbIbuGX6KXKpQ== X-Received: by 2002:a5d:574b:: with SMTP id q11mr1949674wrw.41.1551267108683; Wed, 27 Feb 2019 03:31:48 -0800 (PST) Received: from rpluim-mac ([149.5.228.1]) by smtp.gmail.com with ESMTPSA id n14sm18129054wrx.24.2019.02.27.03.31.47 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 27 Feb 2019 03:31:47 -0800 (PST) From: Robert Pluim To: Eli Zaretskii Subject: Re: bug#34469: 26.1; EWW stops renderring web page on null byte References: <02sgwk1sza.fsf@fencepost.gnu.org> <83mumrivuv.fsf@gnu.org> <83bm37ir7e.fsf@gnu.org> X-Debbugs-No-Ack: yes Mail-Copies-To: never Gmane-Reply-To-List: yes Date: Wed, 27 Feb 2019 12:31:45 +0100 In-Reply-To: (Robert Pluim's message of "Wed, 20 Feb 2019 19:48:50 +0100") Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34469 Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Robert Pluim writes: Ping! Eli, release or master? > 2019-02-20 Robert Pluim > > * lisp/net/eww.el (eww-display-html): Replace NULL characters with > "\0", as libxml can't handle embedded NULLs. > diff --git i/lisp/net/eww.el w/lisp/net/eww.el > index 555b3bd591..06075b1ebd 100644 > --- i/lisp/net/eww.el > +++ w/lisp/net/eww.el > @@ -462,10 +462,12 @@ eww-display-html > (condition-case nil > (decode-coding-region (point) (point-max) encode) > (coding-system-error nil)) > - (save-excursion > - ;; Remove CRLF before parsing. > - (while (re-search-forward "\r$" nil t) > - (replace-match "" t t))) > + (save-excursion > + ;; Remove CRLF and NULL before parsing. > + (while (re-search-forward "\\(\r$\\)\\|\\(\000\\)" nil t) > + (replace-match (if (match-beginning 1) > + "" > + "\\0") t t))) > (libxml-parse-html-region (point) (point-max)))))) > (source (and (null document) > (buffer-substring (point) (point-max))))) From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 27 10:56:12 2019 Received: (at 34469) by debbugs.gnu.org; 27 Feb 2019 15:56:13 +0000 Received: from localhost ([127.0.0.1]:54174 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gz1Z6-0001Qe-Kp for submit@debbugs.gnu.org; Wed, 27 Feb 2019 10:56:12 -0500 Received: from eggs.gnu.org ([209.51.188.92]:44111) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gz1Z4-0001QR-Ua for 34469@debbugs.gnu.org; Wed, 27 Feb 2019 10:56:11 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:51392) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gz1Yq-0004FT-DY; Wed, 27 Feb 2019 10:56:00 -0500 Received: from [176.228.60.248] (port=3846 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gz1Yb-0004PT-4h; Wed, 27 Feb 2019 10:55:44 -0500 Date: Wed, 27 Feb 2019 17:55:56 +0200 Message-Id: <83imx5kyyb.fsf@gnu.org> From: Eli Zaretskii To: Robert Pluim In-reply-to: (message from Robert Pluim on Wed, 27 Feb 2019 12:31:45 +0100) Subject: Re: bug#34469: 26.1; EWW stops renderring web page on null byte References: <02sgwk1sza.fsf@fencepost.gnu.org> <83mumrivuv.fsf@gnu.org> <83bm37ir7e.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34469 Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Robert Pluim > Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com > Date: Wed, 27 Feb 2019 12:31:45 +0100 > > Robert Pluim writes: > > Ping! > > Eli, release or master? Master, please. From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 27 11:21:50 2019 Received: (at 34469) by debbugs.gnu.org; 27 Feb 2019 16:21:50 +0000 Received: from localhost ([127.0.0.1]:54202 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gz1xt-00022X-Vi for submit@debbugs.gnu.org; Wed, 27 Feb 2019 11:21:50 -0500 Received: from mail-wr1-f48.google.com ([209.85.221.48]:41665) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gz1xp-00022C-U9; Wed, 27 Feb 2019 11:21:46 -0500 Received: by mail-wr1-f48.google.com with SMTP id n2so18628320wrw.8; Wed, 27 Feb 2019 08:21:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:mail-copies-to:gmane-reply-to-list :date:message-id:mime-version; bh=xjAGH1YwEqaM6RtU5fi/ruHbsICGZOUaQ0Ky4tiUu+Y=; b=sGZwAS4mztGu7in3ErdvoP9aphxqabcATwUyguhsIlAACUbJsFTrRKnRcqWTqGIKnj ZDRSBm2kJSXWTO7889jG4eOwWtqTTv0Ri75Zkc7YW+JPigJNWcSJOZAcMs45ZFTHIryC nUSWcdUpPEiMsCEeqGzxaLArbtlPXpXq0L2M22WWnikD7cAO5xrPd7vm9YFaBd+qMAkI B7PaQCt0e85O+ASJ/O/VI5xfJunAsI3S7/iHf4rynJXsaJTMff2Mc4ieXD7Ux7rXPvko ssdP+ztbXsN6rAcniHSPO0XFf7VmLY408ZkRyWoCS8uArIg3ZopzIiU2QOZip2Zr8QsK uTHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:mail-copies-to :gmane-reply-to-list:date:message-id:mime-version; bh=xjAGH1YwEqaM6RtU5fi/ruHbsICGZOUaQ0Ky4tiUu+Y=; b=m8kHxr+igMCchkEoTX66PCG88mzEvoJW8+9Tqp5KqpoZ3GN38VnbsOVJy2t2gd9IUB jc2yCx8R1CdY2aY9NS/kWgILSGIGwqI8Y8slkf6Pwgu/n1S2JJuG/Y8aJNuX/9jLisBd oXkTwjUjfaaxyzTwz9UjoJzAQMttmRvlcbIDjHilmDznbZ9LA3NlPYo7kCU3sd62W1sR yO81lG5lds4NzcnS2SNKUZsnEyedmIqv7XKSnjBBD096xXUOavdTzbX4mVveWBBMB5vT rLSafM3gBBVbuynbUXLiWXbab1p8G9a/6zRlZ4M+4UajQZE+kXvFgLFtvD3Kcy8RInuN CDxg== X-Gm-Message-State: APjAAAWTOeHtLd45C1h/HLVwKVN9A6heOLmZGDIDZFGaST7KiyUD0mdj d8qT0QChtEagtHvCVSdPUEUol22yJnI= X-Google-Smtp-Source: APXvYqzldgSCWYvTJp9xFiqGw/81fVC1+Y6Vt0V7q9h1MdFGp74DDBpzl6nUladl1p62ojH5Lbb8lQ== X-Received: by 2002:adf:e90b:: with SMTP id f11mr3117096wrm.36.1551284498780; Wed, 27 Feb 2019 08:21:38 -0800 (PST) Received: from rpluim-mac ([149.5.228.1]) by smtp.gmail.com with ESMTPSA id w4sm25685522wrk.85.2019.02.27.08.21.37 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 27 Feb 2019 08:21:37 -0800 (PST) From: Robert Pluim To: Eli Zaretskii Subject: Re: bug#34469: 26.1; EWW stops renderring web page on null byte References: <02sgwk1sza.fsf@fencepost.gnu.org> <83mumrivuv.fsf@gnu.org> <83bm37ir7e.fsf@gnu.org> <83imx5kyyb.fsf@gnu.org> X-Debbugs-No-Ack: yes Mail-Copies-To: never Gmane-Reply-To-List: yes Date: Wed, 27 Feb 2019 17:21:36 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 34469 Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) tags 34469 fixed close 34469 27.1 quit Eli Zaretskii writes: >> From: Robert Pluim >> Cc: 34469@debbugs.gnu.org, nicholasdrozd@gmail.com >> Date: Wed, 27 Feb 2019 12:31:45 +0100 >> >> Robert Pluim writes: >> >> Ping! >> >> Eli, release or master? > > Master, please. Done as d07f3aae48 Closing. Robert From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 27 20:53:02 2019 Received: (at 34469) by debbugs.gnu.org; 28 Feb 2019 01:53:02 +0000 Received: from localhost ([127.0.0.1]:54506 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gzAsg-0003iH-0q for submit@debbugs.gnu.org; Wed, 27 Feb 2019 20:53:02 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:38024) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gzAsd-0003ht-QL for 34469@debbugs.gnu.org; Wed, 27 Feb 2019 20:53:00 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1908C161461; Wed, 27 Feb 2019 17:52:54 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 8xAAEIFbJbAk; Wed, 27 Feb 2019 17:52:53 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 30EC9161464; Wed, 27 Feb 2019 17:52:53 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id DbteghPDkLJF; Wed, 27 Feb 2019 17:52:53 -0800 (PST) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 0BC671613AD; Wed, 27 Feb 2019 17:52:53 -0800 (PST) To: Robert Pluim From: Paul Eggert Subject: 26.1; EWW stops renderring web page on null byte Openpgp: preference=signencrypt Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG FQgCCQoLBBYCAwECHgECF4AACgkQ7ZfpDmKqfjRRGw/+Ij03dhYfYl/gXVRiuzV1gGrbHk+t nfrI/C7fAeoFzQ5tVgVinShaPkZo0HTPf18x6IDEdAiO8Mqo1yp0CtHmzGMCJ50o4Grgfjlr 6g/+vtEOKbhleszN2XpJvpwM2QgGvn/laTLUu8PH9aRWTs7qJJZKKKAb4sxYc92FehPu6FOD 0dDiyhlDAq4lOV2mdBpzQbiojoZzQLMQwjpgCTK2572eK9EOEQySUThXrSIz6ASenp4NYTFH s9tuJQvXk9gZDdPSl3bp+47dGxlxEWLpBIM7zIONw4ks4azgT8nvDZxA5IZHtvqBlJLBObYY 0Le61Wp0y3TlBDh2qdK8eYL426W4scEMSuig5gb8OAtQiBW6k2sGUxxeiv8ovWu8YAZgKJfu oWI+uRnMEddruY8JsoM54KaKvZikkKs2bg1ndtLVzHpJ6qFZC7QVjeHUh6/BmgvdjWPZYFTt N+KA9CWX3GQKKgN3uu988yznD7LnB98T4EUH1HA/GnfBqMV1gpzTvPc4qVQinCmIkEFp83zl +G5fCjJJ3W7ivzCnYo4KhKLpFUm97okTKR2LW3xZzEW4cLSWO387MTK3CzDOx5qe6s4a91Zu ZM/j/TQdTLDaqNn83kA4Hq48UHXYxcIh+Nd8k/3w6lFuoK0wrOFiywjLx+0ur5jmmbecBGHc 1xdhAFHOwU0ETIByZAEQAKaF678T9wyH4wjTrV1Pz3cDEoSnV/0ZUrOT37p1dcGyj/IXq1x6 70HRVahAmk0sZpYc25PF9D5GPYHFWlNjuPU96rDndXB3hedmBRhLdC4bAXjI4DV+bmdVe+q/ IMnlZRaVlm9EiMCVAR6w13sReu7qXkW9r3RwY2AzXskp/tAe4BRKr1Zmbvi2nbnQ6epEC42r Rbx0B1EhjbIQZ5JHGk24iPT7LdBgnNmos5wYjzwNlkMQD5T0Ydzhk7J+UxwA5m46mOhRDC2r FV/A0gm5TLy8DXjv/Esc4gYnYai6SQqnUEVh5LuV8YCJBnijs+Tiw71x1icmn6xGI45EugJO gec+rLypYgpVp4x0HI5T88qBRYCkxH3Kg8Qo+EWNA9A4LRQ9DX8njona0gf0s03tocK8kBN6 6UoqqPtHBnc4eMgBymCflK12eKfd2YYxnyg9cZazWA5VslvTxpm76hbg5oiAEH/Vg/8MxHyA nPhfrgwyPrmJEcVBafdspJnYQxBYNco2LFPIhlOvWh8r4at+s+M3Lb26oUTczlgdW1Sf3SDA 77BMRnF0FQyE+7AzV79MBN4ykiqaezQxtaF1Fy/tvkhffSo8u+dwG0EgJh+te38gTcISVr0G IPplLz6YhjrbHrPRF1CN5UuL9DBGjxuN35RLNVEfta6RUFlR6NctTjvrABEBAAHCwWUEGAEC AA8FAkyAcmQCGwwFCRLMAwAACgkQ7ZfpDmKqfjSrHA/+KzAKvTxRhA9MWNLxIyJ7S5uJ16gs T3oCjZrBKGEhKMOGX4O0GA6VOEryO7QRCCYah3oxSG38IAnNeiwJXgU9Bzkk85UGbPEd7HGF /VSeHCQwWou6jqUDTSDvn9YhNTdG0KXPM74aC+xr2Zow1O2mhXihgWKD0Dw+0LYPnUOsQ0KO FxHXXYHmRrS1OZPU59BLvc+TRhIhafSHKLwbXK+6ckkxBx6h8z5ccpG0Qs4bFhdFYnFrEieD LoGmnE2YLhdV6swJ9VNCS6pLiEohT3fm7aXm15tZOIyzMZhHRSAPblXxQ0ZSWjq8oRrcYNFx c4W1URpAkBCOYJoXvQfD5L3lqAl8TCqDUzYxhH/tJhbDdHrqHH767jaDaTB1+Talp/2AMKwc XNOdiklGxbmHVG6YGl6g8Lrbsu9NZEI4yLlHzuikthJWgz+3vZhVGyNlt+HNIoF6CjDL2omu 5cEq4RDHM44QqPk6l7O0pUvN1mT4B+S1b08RKpqm/ff015E37HNV/piIvJlxGAYz8PSfuGCB 1thMYqlmgdhd9/BabGFbGGYHA6U4/T5zqU+f6xHy1SsAQZ1MSKlLwekBIT+4/cLRGqCHjnV0 q5H/T6a7t5mPkbzSrOLSo4puj+IToNjYyYIDBWzhlA19avOa+rvUjmHtD3sFN7cXWtkGoi8b uNcby4U= Organization: UCLA Computer Science Department Message-ID: Date: Wed, 27 Feb 2019 17:52:52 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------C185871BD1884D25D2D727B8" Content-Language: en-US X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 34469 Cc: Glenn Morris , Eli Zaretskii , 34469@debbugs.gnu.org, Lukasz Pawelczyk , Nicholas Drozd X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) This is a multi-part message in MIME format. --------------C185871BD1884D25D2D727B8 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Thanks for fixing that bug. However, replacing NUL with \0 sounds iffy. Even if we assume that a web page contains C-like code, the replacement would mishandle a NUL followed by an octal digit, since the replacement would look like \07 which would be interpreted as a BEL character, not as a NULL followed by a digit 7. And web pages do not typically contain C code, so the replacement \0 might cause other trouble. Instead, it sounds better to replace NUL with the four-character sequence "�", as this is a standard HTML way to represent a NUL character. I installed the attached patch to do this. In my little tests with this patch, libxml2 typically handled � by discarding it and continuing to parse, which is better than ignoring the rest of the input. In some cases libxml2 handles � by discarding later input up to a delimiter; although this is bad, it's a libxml2 bug that attackers can exploit independently of what Emacs does with NUL, since attackers can simply use �. --------------C185871BD1884D25D2D727B8 Content-Type: text/x-patch; name="0001-Escape-HTML-NUL-as-0-in-eww.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-Escape-HTML-NUL-as-0-in-eww.patch" >From f7c4d5ce2399fc86b130fd55d3da2c313403f638 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 27 Feb 2019 14:35:51 -0800 Subject: [PATCH] Escape HTML NUL as � in eww * lisp/net/eww.el (eww-display-html): Escape NUL as � as this is more appropriate for HTML. --- lisp/net/eww.el | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/lisp/net/eww.el b/lisp/net/eww.el index 3ec6c1cfd3..3e9334532c 100644 --- a/lisp/net/eww.el +++ b/lisp/net/eww.el @@ -471,11 +471,9 @@ eww-display-html (decode-coding-region (point) (point-max) encode) (coding-system-error nil)) (save-excursion - ;; Remove CRLF and NULL before parsing. - (while (re-search-forward "\\(\r$\\)\\|\\(\000\\)" nil t) - (replace-match (if (match-beginning 1) - "" - "\\0") t t))) + ;; Remove CRLF and replace NUL with � before parsing. + (while (re-search-forward "\\(\r$\\)\\|\0" nil t) + (replace-match (if (match-beginning 1) "" "�") t t))) (libxml-parse-html-region (point) (point-max)))))) (source (and (null document) (buffer-substring (point) (point-max))))) -- 2.20.1 --------------C185871BD1884D25D2D727B8-- From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 28 03:46:57 2019 Received: (at 34469) by debbugs.gnu.org; 28 Feb 2019 08:46:57 +0000 Received: from localhost ([127.0.0.1]:54764 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gzHLF-00012u-Bb for submit@debbugs.gnu.org; Thu, 28 Feb 2019 03:46:57 -0500 Received: from mail-wm1-f45.google.com ([209.85.128.45]:50839) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gzHLD-00012g-IE for 34469@debbugs.gnu.org; Thu, 28 Feb 2019 03:46:56 -0500 Received: by mail-wm1-f45.google.com with SMTP id x7so8325917wmj.0 for <34469@debbugs.gnu.org>; Thu, 28 Feb 2019 00:46:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :mime-version; bh=B7WPdguReHAAdR9TWcTn7/BLDQ3zWixTCGIMIrOgaGQ=; b=VQCEOp5sQqzggz2kXBOnK/3ohPXmJwdiS3HQIoLNvW2n8RhjJfEmuob9LS1P/ekyXL nadTNqc8QesUrl1QIkVS5dLlaCsf+wOM42gaJmiifMLE5vZ4tG8BUxU49dwUmt7TvS+i KJFAU+udkr3XOFmF1cc5QiJV2zM05eC/mwew8P1b4ZhytlSm7zC2J/VpJ8DSos0S5+Py S2eEoAZpqSBvIDxiFu9lZ9F5mc0Z9CVis2GKjhBkmRPrWM3zvyyqqwycE9Qr0yOoEFMO zeosDwCFx6VHSquMIhcD6paYWU9DRDBVZkYUlBPTNQVvPm/MnO+5bu6unrsZGitR2Qrd FhUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:mime-version; bh=B7WPdguReHAAdR9TWcTn7/BLDQ3zWixTCGIMIrOgaGQ=; b=LQgTRmwCTgDpZLi0Hi2fLMJZKAqbAd59nqlCCV8T4+mSyZ25OpNCl5qQfK1uAGd1wV kTkcQwmKsbkvsWEWV+3ptdkG+WNDSLBmPOI18AEDMZzqeohlCwyDW3TpJqgTStaI8HNI LuuJ394w/JOGbhcwcOWY9ZQ/FDGVjtJewBMRAZccn5N/BJ5i9H9R1d7yYRFURFW2iF8E oZ0UVZHTuLOkrd7poqlu2Kp9dkZSde/BU11UeWuPO6viqsgtdnCIa+KTvRSC43C+S7Rf pDVeIyMRjXq2uuburirJ54DeM+XjHdk9XQRV9FulB2Rm4MbFFngLVym/6+zj+2xwTtZ7 c0/w== X-Gm-Message-State: AHQUAub74DZhtwfM4Dvwq9lOIWJl42/zilhJsuGUY3zAjLlzxYXdQW6/ Qn97gZxm9JuK3jNhfyA73hc= X-Google-Smtp-Source: AHgI3IbRoDf75F79opj/JK3iwbarSi1FYp4qTDWQD39+Rf2l10sMzC20zsie7oXknGAajBlbjWtfSQ== X-Received: by 2002:a1c:7415:: with SMTP id p21mr2170718wmc.31.1551343609310; Thu, 28 Feb 2019 00:46:49 -0800 (PST) Received: from rpluim-mac ([2a01:e34:ecfc:a860:c3a:aa7:7f6a:ec3a]) by smtp.gmail.com with ESMTPSA id k6sm26700067wrq.82.2019.02.28.00.46.47 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Thu, 28 Feb 2019 00:46:48 -0800 (PST) From: Robert Pluim To: Paul Eggert Subject: Re: 26.1; EWW stops renderring web page on null byte References: Date: Thu, 28 Feb 2019 09:46:46 +0100 In-Reply-To: (Paul Eggert's message of "Wed, 27 Feb 2019 17:52:52 -0800") Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34469 Cc: Glenn Morris , Eli Zaretskii , 34469@debbugs.gnu.org, Lukasz Pawelczyk , Nicholas Drozd X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Paul Eggert writes: > Thanks for fixing that bug. However, replacing NUL with \0 sounds iffy. > Even if we assume that a web page contains C-like code, the replacement > would mishandle a NUL followed by an octal digit, since the replacement > would look like \07 which would be interpreted as a BEL character, not > as a NULL followed by a digit 7. And web pages do not typically contain > C code, so the replacement \0 might cause other trouble. > In my sample of 1 website, 100% of them contained C code :-) > Instead, it sounds better to replace NUL with the four-character > sequence "�", as this is a standard HTML way to represent a NUL > character. I installed the attached patch to do this. > OK by me. Robert From unknown Sat Jun 14 00:06:45 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Thu, 28 Mar 2019 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator