Package: emacs;
Reported by: Nacho Barrientos <nacho.barrientos <at> cern.ch>
Date: Sun, 17 Apr 2022 10:51:01 UTC
Severity: normal
Found in version 28.1
Fixed in version 29.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Nacho Barrientos <nacho.barrientos <at> cern.ch> To: <bug-gnu-emacs <at> gnu.org> Subject: 28.1; url-http.el: chunked response: wait for the last CRLF Date: Sun, 17 Apr 2022 10:02:18 +0200
[Message part 1 (text/plain, inline)]
Hi, This is a bug report for url.el, more specifically for url-http.el in the context of processing responses with Transfer-Encoding set to "chunked". As per [0], the last chunk of 0 bytes is always accompanied by a last CRLF that signals the end of the message: chunked-body = *chunk last-chunk trailer-part CRLF ^ this one chunk = chunk-size [ chunk-ext ] CRLF chunk-data CRLF chunk-size = 1*HEXDIG last-chunk = 1*("0") [ chunk-ext ] CRLF chunk-data = 1*OCTET ; a sequence of chunk-size octets `url-http-chunked-encoding-after-change-function' is able to process (and remove) that terminator IF AVAILABLE in the buffer when processing the response, however it won't wait for it if it's not yet there. In other words: | Bottom of the response buffer | Bottom of the full response | | (visible to url-http) | (to be delivered to Emacs) | | ------------------------------+-----------------------------| | 0\r\n | 0\r\n | | | \r\n | If the last chunk is processed when the bottom of the response buffer is as above (note that the whole response has not yet been delivered to Emacs), url-http will call the user callback without waiting for the final terminator to be read from the socket. This is normally not an issue when doing one-shot requests, but it's problematic when the connection is reused immediately. As there are 2 bytes from the request N that have not been dealt with, they'll be considered as part of the response of the request N+1. On top, it turns out that when processing the headers of request N+1, `url-http-wait-for-headers-change-function' will consider the request a "headerless malformed response" (due to the leading \r\n) delivering it broken to the caller. I'm attaching a patch that I've got applied locally that prevents the problem from happening by implementing an extra state in which `url-http-chunked-encoding-after-change-function` properly waits for the very last element of the message. Unfortunately my copyright papers are stuck on the FSF for months now so I'm unable to submit a patch directly myself so feel free to use the attached diff as a PoC to illustrate the nature of the problem and a possible fix. A more experienced Elisp developer will surely find a more elegant solution. Please note that the patch sits on top of the emacs-28 branch. For additional context, this bug was found when debugging magit/ghub (see [1] for details). Hope this helps. [0] https://datatracker.ietf.org/doc/html/rfc7230#section-4.1 [1] https://github.com/magit/ghub/issues/81
[0001-url-http.el-chunked-response-wait-for-the-last-CRLF.patch (text/x-patch, inline)]
From a11454602f7c4af3af5729b9e504130c660319f3 Mon Sep 17 00:00:00 2001 From: Nacho Barrientos <nacho.barrientos <at> cern.ch> Date: Sat, 16 Apr 2022 18:00:39 +0200 Subject: [PATCH] url-http.el: chunked response: wait for the last CRLF. As per [0], the last chunk of 0 bytes is always accompanied by a last CRLF that signals the end of the message: chunked-body = *chunk last-chunk trailer-part CRLF ^ this one chunk = chunk-size [ chunk-ext ] CRLF chunk-data CRLF chunk-size = 1*HEXDIG last-chunk = 1*("0") [ chunk-ext ] CRLF chunk-data = 1*OCTET ; a sequence of chunk-size octets `url-http-chunked-encoding-after-change-function' is able to process (and remove) that terminator IF AVAILABLE in the buffer when processing the response, however it won't wait for it if it's not yet there. In other words: | Bottom of the response buffer | Bottom of the full response | | (visible to url-http) | (to be delivered to Emacs) | | ------------------------------+-----------------------------| | 0\r\n | 0\r\n | | | \r\n | If the last chunk is processed when the bottom of the response buffer is as above (note that the whole response has not yet been delivered to Emacs), url-http will call the user callback without waiting for the final terminator to be read from the socket. This is normally not an issue when doing one-shot requests, but it's problematic when the connection is reused immediately. As there are 2 bytes from the request N that have not been dealt with, they'll be considered as part of the response of the request N+1. On top, it turns out that when processing the headers of request N+1, `url-http-wait-for-headers-change-function' will consider the request a "headerless malformed response" delivering it broken to the caller. The proposed fix implements a state in which `url-http-chunked-encoding-after-change-function` properly waits for the very last element of the message preventing the problem explained above from happening. For additional context, this bug was found when debugging magit/ghub (see [1] for details). [0] https://datatracker.ietf.org/doc/html/rfc7230#section-4.1 [1] https://github.com/magit/ghub/issues/81 --- lisp/url/url-http.el | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el index 16c3a6a1e6..dfdc03cc6c 100644 --- a/lisp/url/url-http.el +++ b/lisp/url/url-http.el @@ -36,6 +36,7 @@ (defvar url-current-object) (defvar url-http-after-change-function) (defvar url-http-chunked-counter) +(defvar url-http-chunked-last-crlf-missing nil) (defvar url-http-chunked-length) (defvar url-http-chunked-start) (defvar url-http-connection-opened) @@ -1068,7 +1069,16 @@ the callback to be triggered." Cannot give a sophisticated percentage, but we need a different function to look for the special 0-length chunk that signifies the end of the document." - (save-excursion + (if url-http-chunked-last-crlf-missing + (progn + (goto-char url-http-chunked-last-crlf-missing) + (if (not (looking-at "\r\n")) + (url-http-debug "Still spinning for the terminator of last chunk...") + (url-http-debug "Saw the last CRLF.") + (delete-region (match-beginning 0) (match-end 0)) + (if (url-http-parse-headers) + (url-http-activate-callback)))) + (save-excursion (goto-char st) (let ((read-next-chunk t) (case-fold-search t) @@ -1145,13 +1155,16 @@ the end of the document." (url-display-percentage nil nil) ;; Every chunk, even the last 0-length one, is ;; terminated by CRLF. Skip it. - (when (looking-at "\r?\n") + (if (not (looking-at "\r?\n")) + (progn + (url-http-debug "Spinning for the terminator of last chunk...") + (setq-local url-http-chunked-last-crlf-missing (point))) (url-http-debug "Removing terminator of last chunk") - (delete-region (match-beginning 0) (match-end 0))) - (if (re-search-forward "^\r?\n" nil t) - (url-http-debug "Saw end of trailers...")) - (if (url-http-parse-headers) - (url-http-activate-callback)))))))))) + (delete-region (match-beginning 0) (match-end 0)) + (if (re-search-forward "^\r?\n" nil t) + (url-http-debug "Saw end of trailers...")) + (if (url-http-parse-headers) + (url-http-activate-callback)))))))))))) (defun url-http-wait-for-headers-change-function (_st nd _length) ;; This will wait for the headers to arrive and then splice in the -- 2.35.3
[Message part 3 (text/plain, inline)]
In GNU Emacs 28.1 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.33, cairo version 1.17.6) of 2022-04-09 built on frederik Windowing system distributor 'The X.Org Foundation', version 11.0.12101003 System Description: Arch Linux Configured using: 'configure --with-native-compilation --without-compress-install --sysconfdir=/etc --prefix=/usr --libexecdir=/usr/lib --localstatedir=/var --with-cairo --with-harfbuzz --with-modules --with-wide-int --with-x-toolkit=gtk3 --with-xft 'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -g -ffile-prefix-map=/build/emacs/src=/usr/src/debug -flto=auto' 'LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto=auto'' Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG JSON LCMS2 LIBOTF LIBSYSTEMD LIBXML2 M17N_FLT MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND THREADS TIFF TOOLKIT_SCROLL_BARS X11 XDBE XIM XPM GTK3 ZLIB Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix Major mode: ELisp/l Minor modes in effect: recentf-mode: t bug-reference-prog-mode: t global-git-commit-mode: t magit-auto-revert-mode: t goto-address-prog-mode: t erc-services-mode: t erc-networks-mode: t erc-spelling-mode: t flyspell-mode: t display-time-mode: t engine-mode: t mu4e-marker-icons-mode: t outline-minor-mode: t beginend-global-mode: t beginend-prog-mode: t auto-revert-mode: t editorconfig-mode: t csv-field-index-mode: t doom-modeline-mode: t yas-global-mode: t yas-minor-mode: t all-the-icons-ivy-rich-mode: t ivy-rich-project-root-cache-mode: t ivy-rich-mode: t ivy-posframe-mode: t ivy-mode: t shell-dirtrack-mode: t whole-line-or-region-global-mode: t whole-line-or-region-local-mode: t global-whitespace-mode: t override-global-mode: t global-eldoc-mode: t eldoc-mode: t show-paren-mode: t electric-indent-mode: t mouse-wheel-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t window-divider-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t auto-fill-function: yas--auto-fill transient-mark-mode: t hs-minor-mode: t Load-path shadows: /home/nacho/.emacs.d/elpa/transient-20220413.2332/transient hides /usr/share/emacs/28.1/lisp/transient Features: (shadow magit-patch magit-subtree magit-gitignore magit-ediff ediff bicycle crux pulse expand-region yaml-mode-expansions text-mode-expansions cc-mode-expansions the-org-mode-expansions ruby-mode-expansions js-mode-expansions web-mode-expansions er-basic-expansions expand-region-core expand-region-custom ielm rg rg-info-hack rg-menu rg-ibuffer rg-result wgrep-rg wgrep rg-history rg-header ibuf-ext ibuffer ibuffer-loaddefs grep cus-edit smiley gnus-cite qp mm-archive mail-extr cl-print debug cus-start dired-aux emacsbug mwim hippie-exp pcase counsel recentf tree-widget swiper network-stream url-cache secrets forge-list forge-commands forge-semi forge-bitbucket buck forge-gogs gogs forge-gitea gtea forge-gitlab glab forge-github ghub-graphql treepy gsexp ghub gnutls forge-notify forge-revnote forge-pullreq forge-issue forge-topic yaml bug-reference forge-post markdown-mode forge-repo forge forge-core forge-db closql emacsql-sqlite emacsql emacsql-compiler url-http url-auth url-gw nsm magit-extras magit-bookmark magit-submodule magit-obsolete magit-blame magit-stash magit-reflog magit-bisect magit-push magit-pull magit-fetch magit-clone magit-remote magit-commit magit-sequence magit-notes magit-worktree magit-tag magit-merge magit-branch magit-reset magit-files magit-refs magit-status magit magit-repos magit-apply magit-wip magit-log which-func magit-diff smerge-mode diff git-commit log-edit pcvs-util add-log magit-core magit-autorevert magit-margin magit-transient magit-process with-editor magit-mode transient magit-git magit-base magit-section crm em-unix em-term term ehelp em-script em-prompt em-hist em-pred em-glob em-cmpl em-basic em-banner em-alias hideshow goto-addr erc-services erc-join erc-networks erc-track erc-match erc-spelling erc-goodies erc erc-backend erc-loaddefs quail flyspell ispell org-element ol-eww ol-rmail ol-mhe ol-irc ol-info ol-gnus nnselect gnus-search eieio-opt speedbar ezimage dframe ol-docview ol-bibtex ol-bbdb ol-w3m ol-doi org-link-doi org-agenda appt diary-lib diary-loaddefs notifications org-tree-slide org-timer org-clock org-capture org-refile org-protocol desktop-environment time exwm-systemtray xcb-systemtray xcb-xembed exwm exwm-input xcb-keysyms xcb-xkb exwm-manage exwm-floating xcb-cursor xcb-render exwm-layout exwm-workspace exwm-core xcb-ewmh xcb-icccm xcb xcb-xproto xcb-types xcb-debug engine-mode eww xdg mm-url eradio elfeed-show elfeed-search vc-mtn vc-hg vc-git diff-mode vc-bzr vc-src vc-sccs vc-svn vc-cvs vc-rcs vc vc-dispatcher editorconfig-core editorconfig-core-handle editorconfig-fnmatch elfeed-csv elfeed elfeed-curl elfeed-log elfeed-db elfeed-lib avl-tree url-queue xml-query mu4e-marker-icons cus-load mu4e mu4e-org mu4e-main mu4e-view mu4e-view-gnus gnus-art mm-uu mml2015 mm-view mml-smime smime dig gnus-sum gnus-group gnus-undo gnus-start gnus-dbus dbus gnus-cloud nnimap nnmail mail-source utf7 netrc nnoo gnus-spec gnus-int gnus-range gnus-win gnus nnheader wid-edit mu4e-view-common mu4e-headers mu4e-compose mu4e-context mu4e-draft mu4e-actions ido rfc2368 smtpmail sendmail mu4e-mark mu4e-proc mu4e-utils doc-view jka-compr image-mode exif mu4e-lists mu4e-message shr kinsoku svg xml flow-fill org ob ob-tangle ob-ref ob-lob ob-table ob-exp org-macro org-footnote org-src ob-comint org-pcomplete org-list org-faces org-entities outline-minor-faces noutline outline org-version ob-emacs-lisp ob-core ob-eval org-table oc-basic bibtex ol org-keys oc org-compat org-macs org-loaddefs cal-menu calendar cal-loaddefs mule-util hl-line mu4e-vars message rmc puny rfc822 mml mml-sec epa derived epg rfc6068 epg-config gnus-util rmail rmail-loaddefs mm-decode mm-bodies mm-encode mailabbrev gmm-utils mu4e-meta helpful trace edebug backtrace info-look help-fns radix-tree elisp-refs beginend git-modes gitignore-mode gitconfig-mode conf-mode gitattributes-mode ediff-merg ediff-mult ediff-wind ediff-diff ediff-help ediff-init ediff-util git-link eshell-prompt-extras em-dirs esh-var esh-mode em-ls eshell-bookmark bookmark pp eshell package-lint let-alist finder find-func lisp-mnt mail-parse rfc2231 rfc2047 rfc2045 mm-util ietf-drums mail-prsvr powerthesaurus request mailheader autorevert filenotify mail-utils dom virtualenvwrapper gitignore-templates editorconfig sqlformat reformatter sql view jq-mode csv-mode sort web-mode advice disp-table sh-script executable archive-rpm bindat archive-cpio rpm-spec-mode yaml-mode json-mode json-snatcher js imenu cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs go-dlv gud go-mode find-file ffap thingatpt etags fileloop generator xref project rspec-mode compile text-property-search puppet-mode align ruby-mode smie doom-modeline doom-modeline-segments doom-modeline-env doom-modeline-core shrink-path f dash doom-themes-ext-visual-bell face-remap doom-tomorrow-night-theme doom-themes doom-themes-base yasnippet-snippets yasnippet amx comp comp-cstr warnings s all-the-icons-ivy-rich ivy-rich ivy-posframe posframe all-the-icons-ivy ivy delsel ivy-faces ivy-overlay colir color all-the-icons all-the-icons-faces data-material data-weathericons data-octicons data-fileicons data-faicons data-alltheicons dired dired-loaddefs pinentry tramp-cmds em-tramp esh-cmd esh-ext esh-opt esh-module esh-groups esh-proc esh-io esh-arg esh-util tramp-sh tramp tramp-loaddefs trampver tramp-integration files-x tramp-compat shell pcomplete comint ansi-color ring parse-time iso8601 time-date ls-lisp format-spec whole-line-or-region whitespace cl-extra help-mode use-package use-package-ensure use-package-delight use-package-diminish use-package-bind-key bind-key easy-mmode use-package-core finder-inf edmacro kmacro avoid server rx info package browse-url url url-proxy url-privacy url-expand url-methods url-history url-cookie url-domsuf url-util mailcap url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs eieio-loaddefs password-cache json subr-x map url-vars seq byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib iso-transl tooltip eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice button loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote threads dbusbind inotify lcms2 dynamic-setting system-font-setting font-render-setting cairo move-toolbar gtk x-toolkit x multi-tty make-network-process native-compile emacs) Memory information: ((conses 16 1609065 256669) (symbols 48 89566 7) (strings 32 380963 47984) (string-bytes 1 15303133) (vectors 16 156871) (vector-slots 8 3659229 445952) (floats 8 1948 1629) (intervals 56 33966 19519) (buffers 992 308)) -- bye Nacho http://cern.ch/nacho
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.