GNU bug report logs - #19431
24.4; Bad handling of RFC2047 encoded headers by 'mail-extract-address-components'

Previous Next

Package: emacs;

Reported by: Enrico Scholz <enrico.scholz <at> sigma-chemnitz.de>

Date: Mon, 22 Dec 2014 17:58:01 UTC

Severity: normal

Tags: fixed

Found in version 24.4

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 19431 in the body.
You can then email your comments to 19431 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#19431; Package emacs. (Mon, 22 Dec 2014 17:58:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Enrico Scholz <enrico.scholz <at> sigma-chemnitz.de>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 22 Dec 2014 17:58:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Enrico Scholz <enrico.scholz <at> sigma-chemnitz.de>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.4; Bad handling of RFC2047 encoded headers by
 'mail-extract-address-components'
Date: Mon, 22 Dec 2014 16:10:56 +0100
Hi,

the emacs email framework fails on email addresses containing umlauts.
E.g. in the following example

--- {{{ snip ---
; set a (nearly) real-world To: address; the umlaut '=C3=A4' encoding
; was replaced by '=61=65'
(let* ((address "=?utf-8?Q?B=61=65Br=2C_Klaus?= <test <at> example.com>")
       (decoded (rfc2047-decode-string address)))
  ; show output with encoded umlauts and non-RFC2047 header
  (print (mail-extract-address-components "\"Baer, Klaus\" <test <at> example.com>"))
  (print address t)
  (print decoded t)
  ; previous prints were just for debugging purposes; now, the real
  ; functions will be called...
  (print (mail-extract-address-components address))
  (print (mail-extract-address-components decoded)))
--- }}} snip ---

none of the last two debug outputs show the expected split.

| ("Klaus Baer" "test <at> example.com")            <--- this is expected
| 
| "=?utf-8?Q?B=61=65r=2C_Klaus?= <test <at> example.com>"
| 
| "Baer, Klaus <test <at> example.com>"
| 
| ("utf" "test <at> example.com")   <-- BAD (working on undecoded string)
| 
| (nil "Baer")                 <-- BAD (working on decoded string)
| (nil "Baer")


Unfortunately, such RFC2047 encoded addresses are very common in Germany
so that e.g. BBDB (which works on the 'decoded' string) fails in very
much cases.



Enrico




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19431; Package emacs. (Sun, 15 Apr 2018 17:52:02 GMT) Full text and rfc822 format available.

Message #8 received at 19431 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Enrico Scholz <enrico.scholz <at> sigma-chemnitz.de>
Cc: 19431 <at> debbugs.gnu.org
Subject: Re: bug#19431: 24.4; Bad handling of RFC2047 encoded headers by
 'mail-extract-address-components'
Date: Sun, 15 Apr 2018 19:51:13 +0200
Enrico Scholz <enrico.scholz <at> sigma-chemnitz.de> writes:

> the emacs email framework fails on email addresses containing umlauts.
> E.g. in the following example
>
> --- {{{ snip ---
> ; set a (nearly) real-world To: address; the umlaut '=C3=A4' encoding
> ; was replaced by '=61=65'
> (let* ((address "=?utf-8?Q?B=61=65Br=2C_Klaus?= <test <at> example.com>")
>        (decoded (rfc2047-decode-string address)))
>   ; show output with encoded umlauts and non-RFC2047 header
>   (print (mail-extract-address-components "\"Baer, Klaus\" <test <at> example.com>"))
>   (print address t)
>   (print decoded t)
>   ; previous prints were just for debugging purposes; now, the real
>   ; functions will be called...
>   (print (mail-extract-address-components address))
>   (print (mail-extract-address-components decoded)))

Yes, that's a very confusing and not very useful function.  I've now
updated the doc string to point to `mail-header-parse-address', which is
the function that should be used to parse address headers, and does the
right thing also on German addresses.

I don't think it's worth trying to fix the mess that is
`mail-extract-address-components'.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 15 Apr 2018 17:52:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 19431 <at> debbugs.gnu.org and Enrico Scholz <enrico.scholz <at> sigma-chemnitz.de> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 15 Apr 2018 17:52:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 14 May 2018 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 96 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.