GNU bug report logs - #46342
28.0.50; socks-send-command munges IP address bytes to UTF-8

Previous Next

Package: emacs;

Reported by: "J.P." <jp <at> neverwas.me>

Date: Sat, 6 Feb 2021 11:47:01 UTC

Severity: normal

Tags: fixed, patch

Found in version 28.0.50

Fixed in version 28.1

Done: "J.P." <jp <at> neverwas.me>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: "J.P." <jp <at> neverwas.me>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 46342 <at> debbugs.gnu.org
Subject: bug#46342: 28.0.50; socks-send-command munges IP address bytes to UTF-8
Date: Wed, 10 Feb 2021 05:16:58 -0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> what kind of string can this ADDRESS be? My reading of RFC 1928 is
> that it normally is an IP address, in which case encoding is not
> relevant, as it's an ASCII string. But it can also be a domain, right?

This patch only affects IP addresses, but I'm happy to look into the
domain name form as well.

> If so, what form can this domain take? If the domain has non-ASCII
> characters, shouldn't it be hex-encoded, or run through IDNA? I mean,
> are non-ASCII characters in that place at all allowed?

At first glance, both tor and ssh appear to call getaddrinfo() on the
remote end without accounting for the sender's locale or passing any
special IDN-related flags. But I'm still looking into these.

For now, if we're allowing anecdotal caveman logic, I'd wager the answer
is ASCII only. Here's why:

It seems feeding tor and ssh the hostname for Яндекс.рф (Yandex) as the
UTF-8 encoded byte string

  \xd0\xaf\xd0\xbd\xd0\xb4\xd0\xb5\xd0\xba\xd1\x81.\xd1\x80\xd1\x84

results in failure both when forwarding via CONNECT and when resolving
via tor's nonstandard RESOLVE command. (This is direct, no Emacs.)

However, passing the punified "xn--d1acpjx3f.xn--p1ai" works as
intended, forwarding to (or, in the case of RESOLVE, producing) an IP
from a Yandex-registered A record (for me, 77.88.55.66).

To try this at home (on separate ttys):

  $ ssh -TND 4711 my.sshd
  # tcpdump -i lo -nnX "port 4711"
  $ curl --verbose --proxy socks5h://localhost:4711 Яндекс.рф

Here's a trace for curl's actual call to the hostname conversion
function idn2_lookup_ul() [1], which is provided by GNU libidn2 [2].
It's hard to see without context, but this happens before any connection
is established (tcpdump will confirm this).

#0  Curl_idnconvert_hostname at lib/url.c:1566
#1  create_conn at lib/url.c:3583
#2  Curl_connect at lib/url.c:4027
#3  multi_runsingle at lib/multi.c:1671
#4  curl_multi_perform at lib/multi.c:2412
#5  easy_transfer at lib/easy.c:606
#6  easy_perform at lib/easy.c:696
#7  curl_easy_perform at lib/easy.c:715
#8  serial_transfers at src/tool_operate.c:2327
#9  run_all_transfers at src/tool_operate.c:2505
#10 operate at src/tool_operate.c:2621
#11 main at src/tool_main.c:277

On my machine, curl was configured to pass these flags to idn2_lookup_ul[3]:

  /* IDN2_NFC_INPUT: Normalize input string using normalization form C.
     IDN2_NONTRANSITIONAL: Perform Unicode TR46 non-transitional
     processing. */
  int flags = IDN2_NFC_INPUT | IDN2_NONTRANSITIONAL;

Apparently there are two IDNA standards: 2003 and 2008 [4]. Curl uses
the latter, but I'm not sure which, if any, puny.el favors. In the case
of Yandex,

  (puny-encode-domain "Яндекс.рф")

produces "xn--d1acpjx9e.xn--p1ai", which tor and ssh both reject (though
it's very possible I'm missing something.) Anyway, passing the version
above provided by libidn2 to socks-send-command works fine.

[1] https://github.com/curl/curl/blob/ec5d9b44a2e837fc7b82d1c60d5fae3f851620dc/lib/url.c#L1559
[2] https://www.gnu.org/software/libidn/libidn2/reference/libidn2-idn2.html#idn2-lookup-ul
[3] https://www.gnu.org/software/libidn/libidn2/reference/libidn2-idn2.html#idn2-flags
[4] https://www.unicode.org/reports/tr46/#Table_Example_Processing




This bug report was last modified 4 years and 86 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.