GNU bug report logs - #66698
I think hex decoding with basenc -d --base16 should be case-insensitive

Previous Next

Package: coreutils;

Reported by: Niels Möller <nisse <at> lysator.liu.se>

Date: Mon, 23 Oct 2023 09:39:02 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 66698 in the body.
You can then email your comments to 66698 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#66698; Package coreutils. (Mon, 23 Oct 2023 09:39:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Niels Möller <nisse <at> lysator.liu.se>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Mon, 23 Oct 2023 09:39:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Niels Möller <nisse <at> lysator.liu.se>
To: bug-coreutils <at> gnu.org
Subject: I think hex decoding with basenc -d --base16 should be
 case-insensitive
Date: Mon, 23 Oct 2023 11:37:36 +0200
Hi,

the docs for basenc --base16 says "hex encoding (RFC4648 section 8)".
The referenced section in that RFC says 

  Essentially, Base 16 encoding is the standard case-insensitive hex
  encoding and may be referred to as "base16" or "hex".

I think it would be both more useful, and consistent with docs, if
basenc -d --base16 accepted either upper- or lowercase hex digits.

Current behavior, with basenc (GNU coreutils) 9.1:

  $ echo 666F6F0A |basenc --base16 -d
  foo
  $ echo 666F6f0A |basenc --base16 -d
  fobasenc: invalid input

I think both inputs should give the same output, "foo\n", at least by
default. Possibly configurable with options like --strict, --upper,
--lower, etc (--upper/--lower would be useful also for the --base16
encoding, i.e., no -d).

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.





Reply sent to Pádraig Brady <P <at> draigBrady.com>:
You have taken responsibility. (Mon, 23 Oct 2023 12:02:02 GMT) Full text and rfc822 format available.

Notification sent to Niels Möller <nisse <at> lysator.liu.se>:
bug acknowledged by developer. (Mon, 23 Oct 2023 12:02:02 GMT) Full text and rfc822 format available.

Message #10 received at 66698-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Niels Möller <nisse <at> lysator.liu.se>,
 66698-done <at> debbugs.gnu.org
Subject: Re: bug#66698: I think hex decoding with basenc -d --base16 should be
 case-insensitive
Date: Mon, 23 Oct 2023 13:01:17 +0100
[Message part 1 (text/plain, inline)]
On 23/10/2023 10:37, Niels Möller wrote:
> Hi,
> 
> the docs for basenc --base16 says "hex encoding (RFC4648 section 8)".
> The referenced section in that RFC says
> 
>    Essentially, Base 16 encoding is the standard case-insensitive hex
>    encoding and may be referred to as "base16" or "hex".
> 
> I think it would be both more useful, and consistent with docs, if
> basenc -d --base16 accepted either upper- or lowercase hex digits.
> 
> Current behavior, with basenc (GNU coreutils) 9.1:
> 
>    $ echo 666F6F0A |basenc --base16 -d
>    foo
>    $ echo 666F6f0A |basenc --base16 -d
>    fobasenc: invalid input
> 
> I think both inputs should give the same output, "foo\n", at least by
> default. Possibly configurable with options like --strict, --upper,
> --lower, etc (--upper/--lower would be useful also for the --base16
> encoding, i.e., no -d).

Agreed.
Will apply the attached later.
Marking this as done.

thanks,
Pádraig
[basenc-lower-hex.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#66698; Package coreutils. (Mon, 23 Oct 2023 12:52:02 GMT) Full text and rfc822 format available.

Message #13 received at 66698-done <at> debbugs.gnu.org (full text, mbox):

From: Niels Möller <nisse <at> lysator.liu.se>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 66698-done <at> debbugs.gnu.org
Subject: Re: bug#66698: I think hex decoding with basenc -d --base16 should
 be case-insensitive
Date: Mon, 23 Oct 2023 14:50:27 +0200
Pádraig Brady <P <at> draigBrady.com> writes:

> Will apply the attached later.
> Marking this as done.

Thanks! It would make some sense to me to also have options
--upper/--lower; on encoding, they would specify case of the output, on
decoding, they would reject the other case (with default being to accept
either). But less important than fixing the default behavior.

> +  basenc --base16 -d no supports lower case hexadecimal characters.
> +  Previously an error was given for lower case hex digits.

s/ no / now /

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.




Information forwarded to bug-coreutils <at> gnu.org:
bug#66698; Package coreutils. (Mon, 23 Oct 2023 13:09:01 GMT) Full text and rfc822 format available.

Message #16 received at 66698 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Niels Möller <nisse <at> lysator.liu.se>
Cc: 66698 <at> debbugs.gnu.org
Subject: Re: bug#66698: I think hex decoding with basenc -d --base16 should be
 case-insensitive
Date: Mon, 23 Oct 2023 14:08:17 +0100
On 23/10/2023 13:50, Niels Möller wrote:
> Pádraig Brady <P <at> draigBrady.com> writes:
> 
>> Will apply the attached later.
>> Marking this as done.
> 
> Thanks! It would make some sense to me to also have options
> --upper/--lower; on encoding, they would specify case of the output, on
> decoding, they would reject the other case (with default being to accept
> either). But less important than fixing the default behavior.

I was thinking `tr '[:lower:]' '[:upper:]'` would suffice for that when encoding.
When decoding I don't see much need for the strictness, but that could
also be enforced easily by prefiltering with something like `tr 'A-F' x`

The same argument could be made of course for not needing this patch at all,
by prefiltering through tr.  However the default operation should be the
most common requirement (and also the RFC documented operation in this case).
A similar case I hit very frequently is pasting hex into bc, and it's
very annoying to have to convert to uppercase before doing this.

>> +  basenc --base16 -d no supports lower case hexadecimal characters.
>> +  Previously an error was given for lower case hex digits.
> 
> s/ no / now /

Thanks, pushed.

Pádraig.





Information forwarded to bug-coreutils <at> gnu.org:
bug#66698; Package coreutils. (Wed, 25 Oct 2023 01:32:01 GMT) Full text and rfc822 format available.

Message #19 received at 66698 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>,
 Niels Möller <nisse <at> lysator.liu.se>
Cc: 66698 <at> debbugs.gnu.org
Subject: Re: bug#66698: I think hex decoding with basenc -d --base16 should be
 case-insensitive
Date: Tue, 24 Oct 2023 18:30:39 -0700
On 10/23/23 06:08, Pádraig Brady wrote:
> However the default operation should be the
> most common requirement (and also the RFC documented operation in this 
> case).
> A similar case I hit very frequently is pasting hex into bc, and it's
> very annoying to have to convert to uppercase before doing this.

Doesn't the isbase16 function also need updating?

Also, shouldn't we do something similar for base 32, for consistency?




Information forwarded to bug-coreutils <at> gnu.org:
bug#66698; Package coreutils. (Wed, 25 Oct 2023 13:29:02 GMT) Full text and rfc822 format available.

Message #22 received at 66698 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Niels Möller
 <nisse <at> lysator.liu.se>
Cc: 66698 <at> debbugs.gnu.org
Subject: Re: bug#66698: I think hex decoding with basenc -d --base16 should be
 case-insensitive
Date: Wed, 25 Oct 2023 14:27:41 +0100
On 25/10/2023 02:30, Paul Eggert wrote:
> On 10/23/23 06:08, Pádraig Brady wrote:
>> However the default operation should be the
>> most common requirement (and also the RFC documented operation in this
>> case).
>> A similar case I hit very frequently is pasting hex into bc, and it's
>> very annoying to have to convert to uppercase before doing this.
> 
> Doesn't the isbase16 function also need updating?

Good spot. I'm adding this, and also a test
for this --ignore-garbage case.

diff --git a/src/basenc.c b/src/basenc.c
 isbase16 (char ch)
 {
-  return ('0' <= ch && ch <= '9') || ('A' <= ch && ch <= 'F');
+  return isxdigit (to_uchar (ch));
 }


> Also, shouldn't we do something similar for base 32, for consistency?

I was wondering about that.

I previously checked the RFC which didn't mention lower case for base32.
But thinking about it more we probably should allow lower case for base32.

This is also related to the base64 padding change I think, in that we
might add a --strict option to only accept canonical (upper case base32) inputs.
That would also only accept canonical padding, and canonical encoding
of the trailing bits. For example would reject 'SGVsbG9='. See:
https://eprint.iacr.org/2022/361.pdf

thanks,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#66698; Package coreutils. (Wed, 25 Oct 2023 22:13:02 GMT) Full text and rfc822 format available.

Message #25 received at 66698 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>,
 Niels Möller <nisse <at> lysator.liu.se>
Cc: 66698 <at> debbugs.gnu.org
Subject: Re: bug#66698: I think hex decoding with basenc -d --base16 should be
 case-insensitive
Date: Wed, 25 Oct 2023 15:11:43 -0700
[Message part 1 (text/plain, inline)]
On 10/25/23 06:27, Pádraig Brady wrote:
> But thinking about it more we probably should allow lower case for base32.

Yes, I'm thinking the same.

While thinking (:-) I couldn't resist improving performance a bit by 
installing the attached. This also fixes an unlikely bug when isxdigit 
behavior is oddball.
[0001-build-update-gnulib-submodule-to-latest.patch (text/x-patch, attachment)]
[0002-basenc-tweak-checks-to-use-unsigned-char.patch (text/x-patch, attachment)]
[0003-basenc-fix-unlikely-locale-issue-tune.patch (text/x-patch, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 23 Nov 2023 12:24:11 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 211 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.