GNU bug report logs - #35256
Bug report for -W argument (maximum width) - minor and not dangerous

Previous Next

Package: diffutils;

Reported by: alec <at> unifiedmathematics.com

Date: Sat, 13 Apr 2019 15:33:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#35256: closed (Bug report for -W argument (maximum width) -
 minor and not dangerous)
Date: Tue, 27 Aug 2019 23:24:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Tue, 27 Aug 2019 16:23:08 -0700
with message-id <14605a16-f1fb-fa44-7314-2092cf44ba75 <at> cs.ucla.edu>
and subject line Re: [bug-diffutils] bug#35256: Bug report for -W argument (maximum width) - minor and not dangerous
has caused the debbugs.gnu.org bug report #35256,
regarding Bug report for -W argument (maximum width) - minor and not dangerous
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
35256: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=35256
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: alec <at> unifiedmathematics.com
To: bug-diffutils <at> gnu.org
Subject: Bug report for -W argument (maximum width) - minor and not dangerous
Date: Sat, 13 Apr 2019 12:29:45 +0100
[Message part 3 (text/plain, inline)]
Hello there.

I was hoping to view a side-by-side diff of something and, perhaps
unfairly, was hoping for a setting where diff would choose a width
such that there were no truncations and I would use less with no
wrapping to inspect the results.

My first attempt was "-W 0" (a width of 0 has no "legit" meaning
afterall) - error, so I tried -1. This leads to a weird situation
where it seems to just output loads of tabs - while it'll probably
still terminate eventually the behaviour is unreasonable.

To try this yourself run something like:

diff -y ./maps ./task/4974/maps -W -1 

from /proc/XXXX where XXXX is some PID for a program with threads (eg
firefox) and the 4974 is any task that isn't XXXX

ADDENDUM: The -1 isn't important, 9999999999999 also illustrates the
problem - END ADDENDUM

Looking at the code (in the 3.7 tarball, src/diff.c modified on 18th
of December 2018) notice:

Line 284:
  uintmax_t numval;
Line 525:
    case 'W':
      numval = strtoumax (optarg, &numend, 10);
      if (! (0 < numval && numval <= SIZE_MAX) || *numend)
        try_help ("invalid width '%s'", optarg);
      if (width != numval)
        {
          if (width)
        fatal ("conflicting width options");
          width = numval;
        }
      break;

For convenience:
      uintmax_t strtoumax(const char *nptr, char **endptr, int
base);
and it may set errno, my man page doesn't say whether this -1
behaviour is "okay" however it probably is, unsigned afterall, this
means that numval is going to be a really really big value. 

ABUSE POTENTIAL:

Just basic DOS (denial-of-service) stuff, a CPU usage spike comes from
diff itself and it seems to output a lot of tabs (a good 275mib / sec
on my machine) and will probably do so for a good few years before
anything else comes out, a testament to the robustness of diff is that
it did this, and its memory usage didn't start ballooning.

I know diff is used by A LOT of other programs, some of which are
web-accessible (eg mediawiki uses diff - and will by default if it
finds it), many of my projects use it too. It is not a big stretch to
imagine someone has a web-service out there which allows side-by-side
format, and not much of a further leap to assume that someone might
have an input box for width which exposes -W, guarded only by a regex
of the form ^[1-9][0-9]* (which yes, wont allow -1 but will allow
9999999999999)

You could bring a server to its knees pretty quickly using just diff's
CPU usage and a few tabs using this - that's not even considering
whether or not the system hypothesised here doesn't have trouble with
memory from a convenient get_line() function first.

While not really diff's fault or problem, a potential solution
detailed below would fix it and not cause any problems for those with
legit (?) needs for really wide diffs 

SUGGESTIONS:
Humans are limiting here, improvements and the growth of computers
wont really affect the maximum width so putting a limit in place is
reasonable. I make no claim there is a "maximum useful width" so being
able to override will ensure my half-assed musings on such a limit
wont cause any problems in the future. 

I'd go with something like
#define REASONABLE_LIMIT 1000

Add a check that numval is <= get_reasonable_specified_width_limit()
after the existing checks, if not output an error in the form of:

"You probably don't want to do that, see [wherever], if you do specify
--we-have-evolved-cylindical-lenses-now or set the environmental
variable GNU_DIFF_REASONABLE_LIMIT to a new limit, using 0 for none"

Lastly, for what it's worth from a perfect stranger:

I'm very impressed that diff didn't start consuming huge amounts of
memory, and a little saddened that it is impressive! 

Thanks very much for diff and your work on it, you have no idea how
many things it underpins! 

[Message part 4 (text/html, inline)]
[Message part 5 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: alec <at> unifiedmathematics.com
Cc: 35256-done <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#35256: Bug report for -W argument (maximum
 width) - minor and not dangerous
Date: Tue, 27 Aug 2019 16:23:08 -0700
[Message part 6 (text/plain, inline)]
alec <at> unifiedmathematics.com wrote:

> I know diff is used by A LOT of other programs, some of which are
> web-accessible

I'm afraid that ship sailed a while ago: if you let a remote attacker specify an 
arbitrary option to GNU diff there is lots of other trouble you can get into. 
For example, the -I option lets the attacker specify a regular expression that 
can cause diff to undergo exponential complexity. The general wisdom nowadays is 
to not expose command-line operands to attackers.

As for putting in a limit, the GNU Coding Standards say to not impose arbitrary 
limits. In some cases there are good reasons to impose a limit anyway but this 
one doesn't seem to rise to that level.

You do raise a good point that 'diff' shouldn't treat negative inputs as if they 
were large positive inputs, so I installed the attached patch.

Thanks for reporting the problem; your bug report was a pleasure to read.
[0001-diff-don-t-mistreat-N-in-arg-as-a-large-number.patch (text/x-patch, attachment)]

This bug report was last modified 5 years and 327 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.