From unknown Sat Aug 16 23:43:44 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35256: Bug report for -W argument (maximum width) - minor and not dangerous Resent-From: alec@unifiedmathematics.com Original-Sender: "Debbugs-submit" Resent-CC: bug-diffutils@gnu.org Resent-Date: Sat, 13 Apr 2019 15:33:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 35256 X-GNU-PR-Package: diffutils X-GNU-PR-Keywords: To: 35256@debbugs.gnu.org X-Debbugs-Original-To: bug-diffutils@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.15551695809163 (code B ref -1); Sat, 13 Apr 2019 15:33:02 +0000 Received: (at submit) by debbugs.gnu.org; 13 Apr 2019 15:33:00 +0000 Received: from localhost ([127.0.0.1]:58955 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hFKeJ-0002Ni-K5 for submit@debbugs.gnu.org; Sat, 13 Apr 2019 11:33:00 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53139) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hFGrC-0002hp-47 for submit@debbugs.gnu.org; Sat, 13 Apr 2019 07:30:03 -0400 Received: from lists.gnu.org ([209.51.188.17]:51397) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hFGr6-00022e-OX for submit@debbugs.gnu.org; Sat, 13 Apr 2019 07:29:56 -0400 Received: from eggs.gnu.org ([209.51.188.92]:39278) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hFGr5-0000dk-23 for bug-diffutils@gnu.org; Sat, 13 Apr 2019 07:29:56 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hFGr3-0001wI-HX for bug-diffutils@gnu.org; Sat, 13 Apr 2019 07:29:55 -0400 Received: from bisque.maple.relay.mailchannels.net ([23.83.214.18]:41119) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hFGr2-0001qK-Th for bug-diffutils@gnu.org; Sat, 13 Apr 2019 07:29:53 -0400 X-Sender-Id: dreamhost|x-authsender|alec@unifiedmathematics.com Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 25ADD140BAC for ; Sat, 13 Apr 2019 11:29:49 +0000 (UTC) Received: from pdx1-sub0-mail-a87.g.dreamhost.com (100-96-7-60.trex.outbound.svc.cluster.local [100.96.7.60]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 7451D140B56 for ; Sat, 13 Apr 2019 11:29:48 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|alec@unifiedmathematics.com Received: from pdx1-sub0-mail-a87.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.17.2); Sat, 13 Apr 2019 11:29:49 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|alec@unifiedmathematics.com X-MailChannels-Auth-Id: dreamhost X-Tank-Robust: 11b7a0f1552ca0b6_1555154988881_3795193484 X-MC-Loop-Signature: 1555154988881:2244149984 X-MC-Ingress-Time: 1555154988881 Received: from pdx1-sub0-mail-a87.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a87.g.dreamhost.com (Postfix) with ESMTP id 138977FD87 for ; Sat, 13 Apr 2019 04:29:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=unifiedmathematics.com; h= message-id:from:to:subject:date:content-type:mime-version; s= unifiedmathematics.com; bh=L/5J5Zi9OX5QYeH7wmjq1CzdIFw=; b=osx8O CRGmEdLzutoApYl4PW57nnKV58L0bY54KylzY+ijomYqqdXrIahGpL55XS77pbdT eIVYHjmmpaGX3dgG34hy5p+CAYfIHVX/rFmvJRjCipk2bvInCfZLEcaLPobkFRor T7ktBD8W6Wtl0HeCEU8bGGGNtBd5lO7dMcxaAE= Received: from localhost (ip-66-33-200-4.dreamhost.com [66.33.200.4]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: alec@unifiedmathematics.com) by pdx1-sub0-mail-a87.g.dreamhost.com (Postfix) with ESMTPSA id E53D07FD78 for ; Sat, 13 Apr 2019 04:29:45 -0700 (PDT) Message-Id: <8bbb0482c31d85f7bc37b78824e27c31d51ad479@webmail.unifiedmathematics.com> X-DH-BACKEND: pdx1-sub0-mail-a87 From: alec@unifiedmathematics.com X-Mailer: Atmail 7.8.0.2 X-Originating-IP: 10.35.42.211 Date: Sat, 13 Apr 2019 12:29:45 +0100 Content-Type: multipart/alternative; boundary="=_1f51722c62452a7e66e7845f48aeedad" MIME-Version: 1.0 X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: 0 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduuddrvdehgdegudcutefuodetggdotefrodftvfcurfhrohhfihhlvgemucggtfgfnhhsuhgsshgtrhhisggvpdfftffgtefojffquffvnecuuegrihhlohhuthemuceftddtnecunecujfgurhepkffhvffoihfuffgtggesrgdtreerredtjeenucfhrhhomheprghlvggtsehunhhifhhivggumhgrthhhvghmrghtihgtshdrtghomhenucfkphepieeirdeffedrvddttddrgedpuddtrdefhedrgedvrddvuddunecurfgrrhgrmhepmhhouggvpehsmhhtphdphhgvlhhopehlohgtrghlhhhoshhtpdhinhgvthepieeirdeffedrvddttddrgedprhgvthhurhhnqdhprghthheprghlvggtsehunhhifhhivggumhgrthhhvghmrghtihgtshdrtghomhdpmhgrihhlfhhrohhmpegrlhgvtgesuhhnihhfihgvughmrghthhgvmhgrthhitghsrdgtohhmpdhnrhgtphhtthhopegsuhhgqdguihhffhhuthhilhhssehgnhhurdhorhhgnecuvehluhhsthgvrhfuihiivgeptd X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 23.83.214.18 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: -1.4 (-) X-Mailman-Approved-At: Sat, 13 Apr 2019 11:32:58 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) --=_1f51722c62452a7e66e7845f48aeedad Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello there.=0A=0AI was hoping to view a side-by-side diff of something= and, perhaps=0Aunfairly, was hoping for a setting where diff would choo= se a width=0Asuch that there were no truncations and I would use less wi= th no=0Awrapping to inspect the results.=0A=0AMy first attempt was "-W 0= " (a width of 0 has no "legit" meaning=0Aafterall) - error, so I tried -= 1. This leads to a weird situation=0Awhere it seems to just output loads= of tabs - while it'll probably=0Astill terminate eventually the behavio= ur is unreasonable.=0A=0ATo try this yourself run something like:=0A=0Ad= iff -y ./maps ./task/4974/maps -W -1 =0A=0Afrom /proc/XXXX where XXXX is= some PID for a program with threads (eg=0Afirefox) and the 4974 is any= task that isn't XXXX=0A=0AADDENDUM: The -1 isn't important, 99999999999= 99 also illustrates the=0Aproblem - END ADDENDUM=0A=0ALooking at the cod= e (in the 3.7 tarball, src/diff.c modified on 18th=0Aof December 2018) n= otice:=0A=0ALine 284:=0A=C2=A0 uintmax_t numval;=0ALine 525:=0A=C2=A0=C2= =A0=C2=A0 case 'W':=0A=C2=A0=C2=A0=C2=A0 =C2=A0 numval =3D strtoumax (op= targ, &numend, 10);=0A=C2=A0=C2=A0=C2=A0 =C2=A0 if (! (0 < numval && num= val <=3D SIZE_MAX) || *numend)=0A=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 t= ry_help ("invalid width '%s'", optarg);=0A=C2=A0=C2=A0=C2=A0 =C2=A0 if (= width !=3D numval)=0A=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 {=0A=C2=A0=C2= =A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (width)=0A=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 fatal ("conflicting width options");=0A=C2=A0=C2=A0= =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 width =3D numval;=0A=C2=A0=C2=A0= =C2=A0 =C2=A0=C2=A0=C2=A0 }=0A=C2=A0=C2=A0=C2=A0 =C2=A0 break;=0A=0AFor= convenience:=0A=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 uintmax_t strtoumax(const= char *nptr, char **endptr, int=0Abase);=0Aand it may set errno, my man= page doesn't say whether this -1=0Abehaviour is "okay" however it proba= bly is, unsigned afterall, this=0Ameans that numval is going to be a rea= lly really big value. =0A=0AABUSE POTENTIAL:=0A=0AJust basic DOS (denial= -of-service) stuff, a CPU usage spike comes from=0Adiff itself and it se= ems to output a lot of tabs (a good 275mib / sec=0Aon my machine) and wi= ll probably do so for a good few years before=0Aanything else comes out,= a testament to the robustness of diff is that=0Ait did this, and its me= mory usage didn't start ballooning.=0A=0AI know diff is used by A LOT of= other programs, some of which are=0Aweb-accessible (eg mediawiki uses d= iff - and will by default if it=0Afinds it), many of my projects use it= too. It is not a big stretch to=0Aimagine someone has a web-service out= there which allows side-by-side=0Aformat, and not much of a further lea= p to assume that someone might=0Ahave an input box for width which expos= es -W, guarded only by a regex=0Aof the form ^[1-9][0-9]* (which yes, wo= nt allow -1 but will allow=0A9999999999999)=0A=0AYou could bring a serve= r to its knees pretty quickly using just diff's=0ACPU usage and a few ta= bs using this - that's not even considering=0Awhether or not the system= hypothesised here doesn't have trouble with=0Amemory from a convenient= get_line() function first.=0A=0AWhile not really diff's fault or proble= m, a potential solution=0Adetailed below would fix it and not cause any= problems for those with=0Alegit (?) needs for really wide diffs =0A=0AS= UGGESTIONS:=0AHumans are limiting here, improvements and the growth of c= omputers=0Awont really affect the maximum width so putting a limit in pl= ace is=0Areasonable. I make no claim there is a "maximum useful width" s= o being=0Aable to override will ensure my half-assed musings on such a l= imit=0Awont cause any problems in the future. =0A=0AI'd go with somethin= g like=0A#define REASONABLE_LIMIT 1000=0A=0AAdd a check that numval is <= =3D get_reasonable_specified_width_limit()=0Aafter the existing checks,= if not output an error in the form of:=0A=0A"You probably don't want to= do that, see [wherever], if you do specify=0A--we-have-evolved-cylindic= al-lenses-now or set the environmental=0Avariable GNU_DIFF_REASONABLE_LI= MIT to a new limit, using 0 for none"=0A=0ALastly, for what it's worth f= rom a perfect stranger:=0A=0AI'm very impressed that diff didn't start c= onsuming huge amounts of=0Amemory, and a little saddened that it is impr= essive! =0A=0AThanks very much for diff and your work on it, you have no= idea how=0Amany things it underpins! =0A --=_1f51722c62452a7e66e7845f48aeedad Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello there.

I was hoping to view a side-by-side diff of= something and, perhaps unfairly, was hoping for a setting where diff wo= uld choose a width such that there were no truncations and I would use l= ess with no wrapping to inspect the results.

My first attempt was= "-W 0" (a width of 0 has no "legit" meaning afterall) - error, so I tri= ed -1. This leads to a weird situation where it seems to just output loa= ds of tabs - while it'll probably still terminate eventually the behavio= ur is unreasonable.

To try this yourself run something like:
<= br>diff -y ./maps ./task/4974/maps -W -1

from /proc/XXXX where X= XXX is some PID for a program with threads (eg firefox) and the 4974 is= any task that isn't XXXX

ADDENDUM: The -1 isn't important, 99999= 99999999 also illustrates the problem - END ADDENDUM

Looking at t= he code (in the 3.7 tarball, src/diff.c modified on 18th of December 201= 8) notice:

Line 284:
=C2=A0 uintmax_t numval;
Line 525:
= =C2=A0=C2=A0=C2=A0 case 'W':
=C2=A0=C2=A0=C2=A0 =C2=A0 numval =3D str= toumax (optarg, &numend, 10);
=C2=A0=C2=A0=C2=A0 =C2=A0 if (! (0= < numval && numval <=3D SIZE_MAX) || *numend)
=C2=A0= =C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 try_help ("invalid width '%s'", optarg);=
=C2=A0=C2=A0=C2=A0 =C2=A0 if (width !=3D numval)
=C2=A0=C2=A0=C2= =A0 =C2=A0=C2=A0=C2=A0 {
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 if (width)
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 fatal ("confl= icting width options");
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 width =3D numval;
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 }
= =C2=A0=C2=A0=C2=A0 =C2=A0 break;

For convenience:
=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 uintmax_t strtoumax(const char *nptr, char **endptr,= int base);
and it may set errno, my man page doesn't say whether thi= s -1 behaviour is "okay" however it probably is, unsigned afterall, this= means that numval is going to be a really really big value.

ABUSE POTENTIAL:

Just basic DOS (denial-of-service) stuff, a CPU= usage spike comes from diff itself and it seems to output a lot of tabs= (a good 275mib / sec on my machine) and will probably do so for a good= few years before anything else comes out, a testament to the robustness= of diff is that it did this, and its memory usage didn't start ballooni= ng.

I know diff is used by A LOT of other programs, some of which= are web-accessible (eg mediawiki uses diff - and will by default if it= finds it), many of my projects use it too. It is not a big stretch to i= magine someone has a web-service out there which allows side-by-side for= mat, and not much of a further leap to assume that someone might have an= input box for width which exposes -W, guarded only by a regex of the fo= rm ^[1-9][0-9]* (which yes, wont allow -1 but will allow 9999999999999)<= br>
You could bring a server to its knees pretty quickly using just d= iff's CPU usage and a few tabs using this - that's not even considering= whether or not the system hypothesised here doesn't have trouble with m= emory from a convenient get_line() function first.

While not real= ly diff's fault or problem, a potential solution detailed below would fi= x it and not cause any problems for those with legit (?) needs for reall= y wide diffs



SUGGESTIONS:
Humans are limiting here, i= mprovements and the growth of computers wont really affect the maximum w= idth so putting a limit in place is reasonable. I make no claim there is= a "maximum useful width" so being able to override will ensure my half-= assed musings on such a limit wont cause any problems in the future.
I'd go with something like
#define REASONABLE_LIMIT 1000

= Add a check that numval is <=3D get_reasonable_specified_width_limit(= ) after the existing checks, if not output an error in the form of:
<= br>"You probably don't want to do that, see [wherever], if you do specif= y --we-have-evolved-cylindical-lenses-now or set the environmental varia= ble GNU_DIFF_REASONABLE_LIMIT to a new limit, using 0 for none"

L= astly, for what it's worth from a perfect stranger:

I'm very impr= essed that diff didn't start consuming huge amounts of memory, and a lit= tle saddened that it is impressive!

Thanks very much for diff an= d your work on it, you have no idea how many things it underpins!
--=_1f51722c62452a7e66e7845f48aeedad-- From unknown Sat Aug 16 23:43:44 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: alec@unifiedmathematics.com Subject: bug#35256: closed (Re: [bug-diffutils] bug#35256: Bug report for -W argument (maximum width) - minor and not dangerous) Message-ID: References: <14605a16-f1fb-fa44-7314-2092cf44ba75@cs.ucla.edu> <8bbb0482c31d85f7bc37b78824e27c31d51ad479@webmail.unifiedmathematics.com> X-Gnu-PR-Message: they-closed 35256 X-Gnu-PR-Package: diffutils Reply-To: 35256@debbugs.gnu.org Date: Tue, 27 Aug 2019 23:24:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1566948242-17633-1" This is a multi-part message in MIME format... ------------=_1566948242-17633-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #35256: Bug report for -W argument (maximum width) - minor and not dangerous which was filed against the diffutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 35256@debbugs.gnu.org. --=20 35256: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D35256 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1566948242-17633-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 35256-done) by debbugs.gnu.org; 27 Aug 2019 23:23:21 +0000 Received: from localhost ([127.0.0.1]:49589 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1i2ko4-0004ZN-1l for submit@debbugs.gnu.org; Tue, 27 Aug 2019 19:23:21 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:55782) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1i2ko0-0004Z5-B5 for 35256-done@debbugs.gnu.org; Tue, 27 Aug 2019 19:23:17 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1F31E1600C6; Tue, 27 Aug 2019 16:23:10 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id nWAS_25-KzmM; Tue, 27 Aug 2019 16:23:08 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id CFD951600EC; Tue, 27 Aug 2019 16:23:08 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 9iSi2Sdh4XFr; Tue, 27 Aug 2019 16:23:08 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com [23.242.74.103]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id A24C31600C6; Tue, 27 Aug 2019 16:23:08 -0700 (PDT) Subject: Re: [bug-diffutils] bug#35256: Bug report for -W argument (maximum width) - minor and not dangerous To: alec@unifiedmathematics.com References: <8bbb0482c31d85f7bc37b78824e27c31d51ad479@webmail.unifiedmathematics.com> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <14605a16-f1fb-fa44-7314-2092cf44ba75@cs.ucla.edu> Date: Tue, 27 Aug 2019 16:23:08 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <8bbb0482c31d85f7bc37b78824e27c31d51ad479@webmail.unifiedmathematics.com> Content-Type: multipart/mixed; boundary="------------FB16B5F0F5D1BCC95D18486C" Content-Language: en-US X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 35256-done Cc: 35256-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) This is a multi-part message in MIME format. --------------FB16B5F0F5D1BCC95D18486C Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit alec@unifiedmathematics.com wrote: > I know diff is used by A LOT of other programs, some of which are > web-accessible I'm afraid that ship sailed a while ago: if you let a remote attacker specify an arbitrary option to GNU diff there is lots of other trouble you can get into. For example, the -I option lets the attacker specify a regular expression that can cause diff to undergo exponential complexity. The general wisdom nowadays is to not expose command-line operands to attackers. As for putting in a limit, the GNU Coding Standards say to not impose arbitrary limits. In some cases there are good reasons to impose a limit anyway but this one doesn't seem to rise to that level. You do raise a good point that 'diff' shouldn't treat negative inputs as if they were large positive inputs, so I installed the attached patch. Thanks for reporting the problem; your bug report was a pleasure to read. --------------FB16B5F0F5D1BCC95D18486C Content-Type: text/x-patch; name="0001-diff-don-t-mistreat-N-in-arg-as-a-large-number.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-diff-don-t-mistreat-N-in-arg-as-a-large-number.patch" >From 8d26b1403e8607811ccebdfe2822f2dad42a36d3 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Tue, 27 Aug 2019 16:14:15 -0700 Subject: [PATCH] =?UTF-8?q?diff:=20don=E2=80=99t=20mistreat=20-N=20in=20ar?= =?UTF-8?q?g=20as=20a=20large=20number?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Problem reported by alec (Bug#35256). * NEWS: Mention the fix. * bootstrap.conf (gnulib_modules): Use strtoimax and xstrtoimax, not strtoumax and strtoumax. * src/cmp.c (bytes): Now signed, with -1 representing no limit. All uses changed. * src/cmp.c (specify_ignore_initial, main): * src/diff.c (main): * src/ifdef.c (format_group): * src/sdiff.c (interact): Use strtoimax, not strtoumax. --- NEWS | 4 ++++ bootstrap.conf | 4 ++-- src/cmp.c | 27 ++++++++++++++------------- src/diff.c | 14 +++++++------- src/ifdef.c | 6 +++--- src/sdiff.c | 10 +++++----- 6 files changed, 35 insertions(+), 30 deletions(-) diff --git a/NEWS b/NEWS index 5c1ae5f..3ecd111 100644 --- a/NEWS +++ b/NEWS @@ -19,6 +19,10 @@ GNU diffutils NEWS -*- outline -*- that was intended for stdin, stdout, or stderr. [bug#33965 present since "the beginning"] + cmp, diff and sdiff no longer treat negative command-line + option-arguments as if they were large positive numbers. + [bug#35256 introduced in 2.8] + * Noteworthy changes in release 3.7 (2018-12-31) [stable] diff --git a/bootstrap.conf b/bootstrap.conf index 7d5ea62..1a20900 100644 --- a/bootstrap.conf +++ b/bootstrap.conf @@ -70,7 +70,7 @@ stdint strcase strftime strptime -strtoumax +strtoimax sys_wait system-quote unistd @@ -85,7 +85,7 @@ xalloc xfreopen xreadlink xstdopen -xstrtoumax +xstrtoimax xvasprintf ' diff --git a/src/cmp.c b/src/cmp.c index ce2bdb5..16e8869 100644 --- a/src/cmp.c +++ b/src/cmp.c @@ -75,8 +75,8 @@ static size_t buf_size; /* Initial prefix to ignore for each file. */ static off_t ignore_initial[2]; -/* Number of bytes to compare. */ -static uintmax_t bytes = UINTMAX_MAX; +/* Number of bytes to compare, or -1 if there is no limit. */ +static intmax_t bytes = -1; /* Output format. */ static enum comparison_type @@ -129,12 +129,12 @@ static char const valid_suffixes[] = "kKMGTPEZY0"; static void specify_ignore_initial (int f, char **argptr, char delimiter) { - uintmax_t val; + intmax_t val; char const *arg = *argptr; - strtol_error e = xstrtoumax (arg, argptr, 0, &val, valid_suffixes); - if (! (e == LONGINT_OK - || (e == LONGINT_INVALID_SUFFIX_CHAR && **argptr == delimiter)) - || TYPE_MAXIMUM (off_t) < val) + strtol_error e = xstrtoimax (arg, argptr, 0, &val, valid_suffixes); + if (! ((e == LONGINT_OK + || (e == LONGINT_INVALID_SUFFIX_CHAR && **argptr == delimiter)) + && 0 <= val && val <= TYPE_MAXIMUM (off_t))) try_help ("invalid --ignore-initial value '%s'", arg); if (ignore_initial[f] < val) ignore_initial[f] = val; @@ -237,10 +237,11 @@ main (int argc, char **argv) case 'n': { - uintmax_t n; - if (xstrtoumax (optarg, 0, 0, &n, valid_suffixes) != LONGINT_OK) + intmax_t n; + if (xstrtoimax (optarg, 0, 0, &n, valid_suffixes) != LONGINT_OK + || n < 0) try_help ("invalid --bytes value '%s'", optarg); - if (n < bytes) + if (! (0 <= bytes && bytes < n)) bytes = n; } break; @@ -341,7 +342,7 @@ main (int argc, char **argv) s0 = 0; if (s1 < 0) s1 = 0; - if (s0 != s1 && MIN (s0, s1) < bytes) + if (s0 != s1 && (bytes < 0 || MIN (s0, s1) < bytes)) exit (EXIT_FAILURE); } @@ -379,7 +380,7 @@ cmp (void) bool at_line_start = true; off_t line_number = 1; /* Line number (1...) of difference. */ off_t byte_number = 1; /* Byte number (1...) of difference. */ - uintmax_t remaining = bytes; /* Remaining number of bytes to compare. */ + intmax_t remaining = bytes; /* Remaining bytes to compare, or -1. */ size_t read0, read1; /* Number of bytes read from each file. */ size_t first_diff; /* Offset (0...) in buffers of 1st diff. */ size_t smaller; /* The lesser of 'read0' and 'read1'. */ @@ -433,7 +434,7 @@ cmp (void) { size_t bytes_to_read = buf_size; - if (remaining != UINTMAX_MAX) + if (0 <= remaining) { if (remaining < bytes_to_read) bytes_to_read = remaining; diff --git a/src/diff.c b/src/diff.c index e9c2b11..c545642 100644 --- a/src/diff.c +++ b/src/diff.c @@ -282,7 +282,7 @@ main (int argc, char **argv) bool show_c_function = false; char const *from_file = NULL; char const *to_file = NULL; - uintmax_t numval; + intmax_t numval; char *numend; /* Do our initializations. */ @@ -350,8 +350,8 @@ main (int argc, char **argv) { if (optarg) { - numval = strtoumax (optarg, &numend, 10); - if (*numend) + numval = strtoimax (optarg, &numend, 10); + if (*numend || numval < 0) try_help ("invalid context length '%s'", optarg); if (CONTEXT_MAX < numval) numval = CONTEXT_MAX; @@ -525,7 +525,7 @@ main (int argc, char **argv) break; case 'W': - numval = strtoumax (optarg, &numend, 10); + numval = strtoimax (optarg, &numend, 10); if (! (0 < numval && numval <= SIZE_MAX) || *numend) try_help ("invalid width '%s'", optarg); if (width != numval) @@ -554,8 +554,8 @@ main (int argc, char **argv) return EXIT_SUCCESS; case HORIZON_LINES_OPTION: - numval = strtoumax (optarg, &numend, 10); - if (*numend) + numval = strtoimax (optarg, &numend, 10); + if (*numend || numval < 0) try_help ("invalid horizon length '%s'", optarg); horizon_lines = MAX (horizon_lines, MIN (numval, LIN_MAX)); break; @@ -609,7 +609,7 @@ main (int argc, char **argv) break; case TABSIZE_OPTION: - numval = strtoumax (optarg, &numend, 10); + numval = strtoimax (optarg, &numend, 10); if (! (0 < numval && numval <= SIZE_MAX - GUTTER_WIDTH_MINIMUM) || *numend) try_help ("invalid tabsize '%s'", optarg); diff --git a/src/ifdef.c b/src/ifdef.c index 65f1745..43f1f86 100644 --- a/src/ifdef.c +++ b/src/ifdef.c @@ -135,7 +135,7 @@ format_group (register FILE *out, char const *format, char endchar, /* Print if-then-else format e.g. '%(n=1?thenpart:elsepart)'. */ { int i; - uintmax_t value[2]; + intmax_t value[2]; FILE *thenout, *elseout; for (i = 0; i < 2; i++) @@ -144,7 +144,7 @@ format_group (register FILE *out, char const *format, char endchar, { char *fend; errno = 0; - value[i] = strtoumax (f, &fend, 10); + value[i] = strtoimax (f, &fend, 10); if (errno) goto bad_format; f = fend; @@ -152,7 +152,7 @@ format_group (register FILE *out, char const *format, char endchar, else { value[i] = groups_letter_value (groups, *f); - if (value[i] == -1) + if (value[i] < 0) goto bad_format; f++; } diff --git a/src/sdiff.c b/src/sdiff.c index 2ef83da..a61f4e7 100644 --- a/src/sdiff.c +++ b/src/sdiff.c @@ -1098,15 +1098,15 @@ interact (struct line_filter *diff, else { char *numend; - uintmax_t val; + intmax_t val; lin llen, rlen, lenmax; errno = 0; - val = strtoumax (diff_help + 1, &numend, 10); - if (LIN_MAX < val || errno || *numend != ',') + val = strtoimax (diff_help + 1, &numend, 10); + if (! (0 <= val && val <= LIN_MAX) || errno || *numend != ',') fatal (diff_help); llen = val; - val = strtoumax (numend + 1, &numend, 10); - if (LIN_MAX < val || errno || *numend) + val = strtoimax (numend + 1, &numend, 10); + if (! (0 <= val && val <= LIN_MAX) || errno || *numend) fatal (diff_help); rlen = val; -- 2.17.1 --------------FB16B5F0F5D1BCC95D18486C-- ------------=_1566948242-17633-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 13 Apr 2019 15:33:00 +0000 Received: from localhost ([127.0.0.1]:58955 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hFKeJ-0002Ni-K5 for submit@debbugs.gnu.org; Sat, 13 Apr 2019 11:33:00 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53139) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hFGrC-0002hp-47 for submit@debbugs.gnu.org; Sat, 13 Apr 2019 07:30:03 -0400 Received: from lists.gnu.org ([209.51.188.17]:51397) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hFGr6-00022e-OX for submit@debbugs.gnu.org; Sat, 13 Apr 2019 07:29:56 -0400 Received: from eggs.gnu.org ([209.51.188.92]:39278) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hFGr5-0000dk-23 for bug-diffutils@gnu.org; Sat, 13 Apr 2019 07:29:56 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hFGr3-0001wI-HX for bug-diffutils@gnu.org; Sat, 13 Apr 2019 07:29:55 -0400 Received: from bisque.maple.relay.mailchannels.net ([23.83.214.18]:41119) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hFGr2-0001qK-Th for bug-diffutils@gnu.org; Sat, 13 Apr 2019 07:29:53 -0400 X-Sender-Id: dreamhost|x-authsender|alec@unifiedmathematics.com Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 25ADD140BAC for ; Sat, 13 Apr 2019 11:29:49 +0000 (UTC) Received: from pdx1-sub0-mail-a87.g.dreamhost.com (100-96-7-60.trex.outbound.svc.cluster.local [100.96.7.60]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 7451D140B56 for ; Sat, 13 Apr 2019 11:29:48 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|alec@unifiedmathematics.com Received: from pdx1-sub0-mail-a87.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.17.2); Sat, 13 Apr 2019 11:29:49 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|alec@unifiedmathematics.com X-MailChannels-Auth-Id: dreamhost X-Tank-Robust: 11b7a0f1552ca0b6_1555154988881_3795193484 X-MC-Loop-Signature: 1555154988881:2244149984 X-MC-Ingress-Time: 1555154988881 Received: from pdx1-sub0-mail-a87.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a87.g.dreamhost.com (Postfix) with ESMTP id 138977FD87 for ; Sat, 13 Apr 2019 04:29:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=unifiedmathematics.com; h= message-id:from:to:subject:date:content-type:mime-version; s= unifiedmathematics.com; bh=L/5J5Zi9OX5QYeH7wmjq1CzdIFw=; b=osx8O CRGmEdLzutoApYl4PW57nnKV58L0bY54KylzY+ijomYqqdXrIahGpL55XS77pbdT eIVYHjmmpaGX3dgG34hy5p+CAYfIHVX/rFmvJRjCipk2bvInCfZLEcaLPobkFRor T7ktBD8W6Wtl0HeCEU8bGGGNtBd5lO7dMcxaAE= Received: from localhost (ip-66-33-200-4.dreamhost.com [66.33.200.4]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: alec@unifiedmathematics.com) by pdx1-sub0-mail-a87.g.dreamhost.com (Postfix) with ESMTPSA id E53D07FD78 for ; Sat, 13 Apr 2019 04:29:45 -0700 (PDT) Message-Id: <8bbb0482c31d85f7bc37b78824e27c31d51ad479@webmail.unifiedmathematics.com> X-DH-BACKEND: pdx1-sub0-mail-a87 From: alec@unifiedmathematics.com To: bug-diffutils@gnu.org X-Mailer: Atmail 7.8.0.2 X-Originating-IP: 10.35.42.211 Subject: Bug report for -W argument (maximum width) - minor and not dangerous Date: Sat, 13 Apr 2019 12:29:45 +0100 Content-Type: multipart/alternative; boundary="=_1f51722c62452a7e66e7845f48aeedad" MIME-Version: 1.0 X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: 0 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduuddrvdehgdegudcutefuodetggdotefrodftvfcurfhrohhfihhlvgemucggtfgfnhhsuhgsshgtrhhisggvpdfftffgtefojffquffvnecuuegrihhlohhuthemuceftddtnecunecujfgurhepkffhvffoihfuffgtggesrgdtreerredtjeenucfhrhhomheprghlvggtsehunhhifhhivggumhgrthhhvghmrghtihgtshdrtghomhenucfkphepieeirdeffedrvddttddrgedpuddtrdefhedrgedvrddvuddunecurfgrrhgrmhepmhhouggvpehsmhhtphdphhgvlhhopehlohgtrghlhhhoshhtpdhinhgvthepieeirdeffedrvddttddrgedprhgvthhurhhnqdhprghthheprghlvggtsehunhhifhhivggumhgrthhhvghmrghtihgtshdrtghomhdpmhgrihhlfhhrohhmpegrlhgvtgesuhhnihhfihgvughmrghthhgvmhgrthhitghsrdgtohhmpdhnrhgtphhtthhopegsuhhgqdguihhffhhuthhilhhssehgnhhurdhorhhgnecuvehluhhsthgvrhfuihiivgeptd X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 23.83.214.18 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Sat, 13 Apr 2019 11:32:58 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) --=_1f51722c62452a7e66e7845f48aeedad Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello there.=0A=0AI was hoping to view a side-by-side diff of something= and, perhaps=0Aunfairly, was hoping for a setting where diff would choo= se a width=0Asuch that there were no truncations and I would use less wi= th no=0Awrapping to inspect the results.=0A=0AMy first attempt was "-W 0= " (a width of 0 has no "legit" meaning=0Aafterall) - error, so I tried -= 1. This leads to a weird situation=0Awhere it seems to just output loads= of tabs - while it'll probably=0Astill terminate eventually the behavio= ur is unreasonable.=0A=0ATo try this yourself run something like:=0A=0Ad= iff -y ./maps ./task/4974/maps -W -1 =0A=0Afrom /proc/XXXX where XXXX is= some PID for a program with threads (eg=0Afirefox) and the 4974 is any= task that isn't XXXX=0A=0AADDENDUM: The -1 isn't important, 99999999999= 99 also illustrates the=0Aproblem - END ADDENDUM=0A=0ALooking at the cod= e (in the 3.7 tarball, src/diff.c modified on 18th=0Aof December 2018) n= otice:=0A=0ALine 284:=0A=C2=A0 uintmax_t numval;=0ALine 525:=0A=C2=A0=C2= =A0=C2=A0 case 'W':=0A=C2=A0=C2=A0=C2=A0 =C2=A0 numval =3D strtoumax (op= targ, &numend, 10);=0A=C2=A0=C2=A0=C2=A0 =C2=A0 if (! (0 < numval && num= val <=3D SIZE_MAX) || *numend)=0A=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 t= ry_help ("invalid width '%s'", optarg);=0A=C2=A0=C2=A0=C2=A0 =C2=A0 if (= width !=3D numval)=0A=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 {=0A=C2=A0=C2= =A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (width)=0A=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 fatal ("conflicting width options");=0A=C2=A0=C2=A0= =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 width =3D numval;=0A=C2=A0=C2=A0= =C2=A0 =C2=A0=C2=A0=C2=A0 }=0A=C2=A0=C2=A0=C2=A0 =C2=A0 break;=0A=0AFor= convenience:=0A=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 uintmax_t strtoumax(const= char *nptr, char **endptr, int=0Abase);=0Aand it may set errno, my man= page doesn't say whether this -1=0Abehaviour is "okay" however it proba= bly is, unsigned afterall, this=0Ameans that numval is going to be a rea= lly really big value. =0A=0AABUSE POTENTIAL:=0A=0AJust basic DOS (denial= -of-service) stuff, a CPU usage spike comes from=0Adiff itself and it se= ems to output a lot of tabs (a good 275mib / sec=0Aon my machine) and wi= ll probably do so for a good few years before=0Aanything else comes out,= a testament to the robustness of diff is that=0Ait did this, and its me= mory usage didn't start ballooning.=0A=0AI know diff is used by A LOT of= other programs, some of which are=0Aweb-accessible (eg mediawiki uses d= iff - and will by default if it=0Afinds it), many of my projects use it= too. It is not a big stretch to=0Aimagine someone has a web-service out= there which allows side-by-side=0Aformat, and not much of a further lea= p to assume that someone might=0Ahave an input box for width which expos= es -W, guarded only by a regex=0Aof the form ^[1-9][0-9]* (which yes, wo= nt allow -1 but will allow=0A9999999999999)=0A=0AYou could bring a serve= r to its knees pretty quickly using just diff's=0ACPU usage and a few ta= bs using this - that's not even considering=0Awhether or not the system= hypothesised here doesn't have trouble with=0Amemory from a convenient= get_line() function first.=0A=0AWhile not really diff's fault or proble= m, a potential solution=0Adetailed below would fix it and not cause any= problems for those with=0Alegit (?) needs for really wide diffs =0A=0AS= UGGESTIONS:=0AHumans are limiting here, improvements and the growth of c= omputers=0Awont really affect the maximum width so putting a limit in pl= ace is=0Areasonable. I make no claim there is a "maximum useful width" s= o being=0Aable to override will ensure my half-assed musings on such a l= imit=0Awont cause any problems in the future. =0A=0AI'd go with somethin= g like=0A#define REASONABLE_LIMIT 1000=0A=0AAdd a check that numval is <= =3D get_reasonable_specified_width_limit()=0Aafter the existing checks,= if not output an error in the form of:=0A=0A"You probably don't want to= do that, see [wherever], if you do specify=0A--we-have-evolved-cylindic= al-lenses-now or set the environmental=0Avariable GNU_DIFF_REASONABLE_LI= MIT to a new limit, using 0 for none"=0A=0ALastly, for what it's worth f= rom a perfect stranger:=0A=0AI'm very impressed that diff didn't start c= onsuming huge amounts of=0Amemory, and a little saddened that it is impr= essive! =0A=0AThanks very much for diff and your work on it, you have no= idea how=0Amany things it underpins! =0A --=_1f51722c62452a7e66e7845f48aeedad Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello there.

I was hoping to view a side-by-side diff of= something and, perhaps unfairly, was hoping for a setting where diff wo= uld choose a width such that there were no truncations and I would use l= ess with no wrapping to inspect the results.

My first attempt was= "-W 0" (a width of 0 has no "legit" meaning afterall) - error, so I tri= ed -1. This leads to a weird situation where it seems to just output loa= ds of tabs - while it'll probably still terminate eventually the behavio= ur is unreasonable.

To try this yourself run something like:
<= br>diff -y ./maps ./task/4974/maps -W -1

from /proc/XXXX where X= XXX is some PID for a program with threads (eg firefox) and the 4974 is= any task that isn't XXXX

ADDENDUM: The -1 isn't important, 99999= 99999999 also illustrates the problem - END ADDENDUM

Looking at t= he code (in the 3.7 tarball, src/diff.c modified on 18th of December 201= 8) notice:

Line 284:
=C2=A0 uintmax_t numval;
Line 525:
= =C2=A0=C2=A0=C2=A0 case 'W':
=C2=A0=C2=A0=C2=A0 =C2=A0 numval =3D str= toumax (optarg, &numend, 10);
=C2=A0=C2=A0=C2=A0 =C2=A0 if (! (0= < numval && numval <=3D SIZE_MAX) || *numend)
=C2=A0= =C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 try_help ("invalid width '%s'", optarg);=
=C2=A0=C2=A0=C2=A0 =C2=A0 if (width !=3D numval)
=C2=A0=C2=A0=C2= =A0 =C2=A0=C2=A0=C2=A0 {
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 if (width)
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 fatal ("confl= icting width options");
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 width =3D numval;
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 }
= =C2=A0=C2=A0=C2=A0 =C2=A0 break;

For convenience:
=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 uintmax_t strtoumax(const char *nptr, char **endptr,= int base);
and it may set errno, my man page doesn't say whether thi= s -1 behaviour is "okay" however it probably is, unsigned afterall, this= means that numval is going to be a really really big value.

ABUSE POTENTIAL:

Just basic DOS (denial-of-service) stuff, a CPU= usage spike comes from diff itself and it seems to output a lot of tabs= (a good 275mib / sec on my machine) and will probably do so for a good= few years before anything else comes out, a testament to the robustness= of diff is that it did this, and its memory usage didn't start ballooni= ng.

I know diff is used by A LOT of other programs, some of which= are web-accessible (eg mediawiki uses diff - and will by default if it= finds it), many of my projects use it too. It is not a big stretch to i= magine someone has a web-service out there which allows side-by-side for= mat, and not much of a further leap to assume that someone might have an= input box for width which exposes -W, guarded only by a regex of the fo= rm ^[1-9][0-9]* (which yes, wont allow -1 but will allow 9999999999999)<= br>
You could bring a server to its knees pretty quickly using just d= iff's CPU usage and a few tabs using this - that's not even considering= whether or not the system hypothesised here doesn't have trouble with m= emory from a convenient get_line() function first.

While not real= ly diff's fault or problem, a potential solution detailed below would fi= x it and not cause any problems for those with legit (?) needs for reall= y wide diffs



SUGGESTIONS:
Humans are limiting here, i= mprovements and the growth of computers wont really affect the maximum w= idth so putting a limit in place is reasonable. I make no claim there is= a "maximum useful width" so being able to override will ensure my half-= assed musings on such a limit wont cause any problems in the future.
I'd go with something like
#define REASONABLE_LIMIT 1000

= Add a check that numval is <=3D get_reasonable_specified_width_limit(= ) after the existing checks, if not output an error in the form of:
<= br>"You probably don't want to do that, see [wherever], if you do specif= y --we-have-evolved-cylindical-lenses-now or set the environmental varia= ble GNU_DIFF_REASONABLE_LIMIT to a new limit, using 0 for none"

L= astly, for what it's worth from a perfect stranger:

I'm very impr= essed that diff didn't start consuming huge amounts of memory, and a lit= tle saddened that it is impressive!

Thanks very much for diff an= d your work on it, you have no idea how many things it underpins!
--=_1f51722c62452a7e66e7845f48aeedad-- ------------=_1566948242-17633-1-- From unknown Sat Aug 16 23:43:44 2025 X-Loop: help-debbugs@gnu.org Subject: bug#35256: [bug-diffutils] bug#35256: Bug report for -W argument (maximum width) - minor and not dangerous Resent-From: Assaf Gordon Original-Sender: "Debbugs-submit" Resent-CC: bug-diffutils@gnu.org Resent-Date: Wed, 28 Aug 2019 00:57:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35256 X-GNU-PR-Package: diffutils X-GNU-PR-Keywords: To: 35256@debbugs.gnu.org, alec@unifiedmathematics.com Cc: eggert@cs.ucla.edu Received: via spool by 35256-submit@debbugs.gnu.org id=B35256.156695381927638 (code B ref 35256); Wed, 28 Aug 2019 00:57:01 +0000 Received: (at 35256) by debbugs.gnu.org; 28 Aug 2019 00:56:59 +0000 Received: from localhost ([127.0.0.1]:49639 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1i2mGh-0007Bh-Aq for submit@debbugs.gnu.org; Tue, 27 Aug 2019 20:56:59 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:43972) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1i2mGd-0007BQ-Cf for 35256@debbugs.gnu.org; Tue, 27 Aug 2019 20:56:57 -0400 Received: by mail-pl1-f193.google.com with SMTP id 4so360851pld.10 for <35256@debbugs.gnu.org>; Tue, 27 Aug 2019 17:56:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:cc:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=EqJbYVLTHH0Id7xDpIRkEWcMj8tAnfxj1juEgIFVWWI=; b=F0ZzMgMSCrz/U/SDlY/kIXB/TPSv36XFj9gbbPUS5GC8iP6bBeSRHmA1IbhxtLGexx oY2L2xoUEzbLp+xiX1dRHTiNeB+wV/edxeKNvAjUmzw5wUIX/6MiQYWIYcOMusqjpf2x Z/oCmXYrJ686Y4FdFASFU9h0pU7jr/ULDZQn+CySz36Lwec4xuvUiiTTyjjE9IlryLQ4 JhMalyIjYjZawnZ2w5qZNxMwjlosT9gSW8wHG38ftIDxCyM9VbX4i0SVUUMSoM2ZyyDX 8yzn4YSfj0+UURD/4JpBuEfzp+NscBJVrcswxCzSeyzp6rT6FmJQlwhixr2n6Z2pGZR1 vyEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:cc:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=EqJbYVLTHH0Id7xDpIRkEWcMj8tAnfxj1juEgIFVWWI=; b=GmUdokwDsglEd5E3E0jL7DXLfa/WXuchs2RzZMoE0zQQyUYqarXYuJKVwUwzMmRZnP L1DsTmghZqreAlH2B0ZRjMXilvEgW1Kfp9yuswFJQ6ES9OAYbrY/mwWhgxhUAz5Ln5Fl d3KtaMy4bprnVmpQ+IfE86KtNoogS6qD8xf5IZiC4EAKflJ7kej5BblWv4/jyTW8yEF0 AJMVZkZW4npAZkqjjDb6TeYTMHMcBvF9VkDv1tFSpO9+vRKqk+NZ6DzWtbhzBGrqaDel YFHWYM3j458j4ndF7itdxkZ1BToWBCAeGt3ZD5YsZYilVaIQsF7FWLN2R5XDFGstU8uW xsUg== X-Gm-Message-State: APjAAAVc0JxinoaflHd7lTyONq3ZtovdhTiJdG9oj6ZRE1aPZ/8zgU4g XWiPzVqW4qx3RBf5HZYxMyo= X-Google-Smtp-Source: APXvYqxBn8Rw1GQPLI5BNWLWyoprwicKX4zQoZlioke6FeIz1dElVcF91HAIAHFi+fkt1ukXySxl4A== X-Received: by 2002:a17:902:8696:: with SMTP id g22mr1757442plo.122.1566953809484; Tue, 27 Aug 2019 17:56:49 -0700 (PDT) Received: from tomato.moose.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id ev3sm6625095pjb.3.2019.08.27.17.56.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 27 Aug 2019 17:56:48 -0700 (PDT) References: <8bbb0482c31d85f7bc37b78824e27c31d51ad479@webmail.unifiedmathematics.com> <14605a16-f1fb-fa44-7314-2092cf44ba75@cs.ucla.edu> From: Assaf Gordon Message-ID: <73d7c2bc-bc70-764c-886d-272abc04bf4a@gmail.com> Date: Tue, 27 Aug 2019 18:56:46 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <14605a16-f1fb-fa44-7314-2092cf44ba75@cs.ucla.edu> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hello, Slightly off-topic, but potentially helpful: On 2019-08-27 5:23 p.m., Paul Eggert wrote: > alec@unifiedmathematics.com wrote: > >> I know diff is used by A LOT of other programs, some of which are >> web-accessible > > [...] if you let a remote attacker > specify an arbitrary option to GNU diff there is lots of other trouble > you can get into. > [....] The general wisdom nowadays is to not expose command-line > operands to attackers. While generally true, sometimes there's no way around it (or perhaps it is even the goal). An easy way to restrict resources is to execute a simple wrapper shell script that uses 'timeout', 'prlimit' and 'setpriv' for additional restrictions. For example: timeout 10s \ setpriv --no-new-privs \ prlimit --cpu=3 --data=50000000 --nproc=1 \ diff [ARGS] will limit the "diff" process to running 10 seconds (of wall time), consume up to 3 seconds of CPU time, use up to 50MB of memory, and limit to a single process (so it can't execute child processes). The "setpriv" ensures it can't gain new privileges. "prlimit" has more options (e.g. "--fsize" to limit file sizes so it won't fill the drive, and "--nofiles" to limit number of open files). These should work on any modern gnu/linux system ("timeout" is from coreutils, "setpriv" and "prlimit" are from util-linux). None of the above is perfect, but they add a quick layer of additional restrictions (and they don't require additional privileges to use). To take it a step further, you can use containers and tools such as "bubblewrap" and "firefail" to isolate a process from the network, from the filesystem, and even from other processes. Hope this helps, -assaf