From unknown Mon Aug 18 14:23:19 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#18402 <18402@debbugs.gnu.org> To: bug#18402 <18402@debbugs.gnu.org> Subject: Status: Wrong output for single character files without newline Reply-To: bug#18402 <18402@debbugs.gnu.org> Date: Mon, 18 Aug 2025 21:23:19 +0000 retitle 18402 Wrong output for single character files without newline reassign 18402 diffutils submitter 18402 Eric Blake severity 18402 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Wed Sep 03 17:04:19 2014 Received: (at submit) by debbugs.gnu.org; 3 Sep 2014 21:04:19 +0000 Received: from localhost ([127.0.0.1]:58164 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPHj0-0005xO-Gb for submit@debbugs.gnu.org; Wed, 03 Sep 2014 17:04:19 -0400 Received: from eggs.gnu.org ([208.118.235.92]:46237) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPHix-0005xA-Hj for submit@debbugs.gnu.org; Wed, 03 Sep 2014 17:04:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XPHiq-0007SN-SY for submit@debbugs.gnu.org; Wed, 03 Sep 2014 17:04:10 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=BAYES_05 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:47632) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XPHiq-0007SH-Px for submit@debbugs.gnu.org; Wed, 03 Sep 2014 17:04:08 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47932) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XPHip-0002pH-GE for bug-diffutils@gnu.org; Wed, 03 Sep 2014 17:04:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XPHik-0007P8-EW for bug-diffutils@gnu.org; Wed, 03 Sep 2014 17:04:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:9468) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XPHiU-0007Jz-Me; Wed, 03 Sep 2014 17:03:46 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s83L3jE5012169 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 3 Sep 2014 17:03:45 -0400 Received: from [10.3.113.52] (ovpn-113-52.phx2.redhat.com [10.3.113.52]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s83L3ivc021533; Wed, 3 Sep 2014 17:03:44 -0400 Message-ID: <54078230.4040803@redhat.com> Date: Wed, 03 Sep 2014 15:03:44 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0 MIME-Version: 1.0 To: Navin Kabra , bug-gnu-utils@gnu.org, bug-diffutils@gnu.org Subject: Re: Wrong output for single character files without newline References: <86egvtc8iw.fsf@smriti.com> In-Reply-To: <86egvtc8iw.fsf@smriti.com> OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="2UJtjoSoQUG1cCq601WHvWIRpwL0hxa1I" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --2UJtjoSoQUG1cCq601WHvWIRpwL0hxa1I Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable [adding bug-diffutils, as requested by diff --help] On 09/03/2014 04:17 AM, Navin Kabra wrote: > Consider this: >=20 > echo -n a > /tmp/a > echo -n b > /tmp/b > diff -B /tmp/a /tmp/b 'echo -n' is non-portable. Please get used to using 'printf' instead. >=20 > Clearly, the two files are different, yet, diff seems to think that the= > files are identical. I've managed to reproduce this problem on Ubuntu > 14.04 with diffutils 3.3, on CloudLinux 5.10 with diffutils 2.8.1, and > also Ubuntu 10.04 with diffutils 2.8.1. >=20 > If I don't use the -B option, the problem goes away. If the files do en= d > with a newline, the problem goes away. If the files contain more than 1= > character, the problem goes away. If combined with *some* of the other > options (e.g. -e or -y) the problem goes away. Actually, I couldn't reproduce -y making the problem go away: $ ./src/diff -By <(printf a) <(printf b) a b $ echo $? 0 Thanks for the extensive analysis; I can confirm that this bug is still present in the latest diffutils.git sources, although I have not personally hunted for the culprit line of code yet. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --2UJtjoSoQUG1cCq601WHvWIRpwL0hxa1I Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAEBCAAGBQJUB4IwAAoJEKeha0olJ0NqbKUH/i7SDwj+OrV9M6ZEQVg/iV4F TPvRHi/OIdWAHv47nVNTBb8carS+BZASE7sEOMB0XPCXipIm0xu0KfjJaW5i/Y52 RuQCN00UQF7DXMIdK4X3hLo4JDJpi5+mRrZSOElUMfbjuJhiXHDl+bJVv/Tk5UNN v/koIEOt8cTbQ2HeLAZ+9M5PH8wyA09HWmrHOV14LLZTAzEaDJbyEgMMhyseti8t Vzjw+U1UzzRqBEebPIIhlWeJuEGUTrcaezY/8cK6RtVdPq9fRX0K6Qwzl/gaex7D fGCIILyMwhpfaEyUZKpoBMBN0EWBrvnmkbWAFcfFCR31UdLoV73a+hmXcdXS68I= =gj0M -----END PGP SIGNATURE----- --2UJtjoSoQUG1cCq601WHvWIRpwL0hxa1I-- From debbugs-submit-bounces@debbugs.gnu.org Wed Sep 03 19:06:11 2014 Received: (at 18402-done) by debbugs.gnu.org; 3 Sep 2014 23:06:11 +0000 Received: from localhost ([127.0.0.1]:58202 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPJcw-0001gA-Ch for submit@debbugs.gnu.org; Wed, 03 Sep 2014 19:06:11 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:47260) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPJcs-0001fb-PQ for 18402-done@debbugs.gnu.org; Wed, 03 Sep 2014 19:06:08 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 97028A60002; Wed, 3 Sep 2014 16:06:00 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xv5nDwwSXyBm; Wed, 3 Sep 2014 16:05:56 -0700 (PDT) Received: from [192.168.1.9] (pool-71-177-17-123.lsanca.dsl-w.verizon.net [71.177.17.123]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 3EEFC39E8014; Wed, 3 Sep 2014 16:05:56 -0700 (PDT) Message-ID: <54079ED3.8050402@cs.ucla.edu> Date: Wed, 03 Sep 2014 16:05:55 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Eric Blake , navin@smriti.com, Matt Johnson , 18402-done@debbugs.gnu.org Subject: Re: [bug-diffutils] bug#18402: Wrong output for single character files without newline References: <86egvtc8iw.fsf@smriti.com> <54078230.4040803@redhat.com> In-Reply-To: <54078230.4040803@redhat.com> Content-Type: multipart/mixed; boundary="------------090500050109020003030902" X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: 18402-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) This is a multi-part message in MIME format. --------------090500050109020003030902 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Thanks for reporting that. I installed the attached 3 patches; patch #2 should fix the bug. --------------090500050109020003030902 Content-Type: text/plain; charset=UTF-8; name="0001-diff-fix-performance-bug-with-prefix-computation.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename*0="0001-diff-fix-performance-bug-with-prefix-computation.patch" RnJvbSA3YmRkNjQ3OWNlNDNkNmI0NTgwM2ZkMGJjNGIzNjMzNzA5NzVjZWFiIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBXZWQsIDMgU2VwIDIwMTQgMTU6MjU6MjEgLTA3MDAKU3ViamVjdDogW1BBVENI IDEvM10gZGlmZjogZml4IHBlcmZvcm1hbmNlIGJ1ZyB3aXRoIHByZWZpeCBjb21wdXRhdGlv bgoKKiBzcmMvaW8uYyAoZmluZF9pZGVudGljYWxfZW5kcyk6IEZpeCBwZXJmb3JtYW5jZSBi dWc6CnRoZSB0ZXN0IGZvciB3aGVuIHRoZSBwcmVmaXggd2FzIG5lZWRlZCBtZXNzZWQgdXAg YnkKdGhlIDIwMDItMDItMjggaW50ZWdlci1vdmVyZmxvdyBmaXhlcywgY2F1c2luZyBwZXJm b3JtYW5jZSB0byBiZQp3b3JzZSB0aGFuIGl0IG5lZWRlZCB0byBiZS4KLS0tCiBzcmMvaW8u YyB8IDggKysrKystLS0KIDEgZmlsZSBjaGFuZ2VkLCA1IGluc2VydGlvbnMoKyksIDMgZGVs ZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEvc3JjL2lvLmMgYi9zcmMvaW8uYwppbmRleCAwNWE4 OThjLi4xYThiOTM2IDEwMDY0NAotLS0gYS9zcmMvaW8uYworKysgYi9zcmMvaW8uYwpAQCAt NTM4LDYgKzUzOCw3IEBAIGZpbmRfaWRlbnRpY2FsX2VuZHMgKHN0cnVjdCBmaWxlX2RhdGEg ZmlsZXZlY1tdKQogICBsaW4gaSwgbGluZXM7CiAgIHNpemVfdCBuMCwgbjE7CiAgIGxpbiBh bGxvY19saW5lczAsIGFsbG9jX2xpbmVzMTsKKyAgYm9vbCBwcmVmaXhfbmVlZGVkOwogICBs aW4gYnVmZmVyZWRfcHJlZml4LCBwcmVmaXhfY291bnQsIHByZWZpeF9tYXNrOwogICBsaW4g bWlkZGxlX2d1ZXNzLCBzdWZmaXhfZ3Vlc3M7CiAKQEAgLTY4NywxMiArNjg4LDEzIEBAIGZp bmRfaWRlbnRpY2FsX2VuZHMgKHN0cnVjdCBmaWxlX2RhdGEgZmlsZXZlY1tdKQogICBwcmVm aXhfbWFzayA9IHByZWZpeF9jb3VudCAtIDE7CiAgIGxpbmVzID0gMDsKICAgbGluYnVmMCA9 IHhtYWxsb2MgKGFsbG9jX2xpbmVzMCAqIHNpemVvZiAqbGluYnVmMCk7CisgIHByZWZpeF9u ZWVkZWQgPSAhIChub19kaWZmX21lYW5zX25vX291dHB1dAorCQkgICAgICYmIGZpbGV2ZWNb MF0ucHJlZml4X2VuZCA9PSBwMAorCQkgICAgICYmIGZpbGV2ZWNbMV0ucHJlZml4X2VuZCA9 PSBwMSk7CiAgIHAwID0gYnVmZmVyMDsKIAogICAvKiBJZiB0aGUgcHJlZml4IGlzIG5lZWRl ZCwgZmluZCB0aGUgcHJlZml4IGxpbmVzLiAgKi8KLSAgaWYgKCEgKG5vX2RpZmZfbWVhbnNf bm9fb3V0cHV0Ci0JICYmIGZpbGV2ZWNbMF0ucHJlZml4X2VuZCA9PSBwMAotCSAmJiBmaWxl dmVjWzFdLnByZWZpeF9lbmQgPT0gcDEpKQorICBpZiAocHJlZml4X25lZWRlZCkKICAgICB7 CiAgICAgICBlbmQwID0gZmlsZXZlY1swXS5wcmVmaXhfZW5kOwogICAgICAgd2hpbGUgKHAw ICE9IGVuZDApCi0tIAoxLjkuMwoK --------------090500050109020003030902 Content-Type: text/plain; charset=UTF-8; name="0002-diff-fix-bug-with-diff-B-and-incomplete-lines.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="0002-diff-fix-bug-with-diff-B-and-incomplete-lines.patch" RnJvbSBkMmZkOWQ0NjgzZWY2MGMyNTlhM2I0MjZmNzFjZWYxYjg5ZmYzODNkIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBXZWQsIDMgU2VwIDIwMTQgMTU6NTg6MDMgLTA3MDAKU3ViamVjdDogW1BBVENI IDIvM10gZGlmZjogZml4IGJ1ZyB3aXRoIGRpZmYgLUIgYW5kIGluY29tcGxldGUgbGluZXMK ClJlcG9ydGVkIGJ5IE5hdmluIEthYnJhIHZpYSBFcmljIEJsYWtlIGluOgpodHRwOi8vYnVn cy5nbnUub3JnLzE4NDAyCiogc3JjL3V0aWwuYyAoYW5hbHl6ZV9odW5rKTogRG9uJ3QgbWlz aGFuZGxlIGluY29tcGxldGUKbGluZXMgYXQgZW5kIG9mIGZpbGUuCiogdGVzdHMvbm8tbmV3 bGluZS1hdC1lb2Y6IFRlc3QgZm9yIHRoZSBidWcuCi0tLQogc3JjL3V0aWwuYyAgICAgICAg ICAgICAgfCA2ICsrKystLQogdGVzdHMvbm8tbmV3bGluZS1hdC1lb2YgfCA2ICsrKysrKwog MiBmaWxlcyBjaGFuZ2VkLCAxMCBpbnNlcnRpb25zKCspLCAyIGRlbGV0aW9ucygtKQoKZGlm ZiAtLWdpdCBhL3NyYy91dGlsLmMgYi9zcmMvdXRpbC5jCmluZGV4IDAxNjA1N2QuLjQ0Y2U2 MWYgMTAwNjQ0Ci0tLSBhL3NyYy91dGlsLmMKKysrIGIvc3JjL3V0aWwuYwpAQCAtODE3LDcg KzgxNyw4IEBAIGFuYWx5emVfaHVuayAoc3RydWN0IGNoYW5nZSAqaHVuaywKICAgICAgIGZv ciAoaSA9IG5leHQtPmxpbmUwOyBpIDw9IGwwICYmIHRyaXZpYWw7IGkrKykKIAl7CiAJICBj aGFyIGNvbnN0ICpsaW5lID0gbGluYnVmMFtpXTsKLQkgIGNoYXIgY29uc3QgKm5ld2xpbmUg PSBsaW5idWYwW2kgKyAxXSAtIDE7CisJICBjaGFyIGNvbnN0ICpsYXN0Ynl0ZSA9IGxpbmJ1 ZjBbaSArIDFdIC0gMTsKKwkgIGNoYXIgY29uc3QgKm5ld2xpbmUgPSBsYXN0Ynl0ZSArICgq bGFzdGJ5dGUgIT0gJ1xuJyk7CiAJICBzaXplX3QgbGVuID0gbmV3bGluZSAtIGxpbmU7CiAJ ICBjaGFyIGNvbnN0ICpwID0gbGluZTsKIAkgIGlmIChza2lwX3doaXRlX3NwYWNlKQpAQCAt ODM3LDcgKzgzOCw4IEBAIGFuYWx5emVfaHVuayAoc3RydWN0IGNoYW5nZSAqaHVuaywKICAg ICAgIGZvciAoaSA9IG5leHQtPmxpbmUxOyBpIDw9IGwxICYmIHRyaXZpYWw7IGkrKykKIAl7 CiAJICBjaGFyIGNvbnN0ICpsaW5lID0gbGluYnVmMVtpXTsKLQkgIGNoYXIgY29uc3QgKm5l d2xpbmUgPSBsaW5idWYxW2kgKyAxXSAtIDE7CisJICBjaGFyIGNvbnN0ICpsYXN0Ynl0ZSA9 IGxpbmJ1ZjFbaSArIDFdIC0gMTsKKwkgIGNoYXIgY29uc3QgKm5ld2xpbmUgPSBsYXN0Ynl0 ZSArICgqbGFzdGJ5dGUgIT0gJ1xuJyk7CiAJICBzaXplX3QgbGVuID0gbmV3bGluZSAtIGxp bmU7CiAJICBjaGFyIGNvbnN0ICpwID0gbGluZTsKIAkgIGlmIChza2lwX3doaXRlX3NwYWNl KQpkaWZmIC0tZ2l0IGEvdGVzdHMvbm8tbmV3bGluZS1hdC1lb2YgYi90ZXN0cy9uby1uZXds aW5lLWF0LWVvZgppbmRleCAxNGQ1ZjQ5Li5mNTAzNzE4IDEwMDc1NQotLS0gYS90ZXN0cy9u by1uZXdsaW5lLWF0LWVvZgorKysgYi90ZXN0cy9uby1uZXdsaW5lLWF0LWVvZgpAQCAtNTAs NCArNTAsMTAgQEAgY29tcGFyZSBleHAyIG91dCB8fCBmYWlsPTEKICMgZXhwZWN0IGVtcHR5 IHN0ZGVycgogY29tcGFyZSAvZGV2L251bGwgZXJyIHx8IGZhaWw9MQogCisjIFRlc3QgZm9y IEJ1ZyMxODQwMi4KK3ByaW50ZiBhID4gYQorcHJpbnRmIGIgPiBiCitkaWZmIC1CIGEgYiA+ IG91dCAyPmVycgordGVzdCAkPyA9IDEgfHwgZmFpbD0xCisKIEV4aXQgJGZhaWwKLS0gCjEu OS4zCgo= --------------090500050109020003030902 Content-Type: text/plain; charset=UTF-8; name="0003-doc-mention-diff-B-fix-in-NEWS.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="0003-doc-mention-diff-B-fix-in-NEWS.patch" RnJvbSBkZjNhZjI5NjI3YTkyNDk1YTc0MGRhMTNjYjhiYjBkNGZjYzFiZjg0IE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBXZWQsIDMgU2VwIDIwMTQgMTY6MDI6MzUgLTA3MDAKU3ViamVjdDogW1BBVENI IDMvM10gZG9jOiBtZW50aW9uIGRpZmYgLUIgZml4IGluIE5FV1MKCi0tLQogTkVXUyB8IDMg KysrCiAxIGZpbGUgY2hhbmdlZCwgMyBpbnNlcnRpb25zKCspCgpkaWZmIC0tZ2l0IGEvTkVX UyBiL05FV1MKaW5kZXggNThhN2NiYi4uOWYzODhkZCAxMDA2NDQKLS0tIGEvTkVXUworKysg Yi9ORVdTCkBAIC0xMyw2ICsxMyw5IEBAIEdOVSBkaWZmdXRpbHMgTkVXUyAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgIC0qLSBvdXRsaW5lIC0qLQogICBjb25zaWRlciB0 d28gQXNpYW4gZmlsZSBuYW1lcyB0byBiZSB0aGUgc2FtZSBtZXJlbHkgYmVjYXVzZSB0aGV5 CiAgIGNvbnRhaW4gbm8gRW5nbGlzaCBjaGFyYWN0ZXJzLgogCisgIGRpZmYgLUIgbm8gbG9u Z2VyIGdlbmVyYXRlcyBpbmNvcnJlY3Qgb3V0cHV0IGlmIHRoZSB0d28gaW5wdXRzCisgIGVh Y2ggZW5kIHdpdGggYSBvbmUtYnl0ZSBpbmNvbXBsZXRlIGxpbmUuCisKICoqIFBlcmZvcm1h bmNlIGNoYW5nZXMKIAogICBkaWZmJ3MgZGVmYXVsdCBhbGdvcml0aG0gaGFzIGJlZW4gYWRq dXN0ZWQgdG8gb3V0cHV0IGhpZ2hlci1xdWFsaXR5Ci0tIAoxLjkuMwoK --------------090500050109020003030902-- From debbugs-submit-bounces@debbugs.gnu.org Wed Sep 03 19:20:36 2014 Received: (at 18402) by debbugs.gnu.org; 3 Sep 2014 23:20:36 +0000 Received: from localhost ([127.0.0.1]:58213 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPJqu-00021X-GO for submit@debbugs.gnu.org; Wed, 03 Sep 2014 19:20:36 -0400 Received: from mail-wg0-f51.google.com ([74.125.82.51]:59304) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPJqq-00021A-5q; Wed, 03 Sep 2014 19:20:32 -0400 Received: by mail-wg0-f51.google.com with SMTP id l18so9287432wgh.22 for ; Wed, 03 Sep 2014 16:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=4qUZdNxgaaR30Z9KT/J72Uw844zxwZxLsnd0zjoCOv8=; b=s+RIkisFZtiKPc68gvdBkBheKhuu51SJ1nJWcuAO/3z8i4Zb0+o+tUjBZtPBMQCn8d eHb4a92+uM+kOnKfyb56jVmkNBJZiUjcKxKPHZIvQZYz3FgcVJ80sxGcnHseDG6Y+Q2t kZ0wLttjEd1BEw8QAwk2cbbVTxcooJpBCNSCn3ByYF6kaBUwPU7Q9cIym1wzhJMJZM8D JjAqVL0pBRZ0I3CUhc8j4D+kJnL0WyDJpYL773RqW/iQf92MAgg4RkvbLCL+GJFaeEPa vuL3BYNjuW24OnXdHf+d9S3kPJkAhKP0lSTRQykGHn4hJdkF0ValseEC0dEZ1/P6JtDR k9dw== X-Received: by 10.194.78.100 with SMTP id a4mr735323wjx.106.1409786426139; Wed, 03 Sep 2014 16:20:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.41.202 with HTTP; Wed, 3 Sep 2014 16:20:06 -0700 (PDT) In-Reply-To: <54079ED3.8050402@cs.ucla.edu> References: <86egvtc8iw.fsf@smriti.com> <54078230.4040803@redhat.com> <54079ED3.8050402@cs.ucla.edu> From: Jim Meyering Date: Wed, 3 Sep 2014 16:20:06 -0700 X-Google-Sender-Auth: TYERwHRyXtmC5wSDU_9eRU2EIBI Message-ID: Subject: Re: [bug-diffutils] bug#18402: bug#18402: Wrong output for single character files without newline To: 18402@debbugs.gnu.org, Paul Eggert , Eric Blake Content-Type: text/plain; charset=ISO-8859-1 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 18402 Cc: 18402-done@debbugs.gnu.org, Matt Johnson , navin@smriti.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Wed, Sep 3, 2014 at 4:05 PM, Paul Eggert wrote: > Thanks for reporting that. I installed the attached 3 patches; patch #2 > should fix the bug. Thanks for all the patches. Regarding the performance fix, can you give performance deltas on moderate or pathologically affected inputs? It'd be great to include actual inputs (or a recipe for creating them) so we have a hope of avoiding such regressions in the future. From debbugs-submit-bounces@debbugs.gnu.org Wed Sep 03 20:20:27 2014 Received: (at 18402) by debbugs.gnu.org; 4 Sep 2014 00:20:27 +0000 Received: from localhost ([127.0.0.1]:58225 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPKmo-0003QM-0d for submit@debbugs.gnu.org; Wed, 03 Sep 2014 20:20:27 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:50237) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPKml-0003Q7-G7 for 18402@debbugs.gnu.org; Wed, 03 Sep 2014 20:20:24 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 631F6A60002; Wed, 3 Sep 2014 17:20:17 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JfbHo3XsEASh; Wed, 3 Sep 2014 17:20:08 -0700 (PDT) Received: from [192.168.1.9] (pool-71-177-17-123.lsanca.dsl-w.verizon.net [71.177.17.123]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id C6A4439E8014; Wed, 3 Sep 2014 17:20:08 -0700 (PDT) Message-ID: <5407B038.6050804@cs.ucla.edu> Date: Wed, 03 Sep 2014 17:20:08 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Jim Meyering , 18402@debbugs.gnu.org Subject: Re: [bug-diffutils] bug#18402: bug#18402: Wrong output for single character files without newline References: <86egvtc8iw.fsf@smriti.com> <54078230.4040803@redhat.com> <54079ED3.8050402@cs.ucla.edu> In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: 18402 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) Jim Meyering wrote: > can you give performance deltas on moderate > or pathologically affected inputs? Maybe something like this: diff --horizon-lines=100000000000000000000 gnulib/ChangeLog /tmp/ChangeLog where the two files are copies. The bug fix sped up performance about 5x on my platform, which is Fedora 20 x86-64, AMD Phenom II X4 910e. From debbugs-submit-bounces@debbugs.gnu.org Thu Sep 04 01:12:39 2014 Received: (at 18402) by debbugs.gnu.org; 4 Sep 2014 05:12:40 +0000 Received: from localhost ([127.0.0.1]:58326 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPPLb-0002EI-25 for submit@debbugs.gnu.org; Thu, 04 Sep 2014 01:12:39 -0400 Received: from mail-wi0-f180.google.com ([209.85.212.180]:57290) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPPLS-0002Dw-1t for 18402@debbugs.gnu.org; Thu, 04 Sep 2014 01:12:34 -0400 Received: by mail-wi0-f180.google.com with SMTP id ex7so371247wid.1 for <18402@debbugs.gnu.org>; Wed, 03 Sep 2014 22:12:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=ud66GFfTR+ZQTgw6HNfZHnRL094f3BDInF7nWi0TVRo=; b=x11pjx/dMcsw8dIjdQAMeBK3pTPxJLGR3sXX8RzTl610sxVO2aLqQgk9/+e8MBTThV PrFEE52p8z25kScVk6hgYX2W/ae44hRMBIlYvc1gMAUXvs9Kr06ouiTfFir/rLDyA+f3 Y9K/oKxYg/v/Mf43fZHymd3pn2HDZoII41iS//8EuAfR0zKaqY291dO525D4nZ1fzvVa 2RRvnyt480i8+Esma94s3xkaBDDd+H2JJAq6ev2Kx3FJeJNKePVIXrdYFW5N6LlChRfi V5Wr6+f4vlRRSTiDPk4nZFAx8S8csaHuN3EAEIDT7MAfxELpnL2KK7zHOPrprH4pGHKF 7Eag== X-Received: by 10.194.78.100 with SMTP id a4mr2666521wjx.106.1409807544235; Wed, 03 Sep 2014 22:12:24 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.41.202 with HTTP; Wed, 3 Sep 2014 22:12:04 -0700 (PDT) In-Reply-To: <5407B038.6050804@cs.ucla.edu> References: <86egvtc8iw.fsf@smriti.com> <54078230.4040803@redhat.com> <54079ED3.8050402@cs.ucla.edu> <5407B038.6050804@cs.ucla.edu> From: Jim Meyering Date: Wed, 3 Sep 2014 22:12:04 -0700 X-Google-Sender-Auth: imuj7Uw48bgV1-NqApkt9XDqsIE Message-ID: Subject: Re: [bug-diffutils] bug#18402: bug#18402: Wrong output for single character files without newline To: Paul Eggert Content-Type: text/plain; charset=ISO-8859-1 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 18402 Cc: 18402 <18402@debbugs.gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Wed, Sep 3, 2014 at 5:20 PM, Paul Eggert wrote: > Jim Meyering wrote: >> >> can you give performance deltas on moderate >> or pathologically affected inputs? > > > Maybe something like this: > > diff --horizon-lines=100000000000000000000 gnulib/ChangeLog /tmp/ChangeLog > > where the two files are copies. The bug fix sped up performance about 5x on > my platform, which is Fedora 20 x86-64, AMD Phenom II X4 910e. Thanks for the details. I tried to reproduce using two copies of gnulib/ChangeLog, but saw identical times for before/after runs. I also tried with two copies of the output of "seq 9999999" on a tmpfs file system, with the same result: no discernible difference. I tried both on an AMD FX(tm)-4100 and an Intel(R) Core(TM) i7-4770S Here are the commands I ran: seq 9999999 > /t/1 && cp /t/2 env time src/diff --horizon-lines=100000000000000000000 /t/[12] Then I took the best of five elapsed times and compared. Here's the minimum time on the faster system, both with and without the patch: $ env time src/diff --horizon-lines=100000000000000000000 /t/[12] 1.94user 0.34system 0:02.29elapsed 99%CPU (0avgtext+0avgdata 1112960maxresident)k 0inputs+0outputs (0major+404031minor)pagefaults 0swaps From debbugs-submit-bounces@debbugs.gnu.org Thu Sep 04 20:34:34 2014 Received: (at 18402) by debbugs.gnu.org; 5 Sep 2014 00:34:35 +0000 Received: from localhost ([127.0.0.1]:59154 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPhU2-0000j9-7E for submit@debbugs.gnu.org; Thu, 04 Sep 2014 20:34:34 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:48867) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XPhTz-0000is-9t for 18402@debbugs.gnu.org; Thu, 04 Sep 2014 20:34:32 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 0055F39E8017; Thu, 4 Sep 2014 17:34:24 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id M4G5-AOqh7P1; Thu, 4 Sep 2014 17:34:16 -0700 (PDT) Received: from [192.168.1.9] (pool-71-177-17-123.lsanca.dsl-w.verizon.net [71.177.17.123]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 365B539E8014; Thu, 4 Sep 2014 17:34:16 -0700 (PDT) Message-ID: <54090507.90103@cs.ucla.edu> Date: Thu, 04 Sep 2014 17:34:15 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Jim Meyering Subject: Re: [bug-diffutils] bug#18402: bug#18402: Wrong output for single character files without newline References: <86egvtc8iw.fsf@smriti.com> <54078230.4040803@redhat.com> <54079ED3.8050402@cs.ucla.edu> <5407B038.6050804@cs.ucla.edu> In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: 18402 Cc: 18402 <18402@debbugs.gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) Jim Meyering wrote: > I also tried with two copies of the output of "seq 9999999" on a > tmpfs file system, with the same result: no discernible difference. There is something weird going on, as I can't reproduce my earlier results. Perhaps I built one version of 'diff' without optimization and the other one with it, by accident. Sorry about sending you down a wild goose chase. I'm still seeing a significant performance improvement due to the change, though not as dramatic as what I earlier reported. Here's the benchmark: $ seq 100000000 >0 $ cp 0 1 $ time ./diff-old 0 1 real 0m2.540s user 0m1.055s sys 0m1.464s $ time ./diff-new 0 1 real 0m1.734s user 0m0.256s sys 0m1.463s where 'diff-old' and 'diff-new' are the old (b6e691277288c4e8d53b1d2577137d265008d13e) and current (df3af29627a92495a740da13cb8bb0d4fcc1bf84) versions of diffutils, both compiled with plain 'configure; make' on the same Fedora 20 x86-64 platform I mentioned earlier. This is on an ext4 file system that is built atop a mirrored hard-disk subsystem, and the locale is en_US.utf8 (dunno if any of this matters). This benchmark is dominated by system CPU time, so the new version is only about 45% faster than the old if one looks at real time, but it's still clearly a win as the user CPU time about 4x faster. From unknown Mon Aug 18 14:23:19 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 03 Oct 2014 11:24:03 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator