I've attached an archive with 2 files "file1" and "file2"; "file2" is "file1" with some lines removed, so that a diff should report only removed lines. Here are some tests done under Debian/sid (x86_64) with diff 3.7 (Debian package diffutils 1:3.7-3). First, for the reference, the size of the initial diff: $ diff -u file1 file2 | wc -l 22319 But this diff reports added lines, though "file2" has only removed lines compared to "file1". ────────────────────────────────────────────────────────────────── $ diff -u file1 file2 | grep -C16 'mark :37950' -commit refs/heads/master -#legacy-id 9122 -mark :37948 -committer Vincent Lefèvre 1404215412 +0000 -data 55 -[tests/trandom_deviate.c] Correction (fprintf format). -from :37946 -M 100644 :37947 tests/trandom_deviate.c - -blob -mark :37949 -data 15 -Blob at :37949 - -commit refs/heads/misc -#legacy-id 9123 -mark :37950 -committer Vincent Lefèvre 1404216001 +0000 -data 23 -[www/pub.html] Update. -from :37941 -M 100644 :37949 www/pub.html +data 55 +[tests/trandom_deviate.c] Correction (fprintf format). +from :37946 +M 100644 :37947 tests/trandom_deviate.c blob mark :37951 @@ -9910,21 +467,6 @@ M 100644 :38018 src/round_raw_generic.c blob ────────────────────────────────────────────────────────────────── In particular, one can see: -data 55 -[tests/trandom_deviate.c] Correction (fprintf format). -from :37946 -M 100644 :37947 tests/trandom_deviate.c and +data 55 +[tests/trandom_deviate.c] Correction (fprintf format). +from :37946 +M 100644 :37947 tests/trandom_deviate.c while these lines should have been regarded as unmodified. This problem disappears if I shorten "file2" a bit (these lines are at the very beginning in "file2", so that such a change of behavior is surprising): $ head -n 129410 file2 > file3 $ diff -u file1 file3 | grep '^\+' +++ file3 2020-11-24 11:58:17.922462693 +0100 So, now, no added lines reported. This is fine. And here's what diff now gives around these lines: ────────────────────────────────────────────────────────────────── $ diff -u file1 file3 | grep -C16 'mark :37950' -commit refs/heads/master -#legacy-id 9122 -mark :37948 -committer Vincent Lefèvre 1404215412 +0000 data 55 [tests/trandom_deviate.c] Correction (fprintf format). from :37946 M 100644 :37947 tests/trandom_deviate.c blob -mark :37949 -data 15 -Blob at :37949 - -commit refs/heads/misc -#legacy-id 9123 -mark :37950 -committer Vincent Lefèvre 1404216001 +0000 -data 23 -[www/pub.html] Update. -from :37941 -M 100644 :37949 www/pub.html - -blob mark :37951 data 15 Blob at :37951 @@ -9910,21 +467,6 @@ M 100644 :38018 src/round_raw_generic.c blob -mark :38020 -data 15 ────────────────────────────────────────────────────────────────── This is now OK, but stranger things happen when I reduce "file2" even more: $ head -n 120200 file2 > file4 $ diff -u file1 file4 | grep -c '^\+' 7 $ diff -u file1 file4 | wc -l 31251 So, with "file2" reduced to 120200 lines, 7 − 1 = 6 added lines are reported (though this new file has only removed lines). This is incorrect, but if I remove 100 more lines at the end, this is much worse, with 81120 added lines reported, and a huge diff: $ head -n 120100 file2 > file5 $ diff -u file1 file5 | grep -c '^\+' 81121 $ diff -u file1 file5 | wc -l 231111 -- Vincent Lefèvre - Web: 100% accessible validated (X)HTML - Blog: Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)