GNU bug report logs - #22108
diff wrapper script for very large files, low memory

Previous Next

Package: diffutils;

Reported by: Taco van Dijk <taco <at> waag.org>

Date: Mon, 7 Dec 2015 16:17:02 UTC

Severity: normal

To reply to this bug, email your comments to 22108 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-diffutils <at> gnu.org:
bug#22108; Package diffutils. (Mon, 07 Dec 2015 16:17:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Taco van Dijk <taco <at> waag.org>:
New bug report received and forwarded. Copy sent to bug-diffutils <at> gnu.org. (Mon, 07 Dec 2015 16:17:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Taco van Dijk <taco <at> waag.org>
To: bug-diffutils <at> gnu.org
Subject: diff wrapper script for very large files, low memory
Date: Mon, 7 Dec 2015 12:45:50 +0100 (CET)
Hi,

For our current project we faced the following problem;
When trying to compare two large files (2* 4+ Gb) exceeding the RAM of the machine, 
the machine would become unresponsive.

To solve this problem we have found a solution that might be worthwhile sharing, based around xxhash.

For anyone interested, you can find it here.

https://github.com/waagsociety/hashed-diff

Kind regards,

Taco van Dijk & Lodewijk Loos
Waag Society

-- 
PGP: 82EDF574 




Information forwarded to bug-diffutils <at> gnu.org:
bug#22108; Package diffutils. (Mon, 02 May 2016 02:01:02 GMT) Full text and rfc822 format available.

Message #8 received at 22108 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: 22108 <at> debbugs.gnu.org, taco <at> waag.org
Subject: Re: diff wrapper script for very large files, low memory
Date: Sun, 1 May 2016 19:00:06 -0700
tags 22108 wishlist
done

Thanks for the suggestion and pointer.
FYI, your problem is very similar to that described at http://bugs.gnu.org/21665

I'm marking this auto-created issue as "wishlist".
A combination of this approach and using mmap may be profitable when
input files are too large for available RAM.




This bug report was last modified 9 years and 45 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.