GNU bug report logs - #69535
Problem with copying an EXTREMELY large file - cmp finds a mismatch

Previous Next

Package: coreutils;

Reported by: Brian <b_lists <at> patandbrian.org>

Date: Mon, 4 Mar 2024 04:27:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Brian <b_lists <at> patandbrian.org>
To: bug-coreutils <at> gnu.org
Subject: Problem with copying an EXTREMELY large file - cmp finds a mismatch
Date: Sun, 3 Mar 2024 15:04:51 -0500
I don't know whether the problem I've found is with cp or with cmp, so 
I don't know whether to address this report to coreutils or diffutils. 
If you think I've guessed wrong, please tell me so.

I am trying to make a backup copy of a very large (40 Gigabyte) data 
file - yes, I have plenty of disk space! :) It's a binary file, 200 
byte fixed length records to be precise, not a text file. I have 
downloaded, compiled and used the latest versions of cp and cmp and 
the problem persists. My system is a 16-core AMD Ryzen desktop running 
Linux Mint 21.3.

The steps to reproduce the problem are simple, provided you have the 
data file!

I have a folder called original in the data directory. From a terminal 
prompt, I run

cp data.dat original

this apparently completes correctly - at least, no error messages are seen

I then run

cmp -l data.dat original/data.dat

and I get something around 100 bytes of differences. On the basis of 
three attempted copy and comparison pairs, the addresses of these 
differences vary, but they're always a single block of contiguous 
locations, and always towards the end of the file (the last time, they 
were in the 35,000,000,000s).

I have run a fsck on the drive (a 14 TB Seagate connected to one of 
the motherboard SATA ports) and no problems were found.

Any advice, please? I'm close to the limits of my debugging knowledge.

Please note that I have absolutely zero knowledge of the C language or 
its derivatives. I'm a (retired) scientist turned database programmer, 
I know Pascal, FORTRAN and SQL, and that's about it.


Thanks,

Brian.




This bug report was last modified 1 year and 167 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.