GNU bug report logs - #18316
[PATCH] warn on too large file copies

Previous Next

Package: coreutils;

Reported by: adamjsho <at> gmail.com

Date: Fri, 22 Aug 2014 15:58:02 UTC

Severity: normal

Tags: fixed, patch

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 18316 in the body.
You can then email your comments to 18316 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#18316; Package coreutils. (Fri, 22 Aug 2014 15:58:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to adamjsho <at> gmail.com:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Fri, 22 Aug 2014 15:58:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Adam <adamjsho <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: [PATCH] warn on too large file copies
Date: Fri, 22 Aug 2014 16:56:13 +0100
[Message part 1 (text/plain, inline)]
When files are being copied, they can hit the following loop:

  while (max_n_read)
    { 
      ...
      // return true if file finishes copying
      // return false if error occurs     
      // copy some file
      // deduct bytes copied from max_n_read
      ...
    }

  return true;

If max_n_read reaches 0, the copy does not return a warning or a failure
but instead returns that the copy completed successfully.

I believe when copying files as large as these, if the maximum transfer
were to be reached then the copy should not return false (as this will
delete the failed destination file), but instead produce a warning and
return true. 

I have attached a patch against the git master. Please review and
consider it's inclusion.

I apologise if there is an issue with this report as it is my first time
submitting a patch to coreutils. 

Thanks,
Adam Shore
[file_too_large_warning.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#18316; Package coreutils. (Fri, 22 Aug 2014 16:37:01 GMT) Full text and rfc822 format available.

Message #8 received at 18316 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: adamjsho <at> gmail.com, 18316 <at> debbugs.gnu.org
Subject: Re: bug#18316: [PATCH] warn on too large file copies
Date: Fri, 22 Aug 2014 09:36:11 -0700
Adam wrote:
> I believe when copying files as large as these, if the maximum transfer
> were to be reached then the copy should not return false (as this will
> delete the failed destination file), but instead produce a warning and
> return true.

Sorry, I don't see the bug.  max_n_nread is normally UINTMAX_MAX, which 
is effectively infinity, so it shouldn't be an issue then.  When 
max_n_read is less than UINTMAX_MAX, we're copying one extent of a 
sparse file, and it's not an error to read less than the extent we 
expect to find, as that can happen when the file changed as we read it.

Perhaps cp should optionally warn if the input file changes while we 
read it, but that's a bigger change and one that should be done elsewhere.




Information forwarded to bug-coreutils <at> gnu.org:
bug#18316; Package coreutils. (Fri, 22 Aug 2014 16:43:01 GMT) Full text and rfc822 format available.

Message #11 received at 18316 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: adamjsho <at> gmail.com
Cc: 18316 <at> debbugs.gnu.org
Subject: Re: bug#18316: [PATCH] warn on too large file copies
Date: Fri, 22 Aug 2014 17:41:53 +0100
On 08/22/2014 04:56 PM, Adam wrote:
> When files are being copied, they can hit the following loop:
> 
>   while (max_n_read)
>     { 
>       ...
>       // return true if file finishes copying
>       // return false if error occurs     
>       // copy some file
>       // deduct bytes copied from max_n_read
>       ...
>     }
> 
>   return true;
> 
> If max_n_read reaches 0, the copy does not return a warning or a failure
> but instead returns that the copy completed successfully.
> 
> I believe when copying files as large as these, if the maximum transfer
> were to be reached then the copy should not return false (as this will
> delete the failed destination file), but instead produce a warning and
> return true. 
> 
> I have attached a patch against the git master. Please review and
> consider it's inclusion.
> 
> I apologise if there is an issue with this report as it is my first time
> submitting a patch to coreutils. 

I think that would issue a false warning when using extent_copy(),
which is used on sparse files on many file systems.

In the other case UINTMAX_MAX is passed and so should be large
enough for any regular file, though I suppose this is a
theoretical concern if cp was being used between devices for example.
I.E. it would stop without any error after about 19EB on
both 32 and 64 bit platforms. That's the case you're trying to
avoid here right, or is there a more practical concern?

thanks,
Pádraig.





Information forwarded to bug-coreutils <at> gnu.org:
bug#18316; Package coreutils. (Fri, 22 Aug 2014 17:02:02 GMT) Full text and rfc822 format available.

Message #14 received at 18316 <at> debbugs.gnu.org (full text, mbox):

From: Adam <adamjsho <at> gmail.com>
To: 18316 <at> debbugs.gnu.org
Subject: Re: bug#18316: [PATCH] warn on too large file copies
Date: Fri, 22 Aug 2014 18:01:02 +0100
> In the other case UINTMAX_MAX is passed and so should be large
> enough for any regular file, though I suppose this is a
> theoretical concern if cp was being used between devices for example.
> I.E. it would stop without any error after about 19EB on
> both 32 and 64 bit platforms. That's the case you're trying to
> avoid here right, or is there a more practical concern?

Yes, this is what I meant. I know it's a rare scenario however I thought
I would still mention it as I can not see harm in adding a check in. 

Apologies for any confusion.

Thanks,
Adam





Information forwarded to bug-coreutils <at> gnu.org:
bug#18316; Package coreutils. (Fri, 22 Aug 2014 18:10:02 GMT) Full text and rfc822 format available.

Message #17 received at 18316 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: adamjsho <at> gmail.com, 18316 <at> debbugs.gnu.org
Subject: Re: bug#18316: [PATCH] warn on too large file copies
Date: Fri, 22 Aug 2014 11:09:39 -0700
Adam wrote:
> I know it's a rare scenario

The next time you copy a file containing 19 exabytes let us know.  :-)

If we're going to fix this, I suggest removing the limit entirely; 
that's better than generating a warning if the limit is gone past.  I 
expect there are plenty of places in the code that stop working after 
2**63 bytes, and I'd focus on them first.




Information forwarded to bug-coreutils <at> gnu.org:
bug#18316; Package coreutils. (Fri, 22 Aug 2014 19:13:02 GMT) Full text and rfc822 format available.

Message #20 received at 18316 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>, 
 adamjsho <at> gmail.com
Cc: 18316 <at> debbugs.gnu.org
Subject: Re: bug#18316: [PATCH] warn on too large file copies
Date: Fri, 22 Aug 2014 12:11:56 -0700
[Message part 1 (text/plain, inline)]
That reminds me, we should be avoiding types like uint64_t unless 
they're really needed.  The fiemap code is bound tightly to the Linux 
kernel so I guess uint64_t is OK there, but the portable code should use 
higher-level types like off_t.  I installed the attached patch.  No 
doubt further cleanups like this could be made but one thing at a time.
[0001-maint-avoid-int64_t-and-similar-types-unless-they-re.patch (text/plain, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#18316; Package coreutils. (Fri, 22 Aug 2014 19:39:02 GMT) Full text and rfc822 format available.

Message #23 received at 18316 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, adamjsho <at> gmail.com, 18316 <at> debbugs.gnu.org
Subject: Re: bug#18316: [PATCH] warn on too large file copies
Date: Fri, 22 Aug 2014 13:38:04 -0600
[Message part 1 (text/plain, inline)]
On 08/22/2014 12:09 PM, Paul Eggert wrote:
> Adam wrote:
>> I know it's a rare scenario
> 
> The next time you copy a file containing 19 exabytes let us know.  :-)

By the way, even at a lightning-fast rate of a gigabyte a second, such a
copy wouldn't complete in your lifetime.

> 
> If we're going to fix this, I suggest removing the limit entirely;
> that's better than generating a warning if the limit is gone past.  I
> expect there are plenty of places in the code that stop working after
> 2**63 bytes, and I'd focus on them first.

The size of 2**63 is so huge that it is as good as unlimited, for
anything we can do in a finite lifetime, and we don't have to bend over
backwards trying to treat it as an artificial limitation.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#18316; Package coreutils. (Fri, 22 Aug 2014 23:39:01 GMT) Full text and rfc822 format available.

Message #26 received at 18316 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: adamjsho <at> gmail.com, 18316 <at> debbugs.gnu.org
Subject: Re: bug#18316: [PATCH] warn on too large file copies
Date: Sat, 23 Aug 2014 00:38:16 +0100
On 08/22/2014 08:11 PM, Paul Eggert wrote:
> From 46418ecec04ca61647d6750a0a0fe862f0ae69c1 Mon Sep 17 00:00:00 2001
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Fri, 22 Aug 2014 12:07:11 -0700
> Subject: [PATCH] maint: avoid int64_t and similar types unless they're needed
> 
> C11 doesn't require them, even POSIX doesn't strictly require the
> 64-bit versions, and it makes the code a bit clearer if they're
> used only when needed.
> * src/copy.c (write_zeros, extent_copy):
> * src/extent-scan.h (struct extent_info.ext_length):
> Use off_t, not uint64_t, for a value derived from a file offset.
> * src/extent-scan.h (struct extent_info.ext_flags)
> Prefer plain unsigned int to uint32_t where either will do.
> (struct extent_scan.ei_count):
> Use size_t, not uint32_t, for a value bounded by SIZE_MAX.
> * src/factor.c (MAGIC64, MAGIC63, MAGIC65):
> Remove unnecessary casts to uint64_t.
> ---
>  src/copy.c        | 10 +++++-----
>  src/extent-scan.h |  8 ++++----
>  src/factor.c      |  6 +++---
>  3 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/src/copy.c b/src/copy.c
> index 26d5bdd..a9561c6 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -251,7 +251,7 @@ clone_file (int dest_fd, int src_fd)
>  /* Write N_BYTES zero bytes to file descriptor FD.  Return true if successful.
>     Upon write failure, set errno and return false.  */
>  static bool
> -write_zeros (int fd, uint64_t n_bytes)
> +write_zeros (int fd, off_t n_bytes)
>  {
>    static char *zeros;
>    static size_t nz = IO_BUFSIZE;
> @@ -272,7 +272,7 @@ write_zeros (int fd, uint64_t n_bytes)
>  
>    while (n_bytes)
>      {
> -      uint64_t n = MIN (nz, n_bytes);
> +      size_t n = MIN (nz, n_bytes);
>        if ((full_write (fd, zeros, n)) != n)
>          return false;
>        n_bytes -= n;

The above are fine and good.

> @@ -296,7 +296,7 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
>  {
>    struct extent_scan scan;
>    off_t last_ext_start = 0;
> -  uint64_t last_ext_len = 0;
> +  off_t last_ext_len = 0;
>  
>    /* Keep track of the output position.
>       We may need this at the end, for a final ftruncate.  */
> @@ -330,8 +330,8 @@ extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
>        for (i = 0; i < scan.ei_count || empty_extent; i++)
>          {
>            off_t ext_start;
> -          uint64_t ext_len;
> -          uint64_t hole_size;
> +          off_t ext_len;
> +          off_t hole_size;
>
>            if (i < scan.ei_count)
>              {
> diff --git a/src/extent-scan.h b/src/extent-scan.h
> index fa80034..73202a9 100644
> --- a/src/extent-scan.h
> +++ b/src/extent-scan.h
> @@ -26,10 +26,10 @@ struct extent_info
>    off_t ext_logical;
>  
>    /* Extent length.  */
> -  uint64_t ext_length;
> +  off_t ext_length;

The off_t changes assume we compile with -D_FILE_OFFSET_BITS=64 on 32 bit,
which we do, but it's a bit coupled.

>  
>    /* Extent flags, use it for FIEMAP only, or set it to zero.  */
> -  uint32_t ext_flags;
> +  unsigned int ext_flags;
>  };
>  
>  /* Structure used to reserve extent scan information per file.  */
> @@ -42,10 +42,10 @@ struct extent_scan
>    off_t scan_start;
>  
>    /* Flags to use for scan.  */
> -  uint32_t fm_flags;
> +  unsigned int fm_flags;

We check a few low bit flags which is OK though a bit coupled,
and we also merge extents if the flags are the same which could
cause invalid merging on 32 bit if the flags ever expanded to 64
bits which they could theoretically do as there is reserved space for
that in the fiemap structures.  I suppose we could protect against
that unlikely possibility with:

diff --git a/src/extent-scan.c b/src/extent-scan.c
index 805997a..3af8f99 100644
--- a/src/extent-scan.c
+++ b/src/extent-scan.c
@@ -140,6 +140,8 @@ extent_scan_read (struct extent_scan *scan)
           assert (fm_extents[i].fe_logical
                   <= OFF_T_MAX - fm_extents[i].fe_length);

+          verify (sizeof last_ei->ext_flags >= sizeof fm_extents->fe_flags);
+
           if (si && last_ei->ext_flags
               == (fm_extents[i].fe_flags & ~FIEMAP_EXTENT_LAST)
               && (last_ei->ext_logical + last_ei->ext_length

>    /* How many extent info returned for a scan.  */
> -  uint32_t ei_count;
> +  size_t ei_count;

We verify ei_count within SIZE_MAX anyway so this is good.

>    /* If true, fall back to a normal copy, either set by the
>       failure of ioctl(2) for FIEMAP or lseek(2) with SEEK_DATA.  */
> diff --git a/src/factor.c b/src/factor.c
> index 63924d5..f7beaeb 100644
> --- a/src/factor.c
> +++ b/src/factor.c
> @@ -1811,9 +1811,9 @@ isqrt2 (uintmax_t nh, uintmax_t nl)
>  }
>  
>  /* MAGIC[N] has a bit i set iff i is a quadratic residue mod N. */
> -#define MAGIC64 ((uint64_t) 0x0202021202030213ULL)
> -#define MAGIC63 ((uint64_t) 0x0402483012450293ULL)
> -#define MAGIC65 ((uint64_t) 0x218a019866014613ULL)
> +#define MAGIC64 0x0202021202030213ULL
> +#define MAGIC63 0x0402483012450293ULL
> +#define MAGIC65 0x218a019866014613ULL

+1

In summary all the changes look good,
though I might push the above amendment for defensive reasons.

cheers,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#18316; Package coreutils. (Sat, 23 Aug 2014 00:06:02 GMT) Full text and rfc822 format available.

Message #29 received at 18316 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: adamjsho <at> gmail.com, 18316 <at> debbugs.gnu.org
Subject: Re: bug#18316: [PATCH] warn on too large file copies
Date: Fri, 22 Aug 2014 17:05:07 -0700
Pádraig Brady wrote:
> +          verify (sizeof last_ei->ext_flags >= sizeof fm_extents->fe_flags);

Sure, that looks good.

We will always use large-file compilation, so I wouldn't worry about the 
other thing.




Added tag(s) fixed. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 10 Oct 2018 15:54:03 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 18316 <at> debbugs.gnu.org and adamjsho <at> gmail.com Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 10 Oct 2018 15:54:05 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 08 Nov 2018 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 309 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.