GNU bug report logs - #25814
Windows same_file macro is not reliable

Previous Next

Package: diffutils;

Reported by: Kees Dekker <Kees.Dekker <at> infor.com>

Date: Mon, 20 Feb 2017 16:02:01 UTC

Severity: normal

To reply to this bug, email your comments to 25814 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-diffutils <at> gnu.org:
bug#25814; Package diffutils. (Mon, 20 Feb 2017 16:02:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kees Dekker <Kees.Dekker <at> infor.com>:
New bug report received and forwarded. Copy sent to bug-diffutils <at> gnu.org. (Mon, 20 Feb 2017 16:02:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Kees Dekker <Kees.Dekker <at> infor.com>
To: "bug-diffutils <at> gnu.org" <bug-diffutils <at> gnu.org>
Subject: Windows same_file macro is not reliable
Date: Mon, 20 Feb 2017 16:01:09 +0000
[Message part 1 (text/plain, inline)]
Hi,

On file systems that do not support inodes (e.g. NTFS, because not everything is POSIX), the same_file() macro (in system.h) is incorrect as st_ino (and probably st_dev) are meaningless. See also https://msdn.microsoft.com/en- us/library/14h5k7ff.aspx<https://msdn.microsoft.com/en-%20us/library/14h5k7ff.aspx>. I would suggest to add an #ifdef _WIN32 macro that let return same_file() 0.

The same applies to same_file_attributes macro. The st_uid and st_gid fields are never set to a useful value on Windows (see MSDN URL as mentioned before).

The resulting suggested code would be:

#if _WIN32
# define same_file(s, t) 0
#else
# define same_file(s, t) \
    ((((s)->st_ino == (t)->st_ino) && ((s)->st_dev == (t)->st_dev)) \
     || same_special_file (s, t))
#endif

and

#ifndef same_file_attributes
#if _WIN32
# define same_file_attributes(s, t) 0
#else
# define same_file_attributes(s, t) \
   ((s)->st_mode == (t)->st_mode \
    && (s)->st_nlink == (t)->st_nlink \
    && (s)->st_uid == (t)->st_uid \
    && (s)->st_gid == (t)->st_gid \
    && (s)->st_size == (t)->st_size \
    && (s)->st_mtime == (t)->st_mtime \
    && (s)->st_ctime == (t)->st_ctime)
#endif
#endif

Since I'm unfamiliar with git, please check whether this change is feasible.

Regards,
Kees
[Message part 2 (text/html, inline)]

Information forwarded to bug-diffutils <at> gnu.org:
bug#25814; Package diffutils. (Mon, 20 Feb 2017 16:54:02 GMT) Full text and rfc822 format available.

Message #8 received at 25814 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Kees Dekker <Kees.Dekker <at> infor.com>, 25814 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#25814: Windows same_file macro is not reliable
Date: Mon, 20 Feb 2017 10:53:51 -0600
[Message part 1 (text/plain, inline)]
On 02/20/2017 10:01 AM, Kees Dekker wrote:
> Hi,
> 
> On file systems that do not support inodes (e.g. NTFS,

NTFS supports inodes; Cygwin uses them.  Just because Window's native
stat() is broken does not mean NTFS is broken.

> because not everything is POSIX), the same_file() macro (in system.h) is incorrect as st_ino (and probably st_dev) are meaningless.

Rather, instead of ignoring inode, it would be nicer (but indeed a more
complex task) to have gnulib work around window's broken stat() to
provide a version that works instead.  It only matters for file systems
that support hard links (like NTFS); on file systems like FAT that lack
hard links, hard-coding that inodes don't work is okay, but on NTFS
where hard-links are supported, treating same_file() as always returning
0 gives wrong results.

At any rate, this is an issue that needs to be resolved in gnulib, as
more than just diffutils is affected by it.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-diffutils <at> gnu.org:
bug#25814; Package diffutils. (Tue, 21 Feb 2017 09:22:02 GMT) Full text and rfc822 format available.

Message #11 received at 25814 <at> debbugs.gnu.org (full text, mbox):

From: Kees Dekker <Kees.Dekker <at> infor.com>
To: Eric Blake <eblake <at> redhat.com>, "25814 <at> debbugs.gnu.org"
 <25814 <at> debbugs.gnu.org>
Subject: RE: [bug-diffutils] bug#25814: Windows same_file macro is not reliable
Date: Tue, 21 Feb 2017 09:21:29 +0000
> NTFS supports inodes; Cygwin uses them.  Just because Window's native
> stat() is broken does not mean NTFS is broken.

NTFS is not broken, but according to MSDN, it does not support inodes. The fact that Cygwin supports is, is more how Cygwin implemented fstat/struct fstat (more in the *NIX way). Since Cygwin is not an option for us (everything is native Windows/Visual Studio), it does not help me in saying 'Cygwin is a solution'. I know that the GNU guys like Cygwin, but on Windows, it is beyond reality to expect that Cygwin on Windows is a synonym to native Windows. It is not.

>> because not everything is POSIX), the same_file() macro (in system.h) is incorrect as st_ino (and probably st_dev) are meaningless.

>Rather, instead of ignoring inode, it would be nicer (but indeed a more
>complex task) to have gnulib work around window's broken stat() to
>provide a version that works instead.  It only matters for file systems
>that support hard links (like NTFS); on file systems like FAT that lack
>hard links, hard-coding that inodes don't work is okay, but on NTFS
>where hard-links are supported, treating same_file() as always returning
>0 gives wrong results.

>At any rate, this is an issue that needs to be resolved in gnulib, as
>more than just diffutils is affected by it.

I don't know how Cygwin implements stat(), but maybe they use something like GetFileInformationByHandle() (see: https://msdn.microsoft.com/nl-nl/library/windows/desktop/aa363788(v=vs.85).aspx). Anyhow, it would be a good idea to improve gnulib for support (if you like to corset Windows in a POSIX form), but the fix in diff is now very simple. Just rewrite the macros as I described before. For the longer term, it may be worth to wait for a gnulib fix, but that is (for me for now) too long to wait. I'm not understanding why returning 0 for same_file causes problems. It just results in an actual compare. That is some inefficiency, but not erroneous isn't it?

Regards,
Kees




Information forwarded to bug-diffutils <at> gnu.org:
bug#25814; Package diffutils. (Tue, 21 Feb 2017 14:19:02 GMT) Full text and rfc822 format available.

Message #14 received at 25814 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Kees Dekker <Kees.Dekker <at> infor.com>,
 "25814 <at> debbugs.gnu.org" <25814 <at> debbugs.gnu.org>
Subject: Re: [bug-diffutils] bug#25814: Windows same_file macro is not reliable
Date: Tue, 21 Feb 2017 08:18:46 -0600
[Message part 1 (text/plain, inline)]
On 02/21/2017 03:21 AM, Kees Dekker wrote:
>> NTFS supports inodes; Cygwin uses them.  Just because Window's native
>> stat() is broken does not mean NTFS is broken.
> 
> NTFS is not broken, but according to MSDN, it does not support inodes. The fact that Cygwin supports is, is more how Cygwin implemented fstat/struct fstat (more in the *NIX way). Since Cygwin is not an option for us (everything is native Windows/Visual Studio), it does not help me in saying 'Cygwin is a solution'. I know that the GNU guys like Cygwin, but on Windows, it is beyond reality to expect that Cygwin on Windows is a synonym to native Windows. It is not.

[Please configure your mailer to wrap long lines]

I didn't say you had to use Cygwin, but was rather pointing out that
Cygwin is able to implement stat() on top of NTFS in a manner that
exposes NTFS' native inodes, so the native inodes exist and are
available through native Windows API (how else would cygwin be able to
get it).  And therefore it should be possible to write a gnulib stat()
replacement for mingw that uses the same tricks as cygwin uses to get at
the raw NTFS inode information.

>> At any rate, this is an issue that needs to be resolved in gnulib, as
>> more than just diffutils is affected by it.
> 
> I don't know how Cygwin implements stat(),

It's open source, so you can find out by reading the source code.

https://cygwin.com/git/gitweb.cgi?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/fhandler_disk_file.cc#l220

inline ino_t
path_conv::get_ino_by_handle (HANDLE hdl)
{
  IO_STATUS_BLOCK io;
  FILE_INTERNAL_INFORMATION fii;

  if (NT_SUCCESS (NtQueryInformationFile (hdl, &io, &fii, sizeof fii,
                                          FileInternalInformation))
      && isgood_inode (fii.IndexNumber.QuadPart))
    return fii.IndexNumber.QuadPart;
  return 0;
}

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-diffutils <at> gnu.org:
bug#25814; Package diffutils. (Tue, 21 Feb 2017 16:32:02 GMT) Full text and rfc822 format available.

Message #17 received at 25814 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Kees Dekker <Kees.Dekker <at> infor.com>, Eric Blake <eblake <at> redhat.com>,
 "25814 <at> debbugs.gnu.org" <25814 <at> debbugs.gnu.org>
Subject: Re: [bug-diffutils] bug#25814: bug#25814: Windows same_file macro is
 not reliable
Date: Tue, 21 Feb 2017 08:30:53 -0800
On 02/21/2017 01:21 AM, Kees Dekker wrote:
> the fix in diff is now very simple

That may be, but it's bad software engineering practice to apply a bunch 
of painful-to-maintain "simple" changes to diffutils and other programs 
when the real problem is elsewhere. Let's fix the real problem. Surely 
it is a simple fix to Gnulib.





Information forwarded to bug-diffutils <at> gnu.org:
bug#25814; Package diffutils. (Tue, 21 Feb 2017 16:33:02 GMT) Full text and rfc822 format available.

Message #20 received at 25814 <at> debbugs.gnu.org (full text, mbox):

From: Kees Dekker <Kees.Dekker <at> infor.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Eric Blake <eblake <at> redhat.com>,
 "25814 <at> debbugs.gnu.org" <25814 <at> debbugs.gnu.org>
Subject: RE: [bug-diffutils] bug#25814: bug#25814: Windows same_file macro
 is not reliable
Date: Tue, 21 Feb 2017 16:32:26 +0000
>That may be, but it's bad software engineering practice to apply a bunch 
>of painful-to-maintain "simple" changes to diffutils and other programs 
>when the real problem is elsewhere. Let's fix the real problem. Surely 
>it is a simple fix to Gnulib.

I'm unfortunately lacking of time. Do I need to do something else to put this on a TODO list of Gnulib?


Information forwarded to bug-diffutils <at> gnu.org:
bug#25814; Package diffutils. (Tue, 21 Feb 2017 16:58:02 GMT) Full text and rfc822 format available.

Message #23 received at 25814 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Kees Dekker <Kees.Dekker <at> infor.com>, Eric Blake <eblake <at> redhat.com>,
 "25814 <at> debbugs.gnu.org" <25814 <at> debbugs.gnu.org>
Subject: Re: [bug-diffutils] bug#25814: bug#25814: Windows same_file macro is
 not reliable
Date: Tue, 21 Feb 2017 08:57:33 -0800
On 02/21/2017 08:32 AM, Kees Dekker wrote:
> Do I need to do something else to put this on a TODO list of Gnulib?

I suggest sending email to bug-gnulib <at> gnu.org. It should be 
self-contained, i.e., assume readers are unfamiliar with this thread.





This bug report was last modified 8 years and 115 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.