GNU bug report logs -
#25146
grep unusable on mingw - SAME_INODE woes
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
> grep snapshot:
> http://meyering.net/grep/grep-ss.tar.xz 1.4 MB
> http://meyering.net/grep/grep-ss.tar.xz.sig
> http://meyering.net/grep/grep-2.26.39-ae3f.tar.xz
This release, built for mingw, is hardly usable:
- 33 out of 107 tests fail,
- A simple "grep.exe o xx > yy" fails with error
grep.exe: input file 'xx' is also the output
More details:
- This happens both in a Cygwin mintty.exe window and in a cmd.exe window.
- It's the same for 32-bit mingw builds and 64-bit mingw builds
(recipe: http://git.savannah.gnu.org/gitweb/?p=gperf.git;a=blob_plain;f=README.windows;hb=HEAD )
- The error is signalled in grep.c:1874.
At this point, 'st' (of type 'struct _stat64') contains
{ st_dev = 0, st_ino = 0,
st_mode = 0x81B6 = _S_IFREG | _S_IREAD | _S_IWRITE | 0x36,
st_nlink = 1,
st_uid = 0, st_gid = 0, st_rdev = 0, st_size = 4,
st_atime = 1481099615, st_mtime = 1481099615, st_ctime = 1481099615 }
Obviously, such a struct cannot reliably distinguish two different regular files.
In other words, SAME_INODE cannot work.
- So, how do you determine identity of files in Windows?
http://stackoverflow.com/questions/562701/best-way-to-determine-if-two-path-reference-to-same-file-in-windows
But even this is wrong, the use of a BY_HANDLE_FILE_INFORMATION
is not sufficient because it contains only 64-bit identifiers for
files. See https://msdn.microsoft.com/en-us/library/windows/desktop/aa363788(v=vs.85).aspx
The best approach is to use GetFileInformationByHandleEx to produce a
FILE_ID_INFO.
Find attached a proof-of-concept patch. (Really rough - needs
-D_WIN32_WINNT=_WIN32_WINNT_WIN8, and lacks good error handling.)
With it, I get:
$ ./grep.exe o xx > yy
$ ./grep.exe o xx > xx
grep.exe: input file 'xx' is also the output
That is, now the detection of identical regular files works.
How can we go forward from here? I would propose a gnulib module that defines
a data structure that combines a 'struct stat' with the FILE_ID_INFO for native
Windows, and rebase the 'same-inode' module on it.
The other approach, to override mingw's 'struct stat' and stat/fstat/lstat()
functions, would imply a performance hit to all stat calls, even those that
don't want to access the st_ino field.
Bruno
[grep-same-inode-fix.diff (text/x-patch, attachment)]
This bug report was last modified 8 years and 180 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.