GNU bug report logs - #21084
rm appears to no longer be POSIX compliant (as of 2013 edition) re: deleting empty dirs and files under <path>/.

Previous Next

Package: coreutils;

Reported by: Linda Walsh <coreutils <at> tlinx.org>

Date: Sat, 18 Jul 2015 07:57:01 UTC

Severity: normal

Tags: notabug

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21084 in the body.
You can then email your comments to 21084 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Sat, 18 Jul 2015 07:57:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Linda Walsh <coreutils <at> tlinx.org>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sat, 18 Jul 2015 07:57:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <coreutils <at> tlinx.org>
To: bug-coreutils <at> gnu.org
Subject: rm appears to no longer be POSIX compliant (as of 2013 edition) re:
 deleting empty dirs and files under <path>/.
Date: Sat, 18 Jul 2015 00:55:46 -0700
In looking at the 2013 specification for rm
(http://pubs.opengroup.org/onlinepubs/9699919799/utilities/rm.html),

it no longer says to stop processing if the path basename equals
"." or "..". 

It says that the entries "." and ".." shall not be removed.  It
also says rm <empty dir> shall behave like "rmdir" -- i.e. it will
delete empty directories.

But in the case of foo/. it would be expected to process child inodes
before processing the directory itself.

But step 4 on that page says that rm should remove empty directories
without requiring other special switches.







Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Sat, 18 Jul 2015 08:44:01 GMT) Full text and rfc822 format available.

Message #8 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: Linda Walsh <coreutils <at> tlinx.org>
Cc: 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of 2013
 edition) re: deleting empty dirs and files under <path>/.
Date: Sat, 18 Jul 2015 10:43:28 +0200
Linda Walsh <coreutils <at> tlinx.org> writes:

> In looking at the 2013 specification for rm
> (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/rm.html),
>
> it no longer says to stop processing if the path basename equals
> "." or "..". 

"If either of the files dot or dot-dot are specified as the basename
portion of an operand (that is, the final pathname component) or if an
operand resolves to the root directory, rm shall write a diagnostic
message to standard error and do nothing more with such operands."

Andreas.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."




Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Sat, 18 Jul 2015 08:47:01 GMT) Full text and rfc822 format available.

Message #11 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: Linda Walsh <coreutils <at> tlinx.org>
Cc: 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of 2013
 edition) re: deleting empty dirs and files under <path>/.
Date: Sat, 18 Jul 2015 10:46:13 +0200
Linda Walsh <coreutils <at> tlinx.org> writes:

> But step 4 on that page says that rm should remove empty directories
> without requiring other special switches.

Please read step 2a.

Andreas.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."




Added tag(s) notabug. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Sat, 18 Jul 2015 15:48:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 21084 <at> debbugs.gnu.org and Linda Walsh <coreutils <at> tlinx.org> Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Sat, 18 Jul 2015 15:48:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Sat, 18 Jul 2015 22:12:02 GMT) Full text and rfc822 format available.

Message #18 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <coreutils <at> tlinx.org>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013 edition) re: deleting empty dirs and files under <path>/.
Date: Sat, 18 Jul 2015 15:11:03 -0700
reopen 21084
thanks

Andreas Schwab wrote:
> Linda Walsh <coreutils <at> tlinx.org> writes:
> 
>> In looking at the 2013 specification for rm
>> (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/rm.html),
>>
>> it no longer says to stop processing if the path basename equals
>> "." or "..". 
> 
> "If either of the files dot or dot-dot are specified as the basename
> portion of an operand (that is, the final pathname component) or if an
> operand resolves to the root directory, rm shall write a diagnostic
> message to standard error and do nothing more with such operands."
----
	I'll grant it also says you can't remove "/",

	So a special flag "--use_depth_first_inspection" that says not to look at
a "basename" until it's children have been processed wouldn't be any more 
out of place than special flags to handle "/" processing, right?

	The fact that they put the ".", ".." and "/" together, outside of
the 1-4 processing leads one to the idea that they should be treated similarly,
no?






Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Fri, 31 Jul 2015 23:58:02 GMT) Full text and rfc822 format available.

Message #21 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <coreutils <at> tlinx.org>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013	edition) re: deleting empty dirs and files under <path>/.
Date: Fri, 31 Jul 2015 16:56:54 -0700
Linda Walsh wrote:
> Andreas Schwab wrote:
>> "If either of the files dot or dot-dot are specified as the basename
>> portion of an operand (that is, the final pathname component) or if an
>> operand resolves to the root directory, rm shall write a diagnostic
>> message to standard error and do nothing more with such operands."
> ----
>     I'll grant it also says you can't remove "/",
>
>     So a special flag "--use_depth_first_inspection" that says not to 
> look at
> a "basename" until it's children have been processed wouldn't be any 
> more out of place than special flags to handle "/" processing, right?
>
>     The fact that they put the ".", ".." and "/" together, outside of
> the 1-4 processing leads one to the idea that they should be treated 
> similarly,
> no?
---
   Since there is no opposition to this, I presume, all you need now
is a patch?

I.e. - POSIX now demands that "/", "." and ".." all be ignored in a 
basename,
yet the some smart gnu folks decided that leaving in a non-default optional
behavior to override the new dumb-down restrictions would best serve the
community.

So I might reason that they would be equally smart and/or use similar logic
to allow a non-default option to remove the dumb-down on the "." path.

NOTE: I have no issue with NOT _attempting_ a delete on "." after doing
the designed depth-first traversal.  Applying the POSIX restriction
on not attempting to delete "." makes perfect sense to me, since
I know that doing so can give inconsistent and "undefined" behavior
depending on the OS, but using "." as a semantic place holder to allow
one to reference a starting point for some action (imagine
using 'find' if '.' was banned as starting point:

 > find '' -type f
 find: ‘’: No such file or directory

*cheers*










Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Sat, 01 Aug 2015 01:30:05 GMT) Full text and rfc822 format available.

Message #24 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Linda Walsh <coreutils <at> tlinx.org>, Andreas Schwab <schwab <at> linux-m68k.org>
Cc: 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013	edition) re: deleting empty dirs and files under <path>/.
Date: Fri, 31 Jul 2015 18:29:48 -0700
Linda Walsh wrote:
>     Since there is no opposition to this, I presume, all you need now
> is a patch?

My impression is that hardly anybody cares about this corner case.

How about the following idea instead?  We could have --no-preserve-root also 
skip the special treatment for '.' and '..'.  That way, we shouldn't need to add 
an option.




Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Sat, 01 Aug 2015 03:14:01 GMT) Full text and rfc822 format available.

Message #27 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <coreutils <at> tlinx.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Andreas Schwab <schwab <at> linux-m68k.org>, 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013	edition) re: deleting empty dirs and files under <path>/.
Date: Fri, 31 Jul 2015 20:13:20 -0700

Paul Eggert wrote:
> Linda Walsh wrote:
>>     Since there is no opposition to this, I presume, all you need now
>> is a patch?
> 
> My impression is that hardly anybody cares about this corner case.
> 
> How about the following idea instead?  We could have --no-preserve-root 
> also skip the special treatment for '.' and '..'.  That way, we 
> shouldn't need to add an option.
---

Though I've never had a problem with doing something like
'rm -fr /', I'd prefer not to chance it -- I can't believe 
this is a corner case -- not given the putrid hate spewed
at me by some BSD supporter who pushed through the mandatory restrictions.

Instead of 'rm -fr /', I prefer dd if=/dev/sda2 of=/dev/sda ...wait, where was
my new partition?   ARG!

Another issue I haven't raised yet, because this is more important to me
is the horrible execution of --one-file-system.

Since the rm -fr --one-file-system foo/. was removed and the suggested
replacements were use '*'... gee.. so you mean under "foo/' you had
bind  mounts to your root, /usr and /home partitions? But one-file-system
didn't catch it because they all were presented to 'rm' as cmd-line
args -- and the man page legalese says "when  removing  a hierarchy 
             recursively, skip any directory that is on a file system 
             different from  that  of  the  corresponding
             command line argument.

So it limits it to deleting files for all the files you used to be safe
from deleting when dir/. was allowed.  Example on my system... snapshot dir:

Filesystem                          Size  Used Avail Use% Mounted on
/dev/Data/Home-2015.04.22-03.07.02  6.8G  5.5G  1.3G  81% /home/.snapdir/@GMT-2015.04.22-03.07.02
/dev/Data/Home-2015.04.30-03.07.02  1.1G  913M  186M  84% /home/.snapdir/@GMT-2015.04.30-03.07.02
/dev/Data/Home-2015.05.17-13.11.21  762M  647M  115M  85% /home/.snapdir/@GMT-2015.05.17-13.11.21
/dev/Data/Home-2015.05.18-00.40.55  1.2G  981M  193M  84% /home/.snapdir/@GMT-2015.05.18-00.40.55
/dev/Data/Home-2015.05.18-13.05.04  1.7G  1.4G  287M  83% /home/.snapdir/@GMT-2015.05.18-13.05.04
/dev/Data/Home-2015.05.19-04.08.02  1.2G  957M  189M  84% /home/.snapdir/@GMT-2015.05.19-04.08.02
/dev/Data/Home-2015.05.20-04.08.02  922M  774M  149M  84% /home/.snapdir/@GMT-2015.05.20-04.08.02
/dev/Data/Home-2015.05.21-04.08.03  802M  676M  126M  85% /home/.snapdir/@GMT-2015.05.21-04.08.03
/dev/Data/Home-2015.05.22-04.08.02  2.3G  1.9G  421M  82% /home/.snapdir/@GMT-2015.05.22-04.08.02
/dev/Data/Home-2015.05.23-04.08.02  4.5G  3.7G  874M  81% /home/.snapdir/@GMT-2015.05.23-04.08.02
/dev/Data/Home-2015.05.24-04.08.04  7.2G  5.8G  1.4G  81% /home/.snapdir/@GMT-2015.05.24-04.08.04
/dev/Data/Home-2015.05.26-03.39.31  1.3G  1.1G  218M  84% /home/.snapdir/@GMT-2015.05.26-03.39.31
/dev/Data/Home-2015.05.27-04.08.05  5.4G  4.4G  1.1G  82% /home/.snapdir/@GMT-2015.05.27-04.08.05
/dev/Data/Home-2015.06.01-14.19.28  4.1G  3.3G  779M  82% /home/.snapdir/@GMT-2015.06.01-14.19.28
/dev/Data/Home                      1.5T  1.1T  494G  68% /home
/dev/Data/Home-2015.06.02-12.53.34  1.5T  1.1T  502G  68% /home/.snapdir/@GMT-2015.06.02-12.53.34

That's the type of harm caused by removing the "cd snapshots && rm -fr --one-file-system ."
or "rm -fr --one-file-system snapshots/."
and telling them it's not needed in rm because the shell's '*' will expand it.  I could
safely use the disabled features in that dir -- sometimes junk builds up where something got
copied into a directory that didn't have the corresponding partition mounted, as an example.

With "snapshots" becoming more "in vogue" -- in another decade or so, 
POSIX will require banning wildcard usage from a shell (if shell-access
hasn't been disabled before that, of course... ;^).

Besides.. with my suggested change, rm would only need 1 new switch, not '2' like '/' did ;-),
though I admit to wanting to add "-x" (find, rsync, maybe others having such a switch
as meaning stay on 1-dev (--xdev) -- and if I had my druthers, using -x would NOT use
the fact that cmdline args were on different filesystems as an excuse to do more
than operate on "one-file-system"... (would it be that hard to check the device id's of
the cmd-line args before starting a recursive delete based off them?)...

Eh...like I said, for me, just the special option to allow "." or dir/. is far more
important.









Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Sat, 01 Aug 2015 21:18:01 GMT) Full text and rfc822 format available.

Message #30 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Linda Walsh <coreutils <at> tlinx.org>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Andreas Schwab <schwab <at> linux-m68k.org>, 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of 2013
 edition) re: deleting empty dirs and files under <path>/.
Date: Sat, 1 Aug 2015 23:17:09 +0200
On 08/01/2015 05:13 AM, Linda Walsh wrote:
> [...] for me, just the special option to allow "." or dir/. is far more
> important.

You're discussing several aspects at once - '.' as operand, stopping
the removal at file system boundaries, etc. - but this all sounds to me
as if you simply wanted this one, right?

  $ find '.' -mindepth 1 -xdev -delete

Have a nice day,
Berny






Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Sun, 02 Aug 2015 08:03:02 GMT) Full text and rfc822 format available.

Message #33 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <coreutils <at> tlinx.org>
To: Bernhard Voelker <mail <at> bernhard-voelker.de>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, Andreas Schwab <schwab <at> linux-m68k.org>,
 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013 edition) re: deleting empty dirs and files under <path>/.
Date: Sun, 02 Aug 2015 01:02:05 -0700

Bernhard Voelker wrote:
> On 08/01/2015 05:13 AM, Linda Walsh wrote:
>> [...] for me, just the special option to allow "." or dir/. is far more
>> important.
> 
> You're discussing several aspects at once - '.' as operand, stopping
> the removal at file system boundaries, etc. - but this all sounds to me
> as if you simply wanted this one, right?
> 
>   $ find '.' -mindepth 1 -xdev -delete
---
(1) Tsk, tsk.... how could I possibly want that one when there
is no '-delete' arg in the POSIX version of find.  What were
you thinking?!   ;)

(2) You aren't really showing my example but more of
my "rmx -r ." case rmx being an alias to add -1-filesys
and using 'x' to be compatible with most other tools have
come across with such an option -- AND, my 'rmx -r .' only
takes 1/4th the space -- I design shortcuts to typing because
I can only type about 1/2 = 1/3 the speed I did before I got
a moderately nasty case of RSI -- and a boss who pushed me 
to type more in hopes that I'd quit, as he didn't have 
any thing substantial to get rid of me -- even when he wrote
a negative review -- the HR person who knew the facts told him
he had to rewrite it -- he never did.

BTW, you can make your example 2 characters shorter by removing those 
single quotes: they aren't needed.  But still 4x the typing  
for something I used alot.  But that -- also is part of the 
problem.  I have alot of scripts -- many use the standard
rm semantics and don't expect a crippled version.  Just finding
all the places it was used would be a royal pain.

(3) You aren't handling legacy SVr4 systems. The voiding of the 
standard functionality (strictly, depth-first operation: you
don't examine anything about about the current object until all of
it's children are gone -- this is especially true of the command
line arguments.

(4) The posix change made cmdline behavior inconsistent.  This
change says to process `some' newly created "restriction rules"
(for their 3 restricted paths) **before** processing contents.  However with the 1-fs 
(--one-file-system) switch, the current 'rm' does not inspect the initial
paths to verify that they are on the same filesystem -- it is only
acting on those arguments on the basis of their type: if file: delete;
if directory: descend. but it does not pay attention to their names
even long enough to do a 'stat' to check that they are really all
on 1 filesystem.  

It looks bad for rm to suddenly not fit in with other core utils
which can process "." as a __Semantic__ place-holder.  For an example
of that, we have too look into the 'arcane' 'cp' program. :)

 > if you do this:
 > cp -a dir dir1   #expected behavior: want to update maybe
 > diff -r dir dir1
 > cp -au dir dir1
 > diff -r dir dir1
 Only in dir1: dir

You can do the same example adding a slash at the end to 
force both paths to be considered dirs, but that does the
same thing..

if you try wildcards:

 > cp -aux dir/* dir1

It might work -- might not for the same reason that the
-1fs switch in rm doesn't really work as expected.  
***NOTE***: I haven't tested that example, but if does the
"right thing" then the behaviors would be different between
it and rm.

The safe way to do the update: (Note 'cp' also
has -x as did find 

 > cp -aux dir/. dir1/.
 > touch dir/newfile
 > cp -vaux dir/. dir1/. 
 ‘dir/./newfile’ -> ‘dir1/././newfile’
 > diff -r dir dir1

 >> Only the dot worked. <<

In experimenting with a few utils, I found some similar
and some different behaviors.

  > du -hdx 1 -x dir
  > du -hdx1 -x dir/.   ## both the same output.


rsync also has a -x for staying on 1 fs and has semantics not too
far off copy, but also allows "/. " semantics (been in unix since 
the earliest rm versions I saw in mid 80's.  

I seems obvious that someone got the votes to push through a
bad feature: one that was default on, and allowed for no
alternatives.  That should not be the way Gnu should handle
their SW -- a posix-compat mode, yes, but if Chet took ever non
posix feature and behavior out of bash -- it would set the shell
back 20 years.  But that's only pointing out the prevalence of
using "." as meaning "start here" across many other tools.

(5) -- the example you gave above doesn't handle a very important
case:

It doesn't handle "-f" -- first I do a find to show that I own all
the files below ".". (! -user law).

  > find . -mindepth 1 -xdev ! -user law|wc -l  #do I own all the files?
  0

Now I follow your option with the change that  I only pipe
stderr through the pipe -- which looks only for the words "Permission denied" 
and that into wc -l showing 832 errors that rm -fr . handles but are not
handled by your find example.


 > find '.' -mindepth 1 -xdev -delete 2>&1 1>/dev/null|grep Permission\ denied|wc -l
 832 

AFAIK, find, by itself, has no way to remove all of the items under a
tree even if you own them all.  Whereas rm -fr . did.
--- 


I hope my writing was sufficiently clear.  I have tried to
be clear about why I think rm w/'.' should have it's own
switch (and combining it with '..' -- I can think of 
no one or any reason to enable that functionality.  Even
though I am careful, Having the protection against doing
a bad-thing in '/' but allowing the 20+ year previous 
behavior to be returned, thus the need for a separate
switch.

Cheers!
Linda







Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Sun, 02 Aug 2015 08:16:01 GMT) Full text and rfc822 format available.

Message #36 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Linda Walsh <coreutils <at> tlinx.org>, 
 Bernhard Voelker <mail <at> bernhard-voelker.de>
Cc: Andreas Schwab <schwab <at> linux-m68k.org>, 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013 edition) re: deleting empty dirs and files under <path>/.
Date: Sun, 02 Aug 2015 01:15:07 -0700
Linda Walsh wrote:
> find, by itself, has no way to remove all of the items under a
> tree even if you own them all.

That's not a problem.  Have 'find' call 'rm'.  Something like this, say:

find . ! -name . -prune -exec rm -fr {} +

So there's no need to change 'rm'.




Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Sun, 02 Aug 2015 20:32:02 GMT) Full text and rfc822 format available.

Message #39 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Linda Walsh <coreutils <at> tlinx.org>
Cc: Andreas Schwab <schwab <at> linux-m68k.org>, 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of 2013
 edition) re: deleting empty dirs and files under <path>/.
Date: Sun, 2 Aug 2015 22:31:16 +0200
On 08/02/2015 10:15 AM, Paul Eggert wrote:
> Linda Walsh wrote:
>> find, by itself, has no way to remove all of the items under a
>> tree even if you own them all.
> 
> That's not a problem.  Have 'find' call 'rm'.  Something like this, say:
> 
> find . ! -name . -prune -exec rm -fr {} +
> 
> So there's no need to change 'rm'.

+1

Adding additional code to find out if the file to remove is still on the
same file system would add bloat, and would open another can of worms:
corner cases, races and a big performance penalty.  E.g. one might blindly
assume that only directories are mount points, but in reality also a
regular file can be 'over-mounted'.

Have a nice day,
Berny






Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Mon, 03 Aug 2015 01:33:02 GMT) Full text and rfc822 format available.

Message #42 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <coreutils <at> tlinx.org>
To: Bernhard Voelker <mail <at> bernhard-voelker.de>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, Andreas Schwab <schwab <at> linux-m68k.org>,
 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013 edition) re: deleting empty dirs and files under <path>/.
Date: Sun, 02 Aug 2015 18:32:08 -0700
[Message part 1 (text/plain, inline)]

Bernhard Voelker wrote:
> On 08/02/2015 10:15 AM, Paul Eggert wrote:
>> Linda Walsh wrote:
>>> find, by itself, has no way to remove all of the items under a
>>> tree even if you own them all.
>> That's not a problem.  Have 'find' call 'rm'.  Something like this, say:
>>
>> find . ! -name . -prune -exec rm -fr {} +
>>
>> So there's no need to change 'rm'.
> 
> +1
> 
> Adding additional code to find out if the file to remove is still on the
> same file system would add bloat, and would open another can of worms:
> corner cases, races and a big performance penalty

Um...

The code to find out if the file to remove is on the same file system
is already in "rm".

-3 for attempting to create strawmen.
[treescan-aio.pl (text/plain, inline)]
#!/usr/bin/perl

eval 'exec /usr/bin/perl  -S $0 ${1+"$@"}'
    if 0; # not running under some shell

# inspired by treescan by Jamie Lokier <jamie <at> imbolc.ucc.ie>
# about 40% faster than the original version (on my fs and raid :)

use strict;
use Getopt::Long;
use Time::HiRes ();
use IO::AIO;

our $VERSION = $IO::AIO::VERSION;

Getopt::Long::Configure ("bundling", "no_ignore_case", "require_order", "auto_help", "auto_version");

my ($opt_silent, $opt_print0, $opt_stat, $opt_nodirs,
    $opt_nofiles, $opt_grep, $opt_progress);

GetOptions
   "quiet|q"    => \$opt_silent,
   "print0|0"   => \$opt_print0,
   "stat|s"     => \$opt_stat,
   "dirs|d"     => \$opt_nofiles,
   "files|f"    => \$opt_nodirs,
   "grep|g=s"   => \$opt_grep,
   "progress|p" => \$opt_progress,
   or die "Usage: try $0 --help";

@ARGV = "." unless @ARGV;

$opt_grep &&= qr{$opt_grep}s;

my ($n_dirs, $n_files, $n_stats) = (0, 0, 0);
my ($n_last, $n_start) = (Time::HiRes::time) x 2;

sub printfn {
   my ($prefix, $files, $suffix) = @_;

   if ($opt_grep) {
      @$files = grep "$prefix$_" =~ $opt_grep, @$files;
   }
   
   if ($opt_print0) {
      print map "$prefix$_$suffix\0", @$files;
   } elsif (!$opt_silent) {
      print map "$prefix$_$suffix\n", @$files;
   }
}

sub scan {
   my ($path) = @_;

   $path .= "/";

   IO::AIO::poll_cb;

   if ($opt_progress and $n_last + 1 < Time::HiRes::time) {
      $n_last = Time::HiRes::time;
      my $d = $n_last - $n_start;
      printf STDERR "\r%d dirs (%g/s) %d files (%g/s) %d stats (%g/s)       ",
             $n_dirs, $n_dirs / $d,
             $n_files, $n_files / $d,
             $n_stats, $n_stats / $d
         if $opt_progress;
   }

   aioreq_pri (-1);
   ++$n_dirs;
   aio_scandir $path, 8, sub {
      my ($dirs, $files) = @_
         or warn "$path: $!\n";

      printfn "", [$path]   unless $opt_nodirs;
      printfn $path, $files unless $opt_nofiles;

      $n_files += @$files;

      if ($opt_stat) {
         aio_wd $path, sub {
            my $wd = shift;

            aio_lstat [$wd, $_] for @$files;
            $n_stats += @$files;
         };
      }

      &scan ("$path$_") for @$dirs;
   };
}

IO::AIO::max_outstanding 100; # two fds per directory, so limit accordingly
IO::AIO::min_parallel 20;

for my $seed (@ARGV) {
   $seed =~ s/\/+$//;
   aio_lstat "$seed/.", sub {
      if ($_[0]) {
         print STDERR "$seed: $!\n";
      } elsif (-d _) {
         scan $seed;
      } else {
         printfn "", $seed, "/";
      }
   };
}

IO::AIO::flush;


Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Mon, 03 Aug 2015 01:38:01 GMT) Full text and rfc822 format available.

Message #45 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <coreutils <at> tlinx.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Bernhard Voelker <mail <at> bernhard-voelker.de>,
 Andreas Schwab <schwab <at> linux-m68k.org>, 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013	edition) re: deleting empty dirs and files under <path>/.
Date: Sun, 02 Aug 2015 18:37:43 -0700

Paul Eggert wrote:
> Linda Walsh wrote:
>> find, by itself, has no way to remove all of the items under a
>> tree even if you own them all.
> 
> That's not a problem.  Have 'find' call 'rm'.  Something like this, say:
> 
> find . ! -name . -prune -exec rm -fr {} +
> 
> So there's no need to change 'rm'.
----
Bernard is worried about performance.  Do you know how long it would take 
for find to call rm? a half-a-million times?
Um....

> time rm -fr . 
183.23sec 0.69usr 36.25sys (20.16% cpu)
> time find . ! -name . -prune -exec rm -fr {} +
219.58sec 0.87usr 40.81sys (18.98% cpu) -- about 36 seconds (~20%) longer

So you've already slowed things down -- and those times were just for my home 
directory...!   (non-critical data was used for these tests (copies
of my home directory that existed on different partitions))

But you also didn't address points (3), (4) or (5)..

-.5






Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Mon, 03 Aug 2015 05:24:02 GMT) Full text and rfc822 format available.

Message #48 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Linda Walsh <coreutils <at> tlinx.org>
Cc: 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013	edition) re: deleting empty dirs and files under <path>/.
Date: Sun, 02 Aug 2015 22:23:46 -0700
Linda Walsh wrote:
>
>> time rm -fr .
> 183.23sec 0.69usr 36.25sys (20.16% cpu)
>> time find . ! -name . -prune -exec rm -fr {} +
> 219.58sec 0.87usr 40.81sys (18.98% cpu) -- about 36 seconds (~20%) longer

Benchmarks like this are often suspect since a lot of it depends on factors that 
are hard to reproduce.  That being said, when I tried a similar benchmark on my 
machine, the 'find' solution was over 30% faster.  In any event the minor 
performance improvements we're talking about would not be a compelling argument 
for adding UI complexity to 'rm', even if the 'rm' approach was uniformly faster

> But you also didn't address points (3), (4) or (5)..

They aren't a problem either.  As I mentioned, the "find" approach conforms to 
POSIX and so is quite portable; that covers (3).  If you don't want to cross 
file system boundaries, add the POSIX-required -xdev option to 'find' and the 
GNU extension --one-file-system argument to 'rm'; that covers (4).  And the 
example already uses rm's -f option; that covers (5).





Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Mon, 03 Aug 2015 05:39:02 GMT) Full text and rfc822 format available.

Message #51 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <coreutils <at> tlinx.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Bernhard Voelker <mail <at> bernhard-voelker.de>,
 Andreas Schwab <schwab <at> linux-m68k.org>, 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013	edition) re: deleting empty dirs and files under <path>/.
Date: Sun, 02 Aug 2015 22:38:18 -0700

Paul Eggert wrote:
> Linda Walsh wrote:
>> find, by itself, has no way to remove all of the items under a
>> tree even if you own them all.
> 
> That's not a problem.  Have 'find' call 'rm'.  Something like this, say:
> 
> find . ! -name . -prune -exec rm -fr {} +
> 
> So there's no need to change 'rm'.
----
Actually, I was thinking about this -- if adding a switch to
provide the original behavior it had for 20+ years is so stressful
to you guys, then really, you should revert the change -- think about
it -- it would *simplify* the code -- which means Bernard should be
happy as it would imply less bloat.

for those who wanted protection, they could have **easily** put a
shell wrapper-function around 'rm' to provide that functionality --
there was no need to change rm -- the code they added breaks the
original behavior and and, Bernhard, has already opened a can of
worms, cornder cases, races and a sizable performance penalty.

All they things that you accuse function restoration of 
have already been done.  I assert that this change make problems
MORE likely to happen as they try the easiest workarounds they
can find.  In my 26 years of unix-related experience, not once
have I heard of anyone accidentally doing rm -fr dir/. by accident.
You say hardly anybody cares for the corner case -- that was because no 
the problem it is solving is one that was hardly ever a problem.

How about this: (not my favorite beverage), BUT switch the code to 
allow the above (dir/.) but not a lone ".".  I'd leave the .. 
restriction in for dir/.. and .., since there is no legitimate
use for it: just delete 'dir'.  But there is a legitimate
use to use "dir/." -- since it let the users avoid the bug
of using '*' and deleting multiple file systems while they thought
that "one-file-system" meant "1 file system[period]", not 1
filesystem per command line argument.  That's just a lame definition
that someone thought up justify the new bug that was being introduced.

So if any of you are serious about making the tools more robust and
less prone accidents -- especially if you only enable "." if it
is preceeded by a directory component (not likely to be typed by
accident, but useful to keep things on 1 device), while using
a lone ".", I could see as being as easily mistyped as the rm -fr /
as the two are right next to each other.

Oh, in addition to that, I *am* asking for compatibility in "-x"
being the short version of "--one-file-system" ('cp', 'rsync',
'find' (though find make it short for --xdev -- it still uses -x
as the short form).

I hope you don't have your eyes shut so tight that you can't
see that the removal of dir/. has brought about needs for 
workarounds with the easiest one using shell expansion.

You realize with bash shell expansion and globstar turned on,
the "one-file-system" switch is useless.  Not allowing
dir/. breaks --onefs" in all cases as rm -fr ** would list
all files on the cmd line (hadn't thought of that till just
now, this is not a corner case, as ** will list all the 
files and directories on the command line.






Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Mon, 03 Aug 2015 06:10:02 GMT) Full text and rfc822 format available.

Message #54 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <coreutils <at> tlinx.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013	edition) re: deleting empty dirs and files under <path>/.
Date: Sun, 02 Aug 2015 23:09:10 -0700

Paul Eggert wrote:
> Linda Walsh wrote:
>>
>>> time rm -fr .
>> 183.23sec 0.69usr 36.25sys (20.16% cpu)
>>> time find . ! -name . -prune -exec rm -fr {} +
>> 219.58sec 0.87usr 40.81sys (18.98% cpu) -- about 36 seconds (~20%) longer
> 
> Benchmarks like this are often suspect since a lot of it depends on 
> factors that are hard to reproduce.  That being said, when I tried a 
> similar benchmark on my machine, the 'find' solution was over 30% 
---
Did you run them on separate partitions over the same file-structure?

Neither rm nor find had a hot or even warm cache, as I mounted the file
systems just for this test.

You can use the same partitions/files, if you use dropcaches:
#!/bin/bash 

function dropcaches () {
 echo -n "3"|sudo dd status=none of=/proc/sys/vm/drop_caches
}

#if [[ ${BASH_LINE[@]} == 0 ]]; then 
 time dropcaches
#fi
-------------
If you run it, it runs the function then, if you source it, 
you'd have to insert 'time' manually later --- which I
usually do, as I like to know how long things take. 

Have had it take as long as over 60 seconds on my system,
though more often under 10.




> faster.  In any event the minor performance improvements we're talking 
> about would not be a compelling argument for adding UI complexity to 
> 'rm', even if the 'rm' approach was uniformly faster
---
	I was addressing Bernhard's explicitly stated concerns.
They were not my concerns.  


> 
>> But you also didn't address points (3), (4) or (5)..
> 
> They aren't a problem either.  As I mentioned, the "find" approach 
> conforms to POSIX and so is quite portable; that covers (3).
---
	You can't believe that. People with older systems don't 
always keep upto-date with the latest versions -- and likely
wrote most of their maint-scripts under the original POSIX 
charter.  They won't know until they are bitten and complain
alot louder than I'm comfortable with.

	It doesn't solve '4', since it's about users wanting 
similar behaviors not only in other packages, but within the
same package.  It's a broken wart that is not restricted in 
other utils -- and causes you to have to defend "one-file-system"
no longer being usable because the users should have known.

It would handle (5), _probably_.

But allowing dir/. would not cause the same problems a complete 
ban has, nor is it a likely candidate for abuse or accident.



Can you think of anything I've suggested that you've 
been supportive on, yet I know I 




Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Mon, 03 Aug 2015 06:30:03 GMT) Full text and rfc822 format available.

Message #57 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Linda Walsh <coreutils <at> tlinx.org>
Cc: 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013	edition) re: deleting empty dirs and files under <path>/.
Date: Sun, 02 Aug 2015 23:29:04 -0700
Linda Walsh wrote:
> People with older systems don't always keep upto-date with the latest versions

People that don't keep up-to-date can't rely on any changes that we would make 
to 'rm'.  Besides, we don't care about SVR4 systems so old that they're no 
longer supported.

I ran my little test on the same file system.

I'm afraid I'm not persuaded.




Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Mon, 03 Aug 2015 19:41:02 GMT) Full text and rfc822 format available.

Message #60 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Linda Walsh <coreutils <at> tlinx.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of
 2013	edition) re: deleting empty dirs and files under <path>/.
Date: Mon, 03 Aug 2015 12:40:04 -0700

Paul Eggert wrote:
> Linda Walsh wrote:
>>
>>> time rm -fr .
>> 183.23sec 0.69usr 36.25sys (20.16% cpu)
>>> time find . ! -name . -prune -exec rm -fr {} +
>> 219.58sec 0.87usr 40.81sys (18.98% cpu) -- about 36 seconds (~20%) longer
> 
> Benchmarks like this are often suspect since a lot of it depends on 
> factors that are hard to reproduce.  That being said, when I tried a 
> similar benchmark on my machine, the 'find' solution was over 30% 
> faster. 
---
	Nearly impossible except for leaving out that that find DOES have
some multi-cpu capability in it's system scan, If rm had the same,
then the addition of 500,000+ calls to an external process can't take
'zero time'.

In any event the minor performance improvements we're talking 
> about would not be a compelling argument for adding UI complexity to 
> 'rm', even if the 'rm' approach was uniformly faster
> 
>> But you also didn't address points (3), (4) or (5)..
> 
> They aren't a problem either.  As I mentioned, the "find" approach 
> conforms to POSIX and so is quite portable; that covers (3).  
----
	You claim it adheres to POSIX, but there is no single POSIX --
what version are talking about, as the version of POSIX from 2003 or 
earlier wouldn't have problems.

	POSIX was supposed to describe what was actually implemented
in the systems out there so people could move to a common base to
provide API compatibility.   Adding descriptions of the base commands
and the arguments supported was to help write shell scripts that
would be portable.  Removing functionality in a backwards incompatible
way is anything but "helping portability".  (3) is not maintained.

> If you 
> don't want to cross file system boundaries, add the POSIX-required -xdev 
> option to 'find' and the GNU extension --one-file-system argument to 
> 'rm'; that covers (4). 
----
	Not really -- we are talking about the 'rm' command.  Not rewriting
scripts and humans to use new commands.  

	Answer this:  How does disabling functionality make something more
portable?  Gnu coreutils have tons of things that enable new functionality, 
that are not portable unless you assume all platforms will have the new
functionality.  But **removing** capabilities from programs can never 
provide backwards compat -- rm -fr dir/. was there for 30 years, and now
you think removing that feature makes it portable?  I'm sorry, your logic
is not logical.

	If you want to use the safety reason as an overriding reason then
I can see banning . .. and / (even though gnu went for a workaround on
/.  But safety wouldn't be an excuse for removing rm -fr "this_is_a_dir/.".
I've never even heard of '.' being aproblem and it is supported in the rest
of coreutils (except rmdir -- where dirname_this_is/. should also be allowed.

> People that don't keep up-to-date can't rely on any changes that we 
> would make to 'rm'. 

	I keep up-to-date -- that's why it bit me.  But I still haven't
upgraded my perl beyond 5.16 because too many of my script break due to
them installing or removing various supported featers.  I'm still working
on that -- but it's alot of scripts.

> I ran my little test on the same file system.

Did you at least drop the caches between runs (i.e.)
by echoing '3' to /proc/sys/vm/drop_caches?  

> I'm afraid I'm not persuaded.

You really think when people find they can't do: 

> cp -axf /usr/. /usr2/.    #... no wanted that in /usr3
> mkdir /usr3 && cp -alxf /usr2/. /usr3/.  ... ESPACE...!
> rm -fxr /usr2/. /usr3/.   ## except this will fail...
> cp -axf /usr/. /usr3/.

They'll instantly think of find? -- Where else besides
rmdir is dir/. banned?






Information forwarded to bug-coreutils <at> gnu.org:
bug#21084; Package coreutils. (Mon, 03 Aug 2015 20:46:02 GMT) Full text and rfc822 format available.

Message #63 received at 21084 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Linda Walsh <coreutils <at> tlinx.org>
Cc: 21084 <at> debbugs.gnu.org
Subject: Re: bug#21084: rm appears to no longer be POSIX compliant (as of 2013
 edition) re: deleting empty dirs and files under <path>/.
Date: Mon, 3 Aug 2015 13:45:28 -0700
On 08/03/2015 12:40 PM, Linda Walsh wrote:

>  there is no single POSIX --
> what version are talking about

The current version, POSIX.1-2013.  The code also works in 
POSIX.1-2004.  At this point older POSIX releases are mostly just 
historical curiosities; application developers shouldn't need to worry 
about them.

> Did you at least drop the caches between runs (i.e.)
> by echoing '3' to /proc/sys/vm/drop_caches?

No, I didn't mess with the caches.  I'm reasonably sure everything was 
cached.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 01 Sep 2015 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 297 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.