GNU bug report logs -
#60989
[PATCH] rm: fail on duplicate input if force not enabled
Previous Next
Full log
Message #11 received at 60989 <at> debbugs.gnu.org (full text, mbox):
On 21/01/2023 15:51, Łukasz Sroka wrote:
> On 21/01/2023 15:53, Pádraig Brady <P <at> draigbrady.com> wrote:
>>
>> On 21/01/2023 13:05, Łukasz Sroka wrote:
>>> When the input files contain duplicates, then the rm fails. Because
>>> duplicates occur most often when the * is used and the shell unwraps it.
>>> There is a very common scenario when a user accidentally enters space
>>> after a filename, or enters space instead of forward slash.
>>> Example:
>>>
>>> rm prefix_ *
>>>
>>> The user intended to remove all files with a `prefix_` but removed all
>>> of the files in cwd.
>>> The program quits immediately when a duplicate is detected, to prevent
>>> pressing `y` because user expected a prompt regarding removing multiple
>>> files.
>>> The force option disables this function to enable scripts to work
>>> without modifying them.
>>>
>>> ```
>>> diff --git a/src/rm.c b/src/rm.c
>>> index 354e2b0df..e4f9949f0 100644
>>> --- a/src/rm.c
>>> +++ b/src/rm.c
>>> @@ -123,6 +123,16 @@ diagnose_leading_hyphen (int argc, char **argv)
>>> }
>>> }
>>>
>>> +static bool
>>> +find_duplicates (int n_files, char **files)
>>> +{
>>> + for (int l = 0; l < n_files-1; l++)
>>> + for (int r = l+1; r < n_files; r++)
>>> + if (strcmp(files[l], files[r]) == 0)
>>> + return true;
>>> + return false;
>>> +}
>>> +
>>> void
>>> usage (int status)
>>> {
>>> @@ -211,6 +221,7 @@ main (int argc, char **argv)
>>> bool preserve_root = true;
>>> struct rm_options x;
>>> bool prompt_once = false;
>>> + bool force_rm = false;
>>> int c;
>>>
>>> initialize_main (&argc, &argv);
>>> @@ -238,6 +249,7 @@ main (int argc, char **argv)
>>> x.interactive = RMI_NEVER;
>>> x.ignore_missing_files = true;
>>> prompt_once = false;
>>> + force_rm = true;
>>> break;
>>>
>>> case 'i':
>>> @@ -352,6 +364,17 @@ main (int argc, char **argv)
>>> uintmax_t n_files = argc - optind;
>>> char **file = argv + optind;
>>>
>>> + if (!force_rm && find_duplicates(n_files, file))
>>> + {
>>> + /* Because usually when the input files are duplicated it means
>>> that the user
>>> + sumbitted both a directory and an * as separate arguments,
>>> probably by accident */
>>> + fprintf (stderr,
>>> + "%s: input contains duplicates, most likely you've put "
>>> + "both * and a file from the same directory.\n",
>>> + program_name);
>>> + return EXIT_FAILURE;
>>> + }
>>> +
>>> if (prompt_once && (x.recursive || 3 < n_files))
>>> {
>>> fprintf (stderr,
>>> ```
>>
>> An interesting proposal.
>> The main protection would be for `dir/ *` rather than `file_prefix_ *`.
>> The former would be unusual for a user to type, while the latter more usual, but wouldn't trigger the protection AFAICS.
>> This ads O(N^2) on each interaction, so if it was to be included probably only enabled with --interactive.
>>
>> cheers,
>> Pádraig
>
> Yeah, true. Implemented it quickly and focused on the `dir/ *`
> scanerio at first, because that has happened to me -.-
> It was because I tried to do something quickly while editing command
> from backsearch and ended up with `rm tmp/ *` instead of `rm tmp/*`.
> To make the prefix_ protection it won't be possible without checking
> if a "prefix" exits on disk (eg. file and file1), which would be more
> time complex.
> I don't like the --interactive approach, as when you're trying to do
> something quickly, you probably won't opt for doing it interactively
> and you're probably either disabled or ignoring shell warnings.
True, but at a high level -I can be seen as
an unintrusive "be more careful" option,
and so is often enabled by default with an alias.
Red Hat flavored systems even enable the more intrusive
-i option by default for the root user for example.
> The more I think about this problem, the more I tilt towards
> implementing it in shell. If you'd've written "rm prefix_ *" and had
> 10k files there, it would be so much easier to detect it on the shell
> side than checking every filename with the other and triggering
> syscalls for every one...
> Either way if you have other solutions in mind, please update me, as
> it wouldn't require to configure shell in dockers or remote machines,
> thus making the solution more available.
A shell option like `shopt -s uniqueglob` would indeed be more general.
There are various other globbing options, like the similar "failglob"
which fails to execute the command if a glob doesn't match anything.
cheers,
Pádraig
This bug report was last modified 2 years and 207 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.