Package: coreutils;
Reported by: Łukasz Sroka <sroka.dev <at> gmail.com>
Date: Sat, 21 Jan 2023 14:29:02 UTC
Severity: normal
Tags: patch
View this message in rfc822 format
From: Pádraig Brady <P <at> draigBrady.com> To: Łukasz Sroka <sroka.dev <at> gmail.com> Cc: 60989 <at> debbugs.gnu.org Subject: bug#60989: [PATCH] rm: fail on duplicate input if force not enabled Date: Sat, 21 Jan 2023 18:51:51 +0000
On 21/01/2023 15:51, Łukasz Sroka wrote: > On 21/01/2023 15:53, Pádraig Brady <P <at> draigbrady.com> wrote: >> >> On 21/01/2023 13:05, Łukasz Sroka wrote: >>> When the input files contain duplicates, then the rm fails. Because >>> duplicates occur most often when the * is used and the shell unwraps it. >>> There is a very common scenario when a user accidentally enters space >>> after a filename, or enters space instead of forward slash. >>> Example: >>> >>> rm prefix_ * >>> >>> The user intended to remove all files with a `prefix_` but removed all >>> of the files in cwd. >>> The program quits immediately when a duplicate is detected, to prevent >>> pressing `y` because user expected a prompt regarding removing multiple >>> files. >>> The force option disables this function to enable scripts to work >>> without modifying them. >>> >>> ``` >>> diff --git a/src/rm.c b/src/rm.c >>> index 354e2b0df..e4f9949f0 100644 >>> --- a/src/rm.c >>> +++ b/src/rm.c >>> @@ -123,6 +123,16 @@ diagnose_leading_hyphen (int argc, char **argv) >>> } >>> } >>> >>> +static bool >>> +find_duplicates (int n_files, char **files) >>> +{ >>> + for (int l = 0; l < n_files-1; l++) >>> + for (int r = l+1; r < n_files; r++) >>> + if (strcmp(files[l], files[r]) == 0) >>> + return true; >>> + return false; >>> +} >>> + >>> void >>> usage (int status) >>> { >>> @@ -211,6 +221,7 @@ main (int argc, char **argv) >>> bool preserve_root = true; >>> struct rm_options x; >>> bool prompt_once = false; >>> + bool force_rm = false; >>> int c; >>> >>> initialize_main (&argc, &argv); >>> @@ -238,6 +249,7 @@ main (int argc, char **argv) >>> x.interactive = RMI_NEVER; >>> x.ignore_missing_files = true; >>> prompt_once = false; >>> + force_rm = true; >>> break; >>> >>> case 'i': >>> @@ -352,6 +364,17 @@ main (int argc, char **argv) >>> uintmax_t n_files = argc - optind; >>> char **file = argv + optind; >>> >>> + if (!force_rm && find_duplicates(n_files, file)) >>> + { >>> + /* Because usually when the input files are duplicated it means >>> that the user >>> + sumbitted both a directory and an * as separate arguments, >>> probably by accident */ >>> + fprintf (stderr, >>> + "%s: input contains duplicates, most likely you've put " >>> + "both * and a file from the same directory.\n", >>> + program_name); >>> + return EXIT_FAILURE; >>> + } >>> + >>> if (prompt_once && (x.recursive || 3 < n_files)) >>> { >>> fprintf (stderr, >>> ``` >> >> An interesting proposal. >> The main protection would be for `dir/ *` rather than `file_prefix_ *`. >> The former would be unusual for a user to type, while the latter more usual, but wouldn't trigger the protection AFAICS. >> This ads O(N^2) on each interaction, so if it was to be included probably only enabled with --interactive. >> >> cheers, >> Pádraig > > Yeah, true. Implemented it quickly and focused on the `dir/ *` > scanerio at first, because that has happened to me -.- > It was because I tried to do something quickly while editing command > from backsearch and ended up with `rm tmp/ *` instead of `rm tmp/*`. > To make the prefix_ protection it won't be possible without checking > if a "prefix" exits on disk (eg. file and file1), which would be more > time complex. > I don't like the --interactive approach, as when you're trying to do > something quickly, you probably won't opt for doing it interactively > and you're probably either disabled or ignoring shell warnings. True, but at a high level -I can be seen as an unintrusive "be more careful" option, and so is often enabled by default with an alias. Red Hat flavored systems even enable the more intrusive -i option by default for the root user for example. > The more I think about this problem, the more I tilt towards > implementing it in shell. If you'd've written "rm prefix_ *" and had > 10k files there, it would be so much easier to detect it on the shell > side than checking every filename with the other and triggering > syscalls for every one... > Either way if you have other solutions in mind, please update me, as > it wouldn't require to configure shell in dockers or remote machines, > thus making the solution more available. A shell option like `shopt -s uniqueglob` would indeed be more general. There are various other globbing options, like the similar "failglob" which fails to execute the command if a glob doesn't match anything. cheers, Pádraig
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.