Package: emacs;
Reported by: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer)
Date: Sun, 18 Oct 2015 12:37:02 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21702 in the body.
You can then email your comments to 21702 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Sun, 18 Oct 2015 12:37:02 GMT) Full text and rfc822 format available.taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer)
:bug-gnu-emacs <at> gnu.org
.
(Sun, 18 Oct 2015 12:37:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) To: bug-gnu-emacs <at> gnu.org Subject: shell-quote-argument semantics and safety Date: Sun, 18 Oct 2015 14:36:03 +0200
[Message part 1 (text/plain, inline)]
The documentation of shell-quote-argument only says Quote ARGUMENT for passing as argument to an inferior shell. It's unclear for which shells this is supposed to work. In a recent thread in emacs-devel, it has been demonstrated that if the result is passed to csh, it can allow an attacker to execute an arbitrary shell command, although without arguments: (let ((argument (read untrusted-source))) (assert (stringp argument)) (call-process "csh" nil t nil "-c" (concat "echo " (shell-quote-argument argument)))) ;; If untrusted-source gives us "\nevil-command\n", we get: ;; evil-command: Command not found. The function should clearly document 1) for which shells will the quoting work absolutely, i.e. lead to the given string to appear *verbatim* in an element of the ARGV of the called command, 2) optionally, for which shells will the quoting at least prevent code injection, 3) optionally, for which shells and character sets for ARGUMENT will the quoting work absolutely, 4) optionally, for which shells and character sets for ARGUMENT will the quoting at least prevent code injection, 5) optionally, for which shells will the quoting work at all even if it provides no clear semantics, such that one can at least use it with data coming from trusted sources (e.g. other parts of Emacs's source code, or the user sitting in front of Emacs), where it's the user's/programmer's responsibility to stick to values for ARGUMENT that are intuitively known to be unproblematic even if the character set isn't well-defined. Currently #5 seems to be implied for all shells, for lack of further documentation. Possibly, the function was never meant to be used with untrusted data, but there's no warning against doing so either. I stress-tested the strategy it uses for POSIX shells with the following horrible hack; the results are positive, i.e. the strategy seems to meet the criteria #1 above for POSIX shells. for i in {0..999} do dd if=/dev/urandom of=/dev/stdout bs=1K count=1 2>/dev/null | tr -d '\000' > randomfile # NULL bytes in ARGV are impossible emacs -q --batch --eval \ "(with-temp-buffer (insert-file-contents-literally \"randomfile\") (let ((data (replace-regexp-in-string \"\\n\" \"'\\n'\" (replace-regexp-in-string \"[^-0-9a-zA-Z_./\\n]\" \"\\\\\\\\\\\\&\" (buffer-substring (point-min) (point-max)))))) (erase-buffer) (insert \"printf %s \") (insert data) (write-region (point-min) (point-max) \"commandfile\")))" sh - < commandfile > output # tested with bash, dash, and ksh diff randomfile output || exit done There's also wording in POSIX which seems to guarantee the safety of the strategy: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02_01 "A <backslash> that is not quoted shall preserve the literal value of the following character, with the exception of a <newline>. [...]" For now, here's a trivial patch improving the docstring. If anyone is confident in the safety of the function for shells other than those conforming to POSIX sh, feel free to change the docstring accordingly.
[0001-lisp-subr.el-shell-quote-argument-Improve-documentat.patch (text/x-diff, inline)]
From dedcb603da981dcab8f576dea2f36d58fd2ddcfa Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Taylan=20Ulrich=20Bay=C4=B1rl=C4=B1/Kammer?= <taylanbayirli <at> gmail.com> Date: Sun, 18 Oct 2015 14:23:35 +0200 Subject: [PATCH] * lisp/subr.el (shell-quote-argument): Improve documentation. --- lisp/subr.el | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lisp/subr.el b/lisp/subr.el index e176907..940ebe6 100644 --- a/lisp/subr.el +++ b/lisp/subr.el @@ -2711,7 +2711,11 @@ Note: :data and :device are currently not supported on Windows." (declare-function w32-shell-dos-semantics "w32-fns" nil) (defun shell-quote-argument (argument) - "Quote ARGUMENT for passing as argument to an inferior shell." + "Quote ARGUMENT for passing as argument to an inferior shell. + +This is safe for shells conforming to POSIX sh. No guarantees +regarding code injection are made for other shells, but csh, +MS-DOS and Windows NT are supported for simple cases as well." (cond ((eq system-type 'ms-dos) ;; Quote using double quotes, but escape any existing quotes in -- 2.5.0
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Sun, 18 Oct 2015 15:27:02 GMT) Full text and rfc822 format available.Message #8 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) To: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Sun, 18 Oct 2015 17:26:19 +0200
On the development list it has been pointed out that the Info manual contains more verbose documentation on this function, although it doesn't clarify the semantics much either. === snip === Lisp programs sometimes need to run a shell and give it a command that contains file names that were specified by the user. These programs ought to be able to support any valid file name. But the shell gives special treatment to certain characters, and if these characters occur in the file name, they will confuse the shell. To handle these characters, use the function ‘shell-quote-argument’: -- Function: shell-quote-argument argument This function returns a string that represents, in shell syntax, an argument whose actual contents are ARGUMENT. It should work reliably to concatenate the return value into a shell command and then pass it to a shell for execution. Precisely what this function does depends on your operating system. The function is designed to work with the syntax of your system’s standard shell; if you use an unusual shell, you will need to redefine this function. ;; This example shows the behavior on GNU and Unix systems. (shell-quote-argument "foo > bar") ⇒ "foo\\ \\>\\ bar" ;; This example shows the behavior on MS-DOS and MS-Windows. (shell-quote-argument "foo > bar") ⇒ "\"foo > bar\"" Here’s an example of using ‘shell-quote-argument’ to construct a shell command: (concat "diff -c " (shell-quote-argument oldfile) " " (shell-quote-argument newfile)) === /snip === I'm not sure if that needs change, given the change to the docstring, which counts as the more authoritative documentation of the precise semantics if I'm not mistaken. Taylan
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Sun, 18 Oct 2015 17:17:01 GMT) Full text and rfc822 format available.Message #11 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) Cc: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Sun, 18 Oct 2015 20:16:54 +0300
> From: taylanbayirli <at> gmail.com (Taylan Ulrich > Bayırlı/Kammer) > Date: Sun, 18 Oct 2015 14:36:03 +0200 > > The documentation of shell-quote-argument only says > > Quote ARGUMENT for passing as argument to an inferior shell. > > It's unclear for which shells this is supposed to work. I fixed the doc string to clarify that this function works correctly with the system's standard shell. > In a recent thread in emacs-devel, it has been demonstrated that if > the result is passed to csh, it can allow an attacker to execute an > arbitrary shell command As I understand it, csh is not the standard shell on Posix systems, so the fixed doc string now says not to expect it to work with csh. > The function should clearly document > > 1) for which shells will the quoting work absolutely, i.e. lead to > the given string to appear *verbatim* in an element of the ARGV of > the called command, > > 2) optionally, for which shells will the quoting at least prevent > code injection, > > 3) optionally, for which shells and character sets for ARGUMENT will > the quoting work absolutely, > > 4) optionally, for which shells and character sets for ARGUMENT will > the quoting at least prevent code injection, > > 5) optionally, for which shells will the quoting work at all even if > it provides no clear semantics, such that one can at least use it > with data coming from trusted sources (e.g. other parts of Emacs's > source code, or the user sitting in front of Emacs), where it's the > user's/programmer's responsibility to stick to values for ARGUMENT > that are intuitively known to be unproblematic even if the character > set isn't well-defined. > > Currently #5 seems to be implied for all shells, for lack of further > documentation. Possibly, the function was never meant to be used with > untrusted data, but there's no warning against doing so either. I thin 1) is now covered, and the rest are optional. In particular, our way to provide better safety is not by documenting APIs that could be maliciously abused, but by marking the related variables as unsafe unless they have special predefined values. So I don't think we should extend this particular doc string with safety information. > (defun shell-quote-argument (argument) > - "Quote ARGUMENT for passing as argument to an inferior shell." > + "Quote ARGUMENT for passing as argument to an inferior shell. > + > +This is safe for shells conforming to POSIX sh. No guarantees > +regarding code injection are made for other shells, but csh, > +MS-DOS and Windows NT are supported for simple cases as well." Thanks, but I think this is no longer needed, after the change I made.
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Sun, 18 Oct 2015 19:13:02 GMT) Full text and rfc822 format available.Message #14 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) To: Eli Zaretskii <eliz <at> gnu.org> Cc: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Sun, 18 Oct 2015 21:12:34 +0200
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: taylanbayirli <at> gmail.com (Taylan Ulrich >> Bayırlı/Kammer) >> Date: Sun, 18 Oct 2015 14:36:03 +0200 >> >> The documentation of shell-quote-argument only says >> >> Quote ARGUMENT for passing as argument to an inferior shell. >> >> It's unclear for which shells this is supposed to work. > > I fixed the doc string to clarify that this function works correctly > with the system's standard shell. > >> In a recent thread in emacs-devel, it has been demonstrated that if >> the result is passed to csh, it can allow an attacker to execute an >> arbitrary shell command > > As I understand it, csh is not the standard shell on Posix systems, so > the fixed doc string now says not to expect it to work with csh. > >> The function should clearly document >> >> 1) for which shells will the quoting work absolutely, i.e. lead to >> the given string to appear *verbatim* in an element of the ARGV of >> the called command, >> >> 2) optionally, for which shells will the quoting at least prevent >> code injection, >> >> 3) optionally, for which shells and character sets for ARGUMENT will >> the quoting work absolutely, >> >> 4) optionally, for which shells and character sets for ARGUMENT will >> the quoting at least prevent code injection, >> >> 5) optionally, for which shells will the quoting work at all even if >> it provides no clear semantics, such that one can at least use it >> with data coming from trusted sources (e.g. other parts of Emacs's >> source code, or the user sitting in front of Emacs), where it's the >> user's/programmer's responsibility to stick to values for ARGUMENT >> that are intuitively known to be unproblematic even if the character >> set isn't well-defined. >> >> Currently #5 seems to be implied for all shells, for lack of further >> documentation. Possibly, the function was never meant to be used with >> untrusted data, but there's no warning against doing so either. > > I thin 1) is now covered, and the rest are optional. In particular, > our way to provide better safety is not by documenting APIs that could > be maliciously abused, but by marking the related variables as unsafe > unless they have special predefined values. So I don't think we > should extend this particular doc string with safety information. Hm, there seems to be a fundamental difference in mindset here in how one might use Elisp. I'd like to point out that (in the most extreme cases) people have actually been writing web servers and other such programs in Elisp for which one would normally use a general-purpose language. That is, "APIs that could be maliciously abused" is not the right way to look at it. It's not about the Elisp programmer abusing the API, it's about a malicious data source exploiting a (potential) flaw in an Elisp function, which Elisp programmers have relied on and thus made their programs vulnerable to code injection. That's why I was being so careful with regard to the safety guarantees of the "shell-quasiquote" package I contributed. I would like people to be able to use that as part of a general-purpose Elisp language, and so being safe against code injection is an absolute must. They might after all use it as part of a network-facing service. Actually that might also apply when using e.g. TRAMP, which also communicates with remote hosts and is a normal part of Emacs. I've been told it receives file names from remote hosts and passes them through shell-quote-argument before giving them to a shell. So maybe my concerns apply there as well. Given that, "I think 1) is now covered" is not very relieving to hear. It amounts to "I think this is safe against code injection" which is rather alarming to hear. Either it's very confidently known to be safe and so one may use it for network-facing code, or it's not confidently known to be safe and so one shouldn't use it for network-facing code. This should be documented clearly especially so that users who aren't very aware of injection attacks won't nonchalantly use the function for their network-facing code (when the function isn't known to be safe for this), but also so that users who are aware of such issues know they can use the function and don't instead invent their own thing (when it is known to be safe). Does that make sense? Taylan
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Sun, 18 Oct 2015 19:49:01 GMT) Full text and rfc822 format available.Message #17 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) Cc: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Sun, 18 Oct 2015 22:48:15 +0300
> From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) > Cc: 21702 <at> debbugs.gnu.org > Date: Sun, 18 Oct 2015 21:12:34 +0200 > > I'd like to point out that (in the most extreme cases) people have > actually been writing web servers and other such programs in Elisp for > which one would normally use a general-purpose language. > > That is, "APIs that could be maliciously abused" is not the right way to > look at it. It's not about the Elisp programmer abusing the API, it's > about a malicious data source exploiting a (potential) flaw in an Elisp > function, which Elisp programmers have relied on and thus made their > programs vulnerable to code injection. > > > That's why I was being so careful with regard to the safety guarantees > of the "shell-quasiquote" package I contributed. I would like people to > be able to use that as part of a general-purpose Elisp language, and so > being safe against code injection is an absolute must. They might after > all use it as part of a network-facing service. > > > Actually that might also apply when using e.g. TRAMP, which also > communicates with remote hosts and is a normal part of Emacs. I've been > told it receives file names from remote hosts and passes them through > shell-quote-argument before giving them to a shell. So maybe my > concerns apply there as well. > > > Given that, "I think 1) is now covered" is not very relieving to hear. Item 1 was this: > >> The function should clearly document > >> > >> 1) for which shells will the quoting work absolutely, i.e. lead to > >> the given string to appear *verbatim* in an element of the ARGV of > >> the called command, There's nothing about safety here, only about correctness. That is the aspect that I think is now covered, as the doc string now says for which shells one can have correct results. > It amounts to "I think this is safe against code injection" which is > rather alarming to hear. Either it's very confidently known to be safe > and so one may use it for network-facing code, or it's not confidently > known to be safe and so one shouldn't use it for network-facing code. > This should be documented clearly especially so that users who aren't > very aware of injection attacks won't nonchalantly use the function for > their network-facing code (when the function isn't known to be safe for > this), but also so that users who are aware of such issues know they can > use the function and don't instead invent their own thing (when it is > known to be safe). > > Does that make sense? Maybe it does, but only if we start documenting these aspects project-wide. It makes little sense to me to do that for a single API, and not an important one at that. But that's me.
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Mon, 19 Oct 2015 07:35:02 GMT) Full text and rfc822 format available.Message #20 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) To: Eli Zaretskii <eliz <at> gnu.org> Cc: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Mon, 19 Oct 2015 09:34:15 +0200
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) >> Cc: 21702 <at> debbugs.gnu.org >> Date: Sun, 18 Oct 2015 21:12:34 +0200 >> >> I'd like to point out that (in the most extreme cases) people have >> actually been writing web servers and other such programs in Elisp for >> which one would normally use a general-purpose language. >> >> That is, "APIs that could be maliciously abused" is not the right way to >> look at it. It's not about the Elisp programmer abusing the API, it's >> about a malicious data source exploiting a (potential) flaw in an Elisp >> function, which Elisp programmers have relied on and thus made their >> programs vulnerable to code injection. >> >> >> That's why I was being so careful with regard to the safety guarantees >> of the "shell-quasiquote" package I contributed. I would like people to >> be able to use that as part of a general-purpose Elisp language, and so >> being safe against code injection is an absolute must. They might after >> all use it as part of a network-facing service. >> >> >> Actually that might also apply when using e.g. TRAMP, which also >> communicates with remote hosts and is a normal part of Emacs. I've been >> told it receives file names from remote hosts and passes them through >> shell-quote-argument before giving them to a shell. So maybe my >> concerns apply there as well. >> >> >> Given that, "I think 1) is now covered" is not very relieving to hear. > > Item 1 was this: > >> >> The function should clearly document >> >> >> >> 1) for which shells will the quoting work absolutely, i.e. lead to >> >> the given string to appear *verbatim* in an element of the ARGV of >> >> the called command, > > There's nothing about safety here, only about correctness. That is > the aspect that I think is now covered, as the doc string now says for > which shells one can have correct results. Usually it's indeed correctness that protects against injection attacks. A quoting mechanism that's correct is automatically safe. Another way to make it safe would be to error when the given string contains characters outside of a limited character set. Either way, the safeness should be documented clearly, either implicitly through a clear documentation of the correctness, or explicitly. In your patch, correctness is implied, but the complexity of the problem domain (and thus the function itself) and the importance of possible repercussions of an incorrect implementation leave clearer documentation to be desired. While any function is really implied to be correct by its existence, any function is also implied to very possibly contain bugs, as is natural for software. In many cases these bugs are unimportant. In this case not. I would propose something along the lines of: It is guaranteed that ARGUMENT will be parsed as a single token by shells X, Y, and Z, as long as it is separated from other text via a delimiter in the syntax of the respective shell. (That's even better than the patch I proposed, which didn't mention the problem of delimiting.) I think it's also important to provide some explicit enumeration of shells for which the function is safe, because the systems Emacs supports may change over time, and there is no guarantee that a change to this function will not entail bugs. If we add wording like the above, then any programmer who sits down to expand the function's semantics to another shell will be forced to think very hard about what they're doing; otherwise they might try to do a "good enough" job but not make sure that all edge-cases are handled. "Designed to work with" is after all not an absolute claim of correctness and absence of bugs. >> It amounts to "I think this is safe against code injection" which is >> rather alarming to hear. Either it's very confidently known to be safe >> and so one may use it for network-facing code, or it's not confidently >> known to be safe and so one shouldn't use it for network-facing code. >> This should be documented clearly especially so that users who aren't >> very aware of injection attacks won't nonchalantly use the function for >> their network-facing code (when the function isn't known to be safe for >> this), but also so that users who are aware of such issues know they can >> use the function and don't instead invent their own thing (when it is >> known to be safe). >> >> Does that make sense? > > Maybe it does, but only if we start documenting these aspects > project-wide. It makes little sense to me to do that for a single > API, and not an important one at that. But that's me. This is an API which if its implementation is imperfect will result in programs prone to code injection attacks when these programs face untrusted input sources. Why do you say it's not an important one? Taylan
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Mon, 19 Oct 2015 07:49:02 GMT) Full text and rfc822 format available.Message #23 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) Cc: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Mon, 19 Oct 2015 10:47:28 +0300
> From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) > Cc: 21702 <at> debbugs.gnu.org > Date: Mon, 19 Oct 2015 09:34:15 +0200 > > > Item 1 was this: > > > >> >> The function should clearly document > >> >> > >> >> 1) for which shells will the quoting work absolutely, i.e. lead to > >> >> the given string to appear *verbatim* in an element of the ARGV of > >> >> the called command, > > > > There's nothing about safety here, only about correctness. That is > > the aspect that I think is now covered, as the doc string now says for > > which shells one can have correct results. > > Usually it's indeed correctness that protects against injection attacks. > A quoting mechanism that's correct is automatically safe. And that is the current situation, AFAIU. > Another way to make it safe would be to error when the given string > contains characters outside of a limited character set. What limited set would you suggest that will not make the function useless in real-life scenarios? In any case, I think quoting is better than rejecting, as it supports more use cases. > Either way, the safeness should be documented clearly, either implicitly > through a clear documentation of the correctness, or explicitly. Like I said, this convention should be adopted project-wide. Doing so only in a few doc strings, let alone one, will only confuse, because the user will not know whether the lack of such documentation means the API is safe or unsafe. > I would propose something along the lines of: > > It is guaranteed that ARGUMENT will be parsed as a single token by > shells X, Y, and Z, as long as it is separated from other text via a > delimiter in the syntax of the respective shell. I don't think we want to mention specific shells explicitly, because maintaining such a list would be a burden. The standard shell of each OS is well defined and known to the users of the respective systems. Moreover, Emacs by default uses that shell automatically. > >> Does that make sense? > > > > Maybe it does, but only if we start documenting these aspects > > project-wide. It makes little sense to me to do that for a single > > API, and not an important one at that. But that's me. > > This is an API which if its implementation is imperfect will result in > programs prone to code injection attacks when these programs face > untrusted input sources. Why do you say it's not an important one? Because there are many much more important ones that can do much more harm more easily. In particular, a shell command doesn't need to be quoted to be harmful or malicious.
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Mon, 19 Oct 2015 09:23:02 GMT) Full text and rfc822 format available.Message #26 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) To: Eli Zaretskii <eliz <at> gnu.org> Cc: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Mon, 19 Oct 2015 11:22:16 +0200
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) >> Cc: 21702 <at> debbugs.gnu.org >> Date: Mon, 19 Oct 2015 09:34:15 +0200 >> >> > Item 1 was this: >> > >> >> >> The function should clearly document >> >> >> >> >> >> 1) for which shells will the quoting work absolutely, i.e. lead to >> >> >> the given string to appear *verbatim* in an element of the ARGV of >> >> >> the called command, >> > >> > There's nothing about safety here, only about correctness. That is >> > the aspect that I think is now covered, as the doc string now says for >> > which shells one can have correct results. >> >> Usually it's indeed correctness that protects against injection attacks. >> A quoting mechanism that's correct is automatically safe. > > And that is the current situation, AFAIU. > >> Another way to make it safe would be to error when the given string >> contains characters outside of a limited character set. > > What limited set would you suggest that will not make the function > useless in real-life scenarios? > > In any case, I think quoting is better than rejecting, as it supports > more use cases. > >> Either way, the safeness should be documented clearly, either implicitly >> through a clear documentation of the correctness, or explicitly. > > Like I said, this convention should be adopted project-wide. Doing so > only in a few doc strings, let alone one, will only confuse, because > the user will not know whether the lack of such documentation means > the API is safe or unsafe. Yes, it should be done for every function for which the concerns I've explained apply. So let's start from this one. >> I would propose something along the lines of: >> >> It is guaranteed that ARGUMENT will be parsed as a single token by >> shells X, Y, and Z, as long as it is separated from other text via a >> delimiter in the syntax of the respective shell. > > I don't think we want to mention specific shells explicitly, because > maintaining such a list would be a burden. The standard shell of each > OS is well defined and known to the users of the respective systems. > Moreover, Emacs by default uses that shell automatically. For instance: POSIX sh, MS-DOS, and Windows NT, is not a long list. (I don't really know what shells MS-DOS and Windows NT use; a more precise naming would be good.) The payoff of the small burden is having clear safety guarantees. >> >> Does that make sense? >> > >> > Maybe it does, but only if we start documenting these aspects >> > project-wide. It makes little sense to me to do that for a single >> > API, and not an important one at that. But that's me. >> >> This is an API which if its implementation is imperfect will result in >> programs prone to code injection attacks when these programs face >> untrusted input sources. Why do you say it's not an important one? > > Because there are many much more important ones that can do much more > harm more easily. In particular, a shell command doesn't need to be > quoted to be harmful or malicious. There being other important cases, does not make this a less important case. It is exactly as important as I've already said. I don't understand what "a shell command doesn't need to be quoted to be harmful" is supposed to mean; quoting is what *makes* the arguments harmless, by ensuring they cleanly end up in the ARGV of a called command instead of causing arbitrary behavior of the shell. Here's a patch doing an improvement to the documentation like the one I proposed. Of course, if you have verified that shells other than POSIX sh are fully safe, feel free to improve the docstring accordingly. Taylan
[0001-lisp-subr.el-shell-quote-argument-Improve-documentat.patch (text/x-diff, inline)]
From bb746be5638a17c99e1647ecc178e3b9d97e4ba3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Taylan=20Ulrich=20Bay=C4=B1rl=C4=B1/Kammer?= <taylanbayirli <at> gmail.com> Date: Sun, 18 Oct 2015 14:23:35 +0200 Subject: [PATCH] * lisp/subr.el (shell-quote-argument): Improve documentation. --- lisp/subr.el | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/lisp/subr.el b/lisp/subr.el index c903ee3..e55647b 100644 --- a/lisp/subr.el +++ b/lisp/subr.el @@ -2713,8 +2713,14 @@ Note: :data and :device are currently not supported on Windows." (defun shell-quote-argument (argument) "Quote ARGUMENT for passing as argument to an inferior shell. -This function is designed to work with the syntax of your system's -standard shell, and might produce incorrect results with unusual shells." +This is safe for shells conforming to POSIX sh. No safety +guarantees are made for other shells, but the standard MS-DOS and +Windows NT shells are supported as well. + +Being safe in this context means that as long as the result is +surrounded by delimiters in the syntax of the respective shell, +it's guaranteed that it will be parsed as one token and that the +value of the token will be exactly ARGUMENT." (cond ((eq system-type 'ms-dos) ;; Quote using double quotes, but escape any existing quotes in -- 2.5.0
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Mon, 19 Oct 2015 09:33:02 GMT) Full text and rfc822 format available.Message #29 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) Cc: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Mon, 19 Oct 2015 12:32:22 +0300
> From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) > Cc: 21702 <at> debbugs.gnu.org > Date: Mon, 19 Oct 2015 11:22:16 +0200 > > > Like I said, this convention should be adopted project-wide. Doing so > > only in a few doc strings, let alone one, will only confuse, because > > the user will not know whether the lack of such documentation means > > the API is safe or unsafe. > > Yes, it should be done for every function for which the concerns I've > explained apply. So let's start from this one. Before we start, we need a _decision_ to do that everywhere. Then we could start doing that piecemeal. Before the decision is made, there's no reason to make any such changes. > >> I would propose something along the lines of: > >> > >> It is guaranteed that ARGUMENT will be parsed as a single token by > >> shells X, Y, and Z, as long as it is separated from other text via a > >> delimiter in the syntax of the respective shell. > > > > I don't think we want to mention specific shells explicitly, because > > maintaining such a list would be a burden. The standard shell of each > > OS is well defined and known to the users of the respective systems. > > Moreover, Emacs by default uses that shell automatically. > > For instance: POSIX sh, MS-DOS, and Windows NT, is not a long list. This list doesn't name shells on DOS and Windows (there are several good candidates). As for Posix, is it only sh? What about Bash? what about zsh? You see, the moment you come up with a list such as above, people will start complaining that their favorite shell is not in the list, and the list will grow. Then we will discover that some shells are not really compatible after all, etc. etc. It's a maintenance burden we had better avoided. Saying "the standard shell" avoids all that nicely, because it refers to a single well-known shell. > I don't understand what "a shell command doesn't need to be quoted to be > harmful" is supposed to mean Something like this: rm -rf /* > Here's a patch doing an improvement to the documentation like the one I > proposed. Of course, if you have verified that shells other than POSIX > sh are fully safe, feel free to improve the docstring accordingly. Thanks. However, like I said, I don't think this change would be correct, or needed.
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Mon, 19 Oct 2015 09:51:02 GMT) Full text and rfc822 format available.Message #32 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) To: Eli Zaretskii <eliz <at> gnu.org> Cc: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Mon, 19 Oct 2015 11:50:23 +0200
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) >> Cc: 21702 <at> debbugs.gnu.org >> Date: Mon, 19 Oct 2015 11:22:16 +0200 >> >> > Like I said, this convention should be adopted project-wide. Doing so >> > only in a few doc strings, let alone one, will only confuse, because >> > the user will not know whether the lack of such documentation means >> > the API is safe or unsafe. >> >> Yes, it should be done for every function for which the concerns I've >> explained apply. So let's start from this one. > > Before we start, we need a _decision_ to do that everywhere. Then we > could start doing that piecemeal. Before the decision is made, > there's no reason to make any such changes. Given all the reasons I listed, I would expect that decision to be obvious. >> >> I would propose something along the lines of: >> >> >> >> It is guaranteed that ARGUMENT will be parsed as a single token by >> >> shells X, Y, and Z, as long as it is separated from other text via a >> >> delimiter in the syntax of the respective shell. >> > >> > I don't think we want to mention specific shells explicitly, because >> > maintaining such a list would be a burden. The standard shell of each >> > OS is well defined and known to the users of the respective systems. >> > Moreover, Emacs by default uses that shell automatically. >> >> For instance: POSIX sh, MS-DOS, and Windows NT, is not a long list. > > This list doesn't name shells on DOS and Windows (there are several > good candidates). As for Posix, is it only sh? What about Bash? what > about zsh? > > You see, the moment you come up with a list such as above, people will > start complaining that their favorite shell is not in the list, and > the list will grow. Then we will discover that some shells are not > really compatible after all, etc. etc. It's a maintenance burden we > had better avoided. > > Saying "the standard shell" avoids all that nicely, because it refers > to a single well-known shell. Dash, Bash and (AFAIK all versions of) ksh are POSIX sh compliant. Zsh not unless when requested IIRC; in any case "POSIX sh" is well-defined. My latest patch says "standard shells of MS-DOS and Windows NT." Feel free to improve that if necessary. >> I don't understand what "a shell command doesn't need to be quoted to be >> harmful" is supposed to mean > > Something like this: > > rm -rf /* What are you trying to say? Of course an arbitrary shell command can do anything. The whole point of shell-quote-argument is to prevent a string which is meant purely as an argument to a command to become equivalent in power to an arbitrary shell command. >> Here's a patch doing an improvement to the documentation like the one I >> proposed. Of course, if you have verified that shells other than POSIX >> sh are fully safe, feel free to improve the docstring accordingly. > > Thanks. However, like I said, I don't think this change would be > correct, or needed. I've explained the need for the change, and it is correct. I don't understand why you're trying to make everything so difficult. If for reasons unclear to me you absolutely refuse to accept these improvements to shell-quote-argument's documentation, I will just continue not using the function, because it cannot be trusted. Taylan
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Mon, 19 Oct 2015 10:20:02 GMT) Full text and rfc822 format available.Message #35 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) Cc: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Mon, 19 Oct 2015 13:19:26 +0300
> From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) > Cc: 21702 <at> debbugs.gnu.org > Date: Mon, 19 Oct 2015 11:50:23 +0200 > > I don't understand why you're trying to make everything so difficult. I don't. We just disagree, that's all. I modified the doc string to add the missing information about the shells for which the function was designed. I don't think we should add anything else, for the reasons I pointed out already. > If for reasons unclear to me you absolutely refuse to accept these > improvements to shell-quote-argument's documentation, I will just > continue not using the function, because it cannot be trusted. How can documentation make a function more trustworthy? And what does that have to do with this bug report? This bug report is about the documentation of shell-quote-argument, not whether it is safe and should or should not be used. I think this bug should be closed now.
bug-gnu-emacs <at> gnu.org
:bug#21702
; Package emacs
.
(Mon, 19 Oct 2015 10:26:01 GMT) Full text and rfc822 format available.Message #38 received at 21702 <at> debbugs.gnu.org (full text, mbox):
From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) To: Eli Zaretskii <eliz <at> gnu.org> Cc: 21702 <at> debbugs.gnu.org Subject: Re: bug#21702: shell-quote-argument semantics and safety Date: Mon, 19 Oct 2015 12:25:17 +0200
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer) >> Cc: 21702 <at> debbugs.gnu.org >> Date: Mon, 19 Oct 2015 11:50:23 +0200 >> >> I don't understand why you're trying to make everything so difficult. > > I don't. We just disagree, that's all. > > I modified the doc string to add the missing information about the > shells for which the function was designed. I don't think we should > add anything else, for the reasons I pointed out already. > >> If for reasons unclear to me you absolutely refuse to accept these >> improvements to shell-quote-argument's documentation, I will just >> continue not using the function, because it cannot be trusted. > > How can documentation make a function more trustworthy? And what does > that have to do with this bug report? This bug report is about the > documentation of shell-quote-argument, not whether it is safe and > should or should not be used. > > I think this bug should be closed now. I don't want to repeat myself for the dozenth time so do as you wish, I'll simply continue not using the function nonchalantly and recommend others to do the same. Taylan
Paul Eggert <eggert <at> cs.ucla.edu>
:taylanbayirli <at> gmail.com (Taylan Ulrich Bayırlı/Kammer)
:Message #43 received at 21702-done <at> debbugs.gnu.org (full text, mbox):
From: Paul Eggert <eggert <at> cs.ucla.edu> To: 21702-done <at> debbugs.gnu.org Cc: Taylan Ulrich Bayırlı/Kammer <taylanbayirli <at> gmail.com> Subject: Re: shell-quote-argument semantics and safety Date: Wed, 21 Oct 2015 20:49:37 -0700
I installed a patch to the Emacs manual that attempts to address the documentation problem, and am boldly closing the bug. The bug report can be reopened if more work is needed re shell-quote-argument's documentation. For the patch, please see: http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=f373e812d95e1822833f88db024e011a769998b4
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Thu, 19 Nov 2015 12:24:03 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.