GNU bug report logs -
#15779
timeout: Child gets SIGTTOU when run from a shell script
Previous Next
To reply to this bug, email your comments to 15779 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#15779
; Package
coreutils
.
(Fri, 01 Nov 2013 12:19:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
"Richard W.M. Jones" <rjones <at> redhat.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Fri, 01 Nov 2013 12:19:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Fri, Nov 01, 2013 at 11:53:44AM +0000, Pádraig Brady wrote:
> Probably something to do with job control
> If you `set -m` first in the script,
> then the test binary doesn't hang.
Unfortunately set -m isn't quite right for the script
we are interested in fixing:
https://github.com/libguestfs/libguestfs/blob/master/run.in#L217
> I did a version of timeout once that put the child in it's own group,
> and noted in that version that I needed to leave SIGTTOU as IGN
> as tcsetpgrp(0, getpid()) caused SIGTTOU to be sent and I wasn't sure why.
That sounds like a very similar situation to this one. The
tty_check_change function is called all over the place, so just about
any ioctl or flush or tc* function could cause SIGTTOU to be sent.
> Maybe we should be resetting SIGTT{OU,IN} to what it was
> previously set to, rather than SIG_DFL?
The attached patch does *not* work .. don't know if you can see
any obvious mistakes.
Yesterday I looked through the strace -f output between the two cases
and came to the conclusion that it's not about signal handlers at all.
(What it's about is a mystery ...)
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/
[timeout.patch2 (text/plain, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#15779
; Package
coreutils
.
(Fri, 01 Nov 2013 13:49:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 15779 <at> debbugs.gnu.org (full text, mbox):
On 11/01/2013 12:17 PM, Richard W.M. Jones wrote:
> On Fri, Nov 01, 2013 at 11:53:44AM +0000, Pádraig Brady wrote:
>> Probably something to do with job control
>> If you `set -m` first in the script,
>> then the test binary doesn't hang.
>
> Unfortunately set -m isn't quite right for the script
> we are interested in fixing:
>
> https://github.com/libguestfs/libguestfs/blob/master/run.in#L217
>
>> I did a version of timeout once that put the child in it's own group,
>> and noted in that version that I needed to leave SIGTTOU as IGN
>> as tcsetpgrp(0, getpid()) caused SIGTTOU to be sent and I wasn't sure why.
>
> That sounds like a very similar situation to this one. The
> tty_check_change function is called all over the place, so just about
> any ioctl or flush or tc* function could cause SIGTTOU to be sent.
>
>> Maybe we should be resetting SIGTT{OU,IN} to what it was
>> previously set to, rather than SIG_DFL?
>
> The attached patch does *not* work .. don't know if you can see
> any obvious mistakes.
>
> Yesterday I looked through the strace -f output between the two cases
> and came to the conclusion that it's not about signal handlers at all.
> (What it's about is a mystery ...)
>
> Rich.
The initial messages to bug-coreutils have disappeared somewhere in the ether.
Anyway the context is that if a process being controlled by timeout
gets a SIGTTOU, then it will be stopped. Note timeout handles this case
by killing the process when the timeout occurs, but that doesn't help
as the process will be stalled for the duration of the timeout.
Details at: https://bugzilla.redhat.com/1025269
The workaround was to remove the calls to set SIGTTOU to SIG_DFL
just before the child is exec'd, however I'm not sure about that
since we want to treat monitored timeout jobs like background
processes, and if you run the problematic test child as a standard
shell background job, it's stopped also:
$ /tmp/test&
[2]+ Stopped /tmp/test
The proposed workaround mentioned in the quoted text above is ineffective.
I think that since the child wants to "interact" with the tty,
that adding the --foreground option to the timeout command is appropriate here?
The caveat is that children of the monitored command will not be timed out.
thanks,
Pádraig.
p.s. I wonder would cgroups when available
give a more powerful "process tree" timeout control functionality
Forcibly Merged 15779 15781.
Request was from
Pádraig Brady <P <at> draigBrady.com>
to
control <at> debbugs.gnu.org
.
(Fri, 01 Nov 2013 13:55:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#15779
; Package
coreutils
.
(Fri, 01 Nov 2013 14:40:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 15779 <at> debbugs.gnu.org (full text, mbox):
On Fri, Nov 01, 2013 at 01:48:45PM +0000, Pádraig Brady wrote:
> The initial messages to bug-coreutils have disappeared somewhere in
> the ether. Anyway the context is that if a process being controlled
> by timeout gets a SIGTTOU, then it will be stopped.
I don't think this is strictly accurate.
The issue I have is that timeout behaves differently when running
directly from the command line versus run from the command line via an
intermediate shell script.
interactive -> timeout -> command # OK
interactive -> bash script -> timeout -> command # SIGTTOU sent
SIGTTOU is sent in the latter case, causing the process under test to
hang (ie. T state) until the timeout happens.
This makes timeout useless for our purposes which is to cause a test
to time out if it runs for longer than 4 hours.
> I think that since the child wants to "interact" with the tty, that
> adding the --foreground option to the timeout command is appropriate
> here? The caveat is that children of the monitored command will not
> be timed out.
Annoyingly, for reasons not well understood, this bug only manifests
itself on RHEL 6. Although my minimal test case mentioned in
https://bugzilla.redhat.com/show_bug.cgi?id=1025269 works even
upstream, the original bug we are trying to avoid only happens on RHEL 6
(no idea why that is .. possibly qemu doesn't play funny games with
the tty upstream?)
'timeout' in RHEL 6 doesn't have --foreground. Therefore I have
disabled timeout completely on RHEL 6 builds of libguestfs.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW
This bug report was last modified 11 years and 288 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.