On Thu, Sep 21, 2017 at 1:20 AM, Pádraig Brady <P@draigbrady.com> wrote:
On 18/09/17 18:07, Jack Howarth wrote:
> On Mon, Sep 18, 2017 at 7:40 PM, Jim Meyering <jim@meyering.net> wrote:
>
>> On Mon, Sep 18, 2017 at 4:26 PM, Jack Howarth
>> <howarth.mailing.lists@gmail.com> wrote:
>>> On Mon, Sep 18, 2017 at 5:08 PM, Jim Meyering <jim@meyering.net> wrote:
>> ...
>>>> Is there any chance your failing test was via a python2 framework? I'm
>>>> asking (on Pádraig's behalf) because there is a known problem whereby
>>>> SIGPIPE is mishandled in that case, and that might explain this
>>>> failure, since the data-generation phase relies on SIGPIPE killing
>>>> this test's "yes" command.
>>>
>>> I doubt it as the hang doesn't happen under 10.13 when run on a JHFS
>>> formatted volume.
>>
>> How did you run the tests?
>>
>
> Actually, I forgot to mention that the coreutils test suite hang only
> occurred on the APFS volumes when the coreutils built against the gettext
> and libiconv from fink. A build outside of fink which didn't build against
> those packages didn't show the hang in the coreutils test suite. The fink
> gettext and libiconv packages that I am using are those from...
>
> https://sourceforge.net/p/fink/package-submissions/4955/
>
> and
>
> https://sourceforge.net/p/fink/package-submissions/5004/
>
> which are both patched for the format string strictness in High Sierra. I
> found that using --disable-nls in configuring coreutils was insufficient to
> suppress the test suite hang which I assume is due to the presence of...
>
> #define HAVE_LIBINTL_H 1
>
> in the generated ./lib/config.h
>
> despite the presence of...
>
> /* #undef HAVE_DCGETTEXT */
> /* #undef HAVE_GETTEXT */
>
> when --disable-nls is used so it still could be a Unicode related change in
> APFS, no?
>       Jack

The libintl bit reminded me of https://lists.gnu.org/archive/html/bug-gnulib/2014-10/msg00014.html
I.E. on OSX enabling those libs creates implicit threads I think.
Perhaps that's messing with SIGPIPE handling and only the implicit
thread gets it, thus not killing the main yes(1) thread.
However the yes(1) is also protected with a timeout(1) call.
Perhaps timeout(1) is a silent noop. We should support OSX through DYLD_INSERT_LIBRARIES,
but perhaps there is something preventing that on your system?
But then would the timeout tests fail. Could you check the timeout tests with:

  make SUBDIRS=. TESTS=tests/misc/filter.sh check

In any case we should protect calls to timeout(1) to ensure it's supported.
The attached does that at least.

cheers,
Pádraig.

Pádraig,
     The hang on APFS volumes doesn't seem to be related to CoreFoundation threading. If I repeat the steps that I used to track down a similar issue in make 4.0/4.1 by rebuilding libiconv with --disable-nls and coreutils with the same --disable-nls so that neither are linked against CoreFoundation, the test suite hang still occurs. Also, for the stock build, adding your proposed timeout changes doesn't eliminate the hang in the test suite either.
            Jack