GNU bug report logs -
#22533
Non-determinism in python-3 ".pyc" bytecode
Previous Next
Reported by: Leo Famulari <leo <at> famulari.name>
Date: Tue, 2 Feb 2016 05:17:02 UTC
Severity: important
Done: Ricardo Wurmus <rekado <at> elephly.net>
Bug is archived. No further changes may be made.
Full log
Message #70 received at 22533 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Ricardo Wurmus <rekado <at> elephly.net> writes:
> I have applied this patch locally:
>
> diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm
> index 5f701701a..0d1ecc3c6 100644
> --- a/gnu/packages/python.scm
> +++ b/gnu/packages/python.scm
> @@ -359,8 +359,42 @@ data types.")
> "Lib/ctypes/test/test_win32.py" ; fails on aarch64
> "Lib/test/test_fcntl.py")) ; fails on aarch64
> #t))))
> - (arguments (substitute-keyword-arguments (package-arguments python-2)
> - ((#:tests? _) #t)))
> + (arguments
> + (substitute-keyword-arguments (package-arguments python-2)
> + ((#:tests? _) #t)
> + ((#:phases phases)
> + `(modify-phases ,phases
> + (add-after 'unpack 'patch-timestamp-for-pyc-files
> + (lambda _
> + ;; We set DETERMINISTIC_BUILD to only override the mtime when
> + ;; building with Guix, lest we break auto-compilation in
> + ;; environments.
> + (setenv "DETERMINISTIC_BUILD" "1")
> + (substitute* "Lib/py_compile.py"
> + (("source_stats\\['mtime'\\]")
> + "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])"))
> +
> + ;; Use deterministic hashes for strings, bytes, and datetime
> + ;; objects.
> + (setenv "PYTHONHASHSEED" "0")
> +
> + ;; Reset mtime when validating bytecode header.
> + (substitute* "Lib/importlib/_bootstrap_external.py"
> + (("source_mtime = int\\(source_stats\\['mtime'\\]\\)")
> + "source_mtime = 1"))
> + #t))
> + (add-after 'unpack 'disable-timestamp-tests
> + (lambda _
> + (substitute* "Lib/test/test_importlib/source/test_file_loader.py"
> + (("test_bad_marshal")
> + "disable_test_bad_marshal")
> + (("test_no_marshal")
> + "disable_test_no_marshal")
> + (("test_non_code_marshal")
> + "disable_test_non_code_marshal"))
> + #t))
> + (add-before 'check 'allow-non-deterministic-compilation
> + (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t))))))
> (native-search-paths
> (list (search-path-specification
> (variable "PYTHONPATH")
>
> It allows me to build python-six and python-sip reproducibly. It does
> not fix problems with Python 2, and I haven’t yet tested if it causes
> any new problems.
>
> It’s a little worrying that I had to disable three more tests that I
> think shouldn’t have failed.
Woow, nice work! I can't tell what's going on with the tests, they do
some bytecode manipulation stuff. Maybe it does not expect the low
timestamp somehow?
https://github.com/python/cpython/blob/374c6e178a7599aae46c857b17c6c8bc19dfe4c2/Lib/test/test_importlib/source/test_file_loader.py#L457-L484
I guess we'll do at least one 'core-updates' before 3.7 is released, so
it makes sense to include this. It should also give us some experience
that might be relevant for 2.7, since it probably won't get the upstream
reproducibility patch that relies on 3.7 features.
The only remark I have is: is introducing a new variable necessary?
SOURCE_DATE_EPOCH implies that the user wants a deterministic build;
the upstream patch doesn't actually honor it outside of making the
hashing method deterministic. So, I think it might be enough to just
test for SOURCE_DATE_EPOCH instead of DETERMINISTIC_BUILD. The former
is also already set in the build environment.
However, I just noticed that you unset DETERMINISTIC_BUILD before the
'check' phase. Did it break more things?
I suppose we'll have to set PYTHONHASHSEED somewhere in
python-build-system as well. Did you check if that makes a difference
for numpy? Perhaps it's enough to set it if we add an auto-compilation
step?
[signature.asc (application/pgp-signature, inline)]
This bug report was last modified 6 years and 106 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.