From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 16 12:15:07 2015 Received: (at submit) by debbugs.gnu.org; 16 Feb 2015 17:15:07 +0000 Received: from localhost ([127.0.0.1]:45554 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YNPGE-0005JN-Gz for submit@debbugs.gnu.org; Mon, 16 Feb 2015 12:15:07 -0500 Received: from eggs.gnu.org ([208.118.235.92]:45382) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YNPGB-0005Ia-4a for submit@debbugs.gnu.org; Mon, 16 Feb 2015 12:15:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YNPG4-00024t-FE for submit@debbugs.gnu.org; Mon, 16 Feb 2015 12:14:57 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:48924) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YNPG4-00024n-CL for submit@debbugs.gnu.org; Mon, 16 Feb 2015 12:14:56 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47073) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YNPG3-0007it-1J for bug-guile@gnu.org; Mon, 16 Feb 2015 12:14:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YNPFw-00021O-PX for bug-guile@gnu.org; Mon, 16 Feb 2015 12:14:54 -0500 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:46463) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YNPFw-000214-GT for bug-guile@gnu.org; Mon, 16 Feb 2015 12:14:48 -0500 Received: from localhost ([127.0.0.1]:53638 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YNPFv-0002lX-SI for bug-guile@gnu.org; Mon, 16 Feb 2015 12:14:48 -0500 Received: by lola (Postfix, from userid 1000) id EC853E0D40; Mon, 16 Feb 2015 17:41:36 +0100 (CET) From: David Kastrup To: bug-guile@gnu.org Subject: Smob's mark_smob has become unreliable in Guile 2.x Date: Mon, 16 Feb 2015 17:41:36 +0100 Message-ID: <87twyln70f.fsf@fencepost.gnu.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) --=-=-= Content-Type: text/plain This precludes porting LilyPond to Guile 2 since LilyPond relies extensively on the mark_smob functionality. The symptom likely occurs when an object is collected and consequently its free_smob function gets called. This free_smob function essentially is responsible for freeing all resources claimed by the smob, rendering the underlying data moot. If mark_smob continues to get called, its interpretation of the now dead data is not likely to lead to happy results. Debugging is tricky since the gc phases tend to occur asynchronously. If I start the accompanying program with ./test 2 20000 40 I tend to get a core dump perhaps every fourth call. Running repeatedly in a debugger does not produce core dumps (perhaps one needs to reload the debugger for each try?), so one needs to do ulimit -c unlimited ./test 2 20000 40 and, after the problem triggers gdb ./test core for post-mortem debugging. A traceback looks like (gdb) bt #0 0x00000000 in ?? () #1 0x0804aee0 in ?? () #2 0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22 #3 0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so.1 #4 0xb72766ca in GC_mark_some () from /usr/lib/i386-linux-gnu/libgc.so.1 #5 0xb726cd16 in GC_stopped_mark () from /usr/lib/i386-linux-gnu/libgc.so.1 #6 0xb726d730 in GC_try_to_collect_inner () from /usr/lib/i386-linux-gnu/libgc.so.1 #7 0xb726e12a in GC_collect_or_expand () from /usr/lib/i386-linux-gnu/libgc.so.1 #8 0xb726e2b1 in GC_allocobj () from /usr/lib/i386-linux-gnu/libgc.so.1 #9 0xb72731a4 in GC_generic_malloc_inner () from /usr/lib/i386-linux-gnu/libgc.so.1 #10 0xb72732be in GC_generic_malloc () from /usr/lib/i386-linux-gnu/libgc.so.1 #11 0xb761b81e in scm_i_new_smob () from /usr/lib/libguile-2.0.so.22 #12 0x0804985a in std::vector >::_M_insert_aux(__gnu_cxx::__normal_iterator > >, void (* const&)()) () #13 0x0804a6fc in __gnu_cxx::new_allocator::allocate(unsigned int, void const*) () #14 0x0804a03d in void std::_Destroy(void (**)(), void (**)(), std::allocator&) () #15 0x08049a73 in void std::_Destroy(Family**, Family**, std::allocator&) () #16 0x0804934d in scm_puts () #17 0x0804938f in Scm_init::Scm_init(void (*)()) () #18 0x0804938f in Scm_init::Scm_init(void (*)()) () #19 0x0804938f in Scm_init::Scm_init(void (*)()) () #20 0x0804938f in Scm_init::Scm_init(void (*)()) () #21 0x0804938f in Scm_init::Scm_init(void (*)()) () #22 0x0804938f in Scm_init::Scm_init(void (*)()) () #23 0x0804938f in Scm_init::Scm_init(void (*)()) () #24 0x0804938f in Scm_init::Scm_init(void (*)()) () #25 0x0804938f in Scm_init::Scm_init(void (*)()) () #26 0x0804938f in Scm_init::Scm_init(void (*)()) () #27 0x0804938f in Scm_init::Scm_init(void (*)()) () #28 0x0804938f in Scm_init::Scm_init(void (*)()) () #29 0x0804938f in Scm_init::Scm_init(void (*)()) () #30 0x080495e7 in std::vector >::operator[](unsigned int) () #31 0xb75b7dfd in ?? () from /usr/lib/libguile-2.0.so.22 #32 0xb76418e7 in ?? () from /usr/lib/libguile-2.0.so.22 #33 0xb761afb9 in ?? () from /usr/lib/libguile-2.0.so.22 #34 0xb7659f20 in ?? () from /usr/lib/libguile-2.0.so.22 #35 0xb765a539 in ?? () from /usr/lib/libguile-2.0.so.22 #36 0xb75c24f3 in scm_call_4 () from /usr/lib/libguile-2.0.so.22 #37 0xb7641acf in scm_catch_with_pre_unwind_handler () from /usr/lib/libguile-2.0.so.22 #38 0xb7641bd4 in scm_c_catch () from /usr/lib/libguile-2.0.so.22 #39 0xb75b85d1 in ?? () from /usr/lib/libguile-2.0.so.22 #40 0xb75b86d3 in scm_c_with_continuation_barrier () from /usr/lib/libguile-2.0.so.22 #41 0xb763ef7e in ?? () from /usr/lib/libguile-2.0.so.22 #42 0xb72782c1 in GC_call_with_stack_base () from /usr/lib/i386-linux-gnu/libgc.so.1 #43 0xb763f3e6 in scm_with_guile () from /usr/lib/libguile-2.0.so.22 #44 0x080496c5 in void __gnu_cxx::__alloc_traits >::construct(std::allocator&, void (**)(), void (* const&)()) () #45 0xb72cfa83 in __libc_start_main (main=0x4, argc=134517216, argv=0x0, init=0x8049201 , fini=0x804967e >::~_Vector_base()+40>, rtld_fini=0x4, stack_end=0xbfca7aa4) at libc-start.c:287 #46 0xb7750000 in ?? () from /lib/ld-linux.so.2 Backtrace stopped: previous frame inner to this frame (corrupt stack?) The frames listed as #17 0x0804938f in Scm_init::Scm_init(void (*)()) () #18 ... appear to be a fluke in address/symbol correlation and do not seem to have an actual relation to the listed function when looking at the code. I append all the requisite source files producing the test binary when running "make". I can send my binary (when desired) which was compiled using 2.0.11 on a i586 platform. However, I would expect the problem to reproduce in some kind of manner on other systems. The compiler was gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6), Target: i686-linux-gnu. The headers are somewhat cooked-down variants (to remove dependencies on other parts of LilyPond) of the Smob organizing classes used in LilyPond. If one wants to compile for Guile v1 for comparison, one probably needs typedef void * scm_t_func; or so somewhere. --=-=-= Content-Type: application/x-gtar-compressed Content-Disposition: attachment; filename=guile-bug.tgz Content-Transfer-Encoding: base64 H4sIAFkd4lQAA+xce1NbR5bPv6NP0cFVjoSFQIBxApgswdihCoMX8HpTmZTSklrSHe5r7gNZyea7 7++c031v6wEk2Uxmd8dUKiDd7tPn/epzPS6D0Gz0y/HmW31rRvjw2R/+s4Wfvb1d+t198XzL/80/ 3b2tz7rbuy+2dnZfvNjG9/iwt/OZ2vrjUVn+KfNCZ0p9NtS3D6577Pn/0Z/C5MW+yqOkn3cGA/tH gb/oAb5p/GX87JnaGI+HffXjmJVlkMSjYKwGSZTi449uZQ1kfl0YxLc/qo2E1zX+2fR++pn/GVf2 78T3x5/xiP1393a6Yv/dvb3t7R22/+d7n+z/z/jZXG8odTMJckW+X+F3qrNCJSN1HoSzd0k8bKti YtSbi/cqKvMAjmGWmtwUhck6Dew9SdJZFownhWqetNT21tbzjY3tre5z9a2ONz6YWF0EJp6Us1wd TnQ8NfG/fcx3dRh24vCI9rtj6OhRZozKk1Ex1ZnZV7OkVAMdq8wMg7zIgn5ZAMNC6Xi4mWQqSobB aAYQ+KqMhyZjRIFXlBP+Dus3JjaZDtW7sh8C/fNgYOLcKA1K6Zt8YoaqT2Bow2vC4NpioF4ngKuL IInbygR4nqk7k+X4rHbcERZeWyUZYDR1QWhnKklpWwu4zlSoi3pnZ5HomjZ8jBnmJElByASwQNo0 CEPVN6rMzagM29iNterD2c23l+9v1PHFd+rD8dXV8cXNdwdYW0wSPDV3RiAFURoGAAxyMh0XM2AN AG9Pr06+xY7jb87Oz26+A+rq9dnNxen1tXp9eaWO1bvjq5uzk/fnx1fq3furd5fXpx2lrg0hZbD/ AbaOWDDg3dAUOghzJvc7CDIHYuFQTfSdgUAHJrgDWhpRJJ09Li3A0GESj5nAinvA6Wyk4qRoqxy4 HU6KIt3f3JxOp51xXHaSbLwZyv5886jTWN9sNJ4E8SAsh0atibubTNYajbskGDbSLCnMoOjR96p5 ffKW41lb0V/rrcbPQCEfRL3xoOdWJv2/4Zdq0rrWQeMXC6iMfz2oeu0SsD/J/hf9/2Tyx5/xsP/f 3t3ae+H8/87O9gvy/8+3tj/5/z/j5w/2/92vvvrqk///5P//9/r/ETRlpK7fXn5z3fv228YTfApi U39Rh4jDMOize+xMjrxvSVrx2P9G57nJClrVgIUAy1hHJk/1AMpcDA8aDTaya/KvitTqzfuz89Nc RWYAiwjyiOg3H9MkK2jzSfPZs5aSeJCrImHWXA8mJiI2TJMsJMJvJmbGwIbJoIxM7GkPg1eRjksd gv3YdBZ71nyCes5BJ+uCaqUhcB16RoTSrciSkBBjaF/kZB8wfJXP8sJEZIJqrLO+HtPiMAQ0soqa pP4M6MBgAqZplCWgMiYCADKHepKMiSOqr6E1g5B4KMoCykAWkVZME/E2vIkWW7x5Xbejrkm9jRSe vCMAH0DEkNUwl6eO1jC4NSouoz7scJ+8RhSVhe57K9jiLEugl2w31qCILJAKPCIDAuOxoJpgL9Q8 Bj84ujvvgHXanS+xH9bLnI0MIaBGZcwcI72w+UHfDHtsDc0WA18ibpAZXbCzgnFlwV3FWFnJ6cbh CXGS9O+IgRz7WJC3SeJwZp0TnPDMk3lmgnFcmSJJHfsXZJxk+3BJOaiBboGjIxAVRBw2mHlMd5xk EZ5Cz0jczFso7IcJxRe9SCq7/bLGxNM88TVLGLDusVNkX0LoRDq7lWzL8RVixPKhGpZkrMtg2KVD aLoskggefqDudBaINtDJRBCpO58DD8NnAUx9khzQZmecaY4NNT4GLhAGENGJ1ippPdyBHtxy+qfj 2KopUQwVwqc4iTeglYh2kLPokw5LmAXkGM+AFASemb+XgaWJ/Sy1YULz0cOsb4BIAMggglDPy0D0 nE6bU0tWkQ8TxAoOtaxfCyKS6M9/9mpZwbmANFZFGxHE5wjd4iMI65w9POQZ6gygojIsAnd4pQTE GfIvs8AgQFRrBKSzzbHEBZxNJJm/f02e0ejBRCXEe6Zku4NsRLhRW82ccbOEIPZgCI8ZFLMOeZu8 WgQYAwTKmSiAW+QkCPZF1rJNOOohhSf9jQw8BPRgOgmADLZpa9NpQt4ocx7cJvhQD2zuqBWY6nCq kSaxHRAmc5awksPWXdJSRKWSrUOEPPCgz4XeHeigSeEAsXUdlME/6zD4iU4PGW1xUBx4iFGVOrIq qiZ7UazrnV6etwSIaDYOQjbWI+rAliXwHmh69hpYmo+akGxLFsn4sgLv7/uf1L7kaQMOF4f+oyPi v6umBHhPvXTYHdCDeawOltYedjrTieRsyJHigsjOwUbT6Rxh8S+Emx4V4piEx+w6SCK+FoxYd1k5 5lgfkE7ZGo/sndz1G+uKTipX1LEpuF1I/svKrE/GlUHolC0l8cDM6RKlsAUFPZIV24FqZmQUZO4t mzQoZLCkk4J9mSFVDTqms8h1tZ6CH7GZzn1LHEOOXOARMY6YoJrpxpFHe5sXMG/xoKpomd0cxgjr +uvMAIc4l6wGSjFnF22xPCpBgkEJpwHl4uzPRuUp8glEeCTv/zP0fSwr/BurjPL+oLsi2p4cn5+f XbxRZxc3p1evj09OLcwIArVKLO6CzYsSMT+rYZvcAKvqvsHHFn+BVPI2V//Jkc/xb8G91FkdVQiK UvotFYyogJho4fY0owyawprTN8tV5CZDqoe06idJiPyGIYBZw4B1EfZB4EOkbUgoCUfCsJdWODqc nhR07pMR7bDKZzdREvr23fnp21NUHTdnlxfOCXAwovxVc6KoJRVsW4cbhsmUvq6YJ75J8BbLuguG depbJKkAsPpkWUpoIm7q0OGspRnTJ/aSvZKkj5m933TUlaVG11m3cpyRkoPjOR8zTIwEWVtGsPq6 JKQth37NWYRAMfRVAF9qxNcgkEkyX2cunHJwmtY3Lo2ZUozuw8JFtjakcXgb2eQtT8lzwMNz3kKk 3B8oN7xsgZxXha/NgBGPXYeKFtIaUgOKEDVHHBYuvyF+/Db2MBgoaFTCQmmVTYUWEyHRJJsLCf4a oof+30cCamrUNDOSYWZSMEcUDESMuExXQKhfjseiW1ma5Ob34M6+uCzICbsq/YnnE0RsjKirGAjz vOCEkwMKS6uX9mh97/sflBSXCC0SiRJyF/QFhV0+ci2c7Y+zpP/1Wsembs4GJJ8dIN0aYgclSoA5 rPXKUQ5uDm7ZjSWS+I8D6hYwJ+ucF6ToOxTwnDhCJCIE0EBV9OYmSkob2oWtTvonLoes0+nMEUxx yNU/yYiA1Hk2ZDYOkz7KhnxAzY8mJcYEq7KIiU5TamokCfdTmByAsMoRzKEDn+6+4Hqcuwa6ZgVY W9lWh6BcmRGKTgpUxPIpV6CS6jjt9jIsa3JWWSkVBwRmQIsgRxrb8jKTUF3xgTJDhF9OYQMAQ7JB 2o3zuW2s9HBIAbVHiPcIU9Xk75vr+NDiv1sUpEQ5r+1KpD/iG6mirb6ktAi5i/+NB0tZYHj+M2vE iqNp3QEnQL/gTOSaKTP90J5epiY7cqhQLKAqXhrbImsCV/T6VAxxrCj0uHdQP67wqo71HjKqNdYt 75EzDwLJJuPDJKTUOtAnBUdx4QfSvKZWghWVuxQ/4RyKHpKM4tDuP+INPeoH9V4d3xyrZt6yrKjy OGI2ZJ7VWuPIIH1xBNclgWiDr3Ts8wWKp1qktpL3AyGUS1qqnToOwEzICwex65gICEnKcWy386AI RDGEqx7rHUtqSYnyOMHW+uRYDY5mZoy0CVl0WiDrtMxLW6sEsrBUkj6shGe8g1ZZbt5MEuofgr4+ pzsU3XoUB3vsqVhaMRwEdVKQjVXY2WaYAOGUbaIzmBqytZ8kcYsTdRsnUzAQWbfuUz/H9UaNbZ2R E7IQuP/NKR5nFHjwRc6+FAdd3pksC8i/0sNIkiZnHbKf4rDnGCH9HmTYq1tTOSfEmiFvDE1K38aF bPazIef4IGxb/CK9JR0YDEzOCSTCVh704U8sIwUG4V81wEUGzjtfJ9yP5o/10SoRqowtOgRMn1qL MbM6MxwCLdV+rlF5VOsmveMESlN8qWtjRzq/5YUW44p3Ni60XCsFUTxe4Kekh8NgxEZX1FsdEps2 Je7wRtn9Nsk5gOsyLKqFuW2cblDNBLDEmsrpQNdtusYWPkjGMdWvwCrIyFNmxHwO7GSajnVx3ZUq 6BlcQ4bgYFUqLJKxYbpYr1KwxzFzgaGDSv0hXy75O7ad4GVsi16xegy/EaVJSNkKmVnrgE5/n7tG E2DTsRUkz1eAxF7B9y/eraVY6dwxfhbNOTQvANvn8jF+SP7FflmQmNZ9WPWOBZwZ6OrNwkw/YXIF TC2FZT10wV8O9lwu+wlV5VyuJpIGG1I/SrdJ4QzAZNrTcguDgcMBTA2Ezo6AhMrZkovu3Aouixyo Afx7Frq2ZDjtXfAX3Lgu0zScLeq7jX1EuFVOvnWj3My5QolhR7b4oUuRKOeTKP+UVqWtFS2EMgSh U+40WLVl4omWebR8LuhikQviHqRvERRmgSsLDtHnjvV6cc08mSqTpSVgslh0UYvlb1QsAHBUIrbm SZkNqjgKDxZp9rOaXOyUO859Q1yDqIdSLHN7jVAYsskCqnOc4C8SY0Ohhru4I6S/FLjzGRiYRNbq xWOocSL1tHjVqQXByFGAqrL8n0yWELpr3ESGJNcAfhgMmEIvWotKklnMKfhLteUU//y73qvTk/Pj q1NJTt5dXZ5QeOHqfYXee0KR1B8wkJuKWlUtbM5NOX+o82PQ9MqTN/cPEim4Kb5wvknVUsEZCtW0 cjfad93owt6ccKs0CgZJKMbDCZGVlmYueUjaiLVwQWJbDVkyqPHgwCRgrHVVOTlT3k+GM7qPtV26 CoFchWZElxQjJeJ36sfepzqRtnO6PuTOgqtDuSAPk6lQPq3bcwARB9C3wF4cTPn+iotYK9xFmsgQ AweDrNRTwBnROExsYGClh3E6J+nqNslt+y6KjAKotMuAxnwfWLHERhQmWa4ng8ImLtxrc0HT3YSu 0rPm1em/t9Xlu5u2+o/jq7Y6vnpzfnZ901Irfv66SqkrKfbyYBxrakOScjcJcOvw8MvWfzWbAI8/ d/EnzkDs8iG5SzLRBHu8NQxSspSU1eL++uzCNxHxgnkiUw1MbmjEdQeWZyIINl64GEoVArnPAxuu r/f3l0+uMwzq3bgEg5XaXRnWhbf9Az6wYMWqsgo7eEBZzjJ/rOfa6K70EjVK7CJ+Pcs3unAodc1o t1F7C6j1HqqYqrLo3dXpq7OT45tTGVeylQWCtq2TVkjtPpjVkaidv+YTvrm8PO/dqP36w+tFqK6u eAjZBcC0ZX9/uTbE033hn1S61FHw7l0fqXzre9f5W4q5YPzzMrfvT7cqMoaG3OgSNKJhvpxySVpF +ZZj14qLZVGP6hBXtdnOOX9srguSUhSLOtWlMbmM1vx5yyjOI5h6jQTOsh8cj6OH3KfHLl79yDid v/7xPsXDYqrLUT7EXnFwj4Ev9KrbGT5QvmecLJfoRsBCeyq596vE6yB+rtQFNznTGUXdz+f7Cde2 k7lfHyydiPcX4tJewcIWUZAVdBVHBfrBP1PTKjlB2xYvA91RNWUvH1UbXwsf1L85ORE2vpL5p8/r UbWtrZ4usrW1kjAf1qJWPgrtEXQXOLaCXw/w98Anz37+ZUGLK+OHoixhYq1zcTi2GAzWGk8M0tWR 2lyvRqTU+uan1zj+v/wszv8W/4AXQB6e/+12d7vb9v2vF1vbO3s0/7u32/00//tn/MCuN9Y3+H4X vxuf3gX5NAv8rzkLfHNysjAMTN9QQXAiF312Bo203qWZdfeCWsgfCJ83JydtklQgnaoG9xc4qtrp H/eahbTjZGinAmg+pjpmnZIJr1xFZE4Nad2wwo+MpqY/3yfZVzClkOeRgaqPRmOZndXvu9SDy2S+ QTxKju5NnZFCNJYztVXNZmo8yLstNr1oVkWXpEj0vLVx5E8e8Esz95yMEnbFySt7xnR0287cZcWK 5nH6axDzG9gCJ30QwdWsuedajM/ngiD+QjqElLhfnd68v7roXZx+II2zN0Ns69KRcO3YkhpfAoA7 PdXlm+uY0Uwg9Xe4iWvnOLn5RUOetoM6re92dBzDmw3M3PbcNhzenGzY0UO6ceybEVkwORF3mREm AxnW8SbFQ3ct664rcjuIV1UvtoqqiG3mbf9eUrhdvRhVsdHO61LB1LOYNtM21zXwFE2wtu1dB3cG EDn4LiWCy3NZisR8adPQweKAqDarLuXHprDXrL/DGKxC+2om1F+/Oz05e31G9N+vSlKlrYA9X7fN gd/67bq5fHmziK/rt8zxq21HkkY0eedx6fdYbl2Dpqsueap349KyyFVz7cnhmq8Y8u0KcS8vWjuq dlr6ug8yTAx1Bd73XV8zql4D5d4WU3WL5l75+yN1O31QreoJgGW6qst+e8dwH2PcpMYKCNXQw0u+ Ob0fiL26ugcJO8hx32auj5e32kEX0ZgKDFDhoDa0vrfVoa9tTQyVfpcFUcA3rUMTaURKvhibm5Dn CF6N39LNUUT9YtlvwlzGg+zU3jQhd1y9OGKHlNzrEFPD9hJRzhdoe6cMvfCPuzEfKQRX7/FoewPD s2i64PEu4y4+qiuLPC+RZgI+jwTQ0BKYgGTd0LQVtcLHs84iYzxVyss+65L3FTKZYY+jC88rkAKu oRDb2X2+9+LLr9ZE6+RNp+bnlfK03JS1UyUer0aq0pMvabZrtcFuOZHczE16ck5Y0KAtEj83JeD6 616+Rb67UU/hsH4VMt6g+TZ9xQSEP9tBRsFDHRUSM0/o9kKZbpRpEs0OJdD0ghm2eLqVofJw6z7L yV4kaH4hYmZLlZFcOfLijYoYXWh7qrwTQ9u/kFFDzT2aAPVLfRvs3wdSaomllAIae6U3TOyVDl/i 2XGCglPtADJ8av1SHUk+f6mePhBpWtJ5ghhR2IkUuRSrWv1AYhGmE+UluAfzceVNztovkxOsCbDu gm7KiCx775RvFMmGezFAbsHILOlttJQFCqGP6XwZJHEaX8ibYdQdVD26qe+Bczxr50Zp5MUxFFGJ u/kl+/ESZy0TepG1ZMnE5UGQA3l3nSborr7Jy23WVdphXBq91HDj3utddGNqSbUDYfAZt1V1VvDU E2VR9PofFRJWcHP9T8Qcl7Wut5qto0qsVfLRglwbcgH32M77M5jWCuHLAPCy8BdqgCrEVRs5sq/a uZjDywiJp6suT1mtqfbpClT5SbrqSLfHnWS/nrtqx2lbAvRny0jOY8t+5t4e6El52Bvzl6uAtFUX no3+a6y6Gl36aUqIJngte1Nmu7nKHinvhd5z2MX783O7/pd5ylZe/h1VFM6xTdO0ic+1e3H30V08 6P5dD6J0pL584LxHtu62nm59HP2+k58qbOUCTzrc/6It7br/a/8Zp3/AGY/8+z87Wzs70v+lfwli a5f6vzvdT/3fP+WnbgYFCVIzoyP/zfY7fuP23vfa5fGhnS23Q+pHqpojP/gts+2cyVcrOmmZT3o8 uGKH0VHnrP7nWyS3e62jAG5s/m1B+fLIdV8W80kK3vIKaj2DTXHfwpKqwX2Q2zP76WnLv2y9CzJq IizOafJ24ZGyqNAl5m0wzN3U5CAp42LhIB4CSwod5m1e0880ikriAPgpi/b3H1lclUUL14SUyvCe 8OXWgQoPZTn+evZsPvZRLAkpYgjwVuMvfSjHrQtOjprYjg44dJqyfCPc6LY2BXi7JkD2Ev2+eOPq Sbz4/qBENkieGhqO9KWGS0VUwLUsfh3KIVRJExwVOPLm3mqiNd8HP8y9zegX19WLpECAmhoOASc0 q7H0hnWEk7sHvxEX7Hr2UlVY+LrgWldlVP3DQmqdEscw0UNrPev67k5w4EnWdXymkSr6sL7ewsN5 fEjgwWFtYBVCz54FglD1DOgIGqTQt6s6eQyRqs2XqIaSoKnvvt/5oZoFFoF7z7r1M1EQ79n2DzLb u4QpwHvccoo5oKz58FCtnRWuSlqjzwH9D5E8rNSMmFHrptPMbqWPrdXa5h0RUnlBY4d8ghO/a+zc tmqZ+YfPNehIb2goVUjTg3YtLKs/MDQ9oOxzd4FQlKCMxftcj82+6nCEVtZkgeKROmSa8DtwvMiP /hqvOQ6Yj9Qi6XqUkfpT5dH77/bq0AYAGASi6CrICvafD9IEg6mseW+CLy65e/txZlEZnbPbf18T AAAAAAAAAADwUP2Xi44AeAAA --=-=-= Content-Type: text/plain -- David Kastrup --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 16 13:13:18 2015 Received: (at 19883) by debbugs.gnu.org; 16 Feb 2015 18:13:18 +0000 Received: from localhost ([127.0.0.1]:45571 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YNQAX-0006ky-1D for submit@debbugs.gnu.org; Mon, 16 Feb 2015 13:13:17 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:47350 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YNQAT-0006ko-U5 for 19883@debbugs.gnu.org; Mon, 16 Feb 2015 13:13:14 -0500 Received: from localhost ([127.0.0.1]:54657 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YNQAT-0000Pj-5N for 19883@debbugs.gnu.org; Mon, 16 Feb 2015 13:13:13 -0500 Received: by lola (Postfix, from userid 1000) id C3699E0D40; Mon, 16 Feb 2015 19:13:12 +0100 (CET) From: David Kastrup To: 19883@debbugs.gnu.org Subject: Correction for backtrace Date: Mon, 16 Feb 2015 19:13:12 +0100 Message-ID: <873865n2rr.fsf@fencepost.gnu.org> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) This is embarrassing: I used the wrong executable in connection with the core dump. With the matching executable, the coredump makes a lot more sense: #0 0x00000000 in ?? () #1 0x0804aee0 in Smob_base::mark_trampoline (arg=0x9fbb000) at smobs.tcc:34 #2 0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22 #3 0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so.1 #4 0xb72766ca in GC_mark_some () from /usr/lib/i386-linux-gnu/libgc.so.1 #5 0xb726cd16 in GC_stopped_mark () from /usr/lib/i386-linux-gnu/libgc.so.1 #6 0xb726d730 in GC_try_to_collect_inner () from /usr/lib/i386-linux-gnu/libgc.so.1 #7 0xb726e12a in GC_collect_or_expand () from /usr/lib/i386-linux-gnu/libgc.so.1 #8 0xb726e2b1 in GC_allocobj () from /usr/lib/i386-linux-gnu/libgc.so.1 #9 0xb72731a4 in GC_generic_malloc_inner () from /usr/lib/i386-linux-gnu/libgc.so.1 #10 0xb72732be in GC_generic_malloc () from /usr/lib/i386-linux-gnu/libgc.so.1 #11 0xb761b81e in scm_i_new_smob () from /usr/lib/libguile-2.0.so.22 #12 0x0804985a in scm_new_smob (tc=7039, data=158987224) at /usr/include/guile/2.0/libguile/smob.h:91 #13 0x0804a6fc in Smob_base::register_ptr (p=0x979f3d8) at smobs.tcc:53 #14 0x0804a03d in Smob::unprotected_smobify_self (this=0x979f3dc) at smobs.hh:273 #15 0x08049a73 in Smob::smobify_self (this=0x979f3dc) at smobs.hh:286 #16 0x0804934d in Family::Family (this=0x979f3d8, totals=0, branch=2) at test.cc:30 #17 0x0804938f in Family::Family (this=0x98b7c98, totals=3, branch=2) at test.cc:35 #18 0x0804938f in Family::Family (this=0x979e598, totals=8, branch=2) at test.cc:35 #19 0x0804938f in Family::Family (this=0x98b65c8, totals=17, branch=2) at test.cc:35 #20 0x0804938f in Family::Family (this=0x9934d78, totals=37, branch=2) at test.cc:35 #21 0x0804938f in Family::Family (this=0x99377c8, totals=76, branch=2) at test.cc:35 #22 0x0804938f in Family::Family (this=0x96921e0, totals=154, branch=2) at test.cc:35 #23 0x0804938f in Family::Family (this=0xa0c3998, totals=311, branch=2) at test.cc:35 #24 0x0804938f in Family::Family (this=0x9d0cc88, totals=623, branch=2) at test.cc:35 #25 0x0804938f in Family::Family (this=0x9b8b0a8, totals=1248, branch=2) at test.cc:35 #26 0x0804938f in Family::Family (this=0x9e84808, totals=2498, branch=2) at test.cc:35 #27 0x0804938f in Family::Family (this=0x9bf3d48, totals=4998, branch=2) at test.cc:35 #28 0x0804938f in Family::Family (this=0x9d05de8, totals=9998, branch=2) at test.cc:35 #29 0x0804938f in Family::Family (this=0x9b8dcb8, totals=19999, branch=2) at test.cc:35 #30 0x080495e7 in workload (avv=0xbfca7aa4) at test.cc:71 #31 0xb75b7dfd in ?? () from /usr/lib/libguile-2.0.so.22 #32 0xb76418e7 in ?? () from /usr/lib/libguile-2.0.so.22 #33 0xb761afb9 in ?? () from /usr/lib/libguile-2.0.so.22 #34 0xb7659f20 in ?? () from /usr/lib/libguile-2.0.so.22 #35 0xb765a539 in ?? () from /usr/lib/libguile-2.0.so.22 #36 0xb75c24f3 in scm_call_4 () from /usr/lib/libguile-2.0.so.22 #37 0xb7641acf in scm_catch_with_pre_unwind_handler () from /usr/lib/libguile-2.0.so.22 #38 0xb7641bd4 in scm_c_catch () from /usr/lib/libguile-2.0.so.22 #39 0xb75b85d1 in ?? () from /usr/lib/libguile-2.0.so.22 #40 0xb75b86d3 in scm_c_with_continuation_barrier () from /usr/lib/libguile-2.0.so.22 #41 0xb763ef7e in ?? () from /usr/lib/libguile-2.0.so.22 #42 0xb72782c1 in GC_call_with_stack_base () from /usr/lib/i386-linux-gnu/libgc.so.1 #43 0xb763f3e6 in scm_with_guile () from /usr/lib/libguile-2.0.so.22 #44 0x080496c5 in main (ac=4, av=0xbfca7aa4) at test.cc:85 (gdb) So the core indeed occurs when trying to call scm_gc_mark on a smob no longer (or not yet?) associated with a valid structure in memory, in a garbage collection apparently triggered during normal allocation of smob data. Sorry for the nonsensical core dump previously. -- David Kastrup From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 25 17:16:21 2015 Received: (at 19883) by debbugs.gnu.org; 25 Feb 2015 22:16:21 +0000 Received: from localhost ([127.0.0.1]:58089 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YQkFh-0005sI-6q for submit@debbugs.gnu.org; Wed, 25 Feb 2015 17:16:21 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:52710 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YQkFe-0005s8-TL for 19883@debbugs.gnu.org; Wed, 25 Feb 2015 17:16:19 -0500 Received: from reverse-83.fdn.fr ([80.67.176.83]:60249 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1YQkFe-00010c-2j; Wed, 25 Feb 2015 17:16:18 -0500 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> Date: Wed, 25 Feb 2015 23:16:15 +0100 In-Reply-To: <873865n2rr.fsf@fencepost.gnu.org> (David Kastrup's message of "Mon, 16 Feb 2015 19:13:12 +0100") Message-ID: <87twy91vtc.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) David Kastrup skribis: > This is embarrassing: I used the wrong executable in connection with the > core dump. With the matching executable, the coredump makes a lot more > sense: > > #0 0x00000000 in ?? () > #1 0x0804aee0 in Smob_base::mark_trampoline (arg=3D0x9fbb000) > at smobs.tcc:34 > #2 0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22 > #3 0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so.1 Could you try commenting out all the SMOB mark functions in LilyPond? This doesn=E2=80=99t fix the bug, of course, but it=E2=80=99s probably a go= od workaround: user-provided mark functions are not needed in Guile 2.0 since libgc scans the whole heap for live pointers. Thanks, Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 25 18:17:30 2015 Received: (at 19883) by debbugs.gnu.org; 25 Feb 2015 23:17:30 +0000 Received: from localhost ([127.0.0.1]:58109 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YQlCs-0007H8-6C for submit@debbugs.gnu.org; Wed, 25 Feb 2015 18:17:30 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:54141 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YQlCq-0007H1-MU for 19883@debbugs.gnu.org; Wed, 25 Feb 2015 18:17:29 -0500 Received: from localhost ([127.0.0.1]:33214 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQlCp-00017g-Pv; Wed, 25 Feb 2015 18:17:28 -0500 Received: by lola (Postfix, from userid 1000) id 62F6DE09A7; Thu, 26 Feb 2015 00:17:27 +0100 (CET) From: David Kastrup To: ludo@gnu.org (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> Date: Thu, 26 Feb 2015 00:17:27 +0100 In-Reply-To: <87twy91vtc.fsf@gnu.org> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Wed, 25 Feb 2015 23:16:15 +0100") Message-ID: <878ufld1iw.fsf@fencepost.gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) ludo@gnu.org (Ludovic Court=C3=A8s) writes: > David Kastrup skribis: > >> This is embarrassing: I used the wrong executable in connection with the >> core dump. With the matching executable, the coredump makes a lot more >> sense: >> >> #0 0x00000000 in ?? () >> #1 0x0804aee0 in Smob_base::mark_trampoline (arg=3D0x9fbb000) >> at smobs.tcc:34 >> #2 0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22 >> #3 0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so.1 > > Could you try commenting out all the SMOB mark functions in LilyPond? > > This doesn=E2=80=99t fix the bug, of course, but it=E2=80=99s probably a = good > workaround: user-provided mark functions are not needed in Guile 2.0 > since libgc scans the whole heap for live pointers. Even the test program crashes at the end (when `count' is called in order to traverse the created hierarchy) when you disable the setting of the mark function in the init method in smobs.tcc. A pointer to a C++ structure does not appear to protect the corresponding SMOB data and free_smob calls the delete operator which calls destructors and clobbers the memory area. Program received signal SIGSEGV, Segmentation fault. 0x08049b0a in std::vector >::size ( this=3D0x1b8b) at /usr/include/c++/4.9/bits/stl_vector.h:655 655 { return size_type(this->_M_impl._M_finish - this->_M_impl._M_sta= rt); } (gdb) bt #0 0x08049b0a in std::vector >::size ( this=3D0x1b8b) at /usr/include/c++/4.9/bits/stl_vector.h:655 #1 0x08049498 in Family::count (this=3D0x1b7f) at test.cc:53 #2 0x0804947c in Family::count (this=3D0x834f350) at test.cc:54 #3 0x0804947c in Family::count (this=3D0x8297d40) at test.cc:54 #4 0x0804947c in Family::count (this=3D0x828a9f8) at test.cc:54 #5 0x0804947c in Family::count (this=3D0x817d768) at test.cc:54 #6 0x0804947c in Family::count (this=3D0x828d588) at test.cc:54 #7 0x0804947c in Family::count (this=3D0x83298b8) at test.cc:54 #8 0x0804947c in Family::count (this=3D0x817fe58) at test.cc:54 #9 0x080495df in workload (avv=3D0xbffff074) at test.cc:73 #10 0xb7e66dfd in ?? () from /usr/lib/libguile-2.0.so.22 #11 0xb7ef08e7 in ?? () from /usr/lib/libguile-2.0.so.22 #12 0xb7ec9fb9 in ?? () from /usr/lib/libguile-2.0.so.22 #13 0xb7f08f20 in ?? () from /usr/lib/libguile-2.0.so.22 #14 0xb7f09539 in ?? () from /usr/lib/libguile-2.0.so.22 #15 0xb7e714f3 in scm_call_4 () from /usr/lib/libguile-2.0.so.22 #16 0xb7ef0acf in scm_catch_with_pre_unwind_handler () from /usr/lib/libguile-2.0.so.22 #17 0xb7ef0bd4 in scm_c_catch () from /usr/lib/libguile-2.0.so.22 #18 0xb7e675d1 in ?? () from /usr/lib/libguile-2.0.so.22 #19 0xb7e676d3 in scm_c_with_continuation_barrier () from /usr/lib/libguile-2.0.so.22 #20 0xb7eedf7e in ?? () from /usr/lib/libguile-2.0.so.22 #21 0xb7b272c1 in GC_call_with_stack_base () from /usr/lib/i386-linux-gnu/libgc.so.1 #22 0xb7eee3e6 in scm_with_guile () from /usr/lib/libguile-2.0.so.22 #23 0x08049685 in main (ac=3D4, av=3D0xbffff074) at test.cc:85 --=20 David Kastrup From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 26 06:35:42 2015 Received: (at 19883) by debbugs.gnu.org; 26 Feb 2015 11:35:42 +0000 Received: from localhost ([127.0.0.1]:58267 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YQwjE-0007pk-Ap for submit@debbugs.gnu.org; Thu, 26 Feb 2015 06:35:40 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:44327 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YQwjB-0007pb-Pg for 19883@debbugs.gnu.org; Thu, 26 Feb 2015 06:35:38 -0500 Received: from pluto.bordeaux.inria.fr ([193.50.110.57]:51291 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1YQwjA-0005nm-KF; Thu, 26 Feb 2015 06:35:37 -0500 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 8 =?utf-8?Q?Vent=C3=B4se?= an 223 de la =?utf-8?Q?R?= =?utf-8?Q?=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Thu, 26 Feb 2015 12:35:32 +0100 In-Reply-To: <878ufld1iw.fsf@fencepost.gnu.org> (David Kastrup's message of "Thu, 26 Feb 2015 00:17:27 +0100") Message-ID: <87k2z4dhx7.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) David Kastrup skribis: > ludo@gnu.org (Ludovic Court=C3=A8s) writes: > >> David Kastrup skribis: >> >>> This is embarrassing: I used the wrong executable in connection with the >>> core dump. With the matching executable, the coredump makes a lot more >>> sense: >>> >>> #0 0x00000000 in ?? () >>> #1 0x0804aee0 in Smob_base::mark_trampoline (arg=3D0x9fbb000) >>> at smobs.tcc:34 >>> #2 0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22 >>> #3 0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so= .1 >> >> Could you try commenting out all the SMOB mark functions in LilyPond? >> >> This doesn=E2=80=99t fix the bug, of course, but it=E2=80=99s probably a= good >> workaround: user-provided mark functions are not needed in Guile 2.0 >> since libgc scans the whole heap for live pointers. > > Even the test program crashes at the end (when `count' is called in > order to traverse the created hierarchy) when you disable the setting of > the mark function in the init method in smobs.tcc. Could you add debugging symbols for libguile? I don=E2=80=99t understand h= ow =E2=80=98count=E2=80=99 gets called. Do you know if this is a use-after-free error? Perhaps setting MALLOC_CHECK_=3D1 would give a hint. If this is the case, Andy had the idea of turning on topological finalization in the GC. This may help for this particular case, but I vaguely recall that this breaks other finalizer-related things. (I would check by myself, but ISTR that building LilyPond =E2=80=9Con one= =E2=80=99s own=E2=80=9D is not recommended. What would you suggest? A Guix recipe would be sweet.) > A pointer to a C++ structure does not appear to protect the > corresponding SMOB data and free_smob calls the delete operator which > calls destructors and clobbers the memory area. Oh, I was mistaken in my previous message. GC scans the stack and the GC-managed heap (stuff allocated with GC_MALLOC/scm_gc_malloc et al.), but it does *not* scan the malloc/new heap. So indeed, C++ objects that hold references to =E2=80=98SCM=E2=80=99 object= s, such as instances of =E2=80=98Smob=E2=80=99, must either have a mark function, o= r they must be allocated with scm_gc_malloc. Would it be possible to add a =E2=80=98new=E2=80=99 operator to =E2=80=98Sm= ob=E2=80=99 that uses =E2=80=98scm_gc_malloc=E2=80=99, and a =E2=80=98delete=E2=80=99 operator th= at uses =E2=80=98scm_gc_free=E2=80=99? Thanks, Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 26 07:32:42 2015 Received: (at 19883) by debbugs.gnu.org; 26 Feb 2015 12:32:42 +0000 Received: from localhost ([127.0.0.1]:58287 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YQxcP-0002BM-VV for submit@debbugs.gnu.org; Thu, 26 Feb 2015 07:32:42 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:45947 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YQxcM-0002BD-Qe for 19883@debbugs.gnu.org; Thu, 26 Feb 2015 07:32:39 -0500 Received: from localhost ([127.0.0.1]:53252 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQxcM-0004eQ-58; Thu, 26 Feb 2015 07:32:38 -0500 Received: by lola (Postfix, from userid 1000) id 5CBFADF676; Thu, 26 Feb 2015 13:32:00 +0100 (CET) From: David Kastrup To: ludo@gnu.org (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> Date: Thu, 26 Feb 2015 13:32:00 +0100 In-Reply-To: <87k2z4dhx7.fsf@gnu.org> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Thu, 26 Feb 2015 12:35:32 +0100") Message-ID: <874mq8dfb3.fsf@fencepost.gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) ludo@gnu.org (Ludovic Court=C3=A8s) writes: > David Kastrup skribis: > >> ludo@gnu.org (Ludovic Court=C3=A8s) writes: >> >>> David Kastrup skribis: >>> >>>> This is embarrassing: I used the wrong executable in connection with t= he >>>> core dump. With the matching executable, the coredump makes a lot more >>>> sense: >>>> >>>> #0 0x00000000 in ?? () >>>> #1 0x0804aee0 in Smob_base::mark_trampoline (arg=3D0x9fbb000) >>>> at smobs.tcc:34 >>>> #2 0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22 >>>> #3 0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.s= o.1 >>> >>> Could you try commenting out all the SMOB mark functions in LilyPond? >>> >>> This doesn=E2=80=99t fix the bug, of course, but it=E2=80=99s probably = a good >>> workaround: user-provided mark functions are not needed in Guile 2.0 >>> since libgc scans the whole heap for live pointers. >> >> Even the test program crashes at the end (when `count' is called in >> order to traverse the created hierarchy) when you disable the setting of >> the mark function in the init method in smobs.tcc. > > Could you add debugging symbols for libguile? I don=E2=80=99t understand= how > =E2=80=98count=E2=80=99 gets called. Figure me surprised. Here is the recursive walk: int Family::count () { int sum =3D 1; for (int i =3D 0; i < kids.size (); i++) sum +=3D kids[i]->count (); return sum; } and here is the starting call in workload(): cout << "last has " << Family::unsmob (k)->count () << endl; > Do you know if this is a use-after-free error? Sure. Nothing else would clobber the kids[] array to contain bad pointers. > If this is the case, Andy had the idea of turning on topological > finalization in the GC. This may help for this particular case, but I > vaguely recall that this breaks other finalizer-related things. I don't see why. Topological finalization might help with mark-after-free. But why would it help if there is not even any mark call involved? This is clearly use-after-free. > (I would check by myself, but ISTR that building LilyPond =E2=80=9Con one= =E2=80=99s > own=E2=80=9D is not recommended. What would you suggest? A Guix recipe = would > be sweet.) Is there a reason you are not using the test program provided with this bug report? There is no real point in experimenting with LilyPond's complexity when a simple test program using its memory management classes already crashes. LilyPond's GUILEv2 branch is currently out of order again since 2.0.11 changed encoding mechanisms _again_ in an incompatible manner (what GUILE calls "stable" is anything but). It is becoming harder and harder to work around GUILE's attempts of wresting encoding control from the application, while GUILE has no byte-transparent decoding of UTF-8, does not support strings encoded in UTF-8, and (as of 2.0.11 or 2.0.10) supports _only_ string ports redecoded to UTF-8. So dealing with memory-mapped UTF-8 encoded files which are multiplexed between reading by GUILE and reading by an UTF-8 decoding parser has again been thwarted. While I try figuring out how to repair the damage this time, testing with LilyPond itself is hard to interpret since a number of problems are not related to the memory management. As long as this simple test program can show the memory management related crashes, I don't see the point in throwing people at LilyPond: that has not delivered any results the last several times I tried it. >> A pointer to a C++ structure does not appear to protect the >> corresponding SMOB data and free_smob calls the delete operator which >> calls destructors and clobbers the memory area. > > Oh, I was mistaken in my previous message. GC scans the stack and the > GC-managed heap (stuff allocated with GC_MALLOC/scm_gc_malloc et al.), > but it does *not* scan the malloc/new heap. > > So indeed, C++ objects that hold references to =E2=80=98SCM=E2=80=99 obje= cts, such as > instances of =E2=80=98Smob=E2=80=99, must either have a mark function,= or they must > be allocated with scm_gc_malloc. > > Would it be possible to add a =E2=80=98new=E2=80=99 operator to =E2=80=98= Smob=E2=80=99 that uses > =E2=80=98scm_gc_malloc=E2=80=99, and a =E2=80=98delete=E2=80=99 operator = that uses =E2=80=98scm_gc_free=E2=80=99? It would not help since many of the references are stored in STL containers (like std::vector ) which have their data allocated/deallocated separately from the memory area of the structure itself. Frankly, I don't get the current strategy of GUILE: basically any use of scm_set_smob_mark will result in a function that can be called with garbage from a smob that has already been deallocated via the function registered with scm_set_smob_free. GUILEv2 developers have resisted fixing this bug for years by trying to stop people from using scm_set_smob_mark and instead telling people to have their entire heap scanned by a conservative garbage collector. For an application like LilyPond which can easily have the heap cover more than half of the available address space and run for half an hour (when generating docs) processing independent files with large individual memory requirements, this strategy will have both considerable performance impacts as well as bleed enough randomly retained memory to run the application into the ground eventually. In my current work on fixing the encoding stuff again I have patched my code to deal with the mark-after-free errors in the free and mark trampolines myself. I need to find a solution for the encoding mess before I can actually indulge in more testing of this workaround. However, due to the intransparency of GUILE's implementation and the multithreaded collector, I have no guarantees that my work on the respective trampolines will reliably prevent all mark-after-free errors. This is something that needs to get fixed in GUILE. It does not make sense to provide a mark callback mechanism that can be called with garbage in GUILE's free store. When GUILE releases/collects memory, it does not make sense to leave the SMOB cells in a state indistinguishable from from valid data. Apart from causing crashes in mark functions, this makes work much harder for the conservative garbage collector. --=20 David Kastrup From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 26 09:03:28 2015 Received: (at 19883) by debbugs.gnu.org; 26 Feb 2015 14:03:28 +0000 Received: from localhost ([127.0.0.1]:58356 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YQz2G-0004W1-2a for submit@debbugs.gnu.org; Thu, 26 Feb 2015 09:03:28 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:48667 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YQz2D-0004Vq-Pi for 19883@debbugs.gnu.org; Thu, 26 Feb 2015 09:03:26 -0500 Received: from pluto.bordeaux.inria.fr ([193.50.110.57]:51386 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1YQz2C-0001Dd-Us; Thu, 26 Feb 2015 09:03:25 -0500 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> <874mq8dfb3.fsf@fencepost.gnu.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 8 =?utf-8?Q?Vent=C3=B4se?= an 223 de la =?utf-8?Q?R?= =?utf-8?Q?=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Thu, 26 Feb 2015 15:03:22 +0100 In-Reply-To: <874mq8dfb3.fsf@fencepost.gnu.org> (David Kastrup's message of "Thu, 26 Feb 2015 13:32:00 +0100") Message-ID: <87fv9sahxx.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) --=-=-= Content-Type: text/plain David Kastrup skribis: > Is there a reason you are not using the test program provided with this > bug report? Sorry, I had overlooked it. Please assume good faith. With this patch: --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/smobs.tcc b/smobs.tcc index c8b9dfe..b1d61b7 100644 --- a/smobs.tcc +++ b/smobs.tcc @@ -132,9 +132,6 @@ void Smob_base::init () // types in order to be able to compare them. The other comparisons // are for static member functions and thus are ordinary function // pointers which work without those contortions. - if (static_cast(&Super::mark_smob) != - static_cast(&Smob_base::mark_smob)) - scm_set_smob_mark (smob_tag_, Super::mark_trampoline); scm_set_smob_print (smob_tag_, Super::print_trampoline); if (&Super::equal_p != &Smob_base::equal_p) scm_set_smob_equalp (smob_tag_, Super::equal_p); --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable =E2=80=9C./test 1000 1000 10000=E2=80=9D works fine. However, it=E2=80=99s not representative since the C++ objects in this test= do not hold =E2=80=98SCM=E2=80=99 references, AFAICS. > LilyPond's GUILEv2 branch is currently out of order again since 2.0.11 > changed encoding mechanisms _again_ Please submit a different bug report. >>> A pointer to a C++ structure does not appear to protect the >>> corresponding SMOB data and free_smob calls the delete operator which >>> calls destructors and clobbers the memory area. >> >> Oh, I was mistaken in my previous message. GC scans the stack and the >> GC-managed heap (stuff allocated with GC_MALLOC/scm_gc_malloc et al.), >> but it does *not* scan the malloc/new heap. >> >> So indeed, C++ objects that hold references to =E2=80=98SCM=E2=80=99 obj= ects, such as >> instances of =E2=80=98Smob=E2=80=99, must either have a mark function= , or they must >> be allocated with scm_gc_malloc. >> >> Would it be possible to add a =E2=80=98new=E2=80=99 operator to =E2=80= =98Smob=E2=80=99 that uses >> =E2=80=98scm_gc_malloc=E2=80=99, and a =E2=80=98delete=E2=80=99 operator= that uses =E2=80=98scm_gc_free=E2=80=99? > > It would not help since many of the references are stored in STL > containers (like std::vector ) which have their data > allocated/deallocated separately from the memory area of the structure > itself. Oh, OK. Still, I don=E2=80=99t think this is a problem because each C++ ob= ject has a corresponding SMOB, and said SMOB is GC-protected; thus the C++ object itself is also GC-protected until the SMOB is unprotected. Here=E2=80=99s the patch I=E2=80=99ve ended up with: --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/smobs.hh b/smobs.hh index 3701280..a41a645 100644 --- a/smobs.hh +++ b/smobs.hh @@ -263,6 +263,20 @@ private: protected: Smob () : self_scm_ (SCM_UNDEFINED), protection_cons_ (SCM_EOL) { }; public: + static void *operator new (std::size_t size) + { + /* This C++ object is referred to by the corresponding SMOB, which is + itself GC-protected. Thus, no need to protect the C++ object. */ + return scm_gc_malloc (size, "lily-smob"); + } + + static void operator delete (void *thing) + { + /* Nothing to do: the GC will reclaim memory for THING when it deems + appropriate. */ + // printf ("delete %p\n", thing); + } + static size_t free_smob (SCM obj) { delete Smob_base::unregister_ptr (obj); diff --git a/smobs.tcc b/smobs.tcc index c8b9dfe..0eb0c30 100644 --- a/smobs.tcc +++ b/smobs.tcc @@ -31,7 +31,7 @@ template SCM Smob_base::mark_trampoline (SCM arg) { - return (Super::unsmob (arg))->mark_smob (); + abort (); } template @@ -60,14 +60,14 @@ template SCM Smob_base::mark_smob () { - return SCM_UNSPECIFIED; + abort (); } template size_t Smob_base::free_smob (SCM) { - return 0; + abort (); } template @@ -125,16 +125,11 @@ void Smob_base::init () // While that's not a consideration for type_p_name_, it's easier // doing it like the rest. - if (&Super::free_smob != &Smob_base::free_smob) - scm_set_smob_free (smob_tag_, Super::free_smob); // Old GCC versions get their type lattice for pointers-to-members // tangled up to a degree where we need to typecast _both_ covariant // types in order to be able to compare them. The other comparisons // are for static member functions and thus are ordinary function // pointers which work without those contortions. - if (static_cast(&Super::mark_smob) != - static_cast(&Smob_base::mark_smob)) - scm_set_smob_mark (smob_tag_, Super::mark_trampoline); scm_set_smob_print (smob_tag_, Super::print_trampoline); if (&Super::equal_p != &Smob_base::equal_p) scm_set_smob_equalp (smob_tag_, Super::equal_p); diff --git a/test.cc b/test.cc index b5ab476..6bbd435 100644 --- a/test.cc +++ b/test.cc @@ -41,9 +41,7 @@ Family::Family (int totals, int branch) SCM Family::mark_smob () { - for (int i = 0; i < kids.size (); i++) - scm_gc_mark (kids[i]->self_scm ()); - return SCM_EOL; + abort (); } int @@ -51,7 +49,11 @@ Family::count () { int sum = 1; for (int i = 0; i < kids.size (); i++) - sum += kids[i]->count (); + { + sum += kids[i]->count (); + if ((i % 100) == 0) + scm_gc (); + } return sum; } --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable It works on this test case, and, assuming I correctly understand LilyPond=E2=80=99s use case, it may be a viable solution for LilyPond. Cou= ld you give it a try and report back? > Frankly, I don't get the current strategy of GUILE: basically any use of > scm_set_smob_mark will result in a function that can be called with > garbage from a smob that has already been deallocated via the function > registered with scm_set_smob_free. That=E2=80=99s not true. For example, the GnuTLS bindings use the exact sa= me code for 1.8 and 2.0 without any problems; it=E2=80=99s surely a simpler ex= ample than LilyPond, of course, but still. ~~~~~~~ In the interest of the Guile and LilyPond projects, I would like to improve our communication. Let me explain what my personal expectations are. First, like many others, I=E2=80=99m a volunteer with limited bandwidth. Thus, I really pre= fer bug reports that are as concise as possible and to-the-point; I can=E2=80= =99t afford to read so much. Second, I don=E2=80=99t want to mix issues: this bug report is about a spec= ific GC issue, not about an encoding issue. It=E2=80=99s better for me (and eve= ryone I guess) when a bug report remains focused. Third, we must fix these LilyPond vs. Guile 2 issues. You and others have spent a lot of time on this already. I think it would help to get everyone involved on both sides. Thus, could you Cc: this bug report to the LilyPond developer list, or the corresponding LilyPond bug report? This is really important to me. Please don=E2=80=99t take it personally or anything. I=E2=80=99m really tr= ying hard to see how we can improve the collaboration between the two projects. Thanks, Ludo=E2=80=99. --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 26 10:30:34 2015 Received: (at 19883) by debbugs.gnu.org; 26 Feb 2015 15:30:34 +0000 Received: from localhost ([127.0.0.1]:58844 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YR0OX-0006kt-L7 for submit@debbugs.gnu.org; Thu, 26 Feb 2015 10:30:34 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:51552 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YR0OU-0006kk-El for 19883@debbugs.gnu.org; Thu, 26 Feb 2015 10:30:31 -0500 Received: from localhost ([127.0.0.1]:58856 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YR0OT-0003Zl-6F; Thu, 26 Feb 2015 10:30:29 -0500 Received: by lola (Postfix, from userid 1000) id 93259E0575; Thu, 26 Feb 2015 16:30:28 +0100 (CET) From: David Kastrup To: ludo@gnu.org (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> <874mq8dfb3.fsf@fencepost.gnu.org> <87fv9sahxx.fsf@gnu.org> Date: Thu, 26 Feb 2015 16:30:28 +0100 In-Reply-To: <87fv9sahxx.fsf@gnu.org> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Thu, 26 Feb 2015 15:03:22 +0100") Message-ID: <87wq34bsh7.fsf@fencepost.gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) ludo@gnu.org (Ludovic Court=C3=A8s) writes: > David Kastrup skribis: > >> Is there a reason you are not using the test program provided with this >> bug report? > > Sorry, I had overlooked it. Please assume good faith. > > With this patch: > > diff --git a/smobs.tcc b/smobs.tcc > index c8b9dfe..b1d61b7 100644 > --- a/smobs.tcc > +++ b/smobs.tcc > @@ -132,9 +132,6 @@ void Smob_base::init () > // types in order to be able to compare them. The other comparisons > // are for static member functions and thus are ordinary function > // pointers which work without those contortions. > - if (static_cast(&Super::mark_smob) !=3D > - static_cast(&Smob_base::mark_smob)) > - scm_set_smob_mark (smob_tag_, Super::mark_trampoline); > scm_set_smob_print (smob_tag_, Super::print_trampoline); > if (&Super::equal_p !=3D &Smob_base::equal_p) > scm_set_smob_equalp (smob_tag_, Super::equal_p); > > > =E2=80=9C./test 1000 1000 10000=E2=80=9D works fine. That's not really surprising: this specifies a "fork factor" of 1000 and 1000 points of data which means that the data structures are nested only one level deep. Try ./test 2 2000 200 which will cause a nesting of about 10 levels, making it much more likely for collection/allocation problems to strike. > However, it=E2=80=99s not representative since the C++ objects in this te= st do > not hold =E2=80=98SCM=E2=80=99 references, AFAICS. Why isn't it representative? That's _very_ much the situation in LilyPond's code base. >> LilyPond's GUILEv2 branch is currently out of order again since >> 2.0.11 changed encoding mechanisms _again_ > > Please submit a different bug report. There were separate bug reports. This is considered a feature. >> It would not help since many of the references are stored in STL >> containers (like std::vector ) which have their data >> allocated/deallocated separately from the memory area of the >> structure itself. > > Oh, OK. Still, I don=E2=80=99t think this is a problem because each C++ > object has a corresponding SMOB, and said SMOB is GC-protected; thus > the C++ object itself is also GC-protected until the SMOB is > unprotected. The code given in test.cc is representative for LilyPond: most of the C++ objects refer to other C++ objects via pointers, and the protection of SMOB and C++ objects is managed through the mark callbacks. Complex C++ objects contain their own SCM value as part of the Smob base class. Simple C++ objects (derived from Simple_smob) don't and are only optionally managed by GUILE. The test program currently does not demonstrate use of Simple_smob: at the current point of time getting the crashes of Smob under control is the first target. > Here=E2=80=99s the patch I=E2=80=99ve ended up with: > > diff --git a/smobs.hh b/smobs.hh > index 3701280..a41a645 100644 > --- a/smobs.hh > +++ b/smobs.hh > @@ -263,6 +263,20 @@ private: > protected: > Smob () : self_scm_ (SCM_UNDEFINED), protection_cons_ (SCM_EOL) { }; > public: > + static void *operator new (std::size_t size) > + { > + /* This C++ object is referred to by the corresponding SMOB, which is > + itself GC-protected. Thus, no need to protect the C++ object. */ > + return scm_gc_malloc (size, "lily-smob"); > + } > + > + static void operator delete (void *thing) > + { > + /* Nothing to do: the GC will reclaim memory for THING when it deems > + appropriate. */ > + // printf ("delete %p\n", thing); > + } > + As I stated: this will not help with STL containers which are extensively used in pretty much every data structure of LilyPond. >> Frankly, I don't get the current strategy of GUILE: basically any use >> of scm_set_smob_mark will result in a function that can be called >> with garbage from a smob that has already been deallocated via the >> function registered with scm_set_smob_free. > > That=E2=80=99s not true. Shrug. I posted the backtrace and the example program. > Let me explain what my personal expectations are. First, like many > others, I=E2=80=99m a volunteer with limited bandwidth. Thus, I really p= refer > bug reports that are as concise as possible and to-the-point; I can=E2=80= =99t > afford to read so much. That's what I consider the primary reason that every offer to help with LilyPond so far has gone dead the moment I provided a branch and recipe to work on. LilyPond is a large and complex application. That's why I=C2=A0went to the pain of providing standalone code demonstrating the memo= ry allocation and management problems so that this could be worked on independently from other problems. > Second, I don=E2=80=99t want to mix issues: this bug report is about a > specific GC issue, not about an encoding issue. It=E2=80=99s better for = me > (and everyone I guess) when a bug report remains focused. Which is why I provided a separate test program that does not break whenever the "stable" Guile version changes its encoding APIs again. Several weeks ago I promised RMS to update the GUILEv2 branch to the current version of LilyPond as he got a renewed promise of cooperation. It turned out that in the mean time Ubuntu had updated Guilev2 from 2.0.9 (or something) to 2.0.11, and the encoding issues I got under control for 2.0.9 were back with a vengeance where a speedy solution was not in sight. So I spent the time to create test code _only_ involving the memory management issues. Now the complaint appears to be that the code is not representative for LilyPond. It most certainly is. > Third, we must fix these LilyPond vs. Guile 2 issues. You and others > have spent a lot of time on this already. In the last two years, nobody except myself spent any time on GUILEv2/LilyPond. Previous to that, Ian Hulin spent time on reorganizing our code base and startup code in order not to get the byte compilation, module system and other interpreter/compiler differences of GUILEv2 mess up the basic operation, mostly by disabling a lot of stuff. There will be a lot of followup work to get code management and performance into reasonable shape once it works at all. When it is not possible to run the test suites without random crashes, there is no reasonably effective way for people to work on those aspects however. > I think it would help to get everyone involved on both sides. Thus, > could you Cc: this bug report to the LilyPond developer list, or the > corresponding LilyPond bug report? This is really important to me. Shrug. I'll put a link to this bug report to a suitable LilyPond issue. It will not likely trigger a response or other feedback, however. Nobody but myself is working on GUILEv2 migration. The current C++ template class implementation/abstraction of the GUILE memory management has been written by me. No other currently active developer has had either significant involvement with the current code, or the functionally similar previous code that used C preprocessor macros instead of language features to do essentially the same. > Please don=E2=80=99t take it personally or anything. I=E2=80=99m really = trying hard > to see how we can improve the collaboration between the two projects. For better or worse, LilyPond has no other developer to offer than myself for that task. There is nobody else available who even has had exposure to GUILEv2 or even GUILE1 internals. --=20 David Kastrup From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 26 13:15:36 2015 Received: (at 19883) by debbugs.gnu.org; 26 Feb 2015 18:15:36 +0000 Received: from localhost ([127.0.0.1]:58933 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YR2yF-0002AW-MT for submit@debbugs.gnu.org; Thu, 26 Feb 2015 13:15:36 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:57932 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YR2yE-0002AO-4T for 19883@debbugs.gnu.org; Thu, 26 Feb 2015 13:15:34 -0500 Received: from reverse-83.fdn.fr ([80.67.176.83]:48780 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1YR2yD-0003XE-BD; Thu, 26 Feb 2015 13:15:33 -0500 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> <874mq8dfb3.fsf@fencepost.gnu.org> <87fv9sahxx.fsf@gnu.org> <87wq34bsh7.fsf@fencepost.gnu.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 8 =?utf-8?Q?Vent=C3=B4se?= an 223 de la =?utf-8?Q?R?= =?utf-8?Q?=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Thu, 26 Feb 2015 19:15:30 +0100 In-Reply-To: <87wq34bsh7.fsf@fencepost.gnu.org> (David Kastrup's message of "Thu, 26 Feb 2015 16:30:28 +0100") Message-ID: <87wq34zghp.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) David Kastrup skribis: > ludo@gnu.org (Ludovic Court=C3=A8s) writes: > >> David Kastrup skribis: [...] >>> It would not help since many of the references are stored in STL >>> containers (like std::vector ) which have their data >>> allocated/deallocated separately from the memory area of the >>> structure itself. >> >> Oh, OK. Still, I don=E2=80=99t think this is a problem because each C++ >> object has a corresponding SMOB, and said SMOB is GC-protected; thus >> the C++ object itself is also GC-protected until the SMOB is >> unprotected. > > The code given in test.cc is representative for LilyPond: most of the > C++ objects refer to other C++ objects via pointers, and the protection > of SMOB and C++ objects is managed through the mark callbacks. Complex > C++ objects contain their own SCM value as part of the Smob base class. > Simple C++ objects (derived from Simple_smob) don't and are only > optionally managed by GUILE. I don=E2=80=99t think this contradicts what I wrote above, does it? >> Here=E2=80=99s the patch I=E2=80=99ve ended up with: >> >> diff --git a/smobs.hh b/smobs.hh >> index 3701280..a41a645 100644 >> --- a/smobs.hh >> +++ b/smobs.hh >> @@ -263,6 +263,20 @@ private: >> protected: >> Smob () : self_scm_ (SCM_UNDEFINED), protection_cons_ (SCM_EOL) { }; >> public: >> + static void *operator new (std::size_t size) >> + { >> + /* This C++ object is referred to by the corresponding SMOB, which = is >> + itself GC-protected. Thus, no need to protect the C++ object. = */ >> + return scm_gc_malloc (size, "lily-smob"); >> + } >> + >> + static void operator delete (void *thing) >> + { >> + /* Nothing to do: the GC will reclaim memory for THING when it deems >> + appropriate. */ >> + // printf ("delete %p\n", thing); >> + } >> + > > As I stated: this will not help with STL containers which are > extensively used in pretty much every data structure of LilyPond. Sorry, I don=E2=80=99t understand how it doesn=E2=80=99t help. It would be a problem is =E2=80=98Smob=E2=80=99 objects could be copied, th= us ending up in non-GC-scanned storage, but IIUC they cannot be copied because their copy constructor is private. What am I missing? At any rate, I don=E2=80=99t see any failure with the test program. >> I think it would help to get everyone involved on both sides. Thus, >> could you Cc: this bug report to the LilyPond developer list, or the >> corresponding LilyPond bug report? This is really important to me. > > Shrug. I'll put a link to this bug report to a suitable LilyPond issue. Thank you. Though I want other LilyPond developers to get involved, and I=E2=80=99m afraid it would be easy for them to just ignore a side bug repo= rt. It=E2=80=99s a vital task for LilyPond, it cannot be a one-person side-proj= ect on the LilyPond side. Thanks in advance, Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 26 14:25:26 2015 Received: (at 19883) by debbugs.gnu.org; 26 Feb 2015 19:25:27 +0000 Received: from localhost ([127.0.0.1]:58953 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YR43q-0003lj-Ba for submit@debbugs.gnu.org; Thu, 26 Feb 2015 14:25:26 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:60202 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YR43o-0003lb-HV for 19883@debbugs.gnu.org; Thu, 26 Feb 2015 14:25:25 -0500 Received: from localhost ([127.0.0.1]:39276 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YR43n-0008Gr-Df; Thu, 26 Feb 2015 14:25:23 -0500 Received: by lola (Postfix, from userid 1000) id F18DFE0575; Thu, 26 Feb 2015 20:25:22 +0100 (CET) From: David Kastrup To: ludo@gnu.org (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> <874mq8dfb3.fsf@fencepost.gnu.org> <87fv9sahxx.fsf@gnu.org> <87wq34bsh7.fsf@fencepost.gnu.org> <87wq34zghp.fsf@gnu.org> Date: Thu, 26 Feb 2015 20:25:22 +0100 In-Reply-To: <87wq34zghp.fsf@gnu.org> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Thu, 26 Feb 2015 19:15:30 +0100") Message-ID: <87lhjkbhlp.fsf@fencepost.gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) ludo@gnu.org (Ludovic Court=C3=A8s) writes: > David Kastrup skribis: > >> Shrug. I'll put a link to this bug report to a suitable LilyPond issue. > > Thank you. Though I want other LilyPond developers to get involved, and > I=E2=80=99m afraid it would be easy for them to just ignore a side bug re= port. > > It=E2=80=99s a vital task for LilyPond, it cannot be a one-person side-pr= oject > on the LilyPond side. Three years ago in August there was a meeting of developers at my house, a lot of information was passed around and in a concerted effort LilyPond-2.16.0 was released. Since then all developments would properly be labelled a one-person project or side-project. Now if you take a look at GUILE=C2=A02.1 development, in particular all commits that are _not_ merges from the 2.0 branch, you'll find that quite more than 90% have come from the same person. LilyPond development is quite more diverse, but the various changes in the last years have invariably been one-person projects. Of the three large GUILE-based applications Gnucash, TeXmacs, and LilyPond, the only successful migration to GUILEv2 so far has been Gnucash, and its integration with GUILE is much smaller than that of LilyPond. The one-person projects concerning finding and/or fixing language and compiler problems or restructuring the related code areas have been exclusively mine. That is not necessarily a good match since my productivity drops to near zero when I lose interest with a problem, and without anybody else keeping the ball rolling, it can easily stay there. But I have no resources I could call upon to change that. --=20 David Kastrup From debbugs-submit-bounces@debbugs.gnu.org Fri Feb 27 04:55:11 2015 Received: (at 19883) by debbugs.gnu.org; 27 Feb 2015 09:55:11 +0000 Received: from localhost ([127.0.0.1]:59256 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YRHdW-000490-UY for submit@debbugs.gnu.org; Fri, 27 Feb 2015 04:55:11 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:47046 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YRHdU-00048s-KP for 19883@debbugs.gnu.org; Fri, 27 Feb 2015 04:55:09 -0500 Received: from reverse-83.fdn.fr ([80.67.176.83]:51045 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1YRHdT-0001oZ-JG; Fri, 27 Feb 2015 04:55:07 -0500 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> <874mq8dfb3.fsf@fencepost.gnu.org> <87fv9sahxx.fsf@gnu.org> <87wq34bsh7.fsf@fencepost.gnu.org> <87wq34zghp.fsf@gnu.org> <87lhjkbhlp.fsf@fencepost.gnu.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 9 =?utf-8?Q?Vent=C3=B4se?= an 223 de la =?utf-8?Q?R?= =?utf-8?Q?=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Fri, 27 Feb 2015 10:55:04 +0100 In-Reply-To: <87lhjkbhlp.fsf@fencepost.gnu.org> (David Kastrup's message of "Thu, 26 Feb 2015 20:25:22 +0100") Message-ID: <874mq7vfuv.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) David Kastrup skribis: > Three years ago in August there was a meeting of developers at my house, > a lot of information was passed around and in a concerted effort > LilyPond-2.16.0 was released. > > Since then all developments would properly be labelled a one-person > project or side-project. OK. I viewed LilyPond as a project with more people involved. > Now if you take a look at GUILE=C2=A02.1 development, in particular all > commits that are _not_ merges from the 2.0 branch, you'll find that > quite more than 90% have come from the same person. > > LilyPond development is quite more diverse, but the various changes in > the last years have invariably been one-person projects. > > Of the three large GUILE-based applications Gnucash, TeXmacs, and > LilyPond, the only successful migration to GUILEv2 so far has been > Gnucash, and its integration with GUILE is much smaller than that of > LilyPond. > > The one-person projects concerning finding and/or fixing language and > compiler problems or restructuring the related code areas have been > exclusively mine. That is not necessarily a good match since my > productivity drops to near zero when I lose interest with a problem, and > without anybody else keeping the ball rolling, it can easily stay there. Sure. There=E2=80=99s quite a number of names showing up in the lilypond-d= evel archive though. We need them to feel concerned about this. Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Fri Feb 27 05:18:19 2015 Received: (at 19883) by debbugs.gnu.org; 27 Feb 2015 10:18:19 +0000 Received: from localhost ([127.0.0.1]:59260 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YRHzv-0004gl-6j for submit@debbugs.gnu.org; Fri, 27 Feb 2015 05:18:19 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:48483 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YRHzt-0004gd-C4 for 19883@debbugs.gnu.org; Fri, 27 Feb 2015 05:18:18 -0500 Received: from localhost ([127.0.0.1]:55789 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YRHzs-0001sT-D7; Fri, 27 Feb 2015 05:18:16 -0500 Received: by lola (Postfix, from userid 1000) id 7A2B9E0BED; Fri, 27 Feb 2015 11:18:15 +0100 (CET) From: David Kastrup To: ludo@gnu.org (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> <874mq8dfb3.fsf@fencepost.gnu.org> <87fv9sahxx.fsf@gnu.org> <87wq34bsh7.fsf@fencepost.gnu.org> <87wq34zghp.fsf@gnu.org> <87lhjkbhlp.fsf@fencepost.gnu.org> <874mq7vfuv.fsf@gnu.org> Date: Fri, 27 Feb 2015 11:18:15 +0100 In-Reply-To: <874mq7vfuv.fsf@gnu.org> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Fri, 27 Feb 2015 10:55:04 +0100") Message-ID: <8761anbqu0.fsf@fencepost.gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) ludo@gnu.org (Ludovic Court=C3=A8s) writes: > David Kastrup skribis: > >> Three years ago in August there was a meeting of developers at my house, >> a lot of information was passed around and in a concerted effort >> LilyPond-2.16.0 was released. >> >> Since then all developments would properly be labelled a one-person >> project or side-project. > > OK. I viewed LilyPond as a project with more people involved. There are a number of them involved, but with regard to coding, everybody is working on his own projects. The largest amount of "teamwork" is probably done in the translation team (which translates documentation and web pages to several different languages) and with documentation writers. There are also people compiling the releases and running the test suites. And there is feedback on the bug trackers. > Sure. There=E2=80=99s quite a number of names showing up in the > lilypond-devel archive though. We need them to feel concerned about > this. The only concern you'll get is people piping up and saying they cannot believe that not more progress has been made. Which is not all that dissimilar to what I see with GUILE and other projects even when sending ready-made patches. In this particular case however, the respective people do not have the low-level skills to actually contribute significantly. Which is the reason I try to condense the ugly stuff into well-encapsulated and controlled places instead of keeping them distributed across the code base. The memory management stuff is of that "nobody else treads here" kind. Now the encoding stuff is wildly enfuriating in its pointless erection and shifting around of road blocks for people needing to actually work with strings and buffers in their original encoding, but would possibly be suitable as material to fight with to more C programmers. --=20 David Kastrup From debbugs-submit-bounces@debbugs.gnu.org Sun Mar 01 01:51:30 2015 Received: (at 19883) by debbugs.gnu.org; 1 Mar 2015 06:51:30 +0000 Received: from localhost ([127.0.0.1]:32812 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YRxiq-0008Vv-PW for submit@debbugs.gnu.org; Sun, 01 Mar 2015 01:51:30 -0500 Received: from world.peace.net ([50.252.239.5]:60185 ident=hope7) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YRxik-0008Ve-LR for 19883@debbugs.gnu.org; Sun, 01 Mar 2015 01:51:26 -0500 Received: from [10.1.10.78] (helo=jojen) by world.peace.net with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1YRxiV-0002Ps-Tc; Sun, 01 Mar 2015 01:51:08 -0500 From: Mark H Weaver To: David Kastrup Subject: Re: bug#19883: Smob's mark_smob has become unreliable in Guile 2.x References: <87twyln70f.fsf@fencepost.gnu.org> Date: Sun, 01 Mar 2015 01:51:28 -0500 In-Reply-To: <87twyln70f.fsf@fencepost.gnu.org> (David Kastrup's message of "Mon, 16 Feb 2015 17:41:36 +0100") Message-ID: <87zj7x5hxr.fsf@netris.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Hi David, Thank you for the minimal test case. This is enormously helpful. David Kastrup writes: > The symptom likely occurs when an object is collected and consequently > its free_smob function gets called. After some investigation, I believe I've confirmed that this is exactly what's happening. I added a field to the Smob class that is set to 12345678 when the object is created (and 21345678 at a later point), and then 87654321 when free_smob is called. I then observed that the segfault happens when the smob_mark function is called on a smob whose tag field is valid, but whose corresponding C++ object has 87654321 in the special field, indicating that its smob_free method was already called. This happened with GC_MARKERS=1. I believe this is a bug in BDWGC. gc_mark.h warns: --8<---------------cut here---------------start------------->8--- /* WARNING: Such a mark procedure may be invoked on an unused object */ /* residing on a free list. Such objects are cleared, except for a */ /* free list link field in the first word. Thus mark procedures may */ /* not count on the presence of a type descriptor, and must handle this */ /* case correctly somehow. */ --8<---------------cut here---------------end--------------->8--- and protect Guile users from this issue by checking that the tag on the smob cell (first word) is correct before calling the user-defined smob_mark function. However, in this case, the mark procedure was called on a smob cell whose first word was the correct tag and whose second word was non-zero and clearly a pointer to an already deleted instance of the C++ Smob class. I think the next step is to remove libguile from this test program, and turn it into a minimal demonstration of this apparent BDWGC bug, for submission to the BDWGC developers. FYI, here are the changes I made to the test program and the results I observed. First, some changes that I think might be needed for GC safety: --8<---------------cut here---------------start------------->8--- --- smobs.hh.ORIG 2015-02-15 13:35:03.000000000 -0500 +++ smobs.hh 2015-02-28 23:42:06.346803668 -0500 @@ -270,8 +271,9 @@ } SCM unprotected_smobify_self () { - self_scm_ = Smob_base::register_ptr (static_cast (this)); - return self_scm_; + SCM s = Smob_base::register_ptr (static_cast (this)); + self_scm_ = s; + return s; } void protect () { @@ -279,12 +281,12 @@ } SCM unprotect () { - unprotect_smob (self_scm_, &protection_cons_); - return self_scm_; + SCM s = self_scm_; + unprotect_smob (s, &protection_cons_); + return s; } void smobify_self () { - self_scm_ = unprotected_smobify_self (); - protect (); + protect_smob (unprotected_smobify_self (), &protection_cons_); } SCM self_scm () const { return self_scm_; } }; --8<---------------cut here---------------end--------------->8--- The changes to 'unprotected_smobify_self' and 'smobify_self" are probably only needed for a multithreaded program. The idea is to try to ensure that the smob is protected from GC by being on the stack (and not merely in the malloc/free heap) until it is protected. The change to 'unprotect' is along the same lines, and might be needed even for single-threaded programs. I don't know that 'scm_gc_unprotect_object' (called from 'unprotect_smob') will guarantee not to free the object passed to it if there are no other visible references to it. However, none of these changes fixed the problem here. BTW, one other thing to keep in mind: it will never be enough if the only GC-visible reference to an object is the C++ Smob object. There must always be a GC-visible reference to the SCM value. --8<---------------cut here---------------start------------->8--- --- smobs.hh.ORIG 2015-02-15 13:35:03.000000000 -0500 +++ smobs.hh 2015-02-28 23:42:06.346803668 -0500 @@ -263,6 +263,7 @@ protected: Smob () : self_scm_ (SCM_UNDEFINED), protection_cons_ (SCM_EOL) { }; public: + volatile int mhw_valid; static size_t free_smob (SCM obj) { delete Smob_base::unregister_ptr (obj); --- smobs.tcc.ORIG 2015-02-16 08:29:02.000000000 -0500 +++ smobs.tcc 2015-02-28 23:30:19.563298917 -0500 @@ -50,6 +50,7 @@ // announce the memory as being GC-controlled before even // allocating the controlling smob. SCM s = SCM_UNDEFINED; + p->mhw_valid = 12345678; SCM_NEWSMOB (s, smob_tag (), p); scm_gc_register_collectable_memory (p, sizeof (*p), smob_name_.c_str ()); return s; @@ -94,6 +95,9 @@ Smob_base::unregister_ptr (SCM obj) { Super *p = Super::unchecked_unsmob (obj); + if (p->mhw_valid != 12345678 && p->mhw_valid != 21345678) + abort (); + p->mhw_valid = 87654321; scm_gc_unregister_collectable_memory (p, sizeof (*p), smob_name_.c_str ()); return p; } --- test.cc.ORIG 2015-02-16 10:33:24.000000000 -0500 +++ test.cc 2015-02-28 23:28:34.690778883 -0500 @@ -28,6 +28,7 @@ Family::Family (int totals, int branch) { smobify_self (); + mhw_valid = 21345678; for (int l=0; l= totals) --8<---------------cut here---------------end--------------->8--- Here's what I observed: --8<---------------cut here---------------start------------->8--- mhw@jojen:~/guile-bug$ GC_MARKERS=1 ./test 2 20000 40 Iteration 0 Iteration 1 Iteration 2 Iteration 3 Segmentation fault (core dumped) mhw@jojen:~/guile-bug$ gdb ./test core GNU gdb (GDB) 7.8.2 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: . Find the GDB manual and other documentation resources online at: . For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./test...done. [New LWP 30663] warning: Could not load shared library symbols for linux-gate.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/gnu/store/3g20rdmnavpblsmgppyl8jhg67nidhjk-glibc-2.20/lib/libthread_db.so.1". Core was generated by `./test 2 20000 40'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0a2abde3 in ?? () warning: File "/gnu/store/7a5b7fqnxzbwhx7k2bhbc4i1nliq53rl-gcc-4.9.2-lib/lib/libstdc++.so.6.0.20-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load:/gnu/store/3lhr8q28q6f59774di9av7ncy09jd55d-guile-2.0.11/lib/libguile-2.0.so.22.7.2-gdb.scm". To enable execution of this file add add-auto-load-safe-path /gnu/store/7a5b7fqnxzbwhx7k2bhbc4i1nliq53rl-gcc-4.9.2-lib/lib/libstdc++.so.6.0.20-gdb.py line to your configuration file "/home/mhw/.gdbinit". To completely disable this security protection add set auto-load safe-path / line to your configuration file "/home/mhw/.gdbinit". For more information about this security protection see the "Auto-loading safe path" section in the GDB manual. E.g., run from the shell: info "(gdb)Auto-loading safe path" (gdb) bt #0 0x0a2abde3 in ?? () #1 0xb766dc4a in smob_mark (addr=0xa15f6e0, mark_stack_ptr=0x9e5a320, mark_stack_limit=0x9e62000, env=0) at smob.c:335 #2 0xb758664a in GC_mark_from (mark_stack_top=0x9e5a320, mark_stack=0x9e5a000, mark_stack_limit=0x9e62000) at mark.c:737 #3 0xb7587c07 in GC_mark_some (cold_gc_frame=0xbf874ed8 "pVZ\267M\332W\267\330N\207\277") at mark.c:386 #4 0xb757da4d in GC_stopped_mark (stop_func=stop_func@entry=0xb757d5b0 ) at alloc.c:637 #5 0xb757e4a9 in GC_try_to_collect_inner (stop_func=0xb757d5b0 ) at alloc.c:456 #6 0xb757ee00 in GC_collect_or_expand (needed_blocks=needed_blocks@entry=1, ignore_off_page=ignore_off_page@entry=0, retry=retry@entry=0) at alloc.c:1266 #7 0xb757effc in GC_allocobj (gran=gran@entry=3, kind=kind@entry=1) at alloc.c:1355 #8 0xb758440b in GC_generic_malloc_inner (lb=lb@entry=24, k=k@entry=1) at malloc.c:133 #9 0xb7581e8a in GC_register_finalizer_inner (obj=obj@entry=0xa005720, fn=fn@entry=0xb766d970 , cd=cd@entry=0x0, ofn=ofn@entry=0xbf875098, ocd=ocd@entry=0xbf87509c, mp=mp@entry=0xb75811c0 ) at finalize.c:524 #10 0xb75821e5 in GC_register_finalizer_no_order (obj=obj@entry=0xa005720, fn=fn@entry=0xb766d970 , cd=cd@entry=0x0, ofn=ofn@entry=0xbf875098, ocd=ocd@entry=0xbf87509c) at finalize.c:575 #11 0xb761bfbb in scm_i_set_finalizer (obj=obj@entry=0xa005720, proc=proc@entry=0xb766d970 , data=data@entry=0x0) at finalizers.c:44 #12 0xb766e1f8 in scm_i_new_smob (tc=7039, data=171214120) at smob.c:416 #13 0x08049b12 in scm_new_smob (tc=7039, data=171214120) at /gnu/store/3lhr8q28q6f59774di9av7ncy09jd55d-guile-2.0.11/include/guile/2.0/libguile/smob.h:91 #14 0x0804a8dc in Smob_base::register_ptr (p=0xa348528) at smobs.tcc:54 #15 0x0804a281 in Smob::unprotected_smobify_self (this=0xa34852c) at smobs.hh:274 #16 0x08049d0a in Smob::smobify_self (this=0xa34852c) at smobs.hh:289 #17 0x0804960c in Family::Family (this=0xa348528, totals=8, branch=2) at test.cc:30 #18 0x08049658 in Family::Family (this=0xa2a1638, totals=17, branch=2) at test.cc:36 #19 0x08049658 in Family::Family (this=0xa2a1758, totals=37, branch=2) at test.cc:36 #20 0x08049658 in Family::Family (this=0xa349478, totals=76, branch=2) at test.cc:36 #21 0x08049658 in Family::Family (this=0xa349008, totals=154, branch=2) at test.cc:36 #22 0x08049658 in Family::Family (this=0xa347318, totals=310, branch=2) at test.cc:36 #23 0x08049658 in Family::Family (this=0xa29d938, totals=623, branch=2) at test.cc:36 #24 0x08049658 in Family::Family (this=0xa2998b8, totals=1248, branch=2) at test.cc:36 #25 0x08049658 in Family::Family (this=0xa27f818, totals=2498, branch=2) at test.cc:36 #26 0x08049658 in Family::Family (this=0xa343608, totals=4998, branch=2) at test.cc:36 #27 0x08049658 in Family::Family (this=0xa1e2a88, totals=9998, branch=2) at test.cc:36 #28 0x08049658 in Family::Family (this=0x9f9a3b8, totals=19999, branch=2) at test.cc:36 #29 0x0804989f in workload (avv=0xbf8759f4) at test.cc:72 #30 0xb7608480 in c_body (d=0xbf875884) at continuations.c:517 #31 0xb76955de in apply_catch_closure (clo=#, args=()) at throw.c:140 #32 0xb766d8cf in apply_1 (smob=#, a=()) at smob.c:142 #33 0xb76ac662 in vm_regular_engine (vm=#, program=#, argv=, nargs=) at vm-i-system.c:858 #34 0xb76ae621 in scm_c_vm_run (vm=#, program=program@entry=#, argv=argv@entry=0xbf875790, nargs=nargs@entry=4) at vm.c:768 #35 0xb76135b4 in scm_call_4 (proc=#, arg1=arg1@entry=#t, arg2=arg2@entry=#, arg3=arg3@entry=, arg4=arg4@entry=) at eval.c:507 #36 0xb76957ee in scm_catch_with_pre_unwind_handler (key=key@entry=#t, thunk=thunk@entry=#, handler=handler@entry=, pre_unwind_handler=) at throw.c:73 #37 0xb7695920 in scm_c_catch (tag=tag@entry=#t, body=body@entry=0xb7608470 , body_data=body_data@entry=0xbf875884, handler=handler@entry=0xb7608930 , handler_data=handler_data@entry=0xbf875884, pre_unwind_handler=pre_unwind_handler@entry=0xb7608630 , pre_unwind_handler_data=pre_unwind_handler_data@entry=0x9ee2020) at throw.c:207 #38 0xb7608cd7 in scm_i_with_continuation_barrier (body=body@entry=0xb7608470 , body_data=body_data@entry=0xbf875884, handler=handler@entry=0xb7608930 , handler_data=handler_data@entry=0xbf875884, pre_unwind_handler=pre_unwind_handler@entry=0xb7608630 , pre_unwind_handler_data=0x9ee2020) at continuations.c:455 #39 0xb7608da6 in scm_c_with_continuation_barrier (func=0x80497a4 , data=0xbf8759f4) at continuations.c:551 #40 0xb7692b32 in with_guile_and_parent (base=0xbf8758ec, data=0xbf875914) at threads.c:906 #41 0xb7589ac8 in GC_call_with_stack_base (fn=fn@entry=0xb7692ae0 , arg=arg@entry=0xbf875914) at misc.c:1840 #42 0xb7692fcf in scm_i_with_guile_and_parent (parent=, data=0xbf8759f4, func=0x80497a4 ) at threads.c:949 #43 scm_with_guile (func=0x80497a4 , data=0xbf8759f4) at threads.c:955 #44 0x0804997d in main (ac=4, av=0xbf8759f4) at test.cc:86 (gdb) dir /home/mhw/gc-7.4.2 Source directories searched: /home/mhw/gc-7.4.2:$cdir:$cwd (gdb) dir /home/mhw/guile-2.0.11/libguile Source directories searched: /home/mhw/guile-2.0.11/libguile:/home/mhw/gc-7.4.2:$cdir:$cwd (gdb) up #1 0xb766dc4a in smob_mark (addr=0xa15f6e0, mark_stack_ptr=0x9e5a320, mark_stack_limit=0x9e62000, env=0) at smob.c:335 warning: Source file is more recent than executable. 335 obj = scm_smobs[smobnum].mark (cell); (gdb) list 330 scm_i_pthread_setspecific (current_mark_stack_pointer, mark_stack_ptr); 331 scm_i_pthread_setspecific (current_mark_stack_limit, mark_stack_limit); 332 333 /* Invoke the SMOB's mark procedure, which will in turn invoke 334 `scm_gc_mark', which may modify `current_mark_stack_pointer'. */ 335 obj = scm_smobs[smobnum].mark (cell); 336 337 mark_stack_ptr = scm_i_pthread_getspecific (current_mark_stack_pointer); 338 339 if (SCM_NIMP (obj)) (gdb) p *(Family *)0x9f9a3b8 $1 = {> = {> = {static smob_tag_ = 7039, static scm_init_ = {}, static smob_name_ = {static npos = , _M_dataplus = {> = {<__gnu_cxx::new_allocator> = {}, }, _M_p = 0x9ea1fd4 "Family"}}, static type_p_name_ = , static smob_proc = , static smob_proc_signature_ = }, self_scm_ = #, protection_cons_ = (), mhw_valid = 21345678}, _vptr.Family = 0x804b668 , kids = { >> = { _M_impl = {> = {<__gnu_cxx::new_allocator> = {}, }, _M_start = 0x9f9a788, _M_finish = 0x9f9a78c, _M_end_of_storage = 0x9f9a78c}}, }} (gdb) p ((Family *)0x9f9a3b8)->mhw_valid $2 = 21345678 (gdb) p ((Family *)0xa348528)->mhw_valid $3 = 12345678 (gdb) p ((Family *)0xa2a1638)->mhw_valid $4 = 21345678 (gdb) x/2xw addr 0xa15f6e0: 0x00001b7f 0x0a352c58 (gdb) p *(Family *)0x0a352c58 $5 = {> = {> = {static smob_tag_ = 7039, static scm_init_ = {}, static smob_name_ = {static npos = , _M_dataplus = {> = {<__gnu_cxx::new_allocator> = {}, }, _M_p = 0x9ea1fd4 "Family"}}, static type_p_name_ = , static smob_proc = , static smob_proc_signature_ = }, self_scm_ = #, protection_cons_ = (), mhw_valid = 87654321}, _vptr.Family = 0xa202ca8, kids = { >> = { _M_impl = {> = {<__gnu_cxx::new_allocator> = {}, }, _M_start = 0xa352cf8, _M_finish = 0xa352d00, _M_end_of_storage = 0xa352d00}}, }} (gdb) p/x 7039 $6 = 0x1b7f (gdb) --8<---------------cut here---------------end--------------->8--- Regards, Mark From debbugs-submit-bounces@debbugs.gnu.org Sun Mar 01 05:10:11 2015 Received: (at 19883) by debbugs.gnu.org; 1 Mar 2015 10:10:11 +0000 Received: from localhost ([127.0.0.1]:32891 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YS0p8-0004jh-Ig for submit@debbugs.gnu.org; Sun, 01 Mar 2015 05:10:11 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:56771 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YS0p5-0004jY-DV for 19883@debbugs.gnu.org; Sun, 01 Mar 2015 05:10:08 -0500 Received: from localhost ([127.0.0.1]:35845 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YS0p3-0005ge-5W; Sun, 01 Mar 2015 05:10:06 -0500 Received: by lola (Postfix, from userid 1000) id D4F10E0460; Sun, 1 Mar 2015 11:09:46 +0100 (CET) From: David Kastrup To: Mark H Weaver Subject: Re: bug#19883: Smob's mark_smob has become unreliable in Guile 2.x References: <87twyln70f.fsf@fencepost.gnu.org> <87zj7x5hxr.fsf@netris.org> Date: Sun, 01 Mar 2015 11:09:46 +0100 In-Reply-To: <87zj7x5hxr.fsf@netris.org> (Mark H. Weaver's message of "Sun, 01 Mar 2015 01:51:28 -0500") Message-ID: <87385p9ggl.fsf@fencepost.gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Mark H Weaver writes: > Hi David, > > Thank you for the minimal test case. This is enormously helpful. > > David Kastrup writes: > >> The symptom likely occurs when an object is collected and consequently >> its free_smob function gets called. > > After some investigation, I believe I've confirmed that this is exactly > what's happening. > > I added a field to the Smob class that is set to 12345678 when the > object is created (and 21345678 at a later point), and then 87654321 > when free_smob is called. I then observed that the segfault happens > when the smob_mark function is called on a smob whose tag field is > valid, but whose corresponding C++ object has 87654321 in the special > field, indicating that its smob_free method was already called. This > happened with GC_MARKERS=1. There are a few remedies for the test case that do not necessarily apply to the full-blown code. I was able to avoid the crashes here by declaring the mark_smob function non-virtual. Declaring it virtual was actually more an oversight than anything else. Not routing the scm_mark call through a pointer fetched from the no longer existent virtual function table will stop the crashes in the example code. It's not an option for LilyPond proper, however, since the type lattice is complex enough that routing some of the marking process through a virtual member function is not really avoidable. Another more likely effective patch (which already made it into LilyPond's code base) _clears_ the SCM data (where the pointer to the C++ structure is stored) in unregister_ptr. Since all calls to the member functions mark_smob (and print_smob accordingly) are routed through the same mark_trampoline, letting this trampoline check for a non-null pointer avoids the mark-after-free situation in the general case. However, this is still a probabilistic fix. For one thing, there might be race conditions inside of GUILE. For another, the basic premise is that calling the mark function with garbage must not result in crashes. Since LilyPond relies on C++ data structures and containers for its operation and those are also involved in the mark walks, "can be called on arbitrary garbage" (of which already freed smobs are just one special case, so catering for them is not sufficient) is not an option that can be implemented with reasonable amount of effort in the mark_smob routines of LilyPond. However, from the GUILE side the mark_smob calls originate from objects detected to be SCM cells, and GUILE should be able to manage areas filled with SCM cells such that it can _reliably_ distinguish allocated from unallocated cells and there is no such thing as random garbage. When that is the case, it may be sufficient to clear out the tag field in the course of calling the free_smob hook, avoiding any subsequent mark calls to occur on a cell that has lost its type consistency anyway. Again, this might involve race conditions: no idea. Because of potential race conditions, a reliable fix needs to be in GUILE or BDWGC and cannot be done at the application end. The best we will be able to achieve there is "crashes almost never randomly". Which is what we should get in LilyPond with (already committed to the code base as . > I believe this is a bug in BDWGC. gc_mark.h warns: > > /* WARNING: Such a mark procedure may be invoked on an unused object */ > /* residing on a free list. Such objects are cleared, except for a */ > /* free list link field in the first word. Thus mark procedures may */ > /* not count on the presence of a type descriptor, and must handle this */ > /* case correctly somehow. */ > > and protect Guile users from this issue by checking that the tag on the > smob cell (first word) is correct before calling the user-defined > smob_mark function. However, in this case, the mark procedure was > called on a smob cell whose first word was the correct tag and whose > second word was non-zero and clearly a pointer to an already deleted > instance of the C++ Smob class. > > I think the next step is to remove libguile from this test program, and > turn it into a minimal demonstration of this apparent BDWGC bug, for > submission to the BDWGC developers. I don't know about the interplay of GUILE and BDWGC. It may be that the general C freestore operation that BDWGC provides cannot prevent the "can be called on garbage" effect and functions need to be robust against that. However, the specific "garbage" here resulted directly from marking an already freed smob, and with a frequency that is far above random. When that happens, it is extremely unlikely that memory is being either successfully retained or freed. And that sounds like a bad idea in BDWGC itself. > FYI, here are the changes I made to the test program and the results I > observed. > > First, some changes that I think might be needed for GC safety: > > --- smobs.hh.ORIG 2015-02-15 13:35:03.000000000 -0500 > +++ smobs.hh 2015-02-28 23:42:06.346803668 -0500 > @@ -270,8 +271,9 @@ > } > SCM unprotected_smobify_self () > { > - self_scm_ = Smob_base::register_ptr (static_cast (this)); > - return self_scm_; > + SCM s = Smob_base::register_ptr (static_cast (this)); > + self_scm_ = s; > + return s; Don't see the intended difference here. It's conceivable that with several "volatile" declarations the compiler may generate different code that would protect against self_scm_ being modified before the function returns. Is that supposed to help against non-atomic changes of self_scm_? > @@ -279,12 +281,12 @@ > } > SCM unprotect () > { > - unprotect_smob (self_scm_, &protection_cons_); > - return self_scm_; > + SCM s = self_scm_; > + unprotect_smob (s, &protection_cons_); > + return s; > } I see how this one would be advisable in multi-threaded applications. > void smobify_self () { > - self_scm_ = unprotected_smobify_self (); > - protect (); > + protect_smob (unprotected_smobify_self (), &protection_cons_); > } > SCM self_scm () const { return self_scm_; } > }; > > > The changes to 'unprotected_smobify_self' and 'smobify_self" are > probably only needed for a multithreaded program. The idea is to try to > ensure that the smob is protected from GC by being on the stack (and not > merely in the malloc/free heap) until it is protected. > > The change to 'unprotect' is along the same lines, and might be needed > even for single-threaded programs. I don't know that > 'scm_gc_unprotect_object' (called from 'unprotect_smob') will guarantee > not to free the object passed to it if there are no other visible > references to it. > > However, none of these changes fixed the problem here. I'll put them in as there is no sane reason not to. However, they appear to mostly concern low-probability races whereas we still have an elephant in the room. > BTW, one other thing to keep in mind: it will never be enough if the > only GC-visible reference to an object is the C++ Smob object. There > must always be a GC-visible reference to the SCM value. That's why we need the mark functions in LilyPond. It would be a really large PITA to have to route everything through SCM since there is a really large corpus of code and non-SCM data structures involved. > (gdb) bt > #0 0x0a2abde3 in ?? () > #1 0xb766dc4a in smob_mark (addr=0xa15f6e0, mark_stack_ptr=0x9e5a320, mark_stack_limit=0x9e62000, env=0) at smob.c:335 > #2 0xb758664a in GC_mark_from (mark_stack_top=0x9e5a320, mark_stack=0x9e5a000, mark_stack_limit=0x9e62000) at mark.c:737 > #3 0xb7587c07 in GC_mark_some (cold_gc_frame=0xbf874ed8 "pVZ\267M\332W\267\330N\207\277") at mark.c:386 > #4 0xb757da4d in GC_stopped_mark (stop_func=stop_func@entry=0xb757d5b0 ) at alloc.c:637 > #5 0xb757e4a9 in GC_try_to_collect_inner (stop_func=0xb757d5b0 ) at alloc.c:456 > #6 0xb757ee00 in GC_collect_or_expand (needed_blocks=needed_blocks@entry=1, ignore_off_page=ignore_off_page@entry=0, > retry=retry@entry=0) at alloc.c:1266 > #7 0xb757effc in GC_allocobj (gran=gran@entry=3, kind=kind@entry=1) at alloc.c:1355 > #8 0xb758440b in GC_generic_malloc_inner (lb=lb@entry=24, k=k@entry=1) at malloc.c:133 > #9 0xb7581e8a in GC_register_finalizer_inner (obj=obj@entry=0xa005720, fn=fn@entry=0xb766d970 , > cd=cd@entry=0x0, ofn=ofn@entry=0xbf875098, ocd=ocd@entry=0xbf87509c, > mp=mp@entry=0xb75811c0 ) at finalize.c:524 > #10 0xb75821e5 in GC_register_finalizer_no_order (obj=obj@entry=0xa005720, fn=fn@entry=0xb766d970 , > cd=cd@entry=0x0, ofn=ofn@entry=0xbf875098, ocd=ocd@entry=0xbf87509c) at finalize.c:575 > #11 0xb761bfbb in scm_i_set_finalizer (obj=obj@entry=0xa005720, proc=proc@entry=0xb766d970 , > data=data@entry=0x0) at finalizers.c:44 > #12 0xb766e1f8 in scm_i_new_smob (tc=7039, data=171214120) at smob.c:416 > #13 0x08049b12 in scm_new_smob (tc=7039, data=171214120) > at /gnu/store/3lhr8q28q6f59774di9av7ncy09jd55d-guile-2.0.11/include/guile/2.0/libguile/smob.h:91 > #14 0x0804a8dc in Smob_base::register_ptr (p=0xa348528) at smobs.tcc:54 > #15 0x0804a281 in Smob::unprotected_smobify_self (this=0xa34852c) at smobs.hh:274 > #16 0x08049d0a in Smob::smobify_self (this=0xa34852c) at smobs.hh:289 That's when smobify_self tries to create a smob pointing to the C++-allocated data structure. This calls scm_new_smob, and in the course of that, garbage is apparently collected. And smob_mark is called on an already freed C++ structure with a bad virtual table, so when mark_trampoline tries routing the virtual table call to mark_smob, it crashes. With the already checked-in changes and your suggested changes, we'll likely get the mark-after-free crashes under control. They still seem to warrant fixing at the BDWGC level as they, crashes aside, appear to cause memory to be in some half-released state good for neither retaining nor reallocation. But the mark-on-general-garbage problem seems sort of system-immanent to conservative garbage collection while making the management of C++ data structures (often relying on some internal consistency outside of user control) extremely awkward. Fixing that at the SCM level seems both desirable and doable and will likely require GUILE to not let garbage (rather than recognizably unallocated cells) into its managed cell store. -- David Kastrup From debbugs-submit-bounces@debbugs.gnu.org Sun Mar 01 10:00:29 2015 Received: (at 19883) by debbugs.gnu.org; 1 Mar 2015 15:00:29 +0000 Received: from localhost ([127.0.0.1]:33418 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YS5M5-0004hA-Gk for submit@debbugs.gnu.org; Sun, 01 Mar 2015 10:00:29 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:33647 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YS5M2-0004gz-8f for 19883@debbugs.gnu.org; Sun, 01 Mar 2015 10:00:27 -0500 Received: from localhost ([127.0.0.1]:40954 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YS5M1-0007zt-FA; Sun, 01 Mar 2015 10:00:25 -0500 Received: by lola (Postfix, from userid 1000) id DD679E0460; Sun, 1 Mar 2015 16:00:24 +0100 (CET) From: David Kastrup To: Mark H Weaver Subject: Re: bug#19883: Smob's mark_smob has become unreliable in Guile 2.x References: <87twyln70f.fsf@fencepost.gnu.org> <87zj7x5hxr.fsf@netris.org> Date: Sun, 01 Mar 2015 16:00:24 +0100 In-Reply-To: <87zj7x5hxr.fsf@netris.org> (Mark H. Weaver's message of "Sun, 01 Mar 2015 01:51:28 -0500") Message-ID: <87lhjg9307.fsf@fencepost.gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Mark H Weaver writes: [...] The race-condition related patch suggestions have been submitted to LilyPond's tracker. Tracker issue: 4304 (http://code.google.com/p/lilypond/issues/detail?id=4304) Rietveld issue: 210810043 (https://codereview.appspot.com/210810043) Issue description: Smob code changes mostly relevant for multithreading These changes have been suggested by Mark H Weaver in -- David Kastrup From debbugs-submit-bounces@debbugs.gnu.org Sun Mar 01 12:02:39 2015 Received: (at 19883) by debbugs.gnu.org; 1 Mar 2015 17:02:39 +0000 Received: from localhost ([127.0.0.1]:33495 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YS7GI-0007gN-S2 for submit@debbugs.gnu.org; Sun, 01 Mar 2015 12:02:39 -0500 Received: from world.peace.net ([50.252.239.5]:60624 ident=hope8) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YS7GC-0007gA-I6 for 19883@debbugs.gnu.org; Sun, 01 Mar 2015 12:02:36 -0500 Received: from [10.1.10.78] (helo=jojen) by world.peace.net with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1YS7G0-0004fC-Ch; Sun, 01 Mar 2015 12:02:20 -0500 From: Mark H Weaver To: David Kastrup Subject: Re: bug#19883: Smob's mark_smob has become unreliable in Guile 2.x References: <87twyln70f.fsf@fencepost.gnu.org> <87zj7x5hxr.fsf@netris.org> <87385p9ggl.fsf@fencepost.gnu.org> Date: Sun, 01 Mar 2015 12:02:41 -0500 In-Reply-To: <87385p9ggl.fsf@fencepost.gnu.org> (David Kastrup's message of "Sun, 01 Mar 2015 11:09:46 +0100") Message-ID: <87vbik647i.fsf@netris.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) David Kastrup writes: > Since LilyPond relies on C++ data structures and containers for its > operation and those are also involved in the mark walks, "can be called > on arbitrary garbage" Who are you quoting here? > But the mark-on-general-garbage problem seems sort of system-immanent to > conservative garbage collection [...] Marking arbitrary garbage is certainly not inherent (is that the word you meant to use?) to conservative garbage collection. There may be bugs, but BDWGC is designed to prevent this. See the "Allocation" and "Mark phase" sections of . In particular, from the "Allocation" section: The collector uses a two level allocator. A large block is defined to be one larger than half of HBLKSIZE, which is a power of 2, typically on the order of the page size. Large block sizes are rounded up to the next multiple of HBLKSIZE and then allocated by GC_allochblk. Recent versions of the collector use an approximate best fit algorithm by keeping free lists for several large block sizes. The actual implementation of GC_allochblk is significantly complicated by black-listing issues (see below). Small blocks are allocated in chunks of size HBLKSIZE. Each chunk is dedicated to only one object size and kind. >From the "Mark phase" section: We determine whether a candidate pointer is actually the address of a heap block. This is done in the following steps: The candidate pointer is checked against rough heap bounds. These heap bounds are maintained such that all actual heap objects fall between them. In order to facilitate black-listing (see below) we also include address regions that the heap is likely to expand into. Most non-pointers fail this initial test. The candidate pointer is divided into two pieces; the most significant bits identify a HBLKSIZE-sized page in the address space, and the least significant bits specify an offset within that page. (A hardware page may actually consist of multiple such pages. HBLKSIZE is usually the page size divided by a small power of two.) The page address part of the candidate pointer is looked up in a table. Each table entry contains either 0, indicating that the page is not part of the garbage collected heap, a small integer n, indicating that the page is part of large object, starting at least n pages back, or a pointer to a descriptor for the page. In the first case, the candidate pointer i not a true pointer and can be safely ignored. In the last two cases, we can obtain a descriptor for the page containing the beginning of the object. The starting address of the referenced object is computed. The page descriptor contains the size of the object(s) in that page, the object kind, and the necessary mark bits for those objects. The size information can be used to map the candidate pointer to the object starting address. To accelerate this process, the page header also contains a pointer to a precomputed map of page offsets to displacements from the beginning of an object. The use of this map avoids a potentially slow integer remainder operation in computing the object start address. The mark bit for the target object is checked and set. If the object was previously unmarked, the object is pushed on the mark stack. The descriptor is read from the page descriptor. (This is computed from information GC_obj_kinds when the page is first allocated.) In Guile, there is a dedicated "kind" for SMOBs that have user-defined mark functions. This means that they are segregrated into their own HBLKSIZE-sized blocks, and so BDWGC is able to reliably determine whether our general SMOB marker ('smob_mark' in smob.c) should be called, without relying on some unreliable tag inside the block itself. So, while it is true that a conservative GC cannot avoid possibly retaining some garbage that should be freed, it is possible to reliably avoid calling mark functions on arbitrary garbage, and BDWGC is designed to do this. BTW, a few more points to mention: I vaguely recall that you may have wondered in the past if Boehm GC "parallel marking" means that it is marking while the mutator (user code) is running. That is not the case. BDWGC first stops all user threads (on POSIX I believe this is done by sending signals to all threads and keeping them in the signal handlers during GC) and then runs multiple of its own marker threads in parallel. I guess this means that user-specified mark functions may be called in multiple threads. However, this can be avoided by setting the GC_MARKERS environment variable to "1", and I set this while running your test code to eliminate this possible complication for now. Another document worth looking at is: http://www.hboehm.info/gc/finalization.html > However, from the GUILE side the mark_smob calls originate from objects > detected to be SCM cells, and GUILE should be able to manage areas > filled with SCM cells such that it can _reliably_ distinguish allocated > from unallocated cells and there is no such thing as random garbage. As described above, BDWGC already handles this, and so far I haven't seen evidence of 'smob_mark' being called on random garbage. However, it seems that 'smob_mark' is being called on objects that have already been finalized. > When that is the case, it may be sufficient to clear out the tag field > in the course of calling the free_smob hook, avoiding any subsequent > mark calls to occur on a cell that has lost its type consistency anyway. > Again, this might involve race conditions: no idea. Yes, this sounds like a work-around worth considering. I'll think about whether it can be done reliably in Guile and/or LilyPond. It occurs to me that during times of heavy allocation, it might be possible for a GC to be triggered while a finalizer is being run. Since the object being finalized would be on the stack, it would be considered live by the GC, even if most or all of the finalizer had already been run. It warrants investigation. Also, while reading the GC design documentation, a particular warning raised some alarms. From "Finalization" in gcdescr.html: Any objects remaining unmarked at the end of this process are added to a queue of objects whose finalizers can be run. Depending on collector configuration, finalizers are dequeued and run either implicitly during allocation calls, or explicitly in response to a user request. (Note that the former is unfortunately both the default and not generally safe. If finalizers perform synchronization, it may result in deadlocks. Nontrivial finalizers generally need to perform synchronization, and thus require a different collector configuration.) and from finalization.html: Finalization code may be run anyplace an allocation or other call to the collector takes place. In multithreaded programs, finalizers have to obey the normal locking conventions to ensure safety. Code run directly from finalizers should not acquire locks that may be held during allocation. This restriction can be easily circumvented by registering a finalizer which enqueues the real action for execution in a separate thread. In single-threaded code, it is also often easiest to have finalizers queue actions, which are then explicitly run during an explicit call by the user's program. So it may be that we need to be careful of this. Here's more recommended reading on the difficulties of implementing finalizers properly: http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html Finalizers are a huge can of worms. I suspect this is why Ludovic was encouraging you to avoid them entirely, but I agree that we need to find a way to make them work for when they are needed. > Because of potential race conditions, a reliable fix needs to be in > GUILE or BDWGC and cannot be done at the application end. The best we > will be able to achieve there is "crashes almost never randomly". I'm not sure this is true. It will require more thought. >> First, some changes that I think might be needed for GC safety: >> >> --- smobs.hh.ORIG 2015-02-15 13:35:03.000000000 -0500 >> +++ smobs.hh 2015-02-28 23:42:06.346803668 -0500 >> @@ -270,8 +271,9 @@ >> } >> SCM unprotected_smobify_self () >> { >> - self_scm_ = Smob_base::register_ptr (static_cast (this)); >> - return self_scm_; >> + SCM s = Smob_base::register_ptr (static_cast (this)); >> + self_scm_ = s; >> + return s; > > Don't see the intended difference here. It's conceivable that with > several "volatile" declarations the compiler may generate different code > that would protect against self_scm_ being modified before the function > returns. Is that supposed to help against non-atomic changes of > self_scm_? No, the idea is to ensure the existence of a GC-visible copy of the SCM value until 'scm_gc_protect_object' is called. For objects allocated in the malloc/free heap, self_scm_ is not visible to the GC. >> BTW, one other thing to keep in mind: it will never be enough if the >> only GC-visible reference to an object is the C++ Smob object. There >> must always be a GC-visible reference to the SCM value. > > That's why we need the mark functions in LilyPond. The mark functions can ensure that in your Family class, the children will be marked even though the parent->child pointers point directly to the C++ object. However, whenever the root of the tree is not protected by 'scm_gc_protect_object', the SCM value of that root must be visible to the GC, in order to cause the user-specified 'mark' function to be called in the first place. It should be noted that for any particular C++ class, another alternative is to arrange for instances of that class to be allocated using 'scm_gc_malloc', as Ludovic suggested. If you do that, then it would suffice to have a GC-visible pointer to instances of that class, which might make your life easier. Regards, Mark From debbugs-submit-bounces@debbugs.gnu.org Sun Mar 01 13:38:19 2015 Received: (at 19883) by debbugs.gnu.org; 1 Mar 2015 18:38:19 +0000 Received: from localhost ([127.0.0.1]:33523 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YS8kt-0001TT-2v for submit@debbugs.gnu.org; Sun, 01 Mar 2015 13:38:19 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:36608 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YS8kr-0001TM-Oh for 19883@debbugs.gnu.org; Sun, 01 Mar 2015 13:38:18 -0500 Received: from localhost ([127.0.0.1]:43915 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YS8kq-0005XD-OD; Sun, 01 Mar 2015 13:38:17 -0500 Received: by lola (Postfix, from userid 1000) id 454E0E05DB; Sun, 1 Mar 2015 19:38:16 +0100 (CET) From: David Kastrup To: Mark H Weaver Subject: Re: bug#19883: Smob's mark_smob has become unreliable in Guile 2.x References: <87twyln70f.fsf@fencepost.gnu.org> <87zj7x5hxr.fsf@netris.org> <87385p9ggl.fsf@fencepost.gnu.org> <87vbik647i.fsf@netris.org> Date: Sun, 01 Mar 2015 19:38:16 +0100 Message-ID: <877fv08sx3.fsf@fencepost.gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Mark H Weaver writes: > Finalizers are a huge can of worms. I suspect this is why Ludovic was > encouraging you to avoid them entirely, but I agree that we need to > find a way to make them work for when they are needed. Again, LilyPond does not really have an option to avoid finalizers (free_smob action): the C++ class instances associated with smobs contain STL containers, and the storage for those is allocated separately on the heap. Running the destructor of the C++ class instance (which is the basic finalization action done when calling the delete operator on the C++ class instance) will then also destruct the STL container, freeing all memory associated with _that_. We don't have control over the storage layout of the STL templates. While one could allocate them using a separate allocation template parameter, that may be a C++11 feature. And it would really complicate the code. Moving all of the smobification stuff into a few template classes has actually locked up our release process for almost a year: 2.19.15 was released in September, 2.19.16 is currently being uploaded. The release-building crosscompiling environment had to be updated to a newer GCC version to accept the templated code, and that GCC reason had new dependencies and needed to be bootstrapped with a C++ rather than a C compiler. And the person who has run the release process the last few years was not actually a programmer. > The mark functions can ensure that in your Family class, the children > will be marked even though the parent->child pointers point directly > to the C++ object. However, whenever the root of the tree is not > protected by 'scm_gc_protect_object', the SCM value of that root must > be visible to the GC, in order to cause the user-specified 'mark' > function to be called in the first place. > > It should be noted that for any particular C++ class, another > alternative is to arrange for instances of that class to be allocated > using 'scm_gc_malloc', as Ludovic suggested. If you do that, then it > would suffice to have a GC-visible pointer to instances of that class, > which might make your life easier. Again: a lot of smobified C++ structures contain STL containers that have their separate allocation that can easily have pointers to C++ structures associated with smobs again that need to be marked. At any rate: since LilyPond can eat a significant ratio of the entire memory space on 32bit machines, I consider it prudent not to have basically all of that scanned by the conservative garbage collector since that would significantly increase the number of false positives when marking. And we are talking about an advertised GUILE feature after all. With regard to "making our life easier": it's a sunk cost since all the work has obviously been done for GUILEv1 already. -- David Kastrup From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 23 05:51:00 2016 Received: (at 19883) by debbugs.gnu.org; 23 Jun 2016 09:51:00 +0000 Received: from localhost ([127.0.0.1]:51797 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bG1Hn-0001Uo-Vf for submit@debbugs.gnu.org; Thu, 23 Jun 2016 05:51:00 -0400 Received: from pb-sasl2.pobox.com ([64.147.108.67]:64208 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bG1Hm-0001Ug-D7 for 19883@debbugs.gnu.org; Thu, 23 Jun 2016 05:50:58 -0400 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 7397F21847; Thu, 23 Jun 2016 05:50:56 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=GeetUq9arqIU385mGxhl3fcvl8s=; b=hYowya 2QmQHVPg8SxB/RtYFSG6E/CSGbe9Ud+bu7Jg2Y2uiNSOz9SqeIQdt5owm5q9Qnj9 +qvba4zIMLAybxk0AFLKRalFdThsSaw2XhnRWf+FsRQ4tu8W6j/NxTSypCrJVihF A6J36zLZwu7RtvOCMYEeoOIS5KYApxSeNDxwE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=tVh9Ba49RWxaCOWzpSVS/8O5tNem0HwP wmyFfrFU4CAJLrfvPxCxXFJBQHoP2wV7KzDYSeRVOZjzxVNmk18qbF62uVsGNEGX SY99ksaH8jfU+NMihnnRXWIAAx5g7Z69evFKRqzM6xOEcRujvR2bsA94uJRrWzQa 9AF7wGRND3Q= Received: from pb-sasl2.nyi.icgroup.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 6B1A521846; Thu, 23 Jun 2016 05:50:56 -0400 (EDT) Received: from clucks (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 8129B21845; Thu, 23 Jun 2016 05:50:55 -0400 (EDT) From: Andy Wingo To: David Kastrup Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> <874mq8dfb3.fsf@fencepost.gnu.org> <87fv9sahxx.fsf@gnu.org> <87wq34bsh7.fsf@fencepost.gnu.org> Date: Thu, 23 Jun 2016 11:50:48 +0200 In-Reply-To: <87wq34bsh7.fsf@fencepost.gnu.org> (David Kastrup's message of "Thu, 26 Feb 2015 16:30:28 +0100") Message-ID: <87fus4p72f.fsf@pobox.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: FAF45C48-3927-11E6-88B6-28A6F1301B6D-02397024!pb-sasl2.pobox.com X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: 19883 Cc: Ludovic =?utf-8?Q?Court=C3=A8s?= , 19883@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.4 (-) On Thu 26 Feb 2015 16:30, David Kastrup writes: > Try ./test 2 2000 200 I can reproduce the crash with your test case, thanks :) The patch below fixes the bug for me. WDYT Ludovic? Andy commit db30120fc3a1727d8f221cbb014314f2babf841e Author: Andy Wingo Date: Thu Jun 23 11:47:42 2016 +0200 Fix race between SMOB marking and finalization * libguile/smob.c (clear_smobnum): New helper. (finalize_smob): Re-set the smobnum to the "finalized smob" type before finalizing. Fixes #19883. (scm_smob_prehistory): Pre-register a "finalized smob" type, which has no mark procedure. diff --git a/libguile/smob.c b/libguile/smob.c index 6a97caa..43ea613 100644 --- a/libguile/smob.c +++ b/libguile/smob.c @@ -372,20 +372,43 @@ scm_gc_mark (SCM o) } +static void* +clear_smobnum (void *ptr) +{ + SCM smob; + scm_t_bits smobnum; + + smob = SCM_PACK_POINTER (ptr); + + smobnum = SCM_SMOBNUM (smob); + /* Frob the object's type in place, re-setting it to be the "finalized + smob" type. This will prevent other routines from accessing its + internals in a way that assumes that the smob data is valid. This + is notably the case for SMOB's own "mark" procedure, if any; as the + finalizer runs without the alloc lock, it's possible for a GC to + occur while it's running, in which case the object is alive and yet + its data is invalid. */ + SCM_SET_SMOB_DATA_0 (smob, SCM_SMOB_DATA_0 (smob) & ~(scm_t_bits) 0xff00); + + return (void *) smobnum; +} + /* Finalize SMOB by calling its SMOB type's free function, if any. */ static void finalize_smob (void *ptr, void *data) { SCM smob; + scm_t_bits smobnum; size_t (* free_smob) (SCM); smob = SCM_PACK_POINTER (ptr); + smobnum = (scm_t_bits) GC_call_with_alloc_lock (clear_smobnum, ptr); + #if 0 - printf ("finalizing SMOB %p (smobnum: %u)\n", - ptr, SCM_SMOBNUM (smob)); + printf ("finalizing SMOB %p (smobnum: %u)\n", ptr, smobnum); #endif - free_smob = scm_smobs[SCM_SMOBNUM (smob)].free; + free_smob = scm_smobs[smobnum].free; if (free_smob) free_smob (smob); } @@ -460,6 +483,7 @@ void scm_smob_prehistory () { long i; + scm_t_bits finalized_smob_tc16; scm_i_pthread_key_create (¤t_mark_stack_pointer, NULL); scm_i_pthread_key_create (¤t_mark_stack_limit, NULL); @@ -483,6 +507,9 @@ scm_smob_prehistory () scm_smobs[i].apply = 0; scm_smobs[i].apply_trampoline = SCM_BOOL_F; } + + finalized_smob_tc16 = scm_make_smob_type ("finalized smob", 0); + if (SCM_TC2SMOBNUM (finalized_smob_tc16) != 0) abort (); } /* From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 23 06:36:48 2016 Received: (at 19883-done) by debbugs.gnu.org; 23 Jun 2016 10:36:48 +0000 Received: from localhost ([127.0.0.1]:51822 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bG208-0002dn-Ar for submit@debbugs.gnu.org; Thu, 23 Jun 2016 06:36:48 -0400 Received: from pb-sasl2.pobox.com ([64.147.108.67]:55557 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bG205-0002de-Vq for 19883-done@debbugs.gnu.org; Thu, 23 Jun 2016 06:36:46 -0400 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 18A6F21E98; Thu, 23 Jun 2016 06:36:43 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=MGi0j+5NvrqOAKMu/dweohHr9d8=; b=gJilzZ Loc+MK5nGlO2Sit5hO0TtpYEWQ8i2RoUcaciTTk8mKEd5WXpdln66CE5Mw4m2+hU v74KGKejq3yyD6GfjMGa7aQu1naBANz7QrJo0UO4VCO7RPzPZTyf2NFPAhjS6192 y3nyKBSxpjsz0g8sCnVm6X1QR5egn30psVgx4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=vA4l92uGjJubE3EsQ2echO7n9bCJDVuu apgvRDBH8itQcP7zzSKnrzZ9Bb4+B76JWgQRP7N0meH9I+8+9hi7U5hBQ77SFAVk ufZEX1+fZMIcLDIRO3DP/4k/lsslOHDaTHahHSS7qtS3+1vfJk9hKdJYEmv8hnmz bcYStRzKNrc= Received: from pb-sasl2.nyi.icgroup.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 10D5421E97; Thu, 23 Jun 2016 06:36:43 -0400 (EDT) Received: from clucks (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 14BB021E95; Thu, 23 Jun 2016 06:36:41 -0400 (EDT) From: Andy Wingo To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> <874mq8dfb3.fsf@fencepost.gnu.org> <87fv9sahxx.fsf@gnu.org> <87wq34bsh7.fsf@fencepost.gnu.org> <87fus4p72f.fsf@pobox.com> Date: Thu, 23 Jun 2016 12:36:34 +0200 In-Reply-To: <87fus4p72f.fsf@pobox.com> (Andy Wingo's message of "Thu, 23 Jun 2016 11:50:48 +0200") Message-ID: <87bn2sp4y5.fsf@pobox.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Pobox-Relay-ID: 60073636-392E-11E6-AB61-28A6F1301B6D-02397024!pb-sasl2.pobox.com X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: 19883-done Cc: 19883-done@debbugs.gnu.org, David Kastrup X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.4 (-) --=-=-= Content-Type: text/plain On Thu 23 Jun 2016 11:50, Andy Wingo writes: > On Thu 26 Feb 2015 16:30, David Kastrup writes: > >> Try ./test 2 2000 200 > > I can reproduce the crash with your test case, thanks :) The patch below > fixes the bug for me. WDYT Ludovic? Here's a patch with a test case. I'm going to apply as it seems to be obviously the right thing and the test case does reproduce what I think is the bug (racing mark and finalize procedures, even if it's only happening in one thread, finalizers and mark procedures do introduce concurrency). We trigger the concurrency in a simple way, via allocation in the finalizer. The patch does fix the original test. GC could happen due to another thread of course. I'm actually not sure where the concurrency is coming from in David's test though :/ I'm very interested in any feedback you might have! Andy --=-=-= Content-Type: text/plain Content-Disposition: inline; filename=0001-Fix-race-between-SMOB-marking-and-finalization.patch >From 8dff3af087c6eaa83ae0d72aa8b22aef5c65d65d Mon Sep 17 00:00:00 2001 From: Andy Wingo Date: Thu, 23 Jun 2016 11:47:42 +0200 Subject: [PATCH] Fix race between SMOB marking and finalization * libguile/smob.c (clear_smobnum): New helper. (finalize_smob): Re-set the smobnum to the "finalized smob" type before finalizing. Fixes #19883. (scm_smob_prehistory): Pre-register a "finalized smob" type, which has no mark procedure. * test-suite/standalone/test-smob-mark-race.c: New file. * test-suite/standalone/Makefile.am: Arrange to build and run the new test. --- libguile/smob.c | 33 +++++++++++++-- test-suite/standalone/Makefile.am | 6 +++ test-suite/standalone/test-smob-mark-race.c | 65 +++++++++++++++++++++++++++++ 3 files changed, 101 insertions(+), 3 deletions(-) create mode 100644 test-suite/standalone/test-smob-mark-race.c diff --git a/libguile/smob.c b/libguile/smob.c index 90849a8..ed9d91a 100644 --- a/libguile/smob.c +++ b/libguile/smob.c @@ -374,20 +374,43 @@ scm_gc_mark (SCM o) } +static void* +clear_smobnum (void *ptr) +{ + SCM smob; + scm_t_bits smobnum; + + smob = PTR2SCM (ptr); + + smobnum = SCM_SMOBNUM (smob); + /* Frob the object's type in place, re-setting it to be the "finalized + smob" type. This will prevent other routines from accessing its + internals in a way that assumes that the smob data is valid. This + is notably the case for SMOB's own "mark" procedure, if any; as the + finalizer runs without the alloc lock, it's possible for a GC to + occur while it's running, in which case the object is alive and yet + its data is invalid. */ + SCM_SET_SMOB_DATA_0 (smob, SCM_SMOB_DATA_0 (smob) & ~(scm_t_bits) 0xff00); + + return (void *) smobnum; +} + /* Finalize SMOB by calling its SMOB type's free function, if any. */ static void finalize_smob (void *ptr, void *data) { SCM smob; + scm_t_bits smobnum; size_t (* free_smob) (SCM); smob = PTR2SCM (ptr); + smobnum = (scm_t_bits) GC_call_with_alloc_lock (clear_smobnum, ptr); + #if 0 - printf ("finalizing SMOB %p (smobnum: %u)\n", - ptr, SCM_SMOBNUM (smob)); + printf ("finalizing SMOB %p (smobnum: %u)\n", ptr, smobnum); #endif - free_smob = scm_smobs[SCM_SMOBNUM (smob)].free; + free_smob = scm_smobs[smobnum].free; if (free_smob) free_smob (smob); } @@ -470,6 +493,7 @@ void scm_smob_prehistory () { long i; + scm_t_bits finalized_smob_tc16; scm_i_pthread_key_create (¤t_mark_stack_pointer, NULL); scm_i_pthread_key_create (¤t_mark_stack_limit, NULL); @@ -493,6 +517,9 @@ scm_smob_prehistory () scm_smobs[i].apply = 0; scm_smobs[i].apply_trampoline_objcode = SCM_BOOL_F; } + + finalized_smob_tc16 = scm_make_smob_type ("finalized smob", 0); + if (SCM_TC2SMOBNUM (finalized_smob_tc16) != 0) abort (); } /* diff --git a/test-suite/standalone/Makefile.am b/test-suite/standalone/Makefile.am index 712418a..aec7418 100644 --- a/test-suite/standalone/Makefile.am +++ b/test-suite/standalone/Makefile.am @@ -275,4 +275,10 @@ test_smob_mark_LDADD = $(LIBGUILE_LDADD) check_PROGRAMS += test-smob-mark TESTS += test-smob-mark +test_smob_mark_race_SOURCES = test-smob-mark-race.c +test_smob_mark_race_CFLAGS = ${test_cflags} +test_smob_mark_race_LDADD = $(LIBGUILE_LDADD) +check_PROGRAMS += test-smob-mark-race +TESTS += test-smob-mark-race + EXTRA_DIST += ${check_SCRIPTS} diff --git a/test-suite/standalone/test-smob-mark-race.c b/test-suite/standalone/test-smob-mark-race.c new file mode 100644 index 0000000..eca0325 --- /dev/null +++ b/test-suite/standalone/test-smob-mark-race.c @@ -0,0 +1,65 @@ +/* Copyright (C) 2016 Free Software Foundation, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public License + * as published by the Free Software Foundation; either version 3 of + * the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA + * 02110-1301 USA + */ + +#if HAVE_CONFIG_H +#include +#endif + +#undef NDEBUG + +#include +#include + +static SCM +mark_smob (SCM smob) +{ + assert (SCM_SMOB_DATA (smob) == 1); + return SCM_BOOL_F; +} + +static size_t +finalize_smob (SCM smob) +{ + assert (SCM_SMOB_DATA (smob) == 1); + SCM_SET_SMOB_DATA (smob, 0); + /* Allocate a bit in the hopes of triggering a new GC, making the + marker race with the finalizer. */ + scm_cons (SCM_BOOL_F, SCM_BOOL_F); + return 0; +} + +static void +tests (void *data, int argc, char **argv) +{ + scm_t_bits tc16; + int i; + + tc16 = scm_make_smob_type ("smob with finalizer", 0); + scm_set_smob_mark (tc16, mark_smob); + scm_set_smob_free (tc16, finalize_smob); + + for (i = 0; i < 1000 * 1000; i++) + scm_new_smob (tc16, 1); +} + +int +main (int argc, char *argv[]) +{ + scm_boot_guile (argc, argv, tests, NULL); + return 0; +} -- 2.8.3 --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 23 09:12:49 2016 Received: (at 19883) by debbugs.gnu.org; 23 Jun 2016 13:12:49 +0000 Received: from localhost ([127.0.0.1]:51930 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bG4R6-0001BR-TQ for submit@debbugs.gnu.org; Thu, 23 Jun 2016 09:12:49 -0400 Received: from eggs.gnu.org ([208.118.235.92]:46548) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bG4R4-0001BE-Uy for 19883@debbugs.gnu.org; Thu, 23 Jun 2016 09:12:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bG4Qv-0004kN-Ud for 19883@debbugs.gnu.org; Thu, 23 Jun 2016 09:12:41 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:35264) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bG4Qv-0004k6-Rd; Thu, 23 Jun 2016 09:12:37 -0400 Received: from pluto.bordeaux.inria.fr ([193.50.110.57]:52120 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1bG4Qu-0002Ay-Mi; Thu, 23 Jun 2016 09:12:37 -0400 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: Andy Wingo Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> <874mq8dfb3.fsf@fencepost.gnu.org> <87fv9sahxx.fsf@gnu.org> <87wq34bsh7.fsf@fencepost.gnu.org> <87fus4p72f.fsf@pobox.com> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 6 Messidor an 224 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-unknown-linux-gnu Date: Thu, 23 Jun 2016 15:12:34 +0200 In-Reply-To: <87fus4p72f.fsf@pobox.com> (Andy Wingo's message of "Thu, 23 Jun 2016 11:50:48 +0200") Message-ID: <87a8icoxq5.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -6.4 (------) X-Debbugs-Envelope-To: 19883 Cc: 19883@debbugs.gnu.org, David Kastrup X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.4 (------) Andy Wingo skribis: > On Thu 26 Feb 2015 16:30, David Kastrup writes: > >> Try ./test 2 2000 200 > > I can reproduce the crash with your test case, thanks :) The patch below > fixes the bug for me. WDYT Ludovic? > > Andy > > commit db30120fc3a1727d8f221cbb014314f2babf841e > Author: Andy Wingo > Date: Thu Jun 23 11:47:42 2016 +0200 > > Fix race between SMOB marking and finalization >=20=20=20=20=20 > * libguile/smob.c (clear_smobnum): New helper. > (finalize_smob): Re-set the smobnum to the "finalized smob" type > before finalizing. Fixes #19883. > (scm_smob_prehistory): Pre-register a "finalized smob" type, which = has > no mark procedure. This LGTM, nice hack! Do you think the test case could be added to the test suite somehow? Thank you, Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 23 09:13:21 2016 Received: (at 19883-done) by debbugs.gnu.org; 23 Jun 2016 13:13:21 +0000 Received: from localhost ([127.0.0.1]:51934 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bG4Rd-0001Ca-5e for submit@debbugs.gnu.org; Thu, 23 Jun 2016 09:13:21 -0400 Received: from eggs.gnu.org ([208.118.235.92]:46674) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bG4Rb-0001CM-Lv for 19883-done@debbugs.gnu.org; Thu, 23 Jun 2016 09:13:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bG4RT-0004zm-BT for 19883-done@debbugs.gnu.org; Thu, 23 Jun 2016 09:13:14 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:35288) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bG4RT-0004yz-8e; Thu, 23 Jun 2016 09:13:11 -0400 Received: from pluto.bordeaux.inria.fr ([193.50.110.57]:52122 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1bG4RR-0002GC-CJ; Thu, 23 Jun 2016 09:13:09 -0400 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: Andy Wingo Subject: Re: bug#19883: Correction for backtrace References: <87twyln70f.fsf@fencepost.gnu.org> <873865n2rr.fsf@fencepost.gnu.org> <87twy91vtc.fsf@gnu.org> <878ufld1iw.fsf@fencepost.gnu.org> <87k2z4dhx7.fsf@gnu.org> <874mq8dfb3.fsf@fencepost.gnu.org> <87fv9sahxx.fsf@gnu.org> <87wq34bsh7.fsf@fencepost.gnu.org> <87fus4p72f.fsf@pobox.com> <87bn2sp4y5.fsf@pobox.com> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 6 Messidor an 224 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-unknown-linux-gnu Date: Thu, 23 Jun 2016 15:13:07 +0200 In-Reply-To: <87bn2sp4y5.fsf@pobox.com> (Andy Wingo's message of "Thu, 23 Jun 2016 12:36:34 +0200") Message-ID: <8760t0oxp8.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -6.4 (------) X-Debbugs-Envelope-To: 19883-done Cc: 19883-done@debbugs.gnu.org, David Kastrup X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.4 (------) Andy Wingo skribis: > On Thu 23 Jun 2016 11:50, Andy Wingo writes: > >> On Thu 26 Feb 2015 16:30, David Kastrup writes: >> >>> Try ./test 2 2000 200 >> >> I can reproduce the crash with your test case, thanks :) The patch below >> fixes the bug for me. WDYT Ludovic? > > Here's a patch with a test case. Perfect, you can ignore my previous message. :-) Ludo=E2=80=99. From unknown Fri Sep 05 18:58:02 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 22 Jul 2016 11:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 09 09:41:19 2016 Received: (at control) by debbugs.gnu.org; 9 Oct 2016 13:41:19 +0000 Received: from localhost ([127.0.0.1]:49413 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1btELv-0000bY-E9 for submit@debbugs.gnu.org; Sun, 09 Oct 2016 09:41:19 -0400 Received: from mail3-relais-sop.national.inria.fr ([192.134.164.104]:56157) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1btELu-0000bH-CV for control@debbugs.gnu.org; Sun, 09 Oct 2016 09:41:18 -0400 X-IronPort-AV: E=Sophos;i="5.31,466,1473112800"; d="scan'208";a="196141996" Received: from reverse-83.fdn.fr (HELO pluto) ([80.67.176.83]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/AES256-GCM-SHA384; 09 Oct 2016 15:41:08 +0200 From: ludovic.courtes@inria.fr (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: control@debbugs.gnu.org X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 18 =?utf-8?Q?Vend=C3=A9miaire?= an 225 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-unknown-linux-gnu Date: Sun, 09 Oct 2016 15:41:04 +0200 Message-ID: <87a8edhc7j.fsf@inria.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -5.8 (-----) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.8 (-----) unarchive 19883 thanks From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 09 14:08:40 2016 Received: (at 19883) by debbugs.gnu.org; 9 Oct 2016 18:08:40 +0000 Received: from localhost ([127.0.0.1]:50216 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1btIWd-0000Te-M1 for submit@debbugs.gnu.org; Sun, 09 Oct 2016 14:08:40 -0400 Received: from mail-lf0-f43.google.com ([209.85.215.43]:32891) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bt8uA-0007PF-ER for 19883@debbugs.gnu.org; Sun, 09 Oct 2016 03:52:19 -0400 Received: by mail-lf0-f43.google.com with SMTP id x79so68011777lff.0 for <19883@debbugs.gnu.org>; Sun, 09 Oct 2016 00:52:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:references:date:message-id:user-agent :mime-version; bh=ibR0wfxJ4FlM3S6zlx6I7EQvjXLKXiJqYoz3mR2iW/o=; b=s3hsPL9T6Ru9AGTQWSOtaNDF5LJ9d3ZsQ5QkVy4JKM1BVkSxyWMCvJoLPbvi1gYfun 0tUOaIAuhVet+7ZQiZ8YYSK4LFGFh+ZjcERCXeIJy7NjFJqg6x6O43oW4IHx311AU4v4 A1jjUG7P57QLM3aIFyOuB8xpmHCWvHEpjwacqSAYqImgG4grMdES4HkRDydObULm+fab SNihr0FBQX4uPqmgSoC/AwP368Xbg5C0Z2tRMmihqRHRvC0LqPl6YV1NSdvbL8+lW4Aa gs2+5C2inS9I9Hq2MAY8VQ+y8j2yYWAvSt+Nr7cH1VH3iXjM+XJhB8Zt0AJvMomtO0p8 nATA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:references:date:message-id :user-agent:mime-version; bh=ibR0wfxJ4FlM3S6zlx6I7EQvjXLKXiJqYoz3mR2iW/o=; b=WdgXSeyrd1i/pumGe0iCFGY9El96DAwGLgLRFjSL3dOBUO4YFdfwKMKXQGgIZ6r6ar wNhnMLnB7+baSTagDjYR4I+qkdygGenrRwIM3ENnKLAEWcw1KxeM1NM4jyyV3+w9ePu9 khWHFCk7PG1osELonxqZFeEMXNlyUfCj2QVPulpetYsw89JsFqoS3/ZGMnRGiyD4TD4m h3s06voK7dpDWwO2KEAS8td0QaxDkxukiAi5nDo6CfbZ5nrOX2VrnZJ1pJ1UfkwuJ6Wz lrOgj1LczkJD8dYQvfl6aCMgKvkFelD2TWyjiE5fQX6EdB+dBc92uYwIm6goLDepkL8h df1Q== X-Gm-Message-State: AA6/9RkAlar3K5Re+c8Gye02JwzpxVktKLpei3Kdbd96RDZZ1ROvzBj420hE9r4t4OIpyg== X-Received: by 10.46.0.152 with SMTP id e24mr10955064lji.21.1475999532063; Sun, 09 Oct 2016 00:52:12 -0700 (PDT) Received: from elephant.savannah (93-120-179-209.dynamic.mts-nn.ru. [93.120.179.209]) by smtp.gmail.com with ESMTPSA id r20sm4994657lfr.19.2016.10.09.00.52.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 09 Oct 2016 00:52:10 -0700 (PDT) From: Artyom Poptsov To: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Re: The =?utf-8?B?4oCcZmluYWxpemVk4oCd?= SMOB type References: <87mvieh7mu.fsf@gnu.org> Date: Sun, 09 Oct 2016 10:51:25 +0300 Message-ID: <87vax2klj6.fsf@elephant.savannah> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 19883 X-Mailman-Approved-At: Sun, 09 Oct 2016 14:08:38 -0400 Cc: 19883@debbugs.gnu.org, guile-devel X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.5 (/) --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hello Ludovic, thanks for pointing that out. IIRC, I saw this kind of errors during testing but I thought I fixed them. What Guile-SSH version do you use? Here's 'free_session' procedure from Guile-SSH version 0.10.0: =2D-8<---------------cut here---------------start------------->8--- size_t free_session (SCM session_smob) { if (! SCM_SMOB_PREDICATE (session_tag, session_smob)) { _ssh_log (SSH_LOG_FUNCTIONS, "free_session", "%s", "already freed"); return 0; } struct session_data *data =3D _scm_to_session_data (session_smob); [...] return 0; } =2D-8<---------------cut here---------------end--------------->8--- As you can see, there's additional smob check before accessing the smob's data. Please let me know if the problem manifest itself in the latest Guile-SSH version. Thanks! =2D Artyom =2D-=20 Artyom V. Poptsov ; GPG Key: 0898A02F Home page: http://poptsov-artyom.narod.ru/ --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBCgAGBQJX+fcEAAoJEOnFotkImKAvoMsH+wQlXUXHEwbdCAlgbicv3IZe 2mj+zhZKi6cROgMhbXe7yoR6ywNQKRNEopeESQJ3EP3uS7TWDtr0WpABPxkeULpm taPLuV8+pkVyZiMBlJSM63RYyryS7/Rvg0g4nB3666RP09qpUoTUE6rtiLlCuR3Y +U5xx7zGYmVfL/0VkH+894+sz5uQW+BbetvUSjCeRsar8jA6d/aGmNEMDb7xlyok WqfJo4nI1H8c5G90FD9U4lLrceCBGFnMIJ4RENhogmEBiMlY3l+SoiR8QtsEHlE9 LzH5gDJNsyt8ZbByjl4sP3UZKgs8l5hl/WWFxwZ8dXcBmI9ANYwmCxIEv3h2AOw= =e13y -----END PGP SIGNATURE----- --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 09 18:29:28 2016 Received: (at 19883) by debbugs.gnu.org; 9 Oct 2016 22:29:28 +0000 Received: from localhost ([127.0.0.1]:50289 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1btMb1-0000BK-AZ for submit@debbugs.gnu.org; Sun, 09 Oct 2016 18:29:28 -0400 Received: from mail-lf0-f48.google.com ([209.85.215.48]:36434) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1btJLd-0001oT-SZ for 19883@debbugs.gnu.org; Sun, 09 Oct 2016 15:01:22 -0400 Received: by mail-lf0-f48.google.com with SMTP id b75so86993965lfg.3 for <19883@debbugs.gnu.org>; Sun, 09 Oct 2016 12:01:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=Lb6V5moWhK/gaaWz134ld4Vfihqf7STYU+NoV+wxW4M=; b=MDutQG++IKkkvxwFw21u+S4y29GmgLxJA5tm2b/oCc8RH+ZktYzQDH7sEyF7h2U2Tg 5zT1EvMkDHTr0SgCMH7lz1PjVtIDdXW5G/St1Ht/+ZHmDuppIfbvnK12sDyyLkU0MBKr LCLNVXj73mDyH9kbuAwy2aWY5KXASky3erCENYP+KaaEQmSltDZ4iRX7FU0glWvJeXLx 5VvA+bpxfK29AC6PoFX1UR36VJowaoSb0WGkxrJCmIfUATv/Gbb1WI9cz0H+rSKZRYKC 9zQIxYqfW9s2CAapvFpOIq51g+hzCm0Q20pFpS9tTUA8Iz0dKPZiC7AFasN7rytJkkK7 eIhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=Lb6V5moWhK/gaaWz134ld4Vfihqf7STYU+NoV+wxW4M=; b=dNDABoz6tjSK3YV+9+2YJu5QfGkBRbVzA92Oj6cJkUUZSRQ1vN+IA39c91zpZWn/+S DzlstEZcTnADQLSktAD31v3Zy8I9yxNXRO85QnIJgIXiooXo5c0YkDXqoHmMOgbjE55W gYd8UzW/C0Ry35pabBppMsj16uSxs2Fff1+SxS62WpiyKsGTusfg3+WAsEUsQ1eFSDWv w7i1jeU19CDKW6PRLu+tQZux6RGIBOqcJ3eqtiFI9eRP+Gb9BEeQ13FvaXar2O5Vjp04 Dx9Jc2/Z+hZq/yPQh+6MxXLsXItBl++cgEUB0R0N0eNI82Ve6kmL0vbEpjuCPAIBCHNk uWgg== X-Gm-Message-State: AA6/9RmkF0SEkhD3rNFZPnn+Bf9aS+IMJgyYFWvwOnaFKDrxEpNyrPxjoV0y6HCKEp9t0w== X-Received: by 10.25.150.6 with SMTP id y6mr10299544lfd.160.1476039675726; Sun, 09 Oct 2016 12:01:15 -0700 (PDT) Received: from elephant.savannah (93-120-179-209.dynamic.mts-nn.ru. [93.120.179.209]) by smtp.gmail.com with ESMTPSA id v62sm1541459lfa.44.2016.10.09.12.01.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 09 Oct 2016 12:01:14 -0700 (PDT) From: Artyom Poptsov To: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Re: The =?utf-8?B?4oCcZmluYWxpemVk4oCd?= SMOB type References: <87mvieh7mu.fsf@gnu.org> <87vax2klj6.fsf@elephant.savannah> <87fuo5hc88.fsf@gnu.org> Date: Sun, 09 Oct 2016 22:00:36 +0300 In-Reply-To: <87fuo5hc88.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Sun, 09 Oct 2016 15:40:39 +0200") Message-ID: <87h98ll54b.fsf@elephant.savannah> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" X-Spam-Score: -0.2 (/) X-Debbugs-Envelope-To: 19883 X-Mailman-Approved-At: Sun, 09 Oct 2016 18:29:25 -0400 Cc: 19883@debbugs.gnu.org, David Kastrup , guile-devel X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.2 (/) --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hello Ludovic and David, thanks for all the comments on the smob freeing callbacks. Apparently I misread the documentation so the bug was introduced. I've changed the callback procedures as was suggested; the patches included in Guile-SSH version 0.10.1 released a few moments ago. =2D Artyom =2D-=20 Artyom V. Poptsov ; GPG Key: 0898A02F Home page: http://poptsov-artyom.narod.ru/ --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBCgAGBQJX+pPUAAoJEOnFotkImKAvEocH/1KKFEZOxhuntMkglSms/Tqx Y5kuuGXXXXUtuuRj3DvDzGCU+D/c7NZqRA1lWij1Une9PV2KRf7Y6cm4NdqsrGoo W11U0wPgJzUWKCF2/DD6wdfXPfRDVM8+bM9v2tb19+8BFLtlkgH/EJjeciljgIq1 wNIESpI9q1BKMEXrqJ+VUD9Q2otmzjKkiY8Vck+lMEkk2NdWheRQr+RqYqr5vL0M 6HVeUV74NebpZ5pBwv+KnCs/tQqQWKu9C1+i0GkV4IahhSijub2yYIhlctUeHL8+ L9UIok4ntHFh8QNo18KBAKik/0YhOZqE789fIhZZLq34/R0nfKJQRfUDZmJDlrc= =tz0i -----END PGP SIGNATURE----- --=-=-=-- From unknown Fri Sep 05 18:58:02 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Mon, 07 Nov 2016 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator