GNU bug report logs - #52338
Crawler bots are downloading substitutes

Previous Next

Package: guix;

Reported by: Leo Famulari <leo <at> famulari.name>

Date: Mon, 6 Dec 2021 21:22:01 UTC

Severity: normal

Done: Mathieu Othacehe <othacehe <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 52338 <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: 52338 <at> debbugs.gnu.org
Subject: [maintenance] hydra: berlin: Create robots.txt.
Date: Mon,  6 Dec 2021 17:18:10 -0500
I tested that `guix system build` does succeed with this change, but I
would like a review on whether the resulting Nginx configuration is
correct, and if this is the correct path to disallow. It generates an
Nginx location block like this:

------
      location /robots.txt {
        add_header  Content-Type  text/plain;
        return 200 "User-agent: *
Disallow: /nar
";
      }
------

* hydra/nginx/berlin.scm (berlin-locations): Add a robots.txt Nginx location.
---
 hydra/nginx/berlin.scm | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/hydra/nginx/berlin.scm b/hydra/nginx/berlin.scm
index 1f4b0be..3bb2129 100644
--- a/hydra/nginx/berlin.scm
+++ b/hydra/nginx/berlin.scm
@@ -174,7 +174,14 @@ PUBLISH-URL."
            (nginx-location-configuration
             (uri "/berlin.guixsd.org-export.pub")
             (body
-             (list "root /var/www/guix;"))))))
+             (list "root /var/www/guix;")))
+
+           (nginx-location-configuration
+             (uri "/robots.txt")
+             (body
+               (list
+                 "add_header  Content-Type  text/plain;"
+                 "return 200 \"User-agent: *\nDisallow: /nar/\n\";"))))))
 
 (define guix.gnu.org-redirect-locations
   (list
-- 
2.34.0





This bug report was last modified 3 years and 210 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.