GNU bug report logs - #37207
nginx serving files from the store returns Last-Modified = Epoch

Previous Next

Package: guix;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Wed, 28 Aug 2019 09:53:02 UTC

Severity: normal

Merged with 39051

To reply to this bug, email your comments to 37207 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Wed, 28 Aug 2019 09:53:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ludovic Courtès <ludo <at> gnu.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Wed, 28 Aug 2019 09:53:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: bug-Guix <at> gnu.org
Subject: guix.gnu.org returns Last-Modified = Epoch
Date: Wed, 28 Aug 2019 11:52:36 +0200
Hello Guix,

Since the use of the ‘static-web-site’ service, which puts web site
files in the store, nginx returns a ‘Last-Modified’ header that can
trick clients into caching things forever:

--8<---------------cut here---------------start------------->8---
$ wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 | grep Last
Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
--8<---------------cut here---------------end--------------->8---

We should tell nginx to do not emit ‘Last-Modified’, or to take the
state from the /srv/guix.gnu.org symlink, if possible.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Wed, 28 Aug 2019 10:42:02 GMT) Full text and rfc822 format available.

Message #8 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Gábor Boskovits <boskovits <at> gmail.com>
To: 37207 <at> debbugs.gnu.org
Subject: guix.gnu.org Last Modified at epoch
Date: Wed, 28 Aug 2019 12:40:37 +0200
[Message part 1 (text/plain, inline)]
Hello,

Supressing the last modified header is just an
add_header Last-Modified "";
away.

To get the info from the symlink seems to be much trickier, i would do with
either embedded perl or embedded lua. I am not sure if we should bother
with it, though. Wdyt?
[Message part 2 (text/html, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Wed, 28 Aug 2019 14:38:02 GMT) Full text and rfc822 format available.

Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Tobias Geerinckx-Rice <me <at> tobias.gr>
To: bug-guix <at> gnu.org
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: guix.gnu.org Last Modified at epoch
Date: Wed, 28 Aug 2019 16:37:15 +0200
[Message part 1 (text/plain, inline)]
Gábor, Ludo',

Gábor Boskovits 写道:
> Supressing the last modified header is just an
> add_header Last-Modified "";
> away.

You'll also need:

# Don't honour client If-Modified-Since constraints.
if_modified_since off;
# Nginx's etags are hashes of file timestamp & file length.
etag off;

Turning these off will of course prevent all caching.  I don't 
know if that would add measurable load to guix.gnu.org (it would 
be more problematic if we used a CDN, but it might still make a 
difference).

Nix does something both interesting and icky — as always: patch[0] 
nginx to look up the realpath() instead, so clients can still 
cache using If-None-Match.

Kind regards,

T G-R

[0]: https://github.com/NixOS/nixpkgs/pull/48337
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Wed, 28 Aug 2019 14:38:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Wed, 28 Aug 2019 15:06:02 GMT) Full text and rfc822 format available.

Message #17 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Gábor Boskovits <boskovits <at> gmail.com>
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: guix.gnu.org Last Modified at epoch
Date: Wed, 28 Aug 2019 17:05:30 +0200
[Message part 1 (text/plain, inline)]
Hi Gabor,

On Wed, 28 Aug 2019 12:40:37 +0200
Gábor Boskovits <boskovits <at> gmail.com> wrote:

> Supressing the last modified header is just an
> add_header Last-Modified "";
> away.
> 
> To get the info from the symlink seems to be much trickier, i would do with
> either embedded perl or embedded lua. I am not sure if we should bother
> with it, though. Wdyt?

Since we already emit ETag, I don't think we need to bother with Last-Modified.

Why is the ETag so short, though?

>wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 | grep -i etag
>ETag: "1-2f38b1"

[Message part 2 (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Wed, 28 Aug 2019 19:01:02 GMT) Full text and rfc822 format available.

Message #20 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Gábor Boskovits <boskovits <at> gmail.com>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: guix.gnu.org Last Modified at epoch
Date: Wed, 28 Aug 2019 20:59:44 +0200
[Message part 1 (text/plain, inline)]
Hello Danny,

Danny Milosavljevic <dannym <at> scratchpost.org> ezt írta (időpont: 2019. aug.
28., Sze, 17:05):

> Hi Gabor,
>
> On Wed, 28 Aug 2019 12:40:37 +0200
> Gábor Boskovits <boskovits <at> gmail.com> wrote:
>
> > Supressing the last modified header is just an
> > add_header Last-Modified "";
> > away.
> >
> > To get the info from the symlink seems to be much trickier, i would do
> with
> > either embedded perl or embedded lua. I am not sure if we should bother
> > with it, though. Wdyt?
>
> Since we already emit ETag, I don't think we need to bother with
> Last-Modified.
>
> Why is the ETag so short, though?
>
>
The ETag we emit is also bad. Nginx calculates this from mtime and
content-lenght,
so in our case it's just content length.


> >wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 |
> grep -i etag
> >ETag: "1-2f38b1"
>
>
Best regards,
g_bor

-- 
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21
[Message part 2 (text/html, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Wed, 28 Aug 2019 19:43:02 GMT) Full text and rfc822 format available.

Message #23 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Gábor Boskovits <boskovits <at> gmail.com>
To: Tobias Geerinckx-Rice <me <at> tobias.gr>
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: guix.gnu.org Last Modified at epoch
Date: Wed, 28 Aug 2019 21:42:20 +0200
[Message part 1 (text/plain, inline)]
Hello Tobias,

Tobias Geerinckx-Rice via Bug reports for GNU Guix <bug-guix <at> gnu.org> ezt
írta (időpont: 2019. aug. 28., Sze, 16:38):

> Gábor, Ludo',
>
> Gábor Boskovits 写道:
> > Supressing the last modified header is just an
> > add_header Last-Modified "";
> > away.
>
> You'll also need:
>
> # Don't honour client If-Modified-Since constraints.
> if_modified_since off;
> # Nginx's etags are hashes of file timestamp & file length.
> etag off;
>
>
You really have a point here.

Based on my reseach, I came up with the following:

we need
etag off;

we should create a file with the git last modification time of the files,
updated when there is a new commit in the repo => last-modified
we should create a file with some hash of the files, updated when there is
a new commit in the repo => etag
we could restrict these operations to the files modified since the last
checkout.

Retrieve these with embededd perl.
Wdyt?


> Turning these off will of course prevent all caching.  I don't
> know if that would add measurable load to guix.gnu.org (it would
> be more problematic if we used a CDN, but it might still make a
> difference).
>
> Nix does something both interesting and icky — as always: patch[0]
> nginx to look up the realpath() instead, so clients can still
> cache using If-None-Match.
>
> Kind regards,
>
> T G-R
>
> [0]: https://github.com/NixOS/nixpkgs/pull/48337
>

Best regards,
g_bor

-- 
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21
[Message part 2 (text/html, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Wed, 28 Aug 2019 20:33:02 GMT) Full text and rfc822 format available.

Message #26 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Gábor Boskovits <boskovits <at> gmail.com>
Cc: 37207 <at> debbugs.gnu.org, Tobias Geerinckx-Rice <me <at> tobias.gr>
Subject: Re: bug#37207: guix.gnu.org Last Modified at epoch
Date: Wed, 28 Aug 2019 22:32:42 +0200
Hello,

Gábor Boskovits <boskovits <at> gmail.com> skribis:

> we should create a file with the git last modification time of the files,
> updated when there is a new commit in the repo => last-modified
> we should create a file with some hash of the files, updated when there is
> a new commit in the repo => etag
> we could restrict these operations to the files modified since the last
> checkout.
>
> Retrieve these with embededd perl.
> Wdyt?

What would the config look like?  AFAICS our ‘nginx’ package doesn’t
embed Perl, and I think it’s better this way.  :-)  Can we do that with
pure nginx directives?

We create /srv/guix.gnu.org (as a symlink) with the correct mtime¹.  If
we can tell nginx to use it as the ‘Last-Modified’ date, that’s perfect.

Ludo’.

¹ https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/berlin.scm#n212




Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Thu, 29 Aug 2019 06:13:02 GMT) Full text and rfc822 format available.

Message #29 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Gábor Boskovits <boskovits <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 37207 <at> debbugs.gnu.org, Tobias Geerinckx-Rice <me <at> tobias.gr>
Subject: Re: bug#37207: guix.gnu.org Last Modified at epoch
Date: Thu, 29 Aug 2019 08:11:46 +0200
[Message part 1 (text/plain, inline)]
Hello Ludo,

Ludovic Courtès <ludo <at> gnu.org> ezt írta (időpont: 2019. aug. 28., Sze,
22:32):

> Hello,
>
> Gábor Boskovits <boskovits <at> gmail.com> skribis:
>
> > we should create a file with the git last modification time of the files,
> > updated when there is a new commit in the repo => last-modified
> > we should create a file with some hash of the files, updated when there
> is
> > a new commit in the repo => etag
> > we could restrict these operations to the files modified since the last
> > checkout.
> >
> > Retrieve these with embededd perl.
> > Wdyt?
>
> What would the config look like?  AFAICS our ‘nginx’ package doesn’t
> embed Perl, and I think it’s better this way.  :-)  Can we do that with
> pure nginx directives?
>
> We create /srv/guix.gnu.org (as a symlink) with the correct mtime¹.  If
> we can tell nginx to use it as the ‘Last-Modified’ date, that’s perfect.
>
>
I was thinking about this. Yes, we can solve that with pure nginx. There is
an issue however.
It invalidates all cached entries on update, so files not modified will
also need to be downloaded again.

The easiest way to do that would be to simply generate an nginx config
snippet at a configurable location,
setting up the mtime and etags variable, and include that from the main
config.

If this would be ok, then I will have a look at implementing this.

Ludo’.
>
> ¹
> https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/berlin.scm#n212
>

Best regards,
g_bor

-- 
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21
[Message part 2 (text/html, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Thu, 29 Aug 2019 12:41:02 GMT) Full text and rfc822 format available.

Message #32 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Gábor Boskovits <boskovits <at> gmail.com>
Cc: 37207 <at> debbugs.gnu.org, Tobias Geerinckx-Rice <me <at> tobias.gr>
Subject: Re: bug#37207: guix.gnu.org Last Modified at epoch
Date: Thu, 29 Aug 2019 14:40:12 +0200
Hi Gábor,

Gábor Boskovits <boskovits <at> gmail.com> skribis:

> Ludovic Courtès <ludo <at> gnu.org> ezt írta (időpont: 2019. aug. 28., Sze,
> 22:32):
>
>> Hello,
>>
>> Gábor Boskovits <boskovits <at> gmail.com> skribis:
>>
>> > we should create a file with the git last modification time of the files,
>> > updated when there is a new commit in the repo => last-modified
>> > we should create a file with some hash of the files, updated when there
>> is
>> > a new commit in the repo => etag
>> > we could restrict these operations to the files modified since the last
>> > checkout.
>> >
>> > Retrieve these with embededd perl.
>> > Wdyt?
>>
>> What would the config look like?  AFAICS our ‘nginx’ package doesn’t
>> embed Perl, and I think it’s better this way.  :-)  Can we do that with
>> pure nginx directives?
>>
>> We create /srv/guix.gnu.org (as a symlink) with the correct mtime¹.  If
>> we can tell nginx to use it as the ‘Last-Modified’ date, that’s perfect.
>>
>>
> I was thinking about this. Yes, we can solve that with pure nginx. There is
> an issue however.
> It invalidates all cached entries on update, so files not modified will
> also need to be downloaded again.
>
> The easiest way to do that would be to simply generate an nginx config
> snippet at a configurable location,
> setting up the mtime and etags variable, and include that from the main
> config.
>
> If this would be ok, then I will have a look at implementing this.

I’m not sure I fully understand, but yes, if you could send a prototype
as a diff against maintenance.git, that’d be great!

Thank you,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Thu, 05 Sep 2019 20:48:01 GMT) Full text and rfc822 format available.

Message #35 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Gábor Boskovits <boskovits <at> gmail.com>
Cc: 37207 <at> debbugs.gnu.org, Danny Milosavljevic <dannym <at> scratchpost.org>,
 Tobias Geerinckx-Rice <me <at> tobias.gr>
Subject: Re: bug#37207: guix.gnu.org Last Modified at epoch
Date: Thu, 05 Sep 2019 22:47:36 +0200
Hello!

Did one of you have chance to come up with a trick to emit the right
‘Last-Modified’?  We seemed to be close to having something.  :-)

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Thu, 26 Sep 2019 08:40:02 GMT) Full text and rfc822 format available.

Message #38 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Tobias Geerinckx-Rice <me <at> tobias.gr>
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: guix.gnu.org Last Modified at epoch
Date: Thu, 26 Sep 2019 10:39:45 +0200
Hi Tobias,

Tobias Geerinckx-Rice <me <at> tobias.gr> skribis:

> Turning these off will of course prevent all caching.  I don't know if
> that would add measurable load to guix.gnu.org (it would be more
> problematic if we used a CDN, but it might still make a difference).
>
> Nix does something both interesting and icky — as always: patch[0]
> nginx to look up the realpath() instead, so clients can still cache
> using If-None-Match.

> [0]: https://github.com/NixOS/nixpkgs/pull/48337

(See
<https://raw.githubusercontent.com/NixOS/nixpkgs/9bc23f31d29138f09db6af52708a9b8b64deec64/pkgs/servers/http/nginx/nix-etag-1.15.4.patch>.)

I had overlooked this patch but it looks like the right approach
overall.  Calling ‘realpath’ each time seems a bit expensive as it
creates an ‘lstat’ storm, but I can’t think of a better solution.

I also found this post whose main interest is in showing how to write a
plugin to generate custom etags:

  https://mikewest.org/2008/11/generating-etags-for-static-content-using-nginx/

Thoughts?

Ludo’.




Merged 37207 39051. Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Sat, 11 Jan 2020 21:24:02 GMT) Full text and rfc822 format available.

Changed bug title to 'nginx serving files from the store returns Last-Modified = Epoch' from 'guix.gnu.org returns Last-Modified = Epoch' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Sat, 11 Jan 2020 21:27:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Thu, 26 Mar 2020 23:07:02 GMT) Full text and rfc822 format available.

Message #45 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Legoll <vincent.legoll <at> gmail.com>
To: 37207 <at> debbugs.gnu.org
Subject: Re: nginx serving files from the store returns Last-Modified = Epoch
Date: Fri, 27 Mar 2020 00:06:01 +0100
This bug prevents repology [1] to show
the latest versions of packages in guix,
as it relies on 'Last-Modified' for:
https://guix.gnu.org/packages.json
changing in a meaningful way...

[1] https://repology.org/

-- 
Vincent Legoll




Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Thu, 26 Mar 2020 23:31:01 GMT) Full text and rfc822 format available.

Message #48 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Legoll <vincent.legoll <at> gmail.com>
To: 37207 <at> debbugs.gnu.org
Subject: Repology
Date: Fri, 27 Mar 2020 00:30:23 +0100
It also paint us a a fairly outdated distro,
despite our efforts to keep the pace and
update to latest versions of packages.

We may even get into the top ten, which
may give us a bit of attention and attract
some distrohoppers^Wusers.

-- 
Vincent Legoll




Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Mon, 30 Mar 2020 02:36:26 GMT) Full text and rfc822 format available.

Message #51 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Gábor Boskovits <boskovits <at> gmail.com>
To: Vincent Legoll <vincent.legoll <at> gmail.com>
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: nginx serving files from the store returns
 Last-Modified = Epoch
Date: Sun, 29 Mar 2020 11:50:21 +0200
Hello Vincent,

Vincent Legoll <vincent.legoll <at> gmail.com> ezt írta (időpont: 2020.
márc. 27., P, 0:07):
>
> This bug prevents repology [1] to show
> the latest versions of packages in guix,
> as it relies on 'Last-Modified' for:
> https://guix.gnu.org/packages.json
> changing in a meaningful way...
>

Does it also use etags, or just last-modified?

I ask this because we already have bug similar to this, and it would
be interesting to know if
it would be enough to have a meaningful etags generation, or we still have to
deal with last-modified.

> [1] https://repology.org/
>
> --
> Vincent Legoll
>
>
>

Best regards,
g_bor
-- 
OpenPGP Key Fingerprint: 7988:3B9F:7D6A:4DBF:3719:0367:2506:A96C:CF63:0B21




Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Mon, 30 Mar 2020 11:54:02 GMT) Full text and rfc822 format available.

Message #54 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Legoll <vincent.legoll <at> gmail.com>
To: Gábor Boskovits <boskovits <at> gmail.com>
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: nginx serving files from the store returns
 Last-Modified = Epoch
Date: Mon, 30 Mar 2020 13:53:35 +0200
Hello,

On Sun, Mar 29, 2020 at 11:50 AM Gábor Boskovits <boskovits <at> gmail.com> wrote:
> Does it also use etags, or just last-modified?

From the email exchange I had with the maintainer of the site,
I think it only uses last-modified.

> I ask this because we already have bug similar to this, and it would
> be interesting to know if
> it would be enough to have a meaningful etags generation, or we
> still have to deal with last-modified.

Is etags easier for us to handle ?

-- 
Vincent Legoll




Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Sat, 09 May 2020 22:08:02 GMT) Full text and rfc822 format available.

Message #57 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: 37207 <at> debbugs.gnu.org
Cc: ludo <at> gnu.org
Subject: Re: bug#37207: guix.gnu.org returns Last-Modified = Epoch
Date: Sat, 09 May 2020 23:07:41 +0100
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:

> Since the use of the ‘static-web-site’ service, which puts web site
> files in the store, nginx returns a ‘Last-Modified’ header that can
> trick clients into caching things forever:
>
> --8<---------------cut here---------------start------------->8---
> $ wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 | grep Last
> Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
> --8<---------------cut here---------------end--------------->8---
>
> We should tell nginx to do not emit ‘Last-Modified’, or to take the
> state from the /srv/guix.gnu.org symlink, if possible.

I ended up looking at this again in relation to Repology [1].

1: https://github.com/repology/repology-updater/issues/218#issuecomment-525905704

Going back to that comment, given that the Last-Modified header (and the
ETag) is wrong, it's probably sensible to remove them. That might even
fix the issue with Repology fetching the packages.json file.

Alternatively (or in addition), we could run a really simple Guile web
server that just serves the packages.json file with the right
Last-Modified value, and have NGinx proxy requests to that server. This
would be pretty easy to setup I believe, and would allow providing a
correct value.

Thoughts?

Chris
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Sun, 10 May 2020 10:12:02 GMT) Full text and rfc822 format available.

Message #60 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: guix.gnu.org returns Last-Modified = Epoch
Date: Sun, 10 May 2020 12:11:16 +0200
Howdy!

Christopher Baines <mail <at> cbaines.net> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Since the use of the ‘static-web-site’ service, which puts web site
>> files in the store, nginx returns a ‘Last-Modified’ header that can
>> trick clients into caching things forever:
>>
>> --8<---------------cut here---------------start------------->8---
>> $ wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 | grep Last
>> Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
>> --8<---------------cut here---------------end--------------->8---
>>
>> We should tell nginx to do not emit ‘Last-Modified’, or to take the
>> state from the /srv/guix.gnu.org symlink, if possible.
>
> I ended up looking at this again in relation to Repology [1].
>
> 1: https://github.com/repology/repology-updater/issues/218#issuecomment-525905704
>
> Going back to that comment, given that the Last-Modified header (and the
> ETag) is wrong, it's probably sensible to remove them. That might even
> fix the issue with Repology fetching the packages.json file.
>
> Alternatively (or in addition), we could run a really simple Guile web
> server that just serves the packages.json file with the right
> Last-Modified value, and have NGinx proxy requests to that server. This
> would be pretty easy to setup I believe, and would allow providing a
> correct value.
>
> Thoughts?

I think it wouldn’t really help because the Last-Modified issue is
pervasive.  It shows for instance when accessing the web site: one often
has to force the browser to reload pages to get the latest version.

So I’m all for one of the solutions that were proposed earlier.

WDYT?

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Mon, 11 May 2020 10:33:02 GMT) Full text and rfc822 format available.

Message #63 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: guix.gnu.org returns Last-Modified = Epoch
Date: Mon, 11 May 2020 11:32:19 +0100
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:

> Howdy!
>
> Christopher Baines <mail <at> cbaines.net> skribis:
>
>> Ludovic Courtès <ludo <at> gnu.org> writes:
>>
>>> Since the use of the ‘static-web-site’ service, which puts web site
>>> files in the store, nginx returns a ‘Last-Modified’ header that can
>>> trick clients into caching things forever:
>>>
>>> --8<---------------cut here---------------start------------->8---
>>> $ wget --debug -O /dev/null   https://guix.gnu.org/packages.json 2>&1 | grep Last
>>> Last-Modified: Thu, 01 Jan 1970 00:00:01 GMT
>>> --8<---------------cut here---------------end--------------->8---
>>>
>>> We should tell nginx to do not emit ‘Last-Modified’, or to take the
>>> state from the /srv/guix.gnu.org symlink, if possible.
>>
>> I ended up looking at this again in relation to Repology [1].
>>
>> 1: https://github.com/repology/repology-updater/issues/218#issuecomment-525905704
>>
>> Going back to that comment, given that the Last-Modified header (and the
>> ETag) is wrong, it's probably sensible to remove them. That might even
>> fix the issue with Repology fetching the packages.json file.
>>
>> Alternatively (or in addition), we could run a really simple Guile web
>> server that just serves the packages.json file with the right
>> Last-Modified value, and have NGinx proxy requests to that server. This
>> would be pretty easy to setup I believe, and would allow providing a
>> correct value.
>>
>> Thoughts?
>
> I think it wouldn’t really help because the Last-Modified issue is
> pervasive.  It shows for instance when accessing the web site: one often
> has to force the browser to reload pages to get the latest version.
>
> So I’m all for one of the solutions that were proposed earlier.
>
> WDYT?

So I think removing the Last-Modified header from the responses will fix
the issue with the Repology fetcher (as it will stop thinking it's
already fetch the file, since it was last modified in 1970), instead it
will just always process the file.

Removing the Last-Modified header, and maybe the ETag as well from
responses should avoid the issue with web browsers using a cached
version of the page when they probably shouldn't.

I realise what I described with using a Guile web server to serve the
packages.json file wouldn't help with other pages (unless they're served
as well, which is a possibility), but that was just an optimisation over
removing the header entirely, as having the Last-Modified header, with a
correct value would help the Repology fetcher cache the file.

Does that make sense? It still seems to me that a small change to the
NGinx config (I think these lines somewhere in the config would do it
[1]) would help with the Repology fetcher issue, and the issue you
describe with web browsers.

1:

add_header Last-Modified "";
if_modified_since off;
etag off;
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Mon, 11 May 2020 12:48:02 GMT) Full text and rfc822 format available.

Message #66 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: guix.gnu.org returns Last-Modified = Epoch
Date: Mon, 11 May 2020 14:47:34 +0200
Hi,

Christopher Baines <mail <at> cbaines.net> skribis:

> So I think removing the Last-Modified header from the responses will fix
> the issue with the Repology fetcher (as it will stop thinking it's
> already fetch the file, since it was last modified in 1970), instead it
> will just always process the file.
>
> Removing the Last-Modified header, and maybe the ETag as well from
> responses should avoid the issue with web browsers using a cached
> version of the page when they probably shouldn't.

It would prevent client-side caching altogether.  So perhaps we can do
that as a stopgap (and it has the advantage of requiring only a tiny
config change).

Eventually, it’d be nice to have something better, like one of the Etag
patches discussed upthread.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Fri, 15 May 2020 21:14:01 GMT) Full text and rfc822 format available.

Message #69 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: anadon via web <issues.guix.gnu.org <at> elephly.net>
To: 37207 <at> debbugs.gnu.org
Subject: nginx serving files from the store returns Last-Modified = Epoch
Date: Fri, 15 May 2020 23:12:51 +0200
 Any movement on this?





Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Mon, 25 May 2020 08:21:02 GMT) Full text and rfc822 format available.

Message #72 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: 37207 <at> debbugs.gnu.org
Subject: [PATCH] nginx: berlin: Work around Last-Modified issues for
 guix.gnu.org.
Date: Mon, 25 May 2020 09:20:47 +0100
* hydra/nginx/berlin.scm (%berlin-servers): Add some config to the
nginx-server-configurations for guix.gnu.org.
---
 hydra/nginx/berlin.scm | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/hydra/nginx/berlin.scm b/hydra/nginx/berlin.scm
index 303fd35..8c90eb1 100644
--- a/hydra/nginx/berlin.scm
+++ b/hydra/nginx/berlin.scm
@@ -514,6 +514,13 @@ PUBLISH-URL."
     (locations guix.gnu.org-locations)
     (raw-content
      (list
+      ;; TODO This works around NGinx using the epoch for the
+      ;; Last-Modified date, as well as the etag.
+      ;; See http://issues.guix.info/issue/37207
+      "add_header Last-Modified \"\";"
+      "if_modified_since off;"
+      "etag off;"
+
       "access_log /var/log/nginx/guix-info.access.log;")))
 
    (nginx-server-configuration
@@ -634,6 +641,13 @@ PUBLISH-URL."
      (append
       %tls-settings
       (list
+       ;; TODO This works around NGinx using the epoch for the
+       ;; Last-Modified date, as well as the etag.
+       ;; See http://issues.guix.info/issue/37207
+       "add_header Last-Modified \"\";"
+       "if_modified_since off;"
+       "etag off;"
+
        "access_log /var/log/nginx/guix-gnu-org.https.access.log;"))))
 
    (nginx-server-configuration
-- 
2.26.2





Information forwarded to bug-guix <at> gnu.org:
bug#37207; Package guix. (Mon, 25 May 2020 08:26:02 GMT) Full text and rfc822 format available.

Message #75 received at 37207 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 37207 <at> debbugs.gnu.org
Subject: Re: bug#37207: guix.gnu.org returns Last-Modified = Epoch
Date: Mon, 25 May 2020 09:24:59 +0100
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi,
>
> Christopher Baines <mail <at> cbaines.net> skribis:
>
>> So I think removing the Last-Modified header from the responses will fix
>> the issue with the Repology fetcher (as it will stop thinking it's
>> already fetch the file, since it was last modified in 1970), instead it
>> will just always process the file.
>>
>> Removing the Last-Modified header, and maybe the ETag as well from
>> responses should avoid the issue with web browsers using a cached
>> version of the page when they probably shouldn't.
>
> It would prevent client-side caching altogether.  So perhaps we can do
> that as a stopgap (and it has the advantage of requiring only a tiny
> config change).

Great, I've finally got around to sending a patch for this now.
[signature.asc (application/pgp-signature, inline)]

This bug report was last modified 5 years and 17 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.