GNU bug report logs -
#42162
gforge.inria.fr to be taken off-line in Dec. 2020
Previous Next
Full log
View this message in rfc822 format
Hi zimoun,
zimoun <zimon.toutoune <at> gmail.com> writes:
> One question is how this database scales?
>
> For example, a quick back-to-envelop estimation leads to ~1.2GB metadata
> for ~14k packages and then an increase of ~700MB per year, both with the
> Ludo’s code [1].
>
> [1] <http://issues.guix.gnu.org/issue/42162#11>
It’s a good question. A good part of the size comes from the
representation rather than the data. Compression helps a lot here. I
have a database of 3,912 packages. It’s 295M uncompressed (which is a
little better than your estimation). If I pass each file through Lzip,
it shrinks down to 60M. That’s more like 15.5K per package, which is
almost an order of magnitude smaller than the estimation you used
(120K). I think that makes the numbers rather pleasant, but it comes at
the expense of easy storing in Git.
> As mentioned [2], should this service be part of SWH (download cooking
> task)? Or project side?
>
> [2] <https://forge.softwareheritage.org/T2430#47486>
It would be interesting to just have SWH absorb the project. Since
other distros already know how to produce a “sources.json” and how to
query the SWH archive, it would mean that they benefit for free (and so
would Guix, for that matter). I’m open to that, but right now having
the freedom to experiment is important.
-- Tim
This bug report was last modified 2 years and 287 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.