GNU bug report logs - #6524
du now uses less than half as much memory, sometimes

Previous Next

Package: coreutils;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Sun, 27 Jun 2010 22:10:02 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eric Blake <eblake <at> redhat.com>
To: Paul Eggert <eggert <at> CS.UCLA.EDU>
Cc: bug-gnulib <at> gnu.org, 6524 <at> debbugs.gnu.org, Bruno Haible <bruno <at> clisp.org>, José Marchesi <jemarch <at> gnu.org>
Subject: bug#6524: RFC: modules for generic unordered sets and mappings
Date: Thu, 01 Jul 2010 21:26:38 -0600
[Message part 1 (text/plain, inline)]
On 07/01/2010 06:44 PM, Paul Eggert wrote:
> I've just gone through "du", and it's fresh in my
> mind, so I thought I'd run through what I needed there, to see how
> well it maps to the proposal.
> 
> I needed three kinds of hash tables.
> 
> * Table A is indexed by device number (a dev_t value) and the value is a
>   secondary hash table.  Typically the number of devices is relatively
>   small.
> 
> * Table B is indexed by inode number (an ino_t value) and has no value.
>   All that matters is whether the key is present.  So it is a set,
>   not a general mapping.  Often there are many, many inode numbers.
> 
> * The gnulib hash package uses void * keys, but having a void * value
>   that exists only to point to an ino_t value wastes memory.  So, if
>   an inode number is small (less than M, say), it maps to itself.  If
>   it is large, it is a key into a third table (table C) whose values
>   are mapped inode numbers.  (Values in table C are always M or
>   greater.)  The mapped inode number (whether taken from table C, or
>   used directly) is cast to void * and then used as the index into
>   table B.  In practice, most inode numbers are less than M so this is
>   a storage win.

We need to be careful on cygwin.

$ ls -i | head
 1125899907178954 ABOUT-NLS
28147497671067760 AUTHORS
 1970324837308495 COPYING
 9851624184873962 ChangeLog
 4785074604197847 ChangeLog-2005
 1970324837180710 ChangeLog-2006
 1970324837078885 ChangeLog-2007
 2251799813904421 ChangeLog-2008
  281474977732635 GNUmakefile
 1125899907781258 GNUmakefile~

sizeof(void*) == sizeof(size_t) == 4, but sizeof(ino_t) == 8, and most
inodes are quite randomly dispersed but definitely larger than 4 bytes.
 Does your scheme work well at mapping cygwin's 8-byte inodes into
4-byte pointers?

-- 
Eric Blake   eblake <at> redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

This bug report was last modified 14 years and 99 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.