Package: emacs;
Reported by: Eric Abrahamsen <eric <at> ericabrahamsen.net>
Date: Thu, 6 Dec 2018 22:40:02 UTC
Severity: wishlist
Tags: fixed
Found in version 27.0.50
Fixed in version 27.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Message #86 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Eric Abrahamsen <eric <at> ericabrahamsen.net> To: bug-gnu-emacs <at> gnu.org Subject: Re: bug#33653: 27.0.50; Change Gnus obarrays-as-hash-tables into real hash tables Date: Tue, 26 Mar 2019 12:58:32 -0700
Andy Moreton <andrewjmoreton <at> gmail.com> writes: > On Mon 25 Mar 2019, Eric Abrahamsen wrote: > >> Andy Moreton <andrewjmoreton <at> gmail.com> writes: >>> Other notes from reading the code: >>> >>> 1) In `gnus-gnus-to-quick-newsrc-format' you ignore the contents of >>> `gnus-newsrc-alist' when saving "newsrc.eld", and replace it with the >>> details from `gnus-newsrc-hashtb'. Why ? The rest of the gnus code >>> appears to treat `gnus-newsrc-alist' as the single source of truth, >>> with the hash tables being used only for faster access to it. >> >> Eventually I would like to reduce the number of data structures so that >> groups are held in `gnus-newsrc-hashtb', and ordering is kept in >> `gnus-group-list', and that's it. `gnus-newsrc-alist' would only be used >> when persisting to disk. My next proposed change (once I've recovered my >> confidence) is to turn groups into actual objects, in which case the >> alist would really just be a kind of serialization format. > > ok, but if the hash table and the alist reference the same lisp objects > then the existing setup is not so bad. There's nothing wrong with the existing setup here! I'm just laying the groundwork for the next round of changes. >> The hash table ought to be in sync with the rest of the data structure >> -- if it isn't, that's another bug. >> >>> 2) In `gnus-gnus-to-quick-newsrc-format' you dropped the code to remove >>> the dummy group from `gnus-newsrc-alist'. Why ? This internal dummy >>> group is now saved in "newsrc.eld", which is not needed. >> >> This was an error. (Though in my case, I've had the dummy group in my >> newsrc.eld for months, and it hasn't done any harm. I don't know why >> it's necessary.) > > I agree that it's harmless, it just seemed to be an unnecessary change. > I expect it is there to ensure that the alist is never empty to avoid > the need to check that everywhere. Yes, for sure, and I didn't mean to leave it in there, that's just a bug. >>> 3) The format of the entries in `gnus-newsrc-hashtb' has changed, >>> removing the second element. Why? >> >> Because the old `gnus-gethash' call returned a slice of >> `gnus-newsrc-alist', where the second element was actually the group >> *before* the group you wanted, and the third element was the cdr of >> `gnus-newsrc-alist', starting with the group you wanted. This was >> undocumented, and took a bit to figure out. Now, the gethash call just >> gives you the group. Ideally, in the next set of changes, it will give >> you an object. > > ok, so it sounds like the old code was trying hard to use the same lisp > objects in both the hash table and the alist. That avoids alloacting > twice the storage, and ensures that the hash table and the alist stay in > sync. Yes, I'm looking forward to when the groups are actual objects, . But in the meantime, the lisp objects are still the same, I haven't doubled storage (at least `eq' returns t, so I assume that's what that means). >>> 4) You changed several hash tale sizesfrom 4096 to 4000, and 1024 to >>> 1000. Why? >> >> My understanding is that using a prime number is significant when it >> comes to vector access, but that the hash table implementation is >> higher-level, where a prime number is no longer significant. If that's >> incorrect I would like to know! > > None of these numbers are prime. The old numbers are powers of two, > which are reasonable for allocation sizes on a binary machine. Again, > not a terribly important change, but not really needed either. Yes, sorry, powers of two. It's true it's a do-nothing change, I guess I've been erring on the side of trying to make the code more self-explanatory, less "odd". Probably this change didn't need to be made, but it seems about as un-risky as a code change could be. >>> Your patch contains several logical changes that would be easier to >>> understand (and bisect) as a series of patches with one logical change >>> in each patch: >>> - code layout changes >>> - add missing doc strings and code comments >>> - change hash table implementation >>> - change format of `gnus-newsrc-hashtb' entries >>> - change usage of `gnus-group-change-level' >>> - change coding of group names >>> While it can take extra work to split things up, the end result is much >>> easier to understand. >> >> In principle I agree with this completely. In practice I found it >> extraordinarily difficult to touch one part of Gnus without running into >> knock-on repercussions. > > Agreed. I find that this sort of work usually goes in two phases: some > exploratory programming to decide on the path forward and the eventual > goal, followed by rearranging the changes into a logical set of > (bisectable) patches that get to that goal in small, self-contained > steps. The second half is extra work, but worth it to make it easier for > other people to review the patches (and to simplify any archaeology > needed by a later maintainer). > > If changing gnus is hard, then perhaps writing new tests to document > what you have discovered about the code will help to ensure that later > changes do not change the observed behaviour. The previous commit did add some new Gnus tests -- perhaps the first! I'm planning more, once Gnus group are actual objects. Everything's so interconnected now (and data-reliant) that it's very hard to write effective tests. >> The ultimate goal of the changes I have in mind for Gnus is to address >> exactly this: to make it more modular, to improve isolation of code >> paths, and to reduce the number of semi-redundant data structures. But >> the process is evidently even messier than I thought. I held back >> another commit to group name encoding in an attempt to keep things >> simple, but that seems to have made things even worse. > > A worthwhile goal - please do not get discouraged. Thank you, Eric
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.