trip logs / gnuvola


Trip Log 2022-01-01 h11 -- Indicies Style Upgrade Part 10

In this episode of the Indices Style Upgrade series we finish up the analysis of patch 7: “Order tags most-impactful first” that we started in the previous trip log.  (Feel free to grab the accompanying tarball and follow along.) 

As you may recall, patch 7 has two hunks, and we looked at most of hunk 1 previously (lines 31-38).  Let's look at the rest of it, now, and see how it fits w/ hunk 2.  Here's the excerpt: 

26	+(define (most-impactful-first tag+count+latest)
27	+
28	+  (define (more-impactful a b)
29	+    (let-values (((a-tag a-count a-latest) (tag+count+latest a))
30	+                 ((b-tag b-count b-latest) (tag+count+latest b)))
[...]
38	+      ...))
39	+
40	+  ;; rv
41	+  (lambda (ls)
42	+    (sort ls more-impactful)))

It appears that the aspects used in the heart of the comparison algorithm — namely the “count”, the “latest”, and the “tag” — are derived from the respective A and B items by the ‘tag+count+latest’ procedure.  That procedure, passed in as the unique arg to ‘most-impactful-first’, returns multiple values, which is why we need the ‘let-values’ form to assign them to the convenient local variables ‘a-tag’, ‘a-count’, and so on (lines 29-30).  So, in the service of ‘sort’ (line 42), our comparison procedure ‘more-impactful’ (lines 28-38) basically has two steps: derive the aspects from the respective A and B items, and do the comparison based on those aspects. 

Right, so where does ‘tag+count+latest’ come from?  It's passed in from the caller of ‘most-impactful-first’ (line 26).  This is where hunk 2 makes its appearance.  Here it is, along w/ the change description part at the beginning of the patch, which does indeed connect the two hunks (line 11): 

 4	Subject: [PATCH 7/8] tl-mkindex: Order tags most-impactful first.
[...]
 9	* sub/tl-mkindex (most-impactful-first): New proc.
10	(lexicographically-ordered): Delete proc.
11	(consult-db): Use ‘most-impactful-first’.
[...]
46	@@ -257,7 +272,17 @@ (define (consult-db upath-ignored where-clause)
47	   (define (one tag upath title)
48	     (cons tag (youngest-first (map cons upath title))))
49
50	-  (lexicographically-ordered
51	+  ((most-impactful-first
52	+    (lambda (x)
53	+      ;; NB: When ‘one’ changes, this must change accordingly, as well.
54	+      (let ((refs (cdr x)))
55	+        (values
56	+         ;; tag
57	+         (car x)
58	+         ;; count
59	+         (length refs)
60	+         ;; latest ref title
61	+         (cdar refs)))))
62	    (apply map one (cols<-tuples-result
63	                    (fetch-tuples)))))
64

By the way, the description of the removal (line 10) and the actual removal of the call (line 50) correspond, so at least we know there is some basic sanity in the system. 

We see that the argument to ‘most-impactful-first’ is an anonymous function (‘lambda’ form, lines 52-61) which returns the three aspects of its arg ‘x’ (line 52), as three values (lines 55-61).  Although they are only comments (lines 56, 58, 60), they still help us match these values to the local variables assigned by the ‘let-values’ form (line 29).  So, that checks out. 

Lastly, remember that ‘most-impactful-first’ returns a thunk (lines 41-42), and THAT is the procedure that needs to be called to actually do the sorting.  Thus, we see the thunk call represented by the extra pair of parentheses surrounding the ‘most-impactful-first’ call (line 51 open, line 61 close).  So the call sequence is as follows:

By now, you might be impressed at the interplay of control and data, or you might be aghast and horrified.  In any case, the question of “Why such complexity?” is a good one.  Apart from the honest but vacuous answer of “It seemed like a good idea at the time”, we can take the comment on line 53 as a clue.  It reminds the reader (programmer, future self) that there is a coupling between the data structure composed by procedure ‘one’ (lines 47-48) and the aspect-returning anonymous function: what ‘one’ packs, the anonymous function must unpack. 

The clue we can derive is this: Because such a coupling is a maintenance burden (it's easy to forget to modify both parts when modifying anything), placing the coupled portions close to each other in the source code (along w/ a salient comment) can help reduce potential problems by literally keeping the portions on the same page.  “Out of sight, out of mind” is a real danger to program maintenance! 

After all, like Perlis sez, “Don't have good ideas if you aren't willing to be responsible for them.” 


Copyright (C) 2022 Thien-Thi Nguyen