trip logs / gnuvola

Trip Log 2017-08-23 h07 -- Indices Style Upgrade Part 3

We continue the “indices style upgrade” trip (see accompanying tarball to follow along) now, pondering facets of the universal constant, in the process of drawing a conclusion on the merit (if any) of patch 1. 

First off, to demystify the intro paragraph (which I write wearing a marketing hat, as that paragraph is excerpted into the site Atom feed and thus needs to be alluring), the universal constant is “change”, and the facets of change in question are “motion” and “gain/loss”.  Actually, these are not new topics; we've discovered that patch 1 involves moving a fancy verb from one location in the source code to another (motion), and we've seen numbers for delta and footprint (gain/loss).  What's new is how these concepts relate, and how they play into our (eventual) valuation of the patch. 

In patch 1, let's turn our attention from the initial portion, the commit message, to the rest of the file, starting at line 13.  BTW, if you use GNU Emacs (you should!), working w/ line numbers is easy.  After you ‘C-x C-f’ (or ‘M-x find-file’) the patch file, you can either turn on Line Number mode (via command ‘M-x line-number-mode’) to see the “current line” (where the cursor is) displayed and updated dynamically, or you can have Emacs report the current line on demand via command ‘M-x what-line’.  (Personally, I bind ‘M-l’ to ‘what-line’ and use ‘M-l’ every so often; YMMV.)  If you don't use Emacs, you can use ‘cat -n FILENAME’ in a shell to see line numbers, e.g.: 

$ cat -n 0001-tl-mkindex-int-Use-sort-car-more.patch
     1  From 805241d75f191f815a279074c2dfd78f34e2b869 Mon Sep 17 00:00:00 2001
     2  From: Thien-Thi Nguyen <>
     3  Date: Mon, 7 Aug 2017 19:47:07 +0200
     4  Subject: [PATCH 1/8] =?UTF-8?q?[tl-mkindex=20int]=20Use=20=E2=80=98sort/ca?=
     5   =?UTF-8?q?r<=E2=80=99=20more.?=
     6  MIME-Version: 1.0
     7  Content-Type: text/plain; charset=UTF-8
     8  Content-Transfer-Encoding: 8bit
    10  * sub/tl-mkindex (sort/car<): New proc, moved to top-level from...
    11  (consult-db sort/car<):
    12  (interesting-elements): Use ‘sort/car<’.
    13  ---
    14   sub/tl-mkindex | 14 ++++++--------
    15   1 file changed, 6 insertions(+), 8 deletions(-)
    17  diff --git a/sub/tl-mkindex b/sub/tl-mkindex

Anyway, as mentioned previously lines 14 and 15 show the footprint and gain/loss in terms of lines (the delta was hand-computed).  As this portion of the patch is machine-generated, we need not (but nonetheless can choose to) verify that yes, those counts reflect the respective summation of plus (‘+’) and minus (‘-’) characters in the left margin, from line 17 onwards. 

“But ttn, how shamelessly you lie!  I count 7 ‘+’ (lines 20, 25-29, 52) and 10 ‘-’ (lines 19, 37-40, 48-51, 56), not 6 and 8.” 

Yes, you are right.  I failed to outline the patch's structure from line 17 onwards: a “header” (lines 17-20), three “hunks” (lines 21-32, 33-43, 44-55), and a “footer” (lines 56-58); and furthermore explain that only the left-margin plus and minus chars in the hunks count (i.e., we can disregard lines 19, 20, and 56).  I did this (lie by omitting important detail) partly to exercise a rhetorical device, but mostly to encourage you to question authority (both me and the patch).  For the gain/loss in terms of lines of code is incidental to the gain/loss in terms of trust (in both me — or whomever you might query to explain an unfamiliar topic — and the patch). 

[Insert diatribe here about long-standing (and difficult to discern and uproot) PTB subversion of communications systems (like the Internet) through putatively peer-reviewed software development practices.  Sez HAL: “This sort of thing has cropped up before, and it has always been due to human error.”] 

Of course, how you go about wielding your mistrust is key; doing it wrong “greatly weakens the ability to cooperate with others” (allegedly).  Probably a good idea to temper the adversarial stance w/ a little charity if you can “afford” it, i.e., avoid “You lie!” and prefer “What am I missing?” if possible.  Yes, the middle way — always an option... 

Alright, so measuring gain/loss of trust is the name of the game.  Well, the patch gained some trust through commit message cross-checks (see preceding trip log), so let's try for more by cross-checking the commit message against the hunks.  (I called these collectively “the actual patch” in the preceding trip log's last paragraph — apologies for the imprecision.) 

Each hunk has the same structure: one “at-at-line” followed by one or more insertion lines (‘+’ at beginning of line), deletion lines (‘-’ at bol) or context lines (space at bol).  Context lines are present both before and after the patch is applied (they don't change).  The at-at-line also has structure:

@@ -K,L +N,M @@ TEXT

K, L, N, M are line numbers (not interesting), and TEXT consistently (in all patches in this patch set) has the form: ‘(define (PROCEDURE-NAME ...’.  (This is not always the case.  Generally, TEXT can be anything, or even missing.) 

Generalities out of the way, let's look at the first hunk (lines 21-32).  For your convenience, I excerpt it here, w/ line numbers prefixed: 

    21  @@ -209,6 +209,11 @@ (define (~listed url txt)
    22                                     (list txt)))))
    23               url txt)))
    25  +(define (sort/car< ls)
    26  +  (sort ls (lambda (a b)
    27  +             (string<? (car a)
    28  +                       (car b)))))
    29  +
    30   (define (consult-db upath-ignored where-clause)
    32     (define query

It shows five insertion lines and no deletion lines; the hunk is purely “additive”.  The PROCEDURE-NAME ‘~listed’ (line 21) doesn't ring any bells — first time we've encountered it — but ‘sort/car<’ (line 25) does.  That identifier appears in the patch title, and indeed on every line of the commit message.  Hmm, which line (if any) of the commit message corresponds to this hunk? 

Well, in the addition lines text, there are the same number of open-parens as close-parens; moreover, they are properly nested.  This tells us that the addition lines constitute a complete “form”.  If you are using Emacs, you can verify this by moving the cursor to the first open-paren on line 25 and typing ‘C-M-f’ to move forward, and ‘C-M-b’ to move backward, over the form.  The form is not indented; there is no space between the left margin and the first open-paren.  This tells us that it might be at top-level. 

Why only “might be” and not “is”?  Because formatting (which includes indentation) has no bearing on the meaning of programs written in Scheme (the programming language used here), so sometimes programmers pay less attention to that aspect and introduce misleading (albeit non-damaging) changes.  At this time, we are still building trust in this patch (and its author); vigilance is still indicated. 

Overall, the clues give us a high probability that this hunk corresponds to the first line in the commit message (line 10 in the patch).  We put that thought (let's call it ‘hunk-1-is-line-1’) on the “probable” shelf and now turn to the second hunk (lines 33-43), excerpted here: 

    33  @@ -229,10 +234,6 @@ (define (consult-db upath-ignored where-clause)
    34                           (map type-objectifier
    35                                '(text *text
    36                                       *text))))))
    37  -    (define (sort/car< ls)
    38  -      (sort ls (lambda (a b)
    39  -                 (string<? (car a)
    40  -                           (car b)))))
    42       (define (one tag upath title)
    43         (cons tag (sort/car< (map cons upath title))))

This hunk shows no insertion lines and four deletion lines; it is purely “subtractive”.  The changed text is identical to that in the first hunk, a complete form.  Unlike the first, however, this form is indented (by four columns, to be precise), which might be a clue that it is “inside” another form.  Corroborating this suspicion is the PROCEDURE-NAME ‘consult-db’ (line 33), which in the second line of the commit message (line 11 in the patch) appears as a top-level identifier.  Cool — correspondance clinched! 

Surfing on this wave of reasoning, we note:

and happily “move” (yuk yuk) ‘hunk-1-is-line-1’ from the “probable” shelf to the “certain” floor.  (The floor is usually underfoot and — di solito — “s”table. :-D) 

All this motion is characterized by the pairing of purely subtractive and purely additive changes involving identical forms.  Moving ‘sort/car<’ does not change it.  Moving ‘hunk-1-is-line-1’ does not change it.  The essential nature of these changes is solely in location. 

So, returning to the game (measuring gain/loss of trust), is “move” the best verb for the job?  Why not “translate” (def. 5), “displace” (def. 1), or even “hoist” (def. 1)?  As the author of the patch, I cannot give an impartial answer.  I chose “move” because those others are too specific — and potentially misleading, in the case of “hoist” (def. 4) — to my ear.  Plus, “move” is short and sweet, not windy and full of frills (like this trip log).  If these reasons resonate w/ you, tally a gain in trust.  If not, tally a loss.  Up to you. 

“But ttn, why the cold shoulder?  Can we get back to patch 1 particulars?” 

Yeah, tone fail.  I'm trying to convey a “tough love” vibe but (like real life) am quite the klutz w/ nuanced (non-explicit) expression.  Sorry — I'm working on it...  :-( 

Back to patch 1 particulars, then.  By process of elimination, and presuming a 1-to-1 correspondance, the third hunk (lines 44-55) must correspond w/ line 3 of the commit message (line 12 in the patch).  Indeed, that its PROCEDURE-NAME ‘interesting-elements’ appears in the commit message ‘ID’ position is a strong confirmation.  Here's the excerpt: 

    44  @@ -276,10 +277,7 @@ (define (interesting-elements)
    46     (let-values (((leaf? tag refs) (context)))
    47       (let* ((distinct-refs (delete-duplicates
    48  -                           (sort (apply append refs)
    49  -                                 (lambda (a b)
    50  -                                   (string<? (car a)
    51  -                                             (car b))))))
    52  +                           (sort/car< (apply append refs))))
    53              (url (map car distinct-refs)))
    55         (define (leaf-entries)

This hunk is neither purely additive nor purely subtractive; there are insertions as well as deletions, all involving the lines that follow the ‘distinct-refs’ bit (line 47).  If you are using Emacs, you can open patch 1 (via ‘C-x C-f’) and type ‘C-c C-d’ (aka ‘M-x diff-unified->context’) to transform this hunk to something like: 

*** 276,285 ****

    (let-values (((leaf? tag refs) (context)))
      (let* ((distinct-refs (delete-duplicates
!                            (sort (apply append refs)
!                                  (lambda (a b)
!                                    (string<? (car a)
!                                              (car b))))))
             (url (map car distinct-refs)))

        (define (leaf-entries)
--- 277,283 ----

    (let-values (((leaf? tag refs) (context)))
      (let* ((distinct-refs (delete-duplicates
!                            (sort/car< (apply append refs))))
             (url (map car distinct-refs)))

        (define (leaf-entries)

In this format, the context lines are duplicated into “before” and “after” sections, and the insertion and deletion lines are represented together (undistinguished) as “changed lines” (‘!’ at bol).  This is easier on the eyes for many. 

At this point, further cross-checks require understanding the Scheme programming language, which is definitely out of scope for this trip log.  So, I'll simply list the remaining observations and conclusions now, trying to phrase things as much as possible from a non-programmer pov, and you can trust me or not.  If not, no problem — in that case I suggest you find someone you trust to take a look and voice an opinion.  (I have been a programmer for a long time; the non-programmer pov I adopt is at best a reconstructed memory, a hologram of patchily inconsistent fuzziness).  First, the observations: 

Now the conclusions, including a bit of reasoning: 

Wow, that was grueling — If you're still here, kudos!  All cross-checks complete (tally a gain in trust — woo hoo!), let's shift focus from the “what” to the “why”, which is the traditional vector for all sorts of interesting (and trust-wobbling) insights. 

Some of the hifalutin reasons I've already mentioned: portfolio padding (to impress potential employers), joy of writing for humans, joy of writing for computers.  These are (opportunistically) wrapped around the core reason for patch 1: to change a small piece of the code from WET to DRY (i.e., to eliminate a redundancy). 

Per se, that reason is an excellent one, fundamental on both technical and aesthetic scales, and thus you might be inclined to tally a gain in trust.  But wait!  What's the larger context here?  This is code that I wrote, and I have a lot of experience.  Why didn't all that experience impede me from introducing that redundancy in the first place?  Hmmm!  (Tally a loss in trust.) 

To save time, I'll lump all the excuses (which are detailed and somewhat technical) into one: I forgot.  Whether you tally this as a gain or loss is (again) up to you.  In times past, I would have considered such lumping a clear loss, but nowadays, the knife edge is less sharp and oriented differently. 

Lastly, wrt the patina of genius (per Perlis) from the negative delta, can redundancy be considered complexity?  I think in this case, yes, but only ever so slightly.  Personally, the satisfaction I feel for patch 1 derives mostly from a sense of redemption through remediation, like adjusting an off-kilter picture frame back to true vertical.  There's no genius here, really. 

That's it.  I hope you tallied a sufficient gain in trust to join me again in delving into patches 2, 3, and so on.  I hope to expand on “I forgot”, as well (if I don't forget :-D). 

Copyright (C) 2017 Thien-Thi Nguyen