Tyree’s Code: John Henry… For Spreadsheets?
In Case you just stumbled upon this: See this Blog
This blog post is a technical breakdown of the following Google Sheets formula:
=SUM(INDIRECT( "'DataSet'!" & ADDRESS(2, MATCH(E1, DataSet!$1:$1, 0)) & ":" & ADDRESS(ROWS(DataSet!R:R), MATCH(E1, DataSet!$1:$1, 0))))
I used ChatGPT to summarize this Google Sheets code. This code, itself, was created by ChatGPT at the behest of a friend of mine. You can get a full explanation as well as context as to why I’m having ChatGPT explain this formula here: John Henry… For Spreadsheets.
Without further ado…
Here is ChatGPT’s explanation of my friend’s Formula
Step 1: MATCH(E1, DataSet!$1:$1, 0)
This part searches for the value in cell E1 within row 1 of the DataSet sheet (typically the header row). Here’s how it works:
E1: This is the value you’re trying to find in the headers.
DataSet!$1:$1: The range is the entire first row of the DataSet sheet, where each cell is a header.
0: The match type of 0 means it looks for an exact match.
This MATCH function returns the column number where the value in E1 is found within the header row.
Step 2: ADDRESS(2, MATCH(...))
The ADDRESS function generates a cell reference as a text string based on row and column numbers. Here’s how it’s used here:
Row argument (2): This specifies the start row, which in this case is row 2.
Column argument: This is provided by MATCH(E1, DataSet!$1:$1, 0), which returns the column number of the matching header.
This results in a cell reference like "B2" (if E1 matched with column B in the headers). This becomes the start cell for the range we want to sum.
Step 3: ROWS(DataSet!R:R)
This part counts the number of rows in column R on the DataSet sheet:
Since R:R refers to the entire column, ROWS(DataSet!R:R) returns the total number of rows in the sheet, regardless of content.
This value is then used as the last row in the range for the SUM calculation, setting an end point for the row range.
Step 4: ADDRESS(ROWS(DataSet!R:R), MATCH(...))
This ADDRESS function uses the output from ROWS(DataSet!R:R) as the row number, paired again with the MATCH output for the column number:
This will give something like "B1000" if ROWS(DataSet!R:R) is 1000 and MATCH found column B.
This cell reference represents the end cell for the range.
Step 5: INDIRECT("'DataSet'!" & ADDRESS(...) & ":" & ADDRESS(...))
The INDIRECT function constructs a range from two cell references (start and end) created by ADDRESS:
The two ADDRESS functions are combined with ":" to create a range like "B2:B1000".
INDIRECT("'DataSet'!" & "B2:B1000") converts this text into an actual reference that Excel can use to evaluate and return a range of values from B2 to B1000 in the DataSet sheet.
Step 6: SUM(...)
Finally, the SUM function calculates the total of all values in the constructed range from INDIRECT:
If your dataset is large, this can get slow because INDIRECT recalculates every time a cell in the spreadsheet is changed, affecting performance.
Summary of What It Does
Finds the column corresponding to the header in E1.
Creates a range from the 2nd row in that column down to the last row.
Sums up all the values in that range.
In short, this formula dynamically identifies and sums a vertical column of values in the DataSet sheet, where the column is chosen based on the header specified in E1.