I love that you are doing this test. However, as it purports to be a test of "English-to-SQL", your hardest question (Q9) seems ungrammatical:
> Show order lines, revenue, units sold, revenue per unit (total revenue รท total units sold), average list price per product in the subcategory,
gross profit, and margin percentage for each product subcategory.
In particular, the clause "in the subcategory, gross profit, and margin percentage for each product subcategory" is ambiguous, and I wonder if more models would pass if the English were reformulated to be correct.
(it's also notable that Claude Opus 4.6 and Sonnet 4.6 both "missed" this one)
> Show order lines, revenue, units sold, revenue per unit (total revenue รท total units sold), average list price per product in the subcategory, gross profit, and margin percentage for each product subcategory.
In particular, the clause "in the subcategory, gross profit, and margin percentage for each product subcategory" is ambiguous, and I wonder if more models would pass if the English were reformulated to be correct.
(it's also notable that Claude Opus 4.6 and Sonnet 4.6 both "missed" this one)