Great example! Thanks for sharing. I tested the same prompt using a variety of GPT-4 model versions and it consistently got the answer wrong. When I continued with "use algebra/formulas to confirm your answer" it corrected the initial mistake, but I'm surprised that it's not doing that out of the box.