On my M1 Max macbook pro, the GGUF version bartowski/google_gemma-3-27b-it-qat-GGUF is 15.6gb and runs at 17tok/sec, whereas mlx-community/gemma-3-27b-it-qat-4bit is 16.8gb and runs at 15tok/sec. Note that both of these are the new QAT 4bit quants.
On my M1 Max macbook pro, the GGUF version bartowski/google_gemma-3-27b-it-qat-GGUF is 15.6gb and runs at 17tok/sec, whereas mlx-community/gemma-3-27b-it-qat-4bit is 16.8gb and runs at 15tok/sec. Note that both of these are the new QAT 4bit quants.