I wrote a mini guide on running Gemma 3 at https://docs.unsloth.ai/basics/tutori...

vessenes · on March 12, 2025

Daniel, as always, thanks for these. I had good results with your Q4_K_M quant on mac / llama.cpp. However, on Linux/A100/ollama, there is something very wrong with your Q8_0 quant. python code has indentation errors, missing close parens, quite a lot that's bad. I ran both with your suggested command lines, but of course could have been some mistake I made. I'm testing the bf16 on the A100 now to make sure it's not a hardware issue, but my gut is there's a model or ollama sampling problem here.

EDIT: 27b size

tarruda · on March 12, 2025

Thanks for this, but I'm still unable to reproduce the results from Google AI studio.

I tried your version and when I ask it to create a tetris game in python, the resulting file has syntax errors. I see strange things like a space in the middle of a variable name/reference or weird spacing in the code output.

ac29 · on March 12, 2025

Some models are more sensitive to quantization than others, presumably AI Studio is running the full 16 bit model.

Try maybe the 8bit quant if you have the hardware for it? ollama run hf.co/unsloth/gemma-3-27b-it-GGUF:Q8_0

tarruda · on March 12, 2025

I tested the full fp16 gguf

svachalek · on March 12, 2025

This seems worse than the official Ollama build. First question I tried:

>>> who is president

The বর্তমানpresident of the United States is Джо Байден (JoeBiden).