Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Based on the current DeepSeek website I suspect it's not going to be great as their current model (V3.4? V4-mini?) often forgets or changes facts explicitly mentioned in the conversation which R1 never did. It's better than R1 at math or coding, but nearly unusable for deep conversation. I suspect they pushed MLA or linear attention too much, or quantize a lot more than before.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: