Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn't a key selling point of the latest, hottest model that's on the front page of Hacker News multiple times right now, the fact that it fits on consumer-grade GPUs? Surely some of the interesting ideas it's spawning right now are people doing transfer learning on GPUs that don't end in "100", don't you think?


for what it's worth, stable diffusion was trained on 32 x 8 x A100 GPUs


You know there's a huge difference between training the original model and transfer learning to apply it to a new use case, right? Saying people are years behind if they think there work is only worth something with 8 A100 pods is pretty ignorant of how most applications get built. Not everyone's trying to design novel model architectures, nor should they.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: