Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What impact does Google's Tensor Processing Unit have on the answer to this question?


Well, AFAIK, it isn't for sale; you have to go through their cloud offering. So it's not impactful at all on what GPU to get.


I mean it can have an effect on where the answer is to not get a GPU at all and use the cloud service.


AFAIK AMD has not announced any Tensor Processing Unit or specific tensor acceleration capability. nVidia Volta [0] does have a Tensor Processing core optimization.

https://www.nvidia.com/en-us/data-center/volta-gpu-architect...


'Tensor processing' isn't some special kind of compute. The TPUs (which were badly named by google) are just specifically designed ASICs with lower precision which have much lower power consumption than GPUs, therefore cost effective for inference, which google does more than anyone on earth.

'Tensor Processing Unit' has become some what a phrase used to confuse (trick?) people into thinking it's some new type of processing, but its not. If the GPU says it has a Tensor Processing Unit, it just means it operates with a lower precision, but you have to remember that the overall power consumption of the GPU doesn't really change at full use compared to using higher precision chips. So you're actually missing out on the cost effectiveness of using lower precision.

If anything the 'Tensor Processing Units' just take up unnecessary die space when included high higher precision units because they're bad for training compared to higher precision compute units.


That's not true. Previous gpus, like Pascal, have lower precision instructions​ for fp16 and int8. The tpu allows a 4x4 matrix multiply and accumulated in one clock cycle through special cores. They're physically different parts of the die.


But they're not a new paradigm as they are often stated in 'marketing talk'. Also, what are you referring to by 'tpu'? Google's TPU or nVidia's Tensor processing core? Because AFAIK, there aren't any details about TPU's processing pipeline other than that one paper about data center use.

I realize they are physical parts of the die, which is why I said in my last sentence about how they take up unnecessary die space. Why? Because that die space will be better used for higher precision FP units which will be useful in training, which is more important than inference for most of the people in this thread.


Tpu is what Nvidia and Google both call their low precision matrix multiply hardware. Your assessment of how the hardware can better be used is just your assessment. The reason they did this is because they realized that a large share of their professional users want this, and they're willing to pay. They'll likely have other versions of the card where the tpus aren't there, so you'll have that option as well. It's not a new paradigm, but it's absolutely new to have this type of computation in a single cycle at these clock rates.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: