You know there's a huge difference between training the original model and transfer learning to apply it to a new use case, right? Saying people are years behind if they think there work is only worth something with 8 A100 pods is pretty ignorant of how most applications get built. Not everyone's trying to design novel model architectures, nor should they.