Fair enough. I would expect ARM support in Lambda quite soon. May be next re:invent, mb later during 2021. Graviton2-based images just demonstrate such superior performance. So far AWS Inferentia, however has not acquired enough traction. As far as I know, there is fundamental issue with supporting serverless GPU - unlike regular CPU, the initialization cycle is just to long. Therefore, we as our serverless inference strategy adopted Lambda first approach to see how far could we go before deploying heavy guns of SageMaker end-points. The trick is to make the transfer seamless for developers. For us, it's servelress as long as developer does not think about servers. Pay as you use model is cost a optimization issue.