Fair enough. I would expect ARM support in Lambda quite soon. May be next re:invent, mb later during 2021. Graviton2-based images just demonstrate such superior performance. So far AWS Inferentia, however has not acquired enough traction. As far as I know, there is fundamental issue with supporting serverless GPU - unlike regular CPU, the initialization cycle is just to long. Therefore, we as our serverless inference strategy adopted Lambda first approach to see how far could we go before deploying heavy guns of SageMaker end-points. The trick is to make the transfer seamless for developers. For us, it's servelress as long as developer does not think about servers. Pay as you use model is cost a optimization issue.

Software technologist/architect; connecting dots across multiple disciplines; C-level mentoring

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store