KPIs for Cloud Platforms Regarding ML Inferencing
- Performance: How well does the platform optimize and accelerate inference workloads?
- Scalability: Can the platform easily scale to meet changing demands?
- Cost: How cost-effective is the platform for inference workloads?
- Ease of use: How easy is it to deploy and manage inference models on the platform?
- Integration: How well does the platform integrate with other ML tools and services?
- Security: How secure is the platform for deploying and running inference models?
Cloud Platform Support for Top 5 Inference Servers for Generative AI
NVIDIA Triton
✅ Supports Triton Inference Server on AWS Marketplace. Offers pre-built AMIs and managed services for easy deployment
✅ Supports Triton Inference Server on Azure Marketplace. Offers VMs and Azure Cognitive Services for deployment
✅ Supports Triton Inference Server on GCP Marketplace. Offers VMs and managed services for deployment.
TensorRT
✅ Offers NVIDIA Deep Learning AMI with TensorRT pre-installed. Provides managed services for deploying TensorRT models.
✅ Offers NVIDIA Deep Learning VM with TensorRT pre-installed. Provides managed services for deploying TensorRT models.
✅ Offers NVIDIA Deep Learning VM with TensorRT pre-installed. Provides managed services for deploying TensorRT models.
Apache MXNet Serving
✅ Supports MXNet Serving through Amazon SageMaker Neo. Offers managed services for deploying MXNet models.
✅ Supports MXNet Serving through Azure Machine Learning. Offers managed services for deploying MXNet models.
✅ Supports MXNet Serving through AI Platform Prediction. Offers managed services for deploying MXNet models.
ONNX Runtime
✅ Supports ONNX Runtime through Amazon SageMaker Neo. Offers managed services for deploying ONNX models.
✅ Supports ONNX Runtime through Azure Machine Learning. Offers managed services for deploying ONNX models.
✅ Supports ONNX Runtime through AI Platform Prediction. Offers managed services for deploying ONNX models. |
Milvus
✅ Offers Milvus as a managed service on AWS Marketplace.
✅ Offers Milvus as a managed service on Azure Marketplace.
✅ Offers Milvus as a managed service on GCP Marketplace.
Additional Considerations
- Vendor lock-in: Consider how easy it is to switch between cloud platforms if needed.
- Compliance: Ensure the platform meets your industry’s compliance requirements.
- Support: Choose a platform with a good track record of customer support.
By carefully considering these factors, you can choose the cloud platform that best meets your needs for deploying and running generative AI models.