AWS··aws@amazon.com
Introducing GPU Health Monitoring and Auto Repair for Amazon ECS Managed Instances
Amazon Elastic Container Service (Amazon ECS) now offers NVIDIA GPU health monitoring and auto repair functionality for Amazon ECS Managed Instances. The new capability automatically detects critical NVIDIA GPU hardware failures and replaces impaired instances, helping customers improve the availability and reliability of their GPU-accelerated containerized workloads. Running GPU-accelerated workl