Amazon EC2 Inf2 Instances for Low-Cost, High-Performance Generative AI Infe

2 years ago aws.amazon.com

Summary: This is a summary of an article originally published by AWS DevOps Blog. Read the full original article here →

https://aws.amazon.com/polly/ Innovations in deep learning (DL), especially the rapid growth of large language models (LLMs), have taken the industry by storm. DL models have grown from millions to billions of parameters and are demonstrating exciting new capabilities.

Compared to Amazon EC2 Inf1 instances, Inf2 instances deliver up to 4x higher throughput and up to 10x lower latency.

Since the underlying AWS Inferentia2 chips are purpose-built for DL workloads, Inf2 instances offer up to 50 percent better performance per watt than other comparable Amazon EC2 instances.

With just a few lines of code changes, we compiled and ran a PyTorch model on an Amazon EC2 Inf2 instance.

DevOps Articles