Learning

Selecting Ideal EC2 Instances for GPU Workloads on AWS

Agam Jain

Apr 3, 2025

10 mins

Choosing the right EC2 pricing model for your AI/ML workloads can make or break your cloud budget. Machine learning tasks, whether training large models or serving real-time predictions, often require significant computing resources. Choosing the optimal pricing model can significantly reduce costs while meeting performance needs. In this article, we'll explore the different EC2 pricing models, look at gpu availability and pricing, followed by a concise table mapping specific use cases to recommended EC2 instances.

Types of EC2 Instances on AWS

1. On-Demand Instances

On-Demand Instances let you pay only for the computing capacity you use, without long-term commitments.

  • Pricing Model:
    • Pay-per-use, billed hourly or by the second.
    • Highest cost among EC2 pricing options.
  • Ideal Use Cases:
    • Unpredictable or short-lived workloads.
    • Spiky real-time inference traffic.
    • Experimental or ad-hoc AI/ML tasks.
  • Benefits & Considerations:
    • Pros: Flexible, no upfront payment, no long-term commitment.
    • Cons: Higher costs compared to other models.

2. Spot Instances

Spot Instances allow you to utilize unused EC2 capacity at a significantly discounted rate.

  • Pricing Model:
    • Up to 90% cheaper compared to On-Demand prices.
    • Prices fluctuate based on supply and demand.
  • Ideal Use Cases:
    • Fault-tolerant training jobs.
    • Batch processing and non-critical offline inference tasks.
  • Benefits & Considerations:
    • Pros: Major cost savings, ideal for non-critical workloads.
    • Cons: Instances can be terminated by AWS with short notice (2-minute warning).

3. Reserved Instances (RIs)

Reserved Instances involve committing to specific instance configurations for a set period (1 or 3 years) at discounted rates.

  • Pricing Model:
    • Up to 72% discount compared to On-Demand prices.
    • Requires upfront commitment (partial or full upfront options available).
  • Ideal Use Cases:
    • Long-term, predictable workloads.
    • Steady-state real-time inference services.
  • Benefits & Considerations:
    • Pros: Cost-effective for predictable, long-term usage; guaranteed availability.
    • Cons: Reduced flexibility, risk of paying for unused capacity.

4. Capacity Blocks for ML

Capacity Blocks for ML enable reservation of GPU instances in advance for specific short-term periods.

  • Pricing Model:
    • Short-term reservations without long-term commitments.
    • Pay only for the reserved duration.
  • Ideal Use Cases:
    • Short-term predictable workloads (3-6 months).
    • Critical AI/ML projects needing guaranteed GPU availability.
  • Benefits & Considerations:
    • Pros: Guaranteed GPU instance availability without long-term commitment.
    • Cons: Less cost-effective for extended periods compared to Reserved Instances.

GPU Availability and Pricing Across EC2 Pricing Models

The table below summarizes six GPU offerings on AWS (us-east-1 region) – including their representative EC2 instance types – and the hourly pricing for each available purchase model.

GPU (Instance Type)On-DemandCapacity Blocks (ML)SpotReserved (1yr)
NVIDIA H200 (8× H200,p5e.48xlarge)N/AYes – via Capacity Blocks for ML, effective $34.61/hr in us-east-1Yes (Spot capacity is expected – pricing TBD; typically 70–80% off On-Demand)Yes (1-year or 3-year Savings Plans available – significant discounts)
NVIDIA H100 (8× H100,p5.48xlarge)N/AYes – via Capacity Blocks, effective $31.46/hr~$22.78/hr Spot (approx., varies by AZ)Yes (Standard RIs/Savings Plans – e.g. ~30% off On-Demand; 1-year no-upfront roughly ~$65–70/hr)
NVIDIA L40S (8× L40S,g6e.48xlarge)$30.13/hr On-DemandN/A (Not offered via Capacity Blocks)~$9.0/hr Spot (est., up to ~70% off On-Demand)Yes (1-year RI ~$21/hr est.)
NVIDIA A10G (8× A10G,g5.48xlarge)$16.29/hr On-DemandN/A (No Capacity Blocks)~$5.0/hr Spot (typical discount ~70%)$11.97/hr with 1-year Reserved (No Upfront)
NVIDIA L4 (8× L4,g6.48xlarge)$13.35/hr On-DemandN/A (No Capacity Blocks)$3.09/hr Spot (avg. in N. Virginia)~$8.70/hr 1-year Reserved (No Upfront, effective) ⁺
NVIDIA T4 (4× T4,g4dn.12xlarge)$3.91/hr On-DemandN/A (No Capacity Blocks)~$1.17/hr Spot (est., ~70% off)$2.65/hr 1-year Reserved (Partial Upfront)

Recommended EC2 Instances for different types of GPU Workloads

AI/ML Use CaseRecommended EC2 Instance TypeRemarks
Real-time inference with long-term predictable usageReserved InstancesSignificant cost savings (up to 72%) for steady-state workloads requiring consistent, long-term capacity.
Real-time inference with short-term predictable usage (e.g., 3–6 months) or planned training runsCapacity Blocks for MLReservation of GPU instances for specific short periods, ensuring availability without long-term commitment.
Real-time inference with spiky or unpredictable trafficOn-Demand InstancesFlexibility to handle variable workloads, ideal for sudden demand spikes without prior commitment.
Offline inference, batch processing, or job queuesCombination of On-Demand and Spot InstancesBalance cost and availability, using low-cost Spot Instances for non-urgent tasks and On-Demand Instances for urgent processing.
Fault-tolerant training jobs or non-critical workloadsSpot InstancesAccess spare EC2 capacity at greatly reduced prices (up to 90% discount), ideal for workloads tolerant to interruptions.

Conclusion

Selecting the right Amazon EC2 instance pricing model is crucial for balancing performance and cost efficiency for GPU workloads. Each model has its sweet spot:

  1. On-Demand for flexibility
  2. Spot for extreme cost savings with fault tolerant workloads
  3. Reserved for long-term efficiency, and
  4. Capacity Blocks for bridging the gap when you need guaranteed short-term GPU power

Understanding these options helps optimize spending, ensuring you pay precisely for the capacity needed while maintaining robust performance.

Deploy in minutes, scale in seconds

Get started for free or contact us to get a custom demo tailored to your needs.

Deploy in minutes, scale in seconds

Get started for free or contact us to get a custom demo tailored to your needs.

Deploy in minutes, scale in seconds

Get started for free or contact us to get a custom demo tailored to your needs.

© 2024. All rights reserved.

Join our Newsletter

Sign up to our mailing list below and be the first to know about updates and founder’s notes. Don't worry, we hate spam too.

© 2024. All rights reserved.

Join our Newsletter

Sign up to our mailing list below and be the first to know about updates and founder’s notes. Don't worry, we hate spam too.

© 2024. All rights reserved.