TrainBy.ai — ML training platforms that don't break the bank

December 2025

Current GPU prices

The same H100 can cost $2/hr or $4/hr depending on where you rent it. Spot instances are 50-70% cheaper but can disappear mid-training. Here's what things actually cost:

NVIDIA H100 (80GB)

The gold standard for training. HBM3, 3TB/s bandwidth.

Lambda Labs$2.49/hr

RunPod$2.69/hr

AWS$4.00/hr

GCP$3.74/hr

NVIDIA A100 (80GB)

Still the workhorse. Great for fine-tuning and medium runs.

Lambda$1.29/hr

RunPod spot$0.89/hr

Modal$1.10/hr

RTX 4090 (24GB)

Consumer GPU, surprisingly capable for experimentation.

RunPod$0.44/hr

Vast.ai$0.29/hr

💡 Pro tip

Spot instances are 50-70% cheaper but can be terminated anytime. Use them for fault-tolerant training with checkpointing. On-demand for anything you can't restart.

Platform breakdown

Where to train

Modal

Serverless · Pay per second

Fan favorite

The serverless GPU platform ML Twitter loves. Write Python, decorate with @modal.gpu("A100"), it runs in the cloud. Cold starts ~1-2 seconds. $30/month free credits.

"Modal changed how I think about GPU compute. I went from spinning up EC2 instances to just running a Python script." — ML engineer at a Series A startup

A100: $1.10/hr H100: $2.95/hr $30/mo free

Try it →

Lambda Labs

Cloud + On-prem

Best H100 price

If you need raw GPU hours at the best price, Lambda is hard to beat. H100s at $2.49/hr are the cheapest we've found for reliable availability. They also sell physical hardware.

H100: $2.49/hr A100: $1.29/hr 8xH100: $19.92/hr

Best for: serious training runs needing hours of uninterrupted compute.

Try it →

RunPod

Spot & on-demand marketplace

Cheapest spot

Spot instance marketplace. A100s often under $1/hr. Can be preempted, availability varies. Good for batch jobs where you can checkpoint and resume.

"I run fine-tuning jobs on RunPod spot overnight. If preempted, checkpoint saves and restarts. Full LoRA fine-tune costs about $20." — Open source contributor

A100 spot: $0.89/hr 4090: $0.44/hr

Try it →

Hugging Face

Models + Spaces + Training

The hub

Not just a model hub anymore. Inference endpoints, AutoTrain for no-code fine-tuning, Spaces for demos. If you're working with open models, you're probably here anyway.

500K+ models AutoTrain: no code Inference: $0.06/hr+

Try it →

More options

Other platforms

Replicate

Pay per second · API-first

Run and fine-tune models via API. Great for deploying open models without managing infra.

Try it →

Together AI

Inference + fine-tuning

Fast inference for open models. Often the cheapest way to run Llama, Mixtral, etc.

Try it →

Vast.ai

GPU marketplace

Peer-to-peer GPU rental. Cheapest option, reliability varies. Good for experiments.

Try it →

Anyscale

Ray-based · Enterprise

From Ray creators. Good for distributed training at scale. Enterprise-focused.

Try it →

TL;DR

What to use

Experimenting? Modal free tier or RunPod spot. Vast.ai for absolute cheapest.

Fine-tuning? Modal or RunPod. Hugging Face AutoTrain for no-code.

Serious training? Lambda Labs for best H100 prices.

Production inference? Replicate or Together AI APIs. Modal for more control.

Enterprise? AWS/GCP if already there. More expensive but ecosystem benefits.

ML training without the AWS bill shock

GPU price alerts

Current GPU prices

Where to train

Other platforms

What to use