GPU Hardware Specifications

Complete technical specifications, benchmarks, and comparisons for data center and consumer GPUs

📊 Popular Comparisons

H100 vs A100

Is the 3x price worth the performance gain?

Compare Now →

Consumer vs Datacenter

RTX 4090 vs A100: Licensing and performance

Learn More →

TPU Explained

Google's tensor processors demystified

Read Guide →

GPU Benchmarks

Real-world ML training performance

View Results →

🚀 Enterprise GPUs

NVIDIA B200LATEST

Blackwell Architecture • 2025

Memory: 192GB HBM3e
Bandwidth: 8.0 TB/s
FP8 PFLOPS: 4.5/9.0 (dense/sparse)
TDP: 1000W
Interconnect: NVLink 5.0

Cloud Price:

$5.89 - $7.99/hr

Compare

NVIDIA H200NEW

Hopper Architecture • 2024

Memory: 141GB HBM3e
Bandwidth: 4.8 TB/s
FP8 PFLOPS: 32 (peak)
TDP: 700W
Interconnect: NVLink 4.0

Cloud Price:

$3.58 - $10.60/hr

Compare

NVIDIA H100

Hopper Architecture • 2022

Memory: 80GB HBM3
Bandwidth: 3.35 TB/s
FP16 TFLOPS: 989
TDP: 700W
Interconnect: NVLink 4.0

Cloud Price:

$2.29 - $4.99/hr

Compare

NVIDIA A100

Ampere Architecture • 2020

Memory: 40/80GB HBM2e
Bandwidth: 2.0 TB/s
FP16 TFLOPS: 312
TDP: 400W
Interconnect: NVLink 3.0

Cloud Price:

$1.65 - $3.67/hr

Compare with H100

NVIDIA V100

Volta Architecture • 2017

Memory: 16/32GB HBM2
Bandwidth: 900 GB/s
FP16 TFLOPS: 125
TDP: 300W
Interconnect: NVLink 2.0

Cloud Price:

$1.11 - $2.30/hr

Compare

AMD MI355X2025

CDNA 4 • June 2025

Memory: 288GB HBM3e
Bandwidth: 8.0 TB/s
FP8 TFLOPS: 2,615
TDP: 750W
Process: 3nm N3P

Cloud Price:

$4.99/hr

Compare

AMD MI325X

CDNA 3 • 2024

Memory: 256GB HBM3e
Bandwidth: 6.0 TB/s
FP16 TFLOPS: 1,307
TDP: 750W
Interconnect: Infinity Fabric

Cloud Price:

$2.25 - $2.49/hr

Compare

Intel Gaudi 3

5nm Process • 2024

Memory: 128GB HBM2e
Bandwidth: 3.7 TB/s
FP8 PFLOPS: 1.835
TDP: 600W
Network: 24× 200GbE

Cloud Price:

$1.99/hr

Compare

Google TPU v7PREVIEW

Ironwood • 2025

Memory: 192GB HBM
Bandwidth: 7.2 TB/s
FP8 PFLOPS: 4.6
BF16 TFLOPS: 2,300
Interconnect: 1.2 Tbps ICI

Cloud Price:

$8.50/hr (preview)

Learn More

AWS Trainium3DEC 2025

3nm Process • Dec 2025

Memory: 144GB HBM3e
Bandwidth: 4.9 TB/s
FP8 PFLOPS: 2.52
Performance: 4.4× vs Trn2
Efficiency: 4× perf/watt

Cloud Price:

$3.85/hr

Compare

🎮 Consumer/Prosumer GPUs

RTX 4090FLAGSHIP

Ada Lovelace • 2022

Memory: 24GB GDDR6X
Bandwidth: 1.01 TB/s
FP16 TFLOPS: 82.6
TDP: 450W
CUDA Cores: 16,384

Cloud Price:

$0.65 - $0.79/hr

Compare

RTX A6000

Ampere Workstation • 2020

Memory: 48GB GDDR6
Bandwidth: 768 GB/s
FP16 TFLOPS: 77.0
TDP: 300W
CUDA Cores: 10,752

Cloud Price:

$1.28 - $1.89/hr

Compare

RTX 3090LEGACY

Ampere Gaming • 2020

Memory: 24GB GDDR6X
Bandwidth: 936 GB/s
FP16 TFLOPS: 71.0
TDP: 350W
CUDA Cores: 10,496

Cloud Price:

$0.44 - $0.69/hr

Compare

A40

Ampere Professional • 2020

Memory: 48GB GDDR6
Bandwidth: 696 GB/s
FP16 TFLOPS: 74.7
TDP: 300W
CUDA Cores: 10,752

Cloud Price:

$1.28 - $1.65/hr

Compare

📊 Performance Benchmarks

GPU Model	LLaMA 7B Training	Stable Diffusion	BERT Fine-tune	Inference (tok/s)	Score
B200 192GB	1.5 hours	28 img/s	15 min	6,200	18000
H200 141GB	2.5 hours	18 img/s	25 min	4,100	12000
H100 80GB	4 hours	12 img/s	45 min	2,850	9800
MI325X 256GB	10 hours	10 img/s	70 min	1,450	8500
A100 80GB	12 hours	8 img/s	90 min	1,200	7500
A100 40GB	14 hours	7 img/s	100 min	1,100	7200
V100 32GB	22 hours	5 img/s	150 min	650	5200
RTX 4090	18 hours	10 img/s	120 min	850	4500
RTX 3090	26 hours	6 img/s	180 min	500	3800

📚 2026 Hardware Selection Guide

Frontier Models (>100B): B200 192GB, MI355X 288GB, or multiple H200s
Large Language Models (30-100B): H200 141GB, MI325X 256GB, or H100 80GB clusters
Medium Models (7-30B): H100 80GB, A100 80GB, or MI300X 192GB
Small Models (<7B): RTX 4090, A100 40GB, or Gaudi 3
Image Generation: RTX 4090 (best value), H100 for production scale
Inference at Scale: Groq LPU, AWS Trainium3, or Intel Gaudi 3
Cost-Optimized Training: MI325X (best GB/$), Intel Gaudi 3 (lowest $/hr)
Research/Development: RTX 4090 or cloud spot instances