Now supporting JAX and multi-GPU training

Train models on GPUs.
Skip the DevOps.

Pick a GPU, point us at your code, and start training. We handle provisioning, metrics, and checkpoints so you never touch a cloud console again.

terminal
$|
Successfully installed convexlabs-0.5.0
$|
Detected: PyTorch project
GPU provisioned (1x A10G, 24 GB)
Metrics streaming to dashboard
convexlabs.app/runs/r_8f3k

// how it works

From code to training in 3 steps

No Dockerfiles. No Kubernetes. No cloud console. Just your code and a GPU.

01

Install the CLI

One pip install. Add a few log calls to your training script. The CLI auto-detects your framework, repo, and entry point.

train.py
import convexlabs

run = convexlabs.init()

for epoch in range(10):
    loss = train(model, data)
    run.log(loss=loss, epoch=epoch)

run.finish()
02

Pick a GPU and launch

Choose from T4 to multi-GPU A10G clusters. We provision the instance, clone your repo, pull your datasets, and start training. You never touch AWS.

terminal
$ convexlabs launch

◆ ConvexLabs
  Framework   PyTorch
  Entry       train.py
  Repo        you/model (main)

? Select GPU  ▸ A10G (24GB)
? Launch? (Y/n)  y

✓ Run created: model-20260331
✓ Instance provisioning...
03

Watch metrics stream live

Loss, accuracy, learning rate — every metric you log appears on your dashboard in real-time. Checkpoints save automatically. Share results with your team in one click.

dashboard
epoch 8loss: 0.034acc: 96.8%
epoch 9loss: 0.028acc: 97.4%
epoch 10loss: 0.021acc: 98.1%
✓ Best checkpoint saved (98.1%)

// integration

15 lines. That's the integration.

Add a few API calls to your training script. We handle provisioning, monitoring, and checkpointing. Your code never sees our credentials.

1import convexlabs
2
3run = convexlabs.init()
4
5for epoch in range(10):
6 loss = train(model, dataloader)
7 acc = evaluate(model, val_loader)
8
9 # Metrics show up live in your dashboard
10 run.log(loss=loss, accuracy=acc, epoch=epoch)
11
12 if acc > best_acc:
13 run.save_checkpoint("./best_model.pt")
14 best_acc = acc
15
16run.finish()

// features

Everything you need. Nothing you don't.

No 47-page docs to read. No vendor lock-in. Just a clean API and a dashboard that doesn't suck.

Real-Time Metrics

Loss curves, accuracy, custom metrics. All streaming live to your dashboard. Zoom, pan, compare across runs.

training/loss2.8

GPU Tiers

From CPU micro for testing to 4x A10G for distributed training. Pick your instance, we handle the rest.

CPU Microt3a.small
Small GPU1x T4
Medium GPU1x A10G
Large GPU4x A10G

Hyperparameter Sweeps

Grid or random search strategies. Define your sweep space, we launch and track every run automatically.

Dataset Management

Upload datasets once. They're auto-downloaded to EC2 before training starts. No manual S3 wrangling.

Secure by Default

Per-run tokens, time-limited, hashed. Your code never sees our DB credentials or AWS keys.

Team Workspaces

Owner, admin, member, viewer roles. Invite via email, share runs and experiments across your ML team.

Bring your own framework. We're runtime-agnostic.

PyTorch
TensorFlow
JAX
Custom

Get early access

We're onboarding teams in batches. Join the waitlist and be the first to train on ConvexLabs when we launch.

No spam. We'll only email you when your spot is ready.

Ready to stop fighting
infrastructure?

Join the waitlist and be the first to train on ConvexLabs when we launch. Early access includes free GPU credits.