Credits and usage

How Ertas bills training runs by GPU-minute, how refunds work, how to read the Run panel's duration vs the billable portion, and how to cap your monthly spend.

Ertas bills training in credits. Each credit corresponds to a fixed amount of training time on a given GPU tier; the rate differs by tier (A10G burns credits faster than T4). The Training Confirm dialog estimates the cost of a run before you press play, and the Run panel tracks actual usage as it accrues.

For the canonical credit allocations, plan prices, and overage rates, see the pricing page. This page covers the rules and mechanics of how credits are spent.

What you pay for

You pay for active training time on the GPU, and nothing else. Specifically:

Queued time is free.
Waiting for GPU time (provisioning, model download to the GPU node, tokenizer load) is free.
Training time is billed. The clock starts when the first gradient step runs.
GGUF conversion at the end of training is billed (it uses the same GPU).
Cancelled runs are billed only for the GPU minutes used before cancellation, rounded down to the last whole minute.

A consequence of this: the duration the Run panel shows under each run is the total elapsed wall-clock time, which includes queue and provisioning. The credits charged correspond only to the active training portion, which is often much less. It is normal to see a run with "133 min" elapsed and only a few credits charged, because most of that 133 min was queue time waiting for a free GPU.

Credit rate by GPU tier

GPU tier	VRAM	Approximate rate
T4	16 GB	About 1 credit per 19 to 20 minutes of training
A10G	24 GB	About 1 credit per 8 minutes of training

A10G charges roughly 2.4 times the T4 rate. The Training Confirm dialog always shows the precise estimate for your selected tier before you confirm, so you do not need to do this math yourself.

A few concrete examples to ground intuition:

Tier	Training time	Approximate credits
T4	13 min	~0.7
A10G	34 min	~4.2
A10G	41 min	~5.1

Actual run duration depends on the base model, the dataset size, and the training config (Max Steps, batch size, context length). A 200-step T4 fine-tune on an 8B base typically lands at 1 to 2 credits including GGUF conversion. A 30 to 60 minute A10G run on a 13B base typically lands at 4 to 8 credits.

Estimating a run before you press play

The Training Confirm dialog gives you four numbers:

Total steps: derived from (rows * epochs / effective_batch_size), or your configured Max Steps if it is non-zero.
Estimated runtime: based on calibrated throughput numbers for the chosen base model and GPU tier.
Estimated cost: runtime times the per-minute credit rate.
Your current balance: how many credits you have available.

If your balance is below the estimated cost, the dialog blocks the run and surfaces an upgrade prompt. There is no way to start a run that would exceed your balance.

The estimate is deliberately conservative. The pipeline always reserves at least the estimated credits before a run can start, so the system can guarantee the run has enough budget to complete. In practice, actual usage is typically 70 to 80% of the estimate for normal-sized datasets, and the unused portion is returned to your balance when the run finishes.

For very small datasets (under a few hundred rows), the estimator surfaces a low confidence label. Small datasets have higher runtime variance because the first few training steps dominate wall-clock and the throughput measurements that drive the estimate have less time to stabilise. Two real examples on the same Llama 3.2 1B model with 500 max steps on T4:

480-row dataset: 25 minutes, 1.25 credits actual.
25-row dataset: still 500 steps, but actual credit consumption was around 0.67 credits. Runtime can vary substantially across runs even with identical config.

Either way, the estimate is a ceiling, not a target.

Watching usage live

While a run is training, the Run panel shows live credit accrual on the run card. Spend is sampled every few seconds server-side, but the displayed value refreshes at a slower cadence, so the number tends to step up in noticeable increments rather than tick smoothly. If you cancel a run, billing rounds down to the last whole minute of training, so a run cancelled at 12 min 45 sec is billed for 12 minutes.

The Billing settings tab (in the user settings menu) shows historical usage across the account: credits used per day, per project, and per GPU tier. Useful for spotting cost spikes when a teammate has been running A10G experiments overnight.

Refunds for failed runs

If a run fails on a base model from the verified catalog, Ertas refunds the credits automatically. The refund posts within a few minutes of the failure being recorded. You will see it as a positive entry in the Billing tab and your balance reflects it immediately.

There are two exceptions:

Cancellations do not produce refunds. You pay for the GPU time you used before stopping.
Failures on unverified Hugging Face models are not refunded. When you add an HF model that the validator flags amber, you check a consent box that explicitly waives the refund for that run.

If you believe a refund is owed and you do not see it within an hour, contact support from the Billing tab. Edge cases (rare backend failures) are handled manually.

Plans and credit allocations

Every plan refreshes its credit allocation on a fixed cadence:

Free: 5 credits refreshed daily, with a 30-credit monthly ceiling. Enough for the Quickstart and a handful of small T4 experiments per day.
Builder, Pro, Business: a fixed monthly credit grant that refills each billing cycle. Credit rollover and overage rates vary by plan.

The pricing page lists the current credit grant, rollover policy, and overage rates for each plan. We avoid restating them here because pricing changes more often than docs do.

A10G is available on paid plans only. When you try to add an A10G-only model from the picker, Ertas surfaces an upgrade prompt that takes you to Billing.

Capping your monthly spend (coming soon)

Billing caps are not yet a feature. Two controls are on the roadmap:

Monthly soft cap: when usage crosses this number, Ertas will email you a heads-up. New runs would continue to be allowed.
Monthly hard cap: when usage crosses this number, Ertas will refuse to start new runs until the next cycle or until you raise the cap. Active runs would never be killed mid-flight; they would finish, get billed, and then the lockout would take effect.

In the meantime, the practical guardrail is to keep an eye on the live credit accrual in the Run panel and stick to predictable Max-Steps-based runs while iterating. If you anticipate a long sweep, run a small calibration first and multiply.

Tips for spending less

A few patterns that compound:

Iterate on small experiments first. A 100-step run at rank 8 catches most templating bugs and tells you whether the dataset is even close to working. Once that is clean, scale up.
Use step-based training during iteration. Predictable runtime and cost beats epoch-based when you are comparing variants.
Turn off GGUF conversion while you are still iterating. The export step adds roughly 12 minutes (and credits). Once you have a config you are happy with, run one final pass with conversion on. If you want the final GGUF without paying for it at all, download the LoRA from Hub, merge it into the base model's safetensors locally, and run llama.cpp's convert_hf_to_gguf.py on your own machine. Slower in wall-clock, free in credits.
Train smaller bases on T4. A well-tuned 3B on T4 is a fraction of the credit cost of an 8B on A10G, and is often better on a narrow task.
Cancel obvious failures early. If loss plateaus or shoots up after 30 steps, the rest of the run will not save it. Cancel and adjust.

The biggest cost savings come from data quality, not from squeezing pennies on hyperparameters. See Dataset quality.

What's next

Handling failures

Triaging failed runs and getting credit refunds.

Storage

Dataset and artifact storage quotas.

Pricing

Current plan grants, prices, and overage rates.