Service Token Units (STUs)
One simple unit for every AI workload.
STUs unify usage across LLM tokens, speech-to-text, text-to-speech, and embeddings—so you can plan, compare, and scale with a single meter. Priced for high-throughput, low-latency inference on SCX.ai's sovereign, energy-efficient AI factory.
What is an STU?
An STU (Service Token Unit) is our common billing unit across AI workloads. Instead of juggling separate meters for text tokens, audio minutes, and vectors, you consume STUs. Each workload has a published conversion—so you can mix models and modalities while keeping your budget predictable.
STUs also power clear plan limits: every plan includes a pool of STUs. If you need more, you add top-ups. If you prefer not to top-up, we throttle (we don't silently overcharge).
How STUs map to workloads
These are the current public conversions on SCX.ai. (We review them as models evolve to keep pricing fair and simple.)
| Workload (band/type) | Unit measured | STU per unit | What 1 STU buys |
|---|---|---|---|
| LLM – Band-L (efficient/light models) | 1M text tokens | 1.00 STU | 1.0M tokens |
| LLM – Band-S (standard/70B-class) | 1M text tokens | 2.00 STU | 0.5M tokens |
| LLM – Band-P (premium/large or MoE) | 1M text tokens | 8.00 STU | 0.125M tokens |
| STT – Realtime | 1 hour audio | 0.90 STU | ~1.11 h |
| STT – Batch (Standard) | 1 hour audio | 0.40 STU | 2.50 h |
| STT – Batch (Turbo) | 1 hour audio | 0.10 STU | 10.0 h |
| TTS – Standard | 1M characters | 0.25 STU | 4.0M chars |
| TTS – Neural | 1M characters | 1.60 STU | 0.625M chars |
| TTS – Expressive | 1M characters | 5.00 STU | 0.20M chars |
| Embeddings – Small | 1M tokens | 0.033 STU | ~30.3M tokens |
| Embeddings – Large | 1M tokens | 0.217 STU | ~4.61M tokens |
Band-L: small/efficient models for high-QPS apps.
Band-S: ~70B-class general models (balanced quality/cost).
Band-P: larger or advanced models (e.g., MoE or frontier-class). Band-P is available on Growth (with unlock) and Enterprise.
Why STUs?
Simplify your AI billing and planning
Quick example
50M Band-L tokens, 5M Band-S tokens, 100h STT Batch, 2M TTS chars, 20M embeddings
Recommendation: Starter includes 400 STU → plenty of headroom for this workload.
Included STUs by plan
Choose the plan that fits your workload
Add Top-Ups anytime (100 / 1,000 / 3,000 STU sizes) or enable Reserved Throughput (Growth) to raise concurrency and tokens-per-minute.
How throttling works
No automatic overage charges.
If you run out of STUs and don't top-up, we throttle requests to your plan's safe baseline rate.
You can remove throttling instantly by purchasing a top-up or reducing traffic.
Frequently Asked Questions
Ready to Calculate Your Usage?
Use our calculator to see how many STUs your workload needs and get a personalised plan recommendation.