Question 1

What is data sovereignty and why does it matter?

Accepted Answer

Data sovereignty means your data is stored, processed, and managed under Australian law, by Australian personnel, on Australian-owned infrastructure. Unlike simple 'data residency' where data is only stored locally, true sovereignty ensures your data is exclusively governed by Australian law and protected from foreign access laws like the US CLOUD Act.

Question 2

What AI models does SCX.ai offer?

Accepted Answer

SCX.ai provides access to 14 language models including ALLaM-7B-Instruct-preview, DeepSeek-R1-0528, DeepSeek-R1-Distill-Llama-70B, DeepSeek-V3-0324, DeepSeek-V3.1, and more. We also offer 1 embedding model(s) and 1 audio model(s) - all hosted onshore in Australia with complete data sovereignty.

Question 3

What compliance frameworks does SCX.ai align with?

Accepted Answer

SCX.ai aligns with APRA CPS 234, ASIC AI governance guidance, Australian Privacy Principles, and ISO 27001/IRAP-aligned controls. We provide immutable audit trails, 99.9% uptime SLA, and comprehensive logging for audit-ready infrastructure.

Question 4

What GPU options are available?

Accepted Answer

SCX.ai offers NVIDIA A100 (40GB/80GB), H100 (80GB HBM3), and H200 (141GB HBM3e) GPUs. You can access them on-demand from A$5.50/hour or reserve dedicated pods with 3-12 month commitments for better rates.

Question 5

Where is SCX.ai infrastructure located?

Accepted Answer

Our sovereign data centres are located in NSW and South Australia, with planned expansion to WA, QLD, and VIC for nationwide coverage by 2026. All infrastructure is 100% Australian-owned and operated.

Stage	Time
Request routing	2ms
Batch scheduling	3ms
Prefill (first token)	25ms
Generation (99 tokens)	65ms
Response streaming	5ms
Total	sub-100ms

Achieving 1,200+ Tokens/Sec: Optimising Inference Pipelines

The Inference Challenge

Architecture Overview

Key Optimisations

1. Continuous Batching

2. PagedAttention for KV Cache

3. Speculative Decoding

4. Quantisation Without Quality Loss

5. Prefix Caching

Latency Breakdown

Monitoring and Tuning

Hardware Considerations

Memory Bandwidth

Compute Precision

Scaling Horizontally

Conclusion