// on-prem ai

Same command core, swap the model

For operators whose legal teams block cloud AI entirely. Run Narya Command and WAIS against self-hosted LLMs on infrastructure you control. The strategy layer is designed for this — workflows and tool interfaces don't change when the LLM behind them does.

Talk to us about on-prem See the architecture

Cloud-friendly by default

Default Anor uses managed cloud inference for chat and for the semantic-recall steps inside WAIS. One vendor relationship, one invoice — API costs are baked into your maintenance.

On-prem when policy demands it

For air-gapped OT environments — same command core, same WAIS schema, same write safety. SCADA context never leaves your network. The product license becomes the only revenue line.

// hardware tiers

Three deployment tiers

Hardware planning is part of the conversation. We've tested against multiple open-weight stacks and can recommend a tier that matches your throughput, latency, and operations posture.

Tier 1

Lightweight — workstation-class

Modest CPU + RAM · no GPU required

Run smaller open-weight models locally. Works for narrow tasks, query answering, and lower-volume use. Lowest infrastructure cost.

Tier 2

Production — single GPU

Workstation-class GPU · standard system RAM

Capable open-weight models running on a single GPU. Good for production single-site use, with predictable latency.

Tier 3

Frontier-class on-prem

Dedicated inference appliance · large unified memory

Frontier-class on-prem inference for customers who want full capability without any cloud dependency. Hardware sizing is part of the rollout conversation.

strict-OT mode

Graceful degradation when no cloud calls are allowed

External retrieval calls disabled. Semantic recall and reranking turn off cleanly. Structured and lexical retrieval still work.
Same WAIS schema and API. Storage and retrieval architecture are identical regardless of OT policy. Strict-OT just controls whether AI calls leave the network.
Local-model fallback under continuous evaluation. Where local models don't yet meet the quality bar for SCADA-domain knowledge, we degrade to structured/lexical search. The provider abstraction is in place to swap a local model in when one earns its place.

offline licensing

Air-gapped license activation

1. Customer requests license with machine ID.
2. Narya generates a signed license file.
3. Customer places file in config directory.
4. Application validates signature locally — no network needed.

See the offline activation flow →

// faq

Models & data — the questions your legal team will ask

// next step

Plan an on-prem deployment

Hardware sizing, model selection, deployment shape, license model. We'll walk through it with you and align validation to your environment.

Talk to sales See pricing

Same command core, swap the model

Cloud-friendly by default

On-prem when policy demands it

Three deployment tiers

Lightweight — workstation-class

Production — single GPU

Frontier-class on-prem

Graceful degradation when no cloud calls are allowed

Air-gapped license activation

Models & data — the questions your legal team will ask

What if we don't want SCADA data sent to a third-party AI provider?

Do you support self-hosted models?

Is self-hosted inference cheaper than API pricing?

Will Anor work as well with a self-hosted model?

What about retrieval components that use cloud APIs by default?

How does offline licensing work?

What changes between cloud and on-prem deployments?

Plan an on-prem deployment