Skip to main content

// on-prem ai

Same command core, swap the model

For operators whose legal teams block cloud AI entirely. Run Narya Command and WAIS against self-hosted LLMs on infrastructure you control. The strategy layer is designed for this — workflows and tool interfaces don't change when the LLM behind them does.

Cloud-friendly by default

Default Narya Command uses managed cloud inference for chat and for the semantic-recall steps inside WAIS. One vendor relationship, one invoice — API costs are baked into your maintenance.

On-prem when policy demands it

For air-gapped OT environments — same command core, same WAIS schema, same write safety. SCADA context never leaves your network. The product license becomes the only revenue line.

// hardware tiers

Three deployment tiers

Hardware planning is part of the conversation. We've tested against multiple open-weight stacks and can recommend a tier that matches your throughput, latency, and operations posture.

Tier 1

Lightweight — workstation-class

Modest CPU + RAM · no GPU required

Run smaller open-weight models locally. Works for narrow tasks, query answering, and lower-volume use. Lowest infrastructure cost.

Tier 2

Production — single GPU

Workstation-class GPU · standard system RAM

Capable open-weight models running on a single GPU. Good for production single-site use, with predictable latency.

Tier 3

Frontier-class on-prem

Dedicated inference appliance · large unified memory

Frontier-class on-prem inference for customers who want full capability without any cloud dependency. Hardware sizing is part of the rollout conversation.

strict-OT mode

Graceful degradation when no cloud calls are allowed

  • External retrieval calls disabled. Semantic recall and reranking turn off cleanly. Structured and lexical retrieval still work.
  • Same WAIS schema and API. Storage and retrieval architecture are identical regardless of OT policy. Strict-OT just controls whether AI calls leave the network.
  • Local-model fallback under continuous evaluation. Where local models don't yet meet the quality bar for SCADA-domain knowledge, we degrade to structured/lexical search. The provider abstraction is in place to swap a local model in when one earns its place.
offline licensing

Air-gapped license activation

  1. 1. Customer requests license with machine ID.
  2. 2. Narya generates a signed license file.
  3. 3. Customer places file in config directory.
  4. 4. Application validates signature locally — no network needed.

See the offline activation flow →

// faq

Models & data — the questions your legal team will ask

// next step

Plan an on-prem deployment

Hardware sizing, model selection, deployment shape, license model. We'll walk through it with you and align validation to your environment.