BYOK & local models
Bring your own provider keys, or route to Ollama and vLLM on your own VPS / GPU node — without changing client code.
Nyuro is a bring-your-own-keys gateway. You supply the provider credentials and the infrastructure; Nyuro provides the routing, governance, and observability on top. Sensitive workloads can stay entirely on your perimeter.
Bring your own provider keys
Configure provider credentials on the gateway and they are used for outbound calls on your behalf. Keys are stored encrypted at rest:
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...Provider keys live only on the gateway side and are never returned to clients.
Client applications authenticate with a Nyuro key (neu_live_…) and never see
the upstream provider credentials.
Route to your own VPS
Point the gateway at Ollama or vLLM running on your own machines. Those models
join the catalog and are reachable by name or via strategy:local — with no
change to client code:
# Ollama on your VPS, over HTTPS:
OLLAMA_BASE_URL=https://ollama.yourdomain.com:11434
# vLLM on a GPU node:
VLLM_BASE_URL=https://gpu1.yourdomain.com:8001/v1# Same client, same key, same shape:
client.chat.completions.create(
model="llama3.1", # lives on your VPS
# model="strategy:local", # any local model the router picks
messages=[{"role": "user", "content": "Keep this on-prem, please."}],
)Why this matters
Data residency
Route privacy-sensitive prompts to strategy:local so they never leave your
network, while everything else uses the best cloud model.
Predictable cost
Self-hosted models have no per-token meter. Mix them with cloud models and let budgets cap the rest.
No client churn
Move a model between cloud and your VPS by flipping a base URL on the gateway. Your applications never notice.
One integration
Local and cloud models share the same OpenAI-compatible API, the same keys, and the same observability.
Observability
Every request logs its provider, model, tokens, cost, latency, fallback, and cache status — surfaced live in the dashboard and on response headers.
Migrating from OpenRouter
Move off OpenRouter in five minutes — same OpenAI SDK, a base-URL swap, and you keep the fallback-array pattern you already use.