router.Nyuro.ai

Routing & strategies

How Nyuro picks a model — auto classification, strategy hints, fallback chains, and the transparency headers that tell you what happened.

Routing is the core of the gateway. You express intent; Nyuro resolves that into an ordered list of candidate models, tries them in order, and tells you exactly what it did.

Strategies

Pass a strategy: directive when you care about an outcome more than a specific model:

StrategyPicks
strategy:costThe cheapest model that can handle the request
strategy:qualityThe strongest reasoner available
strategy:latencyThe fastest time-to-first-token
strategy:localOnly models running on your own infrastructure
client.chat.completions.create(
    model="strategy:quality",
    messages=[{"role": "user", "content": "Draft a precise refund-policy clause."}],
)

Auto classification

With model="auto", a dependency-free heuristic classifier inspects the prompt — code fences, reasoning keywords, length, math symbols — and maps it to a task class, with no extra LLM call on the hot path:

  • code → qwen2.5-coder
  • reasoning → claude-3-5-sonnet
  • summary → gpt-4o-mini
  • long context → gpt-4o
  • general chat → gpt-4o-mini

Fallback chains

Send a models array (plural) instead of a single model. The gateway tries each entry in order and falls through on a provider error. Each entry resolves through the same router, so you can mix concrete aliases with auto, strategy:cost, or industry:legal.

import httpx, os

resp = httpx.post(
    "https://api.nyuro.ai/v1/chat/completions",
    headers={"Authorization": f"Bearer {os.environ['NYURO_API_KEY']}"},
    json={
        "models": ["gpt-4o", "claude-3-5-sonnet", "strategy:cost"],
        "messages": [{"role": "user", "content": "Summarize transformer attention in 2 lines."}],
    },
)
print(resp.headers.get("X-Nyuro-Model"), resp.headers.get("X-Nyuro-Fallback-Used"))
curl -D - https://api.nyuro.ai/v1/chat/completions \
  -H "Authorization: Bearer neu_live_…" \
  -H "Content-Type: application/json" \
  -d '{"models": ["gpt-4o", "gpt-4o-mini"], "messages": [{"role": "user", "content": "Hi"}]}'

Sorting candidates

Append sort:price or max_price:<usd> semantics through strategies to bias the candidate ordering toward cheaper or faster models across the resolved chain.

Transparency headers

Every response tells you what the router decided:

HeaderMeaning
X-Nyuro-ModelThe model that actually answered
X-Nyuro-Route-ReasonHuman-readable reason for the choice
X-Nyuro-Fallback-Usedtrue if the primary candidate was skipped
X-Nyuro-CandidatesThe comma-separated candidate chain that was resolved
X-Nyuro-Model: qwen2.5-coder
X-Nyuro-Route-Reason: detected code/engineering cues → code specialist
X-Nyuro-Fallback-Used: false
X-Nyuro-Candidates: qwen2.5-coder,claude-3-5-sonnet,gpt-4o

These headers make routing auditable — pair them with Observability to see the decision for every historical request.

On this page