AI Gateway
One URL in front of every model call.
Point your OpenAI or Azure client, your Anthropic SDK, your Bedrock or Gemini call at the gateway and add one header. Every prompt and completion across every team is scored, policy-checked, spend-metered, and audited, with no change to your application logic.
Drop-in · OpenAI, Anthropic, Bedrock, Gemini, Vertex · decision before the model runs
from openai import OpenAI
client = OpenAI(
base_url="https://app.axiorank.com/api/proxy/v1",
api_key="sk-...", # forwarded, never stored
default_headers={"X-AxioRank-Key": "axr_live_..."},
)Drop-in
No rewrite. Just a new address.
Your apps already speak the provider APIs. The gateway speaks them too, so adoption is a base URL plus one header at the platform layer. The provider key keeps riding in the request, forwarded for that one call and never stored.
Change the URL
Set base_url to the gateway and add X-AxioRank-Key. Nothing else in your code moves.
Decide before the model
The prompt is checked first. A denied call never reaches the provider, so the risky token is never spent.
Cost on every call
Token usage is captured from every response, so spend shows up in the dashboard the moment you point at the gateway.
Live
Watch it decide.
Pick a provider and a scenario, then flip governance on. The detection here is the real browser-safe engine, and the deny, hold, redact, and allow ladder mirrors the gateway's.
Provider
base_url = "https://app.axiorank.com/api/proxy/v1"
headers = { "X-AxioRank-Key": "axr_live_..." }
model = "gpt-4o" // chat.completionsScenario
Prompt
Summarize our Q3 revenue report for the board.
Coverage
Every provider, one gateway.
The same governance, spend, and audit across the APIs your teams already call, streaming and non-streaming alike.
OpenAI Chat Completions
The default. Point base_url at the gateway and call chat.completions as usual.
OpenAI Responses
The newer surface, governed the same way through the one base URL.
Anthropic Messages
Your Anthropic key keeps riding in x-api-key, forwarded and never stored.
Amazon Bedrock
Converse and ConverseStream. The gateway re-signs with SigV4 from your AWS credentials.
Google Gemini
The native generateContent and streamGenerateContent API, model and method in the path.
Google Vertex
Through its OpenAI-compatible endpoint, with your project and location.
The decision
Four outcomes, on the prompt and the answer.
The gateway runs the same deterministic ladder it runs for tool calls. The prompt is governed before the model is called; the completion is governed before it reaches your app.
Allow
Clean on both sides. The provider response is returned unchanged, and usage is metered.
Deny
A live secret or a blocked destination in the prompt returns a 403 and the model is never called.
Hold
A risky prompt returns a 409 with an approval id, so a human clears it before the model runs.
Redact
Secrets and PII in the answer are masked in place, with every other field of the response preserved.
Operate
Stream it, and carry one key.
The gateway keeps the developer experience intact while it governs in the middle.
Streaming, governed
Server-sent events stream straight through when governance is off. When it is on, the gateway buffers, governs, and re-emits in the provider's native streaming format, with tool calls preserved.
Keep exploring
Continue across the control plane.
Govern every model call by changing one URL.
Point a client at the gateway, add one header, and get deny, hold, redact, spend, and audit across every provider your teams use.