AI Service V1

OpenAI-compatible gateway for GPT-5.4, Claude, Mistral, Llama, and DeepSeek with usage tracking

AI Service V1 is an OpenAI-compatible proxy for major cloud LLM providers. You send chat completions in the standard messages format; we forward the request to the provider, log token usage, and deduct cost from your API balance.

Base URL. https://api.koveh.com/ai/

Provider-specific paths are listed below. The legacy unified path https://api.koveh.com/ai/v1/ still works and routes to the OpenAI implementation.

Authentication

All requests require the X-API-Key header (not Bearer). Request an API key via API Access & Authentication.

X-API-Key: YOUR_API_KEY

How Koveh uses these models

Across koveh.com and internal automations we default to the GPT-5.4 family on the OpenAI path unless a task needs another provider.

Use case	Typical model	Why
High-volume classification, digests, lightweight chat	`gpt-5.4-nano`	Cheapest GPT-5.4-class tier; enough for routing, tagging, short summaries
General product chat, agents, coding helpers	`gpt-5.4-mini`	Default balance of quality, speed, and cost
Complex reasoning, premium bots, hard analysis	`gpt-5.4`	Strongest GPT-5.4 tier when latency and cost are secondary
Long-context schema / SQL assistance (platform)	`gpt-5-mini`	Sampled table context with strict row/char caps
Mail classification (Mail Processor v2)	`gpt-5.4-nano` with `gpt-5.4-mini` fallback	Fast first pass; fallback on low confidence

Legacy OpenAI ids such as gpt-4o and gpt-4o-mini still work on the wire, but we recommend passing gpt-5.4-* explicitly in new integrations. On koveh.com, older ids are remapped internally (for example gpt-4o → gpt-5.4-mini).

For self-hosted text generation and embeddings on our infrastructure, see Qwen3 AI API — it is a separate service, not part of AI Service V1.

Supported providers

Each provider has its own path prefix. The request body follows the OpenAI Chat Completions shape unless noted.

Provider	Path	Recommended default	Also common
OpenAI	`/ai/openai/`	`gpt-5.4-mini`	`gpt-5.4-nano`, `gpt-5.4`, `gpt-4o` (legacy)
Anthropic	`/ai/claude/`	`claude-3-5-sonnet-20241022`	`claude-3-haiku-20240307`
Mistral	`/ai/mistral/`	`mistral-large-latest`	`mistral-nemo-instruct-2407`
Together (Llama)	`/ai/llama/`	`meta-llama/Llama-3.1-70b-instruct`	`meta-llama/Llama-3.3-70B-Instruct`
DeepSeek	`/ai/deepseek/`	`deepseek-chat`	`deepseek-r1`, `deepseek-r1-0528`

The gateway does not hard-limit model names: any id accepted by the upstream provider can be sent in the model field. Costs are calculated from actual token usage for that model.

Quick start

Non-streaming chat (OpenAI path, recommended model):

curl -s https://api.koveh.com/ai/openai/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{
    "model": "gpt-5.4-mini",
    "messages": [
      {"role": "system", "content": "You are a concise assistant."},
      {"role": "user", "content": "Summarize what a data warehouse staging layer does in two sentences."}
    ],
    "stream": false
  }'

Streaming — set "stream": true. The response is Server-Sent Events with data: {"content": "..."} chunks and a final data: [DONE].

Claude example — same JSON shape; use /ai/claude/chat/completions and a Claude model id:

curl -s https://api.koveh.com/ai/claude/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Endpoints

1. Chat completion (per provider)

POST <provider-path>/chat/completions

Examples:

POST https://api.koveh.com/ai/openai/chat/completions
POST https://api.koveh.com/ai/claude/chat/completions
POST https://api.koveh.com/ai/deepseek/chat/completions

Request body (OpenAI-compatible):

{
  "model": "gpt-5.4-mini",
  "messages": [
    {"role": "system", "content": "Optional system prompt"},
    {"role": "user", "content": "Hello!"}
  ],
  "stream": false,
  "max_tokens": 1024,
  "temperature": 0.7
}

Field	Required	Notes
`model`	yes	Provider model id (see tables above)
`messages`	yes	Array of `{role, content}` — `system`, `user`, `assistant`
`stream`	no	Default `false`; `true` for SSE streaming
`max_tokens`	no	Cap on completion tokens
`temperature`	no	Sampling temperature; provider defaults apply if omitted

Additional OpenAI parameters supported by the upstream API (for example response_format, reasoning_effort on GPT-5.4-class models) are forwarded as-is when the provider accepts them.

Response. Standard OpenAI-style chat.completion object with choices, usage (prompt / completion / total tokens), and the resolved model name.

2. Unified completion (legacy)

POST https://api.koveh.com/ai/v1/chat/completions

Proxies to the OpenAI implementation for backward compatibility. Prefer /ai/openai/chat/completions for new code.

Choosing a model

OpenAI (most integrations).

gpt-5.4-nano — classification, extraction, short replies, high QPS.
gpt-5.4-mini — default for chat, tools, and agents; best everyday choice.
gpt-5.4 — multi-step reasoning, difficult coding, or when mini quality is not enough.

Other providers.

DeepSeek — cost-effective general chat (deepseek-chat) and reasoning (deepseek-r1*).
Claude — long instructions and careful prose; Sonnet for balance, Haiku for speed.
Llama (Together) — open-weight models without a direct OpenAI account.
Mistral — European-hosted alternative with OpenAI-like chat API.

Model pricing and context windows are stored in our api_models catalog and applied when calculating balance deductions.

Cost tracking

Every completed request is logged with provider, model, token counts, duration, and USD cost. Charges are deducted from the balance on your API user account (api_users.usd_available).

Failed requests are logged with status=error and typically incur no token charge.
Request/response payloads are stored for audit (api_request_data, transaction tables).
For balance and access questions, contact studio@koveh.com.

Service	When to use
Qwen3 AI API	Self-hosted Qwen3 chat and embeddings on Koveh infrastructure
API Access & Authentication	How to obtain and use API keys
Koveh Services Overview	Full list of platform APIs

AI Service V1

On this page