Koveh API

AI Service V1

OpenAI-compatible gateway for GPT-5.4, Claude, Mistral, Llama, and DeepSeek with usage tracking

AI Service V1 is an OpenAI-compatible proxy for major cloud LLM providers. You send chat completions in the standard messages format; we forward the request to the provider, log token usage, and deduct cost from your API balance.

Base URL. https://api.koveh.com/ai/

Provider-specific paths are listed below. The legacy unified path https://api.koveh.com/ai/v1/ still works and routes to the OpenAI implementation.

Authentication

All requests require the X-API-Key header (not Bearer). Request an API key via API Access & Authentication.

X-API-Key: YOUR_API_KEY

How Koveh uses these models

Across koveh.com and internal automations we default to the GPT-5.4 family on the OpenAI path unless a task needs another provider.

Use caseTypical modelWhy
High-volume classification, digests, lightweight chatgpt-5.4-nanoCheapest GPT-5.4-class tier; enough for routing, tagging, short summaries
General product chat, agents, coding helpersgpt-5.4-miniDefault balance of quality, speed, and cost
Complex reasoning, premium bots, hard analysisgpt-5.4Strongest GPT-5.4 tier when latency and cost are secondary
Long-context schema / SQL assistance (platform)gpt-5-miniSampled table context with strict row/char caps
Mail classification (Mail Processor v2)gpt-5.4-nano with gpt-5.4-mini fallbackFast first pass; fallback on low confidence

Legacy OpenAI ids such as gpt-4o and gpt-4o-mini still work on the wire, but we recommend passing gpt-5.4-* explicitly in new integrations. On koveh.com, older ids are remapped internally (for example gpt-4ogpt-5.4-mini).

For self-hosted text generation and embeddings on our infrastructure, see Qwen3 AI API — it is a separate service, not part of AI Service V1.


Supported providers

Each provider has its own path prefix. The request body follows the OpenAI Chat Completions shape unless noted.

ProviderPathRecommended defaultAlso common
OpenAI/ai/openai/gpt-5.4-minigpt-5.4-nano, gpt-5.4, gpt-4o (legacy)
Anthropic/ai/claude/claude-3-5-sonnet-20241022claude-3-haiku-20240307
Mistral/ai/mistral/mistral-large-latestmistral-nemo-instruct-2407
Together (Llama)/ai/llama/meta-llama/Llama-3.1-70b-instructmeta-llama/Llama-3.3-70B-Instruct
DeepSeek/ai/deepseek/deepseek-chatdeepseek-r1, deepseek-r1-0528

The gateway does not hard-limit model names: any id accepted by the upstream provider can be sent in the model field. Costs are calculated from actual token usage for that model.


Quick start

Non-streaming chat (OpenAI path, recommended model):

curl -s https://api.koveh.com/ai/openai/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{
    "model": "gpt-5.4-mini",
    "messages": [
      {"role": "system", "content": "You are a concise assistant."},
      {"role": "user", "content": "Summarize what a data warehouse staging layer does in two sentences."}
    ],
    "stream": false
  }'

Streaming — set "stream": true. The response is Server-Sent Events with data: {"content": "..."} chunks and a final data: [DONE].

Claude example — same JSON shape; use /ai/claude/chat/completions and a Claude model id:

curl -s https://api.koveh.com/ai/claude/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Endpoints

1. Chat completion (per provider)

POST <provider-path>/chat/completions

Examples:

  • POST https://api.koveh.com/ai/openai/chat/completions
  • POST https://api.koveh.com/ai/claude/chat/completions
  • POST https://api.koveh.com/ai/deepseek/chat/completions

Request body (OpenAI-compatible):

{
  "model": "gpt-5.4-mini",
  "messages": [
    {"role": "system", "content": "Optional system prompt"},
    {"role": "user", "content": "Hello!"}
  ],
  "stream": false,
  "max_tokens": 1024,
  "temperature": 0.7
}
FieldRequiredNotes
modelyesProvider model id (see tables above)
messagesyesArray of {role, content}system, user, assistant
streamnoDefault false; true for SSE streaming
max_tokensnoCap on completion tokens
temperaturenoSampling temperature; provider defaults apply if omitted

Additional OpenAI parameters supported by the upstream API (for example response_format, reasoning_effort on GPT-5.4-class models) are forwarded as-is when the provider accepts them.

Response. Standard OpenAI-style chat.completion object with choices, usage (prompt / completion / total tokens), and the resolved model name.

2. Unified completion (legacy)

POST https://api.koveh.com/ai/v1/chat/completions

Proxies to the OpenAI implementation for backward compatibility. Prefer /ai/openai/chat/completions for new code.


Choosing a model

OpenAI (most integrations).

  • gpt-5.4-nano — classification, extraction, short replies, high QPS.
  • gpt-5.4-mini — default for chat, tools, and agents; best everyday choice.
  • gpt-5.4 — multi-step reasoning, difficult coding, or when mini quality is not enough.

Other providers.

  • DeepSeek — cost-effective general chat (deepseek-chat) and reasoning (deepseek-r1*).
  • Claude — long instructions and careful prose; Sonnet for balance, Haiku for speed.
  • Llama (Together) — open-weight models without a direct OpenAI account.
  • Mistral — European-hosted alternative with OpenAI-like chat API.

Model pricing and context windows are stored in our api_models catalog and applied when calculating balance deductions.


Cost tracking

Every completed request is logged with provider, model, token counts, duration, and USD cost. Charges are deducted from the balance on your API user account (api_users.usd_available).

  • Failed requests are logged with status=error and typically incur no token charge.
  • Request/response payloads are stored for audit (api_request_data, transaction tables).
  • For balance and access questions, contact studio@koveh.com.

ServiceWhen to use
Qwen3 AI APISelf-hosted Qwen3 chat and embeddings on Koveh infrastructure
API Access & AuthenticationHow to obtain and use API keys
Koveh Services OverviewFull list of platform APIs

On this page