Koveh API

Qwen3 AI API

Qwen3-0.6B LLM with thinking capabilities and semantic embeddings

The Qwen3 API provides access to the Qwen3-0.6B model for text generation and the specialized Qwen3-Embedding-0.6B model for high-dimensional text embeddings.

Base URL

https://api.koveh.com/qwen3/

Authentication

All requests require a Bearer Token in the Authorization header.

Authorization: Bearer <YOUR_API_KEY>

Endpoints

1. Chat Completion

POST /chat

Generates a response to a list of messages. Supports long context (up to 32k) and "thinking" mode.

Request Body:

{
  "messages": [
    {"role": "user", "content": "Explain quantum entanglement like I'm five."}
  ],
  "enable_thinking": true,
  "max_new_tokens": 2048,
  "temperature": 0.6
}

Response:

{
  "thinking_content": "<detailed internal reasoning>",
  "content": "Quantum entanglement is like having a pair of magic socks...",
  "model": "Qwen3-0.6B",
  "session_id": "uuid-string"
}

2. Create Embeddings

POST /embeddings

Generates vector embeddings for the provided texts.

Request Body:

{
  "texts": ["Hello world", "Artificial Intelligence"]
}

Response:

{
  "embeddings": [[...], [...]],
  "model": "Qwen3-Embedding-0.6B",
  "dimension": 1024
}

POST /chat/similar

Finds similar questions in your chat history using vector similarity.


History & Stats

  • GET /chat/history: Retrieve past conversations.
  • GET /chat/sessions: List active chat sessions.
  • GET /chat/stats: View usage metrics.

On this page