Qwen3 AI API
Qwen3-0.6B LLM with thinking capabilities and semantic embeddings
The Qwen3 API provides access to the Qwen3-0.6B model for text generation and the specialized Qwen3-Embedding-0.6B model for high-dimensional text embeddings.
Base URL
https://api.koveh.com/qwen3/
Authentication
All requests require a Bearer Token in the Authorization header.
Authorization: Bearer <YOUR_API_KEY>Endpoints
1. Chat Completion
POST /chat
Generates a response to a list of messages. Supports long context (up to 32k) and "thinking" mode.
Request Body:
{
"messages": [
{"role": "user", "content": "Explain quantum entanglement like I'm five."}
],
"enable_thinking": true,
"max_new_tokens": 2048,
"temperature": 0.6
}Response:
{
"thinking_content": "<detailed internal reasoning>",
"content": "Quantum entanglement is like having a pair of magic socks...",
"model": "Qwen3-0.6B",
"session_id": "uuid-string"
}2. Create Embeddings
POST /embeddings
Generates vector embeddings for the provided texts.
Request Body:
{
"texts": ["Hello world", "Artificial Intelligence"]
}Response:
{
"embeddings": [[...], [...]],
"model": "Qwen3-Embedding-0.6B",
"dimension": 1024
}3. Semantic Search
POST /chat/similar
Finds similar questions in your chat history using vector similarity.
History & Stats
GET /chat/history: Retrieve past conversations.GET /chat/sessions: List active chat sessions.GET /chat/stats: View usage metrics.