API Reference

API Reference

KoboiLLM API adalah proxy server yang kompatibel dengan OpenAI API format, memungkinkan akses ke 100+ model LLM dengan endpoint yang sama.

Base URL: https://lite.koboillm.com

Autentikasi: Gunakan API Key di header Authorization: Bearer YOUR_API_KEY atau x-litellm-api-key: YOUR_API_KEY


Chat Completions

Endpoint untuk percakapan dengan AI model.

POST /v1/chat/completions

Membuat permintaan chat completion mengikuti format OpenAI API.

Request Body:

ParameterTypeRequiredDescription
modelstringYesModel ID (contoh: openai/gpt-4o-mini, anthropic/claude-4-5-sonnet)
messagesarrayYesArray pesan percakapan
temperaturenumberNoSampling temperature (0-2), default 1
max_tokensintegerNoMaksimum token dalam response
streambooleanNoEnable streaming response
top_pnumberNoNucleus sampling parameter
frequency_penaltynumberNoPenalti frekuensi (-2.0 to 2.0)
presence_penaltynumberNoPenalti kehadiran (-2.0 to 2.0)
stopstring/arrayNoSequence untuk menghentikan generasi
toolsarrayNoDaftar tools untuk function calling
tool_choicestring/objectNoKontrol pemilihan tool
response_formatobjectNoFormat output (contoh: { "type": "json_object" })

Contoh Request:

curl -X POST https://lite.koboillm.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

Contoh Response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "openai/gpt-4o-mini",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! I'm doing well, thank you for asking!"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 15,
    "total_tokens": 25
  }
}

Completions (Legacy)

POST /v1/completions

Endpoint legacy untuk text completion.

Request Body:

ParameterTypeRequiredDescription
modelstringYesModel ID
promptstring/arrayYesPrompt text
max_tokensintegerNoMaksimum token
temperaturenumberNoSampling temperature
streambooleanNoEnable streaming

Embeddings

POST /v1/embeddings

Membuat vector embeddings dari text.

Request Body:

ParameterTypeRequiredDescription
modelstringYesModel embedding (contoh: openai/text-embedding-ada-002)
inputstring/arrayYesText untuk di-embed

Contoh Request:

curl -X POST https://lite.koboillm.com/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "openai/text-embedding-ada-002",
    "input": "Hello world"
  }'

Models

GET /v1/models

Mendapatkan daftar model yang tersedia.

Response:

{
  "object": "list",
  "data": [
    {
      "id": "openai/gpt-4o-mini",
      "object": "model",
      "created": 1677652288,
      "owned_by": "openai"
    }
  ]
}

GET /v1/models/{model_id}

Mendapatkan informasi model spesifik.


Audio

Text-to-Speech

POST /v1/audio/speech

Mengubah text menjadi audio.

ParameterTypeRequiredDescription
modelstringYesModel TTS (contoh: openai/tts-1)
inputstringYesText untuk diubah ke audio
voicestringYesSuara (contoh: alloy, echo, fable)

Speech-to-Text

POST /v1/audio/transcriptions

Mengubah audio menjadi text.

ParameterTypeRequiredDescription
filefileYesFile audio
modelstringYesModel transkripsi (contoh: openai/whisper-1)

Moderations

POST /v1/moderations

Memeriksa konten untuk kepatuhan.

Request Body:

ParameterTypeRequiredDescription
inputstring/arrayYesText untuk dimoderasi
modelstringNoModel moderasi

Assistants API

GET /v1/assistants

Mendapatkan daftar assistants.

POST /v1/assistants

Membuat assistant baru.

DELETE /v1/assistants/{assistant_id}

Menghapus assistant.


Threads

POST /v1/threads

Membuat thread baru untuk conversation.


Images

POST /v1/images/generations

Membuat gambar dari text prompt.

ParameterTypeRequiredDescription
promptstringYesDeskripsi gambar
modelstringNoModel (contoh: openai/dall-e-3)
nintegerNoJumlah gambar (default: 1)
sizestringNoUkuran (contoh: 1024x1024)
qualitystringNoKualitas (standard atau hd)

Batch Operations

GET /v1/batches

Mendapatkan daftar batch jobs.

POST /v1/batches

Membuat batch job baru.

GET /v1/batches/{batch_id}

Mendapatkan status batch job.


Fine-tuning

GET /v1/fine_tuning/jobs

Mendapatkan daftar fine-tuning jobs.

POST /v1/fine_tuning/jobs

Membuat fine-tuning job baru.


Vector Stores

POST /v1/vector_stores

Membuat vector store baru.

GET /v1/vector_stores

Mendapatkan daftar vector stores.

GET /v1/vector_stores/{vector_store_id}

Mendapatkan informasi vector store.

DELETE /v1/vector_stores/{vector_store_id}

Menghapus vector store.


Key Management

GET /key/info

Mendapatkan informasi API key saat ini.

POST /key/generate

Membuat API key baru (memerlukan admin access).

POST /key/delete

Menghapus API key.


User Management

GET /user/info

Mendapatkan informasi user.

POST /user/new

Membuat user baru.


Team Management

GET /team/list

Mendapatkan daftar teams.

POST /team/new

Membuat team baru.

POST /team/update

Update team settings.


Usage & Spend

GET /spend/keys

Mendapatkan spending per API key.

GET /spend/users

Mendapatkan spending per user.

GET /spend/tags

Mendapatkan spending per tag.


Health & Status

GET /health

Health check endpoint.

GET /health/liveliness

Liveliness probe.

GET /health/readiness

Readiness probe.


Error Responses

API mengembalikan error dalam format standar:

{
  "error": {
    "message": "Error description",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

HTTP Status Codes:

CodeDescription
400Bad Request - Invalid parameters
401Authentication Error - Invalid API key
403Permission Denied - Insufficient permissions
404Not Found - Resource doesn't exist
408Timeout - Request timed out
422Unprocessable Entity - Validation error
429Rate Limit Error - Too many requests
500Internal Server Error
503API Connection Error

Rate Limits

Rate limits diterapkan per API key dan dapat dikonfigurasi:

  • tpm_limit: Tokens per minute
  • rpm_limit: Requests per minute
  • max_parallel_requests: Concurrent requests

Headers response menampilkan status rate limit:

x-ratelimit-limit-requests: 1000
x-ratelimit-remaining-requests: 999
x-ratelimit-limit-tokens: 100000
x-ratelimit-remaining-tokens: 99999

SDK & Libraries

Karena API kompatibel dengan OpenAI format, Anda dapat menggunakan library OpenAI resmi:

Python:

from openai import OpenAI
 
client = OpenAI(
    base_url="https://lite.koboillm.com/v1",
    api_key="YOUR_API_KEY"
)
 
response = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)

JavaScript/Node.js:

import OpenAI from 'openai';
 
const client = new OpenAI({
  baseURL: 'https://lite.koboillm.com/v1',
  apiKey: 'YOUR_API_KEY'
});
 
const response = await client.chat.completions.create({
  model: 'openai/gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello!' }]
});

Lihat Juga