Quick Start¶
文档版本: 1.1.0 最后更新: 2026-05-21 适用版本: v1.8.0+ Git 提交: 61384b4a P 作者: Lincoln
This guide will help you make your first API call to JAiRouter and understand the basic concepts.
🎯 Learning Objectives¶
After completing this guide, you will be able to: - ✅ Start JAiRouter service - ✅ Configure your first AI model service - ✅ Send API requests and receive responses - ✅ Experience load balancing and rate limiting - ✅ Use the key generation tool for secure configuration
Prerequisites¶
- JAiRouter is installed and running (see Installation Guide)
- At least one AI model service is configured and accessible
🗝️ Step 0: Generate Secure Keys (v1.8.0+ Recommended)¶
v1.8.0+ provides a key generation tool that automatically generates secure JWT keys and admin passwords.
Option 1: Use Docker to Run Key Generation Tool (Recommended)¶
# Generate JWT key (Base64 encoded)
docker run --rm sodlinken/jairouter:latest java -jar /app/modelrouter.jar --generate-key
# Generate admin password
docker run --rm sodlinken/jairouter:latest java -jar /app/modelrouter.jar --generate-password
Option 2: Use System Commands (No Docker Required)¶
# Generate Base64 encoded JWT key (at least 32 bytes)
openssl rand -base64 32
# Generate random password (16 characters, alphanumeric)
openssl rand -base64 24 | tr -dc 'A-Za-z0-9' | head -c 16
Set Environment Variables¶
# Set JWT key (use the generated key)
export JWT_SECRET="your-base64-encoded-secret"
# Set admin password (use the generated password)
export INITIAL_ADMIN_PASSWORD="MyStr0ng!Pass#2026"
💡 Tip: v1.8.0+ automatically checks key strength on startup to ensure production security.
Your First API Call¶
1. Chat Completion¶
Make a chat completion request using the OpenAI-compatible API:
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen2.5:7b",
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"max_tokens": 100
}'
2. Text Embeddings¶
Generate text embeddings:
curl -X POST http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-ada-002",
"input": "Hello world"
}'
3. Text-to-Speech¶
Generate speech from text:
curl -X POST http://localhost:8080/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{
"model": "tts-1",
"input": "Hello, this is a test.",
"voice": "alloy"
}' \
--output speech.mp3
Understanding the Response¶
A typical chat completion response looks like:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "qwen2.5:7b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm doing well, thank you for asking. How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 20,
"total_tokens": 29
}
}
Load Balancing in Action¶
If you have multiple instances configured, JAiRouter will automatically distribute requests:
model:
services:
chat:
load-balance:
type: round-robin
instances:
- name: "qwen2.5:7b"
baseUrl: "http://server1:11434"
weight: 2
- name: "qwen2.5:7b"
baseUrl: "http://server2:11434"
weight: 1
With this configuration: - Server1 will receive ~67% of requests (weight 2) - Server2 will receive ~33% of requests (weight 1)
Monitoring Your Requests¶
1. Check Service Health¶
2. View Metrics¶
3. Check Instance Status¶
Rate Limiting¶
JAiRouter includes built-in rate limiting. If you exceed the configured limits, you'll receive a 429 Too Many Requests response:
{
"error": {
"message": "Rate limit exceeded",
"type": "rate_limit_exceeded",
"code": "rate_limit_exceeded"
}
}
Error Handling¶
JAiRouter provides consistent error responses:
Service Unavailable (503)¶
{
"error": {
"message": "All service instances are unavailable",
"type": "service_unavailable",
"code": "no_available_instances"
}
}
Circuit Breaker Open (503)¶
{
"error": {
"message": "Circuit breaker is open",
"type": "circuit_breaker_open",
"code": "circuit_breaker_open"
}
}
Using with OpenAI SDK¶
JAiRouter is compatible with OpenAI SDKs. Simply change the base URL:
Python¶
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="not-needed" # JAiRouter doesn't require API keys by default
)
response = client.chat.completions.create(
model="qwen2.5:7b",
messages=[
{"role": "user", "content": "Hello!"}
]
)
Node.js¶
import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: 'http://localhost:8080/v1',
apiKey: 'not-needed'
});
const response = await openai.chat.completions.create({
model: 'qwen2.5:7b',
messages: [{ role: 'user', content: 'Hello!' }],
});
Next Steps¶
Now that you've made your first API calls, learn more about:
- Configuration - Detailed configuration options
- API Reference - Complete API documentation
- Deployment - Production deployment guides
Common Use Cases¶
1. A/B Testing Models¶
Configure multiple models and use weights to split traffic:
model:
services:
chat:
instances:
- name: "model-a"
baseUrl: "http://server1:11434"
weight: 1
- name: "model-b"
baseUrl: "http://server2:11434"
weight: 1
2. Fallback Strategy¶
Configure primary and backup services:
model:
services:
chat:
circuit-breaker:
enabled: true
failure-threshold: 5
fallback:
type: default
message: "Service temporarily unavailable"
3. Geographic Distribution¶
Route requests to the nearest server based on IP: