Frequently Asked Questions (FAQ)¶

文档版本: 1.0.0
最后更新: 2025-08-19
Git 提交: c1aa5b0f
作者: Lincoln

This document collects the most common questions and answers encountered by users when using JAiRouter.

Basic Questions¶

Q1: What is JAiRouter?¶

A: JAiRouter is an AI model service routing and load balancing gateway based on Spring Boot. It provides a unified OpenAI-compatible API interface, supporting multiple backend AI services including GPUStack, Ollama, VLLM, and OpenAI. Key features include load balancing, rate limiting, circuit breaking, health checks, and dynamic configuration management.

Q2: Which AI services does JAiRouter support?¶

A: Currently supported AI services include: - GPUStack: GPU cluster management platform - Ollama: Local large language model runtime platform - VLLM: High-performance LLM inference engine - Xinference: Model inference service framework - LocalAI: Local AI model service - OpenAI: Official OpenAI API

Future support will include Anthropic Claude, Google Gemini, Cohere, and more services.

Q3: What distinguishes JAiRouter from other API gateways?¶

A: JAiRouter is specifically designed for AI model services with the following features: - AI-Specific: Optimized for the special needs of AI model services - OpenAI Compatible: Provides standard OpenAI API format - Intelligent Routing: Supports multiple load balancing strategies - Model-Aware: Understands different AI models' characteristics and capabilities - Cost Optimization: Supports cost-based routing strategies - Easy to Use: Simple configuration designed specifically for AI developers

Q4: Is JAiRouter free?¶

A: Yes, JAiRouter is a completely open-source and free project under the MIT License. You can freely use, modify, and distribute it. We are also considering providing enterprise-level commercial support services.

Installation and Deployment¶

Q5: What are the system requirements for JAiRouter?¶

A: Minimum system requirements: - Java: Version 17 or higher - Memory: Minimum 512MB, recommended 2GB+ - CPU: Minimum 1 core, recommended 2 cores+ - Storage: Minimum 1GB available space - Operating System: Supports Linux, Windows, macOS

Q6: How to quickly get started with JAiRouter?¶

A: The simplest way is to use Docker:

# Pull the image
docker pull sodlinken/jairouter:latest

# Start the service
docker run -d -p 8080:8080 \
  -v ./config:/app/config \
  sodlinken/jairouter:latest

Or build from source:

# Clone the project
git clone https://github.com/Lincoln-cn/JAiRouter.git
cd jairouter

# Build and run
./mvnw clean package
java -jar target/model-router-*.jar

Q7: How to configure backend AI services?¶

A: Configure in application.yml:

model:
  services:
    chat:
      instances:
        - name: "ollama-llama2"
          baseUrl: "http://localhost:11434"
          path: "/v1/chat/completions"
          weight: 1
        - name: "openai-gpt4"
          baseUrl: "https://api.openai.com"
          path: "/v1/chat/completions"
          weight: 2

Q8: What deployment methods are supported?¶

A: JAiRouter supports multiple deployment methods: - Direct Run: Running Java JAR package directly - Docker Container: Single container deployment - Docker Compose: Multi-container orchestration deployment - Kubernetes: Cloud-native deployment (planned) - Cloud Platforms: Deployment on AWS, Azure, GCP, and other cloud platforms

Feature Usage¶

Q9: How to configure load balancing strategies?¶

A: Specify the load balancing type in the configuration file:

model:
  services:
    chat:
      load-balance:
        type: round-robin  # Options: random, round-robin, least-connections, ip-hash

Characteristics of each strategy: - random: Random selection, simple and efficient - round-robin: Round-robin scheduling, evenly distributed - least-connections: Least connections, suitable for long connections - ip-hash: IP hash, session persistence

Q10: How to set rate limiting?¶

A: Configure rate limiting parameters:

model:
  services:
    chat:
      rate-limit:
        type: token-bucket
        capacity: 100        # Bucket capacity
        refill-rate: 10      # Refill rate (per second)
        client-ip-enable: true  # Enable client IP independent rate limiting

Supported rate limiting algorithms: - token-bucket: Token bucket, supports burst traffic - leaky-bucket: Leaky bucket, smooth rate limiting - sliding-window: Sliding window, precise control - warm-up: Warm-up rate limiting, gradual increase

Q11: How does the circuit breaker work?¶

A: Circuit breaker configuration example:

model:
  services:
    chat:
      circuit-breaker:
        failure-threshold: 5      # Failure threshold
        recovery-timeout: 30s     # Recovery timeout
        success-threshold: 3      # Success threshold

The circuit breaker has three states: - CLOSED: Normal state, requests pass through normally - OPEN: Circuit breaker state, returns error directly - HALF_OPEN: Half-open state, attempting recovery

Q12: How to dynamically update configuration?¶

A: Use REST API for dynamic updates:

# Add instance
curl -X POST http://localhost:8080/api/config/instance/add/chat \
  -H "Content-Type: application/json" \
  -d '{
    "name": "new-model",
    "baseUrl": "http://new-server:9090",
    "path": "/v1/chat/completions",
    "weight": 1
  }'

# Update instance
curl -X PUT http://localhost:8080/api/config/instance/update/chat \
  -H "Content-Type: application/json" \
  -d '{
    "instanceId": "new-model@http://new-server:9090",
    "instance": {
      "name": "new-model",
      "baseUrl": "http://new-server:9090",
      "path": "/v1/chat/completions",
      "weight": 2
    }
  }'

Monitoring and Operations¶

Q13: How to monitor JAiRouter's running status?¶

A: JAiRouter provides multiple monitoring methods:

Health Check:

curl http://localhost:8080/actuator/health

Prometheus Metrics:

curl http://localhost:8080/actuator/prometheus

Key Metrics: - Total requests and success rate - Response time distribution - Rate limiting and circuit breaker statistics - JVM memory and GC metrics - System CPU and memory usage

Q14: How to view logs?¶

A: Log file locations: - Container Deployment: /app/logs/jairouter-debug.log - Direct Run: ./logs/jairouter-debug.log

View real-time logs:

# Docker container
docker logs -f jairouter

# Direct run
tail -f logs/jairouter-debug.log

Adjust log level:

curl -X POST http://localhost:8080/actuator/loggers/org.unreal.modelrouter \
  -H "Content-Type: application/json" \
  -d '{"configuredLevel": "DEBUG"}'

Q15: How to backup and restore configuration?¶

A: Configuration file backup:

# Backup configuration directory
cp -r config/ config-backup-$(date +%Y%m%d)

# Backup using API
curl -X POST http://localhost:8080/api/config/merge/backup

Restore configuration:

# Restore configuration files
cp -r config-backup-20250115/ config/

# Restart service to apply configuration
docker restart jairouter

Performance and Optimization¶

Q16: How is JAiRouter's performance?¶

A: Performance metrics (based on standard test environment): - Throughput: 1000+ RPS - Latency: P95 < 100ms - Concurrency: Supports 1000+ concurrent connections - Memory: Base runtime approximately 200MB

Actual performance depends on: - Hardware configuration - Backend service performance - Network latency - Configuration parameters

Q17: How to optimize performance?¶

A: Performance optimization recommendations:

JVM Tuning:

java -Xms2g -Xmx4g \
     -XX:+UseG1GC \
     -XX:MaxGCPauseMillis=200 \
     -jar target/model-router-*.jar

Connection Pool Optimization:

spring:
  webflux:
    httpclient:
      pool:
        max-connections: 200
        max-idle-time: 30s

Cache Configuration:

model:
  cache:
    enabled: true
    ttl: 300s
    max-size: 1000

Q18: How to handle high-concurrency scenarios?¶

A: High-concurrency optimization strategies:

Horizontal Scaling: Deploy multiple JAiRouter instances
Load Balancing: Use Nginx or cloud load balancers
Connection Pool: Increase connection pool size
Caching: Enable response caching
Rate Limiting: Set rate limiting parameters appropriately
Monitoring: Real-time performance monitoring

Troubleshooting¶

Q19: What to do if service startup fails?¶

A: Common startup issues and solutions:

Port Occupied:

# Check port occupancy
netstat -tlnp | grep :8080
# Modify port or stop occupying process

Configuration File Error:

# Validate YAML format
./mvnw spring-boot:run --debug

Java Version Incompatibility:

# Check Java version
java -version
# Ensure using Java 17+

Q20: What to do if backend service connection fails?¶

A: Connection issue troubleshooting steps:

Check Service Status:

curl -v http://backend-server:9090/health

Check Network Connectivity:
```
telnet backend-server 9090
```

Check Configuration:

curl http://localhost:8080/api/config/instance/type/chat

View Logs:

grep -i "connection\|timeout" logs/jairouter-debug.log

Q21: How to troubleshoot performance issues?¶

A: Performance issue diagnosis:

Check System Resources:

# CPU usage
curl http://localhost:8080/actuator/metrics/system.cpu.usage

# Memory usage
curl http://localhost:8080/actuator/metrics/jvm.memory.used

# GC status
curl http://localhost:8080/actuator/metrics/jvm.gc.pause

Analyze Response Time:

# Response time distribution
curl http://localhost:8080/actuator/metrics/jairouter.request.duration

# Request statistics
curl http://localhost:8080/actuator/metrics/jairouter.requests.total

Development and Integration¶

Q22: How to integrate into existing projects?¶

A: Integration methods:

As Proxy Service:

# Python example
import openai

# Point OpenAI client to JAiRouter
openai.api_base = "http://jairouter:8080/v1"
openai.api_key = "your-api-key"

response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Hello!"}]
)

As Load Balancer:

// Node.js example
const axios = require('axios');

const response = await axios.post('http://jairouter:8080/v1/chat/completions', {
  model: 'gpt-3.5-turbo',
  messages: [{ role: 'user', content: 'Hello!' }]
}, {
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer your-api-key'
  }
});

Q23: How to develop custom adapters?¶

A: Custom adapter development:

@Component
public class CustomAdapter extends BaseAdapter {

    @Override
    public Mono<String> processRequest(String serviceType, String requestBody, ServiceInstance instance) {
        return webClient.post()
            .uri(instance.getBaseUrl() + instance.getPath())
            .bodyValue(transformRequest(requestBody))
            .retrieve()
            .bodyToMono(String.class)
            .map(this::transformResponse);
    }

    private String transformRequest(String requestBody) {
        // Request format transformation logic
        return requestBody;
    }

    private String transformResponse(String responseBody) {
        // Response format transformation logic
        return responseBody;
    }
}

Q24: How to contribute code?¶

A: Contribution process:

Fork Project: Fork the project on GitHub
Create Branch: git checkout -b feature/your-feature
Develop Feature: Write code and tests
Commit Code: git commit -m "feat: add new feature"
Push Branch: git push origin feature/your-feature
Create PR: Create Pull Request on GitHub

For detailed information, please refer to the Contribution Guide.

Commercial Use¶

Q25: Can it be used commercially?¶

A: Yes, JAiRouter uses the MIT License, allowing commercial use. You can: - Use in commercial projects - Modify and customize features - Redistribute (license must be retained) - Provide commercial services based on JAiRouter

Q26: Is technical support provided?¶

A: Currently provided support methods: - Community Support: GitHub Issues and Discussions - Documentation Support: Complete user and API documentation - Sample Code: Rich usage examples

We are considering providing enterprise-level commercial technical support services.

Q27: How to obtain enterprise-level features?¶

A: Enterprise-level feature planning: - Multi-tenant Support: Tenant isolation and management - Advanced Security: Enterprise-level authentication and authorization - Professional Monitoring: Advanced monitoring and analysis features - Technical Support: Professional technical support services

For enterprise-level requirements, please contact us to discuss custom development.

Issue Feedback¶

If your question is not answered in this FAQ, please feel free to provide feedback through the following methods:

GitHub Issues: Submit Issue
GitHub Discussions: Participate in Discussion
Email Contact: jairouter@example.com

We will continuously update the FAQ content to provide better support for users.

Last Updated: January 15, 2025