Configuration Guide
Hapax provides a flexible, validated configuration system that helps you maintain a reliable and secure LLM service. This guide will help you understand and customize your deployment.
Configuration Overview
Hapax uses a YAML-based configuration system with built-in validation and dynamic updates. The configuration is organized into logical sections:
server: # HTTP server settings
llm: # LLM provider configuration
logging: # Logging preferences
routes: # API endpoint definitions
metrics: # Monitoring configuration
Understanding Defaults
Hapax comes with carefully chosen defaults that allow it to work out of the box for development. You only need to configure what you want to customize.
Default Configuration
The following settings are provided by default:
# These are built-in defaults - you don't need to copy this
server:
port: 8080
read_timeout: 30s
write_timeout: 45s
max_header_bytes: 2097152 # 2MB
shutdown_timeout: 30s
http3:
enabled: false # Disabled by default
port: 443 # Default HTTPS/QUIC port
llm:
provider: "ollama" # Default local provider
model: "llama2" # Default model
max_context_tokens: 16384
system_prompt: "You are a helpful AI assistant focused on providing accurate and detailed responses."
options:
temperature: 0.7 # Between 0 and 1
top_p: 0.9 # Between 0 and 1
frequency_penalty: 0.3 # Between -2 and 2
presence_penalty: 0.3 # Between -2 and 2
stream: true
retry:
max_retries: 5
initial_delay: 100ms
max_delay: 5s
multiplier: 1.5
retryable_errors: ["rate_limit", "timeout", "server_error"]
provider_preference: # Order of provider selection
- ollama
- anthropic
- openai
logging:
level: "info"
format: "json"
# Default routes
routes:
- path: "/v1/completions"
handler: "completion"
version: "v1"
methods: ["POST", "OPTIONS"]
- path: "/health"
handler: "health"
version: "v1"
methods: ["GET"]
- path: "/metrics"
handler: "metrics"
version: "v1"
methods: ["GET"]
Configuration Inheritance
Configuration works in layers:
- Built-in defaults (shown above)
- Your
config.yaml
overrides - Environment variables (highest priority)
For example, this minimal config.yaml
works because it inherits most settings:
llm:
provider: "anthropic"
api_key: ${ANTHROPIC_API_KEY}
When to Override Defaults
You should override defaults when:
- Switching providers (e.g., from Ollama to Anthropic)
- Running in production (ports, timeouts, etc.)
- Need different logging levels
- Custom system prompts
- Specific model requirements
The following sections detail all available options, with notes about their defaults.
Core Components
Server Configuration
Configure the HTTP server and its behavior:
server:
port: 8080 # HTTP server port
read_timeout: 30s # Request read timeout
write_timeout: 45s # Response write timeout
max_header_bytes: 2097152 # 2MB header limit
shutdown_timeout: 30s # Graceful shutdown period
http3: # Optional HTTP/3 support
enabled: false # Enable HTTP/3 (QUIC)
port: 443 # HTTPS/QUIC port
tls_cert_file: "cert.pem" # TLS certificate path
tls_key_file: "key.pem" # TLS key path
idle_timeout: 30s # Keep-alive timeout
max_bi_streams_concurrent: 100 # Max bidirectional streams
max_uni_streams_concurrent: 100 # Max unidirectional streams
max_stream_receive_window: 6291456 # 6MB stream window
max_connection_receive_window: 15728640 # 15MB connection window
enable_0rtt: true # Enable 0-RTT support
max_0rtt_size: 16384 # 16KB max 0-RTT size
allow_0rtt_replay: false # Disable replay protection
udp_receive_buffer_size: 8388608 # 8MB UDP buffer
The server configuration supports:
- Basic HTTP server settings
- Timeouts and limits
- Graceful shutdown
- HTTP/3 (QUIC) with TLS
- Advanced performance tuning
LLM Provider Settings
Configure your LLM providers and their preferences. Hapax supports two approaches:
Approach 1: Simple Configuration (Recommended)
# Primary LLM configuration
llm:
provider: anthropic # Primary provider to use
model: claude-3.5-haiku-latest
api_key: ${ANTHROPIC_API_KEY}
max_context_tokens: 100000
retry:
max_retries: 3
initial_delay: 100ms
max_delay: 2s
multiplier: 2.0
retryable_errors: ["rate_limit", "timeout", "server_error"]
# Provider definitions
providers:
anthropic: # Provider name
type: anthropic # Provider type
model: claude-3.5-haiku-latest
api_key: ${ANTHROPIC_API_KEY}
ollama:
type: ollama
model: llama3
api_key: ""
# Failover configuration
provider_preference: # Order of provider selection
- anthropic
- ollama
Approach 2: Legacy Configuration
llm:
provider: anthropic
model: claude-3
api_key: ${ANTHROPIC_API_KEY}
backup_providers: # Legacy backup provider configuration
- provider: openai
model: gpt-4
api_key: ${OPENAI_API_KEY}
Provider Failover
The provider failover system supports two modes:
- Modern Approach (Recommended):
- Define providers in the
providers
map - Set failover order in
provider_preference
- More flexible and supports multiple providers
- Define providers in the
- Legacy Approach:
- Use
backup_providers
in thellm
section - Simple primary/backup configuration
- Limited to one backup provider
- Use
The failover system will:
- Start with the first provider in the preference list
- If a provider fails, automatically try the next one
- Use the retry configuration to handle transient errors
- Track provider health and adjust routing accordingly
Health Monitoring
Configure health checks to maintain service reliability:
health_check:
enabled: true
interval: 15s
timeout: 5s
Logging Configuration
Configure logging behavior and output format:
logging:
level: "info" # Logging level: debug, info, warn, error
format: "json" # Output format: json or text
The logging system supports:
- Multiple verbosity levels
- Structured JSON output
- Plain text format
- Environment variable configuration
Environment Variables
Hapax supports sophisticated environment variable expansion:
# Core settings
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
export OLLAMA_API_KEY="your-key"
# Server settings
export HAPAX_PORT="8080"
export HAPAX_HOST="0.0.0.0"
# Logging settings
export LOG_LEVEL="info"
export LOG_FORMAT="json"
The environment variable system supports:
- Standard variable substitution:
${VAR}
- Default values:
${VAR:-default}
- Nested variable references
- Validation and error handling
Example configurations:
llm:
provider: anthropic
api_key: ${ANTHROPIC_API_KEY}
endpoint: ${API_ENDPOINT:-http://localhost:11434}
server:
port: ${PORT:-8080}
host: ${HOST:-0.0.0.0}
Advanced Features
Dynamic Configuration
Hapax monitors your configuration file for changes and safely applies updates without restart:
# These settings can be updated while running
llm:
provider: "openai"
model: "gpt-4"
options:
temperature: 0.7
top_p: 0.9
Health Monitoring
Configure health checks for providers and routes:
health_check:
enabled: true
interval: 15s
timeout: 5s
failure_threshold: 2
Performance Tuning
Optimize for your workload with queue and circuit breaker settings:
queue:
enabled: false # Enable for high-load scenarios
initial_size: 1000 # Default queue size
save_interval: 30s # State persistence interval
circuit_breaker:
max_requests: 100 # Requests allowed in half-open state
interval: 30s # Monitoring interval
timeout: 10s # Open state duration
failure_threshold: 5 # Failures before opening
Configuration Validation
Hapax validates your configuration at startup and when changes are made. The validator checks:
Server Configuration
- Port number (0-65535)
- Positive timeouts (read, write, shutdown)
- Valid header size limits
- HTTP/3 settings when enabled:
- TLS certificate and key files
- Stream and connection limits
- 0-RTT configuration
LLM Configuration
- Provider name is specified
- Model name is specified
- Valid context token limits
- API key presence
Logging Configuration
- Valid log levels: debug, info, warn, error
- Valid formats: json, text
Route Configuration
- Non-empty paths
- Valid handlers
- Version specification
- Method and middleware validation
Run manual validation with:
./hapax --validate --config config.yaml
Best Practices
Security
- Use environment variables for sensitive data (API keys)
- Enable TLS for production deployments
- Configure appropriate timeouts
- Protect metrics endpoints with authentication
Reliability
- Configure multiple providers for failover
- Set up health checks for both providers and routes
- Use appropriate circuit breaker thresholds
- Implement retry logic for transient errors
Performance
- Enable HTTP/3 for improved performance
- Configure appropriate stream and connection limits
- Use caching for repeated requests
- Adjust queue size based on load
Monitoring
- Set appropriate log levels (info for production)
- Use structured JSON logging in production
- Enable metrics collection
- Configure health check intervals
Default Values
The system comes with production-tested defaults:
server:
port: 8080
read_timeout: 30s
write_timeout: 45s
max_header_bytes: 2097152 # 2MB
llm:
provider: "ollama"
model: "llama2"
max_context_tokens: 16384
retry:
max_retries: 5
initial_delay: 100ms
max_delay: 5s
multiplier: 1.5
circuit_breaker:
max_requests: 100
interval: 30s
timeout: 10s
failure_threshold: 5
logging:
level: "info"
format: "json"
Example Configurations
Example Configurations
Basic Development Setup
Minimal configuration for local development:
# config.yaml - Local development
llm:
provider: "ollama"
model: "llama2"
max_context_tokens: 16384
logging:
level: "debug"
format: "text"
Cloud Provider Setup
Configuration for using cloud LLM providers:
# config.yaml - Cloud setup
llm:
provider: "anthropic"
model: "claude-3-haiku"
api_key: ${ANTHROPIC_API_KEY}
max_context_tokens: 100000
retry:
max_retries: 3
initial_delay: 100ms
max_delay: 2s
providers:
anthropic:
type: anthropic
model: claude-3-haiku
api_key: ${ANTHROPIC_API_KEY}
openai:
type: openai
model: gpt-4
api_key: ${OPENAI_API_KEY}
provider_preference:
- anthropic
- openai
Production Deployment
Full production configuration with all features:
server:
port: 443
read_timeout: 30s
write_timeout: 45s
http3:
enabled: true
port: 443
tls_cert_file: "/etc/certs/server.crt"
tls_key_file: "/etc/certs/server.key"
llm:
provider: anthropic
model: claude-3-haiku
api_key: ${ANTHROPIC_API_KEY}
max_context_tokens: 100000
retry:
max_retries: 3
initial_delay: 100ms
max_delay: 2s
multiplier: 2.0
retryable_errors: ["rate_limit", "timeout", "server_error"]
cache:
enable: true
type: "redis"
redis:
address: "localhost:6379"
password: ${REDIS_PASSWORD}
db: 0
providers:
anthropic:
type: anthropic
model: claude-3-haiku
api_key: ${ANTHROPIC_API_KEY}
openai:
type: openai
model: gpt-4
api_key: ${OPENAI_API_KEY}
provider_preference:
- anthropic
- openai
health_check:
enabled: true
interval: 15s
timeout: 5s
failure_threshold: 2
circuit_breaker:
max_requests: 100
interval: 30s
timeout: 10s
failure_threshold: 5
queue:
enabled: true
initial_size: 1000
state_path: "/var/lib/hapax/queue.state"
save_interval: 30s
logging:
level: info
format: json
Troubleshooting
Common Configuration Issues
- Server Configuration
server: port: -1 # Error: Invalid port (must be 0-65535) read_timeout: -5s # Error: Negative timeout not allowed
- HTTP/3 Configuration
server: http3: enabled: true # Error: Missing TLS files max_0rtt_size: 2097152 # Error: Exceeds 1MB limit
- LLM Configuration
llm: # Error: Provider is required model: "gpt-4" max_context_tokens: -1 # Error: Negative tokens not allowed
- Logging Configuration
logging: level: "invalid" # Error: Must be debug, info, warn, or error format: "yaml" # Error: Must be json or text
- Route Configuration ```yaml routes:
- path: “” # Error: Empty path not allowed handler: “” # Error: Handler is required version: “” # Error: Version is required ```
Configuration Checklist
Before deploying, verify:
- All required fields are set
- Port numbers are valid (0-65535)
- Timeouts are positive values
- TLS certificates exist and are readable
- Provider API keys are available
- Log level and format are valid
- All routes have paths, handlers, and versions
Validation Command
Run the built-in validator:
./hapax --validate --config config.yaml
The validator will:
- Check configuration syntax
- Validate all field values
- Verify file accessibility
- Report specific errors ```