Operations Guide
Deployment Modes
Open Model Prism supports three deployment modes controlled by the NODE_ROLE environment variable:
| Mode | NODE_ROLE | Description |
|---|---|---|
| Full (default) | full | Admin UI + Gateway in one pod. Use for dev and small deployments (<50 users). |
| Control Plane | control | Admin API + Frontend only. No gateway routes served. |
| Worker | worker | Gateway only (/api/:tenant/v1/*). Scales horizontally. |
Single-Pod Deployment (default)
docker compose up -d Everything runs in one container. Good for local development and small teams (<50 developers). MongoDB starts with --replSet rs0 to enable Change Streams. If unavailable, the system falls back to polling (15-second interval) automatically.
Scaled Deployment (Control Plane + Workers)
docker compose -f docker-compose.yml -f docker-compose.scaled.yml up -d --scale worker=3 - 1x Control Plane (port 3000) — Admin UI, Admin API, setup wizard, authentication
- N× Worker Pods — Gateway only, handling all
/api/:tenant/v1/*traffic - Load balancer routes gateway traffic to workers and admin traffic to the control plane
All state is in MongoDB — worker pods are fully stateless and can be added or removed at any time.
Capacity Planning
| Team Size | Workers | Estimated Load | Notes |
|---|---|---|---|
| 1–20 developers | 1 (full mode) | ~300 req/min | Single pod is fine |
| 20–80 developers | 2–3 workers | 600–2,400 req/min | --scale worker=2 |
| 80–200 developers | 4–6 workers | 2,400–6,000 req/min | Add nginx in front |
| 200+ developers | 8+ workers | 6,000+ req/min | Consider a dedicated MongoDB cluster |
Load Balancer Configuration
Example nginx upstream config for a scaled deployment:
upstream omp_workers {
server worker1:3000;
server worker2:3000;
server worker3:3000;
}
upstream omp_control {
server control:3000;
}
server {
listen 443 ssl;
server_name omp.example.com;
location /api/ {
location ~ ^/api/[^/]+/v1/ {
proxy_pass http://omp_workers;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
location /api/prism/admin/ {
proxy_pass http://omp_control;
}
location /api/prism/auth/ {
proxy_pass http://omp_control;
}
}
location / {
proxy_pass http://omp_control;
}
} Health check endpoint for load balancer probes: GET /health
Security Hardening
Credentials and Secrets
- Provider credentials: encrypted at rest with AES-256-GCM
- Tenant API keys: SHA-256 hashed — never stored in plaintext
- Admin passwords: bcrypt
Network
- Terminate HTTPS at the load balancer. Model Prism speaks HTTP internally.
- Set
CORS_ORIGINSto specific domains (not*) in production - Firewall the
/metricsendpoint — it is a Prometheus scrape target for internal monitoring only
Environment Variables
| Variable | Default | Description |
|---|---|---|
NODE_ROLE | full | full / control / worker |
MONGO_URI | mongodb://localhost:27017/openmodelprism | MongoDB connection string |
JWT_SECRET | (required) | JWT signing secret — 32+ characters |
ENCRYPTION_KEY | (required) | 32-byte hex string for AES-256-GCM |
PORT | 3000 | Server listen port |
CORS_ORIGINS | * | Comma-separated list of allowed origins |
OFFLINE | false | true disables all outbound internet calls |
LOG_LEVEL | info | debug / info / warn / error |
NODE_ENV | development | production enables JSON structured logs |
System Dashboard
Available at /system (admin and maintainer roles only). Shows the live state of all pods registered with the shared MongoDB: pod list with role badges, per-pod resource metrics (heap, RSS, CPU), per-pod request rate, 60-minute traffic chart, and provider error rates.
Pod heartbeats are written to MongoDB every 30 seconds with a 90-second TTL. Pods that stop sending heartbeats disappear automatically.
Runtime controls (apply immediately, no restart): log level, prompt logging toggle, file logging configuration.
Backup
- Back up the
openmodelprismMongoDB database — it contains all tenant configuration, provider configuration, routing rules, users, and analytics. - Provider credentials are encrypted with
ENCRYPTION_KEY— back up this value separately and store it securely. - Request logs grow approximately 1 KB per request. For high-volume deployments, add a TTL index on
RequestLog.timestampto cap collection size.
Upgrading
git pull
docker compose build
docker compose up -d All application state is stored in MongoDB. For zero downtime, bring up new pods before terminating old ones — the load balancer will route traffic away from pods that fail their /health check.