Model Monitoring
Real-time performance metrics for all deployed endpoints.
All Systems Operational
Total Requests (24h)
1,284,392
+12% vs yesterday
Avg Latency
115ms
-8ms vs yesterday
Error Rate
0.24%
-0.1% vs yesterday
Active Models
0 / 0
Requests per Second
Live — updates every 2s
55 RPS
12 seconds agoNow
Endpoint Metrics
| Model | Region | RPS | p50 | p95 | p99 | Error % | Status |
|---|
Latency Distribution
< 50ms8%
50–100ms31%
100–200ms42%
200–500ms16%
> 500ms3%
Alert Rules
High error rate
Error rate > 5%
High latency
p95 Latency > 500ms
Low throughput
RPS < 1 for 5 min
Model offline
Status = sleeping
Region Health
us-east-1
142ms
62%
eu-west-1
88ms
28%
ap-southeast-1
—
0%
Recent Errors
RateLimitExceeded
llama3-sentiment-v2
10:33:12
×3
TimeoutError
gpt2-code-assistant
09:58:44
×1
InvalidInputError
llama3-sentiment-v2
08:21:09
×7