This page standardizes the HTTP responses and OpenAI-style error bodies produced by the unified inference pipeline. Use it for frontend/backend integration, monitoring, and alert routing.
1. Error object schema (OpenAI-style)
All errors return an application/json body:
{
"error": {
"type": "invalid_request_error | rate_limit_exceeded | service_unavailable | timeout | internal_error | server_error",
"code": "unsupported_provider | executor_binding_validation_failed | model_not_found | invalid_api_key | permission_denied | rate_limit_exceeded | model_fetch_error | internal_error | service_unavailable | orchestrator_missing | closed_source_service_unavailable | timeout",
"message": "Human-readable description",
"param": "optional, when a specific field is invalid",
"provider": "optional, provider/model involved",
"request_id": "optional, for tracing"
}
}
2. Status code matrix (overview)
HTTP
Source (entry)
error.type
error.code
Typical trigger
Fallback Notes
400 Bad Request
UnifiedInferenceService
invalid_request_error
unsupported_provider
Request explicitly selects an unsupported provider
On validation failure, constructs an OpenAI-compatible error body and throws HttpException.
Typical cases: missing model binding; executor type mismatch.
3.3 SimpleErrorClassifier & Formatter
For HttpException, preserve original status; map to UnifiedErrorType:
401 → invalid_api_key
403 → permission_denied
404 → model_not_found
429 → rate_limit_exceeded
other 4xx → invalid_request
any 5xx → internal_error
For generic errors, classify by keywords (same heuristic as above), then emit OpenAI-style body.
4. Fallback policy
shouldAttemptFallback treats the following HTTP statuses as fallbackable: 500, 502, 503, 504, 429.
Additionally, if the error message contains any of:
then fallback logic is engaged (try other providers/executors per policy).
Client guidance: Frontends/SDKs should detect these statuses/codes and apply retry with jittered backoff and provider/model fallback, where appropriate.
5. Monitoring & alerting (minimal checklist)
Group by status, error.type, error.code, provider, model, region.