Error Codes
Complete reference of 33 error codes across 11 categories returned by the EventML Service.
Error response formatjson
{
"error": {
"code": "ML_1000",
"message": "Model configuration is invalid or missing required fields",
"http": 400
}
}ModelsML_1000-1099
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_1000 | 400 | INVALID_MODEL_CONFIG | Model configuration is invalid or missing required fields | Check model framework, version, and required fields against API docs |
ML_1001 | 404 | MODEL_NOT_FOUND | Specified model does not exist | Verify model ID via GET /models |
ML_1002 | 409 | MODEL_ALREADY_EXISTS | A model with this name already exists for this tenant | Use a different name or update the existing model |
ML_1003 | 400 | INVALID_FRAMEWORK | Unsupported ML framework specified | Use: PYTORCH, TENSORFLOW, SKLEARN, XGBOOST, ONNX, or CUSTOM |
TrainingML_1100-1199
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_1100 | 400 | INVALID_TRAINING_CONFIG | Training job configuration is invalid | Check hyperparameters, data config, and instance settings |
ML_1101 | 404 | TRAINING_JOB_NOT_FOUND | Training job does not exist | Verify job ID via GET /training |
ML_1102 | 409 | TRAINING_ALREADY_RUNNING | A training job is already running for this model version | Stop the existing job before starting a new one |
ML_1103 | 502 | TRAINING_INFRASTRUCTURE_ERROR | Training infrastructure (SageMaker/EC2) returned an error | Retry; infrastructure may be scaling up |
ML_1104 | 504 | TRAINING_TIMEOUT | Training job exceeded maximum allowed duration | Increase maxRuntime or reduce dataset size |
Inference & ServingML_1200-1299
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_1200 | 400 | INVALID_INFERENCE_INPUT | Inference input format does not match model schema | Check input schema for the model version endpoint |
ML_1201 | 404 | ENDPOINT_NOT_FOUND | Inference endpoint does not exist | Create an endpoint via POST /endpoints |
ML_1202 | 503 | ENDPOINT_NOT_READY | Endpoint is not in ACTIVE state (still provisioning or scaling) | Wait for endpoint to reach ACTIVE status before invoking |
ML_1203 | 504 | INFERENCE_TIMEOUT | Inference request timed out before returning a prediction | Reduce input size or increase endpoint timeout configuration |
ExperimentsML_1300-1399
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_1300 | 400 | INVALID_EXPERIMENT_CONFIG | Experiment configuration is invalid | Ensure variant traffic allocations sum to 100% |
ML_1301 | 404 | EXPERIMENT_NOT_FOUND | Experiment does not exist | Verify experiment ID via GET /experiments |
ML_1302 | 409 | EXPERIMENT_ALREADY_RUNNING | Experiment is already in RUNNING state | Stop the experiment before modifying its configuration |
Drift & MonitoringML_1400-1499
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_1400 | 400 | INVALID_DRIFT_CONFIG | Drift monitor configuration is invalid | Check threshold, detection method (KS_TEST, PSI, etc.), and feature list |
ML_1401 | 404 | DRIFT_MONITOR_NOT_FOUND | Drift monitor does not exist | Create a monitor via POST /drift-monitors |
ML_1402 | 409 | DRIFT_DETECTED | Significant data drift detected above threshold | Retrain the model with updated data distribution |
Feature StoreML_1500-1599
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_1500 | 400 | INVALID_FEATURE_GROUP | Feature group configuration is invalid | Check feature schema, data types, and entity key configuration |
ML_1501 | 404 | FEATURE_GROUP_NOT_FOUND | Feature group does not exist | Create via POST /feature-store/groups |
ML_1502 | 409 | FEATURE_VERSION_CONFLICT | Feature version already exists with conflicting schema | Use a new version number or update the existing version |
GovernanceML_1600-1699
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_1600 | 403 | GOVERNANCE_VIOLATION | Model governance policy violated (missing documentation, tests, or approvals) | Review and meet all governance requirements before releasing |
ML_1601 | 400 | MISSING_APPROVAL | Required approval not obtained for this release stage | Submit model for approval workflow via POST /governance/approvals |
AuthenticationML_1700-1799
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_1700 | 401 | UNAUTHORIZED | Authentication required | Provide a valid JWT token in the Authorization header |
ML_1701 | 403 | FORBIDDEN | Insufficient permissions for this ML operation | Request required role (ml-engineer, admin) from your admin |
ML_1702 | 401 | INVALID_TOKEN | Invalid or expired authentication token | Obtain a fresh JWT token via auth-svc /login or /refresh |
ValidationML_1800-1899
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_1800 | 400 | VALIDATION_FAILED | Request body validation failed | Check request body against the API docs schema |
ML_1801 | 400 | INVALID_INPUT | Invalid input provided for a field | Review field constraints (type, min/max, pattern, enum values) |
DatabaseML_1900-1999
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_1900 | 500 | DATABASE_ERROR | Database operation failed | Retry; Aurora Serverless v2 may be scaling up from idle |
ML_1901 | 500 | TRANSACTION_FAILED | Database transaction failed due to conflict or timeout | Retry the operation with a new Idempotency-Key |
GeneralML_2000-2099
| Code | HTTP | Name | Description | Resolution |
|---|---|---|---|---|
ML_2000 | 500 | INTERNAL_SERVER_ERROR | Internal server error | Retry; if persistent, contact support with the x-request-id value |
ML_2001 | 503 | SERVICE_UNAVAILABLE | Service temporarily unavailable (ECS task restarting) | Retry with exponential backoff |
ML_2002 | 504 | TIMEOUT | Request timeout exceeded | Reduce request complexity or retry |