Retry OpenAI API on internal server error#1226
Conversation
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
There was a problem hiding this comment.
Pull request overview
Adds openai.InternalServerError to the set of retryable OpenAI SDK exceptions, addressing intermittent 5xx failures by allowing the existing retry/backoff logic to kick in.
Changes:
- Treat
openai.InternalServerErroras retryable for the OpenAI Responses language model. - Treat
openai.InternalServerErroras retryable for the OpenAI Chat Completions language model. - Treat
openai.InternalServerErroras retryable for the OpenAI embedder chunk-cluster embedding calls.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| packages/server/src/memmachine_server/common/language_model/openai_responses_language_model.py | Adds InternalServerError to the retryable exception tuple for response generation. |
| packages/server/src/memmachine_server/common/language_model/openai_chat_completions_language_model.py | Adds InternalServerError to the retryable exception tuple for chat completions generation. |
| packages/server/src/memmachine_server/common/embedder/openai_embedder.py | Adds InternalServerError to the retryable exception tuple for embedding requests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| openai.RateLimitError, | ||
| openai.APITimeoutError, | ||
| openai.APIConnectionError, | ||
| openai.InternalServerError, |
| except ( | ||
| openai.RateLimitError, | ||
| openai.APITimeoutError, | ||
| openai.APIConnectionError, | ||
| openai.InternalServerError, |
| except ( | ||
| openai.RateLimitError, | ||
| openai.APITimeoutError, | ||
| openai.APIConnectionError, | ||
| openai.InternalServerError, |
sscargal
left a comment
There was a problem hiding this comment.
LGTM. Unit tests for this could be handled separately.
| openai.RateLimitError, | ||
| openai.APITimeoutError, | ||
| openai.APIConnectionError, | ||
| openai.InternalServerError, |
There was a problem hiding this comment.
Is it possible to put all the retriable exceptions to a common place instead of hardcoding them everywhere?
|
There is a reproducible InternalServerError where retry won't help: It appears they have rolled out changes gradually that make it no longer intermittent, but consistent. |
Purpose of the change
Hit internal server error intermittently today.
Description
Add exception type to retryable errors for OpenAI APIs.
Type of change
[Please delete options that are not relevant.]
How Has This Been Tested?
Checklist
Maintainer Checklist