feat: enhance memmachine-compose.sh first-run setup wizard#1273
Conversation
vLLM serves one model per instance, so LLM and embedding models often run on different endpoints. This adds prompts for separate base URLs for OpenAI-compatible providers instead of applying a single URL to both. Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>
When users decline to set API keys for OpenAI-compatible providers (answering 'N'), auto-populate with 'EMPTY' instead of leaving <YOUR_API_KEY> placeholders that cause runtime errors. This supports providers like vLLM that don't require API keys. Also recognizes 'EMPTY' as a valid value during environment checks. Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>
Add interactive reranker configuration during first-run setup. Default is RRF hybrid (identity + BM25 for CPU, + cross-encoder for GPU). Optionally replace cross-encoder (GPU) or extend (CPU) with Cohere or AWS Bedrock reranker, including credential prompts. Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>
94f4139 to
fd05dbd
Compare
|
Added GPG signatures to the commits. No code changes. |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Enhances the memmachine-compose.sh first-run wizard to better support heterogeneous deployments by allowing distinct LLM vs embedding endpoints, tolerating keyless OpenAI-compatible providers, and adding interactive reranker configuration.
Changes:
- Split OpenAI-compatible base URL configuration into separate LLM and embedding prompts (with embedding defaulting to LLM URL).
- Treat “no API key” as
EMPTYfor OpenAI-compatible flows and recognize it in env checks. - Add first-run reranker configuration flow (Cohere/AWS options) with YAML patching.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if [ "$is_gpu" = true ]; then | ||
| print_info "Default reranker: RRF hybrid (identity + BM25 + cross-encoder)" | ||
| print_prompt | ||
| read -p "Replace cross-encoder provider? (None/Cohere/AWS) [None]: " reranker_choice |
There was a problem hiding this comment.
configure_reranker introduces several variables without local (e.g., reranker_choice, reply, cohere_key, cohere_model, aws_access_key, etc.), which makes them global in bash and risks accidental interference with variables in other functions (this script already uses reply elsewhere). Also, reranker_choice is uppercased but not trimmed; inputs like Cohere (trailing space) won’t match the case labels. Recommendation: declare these as local and normalize input by trimming leading/trailing whitespace before uppercasing.
| read -sp "Enter your AWS Access Key ID: " aws_access_key | ||
| echo | ||
| print_prompt | ||
| read -sp "Enter your AWS Secret Access Key: " aws_secret_key | ||
| echo | ||
| print_prompt | ||
| read -p "Enter your AWS Region [us-west-2]: " aws_region |
There was a problem hiding this comment.
configure_reranker introduces several variables without local (e.g., reranker_choice, reply, cohere_key, cohere_model, aws_access_key, etc.), which makes them global in bash and risks accidental interference with variables in other functions (this script already uses reply elsewhere). Also, reranker_choice is uppercased but not trimmed; inputs like Cohere (trailing space) won’t match the case labels. Recommendation: declare these as local and normalize input by trimming leading/trailing whitespace before uppercasing.
| if [ "$is_first_run" = true ]; then | ||
| print_prompt | ||
| read -p "Model base URL is not set. Would you like to configure a custom model base URL? (y/N) " reply | ||
| read -p "Model base URL is not set. Would you like to configure custom base URLs? (y/N) " reply |
There was a problem hiding this comment.
The prompt states “Model base URL is not set”, but the generated config typically already has a default base URL (e.g., https://api.openai.com/v1). This message is misleading even when the intent is “configure/change base URLs”. Suggest rewording to something like “Would you like to configure custom base URLs for LLM/embedding? (y/N)” without implying a missing value.
| read -p "Model base URL is not set. Would you like to configure custom base URLs? (y/N) " reply | |
| read -p "Would you like to configure custom base URLs for LLM/embedding? (y/N) " reply |
| if ! grep -q "aws_reranker_id" <(awk '/my_reranker_id:/,/^ [a-zA-Z]/' configuration.yml); then | ||
| safe_sed_inplace "/reranker_ids:/,/^ [a-zA-Z]/ { /- bm_ranker_id/a\\ - aws_reranker_id | ||
| }" configuration.yml | ||
| fi |
There was a problem hiding this comment.
The multi-line sed program (with an embedded newline inside the quoted argument) is fragile across sed implementations—especially given this script explicitly supports BSD/macOS sed. This can lead to reranker IDs not being inserted (or malformed configuration.yml) on some platforms. A more portable approach would be to avoid embedded-newline sed scripts (e.g., use an awk-based insertion in a temp file, or generate a temporary sed script file and pass it via -f).
| safe_sed_inplace "s|cohere_key: <COHERE_API_KEY>|cohere_key: $cohere_key|g" configuration.yml | ||
| print_success "Set Cohere API key in configuration.yml" |
There was a problem hiding this comment.
If the user answers “yes” to setting the Cohere key but then presses Enter at the secret prompt, the script will write an empty value into YAML (likely cohere_key: with nothing). That differs from the intended EMPTY sentinel behavior and can cause runtime parsing/config issues. Consider treating an empty input as EMPTY (and emitting the same warning) for consistency with the “no key” path.
| safe_sed_inplace "s|cohere_key: <COHERE_API_KEY>|cohere_key: $cohere_key|g" configuration.yml | |
| print_success "Set Cohere API key in configuration.yml" | |
| if [[ -n "$cohere_key" ]]; then | |
| safe_sed_inplace "s|cohere_key: <COHERE_API_KEY>|cohere_key: $cohere_key|g" configuration.yml | |
| print_success "Set Cohere API key in configuration.yml" | |
| else | |
| safe_sed_inplace "s|cohere_key: <COHERE_API_KEY>|cohere_key: EMPTY|g" configuration.yml | |
| print_warning "Cohere API key set to 'EMPTY'. Update it later if needed." | |
| fi |
…e#1273) * feat: support separate base URLs for LLM and embedding endpoints vLLM serves one model per instance, so LLM and embedding models often run on different endpoints. This adds prompts for separate base URLs for OpenAI-compatible providers instead of applying a single URL to both. Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com> * feat: graceful API key handling when user declines configuration When users decline to set API keys for OpenAI-compatible providers (answering 'N'), auto-populate with 'EMPTY' instead of leaving <YOUR_API_KEY> placeholders that cause runtime errors. This supports providers like vLLM that don't require API keys. Also recognizes 'EMPTY' as a valid value during environment checks. Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com> * feat: add reranker configuration prompts during setup Add interactive reranker configuration during first-run setup. Default is RRF hybrid (identity + BM25 for CPU, + cross-encoder for GPU). Optionally replace cross-encoder (GPU) or extend (CPU) with Cohere or AWS Bedrock reranker, including credential prompts. Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com> --------- Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>
Purpose of the change
Improve the
memmachine-compose.shsetup wizard to support separate base URLs for LLM and embedding services, handle providers that don't require API keys, and add interactive reranker configuration during first-run setup.Description
The current setup script assumes a single base URL for both LLM and embedding, requires a valid API key, and does not provide any reranker configuration prompts. This is limiting for on-premise setups where LLM and embedding services run on different endpoints, providers like vLLM don't require API keys, and users may want to configure Cohere or AWS Bedrock rerankers during initial setup.
This PR makes three changes to
memmachine-compose.sh:base_urlprompt into two — one for LLM and one for embedding. If the user doesn't need a separate embedding URL, it defaults to the LLM URL.EMPTYinstead of leaving the placeholder<YOUR_API_KEY>, which would cause runtime errors. Thecheck_required_envstep also recognizesEMPTYas a valid value for keyless providers like vLLM.configure_rerankerfunction to the first-run setup flow. On GPU images, it offers to replace the default cross-encoder with Cohere or AWS Bedrock. On CPU images, it offers to add an optional neural reranker alongside the default RRF hybrid (identity + BM25). Handles API key input, model selection, andconfiguration.ymlpatching for both providers.Fixes/Closes
Fixes #1213 (partially fix improvement 1,3,4)
Type of change
How Has This Been Tested?
Test Results:
Snippet of
configuration.ymlChecklist
Maintainer Checklist
Screenshots/Gifs
N/A
Further comments
None