Skip to content

feat: enhance memmachine-compose.sh first-run setup wizard#1273

Merged
sscargal merged 3 commits into
MemMachine:mainfrom
skhynix:1312-feat-mm-compose
Apr 17, 2026
Merged

feat: enhance memmachine-compose.sh first-run setup wizard#1273
sscargal merged 3 commits into
MemMachine:mainfrom
skhynix:1312-feat-mm-compose

Conversation

@1saac-k

@1saac-k 1saac-k commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

Purpose of the change

Improve the memmachine-compose.sh setup wizard to support separate base URLs for LLM and embedding services, handle providers that don't require API keys, and add interactive reranker configuration during first-run setup.

Description

The current setup script assumes a single base URL for both LLM and embedding, requires a valid API key, and does not provide any reranker configuration prompts. This is limiting for on-premise setups where LLM and embedding services run on different endpoints, providers like vLLM don't require API keys, and users may want to configure Cohere or AWS Bedrock rerankers during initial setup.

This PR makes three changes to memmachine-compose.sh:

  1. Separate base URLs for LLM and embedding: Split the single base_url prompt into two — one for LLM and one for embedding. If the user doesn't need a separate embedding URL, it defaults to the LLM URL.
  2. Graceful API key handling: When the user declines to enter an API key, auto-populate it with EMPTY instead of leaving the placeholder <YOUR_API_KEY>, which would cause runtime errors. The check_required_env step also recognizes EMPTY as a valid value for keyless providers like vLLM.
  3. Reranker configuration prompts: Add a new configure_reranker function to the first-run setup flow. On GPU images, it offers to replace the default cross-encoder with Cohere or AWS Bedrock. On CPU images, it offers to add an optional neural reranker alongside the default RRF hybrid (identity + BM25). Handles API key input, model selection, and configuration.yml patching for both providers.

Fixes/Closes

Fixes #1213 (partially fix improvement 1,3,4)

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • Manual verification (list step-by-step instructions)

Test Results:

$ ./memmachine-compose.sh
[SUCCESS] Docker and Docker Compose are available
MemMachine Docker Startup Script
====================================

[SUCCESS] Docker and Docker Compose are available
[SUCCESS] .env file found
[WARNING] configuration.yml file not found. Creating from template...
[PROMPT] Which configuration would you like to use for the Docker Image? (CPU/GPU) [CPU]: GPU
[INFO] GPU configuration selected.
[PROMPT] Which provider would you like to use? (OpenAI/Bedrock/Ollama/OpenAI-compatible) [OpenAI]: OpenAI-compatible
[INFO] Selected provider: OPENAI_COMPATIBLE
[SUCCESS] Set MEMMACHINE_IMAGE to memmachine/memmachine:latest-gpu in .env file
[PROMPT] Which OpenAI-compatible LLM model would you like to use? [qwen-flash]:
[SUCCESS] Selected OpenAI-compatible LLM model: qwen-flash
[PROMPT] Which OpenAI-compatible embedding model would you like to use? [text-embedding-v4]:
[SUCCESS] Selected OpenAI-compatible embedding model: text-embedding-v4
[INFO] Generating configuration file for OPENAI_COMPATIBLE provider...
[SUCCESS] Generated configuration file with OPENAI_COMPATIBLE provider settings
[PROMPT] API key is not set. Would you like to set your API key for the OpenAI-compatible provider? (y/N)
[WARNING] API key set to 'EMPTY'. Update it later if your provider requires authentication.     <----
[PROMPT] Model base URL is not set. Would you like to configure custom base URLs? (y/N) y
[PROMPT] LLM base URL [https://api.openai.com/v1]: http://localhost:8080/v1
[PROMPT] Use a different base URL for embedding? (y/N) y     <----
[PROMPT] Embedding base URL [http://localhost:8080/v1]: http://localhost:9090/v1     <----
[SUCCESS] Set LLM base URL to http://localhost:8080/v1
[SUCCESS] Set embedding base URL to http://localhost:9090/v1
[INFO] Default reranker: RRF hybrid (identity + BM25 + cross-encoder)     <----
[PROMPT] Replace cross-encoder provider? (None/Cohere/AWS) [None]: Cohere     <----
[PROMPT] Would you like to set your Cohere API key? (y/N)     <----
[WARNING] Cohere API key set to 'EMPTY'. Update it later if needed.
[PROMPT] Cohere reranker model [rerank-english-v3.0]:     <----
[SUCCESS] Added Cohere reranker to hybrid configuration
[INFO] OPENAI_API_KEY is set to 'EMPTY' (no authentication - OK for providers like vLLM)
[SUCCESS] OpenAI-compatible base URL appears to be configured
[SUCCESS] API key in configuration.yml appears to be configured
[SUCCESS] Database credentials in configuration.yml appear to be configured
[INFO] Pulling and starting MemMachine services...

Snippet of configuration.yml

..
   openai_compatible_embedder:
      provider: openai
      config:
        max_input_length: 2048
        model: "text-embedding-v4"
        api_key: EMPTY     <----
        base_url: "http://localhost:9090/v1"     <----
        dimensions: 1536
..
    openai_compatible_model:
      provider: openai-chat-completions
      config:
        model: "qwen-flash"
        api_key: EMPTY     <----
        base_url: "http://localhost:8080/v1"     <----
..
 rerankers:
    my_reranker_id:
      provider: "rrf-hybrid"
      config:
        reranker_ids:
          - id_ranker_id
          - bm_ranker_id
          - cohere_reranker_id     <----
..
    cohere_reranker_id:
      provider: "cohere"
      config:
        cohere_key: EMPTY     <----
        model: "rerank-english-v3.0"

Checklist

  • I have signed the commit(s) within this pull request
  • My code follows the style guidelines of this project (See STYLE_GUIDE.md)
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • Confirmed all checks passed
  • Contributor has signed the commit(s)
  • Reviewed the code
  • Run, Tested, and Verified the change(s) work as expected

Screenshots/Gifs

N/A

Further comments

None

1saac-k added 3 commits April 8, 2026 22:44
vLLM serves one model per instance, so LLM and embedding models often
run on different endpoints. This adds prompts for separate base URLs
for OpenAI-compatible providers instead of applying a single URL to both.

Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>
When users decline to set API keys for OpenAI-compatible providers
(answering 'N'), auto-populate with 'EMPTY' instead of leaving
<YOUR_API_KEY> placeholders that cause runtime errors. This supports
providers like vLLM that don't require API keys. Also recognizes
'EMPTY' as a valid value during environment checks.

Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>
Add interactive reranker configuration during first-run setup.
Default is RRF hybrid (identity + BM25 for CPU, + cross-encoder for GPU).
Optionally replace cross-encoder (GPU) or extend (CPU) with Cohere or
AWS Bedrock reranker, including credential prompts.

Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>
@1saac-k 1saac-k force-pushed the 1312-feat-mm-compose branch from 94f4139 to fd05dbd Compare April 8, 2026 13:45
@1saac-k

1saac-k commented Apr 8, 2026

Copy link
Copy Markdown
Contributor Author

Added GPG signatures to the commits. No code changes.

@sscargal sscargal self-assigned this Apr 13, 2026
@sscargal sscargal added this to the v0.3.5 milestone Apr 13, 2026
@sscargal sscargal requested a review from Copilot April 17, 2026 19:10

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Enhances the memmachine-compose.sh first-run wizard to better support heterogeneous deployments by allowing distinct LLM vs embedding endpoints, tolerating keyless OpenAI-compatible providers, and adding interactive reranker configuration.

Changes:

  • Split OpenAI-compatible base URL configuration into separate LLM and embedding prompts (with embedding defaulting to LLM URL).
  • Treat “no API key” as EMPTY for OpenAI-compatible flows and recognize it in env checks.
  • Add first-run reranker configuration flow (Cohere/AWS options) with YAML patching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread memmachine-compose.sh
Comment thread memmachine-compose.sh
Comment thread memmachine-compose.sh
Comment thread memmachine-compose.sh
Comment thread memmachine-compose.sh
if [ "$is_gpu" = true ]; then
print_info "Default reranker: RRF hybrid (identity + BM25 + cross-encoder)"
print_prompt
read -p "Replace cross-encoder provider? (None/Cohere/AWS) [None]: " reranker_choice

Copilot AI Apr 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

configure_reranker introduces several variables without local (e.g., reranker_choice, reply, cohere_key, cohere_model, aws_access_key, etc.), which makes them global in bash and risks accidental interference with variables in other functions (this script already uses reply elsewhere). Also, reranker_choice is uppercased but not trimmed; inputs like Cohere (trailing space) won’t match the case labels. Recommendation: declare these as local and normalize input by trimming leading/trailing whitespace before uppercasing.

Copilot uses AI. Check for mistakes.
Comment thread memmachine-compose.sh
Comment on lines +1019 to +1025
read -sp "Enter your AWS Access Key ID: " aws_access_key
echo
print_prompt
read -sp "Enter your AWS Secret Access Key: " aws_secret_key
echo
print_prompt
read -p "Enter your AWS Region [us-west-2]: " aws_region

Copilot AI Apr 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

configure_reranker introduces several variables without local (e.g., reranker_choice, reply, cohere_key, cohere_model, aws_access_key, etc.), which makes them global in bash and risks accidental interference with variables in other functions (this script already uses reply elsewhere). Also, reranker_choice is uppercased but not trimmed; inputs like Cohere (trailing space) won’t match the case labels. Recommendation: declare these as local and normalize input by trimming leading/trailing whitespace before uppercasing.

Copilot uses AI. Check for mistakes.
Comment thread memmachine-compose.sh
if [ "$is_first_run" = true ]; then
print_prompt
read -p "Model base URL is not set. Would you like to configure a custom model base URL? (y/N) " reply
read -p "Model base URL is not set. Would you like to configure custom base URLs? (y/N) " reply

Copilot AI Apr 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prompt states “Model base URL is not set”, but the generated config typically already has a default base URL (e.g., https://api.openai.com/v1). This message is misleading even when the intent is “configure/change base URLs”. Suggest rewording to something like “Would you like to configure custom base URLs for LLM/embedding? (y/N)” without implying a missing value.

Suggested change
read -p "Model base URL is not set. Would you like to configure custom base URLs? (y/N) " reply
read -p "Would you like to configure custom base URLs for LLM/embedding? (y/N) " reply

Copilot uses AI. Check for mistakes.
Comment thread memmachine-compose.sh
Comment thread memmachine-compose.sh
Comment on lines +1040 to +1043
if ! grep -q "aws_reranker_id" <(awk '/my_reranker_id:/,/^ [a-zA-Z]/' configuration.yml); then
safe_sed_inplace "/reranker_ids:/,/^ [a-zA-Z]/ { /- bm_ranker_id/a\\ - aws_reranker_id
}" configuration.yml
fi

Copilot AI Apr 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multi-line sed program (with an embedded newline inside the quoted argument) is fragile across sed implementations—especially given this script explicitly supports BSD/macOS sed. This can lead to reranker IDs not being inserted (or malformed configuration.yml) on some platforms. A more portable approach would be to avoid embedded-newline sed scripts (e.g., use an awk-based insertion in a temp file, or generate a temporary sed script file and pass it via -f).

Copilot uses AI. Check for mistakes.
Comment thread memmachine-compose.sh
Comment on lines +995 to +996
safe_sed_inplace "s|cohere_key: <COHERE_API_KEY>|cohere_key: $cohere_key|g" configuration.yml
print_success "Set Cohere API key in configuration.yml"

Copilot AI Apr 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user answers “yes” to setting the Cohere key but then presses Enter at the secret prompt, the script will write an empty value into YAML (likely cohere_key: with nothing). That differs from the intended EMPTY sentinel behavior and can cause runtime parsing/config issues. Consider treating an empty input as EMPTY (and emitting the same warning) for consistency with the “no key” path.

Suggested change
safe_sed_inplace "s|cohere_key: <COHERE_API_KEY>|cohere_key: $cohere_key|g" configuration.yml
print_success "Set Cohere API key in configuration.yml"
if [[ -n "$cohere_key" ]]; then
safe_sed_inplace "s|cohere_key: <COHERE_API_KEY>|cohere_key: $cohere_key|g" configuration.yml
print_success "Set Cohere API key in configuration.yml"
else
safe_sed_inplace "s|cohere_key: <COHERE_API_KEY>|cohere_key: EMPTY|g" configuration.yml
print_warning "Cohere API key set to 'EMPTY'. Update it later if needed."
fi

Copilot uses AI. Check for mistakes.

@sscargal sscargal left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sscargal sscargal merged commit 0b7b1dd into MemMachine:main Apr 17, 2026
48 checks passed
edwinyyyu pushed a commit to edwinyyyu/MemMachine that referenced this pull request Apr 20, 2026
…e#1273)

* feat: support separate base URLs for LLM and embedding endpoints

vLLM serves one model per instance, so LLM and embedding models often
run on different endpoints. This adds prompts for separate base URLs
for OpenAI-compatible providers instead of applying a single URL to both.

Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>

* feat: graceful API key handling when user declines configuration

When users decline to set API keys for OpenAI-compatible providers
(answering 'N'), auto-populate with 'EMPTY' instead of leaving
<YOUR_API_KEY> placeholders that cause runtime errors. This supports
providers like vLLM that don't require API keys. Also recognizes
'EMPTY' as a valid value during environment checks.

Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>

* feat: add reranker configuration prompts during setup

Add interactive reranker configuration during first-run setup.
Default is RRF hybrid (identity + BM25 for CPU, + cross-encoder for GPU).
Optionally replace cross-encoder (GPU) or extend (CPU) with Cohere or
AWS Bedrock reranker, including credential prompts.

Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>

---------

Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feat]: memmachine-compose.sh enhancement

3 participants