Embeddings Configuration¶
This project has two different embedding configurations, and which one applies depends entirely on TOOL_DISCOVERY_MODE.
Overview¶
Embeddings are used for semantic search in the registry — the same retrieval architecture patterns that power the Jarvis AI Knowledge Base.
Depending on deployment mode, semantic search is handled in one of two ways:
embedded: local FAISS index with a localsentence-transformersmodelexternal: external vector backend with a configured embedding provider such asaws_bedrockoropenai
Which Settings Apply?¶
TOOL_DISCOVERY_MODE | Service Used | Relevant Variables |
|---|---|---|
embedded | EmbeddedFaissService | LOCAL_EMBEDDINGS_MODEL_NAME, LOCAL_EMBEDDINGS_MODEL_DIMENSIONS |
external | External vector backend | VECTOR_STORE_TYPE, EMBEDDING_PROVIDER, and provider-specific settings such as EMBEDDING_MODEL, AWS_REGION, OPENAI_API_KEY, OPENAI_MODEL |
In the current container setup, the default is TOOL_DISCOVERY_MODE=external.
Configuration¶
Embedded Mode¶
Use this configuration when TOOL_DISCOVERY_MODE=embedded.
Required variables:
TOOL_DISCOVERY_MODE=embeddedLOCAL_EMBEDDINGS_MODEL_NAMELOCAL_EMBEDDINGS_MODEL_DIMENSIONS
Example:
TOOL_DISCOVERY_MODE=embedded
LOCAL_EMBEDDINGS_MODEL_NAME=all-MiniLM-L6-v2
LOCAL_EMBEDDINGS_MODEL_DIMENSIONS=384
External Mode¶
Use this configuration when TOOL_DISCOVERY_MODE=external.
Common variables:
TOOL_DISCOVERY_MODE=externalVECTOR_STORE_TYPEEMBEDDING_PROVIDER
External Mode with aws_bedrock¶
Required variables:
VECTOR_STORE_TYPE=weaviateEMBEDDING_PROVIDER=aws_bedrockEMBEDDING_MODELAWS_REGION
Optional variables:
AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEYAWS_SESSION_TOKEN
Example:
TOOL_DISCOVERY_MODE=external
VECTOR_STORE_TYPE=weaviate
EMBEDDING_PROVIDER=aws_bedrock
EMBEDDING_MODEL=your_bedrock_embedding_model_id
AWS_REGION=us-east-1
External Mode with openai¶
Option 4: Azure OpenAI¶
Cloud-based embedding service via Azure-hosted OpenAI models.
# In .env
VECTOR_STORE_TYPE=weaviate
EMBEDDINGS_PROVIDER=azure_openai
AZURE_OPENAI_API_KEY=<AZURE_OPENAI_API_KEY>
AZURE_OPENAI_ENDPOINT=https://example.openai.azure.com
AZURE_OPENAI_API_VERSION=2024-02-01
AZURE_OPENAI_DEPLOYMENT_NAME=<AZURE_OPENAI_DEPLOYMENT_NAME>
AZURE_OPENAI_MODEL=<AZURE_OPENAI_MODEL>
Characteristics: - Cloud-based service via Azure - Requires Azure OpenAI resource and API key - API costs apply (Azure pricing) - Integrates with Azure security model - Data stays within Azure regions - Supports private endpoints and VNet integration - Enterprise-grade SLA and compliance
Configuration¶
VECTOR_STORE_TYPE=weaviateEMBEDDING_PROVIDER=openaiOPENAI_API_KEY
| Variable | Description | Default | Required |
|---|---|---|---|
EMBEDDINGS_PROVIDER | Provider type: sentence-transformers or litellm | sentence-transformers | No |
EMBEDDING_PROVIDER | New format: openai, aws_bedrock, azure_openai | aws_bedrock | For new config |
EMBEDDINGS_MODEL_NAME | Model identifier | all-MiniLM-L6-v2 | Yes |
EMBEDDINGS_MODEL_DIMENSIONS | Embedding dimension | 384 | Yes |
EMBEDDINGS_API_KEY | API key for cloud provider (OpenAI, Cohere, etc.) | - | For cloud* |
EMBEDDINGS_API_BASE | Custom API endpoint (LiteLLM only) | - | No |
EMBEDDINGS_AWS_REGION | AWS region for Bedrock (LiteLLM only) | - | For Bedrock |
AZURE_OPENAI_API_KEY | Azure OpenAI API key | - | For Azure OpenAI |
AZURE_OPENAI_ENDPOINT | Azure OpenAI endpoint URL | - | For Azure OpenAI |
AZURE_OPENAI_API_VERSION | Azure OpenAI API version | 2024-02-01 | No |
AZURE_OPENAI_DEPLOYMENT_NAME | Azure OpenAI deployment name | - | For Azure OpenAI |
AZURE_OPENAI_MODEL | Azure OpenAI model name | text-embedding-3-small | No |
OPENAI_MODEL
Example:
TOOL_DISCOVERY_MODE=external
VECTOR_STORE_TYPE=weaviate
EMBEDDING_PROVIDER=openai
OPENAI_API_KEY=sk-...
OPENAI_MODEL=text-embedding-3-small
Embedded FAISS Mode¶
When TOOL_DISCOVERY_MODE=embedded, the registry loads EmbeddedFaissService and uses a local sentence-transformers model. In this mode, local embeddings are file-based models from Hugging Face, not OpenAI or Bedrock API calls.
Relevant settings:
| Variable | Description | Default |
|---|---|---|
LOCAL_EMBEDDINGS_MODEL_NAME | Hugging Face sentence-transformers model name | all-MiniLM-L6-v2 |
LOCAL_EMBEDDINGS_MODEL_DIMENSIONS | Expected embedding dimension for the FAISS index | 384 |
Required variables when TOOL_DISCOVERY_MODE=embedded:
LOCAL_EMBEDDINGS_MODEL_NAMELOCAL_EMBEDDINGS_MODEL_DIMENSIONS
Using Azure OpenAI¶
vector_store_type = "weaviate"
embedding_provider = "azure_openai"
azure_openai_api_key = <AZURE_OPENAI_API_KEY>
azure_openai_endpoint = https://example.openai.azure.com
azure_openai_api_version = "2024-02-01"
azure_openai_deployment_name = <AZURE_OPENAI_DEPLOYMENT_NAME>
azure_openai_model = <AZURE_OPENAI_MODEL>
See terraform/aws-ecs/terraform.tfvars.example for complete examples.
Notes:
- The model is loaded either from
registry/models/<model-name>in local development or from/app/registry/models/<model-name>in the container. - If the model is not already present,
sentence-transformersdownloads it from Hugging Face Hub. - The Hugging Face cache is stored next to the local models directory under
.cache. - No local embedding provider, API key, or AWS region setting is used in this mode.
Common Embedded Models¶
| Model | Typical Dimensions | Notes |
|---|---|---|
all-MiniLM-L6-v2 | 384 | Default, lightweight |
all-mpnet-base-v2 | 768 | Higher quality, larger model |
paraphrase-multilingual-MiniLM-L12-v2 | 384 | Multilingual use cases |
External Vector Search Mode¶
When TOOL_DISCOVERY_MODE=external, the registry does not load EmbeddedFaissService. Semantic vectorization is handled by the external vector backend configuration instead.
Relevant settings:
| Variable | Description | Default |
|---|---|---|
VECTOR_STORE_TYPE | Vector store backend | weaviate |
EMBEDDING_PROVIDER | Embedding provider for the external vector backend | aws_bedrock |
EMBEDDING_MODEL | Embedding model ID used by the provider | provider-specific default |
AWS_REGION | AWS region for Bedrock embeddings | us-east-1 |
OPENAI_API_KEY | OpenAI API key when EMBEDDING_PROVIDER=openai | - |
OPENAI_MODEL | OpenAI embedding model name | text-embedding-3-small |
Notes:
- In AWS environments, IAM role or the default AWS credential chain is preferred over hardcoded credentials.
EMBEDDING_MODELis the model identifier passed through pydantic for Bedrock embeddings.- If
OPENAI_MODELis not provided, the default istext-embedding-3-small. EMBEDDING_MODELis not used for the OpenAI embedding path in the current configuration model.
Azure OpenAI¶
text-embedding-3-small(1536 dimensions)text-embedding-3-large(3072 dimensions)text-embedding-ada-002(1536 dimensions)
Note: When using Azure OpenAI, you must specify the deployment name (not the model name) in your Azure OpenAI resource.
Other Providers¶
- Azure OpenAI
- Anthropic (Claude)
- Google Vertex AI
- Hugging Face Inference API
- And 100+ more via LiteLLM
Common External Models¶
| Provider | Model | Typical Dimensions |
|---|---|---|
aws_bedrock | bedrock-model-v2 | 1024 |
aws_bedrock | bedrock-model-v1 | 1536 |
openai | text-embedding-3-small | 1536 |
openai | text-embedding-3-large | 3072 |
Behavior When Settings Change¶
Embedded Mode¶
If you change LOCAL_EMBEDDINGS_MODEL_NAME or LOCAL_EMBEDDINGS_MODEL_DIMENSIONS, make sure the configured dimension matches the model's real output dimension.
If the configured dimension does not match the existing FAISS index dimension, the registry re-initializes the FAISS index.
In practice, this means:
- switching to a model with a different output dimension requires rebuilding the local FAISS index
- using the wrong dimension value can cause the system to discard the old index and start a new one
External Mode¶
If you change EMBEDDING_PROVIDER, EMBEDDING_MODEL, or provider-specific settings, semantic vectorization behavior changes in the external vector backend path.
This does not load EmbeddedFaissService, so local FAISS-specific settings do not apply.
Operational Notes¶
EMBEDDING_MODELis the pydantic-backed setting used for external embedding model selection.BEDROCK_MODELis no longer the source of truth for registry vectorization.- Because the current container setup defaults to
TOOL_DISCOVERY_MODE=external, local FAISS embedding settings usually do not need to be added to AWS Secrets Manager. - If you change the local sentence-transformers model, make sure
LOCAL_EMBEDDINGS_MODEL_DIMENSIONSmatches the model output dimension, or the FAISS index will be rebuilt. LOCAL_EMBEDDINGS_MODEL_NAMEandLOCAL_EMBEDDINGS_MODEL_DIMENSIONSonly apply whenTOOL_DISCOVERY_MODE=embedded.EMBEDDING_PROVIDER,EMBEDDING_MODEL,AWS_REGION,OPENAI_API_KEY, andOPENAI_MODELonly apply whenTOOL_DISCOVERY_MODE=external.
Troubleshooting¶
Embedded Mode: Model Downloads or Cache Location¶
- If the local model is not already present,
sentence-transformersdownloads it from Hugging Face Hub. - The cache directory is stored next to the local models directory under
.cache. - In container mode, models live under
/app/registry/models.
Embedded Mode: Dimension Mismatch¶
Symptoms:
- FAISS index is re-initialized on startup
- semantic search index appears to be rebuilt after changing the model
class FaissService: async def _load_embedding_model(self): self.embedding_model = create_embeddings_client( provider=settings.embeddings_provider, model_name=settings.embeddings_model_name, api_key=settings.embeddings_api_key, aws_region=settings.embeddings_aws_region, embedding_dimension=settings.embeddings_model_dimensions, ) ```
What to check:
LOCAL_EMBEDDINGS_MODEL_NAMELOCAL_EMBEDDINGS_MODEL_DIMENSIONS- whether the configured dimension matches the selected sentence-transformers model
External Mode: Bedrock Authentication¶
What to check:
EMBEDDING_PROVIDER=aws_bedrockEMBEDDING_MODELAWS_REGION- IAM role or AWS credential chain
In AWS environments, IAM-based authentication is preferred over hardcoded credentials.
External Mode: OpenAI Authentication¶
What to check:
EMBEDDING_PROVIDER=openaiOPENAI_API_KEYOPENAI_MODELif you are overriding the default
API Reference¶
This section summarizes the main code-level configuration entry points used by the registry embeddings flow.
Settings¶
The registry application reads embedding-related settings from registry.core.config.Settings.
Important fields:
tool_discovery_modelocal_embeddings_model_namelocal_embeddings_model_dimensionsvector_store_typeembedding_providerembedding_modelaws_regionopenai_api_keyopenai_model
VectorConfig¶
External vectorization settings are passed into the shared registry_pkgs.core.config.VectorConfig model.
Important fields:
vector_store_typeembedding_providerembedding_modelaws_regionopenai_api_keyopenai_model
RegistryContainer.vector_service¶
The registry selects the vector search implementation in RegistryContainer.vector_service:
- when
tool_discovery_mode == "external", it returnsExternalVectorSearchService - otherwise, it returns
EmbeddedFaissService
This is the main switch that determines whether local FAISS settings or external embedding provider settings are used.
EmbeddedFaissService¶
EmbeddedFaissService is used only in embedded mode.
Relevant behavior:
- loads a local
sentence-transformersmodel usingLOCAL_EMBEDDINGS_MODEL_NAME - validates or initializes the FAISS index using
LOCAL_EMBEDDINGS_MODEL_DIMENSIONS - stores model files under the local embeddings model directory
- stores Hugging Face cache under the adjacent
.cachedirectory
BedrockEmbeddingConfig¶
For EMBEDDING_PROVIDER=aws_bedrock, the external embedding model configuration is derived from:
embedding_modelaws_region- optional AWS credentials
The model value comes from EMBEDDING_MODEL, not BEDROCK_MODEL.
OpenAIEmbeddingConfig¶
For EMBEDDING_PROVIDER=openai, the external embedding model configuration is derived from:
openai_api_keyopenai_model