Azure AI Search - Vector Store
Use Azure AI Search as a vector store for RAG.
Quick Startโ
You need three things:
- An Azure AI Search service
- An embedding model (to convert your queries to vectors)
- A search index with vector fields
Usageโ
- SDK
- PROXY
Basic Searchโ
from litellm import vector_stores
import os
# Set your credentials
os.environ["AZURE_SEARCH_API_KEY"] = "your-search-api-key"
os.environ["AZURE_AI_SEARCH_EMBEDDING_API_BASE"] = "your-embedding-endpoint"
os.environ["AZURE_AI_SEARCH_EMBEDDING_API_KEY"] = "your-embedding-api-key"
# Search the vector store
response = vector_stores.search(
vector_store_id="my-vector-index", # Your Azure AI Search index name
query="What is the capital of France?",
custom_llm_provider="azure_ai",
azure_search_service_name="your-search-service",
litellm_embedding_model="azure/text-embedding-3-large",
litellm_embedding_config={
"api_base": os.getenv("AZURE_AI_SEARCH_EMBEDDING_API_BASE"),
"api_key": os.getenv("AZURE_AI_SEARCH_EMBEDDING_API_KEY"),
},
api_key=os.getenv("AZURE_SEARCH_API_KEY"),
)
print(response)
Async Searchโ
from litellm import vector_stores
response = await vector_stores.asearch(
vector_store_id="my-vector-index",
query="What is the capital of France?",
custom_llm_provider="azure_ai",
azure_search_service_name="your-search-service",
litellm_embedding_model="azure/text-embedding-3-large",
litellm_embedding_config={
"api_base": os.getenv("AZURE_AI_SEARCH_EMBEDDING_API_BASE"),
"api_key": os.getenv("AZURE_AI_SEARCH_EMBEDDING_API_KEY"),
},
api_key=os.getenv("AZURE_SEARCH_API_KEY"),
)
print(response)
Advanced Optionsโ
from litellm import vector_stores
response = vector_stores.search(
vector_store_id="my-vector-index",
query="What is the capital of France?",
custom_llm_provider="azure_ai",
azure_search_service_name="your-search-service",
litellm_embedding_model="azure/text-embedding-3-large",
litellm_embedding_config={
"api_base": os.getenv("AZURE_AI_SEARCH_EMBEDDING_API_BASE"),
"api_key": os.getenv("AZURE_AI_SEARCH_EMBEDDING_API_KEY"),
},
api_key=os.getenv("AZURE_SEARCH_API_KEY"),
top_k=10, # Number of results to return
azure_search_vector_field="contentVector", # Custom vector field name
)
print(response)
Setup Configโ
Add this to your config.yaml:
vector_store_registry:
- vector_store_name: "azure-ai-search-litellm-website-knowledgebase"
litellm_params:
vector_store_id: "test-litellm-app_1761094730750"
custom_llm_provider: "azure_ai"
api_key: os.environ/AZURE_SEARCH_API_KEY
litellm_embedding_model: "azure/text-embedding-3-large"
litellm_embedding_config:
api_base: https://krris-mh44uf7y-eastus2.cognitiveservices.azure.com/
api_key: os.environ/AZURE_API_KEY
api_version: "2025-09-01"
Start Proxyโ
litellm --config /path/to/config.yaml
Search via APIโ
curl -X POST 'http://0.0.0.0:4000/v1/vector_stores/my-vector-index/search' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"query": "What is the capital of France?",
}'
Required Parametersโ
| Parameter | Type | Description |
|---|---|---|
vector_store_id | string | Your Azure AI Search index name |
custom_llm_provider | string | Set to "azure_ai" |
azure_search_service_name | string | Name of your Azure AI Search service |
litellm_embedding_model | string | Model to generate query embeddings (e.g., "azure/text-embedding-3-large") |
litellm_embedding_config | dict | Config for the embedding model (api_base, api_key, api_version) |
api_key | string | Your Azure AI Search API key |
Supported Featuresโ
| Feature | Status | Notes |
|---|---|---|
| Logging | โ Supported | Full logging support available |
| Guardrails | โ Not Yet Supported | Guardrails are not currently supported for vector stores |
| Cost Tracking | โ Supported | Cost is $0 according to Azure |
| Unified API | โ Supported | Call via OpenAI compatible /v1/vector_stores/search endpoint |
| Passthrough | โ Not yet supported |
Response Formatโ
The response follows the standard LiteLLM vector store format:
{
"object": "vector_store.search_results.page",
"search_query": "What is the capital of France?",
"data": [
{
"score": 0.95,
"content": [
{
"text": "Paris is the capital of France...",
"type": "text"
}
],
"file_id": "doc_123",
"filename": "Document doc_123",
"attributes": {
"document_id": "doc_123"
}
}
]
}
How It Worksโ
When you search:
- LiteLLM converts your query to a vector using the embedding model you specified
- It sends the vector to Azure AI Search
- Azure AI Search finds the most similar documents in your index
- Results come back with similarity scores
The embedding model can be any model supported by LiteLLM - Azure OpenAI, OpenAI, Bedrock, etc.
Setting Up Your Azure AI Search Indexโ
Your index needs a vector field. Here's what that looks like:
{
"name": "my-vector-index",
"fields": [
{
"name": "id",
"type": "Edm.String",
"key": true
},
{
"name": "content",
"type": "Edm.String"
},
{
"name": "contentVector",
"type": "Collection(Edm.Single)",
"searchable": true,
"dimensions": 1536,
"vectorSearchProfile": "myVectorProfile"
}
]
}
The vector dimensions must match your embedding model. For example:
text-embedding-3-large: 1536 dimensionstext-embedding-3-small: 1536 dimensionstext-embedding-ada-002: 1536 dimensions
Common Issuesโ
"Failed to generate embedding for query"
Your embedding model config is wrong. Check:
litellm_embedding_confighas the right api_base and api_key- The embedding model name is correct
- Your credentials work
"Index not found"
The vector_store_id doesn't match any index in your search service. Check:
- The index name is correct
- You're using the right search service name
"Field 'contentVector' not found"
Your index uses a different vector field name. Pass it via azure_search_vector_field.