Bring Your Own AI (BYOC)

By default, AtlasAI AI features (root cause analysis, runbook generation, AI copilot, natural language search) use AtlasAI’s managed AI infrastructure. With Bring Your Own AI (BYOC), you connect AtlasAI to your own AI provider instead — giving you full control over which model is used, where your data goes, and how much you pay for inference.

BYOC is available on the Operations and Enterprise plans. It is required for on-prem and air-gapped deployments.

When to use BYOC

Reason	Explanation
Data sovereignty	Incident context (logs, metrics, configurations) stays in your network and never reaches AtlasAI’s AI infrastructure
Cost control	You pay your AI provider directly; no AtlasAI AI credit consumption
Model choice	Use the specific model version your compliance team has approved
Air-gapped deployment	AtlasAI’s cloud AI is inaccessible; your own provider runs inside your network
Existing contract	You have an enterprise agreement with Azure or AWS and want to use committed spend

Supported AI providers

Provider	Works air-gapped?	Required configuration
OpenAI	No (requires api.openai.com)	API key, model name
Azure OpenAI	Yes (private endpoint)	API key, endpoint URL, deployment name
AWS Bedrock	Yes (VPC endpoint)	IAM role or access keys, region, model ID
Anthropic	No (requires api.anthropic.com)	API key, model name
Ollama	Yes (runs locally)	Endpoint URL, model name
vLLM	Yes (runs locally)	Endpoint URL, model name
Any OpenAI-compatible API	Depends on where it runs	Base URL, API key, model name

Configuration: environment variables

The simplest way to configure BYOC is through environment variables. These take effect immediately without restarting in most cases (exception: model routing cache refreshes every 5 minutes).

OpenAI


BYOC_PROVIDER=openai
BYOC_API_KEY=sk-proj-abc123...
BYOC_MODEL=gpt-4o-mini
# Optional: use a specific organization
# BYOC_API_KEY_ORG=org-abc123

Which model to choose:

gpt-4o — best quality, higher cost, higher latency
gpt-4o-mini — good quality, low cost, fast — recommended for most use cases
gpt-4-turbo — strong reasoning, 128k context

Azure OpenAI


BYOC_PROVIDER=azure
BYOC_API_KEY=your-azure-openai-api-key
BYOC_ENDPOINT_URL=https://your-resource-name.openai.azure.com
BYOC_MODEL=gpt-4o  # This is your deployment name in Azure, not the model name

For private endpoint access (air-gapped within Azure VNet):


BYOC_ENDPOINT_URL=https://your-private-endpoint.privatelink.openai.azure.com

AWS Bedrock

Option A: EC2 / ECS task role (recommended for AWS deployments)

No access keys needed. The Tenant Plane uses the IAM role assigned to the EC2 instance or ECS task:


TP_BEDROCK_USE_TASK_ROLE=1
TP_BEDROCK_MODEL_ID=anthropic.claude-3-haiku-20240307-v1:0

Ensure the IAM role has bedrock:InvokeModel permission.

Option B: Access keys


BYOC_PROVIDER=bedrock
BYOC_REGION=us-east-1
BYOC_AWS_ACCESS_KEY_ID=AKIA...
BYOC_AWS_SECRET_ACCESS_KEY=abc123...
BYOC_MODEL=anthropic.claude-3-haiku-20240307-v1:0

Available Bedrock models:

Model	Use case	Cost
`anthropic.claude-3-haiku-20240307-v1:0`	RCA, NL search	Low
`anthropic.claude-3-sonnet-20240229-v1:0`	Runbook generation, complex reasoning	Medium
`meta.llama3-8b-instruct-v1:0`	General purpose, lower cost	Low
`amazon.titan-text-lite-v1`	Classification, short tasks	Very low

Anthropic (direct)


BYOC_PROVIDER=anthropic
BYOC_API_KEY=sk-ant-api03-...
BYOC_MODEL=claude-3-5-sonnet-20241022

Ollama (self-hosted, fully local)

Ollama is an open-source tool that runs LLMs locally. Install it on a server in your network.


# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
 
# Pull a model
ollama pull llama3.2       # 2B parameter, fast
ollama pull llama3.2:7b    # 7B parameter, better quality
 
# Start Ollama server
ollama serve

Configure AtlasAI to use it:


BYOC_PROVIDER=openai       # Ollama uses OpenAI-compatible API
BYOC_API_KEY=ollama        # Any non-empty string works
BYOC_ENDPOINT_URL=http://ollama-server:11434/v1
BYOC_MODEL=llama3.2

Model recommendations for Ollama:

Task	Recommended model	Requirements
NL search, classification	`llama3.2`	4 GB RAM
RCA, runbook generation	`llama3.2:7b` or `mistral:7b`	8 GB RAM
Complex reasoning	`llama3.1:70b`	64 GB RAM

vLLM / Text Generation Inference (TGI)

Any server exposing an OpenAI-compatible API (/v1/chat/completions) works:


BYOC_PROVIDER=openai
BYOC_API_KEY=your-server-key  # or "none" if no auth
BYOC_ENDPOINT_URL=http://vllm-server:8000/v1
BYOC_MODEL=meta-llama/Llama-3.1-8B-Instruct

Configuration: Helm (Kubernetes)

For Kubernetes deployments, configure BYOC in your Helm values:


# values.yaml
byoc:
  provider: openai          # Provider name
  apiKey: "sk-..."          # API key (stored as Kubernetes Secret)
  model: "gpt-4o-mini"     # Model to use
  endpointUrl: ""           # Custom endpoint (for Azure / Ollama / vLLM)
  region: ""                # AWS region (for Bedrock)
  bedrockUseTaskRole: false # Use IAM task role (AWS ECS/EC2)
  bedrockModelId: ""        # Bedrock model ID

Apply:


helm upgrade atlasai-tp deploy/helm/atlasai-tp -f values.yaml

Embeddings (AI search and RAG)

In addition to the main LLM, AtlasAI uses embedding models for:

AI-powered log and incident search
RAG (Retrieval Augmented Generation) for runbook knowledge
Semantic similarity in alert correlation

Configure embeddings separately:


# For OpenAI embeddings
EMBEDDING_PROVIDER=openai
OPENAI_API_KEY=sk-...
 
# For AWS Bedrock embeddings
EMBEDDING_PROVIDER=bedrock
AWS_REGION=us-east-1
# (uses same IAM role as Bedrock inference)
 
# For fully local / no embeddings (basic keyword search only)
EMBEDDING_PROVIDER=fallback

The fallback provider uses simple keyword matching instead of semantic embeddings. AI search quality is lower but it works with no external dependencies.

Testing your BYOC configuration

After configuring BYOC, test the connection:


curl -X POST https://your-tenant-plane/api/ai/test \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "x-tenant-id: your-tenant" \
  -H "Content-Type: application/json" \
  -d '{"test": true}'

Expected success response:


{
  "status": "ok",
  "provider": "openai",
  "model": "gpt-4o-mini",
  "latency_ms": 850,
  "cp_connected": false
}

Or check in the UI: Settings → AI shows the current provider, model, and connection status with a Test Connection button.

Data flow and privacy

When BYOC is enabled:


                         Your Network
┌────────────────────────────────────────────────────────────┐
│                                                            │
│   Incident data, logs, metrics                            │
│         │                                                  │
│         ▼                                                  │
│   ┌──────────────┐     ┌──────────────────────────────┐  │
│   │ Tenant Plane │────▶│  Your AI Provider             │  │
│   │ (in your VPC │     │  (Ollama / Bedrock VPC /     │  │
│   │  or on-prem) │     │   Azure private endpoint)     │  │
│   └──────────────┘     └──────────────────────────────┘  │
│                                                            │
└────────────────────────────────────────────────────────────┘

    NO DATA LEAVES YOUR NETWORK (when using private endpoints)

If you use public endpoints (e.g., api.openai.com), your incident context does leave your network to reach the AI provider. Check your AI provider’s data processing agreement.

Fallback behavior

If your BYOC provider is unavailable:

AI features return a clear error message: “AI provider temporarily unavailable”
Non-AI features (incident creation, runbooks, dashboards) continue working normally
The system does not automatically fall back to AtlasAI’s cloud AI (this would send data outside your network unexpectedly)

To enable fallback to AtlasAI cloud AI (only for BYOC customers with CP connectivity):


AI_CREDITS_FAIL_OPEN=1  # Allow cloud AI fallback when BYOC is down

Model requirements for best results

AtlasAI AI features work best with models that support:

Capability	Required for
Function calling / tool use	Runbook generation, structured output
Long context (32k+ tokens)	RCA over large log volumes
JSON mode	Structured incident analysis
Instruction following	Reliable remediation suggestions

Models that work well: Claude 3+, GPT-4o series, Llama 3.1 7B+, Mistral 7B+.

Models that do not work well: very small models (< 3B parameters), base (non-instruct) models, models without function calling support.