Skip to Content
AdministrationBring Your Own AI (BYOC)

Bring Your Own AI (BYOC)

By default, AtlasAI AI features (root cause analysis, runbook generation, AI copilot, natural language search) use AtlasAI’s managed AI infrastructure. With Bring Your Own AI (BYOC), you connect AtlasAI to your own AI provider instead — giving you full control over which model is used, where your data goes, and how much you pay for inference.

BYOC is available on the Operations and Enterprise plans. It is required for on-prem and air-gapped deployments.


When to use BYOC

ReasonExplanation
Data sovereigntyIncident context (logs, metrics, configurations) stays in your network and never reaches AtlasAI’s AI infrastructure
Cost controlYou pay your AI provider directly; no AtlasAI AI credit consumption
Model choiceUse the specific model version your compliance team has approved
Air-gapped deploymentAtlasAI’s cloud AI is inaccessible; your own provider runs inside your network
Existing contractYou have an enterprise agreement with Azure or AWS and want to use committed spend

Supported AI providers

ProviderWorks air-gapped?Required configuration
OpenAINo (requires api.openai.com)API key, model name
Azure OpenAIYes (private endpoint)API key, endpoint URL, deployment name
AWS BedrockYes (VPC endpoint)IAM role or access keys, region, model ID
AnthropicNo (requires api.anthropic.com)API key, model name
OllamaYes (runs locally)Endpoint URL, model name
vLLMYes (runs locally)Endpoint URL, model name
Any OpenAI-compatible APIDepends on where it runsBase URL, API key, model name

Configuration: environment variables

The simplest way to configure BYOC is through environment variables. These take effect immediately without restarting in most cases (exception: model routing cache refreshes every 5 minutes).

OpenAI

BYOC_PROVIDER=openai BYOC_API_KEY=sk-proj-abc123... BYOC_MODEL=gpt-4o-mini # Optional: use a specific organization # BYOC_API_KEY_ORG=org-abc123

Which model to choose:

  • gpt-4o — best quality, higher cost, higher latency
  • gpt-4o-mini — good quality, low cost, fast — recommended for most use cases
  • gpt-4-turbo — strong reasoning, 128k context

Azure OpenAI

BYOC_PROVIDER=azure BYOC_API_KEY=your-azure-openai-api-key BYOC_ENDPOINT_URL=https://your-resource-name.openai.azure.com BYOC_MODEL=gpt-4o # This is your deployment name in Azure, not the model name

For private endpoint access (air-gapped within Azure VNet):

BYOC_ENDPOINT_URL=https://your-private-endpoint.privatelink.openai.azure.com

AWS Bedrock

Option A: EC2 / ECS task role (recommended for AWS deployments)

No access keys needed. The Tenant Plane uses the IAM role assigned to the EC2 instance or ECS task:

TP_BEDROCK_USE_TASK_ROLE=1 TP_BEDROCK_MODEL_ID=anthropic.claude-3-haiku-20240307-v1:0

Ensure the IAM role has bedrock:InvokeModel permission.

Option B: Access keys

BYOC_PROVIDER=bedrock BYOC_REGION=us-east-1 BYOC_AWS_ACCESS_KEY_ID=AKIA... BYOC_AWS_SECRET_ACCESS_KEY=abc123... BYOC_MODEL=anthropic.claude-3-haiku-20240307-v1:0

Available Bedrock models:

ModelUse caseCost
anthropic.claude-3-haiku-20240307-v1:0RCA, NL searchLow
anthropic.claude-3-sonnet-20240229-v1:0Runbook generation, complex reasoningMedium
meta.llama3-8b-instruct-v1:0General purpose, lower costLow
amazon.titan-text-lite-v1Classification, short tasksVery low

Anthropic (direct)

BYOC_PROVIDER=anthropic BYOC_API_KEY=sk-ant-api03-... BYOC_MODEL=claude-3-5-sonnet-20241022

Ollama (self-hosted, fully local)

Ollama is an open-source tool that runs LLMs locally. Install it on a server in your network.

# Install Ollama curl -fsSL https://ollama.ai/install.sh | sh # Pull a model ollama pull llama3.2 # 2B parameter, fast ollama pull llama3.2:7b # 7B parameter, better quality # Start Ollama server ollama serve

Configure AtlasAI to use it:

BYOC_PROVIDER=openai # Ollama uses OpenAI-compatible API BYOC_API_KEY=ollama # Any non-empty string works BYOC_ENDPOINT_URL=http://ollama-server:11434/v1 BYOC_MODEL=llama3.2

Model recommendations for Ollama:

TaskRecommended modelRequirements
NL search, classificationllama3.24 GB RAM
RCA, runbook generationllama3.2:7b or mistral:7b8 GB RAM
Complex reasoningllama3.1:70b64 GB RAM

vLLM / Text Generation Inference (TGI)

Any server exposing an OpenAI-compatible API (/v1/chat/completions) works:

BYOC_PROVIDER=openai BYOC_API_KEY=your-server-key # or "none" if no auth BYOC_ENDPOINT_URL=http://vllm-server:8000/v1 BYOC_MODEL=meta-llama/Llama-3.1-8B-Instruct

Configuration: Helm (Kubernetes)

For Kubernetes deployments, configure BYOC in your Helm values:

# values.yaml byoc: provider: openai # Provider name apiKey: "sk-..." # API key (stored as Kubernetes Secret) model: "gpt-4o-mini" # Model to use endpointUrl: "" # Custom endpoint (for Azure / Ollama / vLLM) region: "" # AWS region (for Bedrock) bedrockUseTaskRole: false # Use IAM task role (AWS ECS/EC2) bedrockModelId: "" # Bedrock model ID

Apply:

helm upgrade atlasai-tp deploy/helm/atlasai-tp -f values.yaml

Embeddings (AI search and RAG)

In addition to the main LLM, AtlasAI uses embedding models for:

  • AI-powered log and incident search
  • RAG (Retrieval Augmented Generation) for runbook knowledge
  • Semantic similarity in alert correlation

Configure embeddings separately:

# For OpenAI embeddings EMBEDDING_PROVIDER=openai OPENAI_API_KEY=sk-... # For AWS Bedrock embeddings EMBEDDING_PROVIDER=bedrock AWS_REGION=us-east-1 # (uses same IAM role as Bedrock inference) # For fully local / no embeddings (basic keyword search only) EMBEDDING_PROVIDER=fallback

The fallback provider uses simple keyword matching instead of semantic embeddings. AI search quality is lower but it works with no external dependencies.


Testing your BYOC configuration

After configuring BYOC, test the connection:

curl -X POST https://your-tenant-plane/api/ai/test \ -H "Authorization: Bearer $ADMIN_TOKEN" \ -H "x-tenant-id: your-tenant" \ -H "Content-Type: application/json" \ -d '{"test": true}'

Expected success response:

{ "status": "ok", "provider": "openai", "model": "gpt-4o-mini", "latency_ms": 850, "cp_connected": false }

Or check in the UI: Settings → AI shows the current provider, model, and connection status with a Test Connection button.


Data flow and privacy

When BYOC is enabled:

Your Network ┌────────────────────────────────────────────────────────────┐ │ │ │ Incident data, logs, metrics │ │ │ │ │ ▼ │ │ ┌──────────────┐ ┌──────────────────────────────┐ │ │ │ Tenant Plane │────▶│ Your AI Provider │ │ │ │ (in your VPC │ │ (Ollama / Bedrock VPC / │ │ │ │ or on-prem) │ │ Azure private endpoint) │ │ │ └──────────────┘ └──────────────────────────────┘ │ │ │ └────────────────────────────────────────────────────────────┘ NO DATA LEAVES YOUR NETWORK (when using private endpoints)

If you use public endpoints (e.g., api.openai.com), your incident context does leave your network to reach the AI provider. Check your AI provider’s data processing agreement.


Fallback behavior

If your BYOC provider is unavailable:

  • AI features return a clear error message: “AI provider temporarily unavailable”
  • Non-AI features (incident creation, runbooks, dashboards) continue working normally
  • The system does not automatically fall back to AtlasAI’s cloud AI (this would send data outside your network unexpectedly)

To enable fallback to AtlasAI cloud AI (only for BYOC customers with CP connectivity):

AI_CREDITS_FAIL_OPEN=1 # Allow cloud AI fallback when BYOC is down

Model requirements for best results

AtlasAI AI features work best with models that support:

CapabilityRequired for
Function calling / tool useRunbook generation, structured output
Long context (32k+ tokens)RCA over large log volumes
JSON modeStructured incident analysis
Instruction followingReliable remediation suggestions

Models that work well: Claude 3+, GPT-4o series, Llama 3.1 7B+, Mistral 7B+.

Models that do not work well: very small models (< 3B parameters), base (non-instruct) models, models without function calling support.