Production Deployment Guide¶
This guide covers deploying mcp-scan in production environments with high availability and security requirements.
Architecture Overview¶
┌─────────────────────────────────────────────────────────────┐
│ Production Stack │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ mcp-scan │ │ Ollama │ │ CodeQL │ │
│ │ (scanner) │───▶│ (local) │ │ (binary) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │ │ │
│ │ ┌──────────────┐ │ │
│ └───────────▶│ LSP Servers │◀──────────┘ │
│ │ (pyright, │ │
│ │ gopls) │ │
│ └──────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ External APIs │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Claude │ │ OpenAI │ │ Custom │ │ │
│ │ │ API │ │ API │ │ LLM │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
System Requirements¶
Minimum Requirements¶
| Component | Requirement |
|---|---|
| CPU | 4 cores |
| RAM | 8 GB |
| Disk | 20 GB SSD |
| OS | Linux (Ubuntu 22.04+, RHEL 8+) |
Recommended for Production¶
| Component | Requirement |
|---|---|
| CPU | 8+ cores |
| RAM | 32 GB |
| Disk | 100 GB NVMe SSD |
| GPU | NVIDIA GPU (for local Ollama) |
| OS | Linux (Ubuntu 22.04 LTS) |
Installation¶
1. Install mcp-scan¶
# From source
git clone https://github.com/mcphub/mcp-scan.git
cd mcp-scan
make build
sudo mv bin/mcp-scan /usr/local/bin/
# Or via go install
go install github.com/mcphub/mcp-scan/cmd/mcp-scan@latest
2. Install CodeQL¶
# Download CodeQL CLI
CODEQL_VERSION=2.16.0
curl -L -o codeql.tar.gz \
"https://github.com/github/codeql-cli-binaries/releases/download/v${CODEQL_VERSION}/codeql-linux64.tar.gz"
# Install
sudo tar xzf codeql.tar.gz -C /opt/
sudo ln -sf /opt/codeql/codeql /usr/local/bin/codeql
# Download language packs
codeql pack download codeql/python-queries
codeql pack download codeql/javascript-queries
codeql pack download codeql/go-queries
# Verify
codeql version
codeql resolve languages
3. Install Language Servers¶
# Node.js language servers
npm install -g pyright typescript typescript-language-server
# Python language server (alternative)
pip3 install python-lsp-server[all]
# Go language server
go install golang.org/x/tools/gopls@latest
# Verify
./scripts/check-lsp.sh
4. Install Ollama (Optional)¶
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Configure systemd service
sudo systemctl enable ollama
sudo systemctl start ollama
# Pull models
ollama pull llama3.2:3b
# Verify
curl http://localhost:11434/api/tags
Configuration¶
System Configuration¶
Create /etc/mcp-scan/config.yaml:
version: "1"
# Analysis settings
mode: deep
timeout: 30m
workers: 0 # Auto-detect
fail_on: high
# File patterns
include:
- "**/*.py"
- "**/*.ts"
- "**/*.js"
- "**/*.go"
exclude:
- "**/node_modules/**"
- "**/venv/**"
- "**/vendor/**"
- "**/dist/**"
- "**/build/**"
- "**/*.min.js"
- "**/*_test.*"
- "**/test/**"
# LLM configuration
llm:
enabled: true
provider: "" # Auto-detect
url: "http://localhost:11434"
threshold: 0.7
timeout: "60s"
max_length: 5000
# CodeQL configuration
codeql:
enabled: true
path: "/usr/local/bin/codeql"
timeout: "30m"
queries_dir: "/opt/mcp-scan/codeql-queries/mcp"
min_severity: 5.0
languages:
- python
- javascript
- go
# LSP configuration
lsp:
enabled: true
languages:
- python
- typescript
- go
# ML configuration
ml:
enabled: true
threshold: 0.3
model_path: "/opt/mcp-scan/ml_weights.json"
# Output configuration
output:
format: sarif
include_trace: true
redact_snippets: false
Environment Variables¶
Create /etc/mcp-scan/env:
# LLM API keys
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
# Override defaults
MCP_SCAN_LLM_PROVIDER=claude
MCP_SCAN_CODEQL_TIMEOUT=1h
Systemd Service¶
Create /etc/systemd/system/mcp-scan.service:
[Unit]
Description=MCP-SCAN Security Scanner
After=network.target ollama.service
[Service]
Type=oneshot
EnvironmentFile=/etc/mcp-scan/env
ExecStart=/usr/local/bin/mcp-scan scan /var/lib/mcp-scan/workspace --config /etc/mcp-scan/config.yaml
WorkingDirectory=/var/lib/mcp-scan
User=mcp-scan
Group=mcp-scan
# Security hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/mcp-scan
PrivateTmp=true
[Install]
WantedBy=multi-user.target
Security Hardening¶
API Key Management¶
# Use secrets manager
export ANTHROPIC_API_KEY=$(aws secretsmanager get-secret-value \
--secret-id mcp-scan/anthropic-key \
--query SecretString --output text)
# Or use HashiCorp Vault
export ANTHROPIC_API_KEY=$(vault kv get -field=api_key secret/mcp-scan/anthropic)
Network Security¶
# Restrict Ollama to localhost
# In /etc/ollama/config
OLLAMA_HOST=127.0.0.1:11434
# Firewall rules
sudo ufw allow from 127.0.0.1 to any port 11434
sudo ufw deny 11434
File Permissions¶
# Create service user
sudo useradd -r -s /bin/false mcp-scan
# Set ownership
sudo chown -R mcp-scan:mcp-scan /var/lib/mcp-scan
sudo chown -R mcp-scan:mcp-scan /opt/mcp-scan
# Restrict config file
sudo chmod 600 /etc/mcp-scan/env
sudo chown root:mcp-scan /etc/mcp-scan/env
High Availability¶
Load Balancing¶
For scanning at scale, deploy multiple instances behind a load balancer:
# docker-compose.ha.yml
version: '3.8'
services:
scanner-1:
image: mcp-scan:latest
environment:
- ANTHROPIC_API_KEY
deploy:
replicas: 3
resources:
limits:
cpus: '4'
memory: 8G
ollama:
image: ollama/ollama:latest
deploy:
replicas: 2
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
Caching¶
Enable result caching to improve performance:
Monitoring¶
Prometheus Metrics¶
# prometheus.yml
scrape_configs:
- job_name: 'mcp-scan'
static_configs:
- targets: ['localhost:9090']
metrics_path: /metrics
Grafana Dashboard¶
Key metrics to monitor:
mcp_scan_duration_seconds- Scan durationmcp_scan_findings_total- Total findingsmcp_scan_llm_requests_total- LLM API callsmcp_scan_llm_latency_seconds- LLM response timemcp_scan_errors_total- Error count
Alerting¶
# alertmanager.yml
groups:
- name: mcp-scan
rules:
- alert: HighSeverityFinding
expr: mcp_scan_findings_total{severity="critical"} > 0
labels:
severity: critical
annotations:
summary: Critical vulnerability detected
- alert: LLMLatencyHigh
expr: mcp_scan_llm_latency_seconds > 30
labels:
severity: warning
annotations:
summary: LLM response time is high
Backup and Recovery¶
Backup Strategy¶
# Backup ML model and config
tar czf mcp-scan-backup.tar.gz \
/etc/mcp-scan/ \
/opt/mcp-scan/ml_weights.json \
/opt/mcp-scan/codeql-queries/
# Store in S3
aws s3 cp mcp-scan-backup.tar.gz s3://backups/mcp-scan/
Recovery¶
# Restore from backup
aws s3 cp s3://backups/mcp-scan/mcp-scan-backup.tar.gz .
tar xzf mcp-scan-backup.tar.gz -C /
# Restart services
sudo systemctl restart mcp-scan ollama
Troubleshooting¶
Common Issues¶
Out of Memory¶
Slow Scans¶
# Reduce CodeQL timeout
mcp-scan scan . --codeql-timeout 10m
# Limit languages
mcp-scan scan . --codeql-languages python
# Increase parallelism
mcp-scan scan . --workers 8
LLM Rate Limiting¶
# Reduce threshold to avoid unnecessary calls
mcp-scan scan . --llm-threshold 0.9
# Use local Ollama instead of cloud
mcp-scan scan . --llm-provider ollama
Debug Mode¶
# Enable verbose logging
mcp-scan scan . --verbose 2>&1 | tee scan.log
# Check specific component
MCP_SCAN_DEBUG=llm mcp-scan scan .
MCP_SCAN_DEBUG=codeql mcp-scan scan .
Maintenance¶
Regular Tasks¶
| Task | Frequency | Command |
|---|---|---|
| Update mcp-scan | Weekly | go install github.com/mcphub/mcp-scan/cmd/mcp-scan@latest |
| Update CodeQL | Monthly | codeql pack upgrade |
| Retrain ML model | Monthly | python scripts/train_ml.py |
| Clear cache | Weekly | rm -rf /var/cache/mcp-scan/* |
| Update Ollama models | Monthly | ollama pull llama3.2:3b |