Fred Nix

THE PROBLEM I SOLVE

Enterprise AI infrastructure that actually works

From Zero to AI in Hours

Enterprises spend $200K+ on powerful AI hardware, then stare at blank terminals for weeks. The ecosystem is fragmented across hundreds of GitHub repos, tribal knowledge, and undocumented configurations.

I built a system that transforms bare Ubuntu servers into production AI infrastructure with a single command. No consultants. No guesswork. Just working AI.

One-Command Deploy Air-Gap Ready 100% Reproducible

THE CHALLENGES

$200K Hardware, Zero Path Forward

Powerful infrastructure sitting idle because nobody knows how to configure it.

6 Weeks to Production

Traditional deployments require senior engineers and months of configuration.

Tribal Knowledge Trap

Critical deployment steps exist only in experts' heads. No documentation.

Cloud Dependency Lock-in

Data sovereignty concerns and unpredictable costs blocking adoption.

WHAT I BUILT

One-Command Deployment

Clone the repo. Run bootstrap.sh. Production AI in hours, not weeks.

100% Air-Gapped Capable

No cloud dependencies. Your data stays in your datacenter. Period.

Git IS the Product

Everything in Docker. Everything in YAML. Documentation that deploys.

AI-Augmented Development

100k+ AI sessions preserved. Institutional knowledge in version control.

TECHNICAL CAPABILITIES

Full-stack AI infrastructure engineering

GPU

GPU Infrastructure

NVIDIA L40S deployment, CUDA optimization, multi-GPU orchestration, vLLM inference engines, and real-time GPU monitoring with DCGM.

CUDA vLLM L40S DCGM

K8s

Container Orchestration

Docker Compose for simplicity, Kubernetes-ready architecture, GitOps workflows, and automated deployment pipelines.

Docker Kubernetes GitOps

LLM Operations

Model serving, hot-swapping, RAG pipelines, embedding generation, and production inference optimization.

LLaMA Mistral RAG

NET

Network Architecture

100GbE fabric design, NFS storage optimization, air-gap configurations, and zero-trust security models.

100GbE NFS Air-Gap

OBS

Observability Stack

Prometheus metrics, Grafana dashboards, custom GPU exporters, and comprehensive alerting systems.

Prometheus Grafana Alerting

DEV

AI-Augmented Dev

Claude Code integration, CLAUDE.md patterns, session preservation, and AI-human collaboration workflows.

Claude AI Pairing 100k Sessions

MEM

Session Persistence

100,000+ AI sessions preserved with full context recovery, conversation threading, and cross-session learning.

Context Memory History

DATA

Data Infrastructure

Persistent storage architecture, automated backups, Timeshift snapshots, and GitLab offsite redundancy.

Storage Backups Redundancy

SEC

Security Architecture

Zero-trust security model, air-gapped operations, encrypted communications, and access control systems.

Zero-Trust Encryption Access

Operations at Scale

Real-time monitoring across every component. GPU utilization, memory pressure, inference latency, and system health visualized in custom Grafana dashboards.

Three-layer backup strategy ensures zero data loss: Timeshift for hourly snapshots, versioned backup scripts, and GitLab for offsite redundancy.

Real-time Metrics Custom Dashboards Three-Layer Backup

THE JOURNEY

Evolution of the BTA AI POD project

Phase 1

Hardware Foundation

4x NVIDIA L40S GPUs installed. 184GB VRAM orchestrated. 100GbE network fabric deployed.

Phase 2

Software Stack

Docker containerization. vLLM inference engine. Model hot-swapping capability achieved.

Phase 3

Observability

Prometheus + Grafana stack. Custom GPU metrics exporters. Real-time alerting system.

Phase 4

One-Command Deploy

bootstrap.sh achievement. Fresh clone to production in hours. Zero tribal knowledge required.

Current

Production Ready

100k+ AI sessions. 9 LLM models available. Air-gap certified. Enterprise deployment proven.

ENGINEERING WISDOM

Hard-won lessons from production AI infrastructure

Silent Failures Are Worst

"A crash tells you something is wrong. A silent failure tells you nothing. We'd rather have an ugly error message than a pretty button that does nothing."

Git Is The Product

"If it's not in the repo, it doesn't exist. The repository IS the system - not documentation about a system, but the actual deployable system itself."

Record Your Dead Ends

"Every approach we tried that failed is documented. A dead end isn't a failure - it's information that prevents future wasted time."

Fresh Clone Testing

"Your development environment lies to you. The only valid test is: Does it work on a machine that has NEVER seen this code before?"

Complexity Is Borrowed

"Every bit of complexity you add today is borrowed from someone's future time. Before adding complexity, ask: Who will pay the interest?"

Event-Driven Over Timers

"Timer-based protection has a fatal flaw: important events don't respect your schedule. Don't save every 30 minutes. Save when something HAPPENS."

TECHNICAL DEEP DIVES

Key architectural decisions explained

LINK

Symlink Architecture

Live system runs from symlinks pointing to the git repo. Edit the repo, the system updates. No deployment step for dashboard changes.

CLAUDE.md Pattern

AI assistants read a single file at session start that contains everything needed to understand and work on the project.

VM Iteration Loop

QEMU snapshot, deploy, test, revert. Poor-man's CI/CD without cloud infrastructure. Under 15 minutes per cycle.

SWAP

Hot-Swap Models

9 LLMs pre-configured. Switch from the admin panel without cluster restart. Approximately 30 second model swap time.

Three-Layer Backup

Timeshift (hourly automatic), backup.sh (versioned), GitLab (offsite). Multiple layers catch different failure modes.

AIR

Air-Gap Ready

3-way network toggle: AIR GAPPED / INTERNAL / FULL. Zero cloud dependencies. All inference happens on-premises.