Can you run a production AI automation stack on a cheap VPS?

Yes. A Hetzner CAX11 (ARM, 2 vCPU, 4GB RAM, €3.99/month) comfortably runs n8n, LiteLLM, Minio, Umami, Listmonk, and Coolify simultaneously, with headroom to spare.

What is Coolify and why use it for self-hosting?

Coolify is an open-source self-hosting platform that manages Docker containers, handles HTTPS via Traefik, and provides a UI for deploying services. It removes the need to write Docker Compose files manually for each service.

Why use LiteLLM instead of calling the OpenAI API directly?

LiteLLM acts as a unified proxy for all AI providers (OpenAI, Anthropic, Groq, Gemini). It gives you cost tracking per request, rate limiting, and the ability to swap providers without changing any workflow code.

What is Minio used for in an n8n automation stack?

Minio is a self-hosted S3-compatible object store. In n8n workflows it's used to pass data between workflow runs — for example, keyword research saves a JSON file that the article generator reads an hour later.

How do you monitor a self-hosted automation stack?

Daily Telegram reports for server health (CPU, RAM, disk, container status) plus n8n's built-in error monitor that pings Telegram on any workflow failure.

Self-Hosted AI Stack: n8n + LiteLLM + Coolify on a €4/Mon...

I run 23 active automation workflows, 7 content websites, an AI proxy, a newsletter platform, and a vector database — all on a single server that costs €3.99 per month.

This is the exact setup, why each piece is there, and what I’ve learned maintaining it.

Why Self-Host at All?

The honest answer: cost and control.

The managed equivalent of this stack would run €200–500/month. Zapier for automation. OpenAI API with no cost cap. Mailchimp for newsletters. Cloudflare for storage. Hosted analytics.

Self-hosting the whole thing costs under €20/month. The trade-off is time spent on maintenance — but with Coolify managing containers, that’s closer to 30 minutes a month than a full-time job.

The second reason is no vendor lock-in. When a provider raises prices or changes an API, I swap it in LiteLLM and nothing else changes.

The Server

Hetzner CAX11 — ARM architecture, 2 vCPU, 4GB RAM, 40GB SSD, 20TB traffic, €3.99/month.

The ARM architecture matters. Hetzner’s ARM servers have significantly better performance-per-euro than their x86 equivalents. I was skeptical at first — most Docker images are x86 — but multi-arch support has become good enough that I’ve had no compatibility issues with any service in this stack.

Located in Helsinki. Latency to Stockholm is ~15ms. Not relevant for async workflows, but nice for the occasional direct API call.

4GB RAM sounds tight. Here’s actual usage with everything running:

n8n:          ~280MB
LiteLLM:      ~180MB
Minio:        ~120MB
Umami:        ~90MB
Postgres (x2):~160MB each
Listmonk:     ~70MB
Qdrant:       ~150MB
Coolify:      ~200MB
Total:        ~1.4GB

Leaves about 2.4GB free. The server has never OOM-killed a container.

Coolify: The Foundation

Coolify is what makes running 12+ containers manageable without becoming a full-time sysadmin.

What it handles:

Container lifecycle: deploy, restart, update with one click
HTTPS: Traefik auto-issues Let’s Encrypt certs for every custom domain
Environment variables: stored per-service, never in code
One-click services: Postgres, Redis, Minio, n8n are all available as pre-configured templates

Before Coolify I managed Docker Compose files manually. Every new service meant writing a compose file, handling the Traefik labels for HTTPS, managing env files. Coolify wraps all of that.

The one thing Coolify doesn’t do: it won’t tell you when a container is unhealthy without you checking. I solved that with a daily Telegram report (more on that below).

n8n: The Orchestration Layer

n8n is where the logic lives. Every automated action in the stack is an n8n workflow:

Daily keyword research for each site (runs at 08:00–09:45 depending on site)
Article generation and publishing (10:00–12:00)
Analytics reports (Umami stats to Telegram every morning)
GEO/AEO intelligence (daily RSS parsing + Groq analysis)
AI traffic monitoring (daily Umami Postgres query)
Monthly performance report (1st of month, 09:30)
Error monitoring (Telegram alert on any workflow failure)

23 active workflows total. n8n runs as a container with a Postgres backend for execution history.

What n8n is good at: chaining API calls, conditional logic, scheduled tasks, webhook endpoints. The visual editor makes it easy to debug — you can see exactly what data each node received and what it output.

What n8n is not good at: complex data transformations (use a Code node), fan-out patterns where multiple branches merge back (they break silently on empty results — use sequential chains instead).

The most important n8n lesson I learned: use sequential chains. Connect nodes in a single line A → B → C → D. Never have two nodes both connect to a third without a Merge node in between — empty results from one branch silently kill the chain.

LiteLLM: The AI Gateway

Every LLM call in the entire stack goes through LiteLLM. n8n workflows call http://litellm/v1/chat/completions, not the OpenAI API directly.

Why this matters:

Cost visibility. LiteLLM tracks spend per request, per model, and per day. I can see exactly what each workflow costs. My total March spend was $3.76 across all 23 workflows. Without LiteLLM I’d have no idea which workflow was expensive.

Provider flexibility. My article generators use Groq (fast, cheap, llama-3.3-70b). My hockey sites also use Gemini (free tier). LiteLLM routes to the right provider based on the model name. If Groq raises prices tomorrow, I change one config line — nothing in the workflows changes.

Rate limit handling. Groq has generous rate limits but they exist. LiteLLM queues requests and handles retries. Workflows never see a rate limit error.

The model I use most: llama-3.3-70b-versatile via Groq. It’s fast (~1500 tokens/sec), good at following formatting instructions, and costs roughly $0.05 per 2000-token article output.

Minio: Workflow-to-Workflow Data Passing

This one surprised me. I didn’t expect object storage to become so central to the architecture.

The problem it solves: n8n workflows don’t share state. If keyword research runs at 09:45 and the article generator runs at 10:00, they need a handoff mechanism. You could use a database, but that’s overkill for a JSON file.

Minio is a self-hosted S3-compatible object store. I use it as a file system for workflow data:

seo-drafts/
├── tha-approved-keywords.json    (written 09:45, read 10:00)
├── eik-approved-keywords.json    (written 08:00, read 09:00)
├── performance-insights.json     (written Mon 09:15, read by keyword research)
├── geo-aeo-insights.json         (written daily 07:00, read by monthly report)
├── ai-traffic-2026-03-30.json    (daily AI traffic snapshot)
└── [article drafts waiting for approval]

n8n has a native S3 node that reads and writes these files. No custom code needed.

Minio also stores EIK article drafts while they wait for my manual approval. The draft sits in Minio, I get a Telegram message with Approve/Reject buttons, and if I approve, the workflow reads the draft from Minio and pushes it to GitHub.

Umami: Analytics Without the Tracking Overhead

Umami is a self-hosted analytics platform. It runs on Postgres and tracks pageviews, sessions, and custom events — no cookies required, GDPR-compliant by default.

Every site in the stack has the Umami script installed. I query the Umami Postgres database directly from n8n to build daily and weekly analytics reports.

One specific use I’m pleased with: tracking AI referrer traffic. I filter sessions where the referrer contains perplexity.ai, chatgpt.com, claude.ai, or copilot.microsoft.com. This lets me measure whether my GEO (Generative Engine Optimisation) work is having any effect. It’s early data but the trend is upward.

The Monitoring Layer

A self-hosted stack with no monitoring is a stack that breaks silently. I have two layers:

Daily infra report (08:05, Telegram):

Server uptime, load average, RAM usage, swap, disk
Status of every key container (✅ healthy / 🟡 up-no-healthcheck / ❌ down)
Generated by a bash script via cron, no n8n dependency

n8n error monitor:

Any workflow execution that ends in an error triggers an immediate Telegram notification
Includes the workflow name, node that failed, and error message
Built as an n8n workflow listening to the n8n error event

These two together mean I know about any problem within minutes, usually before it affects anything.

The Full Cost Breakdown

Service	Cost
Hetzner CAX11	€3.99/month
Domain registrations (7 domains)	~€5/month avg
Groq API (LLM inference)	~$3–5/month
Netlify (hosting, free tier)	€0
GitHub (private repos)	€0
All self-hosted services	€0
Total	~€15/month

For comparison: Zapier’s Business plan starts at $49/month and doesn’t include AI, analytics, storage, or hosting.

What I Would Change

More RAM. 4GB works but leaves no buffer. If I add another service, I’ll either need to optimize or upgrade to the CAX21 (8GB, €7.49/month). Still cheap.

Backups. I have Coolify’s backup feature configured for Postgres, but I haven’t tested a restore. This is on the list.

Staging environment. Right now changes go straight to production. A failed workflow update can break a live pipeline. A second €4 server as a staging environment is worth it.

How to Replicate This

If you want to build a similar stack:

Provision a Hetzner CAX11, point a domain at it
Install Coolify following their one-line installer
Deploy n8n, Postgres, and Minio as Coolify services
Deploy LiteLLM with your API keys configured
Start building workflows

The whole initial setup takes about 4 hours. Getting the workflows stable took considerably longer — that’s where the actual complexity lives.

If you’re building something like this for a business context — internal automation, content operations, data pipelines — that’s exactly the kind of project I help with. The architecture decisions and the pitfalls are non-obvious until you’ve hit them.

Reach out if it’s relevant.

Running costs are as of March 2026. Hetzner pricing may vary by region.

My Self-Hosted AI Stack: Everything Running on One €4/Month Server

Why Self-Host at All?

The Server

Coolify: The Foundation

n8n: The Orchestration Layer

LiteLLM: The AI Gateway

Minio: Workflow-to-Workflow Data Passing

Umami: Analytics Without the Tracking Overhead

The Monitoring Layer

The Full Cost Breakdown

What I Would Change

How to Replicate This

Read Next

I Built a System That Publishes 30 Articles a Week Across 7 Websites for $0.08 Each

Designing AI Agent Architectures for Modern Data Engineering Workflows

Tools Used in This Article

Why Self-Host at All?

The Server

Coolify: The Foundation

n8n: The Orchestration Layer

LiteLLM: The AI Gateway

Minio: Workflow-to-Workflow Data Passing

Umami: Analytics Without the Tracking Overhead

The Monitoring Layer

The Full Cost Breakdown

What I Would Change

How to Replicate This

Read Next

I Built a System That Publishes 30 Articles a Week Across 7 Websites for $0.08 Each

Designing AI Agent Architectures for Modern Data Engineering Workflows

Tools Used in This Article

Relaterat innehåll

I Built a System That Publishes 30 Articles a Week Across 7 Websites for $0.08 Each

Automated Hockey Analytics Pipeline - AI & LLM Integration with Mage AI Data Processing

Self-Hosted Development Infrastructure - Coolify & Hetzner Platform