Independent Consultant

AI Infrastructure, Sovereign LLMs, Self-Hosting.

I help Austrian companies run AI and critical services on their own hardware — no cloud lock-in, no data leaks, full control.

Request a first call See services

01 / Services

Four focus areas.

Everything listed below, I run in production myself. No PowerPoint, no reseller markup.

AI Infrastructure

Inference on your hardware. Designed, deployed, monitored.

A single GPU server or a multi-node cluster. Model routing, auto-scaling, latency and token-throughput monitoring.

vLLM
SGLang
LiteLLM
Multi-GPU

Sovereign LLMs

Open-source models on-prem. Audit-ready, optionally air-gapped.

Llama, Mistral, Qwen, DeepSeek — hosted, routed, documented. GDPR- and NISG-2026-compliant.

GDPR
NISG 2026
On-Prem
Air-Gapped

RAG & Semantic Search

Knowledge retrieval over your own data. Provisioned and operated.

Embedding pipelines, vector stores, re-ranking, eval loops. For internal docs, codebases, support tickets — running on your infrastructure.

pgvector
Qdrant
BGE / E5
Re-Ranking

AI-Augmented Development

Claude Code & co. — introduced, guard-railed, measured.

Training, workflow integration, review pipelines that keep AI output honest. Realistic expectations, measurable results.

Claude Code
Codex
Reviews
Training

Self-Hosting

SaaS out, your own servers in. Reproducible and documented.

Mail, identity, monitoring, backups, CI — one stack, no vendor lock-in. Proxmox as the foundation under all of the above.

Proxmox
Vault
Stalwart
Wazuh

02 / About

Stefan Starflinger.

Independent, technical, hands-on.

I run the same stack I recommend: a 4-node Proxmox cluster, a GPU server with two RTX PRO 6000s, sovereign LLMs behind LiteLLM, my own CI runners, monitoring, mail and Vault.

Based in Austria. Working languages: German and English.

03 / Contact

Tell me briefly what you have in mind.

I respond within one business day.