doradus-research

Doradus Research is the operator perspective of a small AI cluster — vLLM, sleep-mode rotation pools, consumer Blackwell GPUs, multi-model serving. We publish what we learn the hard way: dead-ends, configuration that didn't work, and the recipes that ended up in production.

We also take on commissioned research engagements — formally-verified arXiv papers structured to pass peer review, plus patent-services research (prior-art search, patentability assessment, claim drafting support). Same operators, same cluster, the autoresearch pipeline does the breadth and Lean 4 does the mathematical verification.

Compute: 3 GPU compute nodes carrying 10× NVIDIA RTX PRO 6000 Blackwell (95 GiB each, SM12.0a) + 4× RTX 5090 — split across a TP=4 rotation pool, two TP=2 sleep/wake cohorts, and dedicated single-GPU services. 2× NVIDIA DGX Spark (GB10, 128 GiB UMA each) for medium-tier always-on serving + ComfyUI. 2× Apple Mac Studio M3 Ultra (256 GiB UMA each) hosting MLX models. ~1.3 TB system RAM total. All on-prem. Built to accommodate ~40 daily users across internal tooling, autoresearch pipelines, and ad-hoc inference.

Storage: ~75 TB across four tiers — hot NVMe per node, an erasure-coded warm cluster, a shared NFS model cache, and SMB cold archive. Promotion / demotion between tiers is automatic.

Network: 100GbE fiberoptic backbone with managed switching across all nodes, fronted by a dedicated perimeter firewall appliance.

Orchestration: modern scheduler + service mesh with mTLS on every internal hop, including localhost. Inter-host trust over a private overlay.

Inference runtimes: vLLM (sleep-mode rotation for frontier MoE), SGLang (training-adjacent + long-context dense), llama.cpp + llama-swap (32 GiB single-GPU pool with auto-load / auto-evict), MLX (Mac side). The interesting bugs live at the seams between these.

Code at github.com/DoradusResearch. Cross-posted on Hugging Face at huggingface.co/DoradusResearch (forthcoming).

Reach out: commission research, GitHub, or X (@DoradusAI). We also read r/LocalLLaMA and the vLLM Discord.