vibescoder
all tags

// Posts tagged: homelab

GLM Is the New Hotness, So Let's Test It On the Homelab

·14 min read

GLM is suddenly everywhere in developer conversations. Before we run the bakeoff, we need to answer two questions: what is GLM, and is it suitable for a single RTX 5090 homelab?

AI-NT-No-Problem: Cramming a 9950X3D and RTX 5090 Into an SFF Custom Loop

·11 min read

A full-tower AI homelab with a 420mm AIO gets rebuilt into an SFF open-frame case with custom hardline water cooling. Two 240mm slim radiators, a single shared loop, 580W peak heat load, and 2,726 sensor readings that prove whether the tradeoff was worth it.

Model Showdown Round 7: Five Local Models vs. One Cloud Model on a Real Coding Task

·13 min read

I gave five local LLMs and one frontier cloud model the same coding task on my homelab: build a tag manager for the blog's admin panel. Only two shipped anything. Here's what happened.

Homelab Bakeoff: OpenClaw Outperforms Hermes… With Hermes Models

·15 min read

Two Discord bots, one 14B model, five fitness-tracker tasks. Both agents failed on the first try. Getting them working required debugging context overflow, silent tool parameter drops, and a chat template flag that changes everything. The results reveal as much about the state of local AI agents as they do about which framework won.

Updating Coder To Get User Secrets and the Art of Knowing Where Your Secrets Belong

·9 min read

Coder 2.34 shipped User Secrets — per-user credential storage that injects into every workspace automatically. We upgraded, audited 29 secrets across four projects, and found exactly two that belonged there. Here's how we decided, how we migrated, and what we cleaned up along the way.

QoL with WoL: Turning on the Homelab from Anywhere in the World

·8 min read

A full walkthrough of setting up Wake on LAN on a Linux homelab and wiring it into Google Home via SmartThings — including every dead end, expired link, and wrong interface name along the way.

Qwen Is Not Yet Ready to Power Local OpenClaw Deployments

·9 min read

Two weeks of using Qwen3.5-35B as my daily AI assistant — the Jinja template fix that made it work, the thermal spam incident that almost ended the experiment, and the session-context gap that makes it feel like a junior dev every morning. Plus: what's next with Qwen 3.6.

Wiring MCP Into My Fitness Tracker — and Asking OpenClaw About My Last Workout

·13 min read

I built a Model Context Protocol server into the fitness tracker I vibe coded a year ago, wired it through Vercel and Coder workspaces, and ended the afternoon asking my Discord bot what my last workout was. Here's the build, the wrong turn into Coder's AI Bridge, the workaround, and how the same endpoint now serves Claude Desktop, Codex, Coder Agents, and OpenClaw.

Installing OpenClaw on the Homelab

·11 min read

From curl to working Discord bot in one afternoon — with a local LLM on the RTX 5090. Every gotcha, every config mistake, and the one setting that silently ate every server channel reply for hours.

Thursday Thoughts: The Models We Can't Run

·7 min read

DeepSeek V4-Pro, V4-Flash, and Zyphra ZAYA1 are three of the most exciting new models in local AI. None of them run on our RTX 5090 homelab — for completely different reasons. Here's the research, the math, and what it means for anyone building a local inference rig.

The Fix That Was Fixed Four Times

·9 min read

A second user joined the homelab Coder instance and couldn't push to GitHub. What looked like a missing config turned into five chained problems, a domain migration aftershock, an agent-debugging-an-agent meta-moment, and the discovery that the same credential helper bug had been "fixed" four times in ten days — and never actually deployed.

Model Showdown Round 3: Ditching Ollama in Favor of llama.cpp

·17 min read

We ripped out Ollama, migrated to llama.cpp, and benchmarked five local models across 12 tasks on an RTX 5090. The results surprised us — and the winner wasn't who we expected.

Wacky Wednesday: Why I Won't Daily Linux as My Desktop

·6 min read

I asked an AI agent to turn off my RGB lights on Linux. 85 terminal commands, 35 failures, 4 hangs, 2 dead download links, one wrong build system, and the GPU is still glowing. This is the post.

Slaying the Gemma Beast: How We Fixed Local AI and Shipped Search

·17 min read

Gemma 4 failed to build a single feature in our last test. This time we diagnosed the problem, switched from Ollama to llama.cpp, tuned the inference settings, and Gemma shipped a working search feature to production. Then Opus reviewed the code and made it better. Here's what we learned about making local models actually work.

Invisible Failures: The Bugs That Hide in Plain Sight

·12 min read

Four bugs that were silently breaking things for days: a deploy that only crashes on new images, a shell guard that eats your auth tokens, a publish date frozen at draft creation, and a homelab with no emergency remote access. Plus: capacity planning for when you're running AI workspaces on a single machine.

The Agentic Gap: Claude Oneshots, Gemma Fails

·12 min read

We pitted Gemma 4 against Opus 4.6 on a real feature build for vibescoder.dev. Gemma is the fastest model in our benchmark. It also couldn't finish the job. Here's what happened when we stopped testing toy apps and started building production code.

Model Showdown Round 2: Adding Gemma, Kimi, and 579 GB of Stubborn Optimism

·15 min read

We added Google's Gemma 4 and Moonshot's 1-trillion-parameter Kimi K2 to the local model benchmark. Five out of six models scored perfect. Gemma 4 is the new speed king. And yes, we ran a 579 GB model off an NVMe drive — at 0.6 tokens per second.

Downtime Is a Feature: Custom Domains, Cloudflare, and MCP While Models Download

·11 min read

While waiting for massive open source models to download, I tackled the homelab backlog: custom domain for my Coder instance via Cloudflare Tunnel, security hardening (with a gotcha that could kill your AI search visibility), and wiring up MCP servers to give agents superpowers.

Model Showdown: Benchmarking Local vs Cloud LLMs on a Real Coding Task

·18 min read

We gave six LLM models the exact same coding prompt and measured everything: speed, tokens, and whether the code actually works. Three models scored perfect. Two built the wrong kind of app. One ran out of tokens mid-line.

Putting the GPU to Work: Running Local LLMs on a Home Lab

·12 min read

Installing Ollama, pulling five purpose-built models, wiring local inference into Coder Agents, and running agentic coding on an RTX 5090 workstation. 44 GB of models, zero cloud API calls, fully self-hosted.

From Idea to Infrastructure: Standing Up a Self-Hosted AI Dev Environment

·10 min read

The journey from "I should build a home lab" to a fully configured self-hosted Coder server with GitHub integration, multi-user workspaces, and AI agents that actually know how to use the tools available to them.