GLM Is the New Hotness, So Let's Test It On the Homelab
GLM is suddenly everywhere in developer conversations. Before we run the bakeoff, we need to answer two questions: what is GLM, and is it suitable for a single RTX 5090 homelab?
GLM is suddenly everywhere in developer conversations. Before we run the bakeoff, we need to answer two questions: what is GLM, and is it suitable for a single RTX 5090 homelab?
A full-tower AI homelab with a 420mm AIO gets rebuilt into an SFF open-frame case with custom hardline water cooling. Two 240mm slim radiators, a single shared loop, 580W peak heat load, and 2,726 sensor readings that prove whether the tradeoff was worth it.
Three bugs that looked fixed from the wrong vantage point. An unquoted YAML date that crashed the public homepage one month after we wrote a blog post about the same bug. A model string that worked until its deprecation date passed. A security commit that stacked three invisible failures on top of each other. And a lesson about what accumulates when you build fast with agents.
A finance intern is spending her summer observing business processes and vibe coding automation tools. Not a CS major. Not shadowing someone. Building something real. It is a small example that says something big about how AI is reshaping internships, careers, and what the word "developer" actually means.
Vibe coding has moved from hobbyist curiosity to enterprise rollout across knowledge workers, and the next wave of AI adoption will be defined by governance and token economics.
A CEO panel at an AI event sparked a simple but powerful question every startup founder should ask themselves: does your business get better as AI models improve, or does it get worse?
I gave five local LLMs and one frontier cloud model the same coding task on my homelab: build a tag manager for the blog's admin panel. Only two shipped anything. Here's what happened.
Four frontier models, ten tasks, one government shutdown. We ran Claude Fable 5 through the homelab benchmark harness three hours before Anthropic pulled the plug — and it came in second. Here's the full bakeoff.
Two Discord bots, one 14B model, five fitness-tracker tasks. Both agents failed on the first try. Getting them working required debugging context overflow, silent tool parameter drops, and a chat template flag that changes everything. The results reveal as much about the state of local AI agents as they do about which framework won.
Coder 2.34 shipped User Secrets — per-user credential storage that injects into every workspace automatically. We upgraded, audited 29 secrets across four projects, and found exactly two that belonged there. Here's how we decided, how we migrated, and what we cleaned up along the way.
A model refresh on the homelab (Qwen 3.6, new embeddings, 469 llama.cpp builds), a feature sprint on the vacation planning site (calendar sync, expense tracking, and three bugs that taught us more than the features did), and automating Substack syndication after discovering two more undocumented quirks. Three unrelated workstreams, one theme: maintenance is where the real learning happens.
At a C-suite roundtable in Palo Alto last week, ten-plus executives from a mix of gaming platforms, enterprise systems providers, job sites, and other Bay Area titans landed on the same analogy without being prompted: we've seen this before. The lift-and-shift era of AI is already here. The native era — where you redesign workflows from scratch for agents, not humans — is what comes next.
I've been running OpenClaw on the homelab for a month. A recommendation sent me down the Hermes Agent rabbit hole — and the research before the first real test revealed my daily driver model was broken for tool calling all along.
Substack supports RSS import, but the importer is finicky, undocumented, and rejects feeds for reasons it won't tell you. Here's how we got 13 curated posts from a Next.js blog into Substack — and what every other guide leaves out, including the dedup gotcha that bit us on the re-import.
The fitness tracker MCP server was a test run. This week I added the same thing to vibescoder.dev — 16 tools that let any agent list posts, publish drafts, check analytics, trigger deploys, cross-post to Dev.to, and troubleshoot the live site. Here's the build, the architectural decisions, and what it's like when the agent that built the feature can immediately use it.
I asked an agent to security-audit my fitness tracker after wiring MCP into it. It found nineteen things. I fixed them all in four neat batches. Then the dashboard went empty, Google sign-in died, and the real bugs turned out to be the ones the audit couldn't see — a middleware file that had been silently doing nothing for months, and an OAuth client that never existed in any project I owned.
One missing pair of quotes in one frontmatter field took down the admin drafts page. YAML 1.1 auto-parsed the date to a JS Date object, formatDate called .includes on it, and the route 500'd. Here's the bisect from a mobile screenshot to a one-line fix, why only the drafts page broke, and the lesson about trusting types at the YAML boundary. Part two of a two-part Friday Fixes — see #1 for the scheduled-publish workflow bugs that landed the same day.
The scheduled-publish GitHub Action broke twice in nine days. Bug one: a grep that matched body text instead of frontmatter, triggered by a post about the feature itself. Bug two: a dead-code line introduced by the fix for bug one — racy under set -euo pipefail, probabilistically silent for eight days, then 42 consecutive failures with zero notifications.
Someone vibe coded an app with Google AI Studio. The Gemini API key shipped in the client-side JavaScript bundle. Google suspended the project. Here's why every AI coding tool gets this wrong, why regular audits are the only real defense, and what you can do before it happens to you.
The Round 5 bakeoff produced four implementations. None of them shipped. What shipped was a merge of the best pieces from all four, then a polish pass against real data. Bakeoff → Merge → Polish is a generalizable pattern for any feature where the design space is genuinely unclear.
Three AI agents audited the blog and produced three different reports. Closing them out was its own job — triage, phasing, verification, and ten commits across two repos with zero build failures. Here's the remediation arc, what shipped, what got deferred, and what the process revealed about working through someone else's audit.
Our AEO audit gave vibescoder.dev a clean bill of health. Cloudflare's isitagentready.com gave it a 25 out of 100. Both audits were right — they were measuring two different competencies. Here's the side-by-side, what each one caught, and the two genuine gaps we shipped fixes for — taking the score from 25 to 33 (and on track for 39 after the next scan).
DeepSeek V4-Pro, V4-Flash, and Zyphra ZAYA1 are three of the most exciting new models in local AI. None of them run on our RTX 5090 homelab — for completely different reasons. Here's the research, the math, and what it means for anyone building a local inference rig.
A second user joined the homelab Coder instance and couldn't push to GitHub. What looked like a missing config turned into five chained problems, a domain migration aftershock, an agent-debugging-an-agent meta-moment, and the discovery that the same credential helper bug had been "fixed" four times in ten days — and never actually deployed.
Two AI models got the same prompt: review the blog fodder, check for redundancy, and draft a post. Opus chose a debugging war story. Qwen chose a data-driven redesign. Neither picked the same fodder. Here's what the difference reveals about how models think about content.
Three rounds of iPhone screenshots to fix spacing that should have been right the first time. The fix wasn't smaller padding — it was teaching the agent the pixel math once so it never forgets. Plus: admin pillbox for drafts, hamburger menu shortcut, Invalid Date bugs, and scheduled publishing for every draft.
As AI agents make code generation trivial, the real value shifts from storing source code to preserving the chat conversations that created it.
I asked an AI agent to turn off my RGB lights on Linux. 85 terminal commands, 35 failures, 4 hangs, 2 dead download links, one wrong build system, and the GPU is still glowing. This is the post.
A CRLF bug silently broke every workspace for weeks. Then we fixed it, taught the agent to remember, moved templates to Git, squashed a nested heredoc, cut boot time from 91 seconds to 5, automated the screenshot pipeline, and built scheduled publishing — which this post used to publish itself. Ten fixes, one week.
How AI agents are transforming software development the same way Google Maps revolutionized travel - making the impossible feel effortless and opening up new worlds of exploration.