From Chaos to Signal: How We Fixed Our Blog's Tag System

Last week, I shipped a filter bar that barely worked.

The feature was live, the code was clean, the transitions were smooth. But clicking through the filters barely changed anything. Click [ai] and three posts disappeared. That's not filtering — it's a rounding error.

The filters weren't broken. The data was.

Three PRs, two repos, 18 MDX files, and the realization that folksonomy without taxonomy is just noise.

Before We Start: The Broken Build

The Vercel build was already failing before any filtering work started:

Type error: 'Fuse' only refers to a type, but is being used as a namespace here.

const FUSE_OPTIONS: Fuse.IFuseOptions<SearchIndexEntry> = {
                     ^

A stale import on a diverged branch. Main already had the fix (import Fuse, { type IFuseOptions }) but the branch hadn't caught up.

While fixing that, I found a second problem: the Friday scheduled post ("Friday Fixes: The Agent Was Flying Blind") had publishAt: 2026-05-02 — a Saturday. The post title literally says "Friday Fixes." Fixed the date to 2026-05-01 and flipped published: true directly.

Fix the build, rescue the post, then ship the feature.

V1: Tag-Based Filter Bar

First pass. Added a // grep labeled filter bar to the homepage with tag pills:

// grep                                               // sort
[*]  [ai]  [homelab]  [agents]  [coder]  [+14]       [newest ↓]

The architecture was clean: FilterBar.tsx renders tag pills and a sort toggle, PostListWithFilters.tsx wraps them in a "use client" boundary with AnimatePresence for smooth transitions, and page.tsx still server-renders all posts and passes them as props. URL sync via ?tag=ai&sort=oldest — shareable, clean URL at defaults.

Design choices that felt right: [*] glob wildcard instead of "All" (dev personality), single-select (simplest UX), top 4 tags by frequency inline with [+N] overflow for the rest.

It shipped. It worked. And clicking through the filters barely changed anything.

The Tag Saturation Problem

I ran a data audit on all 15 published posts. The numbers were bad:

Tag	Posts	% of all	Verdict
building-in-public	14	94%	Describes the blog, not the post
ai	12	81%	AI is the tool, not always the subject
coder	10	69%	Coder is the platform
meta	8	50%	Vague
homelab	7	44%	First tag with real signal
agents	7	44%	Meaningful

Clicking [ai] hid 3 posts out of 15. The top 4 tags shown in the filter bar were essentially "all posts, but with different labels."

The real content split was structural: 14 technical how-to posts and 1 opinion piece. Readers wanted to toggle between how-to, opinion, and popular — a distinction that freeform tags couldn't capture.

V2: Content-Type Filters

Replaced tag pills with three fixed content-type filters:

// grep                                               // sort
[*]  [how-to]  [opinion]  [popular]  [+tags]          [newest ↓]

Tags are folksonomy — freeform, inconsistent, grow unbounded. Content type is taxonomy — controlled vocabulary, exactly two values, every post gets one. I added a type field to the post schema:

export type PostType = 'how-to' | 'opinion';
 
export interface PostMeta {
  // ...existing...
  type?: PostType;  // undefined defaults to 'how-to'
}

The [popular] pill uses commentCount from the GitHub Discussions API — already wired to every post object on the homepage. Zero new infrastructure. It overrides the sort to comment-count DESC and hides the sort toggle (showing both implies they compose — they don't).

Tags didn't disappear. They moved to a [+tags] expander — same expand/collapse UX as the old [+N], but now a secondary filter layer that composes with the content-type selection. Click [how-to], then drill into [homelab] from the expanded tag row.

The Tag Cleanup

Shipped alongside the engine changes as a single commit to the content repo. The cleanup rules:

Removed entirely: building-in-public (94% — every post had it) and meta (50% — too vague to mean anything).

Trimmed: ai kept only on posts where AI is the subject (benchmarks, LLM setup), not where it's just the tool. coder kept only on Coder-specific config posts.

Merged: ai-agents → agents (inconsistent naming — 1 post vs. 7).

Backfilled: Three early "Day N" posts had tags that were 100% noise. After cleanup they were tagless. Added next-js back so they had at least one meaningful tag.

Metric	Before	After
Highest tag saturation	94% (building-in-public)	56% (agents)
Median tags per post	5	2–3
Tags with only 1 post	11	6
Unique tags	22	16

Gotchas

Spread order matters. { ...meta, type: meta.type ?? 'how-to' } — the explicit type must come after the spread. Put it before and the spread overwrites it with undefined from the raw frontmatter.

[popular] sort hides the toggle. When popular is active, it imposes commentCount DESC. Leaving the newest ↓ / oldest ↑ toggle visible implies two sorts are composable. Hiding it is cleaner.

Posts with zero remaining tags. After removing the noise, three posts had nothing left. The cleanup script has to account for the edge case where all of a post's tags were noise. Better to backfill one real tag than leave a post tagless.

What I Learned

Build the filter first, audit the data second — and you'll build it twice. V1 was architecturally sound. The problem was upstream. If I'd run the saturation analysis before writing FilterBar.tsx, I'd have gone straight to content-type filters and skipped the intermediate version entirely.

Folksonomy breaks at small scale, not large. The conventional wisdom is that freeform tags degrade as a corpus grows. At 15 posts, they were already useless — not because there were too many tags, but because the same tags appeared on everything. A 94% saturation rate means the tag is describing the blog, not the post.

Taxonomy is a design constraint, not a limitation. Exactly two values (how-to | opinion) sounds restrictive. In practice, it made every editorial decision simpler — there's no ambiguity about which bucket a post belongs in. Constraints that reduce decision fatigue are features.

What's Next

The filter bar is live. The data is clean. The question is whether it holds.

If we grow to 50 posts, will agents creep back toward 94%? The answer depends on whether the taxonomy structure nudges writers toward specificity or whether old habits return. I'll be watching the distribution. If agents hits 80%, it's time for a lint rule.

For now: the filters work, the data is clean, and we finally have a system that distinguishes what the post is about from what the blog is about.

By the Numbers

3 PRs merged to main across 2 repos in one session
310 lines added for V1 filter bar, then 107 added / 78 removed for V2 refactor
18 MDX files updated in the tag cleanup
47 tag instances removed, 20 added — net −27 across all posts
94% → 56% highest tag saturation dropped
5 → 2–3 median tags per post
0 new dependencies across both versions
1 broken build fixed, 1 missed scheduled post rescued before the feature work started

This post was originally drafted by Qwen 3.5 35B-A3B running locally on the homelab, then editorially polished by Opus 4.6. Both models were given the same prompt and the same fodder files — Qwen chose this topic, Opus chose a different one entirely. The meta comparison is in Model Showdown Round 4: Opus vs Qwen — Writers, Not Coders. Qwen's unedited draft is below.

---
title: "From Chaos to Signal: How We Fixed Our Blog's Tag System"
date: '2026-05-12'
description: >-
  Tag filters barely changed anything. A data audit revealed the problem:
  94% of posts had the same tags. We replaced folksonomy with taxonomy,
  rebuilt the filter bar, and cut tag saturation from 94% to 56%.
  Here's the complete story of V1 to V2.
tags:
  - next-js
  - content-design
  - tagging
  - data-audit
published: false
type: how-to
sources:
  - blog-fodder-filtering-taxonomy-overhaul-may-1.md
---
 
# From Chaos to Signal: How We Fixed Our Blog's Tag System
 
Last week, I shipped a filter bar to my blog that barely worked.
 
The feature was live, the code was clean, the transitions were smooth. But
clicking through the filters barely changed anything. Click `[ai]` and three
posts disappeared. That's not filtering — it's a rounding error.
 
I had built a solution that solved the wrong problem. The filters weren't
broken; the data was.
 
This post covers the complete story: the broken build that started everything,
the V1 filter that shipped, the data audit that exposed the rot, the V2
rebuild, and the tag cleanup that cut saturation from 94% to 56%.
 
Three PRs. Two repos. 18 MDX files. And the realization that **folksonomy
without taxonomy is just noise.**
 
---
 
## The Broken Build (Context)
 
Before any filtering work started, the Vercel build was failing. The error:
 
```
Type error: 'Fuse' only refers to a type, but is being used as a namespace
here.
 
const FUSE_OPTIONS: Fuse.IFuseOptions<SearchIndexEntry> = {
                     ^
```
 
The `fix/search-polish` branch had a stale import — `Fuse.IFuseOptions` as a
namespace, but fuse.js exports `IFuseOptions` as a named type, not a namespace
member. Main already had the fix (`import Fuse, { type IFuseOptions }`) but the
branch was diverged.
 
**Also discovered:** The Friday scheduled post ("Friday Fixes: The Agent Was
Flying Blind") didn't go live because the `publishAt` date was `2026-05-02` — a
Saturday. The post title literally says "Friday Fixes." The cron would have
fired a day late. Fixed the date to `2026-05-01` and flipped `published: true`
directly.
 
**Root cause of both:** The build log was from a preview deploy on the branch,
not production. But the scheduled post was a genuine date typo.
 
The filters could wait. Fix the build, rescue the post, then ship the feature.
 
---
 
## V1: Tag-Based Filter Bar (PR #8)
 
First pass at filtering. Added a `// grep` labeled filter bar to the homepage
with tag pills:
 
```
// grep                                               // sort
[*]  [ai]  [homelab]  [agents]  [coder]  [+14]       [newest ↓]
```
 
### Architecture
 
- `FilterBar.tsx` — tag pills + sort toggle, `// grep` and `// sort` labels
  matching the blog's code-comment aesthetic
- `PostListWithFilters.tsx` — `"use client"` wrapper with `AnimatePresence` for
  smooth transitions
- `page.tsx` still server-renders all posts, passes them as props to the client
  boundary
- URL sync via `?tag=ai&sort=oldest` — shareable, clean URL at defaults
 
### Design decisions
 
- `[*]` glob wildcard instead of "All" — dev personality
- Single-select, not multi — simplest possible UX
- Top 4 tags by frequency shown inline, rest behind `[+N]` overflow pill
- Sort toggle: `newest ↓` / `oldest ↑`, single pill that flips on click
- Empty state: "No posts match. Try `[*]` to reset."
 
### The problem
 
It shipped and worked, but clicking through the filters barely changed
anything.
 
---
 
## The Tag Saturation Problem
 
I ran a data audit on all 15 published posts. The numbers were bad:
 
| Tag | Posts | % of all | Verdict |
|-----|-------|----------|---------|
| building-in-public | 14 | 94% | Describes the blog, not the post |
| ai | 12 | 81% | Same — AI is the tool, not always the subject |
| coder | 10 | 69% | Same — Coder is the platform |
| meta | 8 | 50% | Vague |
| homelab | 7 | 44% | First tag with real signal |
| agents | 7 | 44% | Meaningful |
 
Clicking `[ai]` hid 3 posts. That's not filtering — it's a rounding error.
The top 4 tags shown in the filter bar were essentially "all posts, but with
different labels."
 
The filter bar was showing noise at scale.
 
### Content classification
 
I classified all posts by type: 14 technical how-to posts and 1 opinion piece
("Agents Are My New Google Maps"). The user wanted to toggle between how-to,
opinion, and popular — a structural distinction that tags couldn't capture.
 
---
 
## V2: Content-Type Filters (PR #9)
 
Replaced tag pills with three fixed content-type filters:
 
```
// grep                                               // sort
[*]  [how-to]  [opinion]  [popular]  [+tags]          [newest ↓]
```
 
### Why frontmatter, not tags
 
Tags are folksonomy — freeform, inconsistent, grow unbounded. Content type is
taxonomy — controlled vocabulary, exactly 2 values (`how-to` | `opinion`),
every post gets exactly one. It's a structural property, not a descriptor.
 
Added `type` field to `PostMeta`:
 
```ts
export type PostType = 'how-to' | 'opinion';
 
export interface PostMeta {
  // ...existing...
  type?: PostType;  // undefined defaults to 'how-to'
}
```
 
### The `[popular]` pill
 
Uses `commentCount` from the GitHub Discussions API — already wired to every
post object on the homepage. Zero new infrastructure. Overrides the sort to
comment-count DESC. Sort toggle hides when popular is active (it imposes its
own sort).
 
Later can swap in Upstash Redis view counts (already being collected via
`PageViewTracker`) — the filter UI stays identical.
 
### `[+tags]` replaces `[+N]`
 
Same expand/collapse UX, but now it's the secondary filter layer. Tags and
content-type filters compose — you can select `[how-to]` and then drill into
`[homelab]` from the expanded tag row.
 
### Files changed
 
**Engine repo (`the-vibe-coder`):**
 
| File | Change |
|------|--------|
| `src/lib/types.ts` | Added `PostType`, `type` field on `PostMeta` and `Post` |
| `src/lib/posts.ts` | Default `type` to `'how-to'`, removed `getTopTags()` |
| `src/components/FilterBar.tsx` | Rewritten: content-type pills + `[+tags]`
expander |
| `src/components/PostListWithFilters.tsx` | Rewritten: type filtering, popular
sort, tag+type composition |
| `src/app/page.tsx` | Simplified props (removed `topTags`) |
 
**Content repo (`the-vibe-coder-content`):**
 
- 18 `.mdx` files updated (frontmatter `type` field added, tags cleaned)
 
---
 
## Tag Taxonomy Cleanup (Content PR #1)
 
Shipped alongside the engine changes. One commit to `the-vibe-coder-content`:
 
### Removed from all posts
 
- `building-in-public` (94% → every post had it, zero signal)
- `meta` (50%, vague — meant "about the blog itself")
 
### Trimmed
 
- `ai` — kept only on posts where AI is the *subject* (benchmarks, LLM setup),
  removed from posts where AI was just the tool
- `coder` — kept only on Coder-specific setup/config posts (2 posts)
 
### Merged
 
- `ai-agents` → `agents` (inconsistent naming, 1 vs 7 posts)
 
### Removed from opinion post
 
- `software-development`, `productivity`, `opinion` (now captured by
  `type: opinion` field)
 
### Added `next-js`
 
To early "Day N" posts that had no remaining tags after cleanup.
 
### Before/after
 
| Metric | Before | After |
|--------|--------|-------|
| Highest tag saturation | 94% (building-in-public) | 56% (agents) |
| Median tags per post | 5 | 2-3 |
| Tags with only 1 post | 11 | 6 |
| Unique tags | 22 | 16 |
 
---
 
## Gotchas
 
### 1. Spread order matters
 
```ts
{ ...meta, type: meta.type ?? 'how-to' }
```
 
The explicit `type` must come AFTER the spread, otherwise the spread overwrites
it with `undefined`.
 
### 2. `published: true` in body text
 
The tag cleanup script used `grep -q "published: true"` to find published
posts. One unpublished draft had that string in its body content ("someone flips
`published: false` → `published: true`"). The `sed` that added the `type` field
correctly matched `^published: true$` (anchored), but the grep guard didn't.
Always anchor your patterns when frontmatter fields can appear as prose.
 
### 3. Posts with zero remaining tags
 
After removing the noise tags, three early "Day N" posts had tags that were
100% noise (`meta`, `ai`, `coder`, `building-in-public`). Had to add `next-js`
back to keep them from being tagless.
 
### 4. `[popular]` sort needs to hide the sort toggle
 
When popular is active, it imposes `commentCount DESC`. Showing a `newest ↓` /
`oldest ↑` toggle alongside that is confusing — it implies two sorts are
composable. Hiding the toggle when popular is selected is cleaner.
 
---
 
## By the Numbers
 
- **3 PRs** — merged to main across 2 repos in one session
- **310 lines** — added for v1 filter bar, then **107 added / 78 removed** for
  v2 refactor
- **18 MDX files** — updated in the tag cleanup
- **47 tag instances removed**, 20 added (net -27 tag instances across all
  posts)
- **94% → 56%** — highest tag saturation dropped from `building-in-public` to
  `agents`
- **5 → 2-3** — median tags per post
- **0 new dependencies** — across both versions
- **1 broken build fixed**, **1 missed scheduled post rescued** — before the
  feature work even started
 
---
 
## What's Next
 
The new filter bar is live. The data is clean. The next question: what happens
when we add more content?
 
If we grow to 50 posts, will `agents` hit 94% again? Or will the taxonomy hold?
The answer depends on whether writers reach for the new taxonomy or drift back
to the old habits.
 
I'll be watching the tag distribution over the coming months. If `agents` creeps
back toward 80%, I'll know we need a nudge — maybe a tag suggestion UI, or a
lint rule that prevents noise tags.
 
For now: the filters work. The data is clean. And we finally have a system that
distinguishes **what the post is about** from **what the blog is about**.