Prairie Home Archive — Documentation
How this site was built, what it does, and how to deploy it
1. Project Overview
The Prairie Home Companion Archive is a full-stack web application that catalogs and streams 1,136 episodes of A Prairie Home Companion (1985–2017). Each episode includes metadata (date, venue, host, tags), an MP3 audio stream, and many include timestamped rundowns listing every skit and musical number.
The app reads directly from a 74 MB SQLite database (data/prairiehome.db) using better-sqlite3 in read-only mode. There is no ORM — all queries are plain SQL in src/lib/db.ts.
Key Features
- Browse 1,136 episodes with year/audio/rundown filters and full-text search
- Deep search across titles, descriptions, venues, tags, and rundown content
- Persistent audio player that survives all client-side navigations
- Clickable rundown timestamps that seek the audio player (2 formats supported)
- Shareable deep links like
/episode/59159#seek-00:08:37 - Dark/light theme persisted to localStorage
- Keyboard shortcuts for playback (Space, arrows, M)
- Volume, mute, and playback speed controls
- Auto-submitting search with debounce (no Search button)
- REST API for all data endpoints
2. Tech Stack
| Layer | Technology | Version |
|---|---|---|
| Framework | Next.js (App Router) | 16.2.7 |
| UI Library | React | 19.2.4 |
| Database | SQLite via better-sqlite3 | 12.10.0 |
| Styling | Tailwind CSS (v4) + custom CSS variables | 4.x |
| Fonts | Inter, EB Garamond, Public Sans (Google Fonts) | — |
| TypeScript | Strict mode | 5.x |
| Lint | ESLint (eslint-config-next) | 9.x |
| Python Tools | Scrapers, audio download, STT, ML (see §11) | 3.x |
3. Architecture & File Structure
The app is a hybrid: server components handle data fetching (browse, search, episode detail, stats), while client components (PlayerBar, EpisodeCard, RundownContent) handle interactivity. The PlayerProvider wraps the entire app via layout.tsx so audio state persists across navigations.
4. Database Schema
The database at data/prairiehome.db contains four tables.
episodes
The primary table — 1,136 rows, one per episode.
transcriptions
Transcribed audio (Deepgram STT). Keyed to episodes.show_id.
audio_events
Detected applause/music/intermission breaks from transcript gaps.
scrape_log
Audit log from the Python scraper.
id, url, status (ok/fetch_error), items_found, scraped_at5. Pages & Routes
| Route | Type | Description |
|---|---|---|
| / | Server (dynamic) | Browse all episodes. Supports ?q, ?year, ?audio, ?rundown, ?page filters. Shows card grid with search/filter bar and pagination. |
| /search | Server (dynamic) | Deep search across all fields including rundowns. Uses multi-pass priority search: title → venue → tags → description → rundown. Displays match type badges and context snippets. |
| /stats | Server (static) | Archive statistics: total episodes, with audio, total duration, missing data, episodes per year with proportional bar chart. |
| /episode/[showId] | Server (dynamic) | Episode detail: metadata, play button, description, tags, and interactive rundown with clickable timestamp seek links. Also handles deep links. |
| /docs | Server (static) | This page — project documentation. |
6. Client Components
PlayerProvider + PlayerContext (PlayerContext.tsx)
React Context provider that owns the <audio> element and exposes all playback state + controls via usePlayer(). Key states:isPlaying, isLoading, currentTime,duration, volume, muted, speed,episode. Methods: play(), pause(), seek(),skip(), setVolume(), setMuted(),setSpeed(), togglePlay().
The play() method accepts an optional seekTo parameter for starting playback at a specific position. State is persisted to localStorage every 2 seconds and restored on mount (via an effect, avoiding hydration mismatches). Keyboard shortcuts (Space, arrows, M) are registered at the provider level.
PlayerBar (PlayerBar.tsx)
Fixed-position bar at the bottom of the viewport. Shows the current episode's cover thumbnail (linked to the episode page), title, venue/date, play/pause/skip buttons, progress bar with time display, volume slider, mute toggle, and playback speed cycle (0.5x–2x). Shows a spinning loader when audio is loading. When no episode is selected, displays “Select an episode to play.”
EpisodeCard (EpisodeCard.tsx)
Card in the browse grid. Shows a gradient cover with episode title text, a play overlay button (only if the episode has audio), and card info (title, year, venue). The card body links to /episode/[showId] via next/link. The play button uses usePlayer().play() — if the episode is currently playing, the overlay shows a pause icon and gets an .playing class. Uses suppressHydrationWarning to handle localStorage-based player state.
RundownContent (RundownContent.tsx)
Renders the episode rundown with interactive timestamp links. Handles two rundown formats:
- New format (506 episodes): HTML from the Python patcher with
<a data-seek="SECONDS" href="#seek-HH:MM:SS">links. Passed through as-is. - Old format (394 episodes): Plain text with
MM:SSandH:MM:SStimestamps at line starts. A two-pass regex (H:MM:SS first, then MM:SS) wraps each timestamp in a<a data-seek="…" href="#seek-HH:MM:SS">link.
Deep link handling: On mount, reads window.location.hash. If it matches #seek-HH:MM:SS, starts playback at that position.
Correct episode switching: When a seek link is clicked, calls play(audioUrl, title, sub, showId, c1, c2, secs) which loads the correct episode at the right timestamp — even if a different episode was already playing.
AutoSearchForm (AutoSearchForm.tsx)
Form wrapper that auto-submits on input. Select dropdowns trigger immediate navigation via router.replace(). Text inputs debounce 400ms before navigating. Uses next/navigation router (client-side) so the audio player is never interrupted. Prevents full browser form submission.
ThemeToggle (ThemeToggle.tsx)
Dark/light mode toggle. Reads initial state from localStorage, toggles thedark class on <html>, and persists the preference. The CSS variable system in globals.css switches between warm paper tones (light) and dark grays (dark) based on the .dark class.
7. REST API
All endpoints return JSON. Parameters are passed as query strings.
| Endpoint | Parameters | Returns |
|---|---|---|
| GET /api/episodes | ?page ?q ?year ?audio ?rundown | { episodes[], total, page, totalPages, years[] } |
| GET /api/episodes/:showId | — | Episode row (or 404) |
| GET /api/search | ?q ?limit=50 | { results[], years[] } |
| GET /api/stats | — | { total, withAudio, totalSecs, noDuration, byYear[], years[] } |
8. Audio Player Architecture
The player is built around a React Context pattern to ensure uninterrupted playback across page navigations:
PlayerProviderwraps the entire app inlayout.tsx. Since layouts never unmount during client-side navigation, the context and the<audio>element stay alive.usePlayer()hook is consumed byPlayerBar,EpisodeCard,EpisodeClient, andRundownContent.- State (volume, muted, speed, current episode, playback position) is persisted to localStorage every 2 seconds and restored after mount via a useEffect — this avoids hydration mismatches (SSR always sees defaults).
play()accepts an optionalseekToseconds parameter. If the same track is already loaded, it just seeks. If a different track, it loads the new source and seeks after thecanplayevent.- Keyboard shortcuts (Space, ←→, M) are registered globally with focus-element checks to avoid interfering with text inputs.
isLoadingstate tracksloadstart,waiting,seekingevents (→ true) andcanplay,play,seeked,errorevents (→ false). The PlayerBar shows a spinning SVG when loading.
9. Rundown Timestamp Seeking
Rundowns come in two formats from the database:
New Format (HTML from patcher)
Already has data-seek (seconds) and href (shareable URL).
Old Format (plain text)
Timestamps use MM:SS (e.g., 59:16, 100:37) until 1 hour, then switch to H:MM:SS (e.g., 1:10:43). A two-pass regex on the client wraps them in seek links before rendering:
- Pass 1: Match
H:MM:SStimestamps (1–2 digit hours, 2 digit minutes and seconds). Convert to total seconds. - Pass 2: Match
MM:SStimestamps (1–3 digit minutes, 2 digit seconds). Convert MM to HH:MM for the href, compute total seconds for data-seek.
Each generated link gets data-seek="SECONDS" (for the click handler) and href="#seek-HH:MM:SS" (for right-click → Copy Link and deep links).
The click handler e.preventDefault()s the link navigation, updateswindow.location.hash for sharing, and callsplay(audioUrl, …, secs) to load the correct episode at the right position.
10. Search System
Two search paths serve different use cases:
| Feature | Browse Bar (getEpisodes) | Search Page (searchAll) |
|---|---|---|
| Fields searched | title, description, venue, tags | title → venue → tags → description → rundown_content |
| Method | Single LIKE query with AND filters | Multi-pass priority search, deduplicating |
| Results | Paginated episode cards | Up to 100, with match type badge + context snippet |
| Filters | year, audio, rundown (dropdowns) | None (text-only deep search) |
| Submission | Auto-submit on input (400ms debounce for text, immediate for selects) | Auto-submit on input (400ms debounce) |
Why different results? The browse bar search does NOT search rundown_content — it would be too broad for a simple filter. The search page is purpose-built for deep search including rundowns, where most interesting results (skit names, guest names, segment titles) live.
11. Theme System
Styling uses Tailwind CSS v4 with project-specificCSS custom properties for theming. No tailwind.config.ts — Tailwind v4 uses CSS-based configuration via @import "tailwindcss".
Light/dark mode is controlled by the .dark class on <html>:
The ThemeToggle component reads localStorage on mount, sets the initial class, and toggles on click. Fonts (Inter, EB Garamond, Public Sans) are loaded via next/font/google with CSS variable fallbacks (--font-body, --font-display,--font-label).
12. Python Data Pipeline
All scripts live in scripts/ and read/write the same SQLite database.
| Script | Purpose |
|---|---|
| db_config.py | Shared path helper — returns data/prairiehome.db |
| scrape_prairiehome.py | Initial scraper — creates the episodes table, scrapes listing + detail pages |
| patch_prairiehome.py | Enriches episodes — fetches page_title, rundown_url, rundown_content. --migrate flag adds data-seek links to existing rundowns |
| download_audio.py | Downloads MP3 files from audio_url to local storage |
| transcribe_episodes.py | Speech-to-text via Deepgram API — creates/updates transcriptions table |
| detect_audio_events.py | Detects applause/music/intermission breaks from transcript utterance gaps |
| train_segment_boundaries.py | ML model — predicts segment boundaries for episodes without rundowns |
| app.py | Original Flask app — kept for reference |
13. Deployment
The app is configured with output: "standalone" innext.config.ts. Building produces a self-contained deployment at .next/standalone/:
Quick Deploy
Process Management (pm2)
Nginx Reverse Proxy
Requirements
- Node.js 18+ on the server
- ~180 MB disk space (DB + node_modules + static assets)
- Server architecture must match the build machine for
better-sqlite3native binary (or runnpm rebuild better-sqlite3on the server)