Production Observability for $0: How I Monitor My Portfolio with Sentry + Pulsetic
I got my first Sentry weekly report. 23 errors. 1.7k transactions. On a side project. That's what production observability looks like — and it costs $0.
The Email That Made It Real
A few weeks after shipping the monitoring stack, the email landed:

I read it twice. Not because something was on fire — but because this is what production engineers actually see (or should) every Monday morning. Error counts. Transaction volume. Trends. I was flying blind before this. Not anymore.
On this post, I'm sharing details of how I built a 4-layer observability stack on my portfolio (luisfaria.dev) - open source, free tier, real production data.
The Problem: Shipping Blind
My previous dev.to article (From git pull to GitOps) ended with this honest admission in the "Future Roadmap" section:
"Monitoring & Alerting: Sentry for error tracking, uptime monitoring, and resource alerts. Current health checks cover the basics, but production-grade observability is the next evolution."
Once the CI/CD pipeline was working — tests passing, Docker images building, Discord pings on deploy — I had a new problem. I had no idea what was happening after the deploy.
Was the site up? Were there errors? Were users hitting rate limits? Was the server about to OOM?
I didn't know. So I fixed it.
The Architecture: 4 Layers
┌─────────────────────────────────┐
│ External Uptime Monitor │
│ (Pulsetic) │
│ Pings /health/ready every 60s │
└────────────┬────────────────────┘
│ HTTPS
┌────────────▼────────────────────┐
│ Nginx (reverse proxy) │
│ Port 80/443 │
└────────────┬────────────────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
┌─────────▼───────┐ ┌──────▼──────────┐ ┌─────▼───────┐
│ Frontend │ │ Backend API │ │ MongoDB │
│ (Next.js) │ │ (Express) │ │ + Redis │
│ @sentry/nextjs │ │ @sentry/node │ │ │
└────────┬────────┘ └──────┬──────────┘ └─────────────┘
│ │
└─────────┬─────────┘
│
┌────────▼────────┐
│ Sentry.io │
│ Error Tracking│
└─────────────────┘
┌─────────────────────────────────┐
│ Cron (every 5 min) │
│ monitor-resources.sh │
│ CPU / Memory / Disk / Docker │
│ → Discord Webhook │
│ (deduplicated, 30-min cooldown)│
└─────────────────────────────────┘
Each layer covers a different failure mode:
| Layer | What it catches | Latency |
|---|---|---|
| Health endpoints | Is the process running? DB/Redis connected? | Instant |
| Sentry | Code errors, crashes, slow transactions | < 1 min |
| Pulsetic | External view — is the site reachable? | < 2 min |
| Cron script | CPU/Mem/Disk/Docker going wrong | < 5 min |
Layer 1: Tiered Health Endpoints
Before wiring up external monitors, I needed something for them to ping. I built three tiers — each with a different audience and a different level of detail.
// backend/src/routes/health.ts // Liveness probe — "is the process running?" // Always 200. Load balancers use this. router.get('/health', (_req, res) => { res.status(200).json({ status: 'ok' }); }); // Readiness probe — "can it serve traffic?" // 200 when healthy, 503 when degraded. // Pulsetic targets this endpoint. router.get('/health/ready', async (_req, res) => { const { healthy, checks } = await runChecks(); // Strip latencies — no sensitive details for public consumers const coarseChecks: Record<string, { status: string }> = {}; for (const [key, val] of Object.entries(checks)) { coarseChecks[key] = { status: val.status }; } res.status(healthy ? 200 : 503).json({ status: healthy ? 'ok' : 'degraded', timestamp: new Date().toISOString(), checks: coarseChecks, }); }); // Internal diagnostics — full checks + system info // IP-whitelisted: loopback, Docker bridge, 10.x private networks only. // CI pipeline uses this from inside the Docker network. router.get('/health/details', async (req, res) => { if (!isTrusted(req)) { res.status(403).json({ error: 'Forbidden' }); return; } const { healthy, checks } = await runChecks(); const system = getSystemInfo(); res.status(healthy ? 200 : 503).json({ status: healthy ? 'ok' : 'degraded', timestamp: new Date().toISOString(), checks, // includes latencies system, // includes memoryUsage, loadAvg, cpus, uptime, nodeVersion }); });
The IP guard for /health/details is worth calling out:
const TRUSTED_EXACT = new Set(['127.0.0.1', '::1', '::ffff:127.0.0.1']); const TRUSTED_PREFIXES = [ '10.', ...Array.from({ length: 16 }, (_, i) => `172.${16 + i}.`), // Docker bridge ranges: 172.17.x through 172.31.x ]; function isTrusted(req: Request): boolean { const ip = req.ip || req.socket?.remoteAddress || ''; if (TRUSTED_EXACT.has(ip)) return true; return TRUSTED_PREFIXES.some((prefix) => ip.startsWith(prefix)); }
Calling it from the public internet returns 403 Forbidden. From inside Docker (CI pipeline) it returns the full diagnostics JSON.
Layer 2: Sentry — Error Tracking for Both Services
The Backend Setup (@sentry/node)
The critical thing: Sentry must be the very first import in backend/src/index.ts. Before Express, before Apollo, before anything.
// backend/src/instrument.ts import * as Sentry from '@sentry/node'; import type { EventHint } from '@sentry/node'; import { GraphQLError } from 'graphql'; const AUTH_CODES = new Set(['UNAUTHENTICATED', 'FORBIDDEN', 'BAD_USER_INPUT']); if (process.env.SENTRY_DSN) { Sentry.init({ dsn: process.env.SENTRY_DSN, environment: process.env.NODE_ENV, tracesSampleRate: process.env.NODE_ENV === 'production' ? 0.2 : 1.0, beforeSend(event, hint: EventHint) { // Skip HTTP 401/403 — auth flow, not bugs const statusCode = event.contexts?.response?.status_code; if (statusCode === 401 || statusCode === 403) return null; // Skip GraphQL auth/validation errors const original = hint.originalException; if (original instanceof GraphQLError) { const code = original.extensions?.code; if (typeof code === 'string' && AUTH_CODES.has(code)) return null; } return event; }, initialScope: { tags: { service: 'portfolio-api' } }, }); }
The beforeSend filter is important. Without it, every unauthenticated API request fires a Sentry event. That's noise, not signal — so I filter out UNAUTHENTICATED, FORBIDDEN, BAD_USER_INPUT, and HTTP 401/403.
For GraphQL specifically, I added an Apollo plugin that captures non-auth errors:
// In Apollo Server setup (backend/src/index.ts) plugins: [ { async requestDidStart() { return { async didEncounterErrors({ errors }) { for (const err of errors) { const code = err.extensions?.code as string | undefined; if (!AUTH_CODES.has(code ?? '')) { Sentry.captureException(err); } } }, }; }, }, ],
The Frontend Gotcha: instrumentation.ts
This is the part that trips up almost everyone on Next.js 13+. It gave me more work than expected. You can install @sentry/nextjs, add sentry.client.config.ts, wrap your config with withSentryConfig() - and still get zero frontend errors in Sentry.
The missing piece: frontend/src/instrumentation.ts.
// frontend/src/instrumentation.ts export async function register() { if (process.env.NEXT_RUNTIME === 'nodejs') { await import('../sentry.server.config'); } if (process.env.NEXT_RUNTIME === 'edge') { await import('../sentry.edge.config'); } }
This file is Next.js's official hook for initializing server-side code. Without it, Sentry's server/edge SDK never initializes, so SSR errors and API route errors silently vanish.
You need three Sentry config files at the frontend root:
frontend/
├── sentry.client.config.ts ← browser-side errors + session replay
├── sentry.server.config.ts ← SSR error capture
├── sentry.edge.config.ts ← middleware error capture
└── src/
└── instrumentation.ts ← THE HOOK THAT WIRES IT ALL TOGETHER
And next.config.ts needs to be wrapped:
// frontend/next.config.ts import { withSentryConfig } from '@sentry/nextjs'; export default withSentryConfig(nextConfig, sentryWebpackPluginOptions);
I also added src/app/global-error.tsx to catch React rendering errors. Otherwise component-level crashes disappear without a trace.
Layer 3: Pulsetic — External Uptime Monitoring
Sentry tells you about code errors. Pulsetic tells you if the whole site is unreachable. These are different problems.
Setup is 5 minutes:
- Create a free account at pulsetic.com
- Add monitor:
https://luisfaria.dev/health/ready - Check interval: 60 seconds, regions: Sydney + US East
- Confirmation period: 2 checks (avoids false positives during rolling deploys)
- Alert channel: Discord webhook
The key insight: configure Pulsetic to alert on 503, not just timeouts. When MongoDB goes down, /health/ready returns 503 degraded — not a network failure, but definitely something I want to know about.
Requiring 2 consecutive failures prevents alert spam during a normal deploy. Containers restart, health checks briefly fail - that's expected. Two consecutive failures means something is actually broken.
Layer 4: Cron Resource Monitor
Sentry and Pulsetic cover errors and availability. But what about the server silently running out of disk space? Or memory creeping up after a week of traffic? Those kill a VPS quietly - no crash, no error, just degradation.
I wrote a bash script that runs every 5 minutes:
# server/monitor-resources.sh (simplified) # Thresholds: 85% for CPU, Mem, Disk # Alerts: Discord webhook # Dedup: 30-minute cooldown per alert type DISCORD_WEBHOOK_URL="${DISCORD_WEBHOOK_URL}" THRESHOLD=85 STATE_DIR="/var/lib/monitor" check_memory() { local used_pct used_pct=$(free | awk '/^Mem:/ {printf "%.0f", $3/$2*100}') if [ "$used_pct" -gt "$THRESHOLD" ]; then send_alert_if_not_deduped "memory" "Memory at ${used_pct}%" fi } check_docker() { # Alert if any expected container is not running for container in frontend_webapp backend_api nginx_gateway mongodb_db redis_cache; do if ! docker ps --format '{{.Names}}' | grep -q "^${container}$"; then send_alert_if_not_deduped "docker_${container}" "Container ${container} is down" fi done }
The deduplication is the part I'm most proud of. Without it, a memory spike at 86% would fire an alert every 5 minutes until someone fixed it. With it, the first alert fires and then nothing for 30 minutes. The disk doesn't lie, but it doesn't need to shout either.
Security model — because this runs with Docker socket access:
| Concern | Solution |
|---|---|
| Runs as | Dedicated monitor system user (no login shell) |
| Docker access | monitor added to docker group (read-only monitoring) |
| Webhook secret | /etc/monitor/monitor.env (chmod 600, owned by monitor) |
| Logs | Logrotate: daily rotation, 7-day retention |
# Setup (on the server) useradd --system --no-create-home --shell /usr/sbin/nologin monitor usermod -aG docker monitor # Cron entry */5 * * * * monitor /opt/monitor/monitor-resources.sh >> /var/log/monitor-resources.log 2>&1
Real Data: First Sentry Weekly Report
After running this for one week, the Sentry weekly email arrived:
| Service | Errors | Transactions |
|---|---|---|
| Frontend (Next.js) | 6 | 1,451 |
| Backend (Node.js) | 17 | 270 |
| Total | 23 | 1,721 |
The 17 backend errors were mostly from testing the error-capture flow (I fired test exceptions during setup). The 6 frontend errors included a couple of ResizeObserver events that I subsequently filtered out.
Most importantly: I could see which GraphQL resolvers were slow, which routes had errors, and exactly what the call stack looked like for each failure. Stack traces with source maps. Breadcrumbs showing what the user did before the crash. Session replay for frontend errors (1% of sessions, 100% of errored ones).
What I Learned: SRE Concepts Applied
| Concept | Implementation |
|---|---|
| Liveness probe | GET /health — always 200, load balancers use this |
| Readiness probe | GET /health/ready — 200 or 503, Pulsetic targets this |
| Internal diagnostics | GET /health/details — IP-whitelisted, CI pipeline uses this |
| Error budget | Sentry free: 5K errors/month — if you hit this, something is very wrong |
| Incident detection | Pulsetic catches outages in < 2 min |
| Alert fatigue | 30-min dedup prevents Discord spam |
| Least privilege | Monitor script runs as monitor user, not root |
| Secret management | Webhook URL in restricted /etc/monitor/monitor.env (chmod 600) |
| Graceful degradation | 503 with "degraded" when a dependency is down, not a hard crash |
| Observability pillars | Logs (Winston) + Metrics (health/cron) + Traces (Sentry) |
The Alert Flow
Error in code → Sentry (instant) → Sentry dashboard + email
Site goes down → Pulsetic (< 2 min) → Discord + email
CPU/Mem/Disk → Cron script (every 5m) → Discord (deduplicated)
Deploy fails → GitHub Actions (instant) → Discord (existing pipeline)
Container crash → Cron script (every 5m) → Discord (deduplicated)
Key Takeaways
1. The instrumentation.ts File Is Not Optional
For Next.js 13+ (/src directory structure), frontend/src/instrumentation.ts is the initialization hook that wires Sentry into SSR and edge runtimes. Skip it and you get zero server-side error data.
2. Filter Before You Drown in Auth Noise
Without beforeSend, every 401/403 becomes a Sentry event. On an app with auth, that's most of your error budget. Filter UNAUTHENTICATED, FORBIDDEN, BAD_USER_INPUT at the source.
3. 503 Is Not "Down" — Design for Degradation
Health checks that return 503 on dependency failures give uptime monitors something actionable. A binary "up/down" monitor misses the nuance of "site works but database is slow."
4. Alert Deduplication Is Not Optional
A 30-minute cooldown on resource alerts prevents alert fatigue. If your phone buzzes every 5 minutes for the same disk usage spike, you'll start ignoring it — which defeats the point.
5. Real Data Changes How You Think
Before the weekly report, I thought about errors abstractly. After seeing "23 errors, 1.7k transactions," the numbers have names, stack traces, and user actions attached. That's the difference between guessing and knowing.
Tech Stack
| Layer | Technology | Cost |
|---|---|---|
| Error tracking | Sentry (free tier: 5K errors/mo) | $0 |
| Uptime monitoring | Pulsetic (free tier: 10 monitors) | $0 |
| Resource alerts | Bash + cron + Discord webhook | $0 |
| Health endpoints | Express routes (already deployed) | $0 |
| Frontend | Next.js + @sentry/nextjs | $0 |
| Backend | Node.js + @sentry/node | $0 |
Try It Yourself
The full implementation is open source:
| Resource | Link |
|---|---|
| Live Site | luisfaria.dev |
| Open Source Repo | https://github.com/lfariabr/luisfaria.dev |
| Health Routes | backend/src/routes/health.ts |
| Backend Sentry | backend/src/instrument.ts |
| Frontend Sentry | frontend/src/instrumentation.ts |
| Cron Script | server/monitor-resources.sh |
| Epic Tracker | Issue #115 — Observability |
Let's Connect
If you're building observability on a budget, working with Next.js + Node.js in production, or navigating Sentry's Next.js integration (that instrumentation.ts gotcha gets everyone), I'd love to trade notes:
- LinkedIn: linkedin.com/in/lfariabr
- GitHub: github.com/lfariabr
- Portfolio: luisfaria.dev
Built with too many Discord pings and one very satisfying weekly Sentry email by Luis Faria
Whether it's concrete or code, structure is everything.