Articles

Notes from building real things

Web, React and Next.js, AI and RAG, and mobile. Written from production work, not tutorials about tutorials.

How Much Does It Cost to Build a RAG Chatbot in 2026?

An MVP RAG chatbot costs $4k-$12k to build in 2026, production builds run $15k-$40k+, and running costs land at $5-$30 per 1,000 queries. Full breakdown.

Jun 9, 2026· 6 min read

RAGAIChatbot

AI Engineering

Reliable JSON From LLMs: Structured Outputs Compared 2026

Strict structured outputs hold ~99.9% schema compliance while plain JSON mode fails 8-15% of the time. I compare OpenAI, Claude, and Gemini with one Zod schema.

Jun 10· 7 min read

AI Engineering

Do AI Agents Need a Memory Layer? Mem0 vs Letta vs Zep

Most AI agents don't need a memory vendor. Unless you need consolidation, decay, or cross-agent state, Postgres with pgvector covers memory for $0 extra.

Jun 10· 8 min read

AI Engineering

How to Migrate Your MCP Server to the 2026 Stateless Spec

The final MCP spec ships July 28, 2026 and removes sessions from the protocol. I migrated my production Node server; here is the exact diff and checklist.

Jun 10· 6 min read

AI Engineering

Private RAG on Local Models: Qwen3 vs Gemma 4 in 2026

Yes, you can ship private RAG on one 24GB GPU in 2026. I ran a 50-question eval: Gemma 4 26B MoE wins English corpora, Qwen3.6 27B wins multilingual.

Jun 10· 7 min read

AI Engineering

How to Secure an MCP Server: 2026 Hardening Checklist

I audited my production MCP stack against the NSA's May 2026 guidance and the OX Security RCE disclosure. Here is the 12-point hardening checklist I use.

Jun 10· 7 min read

AI Engineering

Claude Code vs Cursor vs Codex for Real Client Work 2026

Pricing converged at $20/$200 and SWE-bench scores sit within a point, so workflow decides. Real cost-per-feature numbers from paid client projects.

Jun 10· 7 min read

AI Engineering

Is RAG Dead in 2026? Agentic Retrieval in Production

No. I rebuilt my production SaaS pipeline as agentic retrieval: cost per query down 36%, accuracy up from 68% to 89%. Only naive top-k RAG died in 2026.

Jun 10· 7 min read

Web Development

GEO in 2026: Getting Cited by ChatGPT and Perplexity

GEO means writing answer-first chunks AI engines can lift: 69% of Google searches are zero-click in 2026, but ChatGPT referrals convert at 15.9%.

Jun 10· 7 min read

AI Engineering

How to Reduce LLM API Costs: Caching and Routing in 2026

I cut my reputation SaaS's LLM bill 79%, from $41.60 to $8.90 per 1,000 AI replies, using routing, prompt caching, semantic caching, and batching.

Jun 10· 7 min read

$Leaving Vercel in 2026: The Real Self-Hosting Cost Math - branded cover card by Hamza Shabbir$

Web Development

Leaving Vercel in 2026: The Real Self-Hosting Cost Math

Self-hosting Next.js on Hetzner with Coolify costs me $6-17/mo vs $40-150 on Vercel Pro. The real math, ops hours included, after the April 2026 breach.

Jun 10· 8 min read

AI Engineering

AI Agent Observability in Node.js with OpenTelemetry

OTel GenAI spans went stable in early 2026. Here is how I instrument a TypeScript agent in Node.js, track cost per trace, and alert on silent failures.

Jun 10· 7 min read

Web Development

React Compiler 1.0: Do You Still Need useMemo in 2026?

For referential equality, no: React Compiler 1.0 handles it. I profiled a production dashboard to show what you still hand-memoize and what breaks.

Jun 10· 6 min read

Web Development

Next.js 16 Migration Guide: Fixing the Silent Breakages

Next.js 16 breaks middleware, revalidateTag, and caching with zero errors. Exact symptoms and fixes from migrating my production SaaS in 14 hours.

Jun 10· 8 min read

Web Development

TypeScript 7 Beta (tsgo): What Broke in My Real Monorepo

tsgo cut my monorepo's full type check from 71s to 9.2s, but plugins and compiler-API tools broke. Real before/after numbers and a switch-or-wait verdict.

Jun 9· 7 min read

Web Development

What Is WebMCP? Making Your Web App Work with AI Agents

WebMCP, announced at Google I/O 2026, lets your web app register typed tools AI agents can call in Chrome 149. Here is how I exposed mine, with code.

Jun 9· 6 min read

Mobile Development

Google Play 16KB Page Size: Fix Failing React Native Builds

Google Play now blocks all React Native updates without 16KB page-size support. Here's the fix: RN 0.77+, AGP 8.5.1+, NDK r28, plus updated native deps.

Jun 9· 7 min read

Web Development

How to Fix a Vibe-Coded App: My Rescue Audit Checklist

Roughly 8,000 of 10,000 AI-built startup apps need rescue work in 2026. Here is the 60-minute triage and fix order I use to save them without a rewrite.

Jun 9· 7 min read

Web Development

Vibe Code Your MVP or Hire a Developer? A 2026 Framework

Vibe code to validate, hire an engineer before money flows. A five-question framework with six-month cost math from a developer who rescues vibe-coded apps.

Jun 9· 7 min read

Mobile Development

Expo Go Not on the App Store? Move to Development Builds

Expo Go for SDK 55/56 was never approved on the App Store. Here is my tested migration to development builds, plus SDK 56 gotchas and real costs.

Jun 9· 7 min read

Mobile Development

On-Device AI in React Native: Apple Foundation Models

WWDC 2026 made on-device AI the default for React Native. I built a private review summarizer: no API key, $0 per call, 28 tokens/sec on iPhone 16 Pro.

Jun 9· 7 min read

Mobile Development

React Native vs Flutter for Startup MVPs: What I Pick and Why

For most startup MVPs I pick React Native: it reuses React web talent, code, and hiring pipelines. Flutter wins for custom UI-heavy apps. My framework inside.

Jun 8· 6 min read

AI Engineering

How to Add AI Features to an Existing SaaS Without a Rewrite

You do not need a rewrite to add AI to a SaaS. Add one endpoint beside your existing API, wire it to one workflow, stream the output, and cap token spend.

Jun 6· 8 min read

Building something similar?

I take on a few projects at a time: web apps, AI features, and mobile. Tell me what you are working on.

Start a conversation