Engineering

Adopting TypeScript Strict Mode in Brownfield Codebases

Incremental `strict` flags, boundary types, and the CI rules that prevent regression.

8 min read1.5k views

Flipping `strict: true` in one PR is a morale killer.

TypeScript code on a developer monitor
TypeScript code on a developer monitor

We enable flags incrementally — `strictNullChecks` first, then `noImplicitAny`, then the rest — with a tracked burn-down in the ticket backlog.

Production upgrades rarely fail because of framework bugs — they fail when cache assumptions, auth cookies, and CDN headers were never validated together on staging that mirrors real traffic shape.

Before committing to a migration window, align product, infrastructure, and support on rollback criteria. A written go/no-go checklist prevents heroics when metrics drift after deploy.

Boundary types at API edges contain unknown data: validate

Server infrastructure and network cables
Server infrastructure and network cables

with Zod at runtime, export inferred types inward.

Inventory every data fetch path: server components, route handlers, and client-side SWR hooks. Tag each call with expected staleness and document who owns invalidation when upstream data changes.

Middleware and edge handlers deserve the same regression suite as API routes — especially redirects, locale detection, and auth gates that behave differently under bot traffic.

`any` is allowed only in shim files with a ticket link

Database schema on a developer monitor
Database schema on a developer monitor

and expiry date — visible debt beats hidden casts.

Partial prerendering and streaming change how users perceive performance. Measure first meaningful paint separately from time-to-interactive on routes that mix static shells with dynamic holes.

Document boundary decisions in ADRs so the next squad does not collapse dynamic regions back into fully static pages for short-term convenience.

CI fails on new `any` introductions via ESLint and

API documentation on a laptop screen
API documentation on a laptop screen

reports strict coverage percentage weekly.

Staging must replay CDN cache keys, not only origin responses. We clone production cache headers and run synthetic crawls before promoting framework upgrades.

Load tests should include authenticated sessions and cart mutations — anonymous homepage tests alone miss the routes that break under cache policy changes.

Teams that finish strict adoption report fewer production

Server infrastructure and network cables
Server infrastructure and network cables

incidents tied to null access and malformed payloads.

Dashboard cache hit ratio, RSC payload size by route, and error rate per layout segment on one screen. On-call should not hunt across three tools during an incident.

Schedule a 48-hour post-upgrade review with engineering and client stakeholders — capture what surprised you while context is fresh.

The table below summarizes the reference points we review with client stakeholders before sign-off. Use it as a shared vocabulary in sprint planning and release reviews.

Migration risk matrix

AreaRisk levelMitigationOwner
Caching defaultsHighAudit fetch + revalidate usagePlatform
Dynamic routesMediumStaging parity with CDN headersWeb
MiddlewareMediumEdge-case test suiteWeb
ISR pagesHighLoad test under realistic trafficSRE
Auth cookiesHighCross-domain staging replaySecurity
ObservabilityMediumDashboard per route segmentSRE

Run through this checklist in order — skipping steps because of deadline pressure is how regressions reach production. Assign an owner for each item before you schedule a launch window.

Pre-launch gates

  • Run regression suite on staging with production-like data volume.
  • Validate observability dashboards and alert thresholds.
  • Document rollback steps before promoting to production.
  • Schedule a post-deploy review within 48 hours.
  • Confirm cache headers and CDN behavior match the signed-off staging replay.
  • Verify feature flags and kill switches for partial rollout paths.

More in Engineering