I ran component-level accessibility audits for enterprise SaaS teams: inventory, keyboard paths, contrast on real layers, and a fix queue devs could ship.
Blocker-tier issues cleared before renewal conversations. Major-tier backlog shrank enough that engineering could group fixes by component family instead of fighting fires. The QA team kept a living regression checklist they still run on release candidates. I report measured contrast ratios and keyboard paths; I do not stamp legal certification or VPAT sign-off unless a client engagement explicitly covers it.
40+ components · WCAG 2.1 AA · keyboard + contrast
- 40+
- Components inventoried per engagement
- WCAG 2.1 AA
- Keyboard · contrast · labels
- 3 tiers
- Blocker · major · minor severity
- 0
- Legal certification claims in reports
Why component audits beat page scans
Procurement teams increasingly ask for accessibility evidence, not a marketing badge. The SaaS launch team came to me with a green Lighthouse score and a product UI that still failed basic keyboard traversal in settings, billing, and admin modals.
Page-level scans catch document structure and alt text on the hero image. They miss the combobox that steals focus, the toast that never announces, and the data table header that disappears when you zoom text. Enterprise buyers notice those gaps in trial environments, not in a PDF summary.
I scope audits at the component layer because that is where design systems meet production. Every button variant, every dialog state, every dense operator panel gets the same bar: can a keyboard-only user complete the task, and can a low-vision user read the chrome, not just the marketing headline?
Audit workflow
Figure 1 walks the workflow left to right. Inventory comes first: I pull the design system catalog, map variants (default, hover, disabled, error), tag owning squads, and document baseline tab order before I file a single bug.
Test is where most time lands. I run keyboard-only paths through real flows, measure contrast on foreground and background pairs the component actually uses (not the brand palette swatch page), validate aria labels and live regions, and check touch targets against the 44px bar where mobile layouts ship the same components.
Report translates findings into blocker, major, and minor tiers with WCAG criterion references, screenshots, and repro steps a dev can follow without a design review meeting. Remediate closes the loop with a prioritized fix queue, per-component PR checklist, regression spot-check, and sign-off gate before release.
The workflow is deliberately repetitive. Repetition is what lets the QA team re-run the same checks after a framework upgrade without paying for a full re-audit.
Four phases
Component test matrix
Figure 2 is the matrix I fill for every engagement. Rows are components; columns are test dimensions: keyboard and focus, contrast on real layers, screen reader name and role, touch target size, and motion or animation respect for reduced-motion preferences.
A pass is not vibes. Keyboard pass means every interactive node is reachable, focus is visible on the actual background, and escape routes work from modals and menus. Contrast pass means the ratio is computed on the pair the user sees in the product skin, including semi-transparent overlays and nested panel chrome.
The component matrix surfaced a pattern I see often: marketing components passed while dense operator components failed. Data grids, filter chips, and inline editors carried the debt because they were added after the initial system launch and never re-audited as a set.
I publish the matrix with the audit report so product and engineering share one source of truth. When a squad adds a new variant, they extend the row instead of opening a Slack thread about whether the old audit still counts.
- One row per component variant set, not per page
- Contrast measured on composed UI, not isolated swatches
- Fail/warn/pass per column with WCAG reference
- Matrix becomes the regression checklist after fixes land
Test dimensions
Severity, reporting, and what I will not claim
Blocker tier stops ship: keyboard traps, missing names on icon-only controls, contrast below AA on primary actions. Major tier ships with a dated fix plan: confusing focus order, inconsistent heading hierarchy in modals, error text that color alone carries. Minor tier is polish that still gets logged so it does not disappear.
Every finding includes a WCAG reference, a screenshot or recording, repro steps, and a fix suggestion that respects the existing design language. I do not rewrite the system; I make the gap legible for the team that owns the component.
I report measured ratios and observable behavior. I do not promise legal certification, VPAT completion, or blanket WCAG sign-off unless the contract scope explicitly includes compliance documentation. That honesty kept the engagement credible with procurement: numbers and repro paths, not a logo swap on the footer.
Remediation queue
Figure 3 shows how findings become shippable work. Blockers sit at the top with owner and target sprint. Major items cluster by component family so one squad can clear related issues in one PR. Minor items land in a backlog column that still gets reviewed before major releases.
Each queue row links back to the matrix row, the WCAG criterion, and the stable class hook where I could attach a Playwright spot-check. Their engineers said the queue was the first audit deliverable they could paste into Jira without translation.
Regression gates are non-negotiable after fixes merge. I give QA a short spot-check list: five keyboard paths, three contrast pairs, two screen reader spot tests per release candidate. Full re-audit happens on system upgrades or annual renewal prep, not every two-week sprint.
Fix queue
Enterprise scale: Rexel and Google
Before this engagement I learned this discipline at Rexel, where ninety-plus sub-brands shared one Hybris template. Accessibility was not a single checklist; it was typography, focus, and contrast surviving translation, currency switches, and market-specific chrome. Section maps and style-guide portals were how we kept PLP patterns testable across thirty-seven countries.
At Google I ran People Ops programs where inclusive imagery and hybrid tooling had to work for every Googler, including keyboard-only workflows in internal tools and high-contrast needs on dense operator panels. Scale changes the audit: you invent matrices and portals so squads do not re-discover the same failure mode in isolation.
Those engagements inform how I run SaaS audits today. The component matrix is the portable artifact; the workflow is the same whether the surface is a storefront PLP or a billing settings drawer.
SACA hooks and live contrast tooling
Audits stick when components have stable names. I attach findings to `saca-*` class hooks where teams adopt SACA (or equivalent BEM roots) so Playwright specs can target the same node QA manually tested. Without hooks, every framework upgrade breaks the regression list.
On my own products I dogfood the same bar. weableColor runs WCAG and APCA math on real Figma layers and VS Code document colors; Mission Control scores panels on open. That is the same contrast-on-real-layers rule I enforce in client audits, packaged as tooling I ship publicly.
The team did not adopt SACA wholesale, but they did add stable data-testid and BEM-style roots on the components we fixed. That was enough to keep the remediation queue alive through a later React major upgrade.
Engagement timeline and deliverables
I time-box audits so eng leads see progress weekly. Week one is inventory and matrix scaffolding: catalog import, squad tags, baseline keyboard order on the top twenty traffic components. Week two is deep testing on billing, settings, admin, and shared primitives (buttons, inputs, dialogs, tables). Week three is the readout, queue prioritization with squad leads, and agreement on regression gates before code lands.
Deliverables are boring on purpose: the matrix spreadsheet, the severity-ranked queue export, WCAG-mapped findings with screenshots, a one-page executive summary for procurement, and the regression spot-check list QA keeps. Legal cared about the executive summary; engineering cared about the queue. Same data, two views, no duplicate audits.
I stay through the first remediation sprint when teams ask, not to own the fixes, but to unblock interpretation. When a contrast failure is semantic (token pair) versus component (local override), I point to the right owner so work does not bounce between design systems and feature squads.
- Week 1: catalog + matrix baseline on high-traffic components
- Week 2: deep keyboard and screen reader passes on operator UI
- Week 3: readout, queue, regression list, procurement summary
- Optional sprint embed: clarify findings, not own implementation
Lessons and what ships next
Hero-only audits create false confidence. Renewal conversations cleared because we proved keyboard paths and contrast on the components buyers use in trial, not because the landing page had alt text.
Severity tiers keep engineering from drowning. Blockers first, majors grouped by squad, minors logged but not mixed into emergency fire drills.
Regression beats one-time PDFs. A short spot-check list per release costs less than re-hiring an auditor every quarter.
Nested widgets deserve their own matrix rows. Date pickers, comboboxes, and sortable tables fail in isolation when you only audit the parent card. I split complex components into sub-rows so fixes do not hide behind a passing parent shell.
Next for teams on this pattern: annual matrix refresh tied to design system version tags, automated contrast exports from weableColor into the component catalog, and Playwright specs generated from the highest-traffic keyboard paths in the remediation queue.