Playwright MCP is Microsoft’s official answer to “let an LLM drive a browser without burning vision tokens on screenshots.” Instead of feeding the model pixels, it exposes the page as a structured accessibility tree — every button, link, input, and ARIA role addressable by ID. The agent reads the tree, calls click(id) or fill(id, value), and gets the next tree back.
What it produces: deterministic browser actions and observations. Click, type, navigate, wait for state, extract text — all returned as structured data, not screenshots. Same primitives as the Playwright test runner but exposed as MCP tools.
Best for: end-to-end test generation, scraping behind logins, agentic checkout flows, QA bots that need to interact with a real DOM.
Skip if: you only need to fetch static HTML (use Fetch MCP — far cheaper). Skip if you need pixel-perfect visual diff (Playwright still does it, but screenshot-based MCPs are better suited).
Setup gotchas: first run downloads Chromium (~300MB). Headless by default; pass --headed if you want to watch the agent click. Cookie persistence between runs requires --user-data-dir — easy to miss, breaks anything behind a login if forgotten.
Real-world workflow: I use it to monitor competitor pricing pages weekly. The agent navigates, waits for the React app to hydrate, extracts the structured tier info, and posts a diff to Slack. Setup: 15 minutes. Maintenance: zero, because the accessibility tree is far more stable than CSS selectors.
Compatible alternatives: Puppeteer MCP for older codebases, Firecrawl MCP for static crawls at scale.
The accessibility-tree approach is what makes this 5x cheaper than screenshot-based browser MCPs. Use it.