Scriptable Browsers: From Selenium to Playwright-Powered Monitoring
A practical history of browser automation — Selenium, Puppeteer, Playwright — and how the same technology now powers synthetic monitoring that catches what HTTP checks miss.
Web browsers were built for humans. For a long time that was fine — a person clicked links, filled forms, and eyeballed the results. But as web applications grew more complex (AJAX, SPAs, client-side rendering), the industry needed a way to drive browsers programmatically. First for testing. Then for scraping. And now, for monitoring.
This post traces the evolution of scriptable browsers and explains why the same technology that powers your E2E test suite is now the best tool for knowing whether your site actually works for real users.
A brief history
Selenium (2004)
Selenium appeared when Google Chrome didn’t exist yet and Firefox was the browser to automate. There was no browser API for external control — Selenium solved this by injecting a Java library into the browser, loading custom extensions, and exposing an HTTP interface to accept commands.
Over time, this approach was standardized into the JSONWire protocol. Support for other engines followed — ChromeDriver, OperaDriver, GeckoDriver. Then the W3C WebDriver standard formalized the idea, and browsers began shipping native support. Selenium 4 (2018, stable in 2021) adopted WebDriver natively.
Today Selenium is at v4.41 with 12 releases shipped in 2025 alone. It’s far from dead — but its architecture carries 20 years of history.
Puppeteer (2017)
Google’s Chrome DevTools team introduced Puppeteer in May 2017, built on a different protocol: CDP (Chrome DevTools Protocol). Instead of the HTTP-based WebDriver, CDP uses WebSocket for real-time bidirectional communication with the browser.
This was a significant upgrade. CDP gives you access to everything the DevTools panel sees: network waterfall, console messages, performance traces, coverage data. For scraping and testing, it meant you could intercept network requests, inject JavaScript, and capture HAR files without an external MITM proxy.
The trade-off: Puppeteer was JavaScript-only and Chromium-first. Firefox support came later (via patched binaries), and the API was async-only.
Currently at v24.40, Puppeteer remains widely used — especially in data extraction and AI agent tooling, where its lean CDP usage is 15-20% faster than alternatives for certain workloads.
Playwright (2019)
Playwright was created by the same engineers who built Puppeteer, after they moved from Google to Microsoft. It launched in November 2019 and addressed Puppeteer’s limitations head-on:
- Multi-browser from day one. Chrome, Firefox, and WebKit — all using CDP (or Playwright’s own protocol variants).
- Multi-language. JavaScript/TypeScript first, but also Python, Java, and C#.
- Built-in test runner.
npx playwright testships with parallel execution, auto-retries, HTML reports, and trace recording. - Sync and async APIs. In Python, you choose. Selenium only offered sync; Puppeteer only offered async.
- Auto-waiting. Elements are automatically waited for before interaction — no more
sleep(2)or manual wait loops. - Built-in network introspection. Intercept, mock, or record all network traffic without external tools.
The numbers speak for themselves: 78,600+ GitHub stars, 13.5 million weekly npm downloads, and 235% year-over-year growth. In 2026, Playwright surpassed Selenium as the top automation testing tool in industry surveys. 94% retention rate — once teams adopt it, they stay.
The comparison matrix
| Selenium | Puppeteer | Playwright | |
|---|---|---|---|
| Protocol | WebDriver (HTTP) | CDP (WebSocket) | CDP / custom (WebSocket) |
| Browsers | Chrome, Firefox, Safari, Edge, IE | Chrome, Firefox*, WebKit* | Chrome, Firefox, WebKit |
| Languages | Java, Python, C#, JS, Ruby | JS/TS only | JS/TS, Python, Java, C# |
| Network introspection | Requires MITM proxy | Built-in | Built-in |
| Auto-wait | No | No | Yes |
| Test runner | External (TestNG, pytest) | External | Built-in (@playwright/test) |
| Binary management | 3rd-party webdriver_manager | Built-in npx puppeteer install | Built-in npx playwright install |
| Headless | Via browser flags + Xvfb | Native | Native |
* Puppeteer’s Firefox and WebKit support uses patched browser builds.
If you’re starting fresh in 2026 — no legacy codebase, no deep investment in another framework — Playwright is the clear default for its simplicity, speed, multi-browser support, and built-in tooling.
What headless actually means
Headless mode runs the browser without a GUI. No window renders on screen, but the full engine runs: DOM parsing, JavaScript execution, layout computation, network requests. The advantages are practical:
- Faster. Some rendering steps (compositing, painting) can be skipped.
- Less resource-hungry. No GPU process, no display server dependency.
- Runs anywhere. Linux containers, CI runners, server VMs — no Xvfb needed.
As of Chrome 132 (January 2025), --headless defaults to the “new” headless mode — a full browser instance without a window, not the old stripped-down shell. The previous --headless=old mode was removed from the main binary and lives on as a separate chrome-headless-shell binary for lightweight use cases.
What’s happening now: BiDi and AI agents
Two shifts are reshaping the browser automation landscape:
WebDriver BiDi is a new W3C standard that brings bidirectional communication to WebDriver — combining CDP’s real-time capabilities with WebDriver’s cross-browser standardization. Firefox has already removed CDP support entirely (since Firefox 129); BiDi is the only automation path. Chrome still defaults to CDP, with BiDi as an opt-in. Playwright tracks BiDi progress but hasn’t adopted it yet due to missing features.
AI agent browsers are a new category. Tools like Browser Use, Steel Browser, and Hyperbrowser give AI agents the ability to navigate the web autonomously. Notably, Browser Use moved from Playwright to raw CDP for performance. And Lightpanda — a headless browser written in Zig from scratch — claims 11x faster execution and 9x less memory than headless Chrome, with a CDP-compatible API so existing scripts work as drop-in replacements.
From testing to monitoring
Here’s the connection that matters for operations teams: if you can drive a browser to test a user flow, you can drive a browser to monitor that flow continuously.
A traditional HTTP monitor sends a GET request, checks the status code, and measures response time. This tells you the endpoint is alive. But a 200 OK can hide:
- Broken JavaScript that prevents the page from rendering
- Failed API calls that leave the UI in an error state
- Missing assets (CSS, images, fonts) that break the layout
- Degraded Core Web Vitals that make the page unusable
Browser monitoring loads your page in a real Chromium instance and captures what a real user would experience. Same Playwright engine used for E2E testing — now running on a schedule from multiple global locations.
Oack Browser Monitoring: Pageload mode
Oack’s Pageload monitor opens your URL in headless Chromium (via Playwright) and captures everything:
Web Vitals — the metrics Google uses to measure user experience:
| Metric | What it measures | Good threshold |
|---|---|---|
| TTFB | Server response time | < 200 ms |
| FCP | First visible content | < 1.8 s |
| LCP | Main content ready | < 2.5 s |
| CLS | Layout stability | < 0.1 |
Page timing — DOM Interactive, DOMContentLoaded, and window Load events, tracked as time series so you can spot regressions.
HAR waterfall — the full HTTP Archive of every network request the page makes. Every resource URL, status code, size, and timing bar. Filter by type (JS, CSS, images, XHR). Resources that returned 4xx/5xx are highlighted.
Screenshots — viewport or full-page snapshots captured on every check, not just failures. Visual proof of what the page looked like at that moment.
Console logs — JavaScript errors and warnings from the DevTools console. Set a threshold: if console.error() fires more than N times, the check fails.
Health evaluation is configurable: main document status code + page load timeout + console error threshold + resource error threshold. Any condition breach triggers your alert channels — Slack, Telegram, PagerDuty, email, webhooks.
Each check runs in a fresh browser context — no cookies, no cache — simulating a first-time visitor. The default interval is 5 minutes (minimum: 60 seconds for higher tiers). This is heavier than an HTTP probe, but the signal quality is categorically different.
Oack Browser Monitoring: Test Suite mode
Pageload answers “does my page load correctly?” Test Suite answers “can a user actually do things?”
The key design decision: zero rewrites. You write standard Playwright tests with test() and expect(). The same .spec.ts file runs locally with npx playwright test and on Oack as a scheduled monitor. No custom API, no proprietary format, no vendor lock-in.
Here’s a real example — an E2E test for a login flow:
import { test, expect } from '@playwright/test';
test('user can log in and see the store', async ({ page }) => {
await page.goto(process.env.BASE_URL + '/login');
await page.getByTestId('email-input').fill(process.env.LOGIN_EMAIL!);
await page.getByTestId('password-input').fill(process.env.LOGIN_PASSWORD!);
await page.getByTestId('login-submit').click();
await expect(page).toHaveURL(/\/store/);
});
That’s it. Standard Playwright. Credentials come from environment variables — managed securely via Oack’s team-level secrets (AES-256-GCM encrypted at rest, write-only after creation, never exposed in API responses).
Deploy with two commands
Test it against the platform:
oackctl test --team $TEAM --monitor $MONITOR --dir ./tests
This uploads your project, runs it on a remote browser-checker inside Docker, and returns the Playwright HTML report — complete with screenshots, error details, and timing. The report auto-opens in your browser.
Once you’re happy, deploy for continuous monitoring:
oackctl deploy --team $TEAM --monitor $MONITOR --dir ./tests
The platform runs npm install once, caches your node_modules, and executes npx playwright test on every scheduled check. Any test failure = monitor goes DOWN, alerts fire.
Filtering
You don’t have to run your entire test suite on every check. Filter by test name, project, or tag:
# Only login tests, every 5 minutes
oackctl deploy --team $T --monitor $M --dir ./tests --pw-grep "login"
# Only tests tagged @critical
oackctl deploy --team $T --monitor $M --dir ./tests --pw-tag "critical"
# Only the Chromium project
oackctl deploy --team $T --monitor $M --dir ./tests --pw-project "chromium"
What the probe captures
Every test run produces:
- Playwright HTML report — the same report you see locally, with full step details, screenshots, and traces
- Pass/fail counts — 5 passed, 1 failed, 0 skipped
- Total duration — how long the suite took
- Git metadata — commit SHA, branch, who deployed
The report is served via signed URLs (HMAC + 1-hour expiry) and retained for 3-7 days depending on your plan.
The infrastructure under the hood
Each browser check runs in an ephemeral Docker container — created, executed, and destroyed per probe. This is important for two reasons:
- Isolation. A frozen Chromium tab (infinite JS loop, OOM) can’t block other monitors. The container is killed after the timeout + 30-second grace period.
- Security. User test suites run in containers with
--read-onlyfilesystem,--memory=512m,--cpus=1.5,--pids-limit=256. No access to host credentials or other monitors’ data.
The browser-checker binary (Go) manages the lifecycle: downloads the test suite, extracts cached dependencies, spawns the Docker container, reads artifacts from a bind-mounted temp directory, uploads the Playwright report to storage, and sends the probe result via WebSocket. Credentials for artifact storage and the API never enter the container.
Cold start is ~2 seconds (container creation + Node.js init + Chromium launch) — acceptable for checks running every 60-300 seconds.
When to use which
| Scenario | HTTP Monitor | Browser Pageload | Browser Test Suite |
|---|---|---|---|
| Is the endpoint alive? | Yes | Overkill | Overkill |
| Does the page render? | No | Yes | Overkill |
| Are Web Vitals healthy? | No | Yes | No (use Pageload) |
| Can users log in? | No | No | Yes |
| Does checkout work? | No | No | Yes |
| CI/CD integration? | N/A | N/A | Yes (oackctl test in pipeline) |
| Resource cost | Minimal | Medium | Higher |
Start with HTTP monitors for availability. Add Pageload monitors for your most important pages — homepage, landing pages, pricing. Add Test Suite monitors for critical user flows that, if broken, directly cost revenue.
From scraping to testing to monitoring
Scriptable browsers started as a scraping and testing tool. Twenty years later, the same technology — now mature, fast, and well-supported — is the foundation for answering the question that matters most in production: does my site work for a real user, right now?
Playwright won the framework war. The infrastructure to run it reliably at scale (ephemeral containers, artifact storage, scheduled execution) is the hard part. That’s what we built.
Start monitoring with Oack
Get TCP telemetry, 5-second alerts, and global coverage — free to start.
Get started free