Documentation

Oack is an uptime and performance monitoring platform with deep TCP-level telemetry, multi-channel alerting, and AI-assisted troubleshooting via MCP. Run checkers on your own infrastructure, get notified in seconds, and diagnose issues all the way down to the network layer.

Quick Start

Create your account, set up a team, and add your first monitor in minutes.

Get started

Network Checker

Deploy a dedicated checker on your infrastructure for private monitoring.

Learn more

Quick Start

Get monitoring in three steps:

1. Create an account — Sign up at app.oack.io and create your first team.
2. Add a monitor — Enter the URL you want to watch, pick an HTTP method, set the check interval, and choose a checker region.
3. Configure alerts — Create alert channels (Slack, Discord, Telegram, PagerDuty, Email, or Webhook) and link them to your monitor. You'll be notified within seconds when something goes wrong.

HTTP Monitoring

Each monitor performs an HTTP/HTTPS request to your endpoint at a configurable interval. Every probe captures:

Full timing breakdown — DNS lookup, TCP connect, TLS handshake, send, wait (TTFB), and receive phases.
HTTP headers & body — Request and response headers plus truncated body (1 KB) captured on every probe.
TCP metrics — Kernel-level TCP_INFO data including RTT, retransmits, congestion window, and segment counters.
Packet capture — Optional per-probe pcap of the full HTTP exchange (SYN to FIN) for deep post-mortem analysis.

Monitors support custom HTTP methods (GET, POST, HEAD, etc.), custom headers, request body, and configurable timeouts. Check intervals range from 30 seconds (Business) to 5 minutes (Free).

Health Rules

Health rules define when a monitor is considered up or down. Each monitor has:

Success criteria — Expected HTTP status codes (e.g. 200-299) and maximum latency threshold.
Failure threshold — Number of consecutive failing probes required before the monitor transitions from UP to DOWN.
Recovery threshold — Number of consecutive passing probes required before the monitor transitions from DOWN back to UP.

SSL & Domain Expiration

Oack automatically monitors SSL certificate and domain registration expiration for every active HTTP monitor. A daily sweep checks:

SSL certificates — TLS handshake reads the leaf certificate's expiration date.
Domain registration — RDAP protocol (the ICANN standard replacement for WHOIS) checks registration expiry.

Notifications are sent through your linked alert channels at configurable thresholds (default: 30, 14, 7, and 1 days before expiration). Available on Pro and Business plans.

Web Checker — Pageload

Pageload monitors launch a real Chromium browser (Playwright) to load your page exactly like a visitor would. Designed for performance monitoring — measure how fast your page loads, track Web Vitals over time, and get alerted when performance degrades. No scripting required — just enter a URL.

Web Vitals & timing metrics

Metric	Name	What it tells you
TTFB	Time to First Byte	How long until the browser receives the first byte from the server. High TTFB points to slow server processing, DNS issues, or network latency. Under 200 ms is good; above 600 ms needs investigation.
FCP	First Contentful Paint	When the browser renders the first piece of visible content (text, image, or canvas). This is the moment your page stops being blank. Under 1.8 s is good; above 3 s feels slow to users.
LCP	Largest Contentful Paint	When the largest visible element (hero image, heading block, video poster) finishes rendering. This is the best proxy for "the page looks ready." Under 2.5 s is good; above 4 s means users are waiting too long for the main content.
CLS	Cumulative Layout Shift	How much the page layout shifts unexpectedly while loading (ads popping in, images resizing, fonts swapping). It's a score, not a time — under 0.1 is good; above 0.25 means things are jumping around and annoying your visitors.
DOM Interactive	DOM Interactive	When the HTML document has been fully parsed and the DOM is ready for JavaScript to manipulate. Render-blocking scripts and large HTML payloads push this number up.
DOMContentLoaded	DOMContentLoaded Event	When the HTML and all deferred scripts have finished executing. A big gap between DOM Interactive and DOMContentLoaded usually means heavy synchronous JavaScript.
Load Event	Window Load	When the entire page — including images, stylesheets, iframes, and fonts — has finished loading. This is the "everything done" marker.

What each probe captures

Web Vitals — TTFB, FCP, LCP, and CLS measured from the real browser rendering pipeline.
Page timing — DOM Interactive, DOMContentLoaded, and Load Event timestamps.
HAR waterfall — Full HTTP Archive of every network request the page made, with timing, size, and status. Download and inspect in any HAR viewer.
Screenshots — Optional viewport or full-page screenshot captured after the page loads.
Console log — All console messages (errors, warnings, info) emitted during page load, with counts for each severity.
Resource summary — Total resource count, error count, and total bytes transferred.

Web Checker — Test Suite

Test Suite monitors run standard Playwright Test files to verify functional user flows — login, search, checkout, multi-page navigation. Designed for scenario testing, not page speed. The platform runs npx playwright test on schedule and alerts you when tests fail.

Write your tests with test() and expect(), run them locally with npx playwright test, then deploy the same directory to Oack. No custom API, no rewrites — the same tests run everywhere.

Example: PokéStore e2e tests

poke-store.oack.io is a demo Pokémon store with login, search, cart, and checkout flows. Source. The test suite lives alongside the frontend code:

tests/e2e/store.spec.ts

import { test, expect } from '@playwright/test';

async function loginAsAsh(page) {
  await page.goto('/login');
  await page.getByTestId('email-input').fill('[email protected]');
  await page.getByTestId('password-input').fill('pikachu123');
  await page.getByTestId('login-submit').click();
  await page.waitForURL(/\/store/);
}

test.describe('PokéStore', () => {
  test('should log in and see store', async ({ page }) => {
    await loginAsAsh(page);
    await expect(page.getByTestId('user-name')).toHaveText('Ash Ketchum');
  });

  test('should search Pokémon', async ({ page }) => {
    await loginAsAsh(page);
    await page.getByTestId('search-input').fill('pikachu');
    await expect(page.getByTestId('pokemon-name')).toContainText('Pikachu');
  });
});

Run locally

Terminal

cd web
npx playwright test

# 13 passed (24.1s)

Skip repetitive flags with .oackctl.env

Create a .oackctl.env file in your project root to avoid passing --team and --monitor on every command. oackctl auto-loads it from the current directory.

.oackctl.env

OACKCTL_TEAM=a98957b0-a129-4032-a2c4-d18ac8dd2287
OACKCTL_MONITOR=f190f477-48f7-46d7-a533-25ca3b1541e1

Now you can run commands without the flags:

Terminal

oackctl test --dir web
oackctl deploy --dir web

Every CLI flag maps to an OACKCTL_ env var: --team → OACKCTL_TEAM, --monitor → OACKCTL_MONITOR, --pw-grep → OACKCTL_PW_GREP, etc. Add .oackctl.env to your .gitignore if it contains team-specific IDs.

Test on Oack (one-off run)

Upload the same directory for a one-off test run on Oack's browser infrastructure. The result includes a full Playwright HTML report.

Terminal

oackctl test --team <TEAM> --monitor <MONITOR> --dir web

# Packaging web...
#   Files: 74 (112.7 KB)
#
# Running test...
#
# Result: PASSED
# Report: https://api.oack.io/api/v1/artifacts/.../report/index.html

Deploy for continuous monitoring

Deploy the test suite to a browser monitor. It runs on schedule (e.g. every 5 minutes) and you get alerted when tests fail.

Terminal

oackctl deploy --team <TEAM> --monitor <MONITOR> --dir web

# Packaging web...
#   Files: 74 (112.7 KB)
#
# Uploading suite...
#   Suite: 112.7 KB
#   Tests: tests/e2e/store.spec.ts
#   Git:   8169f4ce (main)
#
# Deployed.
# Monitor: https://app.oack.io/teams/.../monitors/...

What you get

Playwright HTML report — full test breakdown with screenshots, error details, and timing. Opens in your browser after each test run.
Pass/fail health status — any test failure = monitor DOWN. Alerts fire through your configured channels (email, Slack, PagerDuty, etc.).
Git metadata — each deploy records the commit SHA, branch, and who deployed. Visible in the dashboard.
Environment variables — pass credentials and config via --env flags or team-level secrets. Tests access them via process.env.
Filters — run a subset of tests with --pw-grep, --pw-project, or --pw-tag.

Multi-monitor config

For complex setups, define all check suites in a oack.config.json file:

oack.config.json

{
  "team": "<TEAM_ID>",
  "dir": "web",
  "checks": [
    {
      "name": "PokéStore Login",
      "pw_grep": "login"
    },
    {
      "name": "PokéStore Chromium Only",
      "pw_project": "chromium"
    },
    {
      "name": "PokéStore Critical Flows",
      "pw_tag": "critical"
    },
    {
      "name": "PokéStore Full Suite"
    }
  ]
}

The name field is the unique key — monitors are matched by name within the team. If a monitor with that name already exists, it's updated. If not, a new browser monitor is created automatically. Removing a check from the config does not delete the monitor — use oackctl monitors delete for that.

Filters narrow which tests each monitor runs:

pw_grep — match test names (--grep flag in Playwright)
pw_project — run a specific Playwright project (e.g. chromium, firefox)
pw_tag — filter by @tag annotation in test titles

Terminal

oackctl config-deploy --config oack.config.json

# Deploying 4 check suites...
#   PokéStore Login .............. created (monitor abc12345)
#   PokéStore Chromium Only ...... created (monitor bcd23456)
#   PokéStore Critical Flows ..... created (monitor cde34567)
#   PokéStore Full Suite ......... created (monitor def45678)
# Done.

Alert Channels

Alert channels define where notifications are sent when a monitor changes state. Supported channel types:

Channel	Free	Pro	Business
Email	Yes	Yes	Yes
Slack	-	Yes	Yes
Discord	-	Yes	Yes
Telegram	-	Yes	Yes
Webhooks	-	Yes	Yes
PagerDuty	-	Yes	Yes
SMS & Calls	-	-	Coming soon

Telegram uses a one-click deep-link flow — no bot tokens or chat IDs to configure manually. Webhook payloads include HMAC signatures for verification.

Alert Behavior

Monitors notify only explicitly linked alert channels. If no channels are linked, the monitor stays silent.

Alert events are dispatched on two transitions:

DOWN — fired when the failure threshold is reached. Includes monitor URL, status code, and error details.
Recovery — fired when the recovery threshold is reached. Includes downtime duration.

Incident Lifecycle

Incidents track the full lifecycle of an outage or degradation — from detection to resolution. They can be created automatically from monitor failures or manually by any team member.

Every incident moves through a defined set of statuses:

Status	Meaning
Draft	Incident created but not yet declared. Allows teams to assess before notifying stakeholders.
Investigating	Team is actively looking into the issue. Escalation timers begin if an escalation policy is attached.
Identified	Root cause found. Responders are working on a fix.
Monitoring	Fix deployed. Team is watching to confirm stability before closing.
Resolved	Incident closed. Duration is calculated automatically from declared_at to resolved_at.

Each incident carries linked monitors, severity, tags, and a timeline of updates. Status page subscribers are notified automatically when an incident is published.

On-Call Scheduling

On-call schedules define who gets paged when an incident is triggered. Create rotation schedules so the right engineer is automatically notified — no manual routing required.

Key concepts

Rotations — Define a recurring schedule (daily, weekly, or custom) that cycles through team members.
Overrides — Temporarily replace the scheduled on-call for vacations, swaps, or out-of-band coverage.
Handoffs — Automatic transition between shifts with configurable overlap to ensure no gaps in coverage.

Escalation Policies

Escalation policies ensure incidents are never missed. If the primary on-call doesn't acknowledge within a configurable timeout, the incident automatically escalates to the next level.

How it works

1. Incident triggered — The on-call engineer at level 1 of the escalation policy is notified via their preferred channels.
2. No acknowledgment — If the no-ack timeout expires (e.g., 5 minutes), the incident escalates to level 2.
3. Acknowledgment — Acknowledging stops the escalation timer. The responder owns the incident.
4. Further escalation — Additional levels can notify team leads, managers, or entire channels as a last resort.

Escalation events are recorded in the incident timeline, creating a full audit trail of who was notified, when, and whether they responded.

War Rooms & Post-Mortems

War rooms provide a shared space for incident responders to coordinate in real time. Post updates, link monitors, tag team members, and track status transitions — all in one timeline.

After an incident is resolved, generate a post-mortem report that includes:

Timeline — Chronological record of all escalation events, status changes, and team comments.
Impact — Duration, affected monitors, and severity.
Root cause — Document what went wrong and why.
Action items — Track follow-up tasks to prevent recurrence.

Uptime, MTBF & MTTR

Oack computes three reliability metrics from monitor status change history:

Uptime % — percentage of time the monitor was in the UP state within the selected window.
MTBF — Mean Time Between Failures. Average duration between consecutive DOWN incidents.
MTTR — Mean Time To Recovery. Average duration of DOWN incidents before recovery.

Metrics are available over 7-day, 30-day, 90-day, and 365-day windows.

Probe Aggregation

For time ranges longer than 12 hours, probes are automatically aggregated into time buckets using SQL-level statistical functions.

Available aggregation functions: avg, median, min, max, p75, p90, p95, p99.

Bucket size scales with range: 5m buckets for 12-24h, up to 12h buckets for 90d+. Each bucket includes all six timing phases, probe count, and error count. Maximum 1,000 buckets per query.

TCP Telemetry

Every probe captures kernel-level TCP_INFO metrics with zero overhead:

RTT (Round-Trip Time) — Smoothed RTT and RTT variance as seen by the kernel.
Retransmits — Total retransmitted segments during the connection.
Congestion window — TCP congestion window size, indicating bandwidth capacity.
Segment counters — Segments sent and received during the exchange.

Performance Percentiles

When you open a probe's detail view, Oack computes a percent rank for each latency fraction — telling you where this probe sits relative to all successful probes for the same monitor.

Percentile	Interpretation
0 – 50	Faster than average. This fraction performed better than at least half of all probes.
50 – 75	Normal range. Slightly above median but within typical variance.
75 – 90	Above average. This fraction was slower than most probes — worth noting but may not indicate a problem.
90 – 100	Anomalous. This fraction was slower than 90%+ of probes. Likely indicates a real issue.

Latency fractions

Fraction	What it measures
dns_ms	DNS resolution time
connect_ms	TCP connection establishment
tls_ms	TLS handshake (null for plain HTTP)
send_ms	Time to send the request
wait_ms	Time to first byte (TTFB) — server processing time
receive_ms	Time to download the response body
total_ms	End-to-end request duration (sum of all fractions)

Time windows

Each fraction is ranked across four time windows:

1 day — Detects recent anomalies.
7 days — Weekly baseline.
30 days — Monthly baseline.
90 days — Long-term baseline.

CDN Enrichment (Cloudflare)

When your target sits behind Cloudflare, Oack streams edge logs directly into probe details using Cloudflare Instant Logs. Each probe is enriched with CDN-level context automatically.

Requirement: The Cloudflare zone must be on a Business plan or higher. Instant Logs is not available on Free or Pro zones.

What each probe captures

Edge PoP — The Cloudflare data center that served the request.
Cache status — HIT, MISS, DYNAMIC, EXPIRED, or other cf-cache-status values.
Edge timing — Edge TTFB and origin response time from the Cloudflare perspective.
CDN GEO — Geographic location of the edge node that handled the request.

Setup

Go to Account Settings → Integrations.
Add a Cloudflare Zone integration with your zone ID and an API token that has the Logs:Read permission.
Enrichment starts automatically for any monitor whose target matches the configured zone.

Share probe data with anyone using permalink share links.

Time range — Pick exact start and end timestamps.
Expiration — 1 hour, 24 hours, 7 days, 30 days, or 1 year.
Access mode — Public or authenticated links.
View count — Every share link tracks views.

When creating a share link you can hide sensitive data. Redaction is applied server-side.

Redaction Group	Fields hidden
Monitor name	Replaced with a generic label
Checker IP	Checker public IP address
Source ASN	Source AS number and network name
HTTP bodies & auth	Request/response bodies and authorization headers

Network Checker

A network checker is an agent that runs on your infrastructure, connects to Oack, and performs HTTP health checks against your monitors.

Shared — available to all accounts. Oack runs shared checkers in multiple regions.
Dedicated — private to your account. You deploy and manage the checker binary.

Checker Installation

The checker binary supports Linux (amd64, arm64), macOS (Intel & Apple Silicon), and FreeBSD (amd64, arm64).

	Linux	FreeBSD	macOS
Binary	✓ amd64 / arm64	✓ amd64 / arm64	✓ Intel / Apple Silicon
Package (deb/rpm)	✓ amd64 / arm64	-	-
Docker	✓ amd64 / arm64	-	✓ via Docker Desktop
Homebrew	✓	-	✓

Homebrew (macOS / Linux)

Terminal

brew tap oack-io/tap
brew install network-tester

Shell script

Terminal

curl -sSfL "https://raw.githubusercontent.com/oack-io/network-tester/refs/heads/main/install-network-tester.sh" | bash

Docker

Terminal

docker pull oackio/network-tester:latest

mkdir -p $HOME/.net-checker-data
docker run --rm \\
    --cap-add NET_RAW \\
    -v $HOME/.net-checker-data:/data \\
    oackio/network-tester:latest \\
    --token-db /data/tokens.db --mode shared

MCP (AI-Assisted Troubleshooting)

Oack exposes a Model Context Protocol server that lets AI agents read your monitoring data. All MCP tools are read-only.

Claude Code config

{
  "mcpServers": {
    "oack": {
      "type": "http",
      "url": "https://api.oack.io/mcp/"
    }
  }
}

To allow Claude to use all Oack MCP tools without permission prompts:

Terminal

/permissions add mcp__oack__* "allow all Oack MCP tools"

CLI (oackctl)

oackctl is the official command-line interface for the Oack platform.

Install via Homebrew

Terminal

brew tap oack-io/tap
brew install oackctl

Install via shell script

Terminal

curl -sSfL "https://raw.githubusercontent.com/oack-io/oackctl/refs/heads/main/install-oackctl.sh" | bash

Quick start

Terminal

# Authenticate (opens browser for device flow)
oackctl login

# List your teams
oackctl teams list

# List monitors in a team
oackctl monitors list --team <team-id>

# Create a monitor
oackctl monitors create --team <team-id> \\
  --name "Production API" \\
  --url "https://api.example.com/health" \\
  --interval 60

# View probe results
oackctl probes list --team <team-id> --monitor <monitor-id> --limit 10

REST API

All platform functionality is available through the REST API at https://api.oack.io/api/v1/. Browse the full Swagger documentation.

The OpenAPI spec is available at https://api.oack.io/openapi.json. Import it into Postman, Insomnia, or any OpenAPI-compatible tool:

Postman: Import → Link → https://api.oack.io/openapi.json

Terraform Provider

Manage your Oack monitoring infrastructure as code with the official Terraform provider. Create teams, monitors, alert channels, status pages, and PagerDuty integrations — all in version-controlled HCL.

Installation

main.tf

terraform {
  required_providers {
    oack = {
      source  = "oack-io/oack"
      version = "~> 0.1"
    }
  }
}

provider "oack" {
  api_key    = var.oack_api_key    # or OACK_API_KEY env var
  account_id = var.oack_account_id # or OACK_ACCOUNT_ID env var
}

Available resources

Resource	Description
`oack_team`	Teams that own monitors, channels, and API keys
`oack_monitor`	HTTP/HTTPS monitors with SSL/domain expiry, latency thresholds, checker preferences
`oack_alert_channel`	Slack, Email, Webhook, Telegram, Discord, PagerDuty channels
`oack_monitor_alert_channel_link`	Route alerts from monitors to channels
`oack_status_page`	Public or password-protected status pages with custom branding
`oack_status_page_component`	Components and groups on status pages
`oack_status_page_watchdog`	Auto-create/resolve incidents when monitors change health
`oack_pagerduty_integration`	Two-way PagerDuty incident sync
`oack_external_link`	Quick links to Grafana, Datadog, or other dashboards
`oack_team_api_key`	Team-scoped API keys for CI/CD and deploy events

Example: full-stack setup

main.tf

resource "oack_team" "production" {
  name = "Production"
}

resource "oack_monitor" "api" {
  team_id           = oack_team.production.id
  name              = "API Health"
  url               = "https://api.example.com/health"
  check_interval_ms = 30000
  ssl_expiry_enabled    = true
  domain_expiry_enabled = true
}

resource "oack_alert_channel" "slack" {
  team_id = oack_team.production.id
  name    = "Engineering Slack"
  type    = "slack"
  config  = jsonencode({ webhook_url = var.slack_webhook })
}

resource "oack_monitor_alert_channel_link" "api_slack" {
  team_id    = oack_team.production.id
  monitor_id = oack_monitor.api.id
  channel_id = oack_alert_channel.slack.id
}

See the full GitHub repository for progressive examples and resource documentation.

Account Roles

Every user in an account has one of five roles:

Role	Description
Owner	Full control. Manage subscription, transfer ownership, delete account.
Admin	Create/manage teams, monitors, alert channels. Invite/remove members.
Billing Admin	View/manage subscription and billing. Read-only access to teams.
Member	Create teams and monitors, manage alert channels, invite team members.
Guest	Read-only access. Default role for newly invited users.

Team Roles

Role	Description
Owner	Full control. Delete team, transfer ownership.
Admin	Create/manage monitors and alert channels. Manage members.
Member	View monitors/probes/metrics. Create share links. Cannot modify monitors.

Permissions Summary

Action	Min Account Role	Min Team Role
View monitors & probes	Guest	Member
Create share links	Member	Member
Create/update/delete monitors	Member	Admin
Manage alert channels	Member	Admin
Invite account members	Admin	-
Manage subscription	Owner / Billing	-
Delete account	Owner	-

Plan Comparison

Feature	Free	Pro ($29/mo)	Business ($249/mo)
Monitors	10	100	500
Check interval	5 min	60 sec	30 sec
Teams	1	5	50
Members	3	20	Unlimited
Dedicated checkers	-	5	Unlimited
Probe retention	7 days	90 days	365 days
SSL & domain monitoring	-	Yes	Yes
Alert channels	Email	All standard	All + SMS (soon)

See Pricing for full details.

Get Started Free Talk to Founder

Documentation

Quick Start

Network Checker

Quick Start

HTTP Monitoring

Health Rules

SSL & Domain Expiration

Web Checker — Pageload

Web Vitals & timing metrics

What each probe captures

Web Checker — Test Suite

Example: PokéStore e2e tests

Run locally

Skip repetitive flags with .oackctl.env

Test on Oack (one-off run)

Deploy for continuous monitoring

What you get

Multi-monitor config

Alert Channels

Alert Behavior

Incident Lifecycle

On-Call Scheduling

Key concepts

Escalation Policies

How it works

War Rooms & Post-Mortems

Uptime, MTBF & MTTR

Probe Aggregation

TCP Telemetry

Performance Percentiles

Latency fractions

Time windows

CDN Enrichment (Cloudflare)

What each probe captures

Setup

Probe Sharing

Redaction Groups

Network Checker

Checker Installation

Homebrew (macOS / Linux)

Shell script

Docker

MCP (AI-Assisted Troubleshooting)

CLI (oackctl)

Install via Homebrew

Install via shell script

Quick start

REST API

Terraform Provider

Installation

Available resources

Example: full-stack setup

Account Roles

Team Roles

Permissions Summary

Plan Comparison