Skip to main content
The Olostep MCP server gives any MCP-compatible AI client (Claude, Cursor, Windsurf, VS Code, Claude Code, etc.) 10 ready-to-use tools for the live web — scraping, search, AI answers with citations, batch jobs, site crawling, and URL discovery.

Scrape & extract

Pull markdown, HTML, JSON, or text from any URL with optional JS rendering

AI answers

Web-grounded answers with sources and structured output

Batch & crawl

Up to 10k URLs in parallel, or autonomously discover a whole site

Map & search

Find every URL on a site, or run parser-based web search

Before you start

You need an Olostep API key. Get one from the Olostep dashboard — the free tier covers personal use.

Pick a setup path

The fastest path for every client is the hosted endpoint at https://mcp.olostep.com/mcp. No installs, no Node, no Docker — just paste a URL and your API key. If you need it to run fully local (offline use, corporate proxy, air-gapped), every client also supports a local stdio install via npx. Each section below shows both.
Hosted endpoint uses Authorization: Bearer YOUR_API_KEY. Local stdio uses OLOSTEP_API_KEY as an environment variable. Don’t mix them up — wrong auth mode is the #1 onboarding error.

Client setup

One-click install (recommended):Add Olostep MCP server to CursorReplace YOUR_API_KEY in the resulting config with your real key.Manual setup:Create or edit .cursor/mcp.json in your project root (or ~/.cursor/mcp.json for global):
{
  "mcpServers": {
    "olostep": {
      "url": "https://mcp.olostep.com/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      }
    }
  }
}
{
  "mcpServers": {
    "olostep": {
      "command": "npx",
      "args": ["-y", "olostep-mcp"],
      "env": {
        "OLOSTEP_API_KEY": "YOUR_API_KEY"
      }
    }
  }
}
Requires Node.js 18+ on your machine.
Verify: Open Cursor → Settings → MCP. You should see olostep listed with 10 tools including scrape_website. If you see “Connected, 0 tools”, your API key is wrong.

Picking the right tool

The MCP server exposes 10 tools. Use this decision tree to pick the right one — the agent uses the same reasoning:
You want…UseNotes
A specific page’s contentscrape_website or get_webpage_contentSet wait_before_scraping=2000–5000 for SPAs
A natural-language web answer with sourcesanswersReturns AI synthesis + citations
Search results for a querysearch_webParser-based, non-AI, structured
A list of URLs on a sitecreate_mapURL discovery only — does NOT scrape
URLs filtered by queryget_website_urlsRanked by relevance to your search_query
Many known URLs at oncebatch_scrape_urls + get_batch_resultsAsync — kicks off, then poll
A whole site or sectioncreate_crawl + get_crawl_resultsAsync — follows links from a start URL
Scraping a whole site? Use create_crawl, not batch_scrape_urls. Crawl discovers AND scrapes. Batch is for a known list of URLs you already have.

Tool details

Extract content from a single URL. Supports markdown, html, json, text. Optional country for geo-targeted requests, wait_before_scraping (0–10000 ms) for JS-heavy sites, and parser (e.g. @olostep/amazon-product) for structured extraction.
Lightweight markdown-only version of scrape_website. Use when you just want clean markdown and don’t need format options.
Structured (parser-based) web search results for a query. Optional country for localized results. Returns JSON, not AI prose.
AI-powered answer to a task with sources and citations. Pass a json argument to get the answer in a specific shape — either a JSON schema or a short natural-language description.
Async scrape of 2–10k URLs you already have. Returns a batch_id — then call get_batch_results to fetch content. Set wait_for_completion_seconds (up to 900) if you want a single blocking call instead of polling. Recommended: 60 for batches under 50 URLs, 300–600 for 50–1k, 0 (poll separately) for larger batches.
Fetches the status and scraped content for a batch_id. Returns processing until done, then completed with the items array.
Async crawl that follows links from a start_url. Use include_url_patterns / exclude_url_patterns (glob syntax like /blog/**) to scope. Returns a crawl_id — then call get_crawl_results.
Fetches the status and pages for a crawl_id. Supports pagination via cursor and items_limit (max 100 per call). Returns in_progress until done.
Get a list of URLs on a site. URL discovery only — does not scrape. Use when you want to surface candidate URLs (e.g. let the user pick a subset). Supports include_url_patterns / exclude_url_patterns and search_query.
Like create_map, but URLs are ranked by relevance to a required search_query. Use when you want the top N matching links on a site.

Troubleshooting

Your API key is invalid or rate-limited. Open the API keys dashboard and verify the key. If using the hosted endpoint, the header must be exactly Authorization: Bearer sk_... — no quotes around the value, no extra spaces.
Node.js isn’t installed (or not in your PATH). Install Node 18+ from nodejs.org, then restart your terminal and your MCP client. On Windows, switch to a CMD/PowerShell that has Node on the PATH.
You’re likely behind a corporate proxy or firewall blocking the host. Switch to the local stdio install (npx -y olostep-mcp) — it makes outbound requests to api.olostep.com instead, which is usually allowed.
The client cached the old config. Fully quit and relaunch — not just close the window. Claude Desktop in particular keeps running in the menu bar / system tray.
If npx errors out launching the server on Windows, use the CMD-wrapped form:
{
  "command": "cmd",
  "args": ["/c", "npx", "-y", "olostep-mcp"],
  "env": { "OLOSTEP_API_KEY": "YOUR_API_KEY" }
}
You hit the hosted endpoint without an auth header (or with the wrong format). Add the header to your client config exactly as shown in the setup tab.

Recipes

Copy-paste prompts that work well with the tools:
  • Scrape a list of product URLs: “I have a CSV of 200 Amazon product URLs. Batch scrape them with parser=@olostep/amazon-product and return as JSON.”
  • Crawl a docs site: “Crawl https://stripe.com/docs with max_pages=50, follow_links=true, and include_url_patterns=['/docs/**']. Summarize each section as markdown.”
  • Find competitors: “Use answers to find the top 5 competitors to Notion for technical doc sites. Return name, homepage, and 1-line positioning.”
  • Map then scrape: “Run create_map on https://example.com filtered to /blog/**, then batch_scrape_urls on the top 20 results.”

Source & versions