Scrape & extract
Pull markdown, HTML, JSON, or text from any URL with optional JS rendering
AI answers
Web-grounded answers with sources and structured output
Batch & crawl
Up to 10k URLs in parallel, or autonomously discover a whole site
Map & search
Find every URL on a site, or run parser-based web search
Before you start
You need an Olostep API key. Get one from the Olostep dashboard — the free tier covers personal use.Pick a setup path
The fastest path for every client is the hosted endpoint athttps://mcp.olostep.com/mcp. No installs, no Node, no Docker — just paste a URL and your API key.
If you need it to run fully local (offline use, corporate proxy, air-gapped), every client also supports a local stdio install via npx. Each section below shows both.
Hosted endpoint uses
Authorization: Bearer YOUR_API_KEY. Local stdio uses OLOSTEP_API_KEY as an environment variable. Don’t mix them up — wrong auth mode is the #1 onboarding error.Client setup
- Cursor
- Claude Code
- Claude Desktop
- VS Code
- Windsurf
- Docker
- Metorial
One-click install (recommended):
Replace
Requires Node.js 18+ on your machine.Verify: Open Cursor → Settings → MCP. You should see
Replace YOUR_API_KEY in the resulting config with your real key.Manual setup:Create or edit .cursor/mcp.json in your project root (or ~/.cursor/mcp.json for global):Local stdio install (optional)
Local stdio install (optional)
olostep listed with 10 tools including scrape_website. If you see “Connected, 0 tools”, your API key is wrong.Picking the right tool
The MCP server exposes 10 tools. Use this decision tree to pick the right one — the agent uses the same reasoning:| You want… | Use | Notes |
|---|---|---|
| A specific page’s content | scrape_website or get_webpage_content | Set wait_before_scraping=2000–5000 for SPAs |
| A natural-language web answer with sources | answers | Returns AI synthesis + citations |
| Search results for a query | search_web | Parser-based, non-AI, structured |
| A list of URLs on a site | create_map | URL discovery only — does NOT scrape |
| URLs filtered by query | get_website_urls | Ranked by relevance to your search_query |
| Many known URLs at once | batch_scrape_urls + get_batch_results | Async — kicks off, then poll |
| A whole site or section | create_crawl + get_crawl_results | Async — follows links from a start URL |
Tool details
scrape_website
scrape_website
Extract content from a single URL. Supports
markdown, html, json, text. Optional country for geo-targeted requests, wait_before_scraping (0–10000 ms) for JS-heavy sites, and parser (e.g. @olostep/amazon-product) for structured extraction.get_webpage_content
get_webpage_content
Lightweight markdown-only version of
scrape_website. Use when you just want clean markdown and don’t need format options.search_web
search_web
Structured (parser-based) web search results for a query. Optional
country for localized results. Returns JSON, not AI prose.answers
answers
AI-powered answer to a
task with sources and citations. Pass a json argument to get the answer in a specific shape — either a JSON schema or a short natural-language description.batch_scrape_urls
batch_scrape_urls
Async scrape of 2–10k URLs you already have. Returns a
batch_id — then call get_batch_results to fetch content. Set wait_for_completion_seconds (up to 900) if you want a single blocking call instead of polling. Recommended: 60 for batches under 50 URLs, 300–600 for 50–1k, 0 (poll separately) for larger batches.get_batch_results
get_batch_results
Fetches the status and scraped content for a
batch_id. Returns processing until done, then completed with the items array.create_crawl
create_crawl
Async crawl that follows links from a
start_url. Use include_url_patterns / exclude_url_patterns (glob syntax like /blog/**) to scope. Returns a crawl_id — then call get_crawl_results.get_crawl_results
get_crawl_results
Fetches the status and pages for a
crawl_id. Supports pagination via cursor and items_limit (max 100 per call). Returns in_progress until done.create_map
create_map
Get a list of URLs on a site. URL discovery only — does not scrape. Use when you want to surface candidate URLs (e.g. let the user pick a subset). Supports
include_url_patterns / exclude_url_patterns and search_query.get_website_urls
get_website_urls
Like
create_map, but URLs are ranked by relevance to a required search_query. Use when you want the top N matching links on a site.Troubleshooting
Server appears but shows 0 tools
Server appears but shows 0 tools
Your API key is invalid or rate-limited. Open the API keys dashboard and verify the key. If using the hosted endpoint, the header must be exactly
Authorization: Bearer sk_... — no quotes around the value, no extra spaces.`npx: command not found` or `command not found: olostep-mcp`
`npx: command not found` or `command not found: olostep-mcp`
Node.js isn’t installed (or not in your PATH). Install Node 18+ from nodejs.org, then restart your terminal and your MCP client. On Windows, switch to a CMD/PowerShell that has Node on the PATH.
Connection refused or DNS errors on `mcp.olostep.com`
Connection refused or DNS errors on `mcp.olostep.com`
You’re likely behind a corporate proxy or firewall blocking the host. Switch to the local stdio install (
npx -y olostep-mcp) — it makes outbound requests to api.olostep.com instead, which is usually allowed.Edited config but the tool list is stale
Edited config but the tool list is stale
The client cached the old config. Fully quit and relaunch — not just close the window. Claude Desktop in particular keeps running in the menu bar / system tray.
Windows-specific `npx` failures
Windows-specific `npx` failures
If
npx errors out launching the server on Windows, use the CMD-wrapped form:`401 Missing Authorization: Bearer <OLOSTEP_API_KEY>`
`401 Missing Authorization: Bearer <OLOSTEP_API_KEY>`
Recipes
Copy-paste prompts that work well with the tools:- Scrape a list of product URLs: “I have a CSV of 200 Amazon product URLs. Batch scrape them with
parser=@olostep/amazon-productand return as JSON.” - Crawl a docs site: “Crawl https://stripe.com/docs with
max_pages=50,follow_links=true, andinclude_url_patterns=['/docs/**']. Summarize each section as markdown.” - Find competitors: “Use
answersto find the top 5 competitors to Notion for technical doc sites. Return name, homepage, and 1-line positioning.” - Map then scrape: “Run
create_mapon https://example.com filtered to/blog/**, thenbatch_scrape_urlson the top 20 results.”