n8n Integration - Olostep Docs

The Olostep n8n integration brings powerful web search, scraping, and crawling capabilities to n8n workflows. Build automated workflows that search, extract, and structure web data from any website without writing code. Get started with Olostep on n8n →

Features

The integration provides 6 powerful operations for automated web data extraction:

Scrape Website

Extract content from any single URL in multiple formats (Markdown, HTML, JSON, text)

Search

Search the Web and get structured results

Answers (AI)

Search the Web with AI and get structured answers with sources and citations

Batch Scrape URLs

Scrape up to 10k urls at the same time. Perfect for large-scale data extraction

Create Crawl

Get the content of subpages of a URL. Autonomously discover and scrape entire websites

Create Map

Get all URLs on a website for site structure analysis and content discovery

Installation

1. Install the Node

Install the Olostep node package via npm:

npm install n8n-nodes-olostep

Then restart n8n to load the new node. Alternatively, follow the n8n community nodes installation guide for detailed instructions.

2. Connect Your Account

When you first use the Olostep node in a workflow, you’ll need to configure credentials:

Add the “Olostep Scrape” node to your workflow
Click on the node to open its settings
Click “Create New Credential” or select existing credentials
Enter your Olostep API key
Click “Save” to store the credential

Get your API key from the Olostep Dashboard.

Available Actions

Scrape Website

Extract content from a single URL. Supports multiple formats and JavaScript rendering. Use Cases:

Monitor specific pages for changes
Extract product information from e-commerce sites
Gather data from news articles or blog posts
Pull content for content aggregation

Configuration:

URL to Scrape

string

required

Website URL to scrape (must include http:// or https://)

Output Format

dropdown

default:"Markdown"

Choose format: Markdown, HTML, JSON, or Plain Text

Country Code

string

Country code for location-specific content (e.g., “US”, “GB”, “CA”)

Wait Before Scraping

integer

Wait time in milliseconds for JavaScript rendering (0-10000)

Parser

string

Optional parser ID for specialized extraction (e.g., “@olostep/amazon-product”)

Output Fields:

Scrape ID
Scraped URL
Markdown Content
HTML Content
JSON Content
Text Content
Status
Timestamp
Screenshot URL (if available)
Page Metadata

Example Workflows:

Monitor Competitor Pricing

Trigger: Schedule (Every day at 9 AM)Action: Olostep - Scrape Website

URL: Competitor product page
Format: JSON
Parser: @olostep/amazon-product

Action: Google Sheets - Append Row

Add price data to tracking spreadsheet

Action: Email - Send Email (If price drops)

Alert team about price changes

Extract and Save Blog Posts

Trigger: RSS Feed - New ItemAction: Olostep - Scrape Website

URL: {{$json.link}}
Format: Markdown

Action: Notion - Create Page

Save article content to Notion database

Lead Enrichment

Trigger: Google Sheets - New RowAction: Olostep - Scrape Website

URL: Company website from sheet
Format: Markdown

Action: OpenAI - Complete Text

Extract company information using AI

Action: Google Sheets - Update Row

Add enriched data back to sheet

Search

Search the Web for a given query and get structured results (non-AI, parser-based search results). Use Cases:

Automated research workflows
Lead discovery and enrichment
Competitive analysis
Content research

Configuration:

Query

string

required

Search query

Output: Returns structured search results as JSON with titles, URLs, snippets, and metadata. Example Workflows:

Automated Research

Trigger: Schedule (Daily at 8 AM)Action: Olostep - Search

Query: “latest AI developments”

Action: Code - Process Results

Extract and format key information

Action: Notion - Create Pages

Store research findings

Lead Discovery

Trigger: Manual (Button)Action: Olostep - Search

Query: ”{{$json.searchTerm}}”

Action: Airtable - Create Records

Store leads with contact information

Batch Scrape URLs

Scrape up to 10k urls at the same time. Perfect for large-scale data extraction. Use Cases:

Scrape entire product catalogs
Extract data from multiple search results
Process lists of URLs from spreadsheets
Bulk content extraction

Configuration:

URLs to Scrape

text

required

JSON array of objects with url and custom_id fields.Example: [{"url":"https://example.com","custom_id":"site1"}]

Output Format

dropdown

default:"Markdown"

Choose format for all URLs: Markdown, HTML, JSON, or Plain Text

Country Code

string

Country code for location-specific scraping

Wait Before Scraping

integer

Wait time in milliseconds for JavaScript rendering

Parser

string

Optional parser ID for specialized extraction

Output Fields:

Batch ID (use this to retrieve results later)
Status
Total URLs
Created At
Requested Format
Country Code
Parser Used

Example Workflows:

Scrape Product Catalog

Trigger: Webhook - Receive POST RequestAction: Code - Format URLs

Convert CSV/list to JSON array format

Action: Olostep - Batch Scrape URLs

URLs: {{$json.urlArray}}
Format: JSON
Parser: @olostep/amazon-product

Action: Webhook - POST

Send batch ID to your system for retrieval

Daily Content Monitoring

Trigger: Schedule - Every day at 6 AMAction: Google Sheets - Read Rows

Fetch URLs to monitor

Action: Code - Format URLs

Convert to batch array format

Action: Olostep - Batch Scrape URLs

Process all URLs at once

Action: Slack - Send Message

Notify team that scraping is complete

Create Crawl

Get the content of subpages of a URL. Autonomously discover and scrape entire websites by following links. Perfect for documentation sites, blogs, and content repositories. Use Cases:

Crawl and archive entire documentation sites
Extract all blog posts from a website
Build knowledge bases from web content
Monitor website structure changes

Configuration:

Start URL

string

required

Starting URL for the crawl (must include http:// or https://)

Maximum Pages

integer

default:"10"

Maximum number of pages to crawl

Follow Links

boolean

default:"true"

Whether to follow links found on pages

Output Format

dropdown

default:"Markdown"

Format for scraped content

Country Code

string

Optional country code for location-specific crawling

Parser

string

Optional parser ID for specialized content extraction

Output Fields:

Crawl ID (use this to retrieve results later)
Object Type
Status
Start URL
Maximum Pages
Follow Links
Created Timestamp
Formats

Example Workflows:

Archive Documentation Site

Trigger: Schedule - Monthly on 1st at 12 AMAction: Olostep - Create Crawl

Start URL: https://docs.example.com
Max Pages: 500
Follow Links: true
Format: Markdown

Action: Webhook - POST

Send crawl ID to your archive system

Action: Slack - Send Message

Notify team that crawl is in progress

Competitor Content Analysis

Trigger: Schedule - Weekly on Monday at 9 AMAction: Olostep - Create Crawl

Start URL: Competitor blog URL
Max Pages: 100
Format: Markdown

Action: Wait - For 10 minutes

Wait for crawl to complete

Action: Airtable - Create Records

Store crawl data for analysis

Create Map

Get all URLs on a website. Extract all URLs from a website for content discovery and site structure analysis. Use Cases:

Build sitemaps and site structure diagrams
Discover all pages before batch scraping
Find broken or missing pages
SEO audits and analysis

Configuration:

Website URL

string

required

Website URL to extract links from (must include http:// or https://)

Search Query

string

Optional search query to filter URLs (e.g., “blog”)

Top N URLs

integer

Limit the number of URLs returned

Include URL Patterns

string

Glob patterns to include specific paths (e.g., “/blog/**”)

Exclude URL Patterns

string

Glob patterns to exclude specific paths (e.g., “/admin/**”)

Output Fields:

Map ID
Object Type
Website URL
Total URLs Found
URLs (JSON array)
Search Query
Top N Limit

Example Workflows:

Discover and Scrape

Trigger: Manual (Button)Action: Olostep - Create Map

URL: https://example.com
Include Patterns: /products/**
Top N: 500

Action: Code - Extract URLs

Parse URLs from map result

Action: Olostep - Batch Scrape URLs

URLs: {{$json.urls}}
Format: JSON

Action: Google Sheets - Append Rows

Add all product data to spreadsheet

SEO Site Audit

Trigger: Schedule - MonthlyAction: Olostep - Create Map

URL: Your website
Top N: 1000

Action: Airtable - Create Records

Store all URLs for tracking

Action: Slack - Send Message

Report total pages found

Popular Workflow Examples

E-commerce Price Monitoring

Monitor competitor prices and get instant alerts:

Trigger: Schedule (Hourly)
↓
Action: Olostep - Scrape Website
  - URL: Competitor product page
  - Format: JSON
  - Parser: @olostep/amazon-product
↓
Action: IF - Check if price changed
↓
Action: Slack - Send Message
  - Alert: "Price changed to $\{\{price\}\}"

Content Aggregation

Aggregate content from multiple sources:

Trigger: Google Sheets - New Row
↓
Action: Olostep - Scrape Website
  - URL: \{\{$json.url\}\}
  - Format: Markdown
↓
Action: OpenAI - Summarize
  - Summarize the content
↓
Action: Airtable - Create Record
  - Store article with summary

Lead Enrichment Pipeline

Enrich lead data with web information:

Trigger: HubSpot - New Contact
↓
Action: Olostep - Scrape Website
  - URL: \{\{$json.companyWebsite\}\}
  - Format: Markdown
↓
Action: OpenAI - Extract Data
  - Extract: company size, industry, products
↓
Action: HubSpot - Update Contact
  - Add enriched data to contact

Research Automation

Automate research from multiple sources:

Trigger: Airtable - New Record
↓
Action: Olostep - Create Map
  - URL: Research target website
  - Include: /research/**
↓
Action: Code - Parse URLs
↓
Action: Olostep - Batch Scrape URLs
  - URLs: \{\{$json.discoveredUrls\}\}
  - Format: Markdown
↓
Action: Notion - Create Pages
  - Create research database

Track mentions and content:

Trigger: Schedule (Every 6 hours)
↓
Action: Olostep - Scrape Website
  - URL: News site search page
  - Format: HTML
↓
Action: Code - Extract Mentions
  - Find brand mentions
↓
Action: Google Sheets - Append Row
  - Log mentions with timestamp

Multi-Step Workflows

Complete Product Scraping Pipeline

Build a comprehensive product data pipeline:

Discover Product URLs

Use Create Map to find all product pages on the target website

Include patterns: /products/**
Exclude patterns: /cart/**, /checkout/**

Batch Process Products

Use Batch Scrape URLs to extract all product data

Format: JSON
Parser: Product-specific parser if available

Store in Database

Send batch ID to your system or wait and retrieve results

Use Airtable, Google Sheets, or your database

Monitor for Changes

Schedule daily scrapes to track price/availability changes

Compare with existing data
Alert on significant changes

Analyze competitors and plan content:

Map Competitor Sites

Use Create Map on competitor websites

Extract all blog posts and content pages

Scrape Content

Use Batch Scrape URLs to get full content

Format: Markdown for easy analysis

AI Analysis

Use OpenAI to analyze topics and keywords

Identify content gaps
Find trending topics

Create Content Calendar

Add insights to Notion or Airtable

Plan your content strategy

Specialized Parsers

Olostep provides pre-built parsers for popular websites. Use them with the Parser field:

Amazon Product

@olostep/amazon-productExtract: title, price, rating, reviews, images, variants

Google Search

@olostep/google-searchExtract: search results, titles, snippets, URLs

Google Maps

@olostep/google-mapsExtract: business info, reviews, ratings, location

Extract Emails

@olostep/extract-emailsExtract: emails from pages, contact lists, and footers

Extract Socials

@olostep/extract-socialsExtract: social profile links (X/Twitter, GitHub, etc.)

Extract Calendars

@olostep/extract-calendarsExtract: calendar links (Google Calendar, ICS) from pages

Using Parsers

Simply add the parser ID to the Parser field. Two examples:

Action: Olostep - Scrape Website
  - URL: https://www.amazon.com/dp/PRODUCT_ID
  - Format: JSON
  - Parser: @olostep/amazon-product

Action: Olostep - Scrape Website
  - URL: https://example.com/contact
  - Format: JSON
  - Parser: @olostep/extract-emails

The parser automatically extracts structured data specific to the task.

Integration with Popular Apps

Google Sheets

Perfect for data collection and tracking:

Olostep scrapes website
Filter or transform data
Google Sheets - Append Row

Use Cases:

Price tracking spreadsheets
Lead enrichment databases
Content inventory
Competitor analysis sheets

Airtable

Build powerful databases with scraped data:

Olostep scrapes or crawls
Code - Format data
Airtable - Create Records

Use Cases:

Product catalogs
Research databases
Content calendars
Link databases

Slack

Get instant notifications:

Olostep monitors page
IF - Check for changes
Slack - Send Message

Use Cases:

Price drop alerts
Content update notifications
Error monitoring
Daily digests

HubSpot / Salesforce

Enrich CRM data automatically:

New contact added
Olostep scrapes company website
OpenAI extracts key info
CRM - Update contact

Use Cases:

Lead enrichment
Company research
Competitive intelligence
Account mapping

Notion

Build knowledge bases:

Olostep crawls documentation
Code - Parse content
Notion - Create Pages

Use Cases:

Documentation mirrors
Research repositories
Content libraries
Team wikis

Best Practices

Use Batch Processing for Multiple URLs

When scraping more than 3-5 URLs, use Batch Scrape URLs instead of multiple Scrape Website actions. Batch processing is:

Much faster (parallel processing)
More cost-effective
Easier to manage
Better for rate limits

Set Appropriate Wait Times

For JavaScript-heavy sites, use the “Wait Before Scraping” parameter:

Simple sites: 0-1000ms
Dynamic sites: 2000-3000ms
Heavy JavaScript: 5000-8000ms

Test with different values to find the optimal wait time.

Use Specialized Parsers

Use pre-built parsers (e.g., Amazon, Google, and task-specific parsers from the Olostep Store like emails, socials, calendars):

Get structured data automatically
More reliable extraction
No need for custom parsing
Maintained by Olostep

Filter Before Scraping

Use n8n’s IF node to avoid unnecessary scrapes:

Check if URL has changed
Verify data hasn’t been scraped recently
Apply business logic before scraping

This saves API credits and execution time.

Handle Async Operations

Batch, Crawl, and Map operations are asynchronous:

Store the returned ID (batch_id, crawl_id, map_id)
Use a Wait node if retrieving immediately
Consider webhook callbacks for completion
Set up separate workflows for retrieval

Store Results Properly

Choose the right storage based on your needs:

Google Sheets: Simple tracking, team collaboration
Airtable: Relational data, rich formatting
Database: Large-scale, complex queries
Notion: Knowledge base, documentation

Monitor and Alert

Set up monitoring for your scraping workflows:

Use Error workflows in n8n
Send alerts to Slack/Email on failures
Track API usage in Olostep dashboard
Log important metrics

Common Use Cases by Industry

E-commerce

Price Monitoring: Track competitor pricing in real-time
Product Research: Discover trending products and market gaps
Inventory Tracking: Monitor stock availability
Review Analysis: Aggregate and analyze customer reviews

Marketing & SEO

Content Discovery: Find content opportunities
Competitor Analysis: Track competitor strategies
Backlink Research: Discover link opportunities
Keyword Research: Extract keyword data from search results

Sales & Lead Generation

Lead Enrichment: Enhance CRM data with web information
Company Research: Gather company intelligence
Contact Discovery: Find decision-makers
Competitive Intelligence: Track competitor moves

Research & Analytics

Data Collection: Gather data from multiple sources
Market Research: Track industry trends
Academic Research: Collect research data
Price Intelligence: Analyze pricing strategies

Media & Publishing

Content Aggregation: Curate content from multiple sites
News Monitoring: Track news and mentions
Social Media: Monitor social platforms
Trend Detection: Identify trending topics

Troubleshooting

Authentication Failed

Error: “Invalid API key”Solutions:

Check API key from dashboard
Ensure no extra spaces in API key
Recreate the credential in n8n
Verify API key is active

Scrape Returns Empty Content

Error: Content fields are emptySolutions:

Increase “Wait Before Scraping” time
Check if website requires login
Try different format (HTML vs Markdown)
Verify URL is accessible
Check if site blocks automated access

Batch Array Format Error

Error: “Invalid JSON format for batch array”Solutions:

Use format: [{"url":"https://example.com","custom_id":"id1"}]
Ensure proper JSON syntax
Use Code node to format URLs correctly
Test JSON with online validator

Rate Limit Exceeded

Error: “Rate limit exceeded”Solutions:

Space out workflow executions with Wait nodes
Use batch processing instead of individual scrapes
Upgrade your Olostep plan
Check rate limit in dashboard

URL Not Scraped

Error: Specific URLs fail to scrapeSolutions:

Verify URL format (include http:// or https://)
Check if URL requires authentication
Test URL in browser first
Try with country parameter
Contact support for blocked domains

n8n Advantages

Self-Hosted

n8n is self-hosted, giving you complete control over your workflows and data. No vendor lock-in, no data leaving your infrastructure.

No Task Limits

Unlike cloud-based automation platforms, n8n doesn’t impose task limits. Run as many workflows as you need without additional costs.

Open Source

n8n is open source, allowing you to customize and extend it to fit your specific needs.

Cost-Effective

Self-hosted n8n is free, with optional cloud hosting available. Only pay for the Olostep API usage.

Pricing

Olostep charges based on API usage, independent of n8n:

Scrapes: Pay per scrape
Batches: Pay per URL in batch
Crawls: Pay per page crawled
Maps: Pay per map operation

Check current pricing at olostep.com/pricing. n8n: Self-hosted n8n is free. Cloud hosting available with optional paid plans.

Support

Need help with the n8n integration?

Documentation

Browse complete API docs

Support Email

Email: info@olostep.com

n8n Community

Ask in n8n Community

Status Page

Check API status

Scrapes API

Learn about the Scrapes endpoint

Batches API

Learn about the Batches endpoint

Crawls API

Learn about the Crawls endpoint

Maps API

Learn about the Maps endpoint

Python SDK

Use Olostep with Python

LangChain Integration

Build AI agents with LangChain

Get Started

Ready to automate your web search, scraping, and crawling workflows?

Install the Node

Install n8n-nodes-olostep and start building automated workflows

Connect Olostep with n8n and automate your web data extraction today!

Get Started

Features

Integrations

​Features

Scrape Website

Search

Answers (AI)

Batch Scrape URLs

Create Crawl

Create Map

​Installation

​1. Install the Node

​2. Connect Your Account

​Available Actions

​Scrape Website

​Search

​Batch Scrape URLs

​Create Crawl

​Create Map

​Popular Workflow Examples

​E-commerce Price Monitoring

​Content Aggregation

​Lead Enrichment Pipeline

​Research Automation

​Social Media Monitoring

​Multi-Step Workflows

​Complete Product Scraping Pipeline

​SEO Content Strategy

​Specialized Parsers

Amazon Product

Google Search

Google Maps

Extract Emails

Extract Socials

Extract Calendars

​Using Parsers

​Integration with Popular Apps

​Google Sheets

​Airtable

​Slack

​HubSpot / Salesforce

​Notion

​Best Practices

​Common Use Cases by Industry

​E-commerce

​Marketing & SEO

​Sales & Lead Generation

​Research & Analytics

​Media & Publishing

​Troubleshooting

​n8n Advantages

​Self-Hosted

​No Task Limits

​Open Source

​Cost-Effective

​Pricing

​Support

Documentation

Support Email

n8n Community

Status Page

​Related Resources

Scrapes API

Batches API

Crawls API

Maps API

Python SDK

LangChain Integration

​Get Started

Install the Node

Features

Installation

1. Install the Node

2. Connect Your Account

Available Actions

Scrape Website

Search

Batch Scrape URLs

Create Crawl

Create Map

Popular Workflow Examples

E-commerce Price Monitoring

Content Aggregation

Lead Enrichment Pipeline

Research Automation

Social Media Monitoring

Multi-Step Workflows

Complete Product Scraping Pipeline

SEO Content Strategy

Specialized Parsers

Using Parsers

Integration with Popular Apps

Google Sheets

Airtable

Slack

HubSpot / Salesforce

Notion

Best Practices

Common Use Cases by Industry

E-commerce

Marketing & SEO

Sales & Lead Generation

Research & Analytics

Media & Publishing

Troubleshooting

n8n Advantages

Self-Hosted

No Task Limits

Open Source

Cost-Effective

Pricing

Support

Related Resources

Get Started