Firecrawl Content Migrator

A powerful web scraping and content migration tool built with Next.js, TypeScript, and the Firecrawl API. Extract structured data from any website and export it in formats ready for your CMS, database, or data pipeline.

Repository: github.com/mendableai/firecrawl-migrator

What it does

Scrapes content from websites and exports structured data. Map a site, select URLs, define what data to extract, and export as CSV.

Key Features

Map website structure to discover all pages
Select specific URLs to scrape
Define custom fields (title, date, content, etc.)
Export data as CSV
Batch process multiple pages at once

Use Cases

Blog Migration: Extract posts, metadata, and content from any blog platform
E-commerce Data: Scrape product information, prices, and descriptions
News Archives: Collect articles with dates, authors, and categories
Documentation Sites: Extract technical documentation with proper structure
Content Audits: Analyze and export existing website content

Prerequisites

Node.js 18+ and npm
Firecrawl API key (get one at firecrawl.dev)

Quick Start

1. Clone the repository

git clone https://github.com/mendableai/firecrawl-migrator.git
cd firecrawl-migrator

2. Install dependencies

npm install

3. Get your Firecrawl API key

Sign up at firecrawl.dev
Navigate to your dashboard
Copy your API key

4. Configure environment variables

Create a .env.local file in the root directory:

touch .env.local

Add your Firecrawl API key to the file:

FIRECRAWL_API_KEY=fc-YOUR_ACTUAL_API_KEY_HERE

5. Run the development server

npm run dev

6. Open the application

Visit http://localhost:3000 in your browser

You should see the Firecrawl Content Migrator interface. If you see an API key error, double-check your .env.local file.

How It Works

Step 1: Map Website Structure

Enter a URL to analyze the website's structure. The mapping operation discovers all available pages and organizes them in a hierarchical tree view.

Step 2: Select Content

Browse the interactive tree and select which pages to scrape. Use the built-in filters to:

Select all pages in a directory
Filter by URL patterns
Exclude categories, tags, or pagination

Step 3: Define Schema

Configure what data to extract from each page:

Default fields: title, date, content
Add custom fields: author, category, price, tags, etc.
Auto-detection: The tool can analyze pages and suggest fields

Step 4: Extract & Export

Start the batch scraping process to extract structured data from all selected pages. Export results as:

CSV for spreadsheets and databases
JSON for APIs and applications
Custom formats for specific CMS platforms

Troubleshooting

If you see "Firecrawl API key not configured":

Make sure you created the .env.local file
Check that your API key starts with fc-
Restart the development server

Development

Build for production

npm run build
npm start

Run linting

npm run lint

Contributing

Just fork, make your changes, and submit a PR!

License

MIT License - see LICENSE file for details

Support

Create an issue on GitHub
Check Firecrawl docs
Join Firecrawl Discord

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
public		public
src		src
.gitignore		.gitignore
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Firecrawl Content Migrator

What it does

Key Features

Use Cases

Prerequisites

Quick Start

1. Clone the repository

2. Install dependencies

3. Get your Firecrawl API key

4. Configure environment variables

5. Run the development server

6. Open the application

How It Works

Step 1: Map Website Structure

Step 2: Select Content

Step 3: Define Schema

Step 4: Extract & Export

Troubleshooting

Development

Build for production

Run linting

Contributing

License

Support

About

Uh oh!

Releases

Packages

Languages

firecrawl/firecrawl-migrator

Folders and files

Latest commit

History

Repository files navigation

Firecrawl Content Migrator

What it does

Key Features

Use Cases

Prerequisites

Quick Start

1. Clone the repository

2. Install dependencies

3. Get your Firecrawl API key

4. Configure environment variables

5. Run the development server

6. Open the application

How It Works

Step 1: Map Website Structure

Step 2: Select Content

Step 3: Define Schema

Step 4: Extract & Export

Troubleshooting

Development

Build for production

Run linting

Contributing

License

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages