In this guide, we'll scrape website data into Airtable using ParseHub. We'll use the Data Fetcher Airtable app to connect to ParseHub and import selected fields of the website data. We'll scrape event information from timeout.com, but you can use this method to scrape any website's data into Airtable.
ParseHub is a tool for extracting data from websites. You can download it here and sign up for a free account. You can also find ParseHub tutorials here for any help getting started.
Install Data Fetcher from the Airtable app marketplace. After the app launches, sign up for a free Data Fetcher account by entering a password and clicking 'Sign up for free'.
On the home screen of the Data Fetcher app, click 'Create your first request'. Requests in Data Fetcher are how you import data to or send data from your Airtable base.
On the create request screen in Data Fetcher, for Application, select 'ParseHub'.
Copy and paste your personal API key from your ParseHub account into Data Fetcher below the label Authorization. Your API key authorizes Data Fetcher to read from your ParseHub account. You can find your ParseHub API key here.
For Endpoint, select 'Import data from a project's latest run'.
You can also add a Name for your request, e.g. 'Import Website Data'.
Click 'Save & Continue'.
On the next screen, select the ParseHub Project you want to fetch your data from.
Select the Output Table & View you want to import the data into.
Click 'Save & Run'.
This request will run, and the Response Field Mapping window will open. This is where you set which fields from the ParseHub project will map to which fields in the output table in Airtable.
You can easily find the fields you want to import using the 'Find field' search bar.
For this example we will import all of our fields.
You can choose whether to add these to existing Airtable fields or to create new ones. You can also choose the type of any new field (ie. Single line text, Multi line text, Email etc).
Click 'Save & Run'.
Data Fetcher will create any fields that need to be created in the output table, then run the request and import the scraped website data from ParseHub to Airtable. You can now choose to view the output table.
Instead of manually running a request every time you want to update data from a particular project in ParseHub, you can use Data Fetcher's scheduled requests feature to automatically import data every 15 minutes/ hour/ day etc.
To enable scheduling, you'll need both a paid Data Fetcher account and a paid ParseHub account.
In Data Fetcher, scroll to Schedule and click 'Upgrade'.
A new tab will open where you can select a plan and enter your payment details to upgrade.
Return to the Data Fetcher app and click 'I've done this'.
Under Schedule click '+ Authorize'.
A new window will now open for you to authorize the Airtable bases you need Data Fetcher to access.
It's recommended to select 'All current and future bases in all current and future workspaces' to avoid issues with unauthorized bases in the future.
Click 'Grant access'.
In the Data Fetcher interface Schedule this request is now toggled on.
Select how often you want the request to run, e.g. 'Every 15 mins'. Click 'Save' and the request will now run on the schedule and sync any new scraped website data automatically.
Oct 15, 2023
•Rosie Threlfall
•LinkedInWeb ScrapingJul 11, 2023
•Rosie Threlfall
•Web ScrapingLinkedInMay 11, 2023
•Rosie Threlfall
•Web Scraping