If you need to extract data from Amazon and import it into Airtable, this can be easily achieved using the Apify API and the free Data Fetcher extension.
This is often referred to as data scraping (or web scraping) and describes the process of importing data from websites into files or spreadsheets. For this example, we are scraping data from Amazon into Airtable.
Follow this step-by-step tutorial to find out how to use the Apify Airtable integration, which can also be used to scrape data from other websites. You can read our guide on how to scrape Airbnb data into Airtable here.
Apify is a web scraping tool that enables you to collect data from any website. Apify has inbuilt tools to help you easily extract data such Instagram profile info or extract data from TikTok videos such as hashtags and users.
Sign up for a free Apify account here using your email address, Google or Github profile.
You'll need to Verify your email address and complete the requested onboarding info.
Use the Store search bar to locate and select the Amazon Product Scraper.
At the bottom of the next screen, select 'Start 14 days trial'.
Next, copy and paste the URL of the Amazon product (or product category) you want to scrape data from.
You have the option to select a maximum number of items to scrape (which may apply if you were scraping a whole Amazon category.) For this example, we are using a single product, so ignore this step.
Then select 'Save as new task'. This creates what Apify refers to as an 'Actor Task'.
Click 'Start', and Apify will start scraping the product data from your chosen Amazon product.
Next, set up a new Airtable base and install Data Fetcher via the Airtable marketplace. Sign up for a Data Fetcher account by entering a password and clicking 'Sign up for free'. Alternatively, you can use your Google login to create a new account. If you already have a Data Fetcher account use the 'Have an account?' text in the bottom left of the screen to log in.
Data Fetcher is a powerful Airtable extension that can be used to import different types of data from APIs or websites into Airtable.
Data Fetcher requests are used to import and export data. You can create multiple requests within one installation of Data Fetcher in an Airtable base.
Click on 'Create your first request' from the Data Fetcher home screen.
On the create request screen, select 'Apify' for Application to use the Apify Airtable integration.
Click here to get your Personal API token from Apify and copy this to your clipboard using the copy button.
Paste this API key here.
For Endpoint select 'Import results from an actor task's latest run'.
For Actor task choose your Amazon Crawler task.
Give your request a name such as 'Scrape Amazon Product Info' and click 'Save & Run'.
The Response field mapping modal will open, where you choose which fields to import from your chosen Amazon product(s) and how they will map to your output table.
Click Filter all to remove any pre-selected fields, then use the Find field search bar to easily locate the fields you want to import.
For this example, we are going to import the following fields:
'Url', 'Title', 'Brand', 'Description', 'Stars'.
These will automatically be set to map to new fields that will be created in the output table.
Click 'Save & Run'.
The Apify Airtable integration will run and you'll now see the data in your output table related to your Amazon product (or the products in an Amazon category.)
Currently, you'd need to manually click 'Run' in Data Fetcher to import any updates to the Amazon product data. It's also possible to set the Apify Airtable integration to run at regular intervals.
There are two steps to automate this process, using both the Apify scheduling feature and the Data Fetcher's scheduling feature.
In Apify, navigate to the Saved Tasks menu on the left and select your 'Amazon Scraper Task'.
Then choose Schedule from the menu on the top right of the screen.
You can enter a new name for My Schedule 1 if you wish.
By default the schedule is set to every day, but you can also choose hourly, weekly, yearly or choose your own schedule.
Next, you'll need to set up Data Fetcher's scheduling feature. You'll need to upgrade your account as this is a paid Data Fetcher feature. In Data Fetcher, scroll to Schedule and click 'Upgrade'.
Choose a plan from the different options depending on your needs and enter your payment details.
Back in Data Fetcher, click 'I've done this'.
Under Schedule click '+ Authorize'.
A new window will now open and prompt you to authorize the Airtable bases you need Data Fetcher to access.
By selecting 'All current and future bases in all current and future workspaces' you should avoid issues with unauthorized bases in the future.
Click 'Grant access'.
Back in Data Fetcher, you'll see Schedule this request is now toggled on.
Select a schedule for the Apify Airtable integration based on intervals of 'Minutes', 'Hours', 'Days' or 'Months'. Click 'Save', and changes to the data will automatically scrape product data from Amazon, then import into your Airtable base on your chosen schedule.
May 1, 2022•
Rosie Threlfall•ParseHubWeb Scraping