Using the free Data Fetcher extension for Airtable, you can quickly and easily import any Website's XML Sitemap into your Airtable base.
An XML sitemap is essentially a text file listing the pages on a website. Having a sitemap ensures Google and other search engines can easily find and crawl a website's content.
XML is a markup language and file format specifically used for storing and transporting data. This is an example of code used to make an XML sitemap:
By importing a website's sitemap into Airtable, you will be able to monitor any new pages or blog posts for any website, either your own or a site belonging to a competitor, etc.
Create an Airtable table to import your XML sitemap data into. Make a new 'Created time' field. By doing this, you will be able to see exactly when a new sitemap URL was added.
Install Data Fetcher for your base using the Airtable marketplace. This is a free extension that enables us to import XML and many other types of data into Airtable.
Either create a free Data Fetcher account or sign in to your existing one using the 'Have an account' button on the bottom left of the screen.
It's easy to sign-up with your Google account by selecting 'Continue with Google'. Alternatively, enter your email address and password. Data Fetcher is separate from your Airtable account and your details are kept secure.
Select 'Create your first request' from the Data Fetcher home screen. Setting up requests enables us to import data into Airtable from other applications, APIs or URLs.
On the Application drop-down menu, select 'Custom'. This allows us to create a custom request and connect to a URL.
Next, copy and paste the URL of the XML sitemap you would like to import into Airtable. If you need help in locating the website's sitemap there is more information here.
Make sure Output Table & View are set to the table where you made the 'Created time' field where you want to import your data into. Enter the name 'Import XML Sitemap' in the box at the top of the screen. Adding a name is useful to keep track of multiple Data Fetcher requests.
Next, click 'Save & Run'.
The next screen is the Response field mapping window, where you choose which fields to import from the XML file into Airtable. Click 'filter all', then use the 'Find field' search bar to find and select the 'Urlset url loc' field. This field is needed to import each URL from your chosen sitemap.
You can choose whether to import each field of data into an existing Airtable field or create a new one. For this example, choose 'New field' and call it 'URL'.
Click 'Save & Run'.
You can now view your Airtable table where the new 'URL' field has been created and populated with all the URLs from your chosen sitemap. The 'Create Time' shows the time and date when each new sitemap URL was created.
If you want to make sure changes in your XML file are always synced with Airtable when the Data Fetcher request is run, you can use any unique field in your table to achieve this. 'URL' is a unique field as no two URLs will ever be the same.
In Data Fetcher, open the Advanced settings options and find 'Update Based on Field'. Select your unique field as the Update Based on Field, which in this case is 'URL'.
By following these steps, you would currently need to manually click 'Run' in Data Fetcher to import any new URLs from the XML file.
By using Data Fetcher's scheduled requests, you are able to run the request automatically at regular intervals so Data Fetcher will periodically check for changes in the XML file for you.
This is a paid Data Fetcher feature, so you will need to upgrade your account. To do this, in Data Fetcher, scroll to Schedule and click 'Upgrade'.
Select a plan from the options and enter your payment details to upgrade. Depending on how many Data Fetcher requests and users you require, there are different usage plans.
Return to Data Fetcher and click 'I've done this'.
Under Schedule click '+ Authorize'.
A window will open where you'll be prompted to authorize which Airtable bases you want Data Fetcher to have access to.
We recommended selecting 'All current and future bases in all current and future workspaces' to avoid any issues with unauthorized bases in the future.
Click 'Grant access'.
Back in Data Fetcher, you'll see Schedule this request is now toggled to on.
Select a schedule based on intervals of 'Minutes', 'Hours', 'Days' or 'Months'. Click 'Save', and any new or amended URLs will automatically be imported into your Airtable base on your chosen schedule.