Extract URLs from Sitemap reads XML sitemaps to get comprehensive lists of website URLs. Essential for SEO audits, competitive analysis, and bulk website monitoring tasks.


When to Use It

Use this node to:

  • Get complete URL lists for SEO audits
  • Analyze competitor website structure
  • Monitor large websites for changes
  • Bulk check page status across entire sites
  • Feed URLs into scraping or analysis workflows

Inputs

FieldTypeRequiredDescription
XML Sitemap URLTextYesThe sitemap.xml URL you want to extract URLs from
LimitNumberNoMaximum number of URLs to extract (optional)

How It Works

This node reads XML sitemap files and extracts all the URLs listed within them. Sitemaps are files that websites use to tell search engines about their pages.

Common Sitemap Locations

Most websites have sitemaps at these standard locations:

  • https://example.com/sitemap.xml
  • https://example.com/sitemap_index.xml
  • https://example.com/sitemaps/sitemap.xml

You can also find sitemap URLs in:

  • robots.txt file (usually at https://example.com/robots.txt)
  • Google Search Console
  • Website footer links

Sitemap Types

Standard Sitemaps:

  • List all website pages in XML format
  • Include last modification dates
  • Show page priority and update frequency

Sitemap Index Files:

  • Point to multiple sitemap files
  • Common for large websites
  • May contain thousands of URLs across multiple files

Specialized Sitemaps:

  • News sitemaps (news articles)
  • Image sitemaps (image content)
  • Video sitemaps (video content)

Output

The node returns:

  • URLs - List of all URLs found in the sitemap
  • Total Count - Number of URLs extracted
  • Last Modified - When each URL was last updated (if available)
  • Priority - Page priority as specified in sitemap (if available)

Tips

Finding Sitemaps:

  • Check /robots.txt for sitemap declarations
  • Try common sitemap URLs first
  • Look in Google Search Console for verified sitemaps
  • Some sites have multiple sitemaps for different content types

Large Sitemaps:

  • Use the limit parameter for initial testing
  • Large sites may have sitemap index files linking to multiple sitemaps
  • Consider processing in batches for very large sites

Error Handling:

  • Not all websites have sitemaps
  • Some sitemaps may be incomplete or outdated
  • Private or restricted sitemaps may not be accessible

FAQ