What is XPath?
XML Path (XPath) is a query language developed by W3 to navigate XML documents and select specified nodes of data. This definitive XPath SEO guide will make you understand the entire architecture in terms of SEO.
Use of XPath in SEO
This option allows you to scrape data by using XPath selectors, including attributes.
How to Find XPath For a Website
Easiest way to find XPath is using Chrome’s Inspect Tool. Here’s how:
Select desired section of the website for which you want to find the XPath, then right click on it and select Inspect.
Once you have the source, then you can right click an element and select Copy > Copy XPath.
🔶 Then Run Screaming Frog Tool
From the top menu navigation, select Configuration > Custom > Extraction
🔶 Then paste the copied element in the XPath section as shown in the above screenshot and make sure the option should be selected as Extract Text.
🔶 Next, crawl the website on Screaming Frog.
After that, view the scraped data under the Custom Extraction Tab which we set on the previous section in the Extractor 1. We picked the <H2> section of the site to get the details of scrapped data.
X Path Cheat Sheet
Basic Xpaths
ELEMENT | XPATH FOR SCREAMING FROG | EXTRACTION |
Any element | //* | Extract Text |
Any <p> element | //p | Extract Text |
Any <div> element | //div | Extract Text |
Any element with class “example” | //*[@class=’example’] | Extract Text |
The whole webpage | /html | Extract Inner HTML |
All webpage body | /html/body | Extract Inner HTML |
All text | //text() | Extract Text |
All links | //@href | Extract Text |
Links with specific anchor text “example” | //a[contains(., ‘example’)]/@href | Extract Text |
Email Addresses | //a[starts-with(@href, ‘mailto’)] | Extract Text |
Elements can have different classes and IDs, however, there are usually some basic XPaths you can scrape that account for most site formatting.
XPath for SEO
ELEMENT | XPATH | EXTRACTION |
H3 | //h3 | Extract Text |
H3 with specific text “example” | //h3[contains(text(), “example”)] | Extract Text |
Count of H3s | count(//h3) | Function |
Full hreflang (link + value) | //*[@hreflang] | Extract Text |
Hreflang values | //*[@hreflang]/@hreflang | Extract Text |
Types of Schema | //*[@itemtype]/@itemtype | Extract Text |
Schema itemprop rules | //*[@itemprop]/@itemprop | Extract Text |
Conclusion
When the progress bar reaches ‘100%’, the crawl has finished and you can choose to ‘export’ the data using the ‘export’ buttons.
Here in this XPath SEO guide analysis, we have extracted the headings (H2) of the site as shown in the exported excel screenshot: