How To Identify Lost Links In Linkscape, Majestic or Webmaster Tools Data

Enemy sighted!

I think the last two months at the Gadgetplex have been pretty much the busiest of my life. As a result, my minimum “blog every 2 weeks” target has been woefully ignored. Fortunately we’ve got some great new hires in place and, with a little luck I’ll be able to give some love back to the blog in the coming weeks and months.

Anyway, here’s to the first of many new posts to come (setting myself a target there); a quick tip to solve (what I think) remains quite a big problem for most SEO’s – identifying links that may have mysteriously disappeared.

SEO’s don’t like to lose links

Most people are naturally loss averse. The common thinking is that we strongly prefer to avoid losses than consume energy in the pursuit of gain. In the case of links, the practicalities are such that reliable link monitoring is no easy task. Using tools like Buzzstream, we can keep an eye on the links we’ve built, but what about the legacy stuff that you simply assume will stick around?

In the past 6 months, we’ve worked hard to rescue a client’s site that fell victim to a vile SEO tactic. Their former agency owned a large network of links, and had been acquiring sites and domains specifically for the purpose of building links to the client domain. Guess what happened when the agency was fired? A dramatic reduction in linking IP C Blocks and root domain links over the following months.

SEO’s – you’ve got to keep an eye on your link data, all of the time and especially when you’re working with a new client.

Looking at link data is looking at the past

Astrophysicists marvel at the joy of catching light photons in their telescopes created by a star many, many millions of years before they were born. SEO people need the freshest data to make decisions, but frequently forget that while they’re trawling through link data, they’re looking back at an internet from the past.

How do you know the link you’re observing in the data is still there? You have to manually check, build a tool, use a free tool or script. One thing’s for sure – pages get lost, links decay, they can be pulled from right under your feet or errors occur:

Httparchive’s 17k site crawl generated from top sites on Quantcast and Alexa data (amongst other sources)

44 billion page webcrawl from SEOmoz

At least 6% of Linkscape’s web crawl was 404 or unreachable back in late 2009 and the more recent updates show around a 9% decay.

Checking your links are still live from Majestic and Linkscape Data

For such a “quick tip” I’ve probably gone off on a bit of a tangent, but you’re still reading, right? At this stage I’ll give SEOdoctor a shout for this great post, and a tip found in the comments after a tip from Weip  to use - and now, I’ll carry on.

For my tip to work you’ll need to have Niel’s SEO Tools extension installed – here’s a write-up from a few months ago. Frankly if you’re not using it you should really start, it’s amazing.

Check a link can be found on a page with XPath and an IF statement

There’s a function in SEO Tools that allows the use of XPath in an Excel formula (take that, GDocs users!). It’s called XPathonURL, and it’s beautiful. So, if you can fetch the XPath for all href attributes matching a certain (domain) name, from a page, you’ll be able to check whether  a link is still live with a simple IF statement.

Here’s one I made earlier:

here's one I made earlier

Here’s how:

Just make sure you’re looking for the right domain (in this case, and that your cell reference for the inbound link (in this case, C2) is correct. That’s about it!

Just a note on XpathonURL

XpathOnURL doesn’t return a value if there is nothing to return – a blank cell. That’s why I’ve used two qoutation marks. If the result from my query is blank, assume the condition is met in my IF statement and return a “not found”.

Save your historic data

Save your OSE / Linkscape downloads! Save them every month. If you’re not backing up your link data, you’re going to become dependent on the oldest, and most infrequently updated data sources. That’s ok, but it’s always more work to clean up. I tend to prefer directly comparing one data set (Linkscape to Linkscape) rather than scratching my head over Google WMT vs Majestic, or Linkscape vs Majestic. If you do, my best advice is create a master data table and de-dupe to create one big long list of all of your IBL’s. Then, get your analysis skills rocking.

One final thing – get to new links quickly

If finding new links quickly is your bag, check out Rob and Tom’s new tool, Linkstant. Enjoy!


Image credit: Johnson Cameraface
How To Identify Lost Links In Linkscape, Majestic or Webmaster Tools Data, 5.0 out of 5 based on 1 rating


  1. Jon Q

    Nice post, really want to spend some more time playing with the excel plugin you mentioned. Had a go recently and fell in love, I’ll add this function to the list! Really cool to see a bit more detail on tracking lost links, I usually stick to tracking with the Majestic Fresh Index + OSE etc but this will take it one step further for sure.

    Sounds like it could be worth building a master sheet using this with your competitors data too…

  2. Razvan

    With our tool we take the link data and recrawl it for every site. I can tell you from the data seen that you can see lost & broken links in a percentage of up to 70%-80% on certain sites. Almost every sites has a ratio of up to 25%-30% broken links.

    That is why we re-crawl everything … because the analysis is half true if you are looking at once crawled data that at the present time can on give you a vague idea about the link profile of that site.

    Lost & Broken links are as natural as they can get. That is why you need to re-crawl any site that you truly want to analyze and not just overview.

  3. seo analyst

    Amazing! So much information on one article! Its worth it waiting for your posr Richard and spend 2 hours on a single post (considering all the links visited as well).

    Some resources here is one word mind blowing. Big help, thanks again.

    - livingseolife

  4. Chairs Back

    Google webmaster tools have been an obvious choice as a means to getting the most accurate and straight to the point data. This way, we have also been able to get a good idea on the rate at which people come on the website and what words are undergoing improvements.

  5. Razvan

    Google Webmaster Tools is ok for your site. What about other sites that you don’t have access to ? What about your competitors ?

    That is why you need external checking.

  6. Erica

    The XPathOnUrl formula was working like a charm then it broke all of a sudden. I ordered alphabetically (found/not found) then it stopped working when I added more rows to the spreadsheet – it doesn’t work even if I redo everything from scratch. Any ideas?

  7. Paul May

    Great post, Richard! Neil’s toolset looks fantastic.

    One thing to note…you can also use BuzzStream to check these links for you. Just import your links from Majestic or OSE and BuzzStream’s backlink checker will automatically check them. In addition to looking for lost links, it’ll check to see if they’re nofollowed, check PR and mozRank, look for any words on the page that might signal you’re on a page you want to be on, etc.


    Paul May
    BuzzStream co-founder

  8. Alex

    Loads of “Interesting” comments with self promotional intent… Anyho, xpathonurl seems to have stopped working for some reason – says invalid URI. Anyone knows how to fix it?