Missed Opportunities – Finding Sites That Publish Your Infographic But Don’t Link
If you have launched a successful infographic, or any other successful creative piece, you are missing out on links. People repost your work without attribution (linking) – how rude. Well the good news is that it’s pretty easy to get links from this. Most folks are willing to link to you if they are asked, since they are posting your work. This is your step by step guide to finding who is posting your infographic and if they are linking to you.
Find Who is Posting Your Infographic
There are a couple different ways to do this, and to find as many sites as possible, you will want to use all of these methods. For this post, we will use the following infographic by AYTM. It is called The Global Appeal of Angry Birds.
Searching with the Infographic Title
While this is pretty self-explanatory, there are a couple things you need to do first to set yourself up later. First, you need to install the Scrape This plugin for Chrome. Second, you are going to need to go into your search settings for Google (click on the gear in the top right, then click search settings) and change the number of results per page to 100.
Once you have completed these two items, just put the title of the infographic (in quotes) into Google. With our example, you would search “global appeal of angry birds”. Now, right click on the search result and select “Scrape Similar” – This will scrape all of the search results from this page.
Pro Tip: Make sure you don’t click directly on the text that says “Global Appeal of Angry Birds”. As this text is bolded in the search result, the plugin will only scrape other listings with bolded text.
Once you have clicked scrape similar, export the scraping to Google Docs.
Now scrape all of the following paginated search pages and export the info into Google docs.
Pro Tip: The first result in each of the Google docs will be a search result URL (which redirects to the actual URL). You will need to copy this and paste it into your browser, then copy the actual URL and paste it back into your spreadsheet.
Search Similar Images
Now that you’ve gotten all the pages that Google know about with the title of the infographic, we’re going to search with the image itself. Drag the image into the Google Images search box.
Unfortunately, with the search similar images functionality, Google won’t display the 100 results, only 10 at a time. Now, scrape the results as we did before. Once you have scraped them into Google Docs, paste all the results into your Excel spreadsheet.
Search By Embed Code
Alright, now that you have searched for the title and the image, let’s search for a snippet of text from the embed code. As you’ll notice below, the embed code contains the text “Infographic by: AYTM Market Research”.
This search yielded even more results that we can add to our list.
Alright, now scrape the results using scrape similar and put them in your excel doc.
Cleaning The Data
So at this point, you should have a big ass list of pages that have your infographic on them. We have a couple problems though: First, there are going to be category and tag pages in the search results – we don’t want these though as this isn’t the true page containing the infographic, so let’s get rid of them.
Apply a filter to your list of links:
In the URL column, add a custom text filter:
To filter out most of the tag and category pages, you are going to want to filter out all the pages that contain one the following:
When you are writing the custom text filter, use “does not contain” and “and”, not “or”.
Now we’re on to the second problem, as we are searching several different ways, there is going to inherently be some overlap and we’re going to have duplicates. This is easy to solve though, just click the “Remove Duplicates” button in the data tab. Then select “expand the current selection” and hit Remove Duplicates.
Then only select URL.
Now you should have a clean list of URLs that have your infographic on them.
Finding Which Pages Link to Your Infographic & Which Ones Don’t
So now that we know who has published our infographic, the next step is to determine who links to us and who doesn’t so that we can email everyone who doesn’t link to us. First, we will need to take the list of URLs and put them in a .txt file.
Now fire up Screaming Frog SEO Spider. If you aren’t familiar with this program, you should spend some time with it. You are going to want to choose list mode (rather than just crawling a site). This will allow us to crawl our list of URLs.
Then click “Select File” and choose your .txt file full of pages with your infographic on it. Before we can start we need to set up some custom filters. This will allow us to determine if people are linking to us or not. To do this, go to configuration and click custom.
We are going to set up filter 1 and 2 to display pages that link to us and filters 3 and 4 to display pages that don’t link to us. While we don’t need to set up filters 1 and 2 as we are really looking for the sites that don’t link to us, this gives us a nice list which we can give to our client :)
While AYTM isn’t on a www subdomain, it is important to check for both the www and non-www versions of links as some bloggers may not properly link to us. Doing both will prevent false positives from showing up in either list.
Now you can start your crawl!
When your crawl is completed, click on the “custom” tab. You can then select the filter that you want to view. Now click Export for each filter. In Excel, combine filters 1 & 2 and filters 3 & 4. You now have two lists, one that shows everyone who links to your infographic and one that shows people who have published your infographic but haven’t linked to it.
Prioritize Your Efforts
Now that you have these two huge lists, you’ll want to prioritize your email outreach efforts. Do this by pulling Open Site Explorer data for all the URLs on both lists. You can do this easily in Google Docs. Instructions here and example here.
This can be applied to anything, really. You do this with:
Slide Share Presentations
Let’s look at one of Rand’s recent presentations, here. You can see that the embed code (shown below) from the presentation automatically puts the text “Can’t Buy Me Love from Rand Fishkin” below the presentation. If we search for this phrase, there are 341 search results.
Same with video - if we search for the snippet of text below the embedded video, we get over 42,000 results.
Anything Else Embeddable
If you have any embeddable asset, you can easily find copies of it online. You can expand the method in this post to almost anything.