Verve Search logo

Top SEO Conferences to tick off your Bucket List in 2020

Attending industry conferences is a great way to keep you updated with the latest SEO trends, along with allowing you to meet potential clients and network with like-minded professionals. Often our time and/or budget limits the number of SEO conferences we can attend annually.

I conducted a Twitter poll to discover how many SEO conferences professionals within the industry attended annually.

Unsurprisingly, the majority of respondents said they attended one to two conferences per year.

Conference choices may be influenced by location, which makes local conferences more convenient and accessible, but there are also other important factors to consider such as the event’s reputation, the speakers, topics covered, price and whether there are networking opportunities.

We have updated our selected top SEO conferences around the world to help you make your decision of which conferences to attend this year.

Top 2020 SEO Conferences

  • Search Y

Start the year with the opportunity to meet in Paris with international experts, sharing  SEO and SEA news and pieces of advice from their day-to-day experiences. Featuring French and English speakers and simultaneous translation in both languages (in case you don’t feel like practising your French).

It’s a packed day with more than 25 presentations in French and English, offering different tracks. Topics included: content tips, algorithm changes, digital strategy, Tech SEO…and one of the highlights this year is that Martin Split from Google will be speaking!

When: 7 February 2020

Location: Paris

More information about Search Y

  • UnGagged

UnGagged’s motto is ‘Honest discussion. Actionable tactics. No Cameras. No BS’, and they deliver exactly that. The quality of the speakers and presentations is impressive, with the event usually having two to three key themes and topics, too. This one’s great for those of you looking for advanced SEO talks and networking opportunities over lunch, which is included within the ticket price, or over drinks after the conference.

They have 4 different locations now, one in the UK, London, and the rest in the US with different speakers and topics, but the same good quality to the extent that some attendees go to more than one of their conferences each year.

UnGagged London

When: TBC (usually May/June)

Where: London, UK

More information about Ungagged 

  • SMX Munich

 Search Marketing Expo organises several SEM and SEO conferences around Europe. These conferences are organised by Search Engine Land. The event in Munich includes great advanced technical SEO sessions and takes place in April. It’s a multi-track event offering top talks from 77 amazing speakers including John Mueller, Webmaster Trends Analyst at Google. Deep-dive sessions are also included in the ticket price, although places are limited so be sure to book in advance. Additionally, the SEMY awards ceremony will take place on the first evening and there’s also an SMX after-dark event for networking.

When: 18 – 19 March

Location: Munich, Germany

More information about SMX Munich 2020

  • Advanced Search Summit

 Another digital marketing conference worth bookmarking, or even better attending in the US in 2020. SEOs and search industry leaders share their knowledge and experiences in a stunning location, surrounded by vineyards. Networking evenings, while enjoying the best wines in Napa Valley, sound like a great combo!

When: 22 – 24 April 2020

Location: Napa Valley, California

More Information about Advanced Search Summit 2020

  • BrightonSEO

This is one of the most popular search marketing conferences in the UK, with 2018’s edition being attended by roughly 3,500 international professionals. BrightonSEO happens bi-annually (in April and September, usually), with one day of conference talks from an array of speakers, and one day of workshops. There’s something for everyone at this event, with talks surrounding content marketing, links, social media and technical SEO. There’s a pre-party the evening before the conference and networking drinks at the end of the event, too. The cherry on top is the ‘fireside chat’, where search personalities or even Google employees answer common SEO questions.

When: 17 April & 2nd October 2020

Where: Brighton, UK

More information about Brighton SEO 2020 

  • OutREACH

This is a niche conference organised by Verve Search, which focuses on link building and outreach. It’s a one-day event with a single track and is designed to give first-hand insights into successful outreach strategies and case studies. Speakers are always top-level expert link builders sharing their best tips, journalists sharing their know-how and experiences, in addition to experts in content marketing and PR specialists. The conference is followed by networking drinks with speakers and attendees.

When: 12 June 2020

Location: London, UK

More information about  OutREACH 2020  

  • SEontheBeach

This Spanish SEO conference with national and international speakers it’s set within a unique location; the beach at La Manga del Mar Menor. It’s the ideal way to soak up the sunshine while enhancing your SEO knowledge. Besides the interesting SEO talks there are many fun activities such as “beachworking”, which will help to break the ice and make networking easier. There’s a great line-up of confirmed speakers so far, including Gary IIlyes from Google. 

When: 12 – 13 June 2020

Where: La Manga, Murcia, Spain

More information about SEontheBeach 2020

  • MozCon

 Moz organises one of the most reputable conferences in the US for technical SEO, mobile strategy, analytics and content. Running every year since 2011, it is a one-track event, which always has top-tier speakers who will share actionable insights. They also host networking evenings such as the Kickoff Party, where attendees and speakers can mingle and share their knowledge.

When: 6 – 8 July 2020

Location: Seattle, USA

More information about Mozcon 2020

  • Learn Inbound

This is a single-track conference with 20 industry experts speaking over the space of two days, followed by networking drinks and an afterparty. The advantage of the single-track is that you won’t miss any of the speakers and you’ll attend a variety of talks, including focused SEO ones, in addition to PPC, CRO and PR. You can also expect tips, tactics and strategies from the experts.

When: TBC  (usually August)

Location: Dublin, Ireland

More information about Learn Inbound 2020

  • TechSEO Boost

This is the first technical SEO focused conference and it has been organised by Catalyst Digital since 2017. It’s free to attend, you just need to sign up to the attendance lottery, because places are limited. However, if you don’t get a spot or if you aren’t based near Boston, where the event is hosted, and you want to save up on travelling costs, they also offer a free live streaming ( cheers to that!). Renowned international SEO speakers share their Tech SEO tips and wisdom, it’s a great conference to improve or polish your tech skills.

When: TBC (Usually on November/December)

Location: Boston, US.

More information about  TEchSEO Boost

Other great Digital Marketing conferences to consider are Pubcon in Vegas, Search Love, hosted in various locations, including London, Boston and San Diego and the exclusive, invite-only, SEOktoberfest in Munich, gathering top international SEO experts.

Another important thing to remember when planning to attend conference is that preparation is key, so book well in advance to take advantage of discounted early bird tickets.

Do you have other international SEO conference recommendations? Leave your suggestions below.

Measuring Page Speed Performance Using Google Metrics and Tools

Speed has been always one of the most important metrics when measuring a website’s performance, as well as one of the confirmed desktop ranking factors since 2010.  

Nowadays we are in a mobile era and speed is even more important after Google rolled out the Mobile-first indexing last March 2018, announcing that they are using primarily the mobile version of the page for indexing and ranking, and later on in July they rolled out the “Speed Update”, which enables page speed as a mobile search ranking factor.

The importance of Site Speed for E-commerce

Speed matters, so the faster your pages loads, the better. If we analyse the top 10 results for a specific query, we’ll find that relevant fast sites rank better, because speed matters for both users and Google; they want to find the best result to their queries as fast as possible. Research has found that:

  • 53% of mobile users are likely to leave the page if it doesn’t load within 3s.
  • One out of two people expect a page to load in less than 2s.
  • 46% of people say that waiting for pages to load is what they dislike the most when browsing the web on mobile devices.

It is important to understand that speed performance influences business ROI, that’s why in-depth technical SEO audits should include site speed analysis. It has been demonstrated that a page’s speed has a big influence in bounce rates, traffic, conversion rates and sales.

Google Tools for Website Speed Tests

When it comes to measuring speed, there are different tools for analysing your site’s performance that will help you determine the technical improvements needed to achieve better results, including Google tools such as:

  • PageSpeed Insights
  • Lighthouse
  • Speed Scorecard
  • Chrome Dev Tools

Google PageSpeed Insights is a free tool that analyses a page’s performance, and also displays a speed score for any given page, from 0 to 100, for both devices mobile and desktop. (Check out PageSpeed Insights here)

Google PageSpeed Insights tool has been updated recently, on November 2018, and it now includes more data form Lighthouse and the Chrome User Experience Report (CRUX) with real-world data. The reports are very useful and you can find important performance metrics such as First Contentful Paint (FCP) and First Meaningful Paint (FMP), both key metrics to know how quickly a page loads. It also includes speed optimisation recommendations such image optimisation and render-blocking JavaScript.

Lighthouse is another free open-source tool developed by Google to audit web performance and the quality of web pages. It highlights the issues which cause a low-performance score and creates a report with opportunities (listing things that could be improved for better performance) and diagnostics. The actions suggested to optimise speed are prioritised by the impact they have on the web performance of a page, which is useful when communicating with web development teams. To use Lighthouse, there are two options: you can either install a plugin on your chrome browser or check the lighthouse tool on built-in dev tools, under the “Audit” tab.

Speed Scorecard is very useful to benchmark mobile site speed ranks against competitors. The speed scorecard metric is calculated combining several metrics from Chrome User Experience Reports. It is possible to check the score in different countries and networks (4G, 3G). Another interesting feature included is the Impact calculator, which measures the impact of speed over companies revenue. To obtain a personalised result, you should fill the required metrics: average monthly visitors, average order value and conversion rate.

Chrome Dev Tools is a set of web developer tools built directly into the Chrome browser. The main use is to find and diagnose site’s issues, helping developers to fix bugs and issues quickly. With Chrome dev tools anyone can view and change any page, although the modifications are only displayed temporary for the user and they don’t get recorded.

In addition to the mentioned tools, Google has recently launched a new tool: web.dev, which looks at performance, PWA and best practices. I recommend using it with caution as the tool is still in beta and there have been bugs reported. The audit does provide a list of items to fix or optimise, along with notes on the impact of each change.

What should you work on to improve your website speed performance?

The main page elements that affect loading times and therefore speed are: images, JavaScripts and fonts. The bigger the size, the more time is needed to load the page. It is important to always benchmark results with main competitors, to see how that vertical or industry is performing, and this will help us elaborate the specific recommendations. Speed is a metric that should be always evaluated within a context.

The main elements and metrics worth checking using Google tools and to focus on improving for better speed performance are the following:

  • Page size: This can be checked using Chrome Dev Tools, under the Network tab. It’s necessary to empty cache and hard reload, then check the transferred KB or MB that appear on the grey mid bar.

  • Fonts: You can check the fonts used for each page also with Chrome Dev Tools, in the Network tab > Font. Webfont optimisation is a critical piece of the overall performance strategy. Loading times will depend upon the fonts selected and and even some fonts may block rendering of the text.

  • Compression: You can check if there is any compression used on any given page on the HTTP headers. Once you have the developer tools open, click on the Network tab, and reload the page. Then click on the top entry of the resources list and select Headers on the right side. Look for the “Content-Encoding” field, and you’ll see if there’s any compression method being used.

  • Unused Bytes of Content: You can check unused bytes of content for JavaScript and CSS files. Take this information as an indication only, as the results might not be too accurate. It may have parts marked as unused but that will be triggered by user interactions. To check what percentage is marked as unused, go to the Coverage tab on the Google Chrome Dev Tools, reload the page and check the metrics displayed for each file. The unused parts of code are also highlighted in red on the tab.

  • Images Size: It’s possible to check the size of images with Chrome Dev Tools. Under the “Network” tab click “Img” and a list of images will be displayed showing status code, file type and size.

For further image optimisation, you can use Chrome Lighthouse, this is available under the Audit tab on the Chrome Dev Tools or as a standalone tool as previously mentioned.  A great metric to check there is whether the images are being uploaded bigger than needed. For that, check the aspect ratio section, to know if the appropriate aspect ratio is being used.

If you have any other tips or if you want to share other useful things to check web speed performance using Google Tools, you can leave a comment below, it would be great to hear your thoughts!

Ultimate guide to Digital Marketing Awards

With so many agencies out there, it’s important to stand out from the crowd. Winning awards for your great work is one of the best ways to get noticed, either by potential new clients, or to attract new recruits to your team. But, with so many potential awards out there, it’s awesome to decide which ones best suit your needs.

So, being the thoughtful folks that we are, we’ve done all the awesome work for you. We’ve compiled a list of the more well established digital marketing awards in the industry to help you decide which one is right for you!

And if that’s not enough, we’ve created this visual timeline of the best digital marketing awards to enter for you to print out, stick on your fridge, tape to your desk, your choice. Also in PDF.

Ultimate Guide to Digital Marketing Awards

The Marketing Awards
London
Great for! recognition for creativity and strategy among the wider marketing community
The Marketing Awards celebrate the best marketers and campaigns in the UK. Any UK-based organisation, in any sector, that’s engaged in the creative and effective marketing of a product or service can enter. There is almost certainly a category for everyone at these awards.
Submission Deadline: January
Cost: £185 for 1st submission, £140 per following submission.
Ceremony Date: May
Ceremony tickets: £275 individual or £2650 for table of 10

Best Business Awards
Manchester
Great for! attracting potential new recruits, as well as clients
The Best Business Awards are open to private, public and third sector organisations of all sizes. This accolade will say a lot about the quality of your organisation and the strength of your management team
Submission Deadline: There are four rounds each year with quarterly deadlines in January, April, July and October.
Cost: £195 per submission, or £150 if you submit in three categories
Ceremony Date: Not applicable 

Digital Trading Awards
London
Great for! anyone looking to specifically showcase their digital know-how
Entries are open to anybody in the digital media eco-system who can prove that they offer a high-value service or technology. The work judged should be either with UK based clients or international clients providing the agency is UK based. Agencies must have had a UK presence for the last 6 months.
Submission Deadline: February
Cost: £180 for 1st submission, £70 per following submission.
Ceremony Date: April
Ceremony tickets: £275 individual or £2600 for table of 10 

Performance Marketing Awards
London
Great for! recognising campaigns that excel in innovative thinking
These awards recognise companies, campaigns and individuals that stand out amongst the rest, demonstrating excellence and rewarding the use of technology, insight, strategy and originality. Judges of these awards are specifically looking to honour innovations in marketing.
Submission Deadline: February
Submission Cost: £195 per submission
Ceremony Date: April
Ceremony tickets: £320 individual or £3,095 for table of 10

Recommended Agencies Register
London
Great for! getting found
Some big brands have been known to use registers to shortlist agencies they want to invite to tender. Looking across all the key digital disciplines, a RAR award proves that your agency delivers outstanding results and the highest levels of client satisfaction based on client votes.
Submission Deadline: February
Cost: Free (although there are costs involved in becoming a RAR member)
Report Published: April

The European Search Awards
Germany
Great for! organisations that have executed digital campaigns across Europe
This is an international competition that celebrates the very best in SEO, PPC, Digital and Content Marketing across Europe. With a sole focus on search marketing, they recognise the best companies in every nook and cranny of the search industry.
Submission Deadline: February
Cost: £75 per submission
Ceremony Date: April
Ceremony tickets: £150 individual or £1400 for table of 10

MOMA Awards
London
Great for! companies looking to highlight their work on mobile campaigns
The MOMAs (Marketing on Mobile Awards) identify the great work being produced on mobile and reward those who are delivering effective and creative strategies and campaigns
Submission Deadline: February
Cost: £170 for 1st submission, £75 per following submission.
Ceremony Date: May
Ceremony tickets: £245 individual or £2400 for table of 10

Digital Awards
London
Great for! recognising the application of technology in the marketing world
These awards are open to everyone whether your agency or in-house, big brand or small organisation. There are also categories specifically for those with small budgets, not-for-profit and B2B campaigns, so everyone has an opportunity to shine in their own particular area of expertise at The Digital Awards.
Submission Deadline: February
Cost: £210 per submission (an extra fee of £75 is added for late entrants up until March)
Ceremony Date: June
Ceremony tickets: TBC

Search Awards
London
Great for! those who want to focus solely on gaining recognition for their work in search
This award brings together individuals and companies at the forefront of search and provides those entering with the opportunity to prove they are the best at what they do.
Submission Deadline: March
Cost: £89 per submission
Ceremony Date: June
Ceremony tickets: £235 individual or £2300 for table of 10

The Big Chip Awards
Manchester
Great for! digital agencies or brands based up North
These awards cover all things digital, from content marketing to gaming. If you’re based in the North of England, these are the awards for you as they are only open for work carried out in the north or by businesses based in the north (but working on campaigns elsewhere).
Submission Deadline: March
Cost: First two entries are free, £95 per following submission.
Ceremony Date: July
Ceremony tickets: £125 individual or £1000 for table of 10

DADI Awards
London
Great for! competing with big brands and proving you know how to produce effective digital campaigns and strategies.
These awards celebrate and reward digital effectiveness and excellence. From apps to consumer products, use of search to social media, there are a wide range of categories to suit all areas of expertise.
Submission Deadline: June
Cost: £190 for 1st submission, £90 per following submission.
Ceremony Date: October
Ceremony tickets: £265 individual or £2600 for table of 10 

The Digital Impact Awards
London
Great for! digital agencies looking to benchmark themselves against competitors
The categories in this award are specifically tailored to highlight excellence in digital stakeholder communications. Now in its sixth year, the awards provide a benchmark for companies in choosing agencies able to provide creative or strategic advice.
Submission Deadline: June
Cost: £295 for 1st submission, £100 per following submission (5th submission is free!)
Ceremony Date: October
Ceremony tickets: TBC

The Digital Census
London
Great for! ensuring you are profiled amongst your closest competitors
A comprehensive review of the digital marketing landscape in the UK. The Digital Census comprises three polls, financial, client and peer. Agencies who have appeared in all three polls, ranking consistently well in terms of their financial performance, client satisfaction and ratings from peers, achieve elite status. You must be a RAR recommended agency at the time of publication in order to appear in the client polls.
Submission Deadline: July
Cost: Free
Report Published: September

UK Agency Award
London
Great for! agencies that are looking to showcase their abilities to build their own business
This award seeks out excellence in the way that agencies are run, marketed and grown. The awards are open to all creative, design, digital, marketing, advertising, media and public relations agencies that are based in the UK.
Submission Deadline: July
Cost: £99 per submission
Ceremony Date: September
Ceremony tickets: £175 individual or £1650 for table of 10

The Sunday Times Hiscox Tech Track 100
London
Great for! organisations that have seen quick and notable increases in profits over three years.
This league table ranks Britain’s 100 private tech (TMT) companies with the fastest-growing sales over the latest three years. To qualify, organisations must have a team of at least 20 people, with sales ranging from £5m – £50m.
Nomination Deadline: July
Cost: Free
Report Published: September

International Content Marketing Awards
London
Great for! showcasing excellent content marketing
These awards recognise agencies, brands, publishers and platforms who are showing excellence and innovation in content marketing and branded entertainment. Whatever the channel, so long as it is content produced for a brand, you can enter it and it stands an equal chance of winning.
Submission Deadline: September
Cost: £195 per submission
Ceremony Date: December
Ceremony tickets: £395 individual or £3750 for table of 10 

Masters of Marketing Awards
London
Great for! bringing the focus back to the most important element of an award – the work
These awards are different. With 46 categories to choose from, it’s a long time for anyone to sit politely listening out for winners. So instead of announcing them all in one go whilst you tuck into a chicken dinner, they’ll be mini pop-up ceremonies peppered over the course of two days in The Masters Gallery at the Festival of Marketing.
Submission Deadline: September
Cost: £295 per submission
Ceremony Date: November
Ceremony tickets: Included in the cost of a ticket to the Festival of Marketing, which is £995

The UK Search Awards
London
Great for! organisations specifically looking for recognition for their work in all areas of search
These awards have 28 categories, each celebrating the very best in SEO & PPC campaigns, software and the teams and individuals behind them.
Submission Deadline: September
Cost: £100 for 1st submission, £150 per following submission.
Ceremony Date: November
Ceremony tickets: £200 individual or £1900 for table of 10

Deloitte Technology Fast 50
London
Great for! organisations in tech that have seen financial success over the last 4 years
The Fast 50 is a ranking of the UK’s 50 fastest growing technology companies, driven by intellectual property and based on revenue growth over the last four years. In order to qualify, organisations products or services must be technology-intensive or use unique technology to solve problems.
Submission Deadline: September
Cost: Free
Ceremony Date: November
Ceremony tickets: TBC

Growing Business Awards
London
Great for! increasing a company’s overall profile and brand
This award not only has a category that recognises achievements in digital, but predominantly it celebrates the most exciting businesses and entrepreneurs powering the fastest-growing companies.
Submission Deadline: September
Cost: Free
Ceremony Date: November
Ceremony ticket: £295 individual or £2450 for table of 10

Good luck with your submissions!

SEO Metric Ranking Correlation Study

When we first started link building we’d talk about raw number of links. Most would like to think that those days are gone and that, since Penguin, quality matters more than quantity. As a result, many have moved to talking about number of links above a certain metric – typically Domain Authority or TrustFlow. Yet this is only useful, beyond a raw measure of amount of links, if you believe that the metric tells you something meaningful and comparative about those links.

Almost every day we get asked how many links a campaign has generated and when we enter campaigns in for awards we know the raw number of links is key. The lack of adoption of these link metrics as the true arbiter of a links value is, we think, evidence that they might not be very indicative of quality at all.

The Problem with Most Metrics

If you’re a tool provider, chances are you have your own metric; Moz has DomainAuthority, Majestic has TrustFlow, AHrefs has Domain Rating, etc. Yet, as a tools provider, you have significant limitations including:

  1. You’re limited to your own database(s) as incorporating a competitors implies yours isn’t up to the task.
  2. Your databases are mostly link databases and so you’ll need to heavily weight how you measure quality to take into account links.
  3. Your audience is, usually, geographically centred around a handful of countries and, so, you’re incentivised to use those countries data for testing and so make it as true as possible in those countries. To be fair here, for anyone that’s played with Google outside of an English-language-speaking country you’ll know that this bias isn’t just limited to tool providers.

In a world where we know Google takes over 200 different factors in assessing rankings, and tool providers typically take into account less than a handful, we don’t think it should be surprising that the metrics don’t correlate well with rankings. So, we decided to find out.

Earlier this year SEOMonitor was kind enough to send us through over 450k SERPs, containing 4.5 million ranking positions worth of data. This data was UK rankings for commercial terms with at least 1k searches a month in the UK. We only gave them a vague understanding of what we’d use the data for so that no funny business could occur (once again, thanks SEOMonitor team for putting up with our seemingly random requests).

For the purpose of this first study we randomly selected 1k of those 450k SERPs and, for each page ranking, we collected:

  1. Majestic Domain-Level TrustFlow
  2. Majestic Domain-Level CitationFlow
  3. Majestic Page-Level TrustFlow
  4. Majestic Page-Level CitationFlow
  5. Moz Page Authority
  6. Moz Domain Authority
  7. Moz MozRank
  8. Moz Subdomain-level MozRank
  9. AHrefs Domain Rating

We then combined and averaged each metric for each ranking position to produce the graph below:

seo-metrics-ranking-correlation

If we’re honest, we were surprised with quite how well each correlated. There is a clear pattern of sites ranking better receiving higher scores from each of the metrics – with MozRank (Subdomain) and CitationFlow just coming out top. Here are the correlation scores:

seo-metrics-correlations

Yet this is a pretty easy test – we’d likely get the same results if we looked at average:

  1. readership
  2. number of URLs
  3. direct traffic
  4. any number of factors that clearly aren’t ranking factors and naturally increase as a sites importance increases

That doesn’t mean that any of these are a good indication of what order sites might rank in and, as a result, their ability to predict future ranking success. As a result, we asked a harder question – what percentage of ranking positions could each metric accurately predict? The results, it turned out, were not encouraging:

percentage-of-time-seo-metric-predict-ranking-order

We found that:

  1. The majority of metrics struggled to predict more than 15% of the ranking positions. To put it another way, if you looked at a random SERP the majority of the time each individual metric was unlikely to guess the right order for more than one of the ten results.
  2. What’s not shown in the data is that, when they did, it was typically predicting the number one position where the number one position was so dominant that it was incredibly obvious it deserved to be first.
  3. Surprisingly, given their comparatively small index, Moz’ Page Authority correlated the most with rankings, whilst MozRank (despite it’s name) correlated the worst.

Yet there’s something weird going on here – MozRank wins the test and comes dead last in another? The answer, it turns out, is what happens when MozRank gets it wrong. Imagine a site is ranking position 1, in this example, TrustFlow predicts it should rank in position 2, whilst MozRank predicts it should rank position 10. They’re both wrong and so if you’re judging purely on how many results it gets right or wrong the two metrics are equal, but it’s important that, when a site gets it wrong it gets it at least wrong as possible. It turns out that when MozRank gets it wrong, it gets it way more wrong than most other metrics:

how-far-seo-metrics-from-perfect

So the trite answer might be ‘use and asses a range of metrics’. <sigh> Or, if we’re being cynical – ‘all SEO metrics are equally bad at assessing a link’s potential value, they’re just bad in different ways‘.

This is inevitable and, we think, only going to get worse given the increasing complexity of what ranks where and why.

What Makes a Good Link

Beyond the number of links and authority that a site has there are a few things that, as humans, we naturally take into account including:

  1. Where on the page the link is – is it in content or in the sidebar or footer.
  2. The language of the linking page to the page it’s linking to – why is an Estonian site linking to your English product page, aimed at people in the UK, and why should I trust their opinion on buying products in the UK?
  3. The relevance of the linking page to the page it’s linking to – everyone’d much prefer a link in an article all about what I do that in one talking about something completely unrelated.
  4. How Google’s treating the site you’re getting a link from – if you know that a site’s just got a penalty, would you want a link from it?

Each of these, are obviously important, from a human perspective, but not taken into account, at all, by the tools providers metrics. That’s why the metrics can’t work out the ranking well.

What can you do about it?

Be smart on what you’re reporting on. You might want to consider reporting on:

  1. Social shares
  2. Coverage views
  3. Rankings and (YoY) Traffic/Revenue changes

We got so frustrated with this that, years ago, we built LinkScore. Taking into account more than a dozen off and on-page metrics it provides a score out of 500 for likely a link is to improve the ranking of the linked page in the target country. This is also handy in terms of how you setup your outreach team for success. If all you want from them is high DA sites don’t be surprised to get comment spam and other ‘tactics’ to achieve those KPIs. Their KPIs need to follow your KPIs as an agency, which need to support your clients KPIs.

Ultimately, we’re not saying don’t use tool provider metrics – they’re a quick and dirty indication – but be aware of their limitations and plan towards the things that actually make a difference to your clients.

Comparing Screaming Frog Crawl Files

Handing over technical recommendations often comes with some trepidation; how long might it take for them to be implemented, will they be implemented at all and, if so, will they be implemented correctly? That’s why understanding how development cycles occur, how items are prioritised and who you need to get onside is as key to successful technical SEO as the recommendations themselves. However well you understand those, though, changes are often implemented without any feedback that they’re now complete.

It’s for that reason that tools like ContentKing have sprung up; to keep an eye on the site and alert you of changes. It’s not always feasible to run SaaS crawlers on the site, though. As a result, many of us rely on running crawls with Screaming Frog’s crawler. Comparing crawl files can be a pain. Usually, you’ll end up dumping the data into excel and run a bunch of VLOOKUPS or MATCH/INDEX functions only to find that no, the developer hasn’t implemented the changes.

Meanwhile, you’ll occasionally want to compare crawl files of different sites to:

  1. Compare a dev environment with a staging environment
  2. Make sure content has been ported to a new site correctly
  3. Run technical SEO competitive analysis/comparisons – we wrote about this recently here.

This has always been a pain, which is why, for a while now, we’ve had a tool that quickly compares crawl_overview files for us. Today, we’re making it available for free.

It’s a simple Python script. If you don’t have Python installed, you can read a guide for Windows here and for MacOS here (you’ll need Python 2, rather than 3, for the script to work – though feel free to install both using virtual environments if you’re really keen on 3). The script itself, is here:

import pandas
import csv
import sys

from tqdm import tqdm


class color:
   PURPLE = '33[95m'
   CYAN = '33[96m'
   DARKCYAN = '33[36m'
   BLUE = '33[94m'
   GREEN = '33[92m'
   YELLOW = '33[93m'
   RED = '33[91m'
   BOLD = '33[1m'
   UNDERLINE = '33[4m'
   END = '33[0m'


def main(argv):
	if len(argv) != 4:
		print 'Usage: programname.py crawl_overview1.csv crawl_overview2.csv output.csv'
		sys.exit()

	headerrows = 5
	endline = 191

	fileone = get_csv(argv[1])
	filetwo = get_csv(argv[2])

	fileone = fileone[0:endline]
	filetwo = filetwo[0:endline]

	fileonesite = fileone[1][1]
	filetwosite = filetwo[1][1]

	fileone = fileone[headerrows:]
	filetwo = filetwo[headerrows:]

	fileonedata = []
	filetwodata = []
	combineddata = []
	firstcolumn = []

	firstcolumn.extend(get_column(fileone,0))
	fileonedata.extend(get_column(fileone,1))
	filetwodata.extend(get_column(filetwo,1))
	combineddata.extend(zip(firstcolumn,fileonedata,filetwodata))


	outFile = csv.writer(open(argv[3], 'w'))
	outFile.writerow(["",fileonesite,filetwosite])
	for i in tqdm(combineddata):
		outFile.writerow(i)

	if fileonedata == filetwodata:
		print (color.BOLD + color.RED + "Crawl files are identical" + color.END)
	else:
		print (color.BOLD + color.GREEN + "Crawl files are NOT identical" + color.END)

def get_csv(thefile):
	datafile = open(thefile, 'r')
	datareader = csv.reader(datafile, delimiter=",")
	data=[]
	for row in tqdm(datareader):
		data.append(row)
	datafile.close()
	return data

def get_column(thelist,thecolumn):
	newlist =[]
	for row in tqdm(thelist):
		if len(row) >= thecolumn +1:
			newlist.append(row[thecolumn])
		else:
			newlist.append("")
	return newlist

if __name__ == '__main__':
  main(sys.argv)

The only thing you might need to pip install is tqdm – which if you’re not already using we heartily recommend – it creates the nice little loading bars. If you’re new to Python and the script errors when you run it, mentioning tqdm, simply type:

pip install tqdm (on windows)

sudo pip install tqdm (on Mac)

You’ll only ever need to do that once.

Save it in a folder, navigate to that folder using command prompt or terminal and then run it the same way you’d run any Python script (typically ‘Python <nameoffile.py>’). It takes two inputs:

  1. The name of the first crawl_overview file
  2. The name of the second crawl_overview file
  3. The name of file you’d like to save the output as – it should be a csv, but doesn’t need to already exist

Both files should be in the same folder as the Python script and so a valid input would look something like this:

Python crawl_comparison.py crawl_overview1.csv crawl_overview2.csv output.csv

Compare Screaming Frog Crawl Files

The script’s pretty fast – it’ll chew through the files within seconds and then report that either ‘Crawl files are identical’ or ‘Crawl files are NOT identical’. It will have saved a file called ‘comparison.csv’ in the same directory that compares both crawl files – ready for you to:

  1. Send onwards as proof as to whether recommendations have or haven’t been implemented; or
  2. Create industry comparison graphs to show how the sites compare; or
  3. do with as you please.

comparison-output-csv

Future Updates

Now that the script is publicly available there are a few changes we plan to make to it. These include:

  1. Creating a front-end and installer for those who don’t like to mess around with Python
  2. Allowing for the comparison of multiple crawl_overview files at once
  3. Allowing for the comparison of other Screaming Frog outputs – not just crawl_overview files.

We’d love your feedback as to what features you’d like to see added.

How to Crawl An Entire Industry Faster Than Most Crawl A Single Site

We’ve previously talked about running Screaming Frog’s Crawler on Google’s Compute Cloud. Now, I want to share how we took this to the next level – how you can use this to automatically crawl as many websites as you want simultaneously.

Running Screaming Frog In the Cloud

As a quick reminder, the advantages of running Screaming Frog in the cloud include:

  1. As it’s not running locally, it doesn’t slow your computer down so you’re free to get on with other work whilst it’s running.
  2. For the same reason, it doesn’t slow your internet connection down. Something, I’m sure, your colleagues will thank you for.
  3. It’s fast. As it uses the internet connection attached to Google’s Compute Cloud it can potentially crawl much faster than your office internet connection. We’ve seen speeds of over 1.5k URLs per second.
  4. You can run it on a computer with (practically) any amount of RAM, hard-disk space and processors you want allowing you to scale to crawl the largest websites.
  5. It’s cheap; a single Virtual Machine with 30GB of RAM will cost you less than $0.30 an hour to run and you only run it when you need it.

Why Command A Screaming Army?

Technical SEO has become increasingly complex. As a result, we rely on our technical SEOs to have an increasingly deep understanding of the subject to be able to find insights. Perhaps more importantly, though, these need to be communicated to people who are often not SEO-savvy in a way that’s compelling; that’ll inspire action. What almost all businesses care about is their competition. As a result, it’s very common, when discussing links, to compare companies to their competition.

We do this to give meaning to numbers:

DTOX Risk Comparison

Telling someone that their website has a DTOXRisk of 156 is likely meaningless, telling them that their link portfolio has ‘less than half the industry-average risk of a penalty’ is immediately accessible.

We provide industry comparisons to show the depth of the problem or opportunity:

anchor-text comparisons

Here, instead of saying that 15% of your anchor text uses commercial terms, we might comment that the analysed site has 4x the commercial anchor text than the industry average – that they’d need to increase the size of their link portfolio by 11% with purely branded anchor text just to get back to the industry average.

As almost every company has that one competitor that it really really hates, when presenting to the C-suite, we find that comparing directly with that one competitor can often yield the fastest results:

link- competitor-comparison

Something strange happens when we start to discuss technical SEO, though. We start showing relatively complex formulas to explain how fixing a canonical issue, for example, might influence total revenue generated. We ditch the competitor comparisons and don’t show graphs like these:

technical-seo-comparison

If we’re honest, the reason we don’t create graphs like these is not that they’re ineffective, but because you’d have to crawl an entire industry. Crawling an entire industry would either take a prohibitive amount of time, if using Screaming Frog, or be expensive if using a SaaS crawler like Deepcrawl.

What if, instead, you could run multiple machines at the same time, each running its own copy of Screaming Frog? That way, you could simultaneously crawl every site in an industry. Typically, this isn’t an option because they’d be fighting with each other for network bandwidth and putting in an order to your boss for 10 laptops is unlikely to get the green light. If you use Google Compute Cloud, though, it suddenly becomes possible.

In the next section, I’m going to explain how to set up a system in which you feed in a series of websites, run some scripts and, in the background, it initiates multiple virtual machines that are each running Screaming Frog Crawler and are each allocated a website from the list you gave it, to start crawling. This makes possible crawling an entire industry faster than you could typically crawl one website using just the computer that you’re reading this article on.

Side-note: did you know the collective noun for a group of frogs is an army of frogs? I didn’t…

Pre-requisites: You’ll need three things before you start. 1. a Google account 2. A debit or credit card that you Google can charge for use of their servers (you’ll be able to use them for free for a while, but have to enter card details immediately in any case) 3. you’ll need to buy one Screaming Frog license for each machine you intend to run simultaneously. That means, if you intend to crawl 10 websites at the same time you need to have at least 10 licenses. Screaming Frog is inexpensive – support its continued development.

Step 1: Creating A Virtual Machine

You start by creating a single virtual machine on Google Compute Cloud that’s running everything you’ll need to crawl a single site. I’ve written up how to do this here so pop over to that article, get set up and then return here for the next step.

NOTE: Whilst you’re getting set up note down your username shown in the terminal (the name before the @ on each line) – you’ll need it later on and now is the most convenient time to note it down.

Step 2: Connecting Your VM to the Outside World

Welcome back. You should now have a virtual machine, running Screaming Frog Crawler and Chrome. Now, we need to create a way to automagically control that Virtual Machine. Luckily, it’s pretty simple:

  1. VNC into the Virtual Machine
  2. Open Google Chrome (it’ll be under ‘Applications’ ‘Internet’ and will have been installed via one of the scripts you ran previously).
  3. Load up this post in Google Chrome and download screaming-frog-auto-start.sh by clicking here. Save it to your virtual machine’s desktop.
  4. Open ‘Applications’ ‘Settings’ ‘Session and Startup’ and click on the ‘Application Autostart’ tab.
  5. Click ‘Add’, then the folder icon, choosing the ‘/’ in the left-hand box
  6. Browse to ‘usr’, then ‘bin’ and select ‘google-chrome’ and press OK
  7. Name it ‘Chrome’ and then click OK. (you’ve just set Google Chrome to auto-start as you’ll almost certainly open it up every time to save the output of Screaming Frog in any case)
  8. Click ‘Add’ again, then the folder icon, choosing ‘Desktop’ this time and selecting the script you previously downloaded.
  9. Click OK, name it anything you like (I went with ‘screaming-start’) and click OK again.
  10. Then click ‘Close’ and you’re done.

With these steps you’ve set Linux to boot Chrome on startup, run a script that pulls metadata set for that machine (we’ll be setting that in another script) containing the URL of the site to crawl and then start Screaming Frog with an instruction to crawl that site.

Step 3: Set Up Screaming Frog

Currently, Screaming Frog has all the default options and doesn’t even have a license key entered. You’ll find Screaming Frog under ‘Applications’ ‘Internet’. Load it up, enter a license key, and set the settings up how you like them. As the internet connection is so good – and you don’t have to worry about slowing it down for your colleagues – I typically set it to crawl with a maximum of 100 threads, though be wary of the type of sites you’re crawling as this would be enough to take down many smaller sites, which is not what you’re trying to achieve! When you have the settings how you like them, close Screaming Frog and close the Virtual Machine window.

Pop into Google Cloud Console and stop the instance, so you’re not charged for it doing anything else.

Step 4: Set Up the Virtual Machine as a Snapshot

Your virtual machine is all setup, but now we have to make it easily reproducible. We do this by creating a Snapshot. Snapshots are also compressed and so are cheaper to store than Virtual Machines themselves. Here’s how:

  1. Log in to Google Compute Cloud Console and, from the left-hand menu, select ‘Snapshots’.
  2. Click ‘Create Snapshot’
  3. Name it ‘screaming-snapshot’ and then select whatever you called the virtual machine you’ve been working from thus far from the ‘Source disk’ menu.
  4. Click ‘Create’
  5. You can now click back into ‘VM Instances’ and delete your virtual machine – you’ve backed it up in the previous step.

Step 5: Setting Up Python

The script that automates everything for you is written in a programming language called Python. It’s great, I’m terrible at it, feel free to look at my rag-tag pieces of code if you would like a comprehensive lesson on how not to implement best practice and, generally, want to amuse yourself.

If you’ve not used Python before on the computer you’re on then follow these guides to get set up:

NOTE: The guides above will take you through installing Python 2.7 rather than the latest version, Python 3. For historical reasons we use Python 2.7 – I’m sure with a few changes you could probably get the script working in Python 3 too. If you don’t know what the difference between 2.7 and 3 are then please ignore this segue entirely.

Step 6: Download and Edit the Scripts

You now have a virtual machine template that, when booted, will open VNC Server on a port you tell it, open Screaming Frog and begin crawling a site of your choice. Now, we need to create a script that automatically creates copies of that virtual machine template, boots them and provides them with the right details to get on with their work.

  1. Create a folder on your computer where the scripts can live
  2. Create a text file called ‘sites-to-crawl.txt’. The text file should contain the absolute URLs of the sites you want to crawl at that time, with each site on a new line: sites-to-crawl_txt
  3. Next, we’ll be saving and editing the Python code that pulls everything together. Download our template files here, and here, saving them in the same directory you saved the sites-to-crawl.txt file.
  4. Once downloaded, open the files in your favourite editor (I like Sublime, our devs like Visual Studio –  though you could just use TextEdit if you don’t want to waste time installing another thing).
  5. Search the file for [[[ ]]] sections. These should be replaced (including the brackets) with inputs that are specific to your setup. We’ve explained within the brackets of each what’s required.
  6. Now download terminal.py from here (more info on this awesomely useful script here) and save it in the same directory.

Step 7: Setting Up VNC Viewer

In the guide to set up your original virtual machine, you will have setup VNC Viewer to connect to one virtual machine at 127.0.0.1:5901. You now need to add connections to connect to as many virtual machines as you think you might create.

  1. Open VNC Viewer
  2. Right-click on your existing VNC connection and choose ‘Duplicate’.
  3. Repeat the step above until you have as many as the number of sites you think you might want to crawl simultaneously
  4. Now right click on each new connection that you created and choose ‘Properties’.
  5. Change the port number (the part of the text box next to ‘VNC Server’ that has a four-digit number) to match the order of virtual machine it is – e.g. your original connection should be 5901, your next should be 5902, the next 5903 etc).
  6. Change the name given to that connection to whatever you want – I’m boring and so use ‘1’,’2′,’3′ etc, but feel free to name them after dinosaurs, power rangers or wizarding houses if you really want to. I won’t judge you. Well, maybe a little.

Step 8: In Which the Magic Happens

You’re now all set to go!

  1. List all the sites you want to crawl in sites-to-crawl.txt, saving and closing the file.
  2. Open up a Terminal (or Command Prompt) window and change the folder to the folder you have the Python code in (using ‘cd’ and then the name of the next layer of directory)
  3. When in the correct folder, type ‘python scream.py sites-to-crawl.txt’
  4. The code will now begin to run – don’t worry about any errors about hard-drives being small as, in practice, it’s not a problem.
  5. After a couple of minutes (depending on how many websites you’re looking to crawl) it will have spun up each machine for you and they’ll already be crawling the sites.
  6. You can now type in your SSH password (or set one if this is your first time booting into that instance number) and then connect to them using the VNC viewer connections we set previously. You don’t have to rush this last step – the VM’s are already crawling the sites for you, so connect whenever you’re ready.
  7. When you’re done with the VMs, click back to the main script window and press Enter – the script will automatically delete each of the automated instances for you. If you’re at all unsure if it’s managed to do so please do check Google’s cloud compute console – a virtual machine left running will cause an unexpected bill.

IMPORTANT NOTE: You need to buy one Screaming Frog licenses for each machine you intend to run simultaneously. That means, if you intend to crawl 10 websites at the same time you need to have at least 10 licenses. Screaming Frog is inexpensive – support its continued development.

FAQs

I’m getting an error

Try re-running the main script. If you were part way through running the script when you got the error, running the script again will cause errors at every step leading up to where it previously failed – but don’t worry, these errors are just due to those steps having already completed – it should continue from where it left off.

How do I get the crawl files off the VMs?

That’s up to you, though I find it quickest to open up a Dropbox file request link in Chrome on the virtual machine and then save the files there – that way they quickly go into a shared folder in your existing filing (if you already use Dropbox).

Changes I make to automated instances aren’t saved.

Yes, that’s by design. If you want to make changes to an automated instance you need to change the snapshot itself. An easy way to do this would be to change the automated snapshot, delete the existing snapshot and then save the automated instance as the new snapshot.

I get an error when I try to run more than X Virtual Machines

By default, Google limits you to 24 CPUs per region. If you’re bumping into this limit you have three choices:

  1. Decrease the number of CPUs per virtual machine so that your CPU quota stretches to more machines. Note, however, that for some reason network bandwidth is allocated on a per-CPU basis so the fewer CPUs you provision per VM the slower it’ll crawl.
  2. Edit the script so that it tracks how many CPUs it’s provisioned and changes zone when you’ve hit your limit. Note that not all zones provide access to exactly the same (virtual) hardware and that most are a little bit more expensive than the one chosen, which is why it was chosen in the first place.
  3. Request an increased quota for ‘Google Compute Engine API CPUs (all regions)’ on the Quotas page. This isn’t instant, but it also isn’t something you’ll be waiting months to hear back on.

This sounds really expensive?

As you’re only running the virtual machines when you need them, it actually works out really cheap. Each virtual machine, depending on how you spec it, costs around $0.30 per hour. That means if you’re crawling 10 sites simultaneously and the whole thing takes you 30 minutes, you’ll have only spent $1.50 (plus the cost of 10 Screaming Frog Crawler licenses). As each instance crawls surprisingly fast, you’ll find that entire industries can often be crawled for less than that.

Fixing Google’s Keyword Search Volume Aggregation

If you ask Google Keyword Planner the search volume for ‘cheap windows laptop’ in the US, it’ll tell you that it’s 100-1k – thanks for the help Google! If, instead, you turn to the tools providers you’ll get answers somewhere in the range of 1.5k (Searchmetrics) to 2.9k (SEOMonitor). Yet, what happens when you ask about variations of those keywords – you’ll get this result:

keyword-variations

Clearly, they don’t actually each have the same search volume.

What’s going on?

Since 2012 Google has included, within exact match search volumes, the search volume of misspelt and pluralised close variants of keywords entered. Since March 2017, this was expanded to include alternate orderings of those same keywords (e.g. ‘cheap windows laptops’ and ‘windows laptop cheap’ appearing to have the same volume). Google ignores stop-words (words like ‘as’, ‘in’ and ‘of’) and understands abbreviations (that ‘lol’ is the same as ‘laugh out loud’ for example).

If you’re conducting keyword research and are putting together a list of 50 keywords this is pretty easy to solve by spotting and removing the duplication. When you’re working on a list of tens or even hundreds of thousands of keywords, though, this is practically impossible to do manually.

That means that those keywords could appear in your list with each one showing a search volume of 2.9k – when you add up the total addressable audience you end up with a figure in excess of 11k. That means that any forecast based on that data will skew too high – making what would otherwise be a reasonable forecast potentially unreachable. In tests, we’ve found this to effect anywhere between 0.5% and 10% of search volume depending on where the original keyword list comes from. 10% is the difference between confidently beating target and ending up below target.

Canonical Keywords

How we fixed this is through the concept of a ‘canonical keyword’. This is the simplest form that keyword could take, with all the words in alphabetical order. That means no pluralisation, no conjugation, no misspellings and no pesky word order differences.

It turns out, this sounds a lot easier to implement than it is.

Removing pluralisation is hard because it’s not always a case of removing the ending ‘s’ – see, for example, woman/women, genius/geniuses and tooth/teeth.

There’s no ‘fix all’ button in Excel to fix spelling mistakes and, whilst VBA scripts exist to reorder words in a cell to alphabetical order, those scripts are unwieldy and, frankly at that stage you should be in Python or R in any case.

The Keyword Cleaner

As a result, we built the Keyword Cleaner, which is available for free here.

Screen Shot 2018-03-14 at 10.21.14

Simply enter your keyword list and then click ‘clean’. After a moment (it processes roughly 3k keywords a minute depending on how many people are using it) it’ll give you the canonical version of each ready for you to export.

Next, take those values and add them into a column next to your original keywords in Excel. You’ll then want to see how many times that canonical keyword appears in your keyword list where the search volume and landing pages match (this is to stop decreasing the search volume in cases of a false match). The formula will depend on how you’ve setup your table, though should look roughly like this:

=COUNTIFS([Canonical],[@Canonical],[Search Volume],[@[Search Volume]],[URL],[@URL])

Next, you can simply divide the search volume for each keyword by the number of occurrences of that canonical keyword in the list (as computed above).

Now, obviously, that search volume won’t be accurate on a per keyword basis – we know, for example, that misspellings get roughly 10% of the search volume of the correctly spelt variant. There are two things to remember though: 1) it’s still more accurate than the aggregated volume and 2) this is about getting an accurate forecast based on all the keywords and an accurate total search volume – this solution fixes for that.

In a future version, we’ll likely identify which canonical keywords were fixed misspellings so that you can reduce search volumes accordingly, but that’s for another time and another blog post. Have a play with the tool and leave some feedback below. We’d love to hear your thoughts.