Verve Search logo

How to use data analysis for PR campaigns (and our favourite tools for it)

Data. The not-so-secret weapon of most businesses. It’s far more than just numbers in a spreadsheet – it has the power to reveal incredible stories and newsworthy content – but if you’re at the beginning of your data analysis journey, it can be daunting to know where to begin.

Fret not, though. Let us guide you through the best ways of unlocking the potential of your data.

There are so many ways to streamline and enhance your analysis to ensure your numbers are rock solid. With these processes, your PR campaigns can become storytelling masterpieces that journalists and readers can’t help but engage with.

From understanding what your campaign objectives are to having the best data tools for the job, it’s all about what you know and how you use it.

Here at Verve Search, data is the cornerstone of our digital PR work, creating original and exciting content that is bespoke to each brand we work with. Our award-winning campaigns are proof that good data and analysis are imperative if you want to make an impact. 

Here’s exactly what we do collect, process, and analyse data for our digital PR campaigns.

 

Step 1: Campaign Objectives

Remember those objectives we just mentioned? They should be the first steps towards creating streamlined data campaigns.

But to achieve your campaign objectives you must first navigate what they are. Identifying clear and measurable goals can help focus and guide your analysis more effectively.

This involves asking questions like:

  • What problems are you trying to solve? – This depends on the idea as well as client input.
  • What metrics are best suited to the analysis? –This will depend on the angles you expect to pull out from the data and what you think will appeal to journalists the most.
  • What is the time frame of the project? – This will impact the scope of your campaign.

Once you’ve answered these questions, you’ll be well on your way to understanding what you need to do to get there.

 

Step 2: Data Collection

Credit: Choong Deng Xiang via Unsplash

Once you have laid out your campaign objectives, now is the time to make them come to life with data collection. In data-driven campaigns, there are countless data sources that can be used as the basis of the idea – and they often require different methods of collection.

Clients may offer up their internal data or you may have to go scraping the web to create your dataset. It’s useful to know what criteria make a good quality source while gauging whether they are suitable for your campaign, as well as how you can use them to your advantage and how long it will take to gather the data from them.

Looking for potential data sources that don’t require web scraping? Here are some examples:

Government Websites

Government websites like gov.uk or usa.gov hold extensive amounts of publicly available data across a broad range of topics and industries; all you need to do is hit download on whichever file you need.

For example, we leveraged government housing and land registry data to create Forever Homes and Priced Out Property campaigns. This created relevant, trustworthy, and newsworthy content that generated hundreds of links, whilst also keeping it super relevant for the brand.  

APIs

Many websites offer an ‘API’ (Application Programming Interface) which gives access to their data in a clearly formatted and relatively accessible way.

For example, the Spotify API is free and all you need to do is sign up for a developer account to have access to the API keys that are required.

There are a few different ways to incorporate an API for data collection, but for one example, Python was our tool of choice. With it, we collected data for a music-focused campaign for a language tutoring client.

For them, we produced a lyric study which looked at the number of syllables in rappers’ songs. It relied heavily on Spotify’s API, as it allowed us to access a variety of metrics ranging from general song/artist information to scores of different elements of a song. In this case, “speechiness” was the most important metric to score how wordy each track was.

Freedom of Information Requests

Why not request information from a public authority? Freedom of Information (FOI) requests are a great way to gather credible data by leveraging your right to information. You have the right to ask for recorded information held by public authorities including government departments, the local council, and the police and fire services. All this information is free to access.

Keep in mind that if you’re contacting departments from all over the country, you will need to carefully craft your request so that the data you receive is in a consistent format. This makes it much easier when it comes to analysis.

Surveys

If you want truly original data, there is no better collection method than an organic survey. Carefully crafted questions will yield useful results and easily analysable data.

We leveraged first-hand survey data to create a campaign focused on female health for a fem-tech brand, analysing responses from around 10,000 users. The results yielded many headline-worthy statistics, and the real-life opinions and answers were fundamental to the campaign.

Just keep in mind that collecting first-hand data like this can often be expensive, so make sure to bake that cost into your campaign’s budget.

 

Step 3: Data Processing

Before you can start to analyse the heaps of data you’ve collected, it’s key to first process your data.

Processing is the most important step and often referred to as ‘cleaning your data’. It ensures your analysis will be smooth and produces accurate results by making everything you’ve collected consistent.

The two main tools that will be referenced – and that we use continuously – will be Excel and Pandas (a data analysis module of Python, particularly useful for larger datasets that may sometimes crash Excel).

Remove duplicates

Excel’s built-in ‘remove duplicate’ function can do this for you. Alternatively, the Python library, Pandas has a similar function (drop_duplicates). More on the best data tools later…

Checking for errors or missing values

Pivot tables, built into both Excel and Pandas, are your best friend for getting an overview of your data and easily checking issues before analysis.

They allow you to check for anomalies and outliers within your dataset and summarise key metrics while also allowing you to filter and separate different columns for analysis. Below is an example of raw data for a potential campaign, which as you can see, would be difficult to analyse in its current form…

Credit: Verve Search

This is why we put it into a pivot table, as shown below. By selecting the individual columns to analyse, it is in a much more digestible format to pick out insights.

This specific table shows the number of international artists within the singles charts of each country and many more columns can be joined together with this, such as comparing Tempo scores by country to see which one has the most upbeat music.

Credit: Verve Search

Checking for and removing anomalies

Anomalies can disrupt analysis and interfere with useful results. They may occur for all kinds of reasons, but the most common is incorrect data collection. As mentioned above, pivot tables can help to identify anomalies, or even using Excel’s filter function: filter by ascending or descending order and you can see anomalous pieces of data, depending on your dataset.

Transforming

Any unchecked formatting may cause issues down the road, so check that your data is in the same format. Double-check that your columns are the data type you want them to be, including numbers, text and dates.

Sometimes numbers will be imported as text and need to be changed. If working with multiple data sources, they will need to be uniform across the board to prevent having to continuously refer to the original datasets to avoid confusion further along the analysis process.

 

Step 4: Data Analysis

With processing complete, you’re now ready to dive into data analysis.

Firstly, refer to your objectives. Check the initial goals to make sure your analysis still aligns with them and whether they’re still achievable with what you’ve got.

Make sure to explore your data. Most analysis can be done with pivot tables, either in Excel or Python.

Then visualise your data. Once you have dived into the data with pivot tables, you can use them to create charts and graphs that clearly represent the data. Your visualisations should make it obvious to anyone who takes a glance what the data is showing (refer to the data vis catalogue for inspiration).

Finally, it’s time to tell a story. Once trends have been identified and visualisations have been created, crafting a narrative around these findings will be what sets your campaign apart.

Pull out key statistics and create headline-worthy points from your analysis. Focus on making your points digestible for everyone, like opting for percentages rather than raw numbers as this will be clearer to both journalists and readers.

Data Analysis

Just like Batman, data analysts have a variety of tools in their utility belt. Below are the ones that we use the most day-to-day, to help us with steps in processes from planning to data analysis.

Excel

Credit: Verve Search

Hate it or love it, Excel is a necessary tool used for viewing, interpreting, and analysing data. Using pivot tables form a large part of the analysis process which is one of many useful functions that Excel contains.

Python Interpreter

Credit: Jupyter

To write and run Python scripts, you need a Python interpreter. This is always down to personal preference, but we prefer using Jupyter Notebook/JupyterLab to run code in a more modular fashion as opposed to a whole script at once. Not your cup of tea? Look to alternatives including VS Code and Pycharm.

Python

Credit: Selenium

Python is a powerful programming language that can be used in various situations, specifically the scraping and analysis process for data analysts. By leveraging the power of the many libraries within Python, you can collect and manipulate large datasets easily to create visualisations.

The most-used libraries for data collection are BeautifulSoup, Requests and Selenium – and a combination of these can be used to access and scrape websites.

In the same vein, the most common library for analysis is Pandas, in which you can create data frames to contain and manipulate your data before exporting the results to Excel.

You can create visualisations using the libraries Matplotlib, Seaborn or Plotly, check out the documentation for more detailed information.

SQL

Credit: SQLite

When working with large datasets, an SQL database can be useful in storing and accessing data. You can connect to a database through Python and store any scraped data directly. MySQL or SQLite are good starting points.

Data Visualisation Tool

 

Credit: Tableau

Looking for more a powerful method for creating visualisations? Platforms such as Tableau or PowerBI can be useful to create more interactive charts, which can allow for a different type of storytelling.

AI

Whether you’re ready to accept it or not, AI and Large Language Models such as ChatGPT are here to stay. They’re useful when used correctly, and especially when assisting with code or even its built-in analysis capabilities.

However, be careful: it’s not recommended to directly input your data into these tools as they may contain sensitive information. Instead, use it to create Python code, or give you a more general guide towards the best methods for analysis.


Interested in our content marketing and digital PR services, including data processing and analysis?Get in touch.

Comparing Screaming Frog Crawl Files

Handing over technical recommendations often comes with some trepidation; how long might it take for them to be implemented, will they be implemented at all and, if so, will they be implemented correctly? That’s why understanding how development cycles occur, how items are prioritised and who you need to get onside is as key to successful technical SEO as the recommendations themselves. However well you understand those, though, changes are often implemented without any feedback that they’re now complete.

It’s for that reason that tools like ContentKing have sprung up; to keep an eye on the site and alert you of changes. It’s not always feasible to run SaaS crawlers on the site, though. As a result, many of us rely on running crawls with Screaming Frog’s crawler. Comparing crawl files can be a pain. Usually, you’ll end up dumping the data into excel and run a bunch of VLOOKUPS or MATCH/INDEX functions only to find that no, the developer hasn’t implemented the changes.

Meanwhile, you’ll occasionally want to compare crawl files of different sites to:

  1. Compare a dev environment with a staging environment
  2. Make sure content has been ported to a new site correctly
  3. Run technical SEO competitive analysis/comparisons – we wrote about this recently here.

This has always been a pain, which is why, for a while now, we’ve had a tool that quickly compares crawl_overview files for us. Today, we’re making it available for free.

It’s a simple Python script. If you don’t have Python installed, you can read a guide for Windows here and for MacOS here (you’ll need Python 2, rather than 3, for the script to work – though feel free to install both using virtual environments if you’re really keen on 3). The script itself, is here:

import pandas
import csv
import sys

from tqdm import tqdm


class color:
   PURPLE = '33[95m'
   CYAN = '33[96m'
   DARKCYAN = '33[36m'
   BLUE = '33[94m'
   GREEN = '33[92m'
   YELLOW = '33[93m'
   RED = '33[91m'
   BOLD = '33[1m'
   UNDERLINE = '33[4m'
   END = '33[0m'


def main(argv):
	if len(argv) != 4:
		print 'Usage: programname.py crawl_overview1.csv crawl_overview2.csv output.csv'
		sys.exit()

	headerrows = 5
	endline = 191

	fileone = get_csv(argv[1])
	filetwo = get_csv(argv[2])

	fileone = fileone[0:endline]
	filetwo = filetwo[0:endline]

	fileonesite = fileone[1][1]
	filetwosite = filetwo[1][1]

	fileone = fileone[headerrows:]
	filetwo = filetwo[headerrows:]

	fileonedata = []
	filetwodata = []
	combineddata = []
	firstcolumn = []

	firstcolumn.extend(get_column(fileone,0))
	fileonedata.extend(get_column(fileone,1))
	filetwodata.extend(get_column(filetwo,1))
	combineddata.extend(zip(firstcolumn,fileonedata,filetwodata))


	outFile = csv.writer(open(argv[3], 'w'))
	outFile.writerow(["",fileonesite,filetwosite])
	for i in tqdm(combineddata):
		outFile.writerow(i)

	if fileonedata == filetwodata:
		print (color.BOLD + color.RED + "Crawl files are identical" + color.END)
	else:
		print (color.BOLD + color.GREEN + "Crawl files are NOT identical" + color.END)

def get_csv(thefile):
	datafile = open(thefile, 'r')
	datareader = csv.reader(datafile, delimiter=",")
	data=[]
	for row in tqdm(datareader):
		data.append(row)
	datafile.close()
	return data

def get_column(thelist,thecolumn):
	newlist =[]
	for row in tqdm(thelist):
		if len(row) >= thecolumn +1:
			newlist.append(row[thecolumn])
		else:
			newlist.append("")
	return newlist

if __name__ == '__main__':
  main(sys.argv)

The only thing you might need to pip install is tqdm – which if you’re not already using we heartily recommend – it creates the nice little loading bars. If you’re new to Python and the script errors when you run it, mentioning tqdm, simply type:

pip install tqdm (on windows)

sudo pip install tqdm (on Mac)

You’ll only ever need to do that once.

Save it in a folder, navigate to that folder using command prompt or terminal and then run it the same way you’d run any Python script (typically ‘Python <nameoffile.py>’). It takes two inputs:

  1. The name of the first crawl_overview file
  2. The name of the second crawl_overview file
  3. The name of file you’d like to save the output as – it should be a csv, but doesn’t need to already exist

Both files should be in the same folder as the Python script and so a valid input would look something like this:

Python crawl_comparison.py crawl_overview1.csv crawl_overview2.csv output.csv

Compare Screaming Frog Crawl Files

The script’s pretty fast – it’ll chew through the files within seconds and then report that either ‘Crawl files are identical’ or ‘Crawl files are NOT identical’. It will have saved a file called ‘comparison.csv’ in the same directory that compares both crawl files – ready for you to:

  1. Send onwards as proof as to whether recommendations have or haven’t been implemented; or
  2. Create industry comparison graphs to show how the sites compare; or
  3. do with as you please.

comparison-output-csv

Future Updates

Now that the script is publicly available there are a few changes we plan to make to it. These include:

  1. Creating a front-end and installer for those who don’t like to mess around with Python
  2. Allowing for the comparison of multiple crawl_overview files at once
  3. Allowing for the comparison of other Screaming Frog outputs – not just crawl_overview files.

We’d love your feedback as to what features you’d like to see added.

Verve Search Introduces: The LinkScore Tool

Want to understand the real value of the links you’re building?
Here at Verve Search, for the past five years, we’ve been developing a proprietary metric to do just that. Up to now, we’ve kept it exclusively for our clients, but, in the interests of transparency, and for the benefit of the industry as a whole, in the back end of last year we took the decision to build a free to use, public version.

After a whole bunch of work, we’re delighted to say it’s live and ready for you to play with.

For those of you who just want to get their mitts on it, it’s right here.

For those who want to learn more about how it was developed, read on!

Why did we build the LinkScore tool?

A single metric might not always tell the full story!

We’d always felt that there was probably little point (from a rankings perspective) in having a link on an amazingly authoritative domain if it’s no-followed and in a language that neither you nor your customers speak. Yet, if you use a single metric to determine the authority of a link you may find that you’d be treating those links as if they were of equal value.

As such, rather than use a single metric, our tool blends more than 10 different on and off-site metrics, in order to assign a value to a link.

We needed an international metric…
We found that many SEO tools that assign link metrics are primarily focussed on English-speaking audiences. So, whilst their metrics might work well in primarily English-speaking countries, that might not always be the case in countries where English is not the native language. Therefore, we built the LinkScore to provide scores that give equal value to equivalent authoritative sites in each country – meaning quality links in one country are assigned an appropriate value.

We wanted a tool which could evolve & keep pace with the industry!
Each of the different variables added into the LinkScore were chosen based on our own testing and benchmarking. Where we’re using third party metrics we felt it was important that we weren’t tied to one particular database, and as a result we’ve been able to choose multiple best-in-class metrics that get us as close as possible to measuring the true ranking value of a link. Over the years the LinkScore tool has continually evolved alongside this fast-paced industry.

What does the LinkScore tool do?
It allows you to measure a link’s ability to influence rankings. It also allows links to be compared with each other, and groups of links to be compared periodically. Please note, we built this as an SEO tool, and as such, the tool does not take into account the value a link might provide in terms of PR, branding or any other type of marketing.

When you run a link through the tool a score between 0-500 will be returned. This scale is not logarithmic, however some of the variables used to calculate the score are.

Semantically relevant, followed, in-content links in unique content on authoritative domains yield the highest scores. Example sites which would yield high scores include the BBC and the New York Times.

How are scores calculated?

We could tell you, but we’d have to kill you! Kidding 🙂

We keep the exact metrics, and how they are combined, a closely guarded secret. This is to stop people gaming the algorithm, because it is updated annually and because we think the accuracy of the final scores speaks more to the quality of the LinkScore than any particular one of its metrics.

What do I need to do to use the tool?
You’ll need to input your Majestic, Dandelion and SEMRush API credentials and you’re good to go. Why? Well, the LinkScore uses metrics from each of these providers as part of its algorithm. To prevent abuse of the tool, we require users to use their own API accounts rather than providing free access to our own. Rest assured, your API credentials are stored locally on your computer; we do not keep a copy of your API credentials, nor do we use them for any purpose other than analysing the links you add to the LinkScore tool.

How much does the LinkScore tool cost?
Except to the extent that it uses your third-party API credits, the LinkScore tool is free to use. Rate limiting may, however, be put in place to maintain the experience for all users.

Do we store your data?
Definitely not! We do not store your API credentials, the links that you run through the LinkScore tool, or the score output. However, we do run Google Analytics and so store a number of different metrics related to your visit including, but not limited to your location, browser, time on site and pages visited.

Got more questions? Check out our FAQ, or contact us.

And if you do play with the tool, do let us know what you think.