Saturday, 4 May 2013

What Do Google Trends Scores Mean?

What Google Trends Tells Us:

Sample Data:

Google Trends tells us that its scores are based on an analysis of a portion of search volume. This means the scores are only based on a sample of available data, rather than all available data.

How are Interest Over Time Scores Calculated?:

Google Trends  tells us that it scores are relative. An “interest over time” score for any particular day / week / month is awarded on the basis of its actual search volume relative to the actual search volume of the other days / weeks / months. A score of 100 is always awarded to the day / week / month with the highest relative search volume, with the other days/ weeks / months scaled between 0 and 100 accordingly.

When a term is analysed over time within global parameters the scores are calculated using an average score (based on the scores returned from each country). This prevents search terms with high popularity in countries with heavy internet usage constantly achieving high scores.

How are Regional Interest Scores Calculated?: 

In the case of “regional interest” scores, Google Trends uses relativity in order to prevent countries with high internet usage automatically receiving the highest scores. The scores are calculated by comparing the relative popularity of the search term within each region. This data is then scaled between 0 and 100, with the country with the highest relative interest receiving a score of 100.

How Broad is the Search Volume Data Used to Calculate Scores?:

Google Trends offers users the ability to search for an exact phrase by enclosing it in quotation marks. According to the Help Centre when this function is used, scores generated are based on "specific order" search volume. This means that data from searches that include the inputted search terms in the specific order they have been entered will be used. No synonyms or variations are considered.

Google Trends also states that where no "quotation marks" are used, results include searches for the inputted terms in any order and may also include extra terms. No synonyms or variations are considered.

Wednesday, 1 May 2013

Cross-Analysis: Keyword Tool and Google Trends

The steps below can be used to help estimate the value of Google Trends scores in terms of actual search numbers.

Note:

Please read the caveat at the end of this article, which discusses the impact of normalisation / scaling and the accuracy of Keyword Tool.

Step 1:


Keyword Tool should be used to ascertain the monthly search volume for the term being analysed.

This figure should then be multiplied by 12 and divided by 365, to give the average daily search volume.


For example, if the search term hummingbird is reported as having 1 million searches a month, the average daily search volume is 32,877:


(1,000,000 x 12)
365

Step 2:


The average Google Trends score for the  search term over the previous twelve months should be calculated.


For example,  if between May 2012 and April 2013 hummingbird achieved scores totalling 384, its monthly average was 32:


382
12


Step 3:


The average daily search volume should be divided by the average Google Trends score. This provides the value of 1 Google Trends score point for one day.


For example, in the case of hummingbird, the value of 1 Google Trends score point for one day would be 1027:


32,877
32


Step 4:


In order to now ascertain the actual search volume within any month, the value of 1 score point should be multiplied by that month’s score and then multiplied by the number of days in the month.


For example, if the Google Trends score for hummingbird was 62 in June 2011, this would equate to 1,910,220 actual searches:


1027 x 62 x 30



Caveat:

I have good reason to believe that this calculation provides a reasonably reliable result. Read about why I think this calculation works here: Working With Google Trends and Keyword Tool Together.

However, the calculation does make one pretty big assumption.  It assumes that Keyword Tool data is updated monthly, which seems unlikely.

If this assumptions is wrong, the calculation above can at least be used as an estimator of search interest, where Google Trends reports a year-on-year increase in popularity.

The calculation also ignores the fact that Google Trends scores are both normalised and scaled. This means that scores over time do not have a direct quantitive correlation. Instead, they are determined by two steps:

(1) the absolute search volume relative to the total search queries received by Google

(2) the relative popularity on each day / week / month compared to the relative popularity of other days / weeks / months and then scaled between 0 and 100.

This has a significant impact where internet use rises or falls. In theory, a score of 10 on a peak internet  day could represent more actual searches than a score of 100 on a slow internet day. However, as my calculation uses a Google Trend's one-year average as it's starting point, the impact of such anomalies should be diminished.

The calculation will naturally work best in territories that have stable internet usage and with terms that are not radically more popular on certain days (ie Ebay on sundays) or at certain times of the year.

Sunday, 28 April 2013

Google Trends: Terms, Locations, Time Ranges Comparisons

Introduction:

Google Trends includes a function that allows users to compare up to five terms, locations or time ranges. Where this function is used, Google Trends generates a line graph with multiple data lines on it, as well as a bar graph.

Google Trends will also generate a set of "regional interest" tables. However, the scores contained within these tables are not modified as a result of the comparison. In other words, they are the same as if no comparison had been undertake.

Unfortunately, it is not possible to utilise more than one type of comparison simultaneously. Therefore, a comparison of time ranges and a comparison of locations is restricted to a single term.

How Are Scores Calculated?

Comparison of Search Terms:

Scores generated in a comparison of search terms are relative to one another. 

In order to obtain these relative scores, Google Trends compares relative search volume for each term with specified time range. 

If for example, "Search Term A" accounted for 0.0001% of total search queries on 10th January and Search Term B represented 0.00005% of total search queries on the same day, the scores awarded to A and B would be equivalent to 2 - 1. The scaling of the scores between 0 and 100 would depend on how the the relative search volume for both terms on the day compares to the highest relative search volume within the time range. 

The scores contained within the bar graph represent the average score for the terms over the specified time range. 

When a comparison of search terms is undertaken with "global' parameters, the relative search volume that is compared is based on an average derived from all the national relative search volumes.

Read more about comparisons of search terms here: Google Trends - Comparison of Terms.

Comparison of Time Ranges:

A comparison of time ranges calculates scores in a very similar way to a  comparison of terms (described above).

Google Trends calculates the relative search volume for the terms across the different time ranges and then collectively scales the data between 0 and 100.


Comparison of locations:

A comparison between locations generates scores calculated in the same way that "regional interest" scores are calculated. The scores are calculated by analysing:

(1) the popularity of the search term relative to total search within the specific territories

and

(2) the relative popularity of the search term within each specific territory, relative to the other territories.

For more general information on "regional interest" scores read Understanding Google Trends.

Google Trends: What Is Partial Data?


Google Trends offers users highly up-to-date information on the relative popularity of a search. This is indicated by the inclusion of a score for the current month. This score is based on partial data, because the month is not yet complete and therefore, the data is not complete either.



Google Trends calculates all it's "interest over time" scores using the following steps:


  • The actual search volume of each day within the time range is compared to the total number of search queries received by Google on the respective days. An average relative daily search volume is then worked out for each month or week. 

  • The month / week with the highest average relative daily search volume receives a score of 100. The scores of the remaining months are calculated relative to this.  


In the case of "partial data", scores are based on an interim average relative daily search volume. 


Google Trends tells us that it updates it's information daily. Thus, it is not 100% live. In most instances, the "partial data" score has taken into account search data up to 1 or 2 days before the Google Trends analysis has been conducted.


It is possible to determine precisely which days have been included in the calculation of a "partial data" month, by narrowing the Google Trends search to "past week". This will show which days have been awarded a Google Trends score and which days are yet to be assessed for scores.


For example, if you searched for "ice cream" on 20th April 2013, the score awarded for April 2013 would be based on the average daily search volume up to either 18th or 19th April 2013.  


Any score based on partial data is capable of either rising or falling before the month completes.


Read more about partial data here: Understanding Google Trends

Saturday, 27 April 2013

Understanding Google Trends

What is Google Trends?:



Google Trends is a search analysis tool that provides data on the relative popularity of search terms (or websites). Users can input a particular term or set of terms and where there is sufficient data available, Google Trends will generate a line graph, indicating how interest has risen or fallen over a period of time, as well as a table indicating the relative popularity of the search term within specific territories ("regional interest").


Google Trends also allows users to limit the analysis by content type, location, time range and category or compare terms, time-ranges and locations. Analyses parameters can be set to exclude terms or return one set of results based on the cumulative search activity of several terms.


How Does Google Trends Work?:



Where there is sufficient data available, Google Trends awards a score of between 0 and 100 to inputted search terms on a month-by-month / day-by-day basis and on a geographical basis. The meaning of these scores differ according to whether users are looking at "interest over time" or "regional interest".  


Interest Over Time:



The scores awarded by Google Trends on the "interest over time" line graph express the popularity of that term over a specified time range.


Google Trends scores are based on the absolute search volume for a term, relative to the number of searches received by Google.


The scores have no direct quantitative meaning. For example, two different terms could achieve scores of 100 in the same month, but one received 1,000 search requests, whilst the other received 1,000,000. This is because the scores have been scaled between 0 and 100. A score of 100 always represents the highest relative search volume.


Day scores are based on absolute search volume for the term within the day relative to absolute search volume on Google on the same day.  Month / week scores are calculated on the basis of the average relative daily search volume within the month / week. 


A rising line does not necessarily indicate a rise in the popularity. Instead, it may indicate that general search use has increased over the time range. A declining line does not always represents declining popularity either for the same reason. In order to gain the maximum insight from Google Trends, it is necessary to have an understanding of how internet usage might rise or fall.  


The inclusion of scores based on "partial data" is an indicator of just how up-to-date Google Trends is. Read more on partial date here: Google Trends: What Is Partial Data?

It seems likely that Global scores are based on an average score from each country. If this wasn't the case, terms that are popular in countries with high internet usage would constantly perform better than terms that are popular in countries with low internet usage.


Regional Interest: 



The scores awarded by Google Trends on the "regional interest" table / map are not directly relative to one another in a quantitative way. If this was the case, countries or cities with high internet usage or big populations, such as the United States, would permanently find themselves at the top of tables, giving the misleading impression that these countries are "most interested". 


Instead of awarding scores based on direct relativity, Google Trends utilises a kind of "double relativity".  The calculation of scores for particular territories is based on the following data: 


(1) the popularity of a search term within a particular region, relative to the total volume of search within the region over the period specified. 


(2) the relative popularity of the search term for the territory (as determined by step one above) compared to the relative popularity of the search term in other territories.


An example of how regional interest is calculated is given below:


If the search term "Facebook" accounted for 1% of total search requests in the United Kingdom from January 2004 to the present day, but 4% of total search requests in Ireland over the same period, Ireland's score would be 100, relative to the United Kingdom's score of 25.


Ireland's higher score is awarded despite the fact that 1% of total search in the United Kingdom would invariably account for a much higher actual volume of search than 4% of total search in Ireland (given the far higher internet usage / population in the UK).  

As well as generating a simple table, Google Trends also produces a heat map, which shows interest across the globe. The scores shown on the heat map are calculated in the same way that the scores in the table are calculated (ie step 1 and 2 above).