For the most part, SMB’s tend to utilize free analytics solutions like Google Analytics for their web and digital strategy. A powerful platform in its own right, it can be combined with the R to create custom visualizations, deep dives into data, and statistical inferences. This article will focus on the usage of R and the Google Analytics API. We will go over connecting to the API, querying data and making a quick time series graph of a metric.
To make an API call, you’ll need two things. A Client ID
and a Secret ID
. You can use this ID over and over again, so you only need to do the following steps once:
- Login to your GA analytics account
- Go to the Google Developers page: https://console.developers.google.com/project
- Create a New Project and enable the Google Analytics API
- On the Credentials screen (under the API’s and auth menu), create a new Client ID for Application Type “Installed Application”
- Copy the Client ID and Client Secret
In R (I’ll be using RStudio), load the necessary packages:
1 2 3 |
library(ggplot2) library(RGoogleAnalytics) library(scales) |
With the packages loaded, we will run the oauth call to the Google API:
1 |
oauth_token <- Auth(client.id = "Client ID", client.secret = "Client Secret") |
**Note: This part can be a little tricky to understand if you haven’t used R to call to an API before. A new tab should open in your web browser asking if you accept R Analytics to access your GA. Press “Accept”, the page should then move to a message screen that says “Authentication complete. Please close this page and return to R”. When you return to your R IDE, you should see the message in your console saying “Authentication complete.”
Now save the authorization token for future sessions:
1 |
save(oauth_token, file="oauth_token") |
Using Google Analytics with R
To make a query of analytics data you’ll need to identify a few things first. Namely, what your start date and end date of the query should be and also what metric(s) you want to pull for.
1 2 3 4 5 6 7 8 |
ValidateToken(oauth_token) query.list <- Init(start.date = "2015-01-01", end.date = "2015-02-01", dimensions = "ga:date", metrics = "ga:sessions,ga:bounces", max.results = 1000, sort = "ga:date", table.id = "ga:TABLE ID") |
**Note: Table ID is in the URL of your Google Analyics page. It is everything past the “p” in the URL. Example, https://www.google.com/analytics/web/?hl=en#management/Settings/a48963421w80588688pTABLE_ID_NUMBER
Create the Query Builder object so that the query parameters are validated
1 |
ga.query <- QueryBuilder(query.list) |
Extract the data and store it in a data-frame
1 |
ga.data <- GetReportData(ga.query, oauth_token, split_daywise = T) |
You can now make a quick graph of your data. Here we will look at bounces in January:
1 2 3 4 |
ga.data$date <- as.Date(ga.data$date, "%Y%m%d") df <- ga.data[order(ga.data$date), ] dt <- qplot(date, bounces, data=df, geom="line") + theme(aspect.ratio = 1/2) dt + scale_x_date(labels = date_format("%m/%d"),breaks = date_breaks("day")) |
For further documentation and use cases, refer to this link:
https://github.com/Tatvic/RGoogleAnalytics/blob/master/demo/data_extraction_demo.R