Analyze Instagram with R
This tutorial will show you how you create an Instagram app, create an authentication process with R and get data via the Instagram API.
There is no R package for this yet so we have to configure the authentication and data download process on our own. But Instagram offers a pretty good documented API and uses oAuth 2 which makes it easy to use with R and the httr package for example.

Authentication

The place to start for everybody who wants to work with the Instagram API is http://instagram.com/developer/
Here you can find all the information you need and also manage your apps.
So click on „Register Your Application“ and go through the login.
R analyze instagram
On the next screen you can set the parameters for your app. Choose an application name ,write a small description of what your app will be about and add a webiste
Then you have to enter an OAuth redirect URI. To choose it go to your R console and execute following code:
This will show you the preferred callback URI for httr. Copy this URL and paste it in your app settings.
This is how my settings look like:
Analyze instagram with R
After clicking on „Register“ you will be redirected your app authentication details we will need for our analysis.
In R we have to define 4 variables:
The first 3 you get from your app settings. The third on scope is basically the level of authorization you want to get. Basic is enough to download data like likes or comments. If you actually want to post something to Instagram you need another scope. You can find more information on the Instagram developer page about that.
Then we create our Instagram in R for the httr package. This is the app we will use to connect to the API. To do so we have to provide the access points.
In the next step we do the authentication
Now your browser should open and ask you to give permission to the app. After you returned to R you should have received your access token.
Our analysis starts on the basis of a username. In my example I will use „therock“ as it is the account of actor Dwayne Johnson.
username <- “therock”
But most of the functions of the Instagram API work with the user id and we don´t have it now.
So we use the search function to get information about the user with the username „therock“.
This returns a list mostly with around 50 people. But we just extract the first returned user and compare if this is the user we were looking for as the first user will always be the one with 100% the username we searched for if it exists.

Analyze Instagram with R

Now that we have the user id we can start getting the post data.
This returns the recent 20 pictures of the user we will use for our analysis. We go through them with a for loop to extract the count of likes and comments and the date and time the photo was posted.
Instagram uses UNIX timestamps as their date. So we have to convert it to make it readable.

Visualization

Now we can visualize the data. I will use the rCharts package to so. Of course you can also use ggplot2 or whatever package you like.

 

You can find the complete code of the tutorial on my github account:
  • Sal

    Hi Julia. Interesting approach. Thank you for sharing.

    Which package is fromJSON() from ? When I follow your exact procedure, I got an error message “Error: could not find function “fromJSON” ”

    Thanks

    • Hey,
      thanks for your hint.
      I used the fromJSON() function from the rjson package.
      You can find it in the version on github: https://github.com/JulianHill/R-Tutorials/blob/master/r_instagram.r

      Hope this could solve your problem

      Regards

      • Sal

        Thank. I installed the proper json package. Is it normal to have the following warning message ?
        Warning message:
        In fromJSON(getURL(paste(“https://api.instagram.com/v1/users/search?q=”, :
        unexpected escaped character ‘\_’ at pos 24. Keeping value.

        • Actually yes. Sometimes the Instagram API returns character combinations the fromJSON function can´t really deal with and it would normally cause it to abort the call. But I added this unexpected.escape = “keep” parameter to the function call. This will make the fromJSON function just give a warning message and no error message and it will also keep the data it received.

          Regards

      • Sal

        Everything works fine now. Thank you
        My understanding is that the number of pictures imported into the ‘media’ list is set to 20 by default. Can this be modified ? (any number, or ‘all’, or ‘last 3 months’)

  • Rees Klintworth

    Is there any way to automatically close the browser page/get the access_token so that I don’t have to manually close the browser page?

    • Hey Rees
      I am also looking for a way at the moment. This would also be necessary I you would want to use such authentication in a shiny application for example.
      I´ll let you know when I can figure out a solution.

      Regards

  • You can get more than 20 items, but you’ll have to set the pagination for the JSON response. Here’s how you get there:

    fromJSON(getURL(paste(‘https://api.instagram.com/v1/users/’,user_id,’/media/recent?count=200&min_timestamp=0&access_token=’,token,sep=””)))

    Cheers,
    Marco

    • Hey Marco
      thank you for this great info. I could really add this to the tutorial.
      I used the recent function without count as this is just a proof of concept.
      And to get a general overview 20 items are enough.

      Regards

  • Rees Klintworth

    Also, I have attempted to deploy this in a shiny application (https://rees.shinyapps.io/InstagramR/), but I am getting the following error: “Error : oauth_listener() needs an interactive environment.”

    Any advice as to how I can fix this? I’m new to oauth and everything that goes along with it.

    • Rees – looks like you fixed your problem! Share with us how you did?

  • philtiongson

    First of all, thanks very much for this. This is really useful and very informative. I am quite new to the world of R. Your posts are definitely helping me get the hang of using it.

    I have a question with regard to your code above. When I ran the following code –

    media <- fromJSON(getURL(paste('https://api.instagram.com/v1/users/&#039;,user_id,'/media/recent/?access_token=',token,sep="")))

    I get an error response from R that says the following:

    Error in function (type, msg, asError = TRUE) : SSL certificate problem: unable to get local issuer certificate

    May I ask for your help in addressing this error?

    Thank you very much!

    Phil

    • Hey
      sorry for the late answer.
      I heard of several people with your problem.
      Do you use Windows and RStudio? This seems to be the case for the most people with the problem.
      Could you please try to run the code in the R console?

      Regards

  • Chris

    I’ve been working on this for a couple days, because this tutorial is awesome and it’s a good display of some pretty cool things that R can do…plus, I have an excess of free time because of the holiday. Might as well put it to good use.

    I keep getting an error :

    Error in function (type, msg, asError = TRUE) :
    SSL certificate problem: unable to get local issuer certificate

    In the section where we look to bring the data from the instagram api into R:

    user_info <- fromJSON(getURL(paste('https://api.instagram.com/v1/users/search?q=&#039;,username,'&access_token=',token,sep="")),unexpected.escape = "keep")

    Any suggestions on what to do? I've tried doing some basic research on RCurl, and on a whim used the flag 'ssl.verifypeer=false' but that didn't fix the issue.

    I'm on Windows 7 using RStudio, R Version 3.1.2.

    • Hey
      these SSL problems appear pretty often on windows. Did you try it in the R Console?
      Sometimes RStudio has some problems with such kind of things.
      Please tell me if that worked for you.

      Regards

  • Pingback: Instagram & R | SocialFunction()()

  • Louise

    Can you help me?

    when I type the code for authentication , when you open the browser appears this error

    {“code”: 400, “error_type”: “OAuthException”, “error_message”: “Redirect URI does not match registered redirect URI”}

  • Stephane Doyen

    Hi Julian and community,

    Thanks for the great tutorial!
    I have followed it carefully and can get it to work for the REST api. Now, I’m trying to implement and adapt your example to the real time part of instagram api using the httr package.

    I’m using the following code to POST the instructions:

    r = POST(url = ‘https://api.instagram.com/v1/subscriptions’,
    body = “client_id=XXXXX;client_secret=XXXX;aspect=media;access_token=XXXX;callback_url=http://localhost:1410/;object=tag;object_id=selfie”,
    encode = “form”,
    verbose()
    )
    str(content(r))

    But all I get in return is: error_message: chr “Invalid URL. The URL may be on a private network.”

    Any help on to handle the PubSubHubbub protocol in R would be very much appreciated.

  • Thanks for your post, it help me a lot with analyzing instagram. For those who have error of SSL, maybe you want to consider adding this line in your script. hopefully it will help your issue.

    options(RCurlOptions = list(verbose = FALSE, capath = system.file(“CurlSSL”, “cacert.pem”, package = “RCurl”), ssl.verifypeer = FALSE))

    original post by skardhamar https://github.com/skardhamar/rga/issues/6

    • Thanks for sharing this solution!
      I will add it to the tutorial.

      Regards
      Julian

  • I am on a Windows 7 machine, using RStudio. I solved the SSL certificate problem with this code:

    user_info <- fromJSON(getURL(paste('https://api.instagram.com/v1/users/search?q=&#039;,username,
    '&access_token=',token,sep=""), .opts = list(ssl.verifypeer = FALSE)
    ),unexpected.escape = "keep")

    The addition of .opts = list(ssl.verifypeer = FALSE) fixed my problems.

    Now, I am adapting this to retrieve a complete list of users who liked a photo, but Instagram's API only returns the first 120 users. Anyone know if pagination works for the /user endpoint, i can't seem to get it to work?

    And Julian, you are a rock star. Thanks for this tutorial.

    • Hey
      sorry for the late answer and thanks for the kind words.

      A user posted this code some time ago:
      fromJSON(getURL(paste(‘https://api.instagram.com/v1/users/’,user_id,’/media/recent?count=200&min_timestamp=0&access_token=’,token,sep=””)))

      I think that should make it work for you.

      Regards

  • Vanessa

    how do we fix the error:
    Redirect URI does not match registered redirect URI

    • Hey
      could you please post the exact code you used?
      And where exactly does this error show up?

      Regards

  • Pingback: Instagram API & R - Jabber Cruncher()

  • the code runs successfully but at the end im not getting the chart.. help

  • Hi. I’ve followed the directions, downloaded all necessary packages but don’t get a chart at the end. Is there something I’m not doing right?

    I’ve omitted some of the script for privacy but this is what it looks like.

    #Analyze Instagram with R
    #Author: Julian Hillebrand
    #Recreated for Research Purposes by Tamren Estevez-Lopez

    #packages
    require(httr)
    require(rjson)
    require(RCurl)

    #Authentication

    ## getting callback URL
    full_url <- oauth_callback()
    full_url <- gsub("(.*localhost:[0-9]{1,5}/).*", x=full_url, replacement="\1")
    print(full_url)
    #notetoself: this gets pasted on Instagram App Settings

    app_name <- "MiningWithR"
    client_id <- "omit"
    client_secret <- "omit"
    scope = "basic"

    instagram <- oauth_endpoint(
    authorize = "https://api.instagram.com/oauth/authorize&quot;,
    access = "https://api.instagram.com/oauth/access_token&quot😉
    myapp <- oauth_app(app_name, client_id, client_secret)

    ig_oauth <- oauth2.0_token(instagram, myapp,scope="basic", type = "application/x-www-form-urlencoded",cache=FALSE)
    tmp <- strsplit(toString(names(ig_oauth$credentials)), '"')
    token <- tmp[[1]][4]

    ########################################################

    username <- "OMIT"

    #search for the username
    user_info <- fromJSON(getURL(paste('https://api.instagram.com/v1/users/search?q=&#039;,username,'&access_token=',token,sep="")),unexpected.escape = "keep")
    received_profile <- user_info$data[[1]]
    if(grepl(received_profile$username,username))
    {
    user_id <- received_profile$id
    #Get recent media (20 pictures)
    media <- fromJSON(getURL(paste('https://api.instagram.com/v1/users/&#039;,user_id,'/media/recent/?access_token=',token,sep="")))
    df = data.frame(no = 1:length(media$data))
    for(i in 1:length(media$data))
    {
    #comments
    df$comments[i] <-media$data[[i]]$comments$count
    #likes:
    df$likes[i] <- media$data[[i]]$likes$count
    #date
    df$date[i] <- toString(as.POSIXct(as.numeric(media$data[[i]]$created_time), origin="1970-01-01"))
    }
    #Visualization
    require(rCharts)
    m1 <- mPlot(x = "date", y = c("likes", "comments"), type = "Line", data = df)

    Any help would be greatly appreciated!

    • Hey
      what is the content of the m1 object at the end?

      Best regards

  • I’m having the same issue as shruti. Script is running successfully but I’m not getting a chart at the end. The last line I’ve entered is

    m1 <- mPlot(x = "date", y = c("likes", "comments"), type = "Line", data = df)

  • Joseph Southwell

    Se we were using this tutorial today and it started out the day working fine but now we can no longer get a list of users.

    It comes back empty every time no matter what we put in and in the documentation for their api there is a massive changelog entry for today. They added new scope called public_content. (searching users now requires public_content but changing that didn’t help.) They talk about sandboxing but I tried a substring of my own username and that didn’t work either.

    ig_oauth tmp token user_info user_info
    $meta
    $meta$code
    [1] 200
    $data
    list()

    • Sandbox isn’t available yet. It will be a limited ‘testing’ environment. If you didn’t register an app under your account prior to today you won’t be able to do anything until I think 12/3/2015. It’s apparently an attempt to discard malicious third party apps. The sandbox appears as if it will be greatly limited if you aren’t actually developing an application and submitting for review.

    • Shixiao Cui

      I am having the same problem with you. I got an empty list too.

  • Sergio Aguado

    Hi!

    I have some problems when I tried to get the token in shiny apps. Do you have an example? Thank you so much.

  • Karl Miller

    Hi!

    I got the message

    Error in file(con, “r”) : cannot open the connection

    if I run:

    media<-fromJSON(getURL(paste('https://api.instagram.com/v1/users/&#039;,user_id,'/media/recent/?access_token=',token,sep="")))

    Any suggestions or recommendations?

    Thank you so much.