twitter wordcloud R

Create a wordcloud with Twitter Data and R

First follow the steps described in my tutorial about Sentiment Analysis with Twitter but stop before the point “The Analyzing”.

By this step we got our tweets

We now have to get the Text from our tweets to analyze them. We do this with:

Sometimes this text has invalid characters in it which will make our API crash; so we have to remove them.

We can use a function of the site Viralheat to do so:

You just have to copy-past this code and hit enter in R and you can use this function by letting it analyze our text extracted out of the tweets.
We add this clean text to a so called Corpus, this is the main structure in the tool tm to save collections of text documents. To fill this Vector we have to use the VectorSource attribute. This looks like this:
To go on we have to transform this Corpus in a so-called Term-document Matrix. This matrix describes the frequency of terms that occur in a collection of documents.
Ok now we have our tdm. We have to do now is arrange our words by frequencies and put them in the wordcloud.

But before we have to install the wordcloud tool:

Ok here we have our wordcloud. If you want to save it to your computer you can do it with:

Cloud

Now you can find the file Cloud.png on your Computer. Enjoy your own clouds!
Info: In the cloud picture you can see that the word “amp” was often used. This is a small mistake and you have to add this keyword to the clean.text() function which can remove it.

Julian Hillebrand

During my time at university and learning about the basics of economics I started heavily exploring the possibilities and changes caused by digital disruptions and the process of digital transformation, whereby I focused on the importance of data and data analytics and combination with marketing and management.
My personal focus of interest lies heavily on technology, digital marketing and data analytics. I made early acquaintance with programming and digital technology and never stop being interested in following the newest innovations.

I am an open, communicative and curious person. I enjoy writing, blogging and speaking about technology.

  • Pingback: julianhi's Blog | Wordclouds Dortmund vs. Bayern()

  • Pingback: Wordcloud #syria on Twitter | julianhi's Blog()

  • great post for #R beginners like me, thanks! Do you have an example how to create the wordcloud based only on the hashtags inside tweets (instead of the complet tweet text/words)?

    • Thank you Jochen! I’ll think about how to do it but it shouldnt be that hard as you just have to search for words starting with # in the tweet text

  • lena

    great tutorial ! thank you!

  • giuseppe d

    is it possible to make a cloud or an interest map about tweets of a given twitter? In fe words: Given a twitter user, make a summary of phrases or interests most “popular” within all the tweets

  • raviteja

    thank u soo much
    it helped me a lot, i tried with different words and got the output..
    i hv one doubt
    how to store the values in mysql using r.
    I need to store the vector values from r in mysql tables.
    cn u pls tell me.