Analytics Blog

Workout of the Week: Sentiment Analysis

And the Oscar goes to? Alpine!

As we gear up for the Oscar’s this weekend it started us thinking, can Alpine predict the winners?

This week, we dive into some text analysis and use another classifier model called “Naive Bayes” to predict the sentiment of a movie by a reviewer.  Our very own Dillon Woods published a full analysis on his blog where you can see the full description of how he transformed Rotten Tomato reviews into a usable data format.

Data: Movie Reviews from Rotten Tomatoes

Analysis: if social websites such as Twitter and Facebook have taught us anything, it’s that people are not afraid to express their opinions openly on public forums. While most opinionated posts on the internet reach a very small audience of friends and families, some people have developed large social followings which give their words weight. An audience of millions reads anxiously when LeBron James tweets about his preference for basketball shoes. The opinions being expressed about brands is valuable information that allows companies to tweak their products and marketing campaigns in order to find and retain customers. As the volume of data being created on social websites continues to grow it has become impossible for companies to react to trending opinions without the aid of automated sentiment analysis.

Although sentiment analysis has become more popular in recent years, it has yet to become an easy problem to solve. Text must be cleansed, parsed, and analyzed before a statistical model can be developed that is capable of automatically determining whether the writer was expressing a positive or negative opinion about a particular brand. The complexity of these tasks has proved daunting to most companies, but this article describes an approach for using AlpineGreenplum, and GPText to easily create a sentiment analysis model.


Once again, be sure to subscribe to this blog to receive alerts for new posts in this series. You can subscribe at the top right of this page or add this to your Feedly or RSS reader -> ).