As part of my CMPUT 500 – Introduction to Natural Language Processing course (Fall 2007), I was required to do an individual course project. Because I was tied to the Software Engineering lab at the University of Alberta, I decided to integrate some NLP into an existing visualization project (eNulog) that was created by the Software Engineering lab. (Some of the following is from my Canadian AI paper which can be found here in its entirety).
The eNulog project aimed to aggregate and visualize RSS feeds and movie blog postings into a simple, easy to use interface. Within the eNulog interface, each movie, actor, director, or movie genre is represented by a node. A user can click on a node and view all of blogs, comments, or RSS feeds that relate to the given movie (or actor, director, or movie genre). The size of each node relates to the number of posts, comments, or feeds about that movie. Once a node has been clicked/selected, all of the nodes which are similar to it will aggregate around it; those nodes which are dissimilar (e.g., the comedy genre is vastly different than the action genre) will move farther away.
For my project, I took the eNulog program, and mined all of the blog postings to determine their relative polarity. In the Canadian AI conference paper that I wrote along with Dr. Greg Kondrak, we compared two different sentiment analysis techniques (a lexical/dictionary-based approach and a machine learning/support vector machine approach) using the eNulog blog data set. The stronger approach (using support vector machines via SVMLite) was incorporated into the eNulog visualization program. The paper can be found here. The two pictures illustrate the interface: green nodes indicating that the average movie reviews for the movie in question were positive, red indicating that they were negative, and yellow indicating that they were neutral or split 50-50 (or somewhat close to even). The first picture shows the movie ‘Batman Begins’ has 14 total blog postings, that generally recommend the movie. In the second picture, the movie ‘Charlie and the Chocolate Factory’ has been selected/clicked. All of the movies similar to it have moved closer to it, and those that are dissimilar are father away.