Analyzing the Relationships Between Users in Wiki Settings

During the summer of 2007, when I was working on the wiEGO project, I supervised my second WISEST student, Joyce Lam. Because there was lots of work in the Software Engineering lab investigating wikis and blogs, we set Joyce up on a project that analyzed the relationships that existed between wiki users. Using the Annoki Software Engineering wiki, she modified an existing wiki visualization program to help users compare the behavior (number of wiki page edits) of any two wiki contributors. As the first picture illustrates, a user of the Java-based visualization program can select two users who have contributed to the Annoki Wiki and then view an orbit graph (Image 2). In the main application panel, you can also choose to view a one person orbit diagram, or a two person diagram. You can also zoom in and out of the graph, hide pages you don’t want to see, and view a number of wiki statistics on the right hand side.

In an orbit graph, there is a central orb surrounded by a number of concentric circles. In the center orb are the two wiki contributors which were selected to be compared. In each concentric circle, there are a number of wiki pages (represented by colored dots). Each dot represents a wiki page present in the Annoki wiki. The color of each dot indicates which wiki contributor edited the page (one, the other, or both). The dots which are closest to the center orb have been edited the most, those edited the least (if at all) are the farthest away.

One can also select a specific wiki contributor and view all of the wiki pages that were edited by that contributor (via lines/edges connecting the dots). Although this project was not completed finished, both Joyce and I learned many interesting things about integrating wiki information with visualizations during her summer internship.

 

 

Invenio – Tracking Music Trends Using Web Services

The Invenio project was part of the course requirements for my CMPUT 660 – Web Services course. When I took the course (Winter 2008), I was reading lots of celebrity gossip websites (such as Perez Hilton) everyday and was listening to a ton of top 40 hits in Winamp. It was around this time that Perez Hilton starting posting information about new, unsigned (or little known) artists in the USA and Europe. About once a week I would be finding out about some new person that I had never heard or, or had not yet gotten ‘big’ in Canada or Alberta. This started me thinking about how music starts to spread across North America and why some artists are popular everywhere, but some are entirely regional.

For my final course project, one of our requirements was to incorporate geographic information into a web service. Coupling this requirement with my thoughts on music trends, I decided to build a web service that would geovisualize radio station music. (At the time, XM and Sirius weren’t as big as they are today, and this project was a few months before last.fm [i think that’s right] came out with their big analytics software and algorithms). Over a period of 6 weeks (last week in January until 2nd week in March), I collected and organized the music chart information for 190 radio stations that are registered with Nelson SoundScan/Billboard Magazine.

I then took this information, and created a data-intensive, REST-based, RIA entitled, Invenio. Invenio combines a variety of different technologies (Yahoo! Maps, Amazon Associates Web Service, REST, and the Adobe Flex framework) to geographically visualize aggregated music chart information. You can watch a short-ish / long-ish video about Invenio’s features below (or on Youtube here). This project was very successful for me – I got a publication accepted into Cascon 2008 (co-authored by the course instructor Dr. Eleni Stroulia). You can read the paper in it’s entirely here on the ACM website or email me and I can send you a copy.

If you don’t feel like watching the whole video (I know it’s long), here are a couple of pictures that illustrate Invenio’s main features:

In the main Invenio view, you can select the artist, song, and time period and then view the song’s position on each of the 190 radio stations in the US and Canada over this time period. Each circle represents one radio station (or song, or artist, depending upon the view). In some views (Track By Artist), the size of the circle indicates the song or artist’s chart position, in other views the color of the circle indicates the genre of music (Track By Success). Pictured above is the Track by Success view, whereby one can view the top of bottom genre that was on year radio station during the selected time period. If one chooses to ‘View All Weeks’, then they will see a geographic time-lapsed animation of the options they have chosen (e.g., How Alicia Key’s song ‘No One’, fared on the charts for the 6 week period).

In this second picture, instead of circles to indicate song position or genre, we are shown the artist’s album cover (pulled from Amazon) that contains the song that is currently on the chart [in this case in the #1 position]). The main window’s maps are fully interactive – you can zoom in and out, pan the map, and change it’s type (e.g., satellite, hybrid, map). You can also elect to have tool tips appear (that provide additional information about the radio station and link to the radio station’s website). Also, you can provide other additional information which is pulled from the Amazon web site (i.e., album price, a link to the album’s page, a review of the album, the number of lists that the album is on, the genre of music, etc.).

Another type of visualization that is available in Invenio are the Cover Flows or Display Shelves (when I was making Invenio, Apple hadn’t popularized them yet). Each radio station has six display shelves associated to it, and each display shelf visualizes the music that was on each week’s chart. One can select a radio station using the combo box (or by clicking on a circle or artist album cover in the main application window) and all of the display shelves will appear. Once they appear, you can choose to flip through each of them individually, or ‘lock’ them according to chart position or song. This alternative view makes it easy to see how a song has fared using a method that is very different to the main application’s map. Similar to the main application window, you can choose to see tool tips and additional artist/album information from Amazon.

 

The last visualizations that are contained within Invenio are the charting views. Because most people are used to viewing information via charts rather than display shelves or maps, I chose to include three different charting options in Invenio. There are Bubble Charts (the picture directly below this one; the size of each bubble indicates the position of that song on the corresponding radio station’s chart), a line chart indicating a chart score (the picture in the middle; the average position a song across all radio stations), and a vertical chart (the picture at the bottom; it indicates if each song/artist has been fairly consistent over time or if the have had a large variance in chart position over the given time period).

 

AViz – Visualizing Execution Traces for the Dynamic Analysis of Software

For my CMPUT 666 – Reverse Software Engineering course (Fall 2007), I worked on an individual project relating to the visualization of execution traces for dynamic analysis of a software program.

As many software engineers will attest, one of the most important and time-consuming activities within the software development cycle is the continual maintenance of a software system or program. Contrary to popular belief, roughly 50% of the costs encountered during a typical software development cycle are incurred during the modification and maintenance phases of a system, not during the design or implementation activities. Estimates have shown that 50% of the software maintenance phase is spent trying to comprehend a software system. Because program comprehension contributes so much time, effort and money to the total cost of system development, a logical question to ask is: what are the problems with the methods, tools, and techniques that are being used, and how can they be improved upon?

Although each has their own problems, static and dynamic analyses are very beneficial for discovering the behavior and architecture of a system. One of the most surprising concerns that arises after reading static and dynamic analysis literature is that both of these techniques have not been combined together very often. A hybrid of these approaches would involve aggregating the static and dynamic artifacts together, and creating a new visualization that represents the system’s architecture and behavior. Of the limited research which has tried to create a ‘hybrid’ analysis, the most common visualizations that are created are UML collaboration diagrams (a hybrid UML class and sequence diagram):

Within the context of my final project, I wanted to explore the possibilities of using this hybrid approach to assist maintenance personnel in program comprehension when they are performing an evolutionary maintenance tasks. Motivated by the success of a previously created software program at the University of Alberta (JDEvAn Viewer), I wanted to extend this work to include a dynamic analysis element. To this end, I experimented with the creation of a hybrid analysis system, AViz (Aspect Visualization). I utilized the typical reverse engineering paradigm to extract execution traces from my program (using the AspectJ language and writing to files), analyzed the traces (in Java) and then output the results using a UML collaboration diagram (using the Java SWT and Draw2D libraries).

 
Thanks to Zhenchang Xing (for the JDEvAn Viewer code) and Mike Smit (for his AspectJ sample code to get me started) at the UofA.

Enulog – Determining and Visualizing the Polarity of Movie Reviews Using Sentiment Analysis

As part of my CMPUT 500 – Introduction to Natural Language Processing course (Fall 2007), I was required to do an individual course project. Because I was tied to the Software Engineering lab at the University of Alberta, I decided to integrate some NLP into an existing visualization project (eNulog) that was created by the Software Engineering lab. (Some of the following is from my Canadian AI paper which can be found here in its entirety).

The eNulog project aimed to aggregate and visualize RSS feeds and movie blog postings into a simple, easy to use interface. Within the eNulog interface, each movie, actor, director, or movie genre is represented by a node. A user can click on a node and view all of blogs, comments, or RSS feeds that relate to the given movie (or actor, director, or movie genre). The size of each node relates to the number of posts, comments, or feeds about that movie. Once a node has been clicked/selected, all of the nodes which are similar to it will aggregate around it; those nodes which are dissimilar (e.g., the comedy genre is vastly different than the action genre) will move farther away.


For my project, I took the eNulog program, and mined all of the blog postings to determine their relative polarity. In the Canadian AI conference paper that I wrote along with Dr. Greg Kondrak, we compared two different sentiment analysis techniques (a lexical/dictionary-based approach and a machine learning/support vector machine approach) using the eNulog blog data set. The stronger approach (using support vector machines via SVMLite) was incorporated into the eNulog visualization program. The paper can be found here. The two pictures illustrate the interface: green nodes indicating that the average movie reviews for the movie in question were positive, red indicating that they were negative, and yellow indicating that they were neutral or split 50-50 (or somewhat close to even).  The first picture shows the movie ‘Batman Begins’ has 14 total blog postings, that generally recommend the movie. In the second picture, the movie ‘Charlie and the Chocolate Factory’ has been selected/clicked. All of the movies similar to it have moved closer to it, and those that are dissimilar are father away.

wiEGO – Wiki Integrating Electronic Graphical Organizers

In the summer of 2007, I was awarded a Canadian Distributed Mentor Project award and spent the summer continuing my previous work with Dr. Eleni Stroulia. wiEGO, my summer project, is a java based applet (built off of AnnokiBlooms) that interacts with an open source content management system (Moodle) and a wiki (Annoki) to assist Junior High students with their group projects. Keeping Bloom’s Taxonomy of learning objectives in mind, to enable increasingly rich levels of learning, wiEGO supports a variety of graphic organizer structures of different complexities and enables the association of a single wiki document with multiple graphic organizers. wiEGO is comprised of three main sections.

Moodle

Moodle (Modular Object-Oriented Dynamic Learning Environment) is a platform independent content based management system (CMS) that is used by educators at all levels of education (elementary, secondary and post-secondary) throughout the world. Moodle takes a “constructivist and social constructionist approach to education, emphasizing that learners (and not just teachers) can contribute to the educational experience in many ways” [8]. It is used to post course materials, administer quizzes, create assignments, blogs, message boards, and forums, and manage course content through an online e-learning website. Moodle is not only user friendly for students and teachers, but it is also developer friendly. Everything in Moodle is modular, so any new additions that need to be made, which are called ‘blocks’, are extremely easy to author and implement.  All code is written in php.

To implement wiEGO, not many modifications needed to be made to our Moodle installation. I added in a ‘wiEGO Block’ that allows a user to view all of the wiki changes fellow group members have made since their last login to Moodle. A user can select a page title from the list of recent changes and they will not only be taken to this page to view the changes, but also logged into Annoki so they can edit, view or add new pages. This transparent login allows a student to only log in once, to Moodle, and not twice (once to Moodle and once to Annoki).

Annoki

Annoki (Annotated Wiki) is web-based wiki that contains a variety of wiki extensions. A wiki is a website that encourages individuals to collaborate with other via editing and commenting on ‘wiki’ pages. A wiki is an excellant piece of software to use for group project planning because it can track user behavior and allow for an anytime, anywhere collaboration between group members. When a group is writing a final report, this is very efficient because any member can edit a section of the page, and no one will have to email their section of the report to another group member. As well, a teacher can look at the history of a page and determine which group members have collaborated when, and how meaningful their contributions were.

With is in mind, Annoki is based on the very popular Mediawiki software application. Mediawiki is a LAMP system (Linux, Apache Server, mySQL, PHP) that is platform independent. In terms of the wiEGO project, the seamless combination of php and mySQL allows me to easily query the mySQL database and output XML files that can be used in the EGO program.

There are many different extensions and modifications that I added to Annoki to fulfill the wiEGO project. Finally, I needed to update Annoki from version 1.5 to 1.10, which took me roughly 2 weeks. I also added a spellchecker, pdf viewer, task pages, calendar, group access control, special pages to create new group projects (for teachers) and wiEGO extensions to seamlessly allow users to open EGO in a new browser window, to name a few. I also spent a large amount of time on the CSS of the website and changed the wiki skin (Figure 3). Some of the new functionality required me to write some JavaScript to interact with the PHP and mySQL of Mediawiki.

EGO

Because wiEGO contains an extension of AnnokiBlooms it implements a previous toolkit that one of Dr. Stroulia students had created, TOMU. TOMU is java-based applet that reads in an XML XTM file. The XML files TOMU uses are created by our wiki, Annoki. This being said, TOMU queries the database for every wiki pages’ name and associated pages and outputs them to an XML XTM file. This XML file is read by the TOMU toolkit, and TOMU uses a package called TouchGraph to render a topic map.

Keeping with the methods used in TOMU and AnnokiBlooms, EGO also reads in an XML XTM file, but unlike TOMU, EGO can also create new topic maps or edit the existing topic maps. The new graphs can take the form of spider maps, topic maps, timelines, hierarchies or flow charts. Not only are new graphs created in EGO, they are also transformed into new wiki pages, wiki sections or content, meaning that a user can change the wiki content from another source outside of the wiki. By allowing students to do this, we are providing those visual-spatial learners with an alternative method to creating projects, reports and collaborating with their peers.