This is an archived cached-text copy of the developerWorks article. Please consider viewing the original article at: IBM developerWorks



Skip to main content

skip to main content

developerWorks  >  Open source  >

Social-networking open source visualization aids

Use Graphviz, the Google Chart API, and CAIDA's plot-latlong tool to analyze your social networks' attributes

developerWorks
Document options

Document options requiring JavaScript are not displayed

Sample code


Hey there! developerWorks is using Twitter

Follow us


Rate this page

Help us improve this content


Level: Intermediate

Nathan Harrington, Programmer, IBM

06 Jan 2009

Social-networking data analysis can help you understand content, connections, and opportunities for your personal and business associations. This article presents tools and code to extract key components of your social network using the Twitter API to chart, geolocate, and visualize your social-networking data.

This article is a proof-of-concept that shows how to build applications to visualize your interconnections and influence. Graph common subject-matter keywords in your discussions and create geographical maps of your friends' locations. The code presented here relies on Perl, Graphviz, the Cooperative Association for Internet Data Analysis (CAIDA) plot-latlong, and the Google Chart API to create helpful visualizations to analyze your social networks.

Hardware and software requirements

Any PC manufactured after 2000 should provide plenty of horsepower for compiling and running the code here. As of this writing, CAIDA's plot-latlong tool requires a UNIX®-like operating system for geographical map creation. The other visualizations are made using curl and Graphviz, which are available for a wider variety of platforms.

You need Perl and the XML::Simple, Geo::Coder::Yahoo, and GD Perl modules, which process the social-networking data. A good image viewer, such as feh, is also recommended. To manipulate user images into a standard PNG format, the "convert" component of ImageMagick is required. See Resources for information on where to find these programs.

To install these applications on a Debian-based distribution of Linux®, such as Ubuntu, enter the following command in a terminal window: sudo apt-get install perl feh imagemagick curl graphviz. You need to download plot-latlong manually. After unpacking the plot-latlong archive, copy the .mapimages directory and the .mapinfo file to your ${HOME} directory.

Although this article demonstrates the code on Linux, the data gathering and processing code can be adapted easily to work on any platform that supports Perl, such as Microsoft® Windows®.



Back to top


Extracting social-network data using the Twitter API

Twitter's RESTful interface and clear API documentation provide excellent methods for you to access social-networking attributes. See Resources for more information about the Twitter API. Listing 1 shows the initial buildViz.pl program setup.


Listing 1. buildViz.pl, Part 1

#!/usr/bin/perl -w
# buildViz.pl create social networking visualizations
use strict;
use XML::Simple;

die "specify searchUser, username, password, mode " unless @ARGV == 4;
my( $search, $user, $pass, $mode ) = @ARGV;

my $cmd = "mkdir xml/; mkdir img/";
system( $cmd ) unless( -d "xml" && -d "img" );

# get user's profile data
$cmd  = qq{ curl -u $user:$pass "http://twitter.com/users/show/$user.xml" };
$cmd .= qq{ > xml/$user.xml };

system( $cmd ) unless( -e "xml/$user.xml" );

  # get profile image
  my $xmlImg = XMLin( "xml/$user.xml" );
  my $imgUrl = $xmlImg->{profile_image_url};
  $cmd  = qq{ curl "$imgUrl" > img/$user.png ; };
  $cmd .= qq{ convert -format png img/$user.png img/$user.png };
  system( $cmd ) unless( -e "img/$user.png" );

# get users' friends (people that user is following)
$cmd  = qq{ curl -u $user:$pass "http://twitter.com/statuses/friends.xml" };
$cmd .= qq{ > xml/$user.friends.xml };
system( $cmd ) unless( -e "xml/$user.friends.xml" );

After specifying the required modules and Twitter API credentials, directories are created and the XML for the specified user is retrieved. Note that you can create visualizations for any Twitter user who does not protect his updates. Good form requires that the XML files only be retrieved once, so each XML file will be retrieved if it does not exist on the local filesystem. You'll need to delete these files manually if the most recent data is required.

Next, the image for the specified user is downloaded, along with a list of that users' friends. In concert with the Twitter API documentation, this article uses the terms "friends" and "people you are following" interchangeably. Listing 2 continues the retrieval of friends for the specified users friends.


Listing 2. buildViz.pl, Part 2

my $xmlFriend = XMLin( "xml/$user.friends.xml" );

for my $name ( keys %{ $xmlFriend->{user} } )
{
  my $userFr = $xmlFriend->{user}->{$name}->{screen_name};

  # get friends' friends
  $cmd  = qq{ curl -u $user:$pass "http://twitter.com/statuses/friends/};
  $cmd .= qq{$userFr.xml?page=1" > xml/$userFr.friends.xml};
  system( $cmd ) unless( -e "xml/$userFr.friends.xml" );

  # get friends most recent 200 tweets
  $cmd  = qq{ curl -u $user:$pass "http://twitter.com/statuses/user_timeline/};
  $cmd .= qq{$userFr.xml?count=200" > xml/$userFr.user_timeline.xml};
  system( $cmd ) unless( -e "xml/$userFr.user_timeline.xml" );

  # get friends image (requires imagemagick convert)
  my $imgUrl = $xmlFriend->{user}->{$name}->{profile_image_url};
  $cmd  = qq{ curl "$imgUrl" > img/$userFr.png ; };
  $cmd .= qq{ convert -format png img/$userFr.png img/$userFr.png };
  system( $cmd ) unless( -e "img/$userFr.png" );

}#for each friend

As you review your social-networking connection, you may find that your friends share many friends. The unless ( -e sections help reduce the burden on Twitter's servers by only retrieving unique XML files.

In addition to the "friends of friends" list, each friend's timeline is retrieved, along with that friend's profile image. Save the contents of Listing 1 and 2 as the file buildViz.pl and type the command perl buildViz.pl searchUser yourUserName yourPassword retrieve. In this case, searchUser is the username of the Twitter user whose social-networking data you want to retrieve. yourUserName and yourPassword are your authentication credentials, and retrieve is a placeholder to specify XML downloads only.

The buildViz.pl program will create the img and xml subdirectories, and fill them with files like that shown below.


Listing 3. Example img/ xml/ directories

 87953 2008-11-26 08:21 xml/agberg.friends.xml
187263 2008-11-26 08:21 xml/agberg.user_timeline.xml
 85451 2008-11-26 08:23 xml/alphaworks.friends.xml
 50967 2008-11-26 08:23 xml/alphaworks.user_timeline.xml
 85854 2008-11-26 08:21 xml/andysc.friends.xml
163570 2008-11-26 08:21 xml/andysc.user_timeline.xml
 83236 2008-11-26 08:23 xml/BillHiggins.friends.xml
177740 2008-11-26 08:23 xml/BillHiggins.user_timeline.xml
...
  5626 2008-11-26 08:21 img/agberg.png
  5753 2008-11-26 08:23 img/alphaworks.png
  2080 2008-11-26 08:21 img/andysc.png
  4527 2008-11-26 08:23 img/BillHiggins.png



Back to top


Developing interconnections data and visualization using Graphviz

One method to measure a particular user's influence on their friends is to measure the number of friends that user has. In theory, users with fewer friends have more time to follow social-networking updates and respond to questions. Add the contents of Listing 4 at line 53 in buildViz.pl.


Listing 4. visualizeInfluence subroutine

visualizeInfluence() if( $mode eq "influence" );

### begin subroutines

sub visualizeInfluence
{
  my %frHash = ();
  my $xmlFriend = XMLin( "xml/$user.friends.xml" );
  for my $name ( keys %{ $xmlFriend->{user} } )
  {
    my $userFr = $xmlFriend->{user}->{$name}->{screen_name};
    my $xmlSec = XMLin( "xml/$userFr.friends.xml" );

    $frHash{ $userFr } = 0;
    for my $linkUser( keys %{ $xmlSec->{user} } ){  $frHash{$userFr}++  }

  }#for each friend

  my $infList = "1 $user\n";
  for my $name ( sort {$frHash{$a} <=> $frHash{$b}} keys %frHash )
  {
    $infList .= "$frHash{$name} $name\n";
    last if( ($infList =~ s/\n/\n/g) == 15 );  # exit after fifteen lines

  }# for each key sorted

  chop($infList); # remove last newline
  $cmd  = qq{ echo "$infList" | perl twitdot.pl $user img > influence.fdp ; };
  $cmd .= qq{ fdp influence.fdp -Tpng -o graphviz_influence.png };

  system($cmd);

}#visualizeInfluence

Each friends list of friends is counted, and the top 15 "influence-able" friends are added to the $infList variable. These count, and friend name combinations are passed as input to the twitdot.pl program. Based on code from the "Explore relationships among Web pages visually" article, the twitdot.pl program generates fdp graph-generation syntax for Graphviz. Consult the article and the code Download section for more information about the modifications necessary for this particular visualization.

Next, fdp is called with the fdp graph syntax file to generate the visualization. Run the program with the command perl buildViz.pl searchUser yourUserName yourPassword influence and view the output file (graphviz_influence.png) in your favorite image viewer. Figure 1 shows an example of what this can look like.


Figure 1. Example graphviz_influence.png
Example graphviz_influence.png

The width and color of the arrows indicate the "influence-ability" of each of the friends, based on the number of friends they have.



Back to top


Developing keyword data and visualization using the Google chart API

Influence has been measured, but what about content? Add the code shown in Listing 5 at line 87 in buildViz.pl to create a chart showing the most commonly used words in your message history.


Listing 5. visualizeKeywords subroutine

sub visualizeKeywords
{
  my %wordHash = ();
  my $xmlFriend = XMLin( "xml/$user.friends.xml" );
  for my $name ( keys %{ $xmlFriend->{user} } )
  {
    my $userFr = $xmlFriend->{user}->{$name}->{screen_name};
    my $xmlSec = XMLin( "xml/$userFr.user_timeline.xml" );

    for my $linkUser(  keys %{ $xmlSec->{status} }  )
    {
      my $msgText = $xmlSec->{status}->{$linkUser}->{text};
      for my $key( split " ", lc($msgText) ){  $wordHash{$key}++  }

    }#for each text update 

  }#for each friend

  my $tStr = "";
  my $chlStr = "";
  for my $word ( sort {$wordHash{$b} <=> $wordHash{$a}} keys %wordHash )
  {
    next unless( length($word) > 10 );    # only print 'long' entries
    $tStr .= "$wordHash{$word},";         # append url data
    $chlStr .= "$word|";                  # append url labels
    last if( ($tStr =~ s/,/,/g) == 10 );  # exit loop after first ten words

  }#for the top words

  chop($tStr); chop($chlStr);  # remove trailing delimiters

  $cmd  = qq{ curl "http://chart.apis.google.com/chart?cht=p&chd=t:$tStr};
  $cmd .= qq{&chs=1000x300&chl=$chlStr" > chart_keywords.png };
  system($cmd);

}#visualizeKeywords

Each word from each of your friends' timelines is recorded in the %wordHash variable. To measure some of the more significant verbiage, a minimum length of 10 is required for the word to be graphed. The top 10 words meeting these requirements and their frequency counts are then packed into a URL for generation using the Google Chart API. Check the Resources section for more information about the URL formats and the options available with Google Charts.

Add the subroutine call shown below to buildViz.pl at line 54.


Listing 6. visualizeKeywords logic call

visualizeKeywords()  if( $mode eq "keywords"  );

Run the keyword visualization with the command perl buildViz.pl searchUser yourUserName yourPassword keywords. View the output chart_keywords.png file with your image viewer. Figure 2 demonstrates what this can look like.


Figure 2. Example chart_keywords.png
Example chart_keywords.png



Back to top


Developing geolocated data and visualization using plot-latlong

After charting who can be influenced and what is being said, we can move on to visualizing where in the world these people are. Add the code shown in Listing 7 at line 125 in buildViz.pl.


Listing 7. visualizeLocations subroutine

sub visualizeLocations
{
  use Geo::Coder::Yahoo;
  my $geocoder = Geo::Coder::Yahoo->new(appid => 'my_app' );

  open( LOCOUT, ">locationNames" ) or die "no locationNames out\n";
  open( COORDS, ">cityCoords" )    or die "no cityCoords out \n";

  # record all friends geographical locations
  my $xmlFriend = XMLin( "xml/$user.friends.xml" );
  for my $name ( keys %{ $xmlFriend->{user} } )
  {
    my $userLoc = $xmlFriend->{user}->{$name}->{location};
    my $imgName = $xmlFriend->{user}->{$name}->{screen_name};
    my $location = $geocoder->geocode( location => "$userLoc" );

    for my $coords( @{$location} )
    {
      my %hashRef = %{ $coords };
      print "$hashRef{latitude} $hashRef{longitude} # $userLoc\n";
      print COORDS "$hashRef{latitude} $hashRef{longitude} # $userLoc\n";
      print LOCOUT "$userLoc ##$imgName.png\n";

    }#for coordinates returned  

  }#for each friend

  close( COORDS ); close( LOCOUT );

  # draw the map
  $cmd  = qq{ cat cityCoords | perl plot-latlong -s 5 -c };
  $cmd .= qq{ > cityMap.png 2>cityPixels };
  system( $cmd );

  # Annotate the map with the first 7 friends information
  $cmd  = qq{ head -n7 locationNames > 7.locationNames ; };
  $cmd .= qq{ head -n7 cityPixels > 7.cityPixels ; };
  $cmd .= qq{ perl worldCompositeMap.pl 7.cityPixels 7.locationNames };
  $cmd .= qq{ cityMap.png worldCityMap_annotated.png };
  system($cmd);

}#visualizeLocations

Again making use of prior developerWorks-published code, the worldCompositeMap.pl program is detailed in "Create geographical plots of your data using Perl, GD, and plot-latlong." Using the excellent Geo::Coder::Yahoo module, it's relatively easy to record the city coordinates for your friends' locations in the cityCoords file, and the associated name and image data in the locationNames file.

The first seven friends' locations and identifiers are then passed to the worldCompositeMap.pl for rendering. Consult the article link above or the Download section for more information about the worldCompositeMap.pl program.

Add the subroutine call shown in Listing 8 at line 55 in buildViz.pl.


Listing 8. visualizeLocations logic call

visualizeLocations() if( $mode eq "locations" );

Run the command perl buildViz.pl searchUser yourUserName yourPassword locations to build the worldCityMap_annotated.png file, and open that file in your image viewer. Figure 3 is an example of what this can look like.


Figure 3. Example worldCityMap_annotated.png
Example worldCityMap_annotated.png



Back to top


Conclusion, further examples

With the code and tools presented here, you can create a variety of visualizations to help analyze attributes of your social network. Use these tools to track keywords as they spread through your network of friends. Visualize the paths of particular links as they travel to different areas of activity around the world. Help create charts and analysis for your employers to help them see the deep value of social networking.




Back to top


Download

DescriptionNameSizeDownload method
Sample codeos-socialtoolstwitterVisualizations.0.1.zipHTTP
Information about download methods


Resources

Learn

Get products and technologies

Discuss


About the author

Nathan Harrington

Nathan Harrington is a programmer working with Linux at IBM. You can find more information about him at nathanharrington.info.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top