Monday, April 23, 2012

Data Visualization


Data Visualization is a way of representing data in a more appealing yet informative – manner. The main goal of data visualization is to communicate information effectively through graphical means.

The most common way of doing it is by representing data in the form of graphs. There are different types of  graphs available. You need to use the one that’s more suited to the kind of data you’re portraying and its context.

The other method that’s gaining popularity of late is networks/maps.  Although this method is more suited for displaying information from social media, it is a very good and an attractive way to display the information.

Motion slides is another interesting tool to display information from your analysis. It is somewhat similar to the powerpoint slides, but it makes very good use of graphics to display information in a dynamic manner. It lets you create different instances for different periods of time and thereby lets you display information effectively. One very good example of this is the video below, where Dr. Hans Rosling uses it to putforth his findings.



Sunday, April 22, 2012

My Infographic resume

When this assignment was announced, I thought it would be the easiest one. I now stand corrected!

A lot of time and effort has gone into making this resume. I have designed a web page and I have kept it simple. Just click here to have a look at it!



Tuesday, March 6, 2012

Lecture 12: Centrality Measures

Having built our own social graphs, we are all familiar with graphs and network visualizations now. But, it is important to understand that building colorful and good-looking graphs is not the main objective, we need to analyze and interpret the graphs in a proper way to make full use of it. So, I thought I should write about something that is significant and really helpful for all of us. This blog is about one important property called Centrality and how different centrality measures can be used to analyze a graph.

There are different measures of centrality available, to determine the relative importance of a node in a graph. They are:

    1. Degree centrality: This considers the node that has the most number of interactions in a graph as the most important node. So, when this measure is used, all the nodes that have relatively high number of interactions will occupy the center portion of the graph. The picture below illustrates this concept:


     2. Closeness centrality: The node which is close to most of the other nodes in a graph is considered as the most important node. It also implies that this node can communicate easily and quickly to the other nodes in the network. When this measure is used, the node that is most closely connected to other nodes will be at the center of the graph.



    3. Betweenness centrality: The most important node in this measure is the one that appears in most number of shortest paths in the network. This could be a very important measure as it indicates the node that increases the connectivity. Below is an interesting image to illustrate this concept:


[Find out which of these has a high betweenness centrality :) ]


These three measures are highly different from each other. You should use the one that is most appropriate to the nature of the network and kind of analysis involved.

Saturday, February 25, 2012

The rising interest - Pinterest!


We have heard so many things about Facebook and Twitter. There’s just no doubt that these two sites are the top social networking sites. But, we need to know that there are other good sites too! They might even get better than FB and Twitter in a few days!  This blog is about one such site which has already started proving the big potential it has for marketing brands and consumers. For those who have started wondering, this is about an emerging social networking site - Pinterest.

Pinterest is a photo sharing website. It takes social networking to the next level. You’re not gonna do it with just 140 characters or a status update anymore! If you see a photo that you like on the web, or if you have one in your desktop, you can upload it on to Pinterest.  These photos are called ‘Pins’. You can create a bulletin board full of photos that you like, which is called a Board.

Once a photo is pinned, it can soon be re-pinned by the other Pinterest users. This is how a photo or an idea gets spread virally on the network.  Just like the other networking sites, what appears on your home page is determined by the people you choose to follow. The interesting thing here is, you can pick and choose what all boards of a user you want to follow rather than following all the boards of one user.

The other significant feature of Pinterest is that, you can add photos from other websites if you have bookmarked Pinterest on your web browser. Let’s say, you find an interesting picture on Amazon.com. You can create a Pin for this image on Pinterest just by clicking on the bookmark. This will also create a link to Amazon.com, where people can find out a lot more information about it. And, when you add price to an image like this, it will automatically create a banner for that image.

Pinterest can also be used for collaboration, in a way. You can choose to let your friends post something on your board. This is particularly useful when you have had a trip with some of your friends. It is only natural that you have some photos and your friends have some other photos of the same trip. So, when you have created a board for the trip and when you are done posting your photos on the board, you can add your friend’s name as a contributor to the board. He will now be able to pin his set of photos to the board.

Pinterest is one site which has tremendous potential. It has been around only for two years but the number of users on this site has jumped from a few thousands to over 10 million in the last few months. This provides an excellent platform for marketing brands and advertising. We are now in the era of Google Ads and Facebook Ads. But, there will soon come a time, when we will start designing a campaign for Pin-ads! One Metric that can be used to measure the success in Pinterest is the number of times an image gets re-pinned. It would be more valuable than the number of impressions (on Google Ads), as these images get spread by the users themselves.





Remember Sun-Moon-Star?






Monday, February 20, 2012

Lecture 10: The Essence of having a long tail


Of late, the term or phrase that’s seeking quite the attention is ‘long tail’. For those, who aren’t familiar with this phrase, it might sound like something that is never ending or something that’s longer than usual. Well, that could probably be another way of looking at it. What ‘long tail’ actually means is having not just the most popular ones, but also the ‘forgotten’ ones. This could be applied to any context. But, this is the concept or idea that’s been the root cause of some great success stories.

Let us consider Amazon for example. All of us know that it has a wide range of products and it caters to different sections of web users. But, one of the most significant factors for its huge success is – having a long tail! It should be noted that all the products that are offered by Amazon have long shelf lives and so Amazon is able to market those products at different times which includes times when the product is considered to be dead in the market.

It would probably be easier to understand with a real-world example. So, here comes one:

I guess everyone would know the latest Vidya Balan blockbuster “Dirty Picture”. It is based on a biography of a famous South Indian actress, Silk Smitha. When Dirty Picture released in 2011, it turned out to be a massive hit in North India and people started flocking the movie theatres. Stories and rumors about the actress started making rounds all across North India. What Amazon does is, it senses this big opportunity and starts recommending the old movies of 1970s and 1980s starring Silk Smitha, on its website. When people see those movies on the site, they tend to purchase them too, along with the new one. This is a simple concept that has resulted in a huge success. This was possible only because Amazon had not just the latest hits, but also the old forgotten ones. Now you would realize how useful it is to have long shelf life for the products and how valuable it is to have the ‘forgotten’ ones.

As I already mentioned, this concept of long tail could well be applied for different fields. In the context of Google Ads, it would be really helpful to have the ‘forgotten’ keywords or the most rarely used (specific) keywords as a part of your ad campaign. But, it is to be noted that having just one keyword of this type, wouldn’t bring about a significant difference. Whereas, having a handful of them, would prove to be prosperous.



Sunday, February 19, 2012

Guest Lecture: How Communication has evolved!!


Back in the days, when people wanted to pass information from to another, they used to write it down on a tiny piece of paper and tied it to Pigeon’s legs, which actually acted as a carrier between the sender and the receiver. After the invention of telephone systems, the process of communication got simpler. It got even more simple and easier when the ‘web’ evolved. People could transmit information in the form of emails. This was believed to be the most efficient and the most simple way to communicate until now.

But, with the advent of Social Media, the process of communication has become much more interesting than ever before. It is amazing in a way, that web services like Facebook and Twitter which were actually developed with entertainment as their primary focus has now become a tool of communication. Studies have shown that people spend more time on these social networking sites than actually browsing when they are online.

Even in our last lecture, Gilad Lotan showed us a few significant examples of how twitter is being used to pass on information. From those examples that he showed us, the most significant facts that I inferred are:

v  The rate at which information gets transmitted (on twitter) is unimaginable. One of the stand-out examples was that map of information flow he showed, which happened when the East coast faced an earthquake. Someone in Columbia posted a tweet about the quake and the information got transmitted virally and people in New York got to know about the quake even before it actually hit the city.

v  The scale of communication on a platform like twitter is HUGE! It not only reaches the intended recipient, but it also informs all the other passersby! This is Information Broadcast on a level that can’t be compared to anything!

These facts are not just raw, they are real! In fact, I have done a similar thing two years ago. One night, when I was alone at home, I was watching TV and I was happily enjoying a song when I suddenly saw a glass of water on the table shake a little. I was not sure if the glass actually moved / vibrated. I wasn’t sure if it was a quake and it was half past midnight and so I did not want to call or text anyone and ask them if there was a quake. I checked out a few local news channels but nothing had any info about the quake. I was also online at that time on Facebook and so I updated my status, “Did I feel a mild tremor or is it just a wild imagination”! Within seconds, I got five replies asking me to run outta the house.

 Little did I know before, Facebook could be so useful!


Tuesday, February 7, 2012

Lecture 7: Traditional vs Online Advertising


In our last class, we discussed some pros & cons of Online advertising and Traditional advertising. Each one has a different business model and each one caters to different types of audience. Although, the line of separation is getting blurred by the day, you need to remember that there are still people who don’t use the internet and there are also people who don’t watch television (grad students like usJ) or read newspapers regularly.  So, you cannot reach out to the entire population of this world via a single medium.

Although online advertising is growing rapidly, it has still not grown up to the standards of Traditional Advertising.  Significant percentage of the population still prefers traditional advertising (TV or Newspapers) to Online advertising. It may be because of the fact that Online advertising is purely commercial whereas Traditional advertising has an entertainment factor to it. Some people watch ads on TV, not because they want to buy a product/service, but because they like those commercials and they enjoy it.

One of my favorite commercials is:


And, the most popular Superbowl ad: 


There are chances that people, who don’t like the actual Doritos, will still love this ad. 

This is one weakness of Online advertising. People do not enjoy it. Actually, there’s nothing in it to enjoy! It is just a piece of text or an image which talks about the product or service. There’s nothing interesting about it unless you add some graphics to it, which in most cases turns out to be annoying to the user/visitor.

However, the sellers don’t crib about it as it helps their business and it is just another way of reaching out to people. No matter what model is being used, be it Cost per click or Cost per 1000, they are gonna pay for it as long as it helps them grow their business.

Saturday, February 4, 2012

Lecture 6 - Social Graphs


This week, we were given insights into how network analysis is done on social networking sites.  We all logged into our LinkedIn accounts and created InMaps.  Apparently, this was one of those things which we were not aware of, before this class.

So, what is an InMap??
As the name says, it is nothing but a map which is used to convey some information.  It is just a visual representation of your social network. But, little did we know about the process behind these inmaps until our Professor briefed it to us. This process is called NETWORK ANALYSIS, the results of which are portrayed as InMaps (on LinkedIn). These InMaps have different nodes and connections between them. These nodes are different people that a user is connected to and the connections between them are based on the interaction between people.

So, this is what LinkedIn uses as its social graph. Different social graphs are available now and each one has a different way of portraying information.  Our Professor also asked us one interesting question in class – How would it be if Twitter had a social graph like LinkedIn? Many students came up with different ideas and we had an interesting discussion.

When I was returning home after the class, I was wondering how a Facebook social graph would look like. I have never seen a graph like that on Facebook. However, I have seen InMaps on LinkedIn and I know how these graphs are made. So, I sat in a corner of the house and started pondering about different ways of representing Facebook data.

As we all know, Facebook has a lotta different things unlike twitter/LinkedIn.  In InMaps, the nodes represented people/users of LinkedIn. But, when it comes to Facebook, you can have different types of nodes.  One for the users, other for the groups and another for the pages.

When you have fixed the nodes, next task is to determine the connections between them.  You can use one or more parameters to do this. The most common way of doing it is by determining the frequency of comments posted between the two nodes. Another alternative is, using the number of ‘like’s.

Let nodes for users be represented by a small circle, nodes for groups be represented by a square and pages can be represented by a small diamond. We can all now start visualizing the maps.

You would be connected to all of your friends, apparently. And you would also see connections from your node to the groups (represented by squares) which you are a part of. Some groups can also have some of your friends as its members. So, the connection goes from your node to the group node and then to your friend’s node.

Remember, you and your friend are connected by another connection, too! But, there can never exist a connection between two groups!

Connections for Pages can be represented in the same way as Groups.

Facebook has tons and tons of data and a lot of different dimensions or features. You can come up with different ways to represent those data or information. What I have discussed here is purely my imagination and your views or opinions are most welcome!

Friday, January 27, 2012

Lecture 3: Getting Deeper into Analytics!!



In the previous lecture, we had an introduction to Google Analytics (GA).  We also had a warm-up exercise on GA over the weekend. It was the first time for all of us and I was very curious about it.  Because, we all know GA is used for monitoring and analyzing the website traffic but I haven’t seen it before! I haven’t seen what it does, what it displays on the screen.  I was running into wild imaginations – was wondering if it would be something like that screen you get to see in the Space Research Centers,  which is shown on news channels when a rocket is launched.  It would have all possible lines and curves – technically called a graph. But, when I first logged into the Analytics account, I got a feeling that it was no rocket science. The first (main) page on GA is simple and clear and most of all, it makes sense! Thanks to Google J. It had all the metrics which we discussed in class, listed on one side and it had a graph at the top which tells you how many pageviews the website has had over a certain period.

Our Professor told us that GA has two aspects to it. One is the Mechanical aspect and the other is the Analytical aspect. Mechanical aspect of the GA involves setting the timeline for your analysis, choosing the metric you want, etc. Analytical aspect is when you interpret and derive meaning outta the results(graphs or tables) displayed on the screen. Both the aspects are very important and can greatly influence your analysis results/decisions.

The most important metric of all is the Bounce Rate, which is used to measure the visit Quality. According to Google, Bounce Rate is defined as the percentage of single-page visits or visits in which the person left your site from the entrance (landing) page. At first, it was pretty confusing as it is very similar to another metric called Exit Rate. Exit Rate is defined as the percentage of site exits that occurred from a page or set of pages – Google. We had an interesting debate over this and all are doubts were cleared by our Professor.

This is where the lecture came to an end and we all left the class, thinking about our first homework – to analyze the traffic on our own MIS-Eller website and come up with recommendations to the least performing sections of the site.



Sunday, January 22, 2012

Business Intelligence – 2nd Lecture



The second lecture started off with the definition again and so is my second blog on BI.  You can define BI in a lotta different ways. BI is such a big field. Putting it all together, BI can simply be defined as any tool, technology or technique used for collection, measurement, understanding, analysis and prediction using (past) data for Performance Management. Our Professor stressed on ‘Performance Management’, as it is the whole point of using BI for your business. BI is used to help improve business, keep track of your performance, establish goals and targets for the future and find out ways to achieve it. This is exactly where KPIs come into play.

Now, what is KPI?

KPI stands for Key Performance Indicator.

KPIs are commonly used to measure the performance of a business towards a specified target/goal which involves monitoring and measuring metrics. Metrics are defined pertinent to the business. A good example of a metric for a sales company could be: Amount of Dollars earned by sales of a product over a period of time.

Our Professor also talked about different types of data which are Unstructured, Historical and Real-time data. Data can also be broadly classified into two:

Internal – Data that a company or an organization possesses which are mainly about the operations and transactions performed.
External – Data that a company collects from discussion forums, review sites, blogs, etc. which will be helpful for its business.

I understood how data collected from these various sources is being used by companies. They don’t just use it blindly. The data that is collected is first cleansed and then it undergoes a process called ‘Data Profiling’ by which you eliminate fake, unnecessary and useless information. At this point, you will have to use the 6Ws which are called the Web Metrics: What, When, Why, Where, Who & Which.

Ex: What does a customer expect from your business? Where does he come from? Which category or age group does he belong to?

These questions help you derive some meaning outta the data that has been collected which is obviously useful for your company/business.


Thursday, January 19, 2012

What makes BI interesting?



Of late, B and I are the two letters thats on everyone's mind.I was trying to figure out what makes it so interesting.

The very idea of using BI is to help improve your business, based on (past)data analysis. It helps you foresee the future and get yourself prepared for the demand/trend/rise/downfall of the market. But which part of this makes it interesting?

Is it because of the fact that it is related to social media? Of course not!

Is it because it fetches you awesome money?
Okay, this is something that you can't plainly refuse.But, it can't be a proper reason as there are loads of other things in this world which could fetch you even more money.

I guess it is probably because of the human nature. We, as humans, love to predict the future. We always think about future and we always work towards it.

When you have this kinda inborn-curiosity, you are obviously gonna like BI. It gives you a chance to analyse data and find out what customers think about a company, what opinions they have, what they like, which/what attracts them to a product/service, which are lows and highs of a company, what's the trend gonna be like in the near future, etc...obviously sounds interesting to you! After doing all of this, you also get the privilege of suggesting or recommending some solutions to the business which is like icing on the cake.