Monday, April 14, 2014

Final Proj

For our final project, we were given the following information:

"Project 2: Client wants to see the daily evolution of own Page and Profile over the whole year complemented by a few static metrics. There should also be head-to-head comparison with two closest competitors. He also wants to see how the whole business evolution."

In addition to that, we were also given a PowerPoint slide, and two excel files, one with weekly data, and one with monthly data.

The PowerPoint slide provided data on both Facebook and Twitter, but we chose to focus on Facebook only, as the excel files contained the metrics 'Like', 'Comment', and 'Share'.

For our research question, we want to see how we can provide customized information with tips and help to the organization decide on how they can improve their social media presence.

We looked at what the current profile of what the client might see:


The information provides us with the most important information, such as the engagement rate, the number of fans, interaction rate, etc. However, it was very much in a table format, and did not allow one to compare between competitors easily. As such, we decided on the three task that we wanted to do.

1. Redesign the profile to be time-based, to show current and possible predicted future trends.

2. Provide possible solutions to improve the engagement rate

3. Design an infographic to provide the client with the relevant information


Initially, we did not have much to work with. Without the name of the client, it would be extremely difficult to create customized and personal information of the client in an infographic form.

We requested for the client's name and found that it was Paris Saint-Germain, a French soccer club. None of us had previous experience with a soccer club nor were really interested in soccer, but we tried to find out as much possible about the club, the industry, and searched its Facebook Page to look at its postings for the period that we were given,

Initially, we wanted to have it in this format, and did a sample below:


It would have contained the following sections:

1. A calendar of events
2. A quick glance at the latest month's most important information
3. Tips and tricks to increase interactions, and ultimately, engagement.

It would also be named, "A Facebook Guide", to show the idea of providing tips and information to improve engagement rate.

However, further research shows that P.S.G. plays these games very often, almost every other day. It would have made the calendar very cluttered and complicated. As such, the calendar idea was removed.

In addition, the infographic would not contain the daily evolution of the profile page, which was something that Socialbakers wanted to see. 

The daily evolution proved to be a difficult task to create. In creating a time-based infographic, it would be most expedient to use a line graph rather than bar charts, pie charts, etc. It would not make it easy to make the infographic visually appealing if it were to contain many line graphs.

After much discussion, we decided to focus on the interactions and the top posts of the month. That would be the first visualization that the client would see. Engagement rate was not chosen for two reasons. First, the units of engagement rate is in percentage, and the percentage values were very low, with little changes. Second, interactions would show the exact number of likes, comments, and shares, giving the client the actual numbers. Actual numbers would seem to be more useful to the client, it can be converted.


It began with this. This line chart shows the average interactions that PSG has per month. We decided to average it as compared to viewing the daily changes as it was very much complicated and difficult to see any patterns there, as can be seen from the line graph below:


It would be easy to find out the interactions daily, but it would not have looked very useful, as it goes into too much information, and one would not be able to see patterns from there. We used the average number of interactions and plot it on a graph from Nov '12 to Oct '13.

In addition to that, the top post of the month were selected, and placed on that plot in the graph. It provided us with valuable information to see what kind of posts PSG's fans are interested in and how much interaction that it managed to garner. Here is a closeup of one of the top post:


After some time, it did not seem to be quite good to have a photo on the infographic, or to contain such information on the status. It was because of this that we decided to propose an interactive infographic. We do not possess the skills in creating the interactive infographic, if not we would have done so. The photographs were removed and the posts were categorized thematically, according to game/wins, player news, and announcements. This way, PSG would be able to quickly identify what posts have gained the attention of its fans most.

Apart from the proposal of an interactive infographic, we also decided to have the infographic run lengthwise, from left to right, rather than downwards.

The purpose of doing so was to prevent users from reading the infographic wrongly. In having an infographic with many time-based visualizations, we tend to read from left to right. Having an infographic that scrolls down may defeat the purpose. This is because some people may read the left column first before reading the right column. Some may also feel that it is quite annoying to have their eyes look in a Z-shape as they look left to right and left to right again.


The next section was to have October information shown. We wanted to show a point of time information in addition to time-based data. This would give the client a better understanding on what their situation is like in October, and how they can seek to improve it. This also shows a comparison between their competitors. Taking the advice from Jing that circles do not work so well in representing data, the infographic was changed to use icons to represent a number instead.

The final infographic contained 5 sections.

1. Average post per month + Top post of the month
2. Quick Look
3. Daily Changes
4. Year in Review
5. Tips and Tricks

Most of the information from the previous drafts stayed on, with just aesthetic refinements.

Daily posts were mainly from the information that Socialbakers gave us.



As you can see, the infographic is far too long to be seen from afar, and therefore, we will look at each section individually and briefly, to prevent duplication from the report and presentation.
The first part is the header. The entire design stuck to a very minimalistic style and aimed to provide as much information in a clear and concise manner, without much distraction from other design elements.

Just like in previous drafts, the title was kept as 'A Facebook Guide'. The first part contained the three theme colours that would be used throughout the infographic: blue, yellow, and red. PSG's main colours are blue and red, therefore we would use it to indicate that any blue line would stand for PSG. As we did not know who the competitors were, the colours chosen were yellow and red. As part of the primary triadic colours, red, yellow, and blue has good harmony with each other.

This section shows the first half of the first section. The French soccer seasons were added for more information to show when are the periods where PSG would be busy. It also happened to correlated with the number of interactions. Icons were used, and the posts showed quick information about the top post of the month (based on interactions)


This section showed point data for the month of October. As mentioned above, circles were changed to icons, with each icon representing a certain value. One can also see changes from the previous month with the green and red arrows below. The formula for total interactions and engagement rate were also added to remind users how the figure was derived. This would be especially useful in the event that new staff are hired to manage PSG's Facebook.


The third section showed the daily changes. The icons indicate when a post was made which could have caused the spikes. For us, we would not be able to tell if it was a certain post which could have caused the spike. Of course, Socialbakers may have the information required to fill in the section. In addition, the legend and actual post is shown to the right.

What is interesting is that the top post for the month, as shown below, does not seem to have caused any spikes in the increase in interactions or fans.


Despite being the top post, October 24 did not have any spikes, and it was speculated that a post on Oct 25 is the one that caused the spike on the 26-27th. However, we cannot tell definitively that it was the post on the 25th that caused the increase in interactions. It might as well be on the 24th which had caused it. I will continue to explain more towards the end of the post.


In continuation with section 3, the daily changes of the engagement rate is shown. Based on Socialbakers standard of 0.36, all of the players fall short in reaching the optimal engagement rate in the month of October. This is where the client (PSG) can take note, and see the decreasing engagement trendline. PSG then can take some action to improve the situation, perhaps by having more interesting content or posting more based on the tips given in the end.


The fourth section is on the year in review, and it shows the averages that PSG and competition has. This would give a definitive answer to how PSG has been doing in comparison to its competitors over the last year. In addition to looking at the monthly values, PSG can look at the year averages to see how it compares with its competitors, with Socialbakers benchmark. As can be seen, both PSG and Competitor 1 has actually met the engagement rate that Socialbakers has set out on an average. This goes to show that the engagement rate of one month may not be representative of the whole, and sometimes, we need to step back to see the overall performance to judge how an organization is performing.


In addition to just the averages, the admin vs user posts are shown. C1 may have more user posts than admin post because it has more interactive or engaging content that could have caused more of its users to respond, as compared to PSG and C2. This is an area that PSG can look into.

Looking at the distribution of fans, a top tip was given to tell PSG how to build better relationships with its fans. This will give PSG a way of engaging its fans more.


Finally, the last section shows general tips that PSG can use to improve its engagement rate. As we are not social media experts, these are some tips that PSG can use. It does not show customized information, and that is something that we proposed.

Proposal for an interactive visualization:

In creating an interactive infographic, the following are proposed.

1. In section 1, PSG can hover over the month to see the actual top post, and clicking will enable one to go directly into that post.
2. PSG should be able to change the time frame where they want to look at easily, for instance, instead of looking at just one year, PSG can look at 2 years, 6 months. A slider should for ease of manoeuvre.
3. Each section in the infographic is movable, allowing the client to customize the infographic accordingly. The client may want to have the year in review first, rather than the average interactions, and this should allow them to do so.
4. When one hovers over the tips, PSG information should be shown too. For instance, when PSG hovers over the Photo Posts tip, a small information box can mention that the number of photos that PSG has posted in the last week or month.

Limitations
In the creation of this infographic, it was found that much information is based on speculation due to the lack of data and information. As mentioned above, we were not able to tell which post is the one that had caused the spikes or the increase.

In addition, it was very much scoped to see the daily evolution. This meant that we were limited to use time-based visualizations such as line graphs to display information. Although this meant that we are unable to explore more information, we used point data to counter that, and to give the infographic more information and visual appeal.

All in all, this has been an interesting journey, from crafting out the research question to the story on PSG, to the final infographic creation. The process of data visualization is certainly not easy, but is certainly rewarding to create such productions and to see the fruits of labour showcased in the infographic.

Monday, March 31, 2014

Visualization of 1.2 million photos

As a photographer myself, I found that it was interesting to be able to combine both data visualization and photography.

It is true, I do know about metadata, and the type of information that one can retrieve from a single digital image, but I did not really think much on what I can do with that kind of information.

Personally, I've taken about 50,000 photos over the last 6-7 years that I have been as a photographer. Often, I've classified them into events, and in chronological order. This would give me the ability to see all the photos at once, and to search through them easily.

On Flickr itself, I would get pretty annoyed when someone would just tag 'Singapore', and now I see how many photos people have taken there. It just appeared to me that some people are too lazy to even tag a proper location. I wouldn't mind if tourists or foreigners did not know the place well, but I've seen many Singaporeans who do that too, and that annoys me a bit.

But let's not go off-topic.

We were given a task to create a visualization based on the information that one can retrieve on the camera.

For the purpose of this exercise, we decided to look at time of the photo taken, and time of the photo uploaded. This is our visualization, as shown below:


On one end, we would be able to see the time that the photo was taken, and on the other end, would be time that the user had uploaded the image.

We were expecting to see a visualization as such, but when I think about it now, perhaps it is better if the axis were vertical instead of horizontal, just maybe it would be clearer to see the visualization.

From this, we are able to see the time of the day where people take the most number of photos, and also the time where one uploads the most amount of photos.

Of course, we would typically expect the mobile phone photographer to upload the image almost instantly, probably just throw a filter on and upload.

However, many in the photography community wouldn't really do that, even on their mobile phone. Instead, we would take the photograph, and post-process it (on an app such as Snapseed or PS) before we upload it. This could be done in a couple of minutes, to maybe an hour on photoshop (if on the computer).

We could further tell who are the ones who upload the image immediately, and if people have liked such photos, as compared to the people who take the time to edit their photos. perhaps from the upload time we are able to sift out the pros from the amateur photographer.

In addition to that, we would also be able to see the geographical location where most number of photos have been taken and the most number of photos that have been uploaded in a single time.

We could probably tell where these Flickr photographers reside, and how long do they take to upload a photograph.

###

Based on the photo metadata, we are able to tell many things, and I think that it is essential that we do not ignore such information. Flickr is a wonderful place to host photographs, but I think that Facebook is a better place to share photographs.

Data visualization indeed can come in many forms, whether it be from just the data from a single image, or the collected information from a census. I believe that this exercise taught us to look at every little thing, and the things we take for granted, such as a photograph, and to tell us that visualizations can be found anywhere, if only we just put in the time and effort to produce it.

Assignment 3

For Assignment 3, our task was to explore the various types of visualization tools that are available.

Being a Mac user, it made it difficult to download one of the recommended programs - Tableau.

Nonetheless, there are many other programs out there that are able to help us with the data visualization, and there are many tools that are available to help us. I've decided to look at browser based tools, as one can access them from anywhere, and would not need to have the hassle of downloading a different program whenever I changed computers. The three tools that I used include: Many Eyes, Plotly, and Jolicharts.

Of all the three tools that I used, I believed that Plotly was the most useful of them all. It was easy to manipulate data, change graphics, save and download the visualizations that it produced. In addition, it allowed for statistical calculations

I've decided to look at the largest fast food chain in the world - McDonald's to get a closer insight into what is contained within the food that they produce.

The aim of the study was to see if McDonald's top selling and most popular foods contained sufficient good nutrients, and reduced the bad nutrients as recommended by the U.S. Food and Drug Administration.

I narrowed down the items that I wanted to look at, from over 80 plus menu items to 16.

The following are the items that I selected, based on the most popular menu items that customers purchase. The items in green are McDonald's signature items. They are also very popular among customers.

McDONALD'S Bacon Ranch Salad with Crispy Chicken
McDONALD'S, Apple Dippers
McDONALD'S, Bacon Ranch Salad with Grilled Chicken
McDONALD'S, Bacon Ranch Salad without chicken
McDONALD'S, Baked Apple Pie
McDONALD'S, Caesar Salad with Crispy Chicken
McDONALD'S, Caesar Salad with Grilled Chicken
McDONALD'S, Caesar Salad without chicken
McDONALD'S, Chicken McNUGGETS
McDONALD'S, Double Cheeseburger
McDONALD'S, French Fries
McDONALD'S, Sausage Burrito
McDONALD'S, Sausage McGRIDDLES
McDONALD'S, BIG MAC
McDONALD'S, DOUBLE QUARTER POUNDER with Cheese
McDONALD'S, Egg McMUFFIN

I looked at 8 nutrients that the U.S. FDA recommended consumers to increase and reduce. The following are the 5 nutrients that one is supposed to increase:

1. Dietary Fiber
2. Vitamin A
3. Vitamin C
4. Calcium
5. Iron

The following 3 nutrients are recommended by the U.S. FDA to reduce. They include:

1. Cholesterol
2. Fat
3. Sodium

Of course, one would expect that one eats a full meal each time that one goes to McDonald's. However, for the purpose of this exercise, I only used each individual menu item. Eg: one burger, and not one set meal.

Based on the FDA recommendations, I crafted out the optimal nutrients that one should take. They include:

Nutrients that should be limited
(Per meal)
Nutrients that should be taken more
(Per meal)
Fat: 21,666mg
Sodium: 800mg
Cholesterol: 100mg
Dietary Fiber: 8,333mg
Vitamin A: 1,666 IU
Vitamin C: 20mg
Calcium: 333mg
Iron: 6mg

This is based on one meal that one should consume.

This was labeled 'Optimal Meal' and was at the top of the list of the other McDonald's menu items.

To prevent repetition of the report, I've summarized the three tools into a table, where we are able to see the various features of the tool itself:

Item
Many Eyes
Plotly
Jolicharts
Loading speed
Slow
Fast
Slow
Interactive
Visualizations
Yes
Yes
Yes
Download/Save
No
Yes
Yes
Sharing function
No
Yes
Yes
Data Privacy
Public
Private
Private
Ability to edit data
No
Yes
No
Data Analysis
No
Yes
No
Customizability
No
Yes
Yes
Requires account
Yes
No, but need account to share or save
Yes
Browser-based
Yes
Yes
Yes
Ability to zoom in
No
Yes
No
Slideshow
No
No
Yes
Support
None
None
Live Chat
Notes
No
Yes
No
Ability to Undo
No
Yes
No
API Support
No
Yes
No
Cost
Free
Free
Free

Despite using all these three charts, I felt that they still could not provide a simple visualization that I wanted. This would optimally show all the nutrients, and tell the user on first glance what is lacking. In fact, a table with coloured cells seem to be able to show the answer better that all the visualizations so far:


Food Item
Total Lipid
Sodium
Cholesterol
Dietary Fibre
Vitamin A
Vitamin C
Calcium
Iron
Optimal Meal
21,666
800
100
8,333
1,666
20
333
6
McDONALD'S Bacon Ranch Salad with Crispy Chicken
20100
871
70
3200
7927
30.9
147
1.95
McDONALD'S, Apple Dippers
0
0
0
0
5
188.4
42
0.07
McDONALD'S, Bacon Ranch Salad with Grilled Chicken
9580
702
85
3000
6137
31.1
146
1.98
McDONALD'S, Bacon Ranch Salad without chicken
8120
294
27
3300
6447
30.1
140
1.49
McDONALD'S, Baked Apple Pie
12060
153
0
1500
0
24.9
15
1.53
McDONALD'S, Caesar Salad with Crispy Chicken
16410
742
56
3400
7917
30.9
188
1.76
McDONALD'S, Caesar Salad with Grilled Chicken
5620
580
71
3300
6139
31.4
189
1.81
McDONALD'S, Caesar Salad without chicken
4370
177
11
3400
6441
30
183
1.3
McDONALD'S, Chicken McNUGGETS
12680
362
28
0
0
0.8
7
0.57
McDONALD'S, Double Cheeseburger
24940
1035
74
1200
0
0.6
276
3.47
McDONALD'S, French Fries
10980
134
0
2800
0
4
13
0.57
McDONALD'S, Sausage Burrito
17120
763
173
1200
382
0.9
203
1.84
McDONALD'S, Sausage McGRIDDLES
23980
995
32
1400
0
0
85
1.92
McDONALD'S, BIG MAC
32760
1007
79
3500
412
0.9
254
4.38
McDONALD'S, DOUBLE QUARTER POUNDER with Cheese
45420
1333
160
2800
560
1.7
297
6.1
McDONALD'S, Egg McMUFFIN
12170
777
208
1400
0
1.5
242
2.9





















Apologies for the small print, and the horribly out of size chart, I was unable to fit the table into the blog post properly.

Nonetheless, on one glance, one can tell if there are sufficient nutrients in a certain item. Based on these 16 items, none of them are able to contain sufficient 'good' nutrients, but most of them are able to reduce the bad nutrients.

Some interesting facts:

1. The Big Mac contains more dietary fibre than all of the food items, even more than the salads, however still insufficient for a meal.

2. Apple Dippers had 0 'bad' nutrients, but did not have much 'good' nutrients (possibly due to the small serving size of the Apple Dippers).

3. Based on a 'good' to 'bad' nutrient ratio, the Chicken McNuggets has the highest disparity, with absolutely no dietary fibre, little iron, calcium, and vitamin A, and no Vitamin C at all.

4. There are absolutely no cholesterol in French Fries, Apple Dippers, and the Apple Pie.
- there are good and bad cholesterol, and having absolutely no cholesterol at all could potentially be unhealthy for one.

That being said, the three tools are all not without flaws, most of them are small and minor, and one will find the tools to be extremely useful in creating data visualizations. Not really infographics (as discussed in the earlier blog post). Perhaps tools such as piktochart would be better at creating infographics.

Nonetheless, the assignment did give me a better understanding into the various browser based software that I was able to employ for the purpose of this project. It gave me a better understanding into the tools out there.

Perhaps it is time to learn some basic programming to be able to create data visualizations/infographics that is both interactive and completely customizable.