Data Analysis and Visualization

Providing Visual Understanding to Complex Data

Data analysis and visualization can be presented in a variety of ways, depending upon the audience. On this page I demonstrate my ability to analyze the data and present it formally as well as in an informal format from data already analyzed and provided.

Data Visualization Graphic.

Formal Data Visualization

Weather Data Visualization - An Exploration of Weather Patterns Across the US.

An analysis of weather patterns and events in the US for the year 2014. Total injuries, deaths and damages were analyzed by weather events. These two initial analysis provide descriptive statistics of the most important weather variables.

Contributors

Team Macro

Carin Camen
Joseph Clay

Team Micro

Dina Alsharif
Ghia Mei Khoo (Jamie)
Rama Linga

Mobile user's, click inside table for horizontal scroll.

Total Deaths Total Injuries Total Damages
Count 58,574 58,574 58,574
Minimum 0 0 0
Median 0 0 0
Maximum 200 43 $1,500,000,000
Mean 0.045 0.009 $130,845
Std Dev 0.221 1.674 $8,595,137

Table 1: Descriptive Statistics of the most important variables

Descriptive statistics of the most important weather variables.

Click on image to expand.

Plot 1 - This graph shows the direct death by weather events and states. The states of Washington, Arkansas, Mississippi, Illinois, Florida and Nevada had the most weather related direct death. Direct deaths were caused by precipitation, and snowfall weather variables. The six states that were selected was ultimately decided by the analysis of this visualization.

Damages, Deaths and Injuries
by Month 2014

Plot 2 - Ghia Mei Khoo (Jamie)

This graph shows the total death, injuries and damages in the US overall by month. This is meant to show an exploration of the progression over time (Month). The angle of the Rose Plot is proportional to a constant of a month and the square root of the radius that is proportional to deaths and injuries. The number of deaths that occurred were smaller compared to injuries. As such a log scale was required for total death and total injury. The circle plot on top of the month was intended to show size of damage.

The different color and method (bubble) was used for damage (in dollar) and injury and death(in count) because the measure was different. This method makes it easy to show a comparison between months. It is easy to see that April, Aug and Dec has the most damage. There were a lot of deaths and injuries in January, April and July. It is also observed that there seems to be a correlation between death and injuries. Which makes sense because the more severe the event, the more death and injuries expected. As time was an interesting trend to explore, the next few plots (plot 2, 3, 4, 5) show an element of time (season, month).

Plot 2 - Damages, Deaths and Injuries in 2014.

Click on image to expand.

Total Tragedies
by Season and Months 2014

Plot 3 - Ghia Mei Khoo (Jamie)

This graph shows the total death, injuries and damages in the US overall by month. The number of deaths that occurred were smaller compared to injuries. As such a log scale was required for total death and total injury. The different color was used for damage (in dollar) and injury and death(in count) because the measure was different. The subtle color in the background color shows the seasons.

The colors were chosen for because it mimicked the color of the season (eg- blue- cold weather in winter, green- new green trees in spring). The result of this graph shows that a lot of damages happen in spring and summer. Interestingly, April is also the month with high death and injuries. The following plot explains this trend.

Plot 3 - Total tragedies by season and months in 2014.

Click on image to expand.

Total Tragedies
by Season and Months 2014

Plot 3-A Ghia Mei Khoo (Jamie)

This graph shows a drilled down view of the plot before. The technique used was similar to what was used in Plot Plot 3. As April was an interesting month, we wanted to understand more about what happened in April. This shows that the majority of the deaths and injuries were caused by Tornadoes. Damages were caused by both Tornadoes and Floods.

Plot 2 - Total tragedies by season and months in 2014.

Click on image to expand.

Storm Events Pattern
by State and Season 2014

Plot 4 - Rama Linga

This graph visualizes the most frequent storm event by state & season in year 2014. In Winter'14 the most frequent storm event was winter related storms such as Blizzard, Extreme Cold/ Wind Chill, Heavy Snow and Winter Weather especially West, Midwest, East region states were faced many whereas at the same time Southwest region states the most frequent storm event was Drought.

In Spring'14 the most frequent event was Hail especially for Mountain & Central time zone states whereas in Southeast and West regions the most frequent storm event was Rain and Wind respectively. In Summer'14 to most frequent storm storm events were Rain, Hail and Flood , Wildfire was the most frequent storm event in Oregon during the summer. Autumn seems mixer of different storm events across U.S, in Midwest and Southeast regions mostly Rain and Hail, in West it's a mixer of Flood, Winter Storms and Wind.

Interestingly in New Mexico state irrespective of the season the most frequent storm event was Drought and in Hawaii High Surf was the most frequent storm event. Overall, this graph shows the patterns in storm events by visualizing the most frequent storm event by state and season.

Plot 3 - Total damages cause by storm events in 2014.

Click on image to expand.

Storm Events Pattern
by County and Season 2014

Plot 4-A Rama Linga

This graph visualizes the most frequent storm event by county & season in year 2014 . It shows at very high level Winter Storms can happen in Winter, Spring and Autumn not just only in Winter as we expected, and in Summer the most frequent storm event were Rain and Hail, Autumn season has mixture of different storm event types.

Plot 4 - Total event pattern by country and season 2014.

Click on image to expand.

Six State Approach

Plot 5 - Joseph Clay

The choropleths represent which states incurred the greatest deaths or total number of events. These plot were not very conclusive in determining which states were most dangerous so a correlation plot was made to examine the correlation between deaths and events. Doing so coaxed out the fact that there were a few states that seemed to incur high death rates compared to various number of events.

Plot 5 - Six state approach.

Click on image to expand.

Macro and Micro view of
Weather Event by State

Plot 6 - Carin Camen and Joseph Clay

For this graph we wanted to use a combination of a geographical map graph showing the entire US map and using a Rose/Consultant graph for the six individual states which showed the event type by state where deaths occurred. The consultant chart was utilized to maximize the space. The colors were chosen to cohesively bind all of the graphs while also providing sufficient contrast. We combined these two graphs into one visualization to represent the story of why the six states are the worst places to live. We have included a larger format with our final submission. For convenience we are showing one Rose/Consultant graph.

Plot 6 - Macro and micro view of weather event by state.

Click on image to expand.

Number of Deaths
Associated by Weather Event 2014

Plot 7 - Carin Camen and Joseph Clay

This visualization continues to provide a macro and micro view by utilizing a combination of a calendar plot and geographical maps. The geographical maps showed the event type, state and number of events. The calendar plot provided detail to the number of deaths associated by each event type. We have included a larger format with our final submission. For convenience we are showing one sample of the geographical map and calendar graph. These weather events were selected as they were the dominate weather event in each of the six states irrespective of the number of deaths that occurred.

Plot 7 - Numbers of deaths associated by weather event.

Click on image to expand.

Weather Events by State
comparing Occurrence Probability

Plot 8 Carin Camen and Joseph Clay

With this graph, we continued to visualize the story of storms in each of the six states and then provided details to the storms. A Rose/Consultant graph and a Kernel Density Diagram was used. The Rose/Consultant graph showed the weather events by state. The Kernel Density Diagram provides a means to compare the probability of occurrence of each weather event for each state. Only event that occurred more than once were included as it is not possible to determine the probability with only one event. The colors that were used correspond to the color pallet that was used in earlier plots.

Plot 8 - Weather events by state comparing occurrance probability.

Click on image to expand.

Total Damages Caused by
Storm Events 2014

Plot 9 - Dina Alsharif

This graph provide an in-depth view of the six selected states by visualizing the total damages in each state. A square tile grid map help provide a clear visualization of the total damages across these state without getting distracted with the size of the state. A geographical maps provide details of the distribution of damages by counties at each of the six state.A ranking table showed the top most storm events that caused these damages.

Plot 9 - Weather events by state comparing occurrance probability.

Click on image to expand.

Results

A Final Analysis

Overall, we were successful in our stated purpose: To identify the most dangerous states to live in due to their weather through the utilization of various data visualization techniques. These techniques were choropleths, rose plots, kernel density plots, calendar plots, tree maps and correlation plots. Using these techniques, combined with the significant variables of states, events, deaths, and damage, we were able to create a rich view on what makes each of the selected states so dangerous and deserving of the title "Worst Place to Live."

Analyzing the weather patterns across the 50 contiguous U.S states and identifying the 6 states that the worst places to live as brought us to several conclusions:

  • Spring time events are the worse. Scientifically, this is not an uncommon phenomenon as temperature, high humidity and air pressure shifts. These are often the conditions for several very damaging storms to manifest such as tornados, flood and hail to start. Tornadoes were frequent and debris flow (water related event) caused significant death and injuries. Most damages were also caused by spring time events (Tornado, floods, hail). This is made clear from our top 6 states view.

  • Cold and Water(rain, winter weather, debris flow) storms events were most frequent reported. From a 50 state view and a 6 state view, we can see a common trend of highly reported events of these nature. This could be due to the persistence of these events and the types of events they are reported. For example, freeze and winter storms are two different phenomenons but could occur concurrently. We may be able to draw some conclusion on the extensive NOAA's emphasis is on cold weather for public reporting and safety management purposes(eg- cleaning roads).

  • Debris flow and tornados cause the most death. There are only a handful of fatal events. the top two seem to be debris flow (water related event) and tornado's (wind related events) with a Washington Arkansas and Mississippi. It is interesting that the top tornado states were not represented here but it may be due to using only one year's worth of data.

Implications:

Insurance: Insurance company may want to increase the property or life risk premiums of certain states due to the storm conditions expected in those states.

Personal choice of where to live: Several states as we have pointed out show a trend being hit by very damaging storm patterns. Winter weather often is cited as most frequent in some areas too. One might choose to avoid living in the area if the person's preference is to not deal with the cold weather.

Medical Emergency preparation: With a pattern of types of events (by time), major location, frequency of storm and severity of storm events, first respondents and medical care providers can be better prepare and handle tragic situations when it comes.

Further Exploration:

Top safest places to live: We could explore the top "safest places to live in". This view would be interesting because it shows the places where these are the least frequency of storm events, damages, injuries and deaths. This may be confounded by low population density.

Deep dive into weather patterns: Now that we know which events were most tragic, it would be interesting to further provide a micro analysis on the specific events. For example, if we focused on tornadoes in Mississippi, we could look at the temperature, air pressure, wind velocity, wind direction etc and do a comparison against a stable average. Multiple year view: For the purpose of this study, we only used 2014. It would be interesting to compare strom trends over the years and compare overall average against several significant trends or years.

Informal Data Visualization

HCI Salaries - Data obtained from Indeed.com

In this example of data visualization, I compared the salary from the top three cities in the US. I obtained the data from Indeed.com where the analysis of formal presentation had been completed. My comparative anlaysis is presented informally.

Location, Alphabetically, Time, Category, Hierarchy

Click on image to expand.

IA salary information design.

Click on image to expand.

UI salary information design.

Click on image to expand.

UX salary information design.

Click on image to expand.