As I discussed when posting the dataset, the mechanisms for tornado warnings underwent a huge evolution during the time period that this data highlights. In addition to the changes in warning mechanisms, the reporting of events was also improving and therefore, we see an increase in the number of tornados in the data set. I don't believe that there were truly a significantly higher number of tornadoes in the decades that followed the 50's, but rather that as populations became more spread out and reporting became more standardized, we simply recorded more data.
What is interesting is the decrease in number of fatalities over time. This is a testament to our warning systems, public awareness and preparedness. Similarly, after peaking in the 1970's property damage decreases in later years. I'm taking these values with a grain of salt, but would attribute this to better building materials and practices.
I've left the map for last. The midwest, southeast and Texas experience the majority of the tornados, with events west of the Rocky mountains infrequent. The tornados with the largest numbers of fatalities occurred in areas where tornados are common but not regular occurences, population density is high and the Fujita scale rating is also high.
Data Viz Approach
For this dataset, I wanted to focus on the geographic distribution of tornados, their destructive nature and time based analysis. I accomplished this with 3 distinct visualization types - a map, bar charts and BAN's.
I considered many map types - chlorpeth (states colored by a metric), hexbins (latitudes & longitudes rounded and binned into groups and typically colored by a metric) and state tile maps (constant sized shapes representing the approximate geographical location of the state, colored by a metric), and finally a point distribution map (single point per event, with flexibility for size/color based on metrics or dimensions). After weighing the pros and cons of each, I decided to go with a point distribution map in an effort to highlight three distinct items:
- The location of the tornados across the contiguous United States (based on beginning latitude and longitude),
- The force of the tornados (Fujita scale indicated by color) and
- The destruction associated with the tornados (fatalities per tornado indicated by size).
As the sheer number of points makes for a very visually intensive map, I decided to use a custom map created in Mapbox that allowed me to focus just on the United States, remove all state & waterway labels and use a custom palette that complimented the rest of the dashboard.
I chose trusty bar charts to represent metrics by year or decade based upon the user's parameter input. The bars effectively show the relationship of the metrics across time, while being familiar to the average user. The labels compliment the bars, and provide clear indication of the metric value. I kept the color scheme simple highlighting both the min and max for each metric. The bottom most viz looking at tornado frequency by month incorporates an average providing a visual cue as to the months that exceed or fall below the average.
Big numbers are an easy way to summarize the metrics in the bar charts immediately above them.
There were a couple of challenges with the Property Damage column that was included with the NOAA data source. The first was that the column came in as a text column with values indicated with K's, M's and B's to indicate scale. Other values came in without a unit, and based on the values, I assumed these to be Millions. I created calculated fields to convert the strings to float, and used the units to multiply the resulting values to their correct value. Additionally, I used a consumer price index ratio to normalize the values to 1999 dollars.
In an effort to baseline the property damage values, I brought in a Consumer Price Index data source from the Federal Reserve Bank and used this to estimate property damage in 1999 dollars.