Analysis and Visualization of Tsunami Data using Tableau
Providing analysis and insights from the visualization of tsunami’s dataset
What’s a tsunami?
Tsunami is a series of large waves caused by sudden movements of the sea surface which are usually caused by factors such as earthquakes at tectonic plate boundaries, underwater landslides, or volcanic eruptions — National Geographic
Most tsunamis–about 80 percent–happen within the Pacific Ocean’s “Ring of Fire,” a geologically active area where tectonic shifts make volcanoes and earthquakes common.
To gain real insights into the tsunami, I retrieved the dataset from NGDC/WDS Global Historical Tsunami Database provided by NOAA starting from 2015 to 2019. I will provide 2 types of visualization: exploration and prediction visualization using BI tools, Tableau.
1 — Exploration Visualization
The basic idea of exploratory data visualization is to present data in various visual formats, enabling users to gain deep insights from the data, describe and draw conclusions by interacting directly with the data. Characterized to present insights from an existing dataset by highlighting the relationships.
- Tsunami Frequency Visualization
This interactive dashboard aims to visualize the tsunami frequency based on the causal factors. We can filter the visualization by clicking the cause code description on the text table, and it’ll present the view of the cause factor chosen on the map, also the bar chart of frequency comparison per year.
From a total of 1660 tsunamis, the earthquake was the highest cause factor with a total of 852 events. This explains a lot since most tsunamis happen within the Ring of Fire area, where tectonic shifts commonly happened. We can also see the pattern from the map, that most countries in many tsunami cases lie on the Ring of Fire area.
Overall, Indonesia was the highest country with a total of 693 cases, followed by Chile with 313 cases. The bar chart shows that 2018 was the year with the most tsunami frequency with 861 cases in ROF country and 15 cases in non-ROF countries.
2 — Prediction Visualization
When visualizing data, new insights can be drawn and it can generate new relationships between existing information. Prediction visualization can be done with clustering, time-series, and predictive analysis such as forecasting and trends.
- Clustering the Tsunami Area and Location
In this dashboard, I provided the clustered area based on earthquake magnitude into major (cluster 2), strong (cluster 3), and moderate (cluster 1) clusters. By this map, we can see which area has the specific average earthquake magnitude, whether it’s the major, strong, or moderate average magnitude. It is also shown that the areas which lie on the Ring of Fire mostly clustered in the major and strong clusters. I provide information about the clusters and the center of the average earthquake magnitude below.
As for the second graph, we can observe the clustered locations based on the tsunami’s water height average (m) and inundation distance average (m) into light, moderate, and strong clusters. By hovering the data points (the circles), we can see which area was that point is with the inundation distance and water height information. The higher the average water height and inundation distance, the greater impact the tsunami made on the locations.
- Forecasting Tsunami’s Frequency
Science cannot predict exactly when a tsunami will be generated, as it does not have a season. But on this dashboard, I’ll just provide the forecast of tsunami’s frequency that will be occurred in 2020 based on the historical data (2015–2019), and looking for potential seasonal patterns every 12 months using an additive model. With a total number of cases of 301, the peak of the tsunami cases will have occurred in September 2020 with 221 cases.
That was my visualization and analysis of tsunami using the tsunami dataset from 2015 to 2019. The dataset itself has some lack of data points that might affect the granularity of the analysis and the prediction models. As for future studies, it is suggested to use a complete dataset that will give a better chance to explore the data patterns.
Note: This was my project back in August 2020 with Anastasia Milenia, but I did some improvements to it.