Module 10

 


Exploring Time Series Data Visualization with ggplot2


In this blog post, I will explore the visualization of time series data using ggplot2, a powerful data visualization package in R. Time series data represents observations collected or recorded at different points in time, making it essential for analyzing trends, patterns, and relationships over time.


1. Hot Dog Eating Contest Results Visualization


The first example focuses on visualizing the results of Nathan's Hot Dog Eating Contest from 1980 to 2010. We start by loading the data and examining the first few rows:


```R

hotdogs <- read_csv("http://datasets.flowingdata.com/hot-dog-contest-winners.csv")

head(hotdogs)

```


Next, we create a bar plot using R-base graphics to display the number of hot dogs and buns eaten each year, with different colors indicating whether a new record was set:


```R

colors <- ifelse(hotdogs$New.record == 1, "darkred", "grey")

barplot(hotdogs$Dogs.eaten, names.arg = hotdogs$Year, col = colors, border = NA,

        main = "Nathan's Hot Dog Eating Contest Results, 1980-2010", xlab = "Year", ylab = "Hot dogs and buns (HDBs) eaten")

```


Then, we replicate the plot using ggplot2, which offers more flexibility and customization options:


```R

library(ggplot2)

ggplot(hotdogs) + 

  geom_bar(aes(x = Year, y = Dogs.eaten, fill = factor(New.record)), stat = "identity") + 

  labs(title = "Nathan's Hot Dog Eating Contest Results, 1980-2010", fill = "New Record") + 

  xlab("Year") + ylab("Hot dogs and buns (HDBs) eaten")

```


Lastly, we visualize the hot dog eating contest results as a stacked bar plot, showing the distribution of eating places over the years:


```R

hotdog_places <- as.matrix(hotdog_places)

colnames(hotdog_places) <- lapply(2000:2010, as.character)

barplot(hotdog_places, border = NA, main = "Hot Dog Eating Contest Results, 1980-2010", 

        xlab = "Year", ylab = "Hot dogs and buns (HDBs) eaten")

```


2. Economics Data Visualization


In the second example, we explore economics data already embedded in ggplot2. We start by examining the data and extracting the year from the date column:


```R

head(economics)

year <- function(x) as.POSIXlt(x)$year + 1900

economics$year <- year(economics$date)

```


Then, we create a line plot to visualize the unemployment rate over time:


```R

plot1 <- qplot(date, unemploy / pop, data = economics, geom = "line")

plot1

```


Next, we create another line plot to visualize the median duration of unemployment over time:


```R

plot2 <- qplot(date, uempmed, data = economics, geom = "line")

plot2

```


Lastly, we arrange the two line plots side by side using the grid.arrange function:


```R

library(gridExtra)

grid.arrange(plot1, plot2, ncol = 2)

```


Conclusion

Visualization plays a crucial role in time series analysis by providing insights into trends, patterns, and relationships over time. With ggplot2, we can create visually appealing and informative plots to explore and understand time series data more effectively. By leveraging the capabilities of ggplot2, we can enhance our analysis and make informed decisions based on the insights gained from visualizing time series data.

Comments

Popular posts from this blog

Visual Analytics Final Project

Module # 6 Assignment

Module 7 Assignment