Assignment #9: Visualization in R – Base Graphics, Lattice, and ggplot2

# Load dataset from Rdatasets collection

data("iris", package = "datasets")

head(iris)


# Base R Graphics

# Scatter plot

plot(iris$Sepal.Length, iris$Petal.Length,

     main = "Base R: Sepal Length vs Petal Length",

     xlab = "Sepal Length",

     ylab = "Petal Length",

     col = as.numeric(iris$Species),

     pch = 19)


legend("topleft",

       legend = levels(iris$Species),

       col = 1:3,

       pch = 19)


# Histogram

hist(iris$Sepal.Width,

     main = "Base R: Distribution of Sepal Width",

     xlab = "Sepal Width",

     col = "lightblue",

     border = "white")


# Lattice Graphics

library(lattice)


# Conditional scatter plot

xyplot(Petal.Length ~ Sepal.Length | Species,

       data = iris,

       main = "Lattice: Petal Length vs Sepal Length by Species",

       xlab = "Sepal Length",

       ylab = "Petal Length",

       col = "darkgreen",

       pch = 19)


# Box plot

bwplot(Sepal.Width ~ Species,

       data = iris,

       main = "Lattice: Sepal Width by Species",

       xlab = "Species",

       ylab = "Sepal Width",

       col = "orange")


# ggplot2 Graphics

library(ggplot2)


# Scatter plot with smoothing

ggplot(iris, aes(x = Sepal.Length, y = Petal.Length, color = Species)) +

  geom_point() +

  geom_smooth(method = "lm", se = FALSE) +

  labs(title = "ggplot2: Petal Length vs Sepal Length with Trend by Species",

       x = "Sepal Length",

       y = "Petal Length")


# Faceted histogram

ggplot(iris, aes(x = Sepal.Width, fill = Species)) +

  geom_histogram(binwidth = 0.2, color = "black") +

  facet_wrap(~ Species) +

  labs(title = "ggplot2: Sepal Width Distribution by Species",

       x = "Sepal Width",

       y = "Count") 



For this assignment, I used the iris dataset from the datasets package and created visualizations in base R, lattice, and ggplot2. Base R was the most straightforward for making quick plots, but it required more manual work to control details like colors and legends. Lattice was useful for conditioned plots because it made it easy to separate the data into panels by species, which helped with comparison. ggplot2 felt the most flexible and polished because its layered syntax made it easier to build more detailed, publication-quality graphics with relatively little code. The biggest challenge when switching between systems was adjusting to the different syntax styles, since base R uses direct plotting commands, lattice uses formulas, and ggplot2 builds plots in layers. Overall, ggplot2 gave me the most control and the cleanest final output, while base R was the easiest for simple visuals.


Comments

Popular posts from this blog

Module # 4 Programming structure assignment

Assignment #10: Building Your Own R Package

Module # 8 Input/Output, string manipulation and plyr package