– Tidyverse Using ggplot2
– exercises
– providing a Visual representation and detailed explanation , screenshot the solution
– recommended to use RStudio for all
– NO PLAGIARISM
– NEED PLAGIARISM REPORT
– INTEXT CITATION
– QUESTION IS ATTACHED and SOURCE TOO
R For Data Science Cheat Sheet
Tidyverse for Beginners
Learn More R for Data Science Interactively at www.datacamp.com
Tidyverse
DataCamp
Learn R for Data Science Interactively
The tidyverse is a powerful collection of R packages that are actually
data tools for transforming and visualizing data. All packages of the
tidyverse share an underlying philosophy and common APIs.
The core packages are:
• ggplot2, which implements the grammar of graphics. You can use it
to visualize your data.
• dplyr is a grammar of data manipulation. You can use it to solve the
most common data manipulation challenges.
• tidyr helps you to create tidy data or data where each variable is in a
column, each observation is a row end each value is a cell.
• readr is a fast and friendly way to read rectangular data.
• purrr enhances R’s functional programming (FP) toolkit by providing a
complete and consistent set of tools for working with functions and
vectors.
• tibble is a modern re-imaginging of the data frame.
• stringr provides a cohesive set of functions designed to make
working with strings as easy as posssible
• forcats provide a suite of useful tools that solve common problems
with factors.
You can install the complete tidyverse with:
Then, load the core tidyverse and make it available in your current R
session by running:
Note: there are many other tidyverse packages with more specialised usage. They are not
loaded automatically with library(tidyverse), so you’ll need to load each one with its own call
to library().
ggplot2
> install.packages(“tidyverse”)
> iris %>% Select iris data of species
filter(Species==”virginica”) “virginica”
> iris %>% Select iris data of species
filter(Species==”virginica”, “virginica” and sepal length
Sepal.Length > 6) greater than 6.
dplyr
Filter
> library(tidyverse)
Useful Functions
Arrange
Mutate
Summarize
> tidyverse_conflicts() Conflicts between tidyverse and other
packages
> tidyverse_deps() List all tidyverse dependencies
> tidyverse_logo() Get tidyverse logo, using ASCII or unicode
characters
> tidyverse_packages() List all tidyverse packages
> tidyverse_update() Update tidyverse packages
Loading in the data
> library(datasets) Load the datasets package
> library(gapminder) Load the gapminder package
> attach(iris) Attach iris data to the R search path
filter() allows you to select a subset of rows in a data frame.
> iris %>% Sort in ascending order of
arrange(Sepal.Length) sepal length
> iris %>% Sort in descending order of
arrange(desc(Sepal.Length)) sepal length
arrange() sorts the observations in a dataset in ascending or descending order
based on one of its variables.
> iris %>% Filter for species “virginica”
filter(Species==”virginica”) %>% then arrange in descending
arrange(desc(Sepal.Length)) order of sepal length
Combine multiple dplyr verbs in a row with the pipe operator %>%:
mutate() allows you to update or create new columns of a data frame.
> iris %>% Change Sepal.Length to be
mutate(Sepal.Length=Sepal.Length*10) in millimeters
> iris %>% Create a new column
mutate(SLMm=Sepal.Length*10) called SLMm
Combine the verbs filter(), arrange(), and mutate():
> iris %>%
filter(Species==”Virginica”) %>%
mutate(SLMm=Sepal.Length*10) %>%
arrange(desc(SLMm))
> iris %>% Summarize to find the
summarize(medianSL=median(Sepal.Length)) median sepal length
> iris %>% Filter for virginica then
filter(Species==”virginica”) %>% summarize the median
summarize(medianSL=median(Sepal.Length)) sepal length
summarize() allows you to turn many observations into a single data point.
> iris %>%
filter(Species==”virginica”) %>%
summarize(medianSL=median(Sepal.Length),
maxSL=max(Sepal.Length))
You can also summarize multiple variables at once:
group_by() allows you to summarize within groups instead of summarizing the
entire dataset:
> iris %>% Find median and max
group_by(Species) %>% sepal length of each
summarize(medianSL=median(Sepal.Length), species
maxSL=max(Sepal.Length))
> iris %>% Find median and max
filter(Sepal.Length>6) %>% petal length of each
group_by(Species) %>% species with sepal
summarize(medianPL=median(Petal.Length), length > 6
maxPL=max(Petal.Length))
Scatter plot
> iris_small <- iris %>%
filter(Sepal.Length > 5)
> ggplot(iris_small, aes(x=Petal.Length, Compare petal
y=Petal.Width)) + width and length
geom_point()
Scatter plots allow you to compare two variables within your data. To do this with
ggplot2, you use geom_point()
Additional Aesthetics
> ggplot(iris_small, aes(x=Petal.Length,
y=Petal.Width,
color=Species)) +
geom_point()
• Color
• Size
> ggplot(iris_small, aes(x=Petal.Length,
y=Petal.Width,
color=Species,
size=Sepal.Length)) +
geom_point()
Faceting
> ggplot(iris_small, aes(x=Petal.Length,
y=Petal.Width)) +
geom_point()+
facet_wrap(~Species)
Line Plots
Bar Plots
Histograms
Box Plots
> by_year <- gapminder %>%
group_by(year) %>%
summarize(medianGdpPerCap=median(gdpPercap))
> ggplot(by_year, aes(x=year,
y=medianGdpPerCap))+
geom_line()+
expand_limits(y=0)
> by_species <- iris %>%
filter(Sepal.Length>6) %>%
group_by(Species) %>%
summarize(medianPL=median(Petal.Length))
> ggplot(by_species, aes(x=Species,
y=medianPL)) +
geom_col()
> ggplot(iris_small, aes(x=Petal.Length))+
geom_histogram()
> ggplot(iris_small, aes(x=Species,
y=Sepal.Width))+
geom_boxplot()
Graphical Primitives
Data Visualization
with ggplot2
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/1
5
Geoms – Use a geom to represent data points, use the geom’s aesthetic properties to represent variables. Each function returns a layer.
One Variable
a + geom_area(stat = “bin”)
x, y, alpha, color, fill, linetype, size
b + geom_area(aes(y = ..density..), stat = “bin”)
a + geom_density(kernel = “gaussian”)
x, y, alpha, color, fill, linetype, size, weight
b + geom_density(aes(y = ..county..))
a + geom_dotplot()
x, y, alpha, color, fill
a + geom_freqpoly()
x, y, alpha, color, linetype, size
b + geom_freqpoly(aes(y = ..density..))
a + geom_histogram(binwidth = 5)
x, y, alpha, color, fill, linetype, size, weight
b + geom_histogram(aes(y = ..density..))
Discrete
b <- ggplot(mpg, aes(fl))
b + geom_bar()
x, alpha, color, fill, linetype, size, weight
Continuous
a <- ggplot(mpg, aes(hwy))
Two Variables
Continuous Function
Discrete X, Discrete Y
h <- ggplot(diamonds, aes(cut, color))
h + geom_jitter()
x, y, alpha, color, fill, shape, size
Discrete X, Continuous Y
g <- ggplot(mpg, aes(class, hwy))
g + geom_bar(stat = “identity”)
x, y, alpha, color, fill, linetype, size, weight
g + geom_boxplot()
lower, middle, upper, x, ymax, ymin, alpha,
color, fill, linetype, shape, size, weight
g + geom_dotplot(binaxis = “y”,
stackdir = “center”)
x, y, alpha, color, fill
g + geom_violin(scale = “area”)
x, y, alpha, color, fill, linetype, size, weight
Continuous X, Continuous Y
f <- ggplot(mpg, aes(cty, hwy))
f + geom_blank()
f + geom_jitter()
x, y, alpha, color, fill, shape, size
f + geom_point()
x, y, alpha, color, fill, shape, size
f + geom_quantile()
x, y, alpha, color, linetype, size, weight
f + geom_rug(sides = “bl”)
alpha, color, linetype, size
f + geom_smooth(model = lm)
x, y, alpha, color, fill, linetype, size, weight
f + geom_text(aes(label = cty))
x, y, label, alpha, angle, color, family, fontface,
hjust, lineheight, size, vjust
Three Variables
m + geom_contour(aes(z = z))
x, y, z, alpha, colour, linetype, size, weight
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2)) m <- ggplot(seals, aes(long, lat))
j <- ggplot(economics, aes(date, unemploy)) j + geom_area()
x, y, alpha, color, fill, linetype, size
j + geom_line()
x, y, alpha, color, linetype, size
j + geom_step(direction = “hv”)
x, y, alpha, color, linetype, size
Continuous Bivariate Distribution
i <- ggplot(movies, aes(year, rating))
i + geom_bin2d(binwidth = c(5, 0.5))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size, weight
i + geom_density2d()
x, y, alpha, colour, linetype, size
i + geom_hex()
x, y, alpha, colour, fill size
e + geom_segment(aes(
xend = long + delta_long,
yend = lat + delta_lat))
x, xend, y, yend, alpha, color, linetype, size
e + geom_rect(aes(xmin = long, ymin = lat,
xmax= long + delta_long,
ymax = lat + delta_lat))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size
c + geom_polygon(aes(group = group))
x, y, alpha, color, fill, linetype, size
e <- ggplot(seals, aes(x = long, y = lat))
m + geom_raster(aes(fill = z), hjust=0.5,
vjust=0.5, interpolate=FALSE)
x, y, alpha, fill
m + geom_tile(aes(fill = z))
x, y, alpha, color, fill, linetype, size
k + geom_crossbar(fatten = 2)
x, y, ymax, ymin, alpha, color, fill, linetype,
size
k + geom_errorbar()
x, ymax, ymin, alpha, color, linetype, size,
width (also geom_errorbarh())
k + geom_linerange()
x, ymin, ymax, alpha, color, linetype, size
k + geom_pointrange()
x, y, ymin, ymax, alpha, color, fill, linetype,
shape, size
Visualizing error
df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
k <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
d + geom_path(lineend=”butt”,
linejoin=”round’, linemitre=1)
x, y, alpha, color, linetype, size
d + geom_ribbon(aes(ymin=unemploy – 900,
ymax=unemploy + 900))
x, ymax, ymin, alpha, color, fill, linetype, size
d <- ggplot(economics, aes(date, unemploy))
c <- ggplot(map, aes(long, lat))
data <- data.frame(murder = USArrests$Murder, state = tolower(rownames(USArrests)))
map <- map_data("state") l <- ggplot(data, aes(fill = murder))
l + geom_map(aes(map_id = state), map = map) +
expand_limits(x = map$long, y = map$lat)
map_id, alpha, color, fill, linetype, size
Maps
AB
C
Basics
Build a graph with qplot() or ggplot()
ggplot2 is based on the grammar of graphics, the
idea that you can build every graph from the same
few components: a data set, a set of geoms—visual
marks that represent data points, and a coordinate
system.
To display data values, map variables in the data set
to aesthetic properties of the geom like size, color,
and x and y locations.
Graphical Primitives
Data Visualization
with ggplot2
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/
15
Geoms – Use a geom to represent data points, use the geom’s aesthetic properties to represent variables
Basics
One Variable
a + geom_area(stat = “bin”)
x, y, alpha, color, fill, linetype, size
b + geom_area(aes(y = ..density..), stat = “bin”)
a + geom_density(kernal = “gaussian”)
x, y, alpha, color, fill, linetype, size, weight
b + geom_density(aes(y = ..county..))
a+ geom_dotplot()
x, y, alpha, color, fill
a + geom_freqpoly()
x, y, alpha, color, linetype, size
b + geom_freqpoly(aes(y = ..density..))
a + geom_histogram(binwidth = 5)
x, y, alpha, color, fill, linetype, size, weight
b + geom_histogram(aes(y = ..density..))
Discrete
a <- ggplot(mpg, aes(fl))
b + geom_bar()
x, alpha, color, fill, linetype, size, weight
Continuous
a <- ggplot(mpg, aes(hwy))
Two Variables
Discrete X, Discrete Y
h <- ggplot(diamonds, aes(cut, color))
h + geom_jitter()
x, y, alpha, color, fill, shape, size
Discrete X, Continuous Y
g <- ggplot(mpg, aes(class, hwy))
g + geom_bar(stat = "identity")
x, y, alpha, color, fill, linetype, size, weight
g + geom_boxplot()
lower, middle, upper, x, ymax, ymin, alpha,
color, fill, linetype, shape, size, weight
g + geom_dotplot(binaxis = "y",
stackdir = "center")
x, y, alpha, color, fill
g + geom_violin(scale = "area")
x, y, alpha, color, fill, linetype, size, weight
Continuous X, Continuous Y
f <- ggplot(mpg, aes(cty, hwy))
f + geom_blank()
f + geom_jitter()
x, y, alpha, color, fill, shape, size
f + geom_point()
x, y, alpha, color, fill, shape, size
f + geom_quantile()
x, y, alpha, color, linetype, size, weight
f + geom_rug(sides = "bl")
alpha, color, linetype, size
f + geom_smooth(model = lm)
x, y, alpha, color, fill, linetype, size, weight
f + geom_text(aes(label = cty))
x, y, label, alpha, angle, color, family, fontface,
hjust, lineheight, size, vjust
Three Variables
i + geom_contour(aes(z = z))
x, y, z, alpha, colour, linetype, size, weight
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2)) i <- ggplot(seals, aes(long, lat))
g <- ggplot(economics, aes(date, unemploy))
Continuous Function
g + geom_area()
x, y, alpha, color, fill, linetype, size
g + geom_line()
x, y, alpha, color, linetype, size
g + geom_step(direction = “hv”)
x, y, alpha, color, linetype, size
Continuous Bivariate Distribution
h <- ggplot(movies, aes(year, rating))
h + geom_bin2d(binwidth = c(5, 0.5))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size, weight
h + geom_density2d()
x, y, alpha, colour, linetype, size
h + geom_hex()
x, y, alpha, colour, fill size
d + geom_segment(aes(
xend = long + delta_long,
yend = lat + delta_lat))
x, xend, y, yend, alpha, color, linetype, size
d + geom_rect(aes(xmin = long, ymin = lat,
xmax= long + delta_long,
ymax = lat + delta_lat))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size
c + geom_polygon(aes(group = group))
x, y, alpha, color, fill, linetype, size
d<- ggplot(seals, aes(x = long, y = lat))
i + geom_raster(aes(fill = z), hjust=0.5,
vjust=0.5, interpolate=FALSE)
x, y, alpha, fill
i + geom_tile(aes(fill = z))
x, y, alpha, color, fill, linetype, size
e + geom_crossbar(fatten = 2)
x, y, ymax, ymin, alpha, color, fill, linetype,
size
e + geom_errorbar()
x, ymax, ymin, alpha, color, linetype, size,
width (also geom_errorbarh())
e + geom_linerange()
x, ymin, ymax, alpha, color, linetype, size
e + geom_pointrange()
x, y, ymin, ymax, alpha, color, fill, linetype,
shape, size
Visualizing error
df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
e <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
g + geom_path(lineend=”butt”,
linejoin=”round’, linemitre=1)
x, y, alpha, color, linetype, size
g + geom_ribbon(aes(ymin=unemploy – 900,
ymax=unemploy + 900))
x, ymax, ymin, alpha, color, fill, linetype, size
g <- ggplot(economics, aes(date, unemploy)) c <- ggplot(map, aes(long, lat)) data <- data.frame(murder = USArrests$Murder, state = tolower(rownames(USArrests)))
map <- map_data("state") e <- ggplot(data, aes(fill = murder))
e + geom_map(aes(map_id = state), map = map) +
expand_limits(x = map$long, y = map$lat)
map_id, alpha, color, fill, linetype, size
Maps
F M A
=
1
2
3
0
0 1 2 3
4
4
1
2
3
0
0 1 2 3 4
4
+
data geom coordinate
system
plot
+
F M A
=
1
2
3
0
0 1 2 3 4
4
1
2
3
0
0 1 2 3 4
4
data geom coordinate
system
plot
x = F
y = A
color = F
size = A
1
2
3
0
0 1 2 3 4
4
plot
+
F M A
=
1
2
3
0
0 1 2 3 4
4
data geom coordinate
systemx = F
y = A
x = F
y = A
Graphical Primitives
Data Visualization
with ggplot2
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/15
Geoms – Use a geom to represent data points, use the geom’s aesthetic properties to represent variables
Basics
One Variable
a + geom_area(stat = “bin”)
x, y, alpha, color, fill, linetype, size
b + geom_area(aes(y = ..density..), stat = “bin”)
a + geom_density(kernal = “gaussian”)
x, y, alpha, color, fill, linetype, size, weight
b + geom_density(aes(y = ..county..))
a+ geom_dotplot()
x, y, alpha, color, fill
a + geom_freqpoly()
x, y, alpha, color, linetype, size
b + geom_freqpoly(aes(y = ..density..))
a + geom_histogram(binwidth = 5)
x, y, alpha, color, fill, linetype, size, weight
b + geom_histogram(aes(y = ..density..))
Discrete
a <- ggplot(mpg, aes(fl))
b + geom_bar()
x, alpha, color, fill, linetype, size, weight
Continuous
a <- ggplot(mpg, aes(hwy))
Two Variables
Discrete X, Discrete Y
h <- ggplot(diamonds, aes(cut, color))
h + geom_jitter()
x, y, alpha, color, fill, shape, size
Discrete X, Continuous Y
g <- ggplot(mpg, aes(class, hwy))
g + geom_bar(stat = "identity")
x, y, alpha, color, fill, linetype, size, weight
g + geom_boxplot()
lower, middle, upper, x, ymax, ymin, alpha,
color, fill, linetype, shape, size, weight
g + geom_dotplot(binaxis = "y",
stackdir = "center")
x, y, alpha, color, fill
g + geom_violin(scale = "area")
x, y, alpha, color, fill, linetype, size, weight
Continuous X, Continuous Y
f <- ggplot(mpg, aes(cty, hwy))
f + geom_blank()
f + geom_jitter()
x, y, alpha, color, fill, shape, size
f + geom_point()
x, y, alpha, color, fill, shape, size
f + geom_quantile()
x, y, alpha, color, linetype, size, weight
f + geom_rug(sides = "bl")
alpha, color, linetype, size
f + geom_smooth(model = lm)
x, y, alpha, color, fill, linetype, size, weight
f + geom_text(aes(label = cty))
x, y, label, alpha, angle, color, family, fontface,
hjust, lineheight, size, vjust
Three Variables
i + geom_contour(aes(z = z))
x, y, z, alpha, colour, linetype, size, weight
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2))
i <- ggplot(seals, aes(long, lat))
g <- ggplot(economics, aes(date, unemploy)) Continuous Function
g + geom_area()
x, y, alpha, color, fill, linetype, size
g + geom_line()
x, y, alpha, color, linetype, size
g + geom_step(direction = “hv”)
x, y, alpha, color, linetype, size
Continuous Bivariate Distribution
h <- ggplot(movies, aes(year, rating))
h + geom_bin2d(binwidth = c(5, 0.5))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size, weight
h + geom_density2d()
x, y, alpha, colour, linetype, size
h + geom_hex()
x, y, alpha, colour, fill size
d + geom_segment(aes(
xend = long + delta_long,
yend = lat + delta_lat))
x, xend, y, yend, alpha, color, linetype, size
d + geom_rect(aes(xmin = long, ymin = lat,
xmax= long + delta_long,
ymax = lat + delta_lat))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size
c + geom_polygon(aes(group = group))
x, y, alpha, color, fill, linetype, size
d<- ggplot(seals, aes(x = long, y = lat))
i + geom_raster(aes(fill = z), hjust=0.5,
vjust=0.5, interpolate=FALSE)
x, y, alpha, fill
i + geom_tile(aes(fill = z))
x, y, alpha, color, fill, linetype, size
e + geom_crossbar(fatten = 2)
x, y, ymax, ymin, alpha, color, fill, linetype,
size
e + geom_errorbar()
x, ymax, ymin, alpha, color, linetype, size,
width (also geom_errorbarh())
e + geom_linerange()
x, ymin, ymax, alpha, color, linetype, size
e + geom_pointrange()
x, y, ymin, ymax, alpha, color, fill, linetype,
shape, size
Visualizing error
df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
e <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
g + geom_path(lineend="butt",
linejoin="round’, linemitre=1)
x, y, alpha, color, linetype, size
g + geom_ribbon(aes(ymin=unemploy - 900,
ymax=unemploy + 900))
x, ymax, ymin, alpha, color, fill, linetype, size
g <- ggplot(economics, aes(date, unemploy))
c <- ggplot(map, aes(long, lat))
data <- data.frame(murder = USArrests$Murder,
state = tolower(rownames(USArrests)))
map <- map_data("state")
e <- ggplot(data, aes(fill = murder))
e + geom_map(aes(map_id = state), map = map) +
expand_limits(x = map$long, y = map$lat)
map_id, alpha, color, fill, linetype, size
Maps
F M A
=
1
2
3
0
0 1 2 3 4
4
1
2
3
0
0 1 2 3 4
4
+
data geom coordinate
system
plot
+
F M A
=
1
2
3
0
0 1 2 3 4
4
1
2
3
0
0 1 2 3 4
4
data geom coordinate
system
plot
x = F
y = A
color = F
size = A
1
2
3
0
0 1 2 3 4
4
plot
+
F M A
=
1
2
3
0
0 1 2 3 4
4
data geom coordinate
systemx = F
y = A
x = F
y = A
ggsave(“plot “, width = 5, height = 5)
Saves last plot as 5’ x 5’ file named “plot ” in
working directory. Matches file type to file extension.
qplot(x = cty, y = hwy, color = cyl, data = mpg, geom = “point”)
Creates a complete plot with given data, geom, and
mappings. Supplies many useful defaults.
ggplot(data = mpg, aes(x = cty, y = hwy))
Begins a plot that you finish by adding layers to. No
defaults, but provides more control than qplot().
ggplot(mpg, aes(hwy, cty)) +
geom_point(aes(color = cyl)) +
geom_smooth(method =”lm”) +
coord_cartesian() +
scale_color_gradient() +
theme_bw()
data
aesthetic mappings
add layers,
elements with +
layer = geom +
default stat +
layer specific
mappings
additional
elements
data geom
Add a new layer to a plot with a geom_*()
or stat_*() function. Each provides a geom, a
set of aesthetic mappings, and a default stat
and position adjustment.
last_plot()
Returns the last plot
https://creativecommons.org/licenses/by/4.0/
mailto:info@rstudio.com
http://rstudio.com
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/15
Stats – An alternative way to build a layer Coordinate Systems
r + coord_cartesian(xlim = c(0, 5))
xlim, ylim
The default cartesian coordinate system
r + coord_fixed(ratio = 1/2)
ratio, xlim, ylim
Cartesian coordinates with fixed aspect
ratio between x and y units
r + coord_flip()
xlim, ylim
Flipped Cartesian coordinates
r + coord_polar(theta = “x”, direction=1 )
theta, start, direction
Polar coordinates
r + coord_trans(ytrans = “sqrt”)
xtrans, ytrans, limx, limy
Transformed cartesian coordinates. Set
extras and strains to the name
of a window function.
r <- b + geom_bar()
Scales Faceting
t <- ggplot(mpg, aes(cty, hwy)) + geom_point()
Position Adjustments
s + geom_bar(position = “dodge”)
Arrange elements side by side
s + geom_bar(position = “fill”)
Stack elements on top of one another,
normalize height
s + geom_bar(position = “stack”)
Stack elements on top of one another
f + geom_point(position = “jitter”)
Add random noise to X and Y position
of each element to avoid overplotting
s <- ggplot(mpg, aes(fl, fill = drv))
Labels
t + ggtitle(“New Plot Title”)
Add a main title above the plot
t + xlab(“New X label”)
Change the label on the X axis
t + ylab(“New Y label”)
Change the label on the Y axis
t + labs(title =” New title”, x = “New x”, y = “New y”)
All of the above
Legends
Zooming
Themes
Facets divide a plot into subplots based on the values
of one or more discrete variables.
t + facet_grid(. ~ fl)
facet into columns based on fl
t + facet_grid(year ~ .)
facet into rows based on year
t + facet_grid(year ~ fl)
facet into both rows and columns
t + facet_wrap(~ fl)
wrap facets into a rectangular layout
Set scales to let axis limits vary across facets
t + facet_grid(y ~ x, scales = “free”)
x and y axis limits adjust to individual facets
• “free_x” – x axis limits adjust
• “free_y” – y axis limits adjust
Set labeller to adjust facet
labels
t + facet_grid(. ~ fl, labeller = label_both)
t + facet_grid(. ~ fl, labeller = label_bquote(alpha ^ .(x)))
t + facet_grid(. ~ fl, labeller = label_parsed)
Position adjustments determine how to arrange
geoms that would otherwise occupy the same space.
Each position adjustment can be recast as a function
with manual width and height arguments
s + geom_bar(position = position_dodge(width = 1))
r + theme_classic()
White background
no gridlines
r + theme_minimal()
Minimal theme
t + coord_cartesian(
xlim = c(0, 100), ylim = c(10, 20))
With clipping (removes unseen data points)
t + xlim(0, 100) + ylim(10, 20)
t + scale_x_continuous(limits = c(0, 100)) +
scale_y_continuous(limits = c(0, 100))
t + theme(legend.position = “bottom”)
Place legend at “bottom”, “top”, “left”, or “right”
t + guides(color = “none”)
Set legend type for each aesthetic: colorbar, legend,
or none (no legend)
t + scale_fill_discrete(name = “Title”,
labels = c(“A”, “B”, “C”))
Set legend title and labels with a scale function.
Each stat creates additional variables to map aesthetics
to. These variables use a common ..name.. syntax.
stat functions and geom functions both combine a stat
with a geom to make a layer, i.e. stat_bin(geom=”bar”)
does the same as geom_bar(stat=”bin”)
+
x ..count..
=
1
2
3
0
0 1 2 3 4
4
1
2
3
0
0 1 2 3 4
4
data geom coordinate
system
plot
x = x
y = ..count..
fl cty cyl
stat
ggplot() + stat_function(aes(x = -3:3),
fun = dnorm, n = 101, args = list(sd=0.5))
x | ..y..
f + stat_identity()
ggplot() + stat_qq(aes(sample=1:100), distribution = qt,
dparams = list(df=5))
sample, x, y | ..x.., ..y..
f + stat_sum()
x, y, size | ..size..
f + stat_summary(fun.data = “mean_cl_boot”)
f + stat_unique()
i + stat_density2d(aes(fill = ..level..),
geom = “polygon”, n = 100)
stat function
layer specific
mappings
variable created
by transformation
geom for layer parameters for stat
a + stat_bin(binwidth = 1, origin = 10)
x, y | ..count.., ..ncount.., ..density.., ..ndensity..
a + stat_bindot(binwidth = 1, binaxis = “x”)
x, y, | ..count.., ..ncount..
a + stat_density(adjust = 1, kernel = “gaussian”)
x, y, | ..count.., ..density.., ..scaled..
f + stat_bin2d(bins = 30, drop = TRUE)
x, y, fill | ..count.., ..density..
f + stat_binhex(bins = 30)
x, y, fill | ..count.., ..density..
f + stat_density2d(contour = TRUE, n = 100)
x, y, color, size | ..level..
m + stat_contour(aes(z = z))
x, y, z, order | ..level..
m+ stat_spoke(aes(radius= z, angle = z))
angle, radius, x, xend, y, yend | ..x.., ..xend.., ..y.., ..yend..
m + stat_summary_hex(aes(z = z), bins = 30, fun = mean)
x, y, z, fill | ..value..
m + stat_summary2d(aes(z = z), bins = 30, fun = mean)
x, y, z, fill | ..value..
g + stat_boxplot(coef = 1.5)
x, y | ..lower.., ..middle.., ..upper.., ..outliers..
g + stat_ydensity(adjust = 1, kernel = “gaussian”, scale = “area”)
x, y | ..density.., ..scaled.., ..count.., ..n.., ..violinwidth.., ..width..
f + stat_ecdf(n = 40)
x, y | ..x.., ..y..
f + stat_quantile(quantiles = c(0.25, 0.5, 0.75), formula = y ~ log(x),
method = “rq”)
x, y | ..quantile.., ..x.., ..y..
f + stat_smooth(method = “auto”, formula = y ~ x, se = TRUE, n = 80,
fullrange = FALSE, level = 0.95)
x, y | ..se.., ..x.., ..y.., ..ymin.., ..ymax..
1D distributions
2D distributions
3 Variables
Comparisons
Functions
General Purpose
Scales control how a plot maps data values to the visual
values of an aesthetic. To change the mapping, add a
custom scale.
n <- b + geom_bar(aes(fill = fl)) n
n + scale_fill_manual(
values = c(“skyblue”, “royalblue”, “blue”, “navy”),
limits = c(“d”, “e”, “p”, “r”), breaks =c(“d”, “e”, “p”, “r”),
name = “fuel”, labels = c(“D”, “E”, “P”, “R”))
scale_ aesthetic
to adjust
prepackaged
scale to use
scale specific
arguments
range of values to
include in mapping
title to use in
legend/axis
labels to use in
legend/axis
breaks to use in
legend/axis
General Purpose scales
Use with any aesthetic:
alpha, color, fill, linetype, shape, size
scale_*_continuous() – map cont’ values to visual values
scale_*_discrete() – map discrete values to visual values
scale_*_identity() – use data values as visual values
scale_*_manual(values = c()) – map discrete values to
manually chosen visual values
X and Y location scales
Color and fill scales
Shape scales
Size scales
Use with x or y aesthetics (x shown here)
scale_x_date(labels = date_format(“%m/%d”),
breaks = date_breaks(“2 weeks”)) – treat x
values as dates. See ?strptime for label formats.
scale_x_datetime() – treat x values as date times. Use
same arguments as scale_x_date().
scale_x_log10() – Plot x on log10 scale
scale_x_reverse() – Reverse direction of x axis
scale_x_sqrt() – Plot x on square root scale
Discrete Continuous
n <- b + geom_bar(
aes(fill = fl))
o <- a + geom_dotplot(
aes(fill = ..x..))
n + scale_fill_brewer(
palette = “Blues”)
For palette choices:
library(RcolorBrewer)
display.brewer.all()
n + scale_fill_grey(
start = 0.2, end = 0.8,
na.value = “red”)
o + scale_fill_gradient(
low = “red”,
high = “yellow”)
o + scale_fill_gradient2(
low = “red”, hight = “blue”,
mid = “white”, midpoint = 25)
o + scale_fill_gradientn(
colours = terrain.colors(6))
Also: rainbow(), heat.colors(),
topo.colors(), cm.colors(),
RColorBrewer::brewer.pal()
p <- f + geom_point( aes(shape = fl))
p + scale_shape(
solid = FALSE)
p + scale_shape_manual(
values = c(3:7))
Shape values shown in
chart on right
Manual Shape values
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
**
.
oo
OO
00
++
—
||
%%
##
Manual shape values
q <- f + geom_point( aes(size = cyl))
q + scale_size_area(max = 6)
Value mapped to area of circle
(not radius)
ggthemes – Package with additional ggplot2 themes
60
long
la
t
z + coord_map(projection = “ortho”,
orientation=c(41, -74, 0))
projection, orientation, xlim, ylim
Map projections from the mapproj package
(mercator (default), azequalarea, lagrange, etc.)
fl: c fl: d fl: e fl: p fl: r
c d e p r
↵c ↵d ↵
e ↵p ↵r
Use scale functions
to update legend
labels
Without clipping (preferred)
0
50
100
150
c d e p r
fl
co
un
t
0
50
100
150
c d e p r
fl
co
un
t
0
50
100
150
c d e p r
fl
co
un
t
r + theme_bw()
White background
with grid lines
r + theme_grey()
Grey background
(default theme) 0
50
100
150
c d e p r
fl
co
un
t
Some plots visualize a transformation of the original data set.
Use a stat to choose a common transformation to visualize,
e.g. a + geom_bar(stat = “bin”)
https://creativecommons.org/licenses/by/4.0/
mailto:info@rstudio.com
http://rstudio.com
Graphical Primitives
Data Visualization
with ggplot2
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/1
5
Geoms – Use a geom to represent data points, use the geom’s aesthetic properties to represent variables. Each function returns a layer.
One Variable
a + geom_area(stat = “bin”)
x, y, alpha, color, fill, linetype, size
b + geom_area(aes(y = ..density..), stat = “bin”)
a + geom_density(kernel = “gaussian”)
x, y, alpha, color, fill, linetype, size, weight
b + geom_density(aes(y = ..county..))
a + geom_dotplot()
x, y, alpha, color, fill
a + geom_freqpoly()
x, y, alpha, color, linetype, size
b + geom_freqpoly(aes(y = ..density..))
a + geom_histogram(binwidth = 5)
x, y, alpha, color, fill, linetype, size, weight
b + geom_histogram(aes(y = ..density..))
Discrete
b <- ggplot(mpg, aes(fl))
b + geom_bar()
x, alpha, color, fill, linetype, size, weight
Continuous
a <- ggplot(mpg, aes(hwy))
Two Variables
Continuous Function
Discrete X, Discrete Y
h <- ggplot(diamonds, aes(cut, color))
h + geom_jitter()
x, y, alpha, color, fill, shape, size
Discrete X, Continuous Y
g <- ggplot(mpg, aes(class, hwy))
g + geom_bar(stat = “identity”)
x, y, alpha, color, fill, linetype, size, weight
g + geom_boxplot()
lower, middle, upper, x, ymax, ymin, alpha,
color, fill, linetype, shape, size, weight
g + geom_dotplot(binaxis = “y”,
stackdir = “center”)
x, y, alpha, color, fill
g + geom_violin(scale = “area”)
x, y, alpha, color, fill, linetype, size, weight
Continuous X, Continuous Y
f <- ggplot(mpg, aes(cty, hwy))
f + geom_blank()
f + geom_jitter()
x, y, alpha, color, fill, shape, size
f + geom_point()
x, y, alpha, color, fill, shape, size
f + geom_quantile()
x, y, alpha, color, linetype, size, weight
f + geom_rug(sides = “bl”)
alpha, color, linetype, size
f + geom_smooth(model = lm)
x, y, alpha, color, fill, linetype, size, weight
f + geom_text(aes(label = cty))
x, y, label, alpha, angle, color, family, fontface,
hjust, lineheight, size, vjust
Three Variables
m + geom_contour(aes(z = z))
x, y, z, alpha, colour, linetype, size, weight
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2)) m <- ggplot(seals, aes(long, lat))
j <- ggplot(economics, aes(date, unemploy)) j + geom_area()
x, y, alpha, color, fill, linetype, size
j + geom_line()
x, y, alpha, color, linetype, size
j + geom_step(direction = “hv”)
x, y, alpha, color, linetype, size
Continuous Bivariate Distribution
i <- ggplot(movies, aes(year, rating))
i + geom_bin2d(binwidth = c(5, 0.5))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size, weight
i + geom_density2d()
x, y, alpha, colour, linetype, size
i + geom_hex()
x, y, alpha, colour, fill size
e + geom_segment(aes(
xend = long + delta_long,
yend = lat + delta_lat))
x, xend, y, yend, alpha, color, linetype, size
e + geom_rect(aes(xmin = long, ymin = lat,
xmax= long + delta_long,
ymax = lat + delta_lat))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size
c + geom_polygon(aes(group = group))
x, y, alpha, color, fill, linetype, size
e <- ggplot(seals, aes(x = long, y = lat))
m + geom_raster(aes(fill = z), hjust=0.5,
vjust=0.5, interpolate=FALSE)
x, y, alpha, fill
m + geom_tile(aes(fill = z))
x, y, alpha, color, fill, linetype, size
k + geom_crossbar(fatten = 2)
x, y, ymax, ymin, alpha, color, fill, linetype,
size
k + geom_errorbar()
x, ymax, ymin, alpha, color, linetype, size,
width (also geom_errorbarh())
k + geom_linerange()
x, ymin, ymax, alpha, color, linetype, size
k + geom_pointrange()
x, y, ymin, ymax, alpha, color, fill, linetype,
shape, size
Visualizing error
df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
k <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
d + geom_path(lineend=”butt”,
linejoin=”round’, linemitre=1)
x, y, alpha, color, linetype, size
d + geom_ribbon(aes(ymin=unemploy – 900,
ymax=unemploy + 900))
x, ymax, ymin, alpha, color, fill, linetype, size
d <- ggplot(economics, aes(date, unemploy))
c <- ggplot(map, aes(long, lat))
data <- data.frame(murder = USArrests$Murder, state = tolower(rownames(USArrests)))
map <- map_data("state") l <- ggplot(data, aes(fill = murder))
l + geom_map(aes(map_id = state), map = map) +
expand_limits(x = map$long, y = map$lat)
map_id, alpha, color, fill, linetype, size
Maps
AB
C
Basics
Build a graph with qplot() or ggplot()
ggplot2 is based on the grammar of graphics, the
idea that you can build every graph from the same
few components: a data set, a set of geoms—visual
marks that represent data points, and a coordinate
system.
To display data values, map variables in the data set
to aesthetic properties of the geom like size, color,
and x and y locations.
Graphical Primitives
Data Visualization
with ggplot2
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/
15
Geoms – Use a geom to represent data points, use the geom’s aesthetic properties to represent variables
Basics
One Variable
a + geom_area(stat = “bin”)
x, y, alpha, color, fill, linetype, size
b + geom_area(aes(y = ..density..), stat = “bin”)
a + geom_density(kernal = “gaussian”)
x, y, alpha, color, fill, linetype, size, weight
b + geom_density(aes(y = ..county..))
a+ geom_dotplot()
x, y, alpha, color, fill
a + geom_freqpoly()
x, y, alpha, color, linetype, size
b + geom_freqpoly(aes(y = ..density..))
a + geom_histogram(binwidth = 5)
x, y, alpha, color, fill, linetype, size, weight
b + geom_histogram(aes(y = ..density..))
Discrete
a <- ggplot(mpg, aes(fl))
b + geom_bar()
x, alpha, color, fill, linetype, size, weight
Continuous
a <- ggplot(mpg, aes(hwy))
Two Variables
Discrete X, Discrete Y
h <- ggplot(diamonds, aes(cut, color))
h + geom_jitter()
x, y, alpha, color, fill, shape, size
Discrete X, Continuous Y
g <- ggplot(mpg, aes(class, hwy))
g + geom_bar(stat = "identity")
x, y, alpha, color, fill, linetype, size, weight
g + geom_boxplot()
lower, middle, upper, x, ymax, ymin, alpha,
color, fill, linetype, shape, size, weight
g + geom_dotplot(binaxis = "y",
stackdir = "center")
x, y, alpha, color, fill
g + geom_violin(scale = "area")
x, y, alpha, color, fill, linetype, size, weight
Continuous X, Continuous Y
f <- ggplot(mpg, aes(cty, hwy))
f + geom_blank()
f + geom_jitter()
x, y, alpha, color, fill, shape, size
f + geom_point()
x, y, alpha, color, fill, shape, size
f + geom_quantile()
x, y, alpha, color, linetype, size, weight
f + geom_rug(sides = "bl")
alpha, color, linetype, size
f + geom_smooth(model = lm)
x, y, alpha, color, fill, linetype, size, weight
f + geom_text(aes(label = cty))
x, y, label, alpha, angle, color, family, fontface,
hjust, lineheight, size, vjust
Three Variables
i + geom_contour(aes(z = z))
x, y, z, alpha, colour, linetype, size, weight
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2)) i <- ggplot(seals, aes(long, lat))
g <- ggplot(economics, aes(date, unemploy))
Continuous Function
g + geom_area()
x, y, alpha, color, fill, linetype, size
g + geom_line()
x, y, alpha, color, linetype, size
g + geom_step(direction = “hv”)
x, y, alpha, color, linetype, size
Continuous Bivariate Distribution
h <- ggplot(movies, aes(year, rating))
h + geom_bin2d(binwidth = c(5, 0.5))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size, weight
h + geom_density2d()
x, y, alpha, colour, linetype, size
h + geom_hex()
x, y, alpha, colour, fill size
d + geom_segment(aes(
xend = long + delta_long,
yend = lat + delta_lat))
x, xend, y, yend, alpha, color, linetype, size
d + geom_rect(aes(xmin = long, ymin = lat,
xmax= long + delta_long,
ymax = lat + delta_lat))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size
c + geom_polygon(aes(group = group))
x, y, alpha, color, fill, linetype, size
d<- ggplot(seals, aes(x = long, y = lat))
i + geom_raster(aes(fill = z), hjust=0.5,
vjust=0.5, interpolate=FALSE)
x, y, alpha, fill
i + geom_tile(aes(fill = z))
x, y, alpha, color, fill, linetype, size
e + geom_crossbar(fatten = 2)
x, y, ymax, ymin, alpha, color, fill, linetype,
size
e + geom_errorbar()
x, ymax, ymin, alpha, color, linetype, size,
width (also geom_errorbarh())
e + geom_linerange()
x, ymin, ymax, alpha, color, linetype, size
e + geom_pointrange()
x, y, ymin, ymax, alpha, color, fill, linetype,
shape, size
Visualizing error
df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
e <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
g + geom_path(lineend=”butt”,
linejoin=”round’, linemitre=1)
x, y, alpha, color, linetype, size
g + geom_ribbon(aes(ymin=unemploy – 900,
ymax=unemploy + 900))
x, ymax, ymin, alpha, color, fill, linetype, size
g <- ggplot(economics, aes(date, unemploy)) c <- ggplot(map, aes(long, lat)) data <- data.frame(murder = USArrests$Murder, state = tolower(rownames(USArrests)))
map <- map_data("state") e <- ggplot(data, aes(fill = murder))
e + geom_map(aes(map_id = state), map = map) +
expand_limits(x = map$long, y = map$lat)
map_id, alpha, color, fill, linetype, size
Maps
F M A
=
1
2
3
0
0 1 2 3
4
4
1
2
3
0
0 1 2 3 4
4
+
data geom coordinate
system
plot
+
F M A
=
1
2
3
0
0 1 2 3 4
4
1
2
3
0
0 1 2 3 4
4
data geom coordinate
system
plot
x = F
y = A
color = F
size = A
1
2
3
0
0 1 2 3 4
4
plot
+
F M A
=
1
2
3
0
0 1 2 3 4
4
data geom coordinate
systemx = F
y = A
x = F
y = A
Graphical Primitives
Data Visualization
with ggplot2
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/15
Geoms – Use a geom to represent data points, use the geom’s aesthetic properties to represent variables
Basics
One Variable
a + geom_area(stat = “bin”)
x, y, alpha, color, fill, linetype, size
b + geom_area(aes(y = ..density..), stat = “bin”)
a + geom_density(kernal = “gaussian”)
x, y, alpha, color, fill, linetype, size, weight
b + geom_density(aes(y = ..county..))
a+ geom_dotplot()
x, y, alpha, color, fill
a + geom_freqpoly()
x, y, alpha, color, linetype, size
b + geom_freqpoly(aes(y = ..density..))
a + geom_histogram(binwidth = 5)
x, y, alpha, color, fill, linetype, size, weight
b + geom_histogram(aes(y = ..density..))
Discrete
a <- ggplot(mpg, aes(fl))
b + geom_bar()
x, alpha, color, fill, linetype, size, weight
Continuous
a <- ggplot(mpg, aes(hwy))
Two Variables
Discrete X, Discrete Y
h <- ggplot(diamonds, aes(cut, color))
h + geom_jitter()
x, y, alpha, color, fill, shape, size
Discrete X, Continuous Y
g <- ggplot(mpg, aes(class, hwy))
g + geom_bar(stat = "identity")
x, y, alpha, color, fill, linetype, size, weight
g + geom_boxplot()
lower, middle, upper, x, ymax, ymin, alpha,
color, fill, linetype, shape, size, weight
g + geom_dotplot(binaxis = "y",
stackdir = "center")
x, y, alpha, color, fill
g + geom_violin(scale = "area")
x, y, alpha, color, fill, linetype, size, weight
Continuous X, Continuous Y
f <- ggplot(mpg, aes(cty, hwy))
f + geom_blank()
f + geom_jitter()
x, y, alpha, color, fill, shape, size
f + geom_point()
x, y, alpha, color, fill, shape, size
f + geom_quantile()
x, y, alpha, color, linetype, size, weight
f + geom_rug(sides = "bl")
alpha, color, linetype, size
f + geom_smooth(model = lm)
x, y, alpha, color, fill, linetype, size, weight
f + geom_text(aes(label = cty))
x, y, label, alpha, angle, color, family, fontface,
hjust, lineheight, size, vjust
Three Variables
i + geom_contour(aes(z = z))
x, y, z, alpha, colour, linetype, size, weight
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2))
i <- ggplot(seals, aes(long, lat))
g <- ggplot(economics, aes(date, unemploy)) Continuous Function
g + geom_area()
x, y, alpha, color, fill, linetype, size
g + geom_line()
x, y, alpha, color, linetype, size
g + geom_step(direction = “hv”)
x, y, alpha, color, linetype, size
Continuous Bivariate Distribution
h <- ggplot(movies, aes(year, rating))
h + geom_bin2d(binwidth = c(5, 0.5))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size, weight
h + geom_density2d()
x, y, alpha, colour, linetype, size
h + geom_hex()
x, y, alpha, colour, fill size
d + geom_segment(aes(
xend = long + delta_long,
yend = lat + delta_lat))
x, xend, y, yend, alpha, color, linetype, size
d + geom_rect(aes(xmin = long, ymin = lat,
xmax= long + delta_long,
ymax = lat + delta_lat))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size
c + geom_polygon(aes(group = group))
x, y, alpha, color, fill, linetype, size
d<- ggplot(seals, aes(x = long, y = lat))
i + geom_raster(aes(fill = z), hjust=0.5,
vjust=0.5, interpolate=FALSE)
x, y, alpha, fill
i + geom_tile(aes(fill = z))
x, y, alpha, color, fill, linetype, size
e + geom_crossbar(fatten = 2)
x, y, ymax, ymin, alpha, color, fill, linetype,
size
e + geom_errorbar()
x, ymax, ymin, alpha, color, linetype, size,
width (also geom_errorbarh())
e + geom_linerange()
x, ymin, ymax, alpha, color, linetype, size
e + geom_pointrange()
x, y, ymin, ymax, alpha, color, fill, linetype,
shape, size
Visualizing error
df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
e <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
g + geom_path(lineend="butt",
linejoin="round’, linemitre=1)
x, y, alpha, color, linetype, size
g + geom_ribbon(aes(ymin=unemploy - 900,
ymax=unemploy + 900))
x, ymax, ymin, alpha, color, fill, linetype, size
g <- ggplot(economics, aes(date, unemploy))
c <- ggplot(map, aes(long, lat))
data <- data.frame(murder = USArrests$Murder,
state = tolower(rownames(USArrests)))
map <- map_data("state")
e <- ggplot(data, aes(fill = murder))
e + geom_map(aes(map_id = state), map = map) +
expand_limits(x = map$long, y = map$lat)
map_id, alpha, color, fill, linetype, size
Maps
F M A
=
1
2
3
0
0 1 2 3 4
4
1
2
3
0
0 1 2 3 4
4
+
data geom coordinate
system
plot
+
F M A
=
1
2
3
0
0 1 2 3 4
4
1
2
3
0
0 1 2 3 4
4
data geom coordinate
system
plot
x = F
y = A
color = F
size = A
1
2
3
0
0 1 2 3 4
4
plot
+
F M A
=
1
2
3
0
0 1 2 3 4
4
data geom coordinate
systemx = F
y = A
x = F
y = A
ggsave(“plot “, width = 5, height = 5)
Saves last plot as 5’ x 5’ file named “plot ” in
working directory. Matches file type to file extension.
qplot(x = cty, y = hwy, color = cyl, data = mpg, geom = “point”)
Creates a complete plot with given data, geom, and
mappings. Supplies many useful defaults.
ggplot(data = mpg, aes(x = cty, y = hwy))
Begins a plot that you finish by adding layers to. No
defaults, but provides more control than qplot().
ggplot(mpg, aes(hwy, cty)) +
geom_point(aes(color = cyl)) +
geom_smooth(method =”lm”) +
coord_cartesian() +
scale_color_gradient() +
theme_bw()
data
aesthetic mappings
add layers,
elements with +
layer = geom +
default stat +
layer specific
mappings
additional
elements
data geom
Add a new layer to a plot with a geom_*()
or stat_*() function. Each provides a geom, a
set of aesthetic mappings, and a default stat
and position adjustment.
last_plot()
Returns the last plot
https://creativecommons.org/licenses/by/4.0/
mailto:info@rstudio.com
http://rstudio.com
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/15
Stats – An alternative way to build a layer Coordinate Systems
r + coord_cartesian(xlim = c(0, 5))
xlim, ylim
The default cartesian coordinate system
r + coord_fixed(ratio = 1/2)
ratio, xlim, ylim
Cartesian coordinates with fixed aspect
ratio between x and y units
r + coord_flip()
xlim, ylim
Flipped Cartesian coordinates
r + coord_polar(theta = “x”, direction=1 )
theta, start, direction
Polar coordinates
r + coord_trans(ytrans = “sqrt”)
xtrans, ytrans, limx, limy
Transformed cartesian coordinates. Set
extras and strains to the name
of a window function.
r <- b + geom_bar()
Scales Faceting
t <- ggplot(mpg, aes(cty, hwy)) + geom_point()
Position Adjustments
s + geom_bar(position = “dodge”)
Arrange elements side by side
s + geom_bar(position = “fill”)
Stack elements on top of one another,
normalize height
s + geom_bar(position = “stack”)
Stack elements on top of one another
f + geom_point(position = “jitter”)
Add random noise to X and Y position
of each element to avoid overplotting
s <- ggplot(mpg, aes(fl, fill = drv))
Labels
t + ggtitle(“New Plot Title”)
Add a main title above the plot
t + xlab(“New X label”)
Change the label on the X axis
t + ylab(“New Y label”)
Change the label on the Y axis
t + labs(title =” New title”, x = “New x”, y = “New y”)
All of the above
Legends
Zooming
Themes
Facets divide a plot into subplots based on the values
of one or more discrete variables.
t + facet_grid(. ~ fl)
facet into columns based on fl
t + facet_grid(year ~ .)
facet into rows based on year
t + facet_grid(year ~ fl)
facet into both rows and columns
t + facet_wrap(~ fl)
wrap facets into a rectangular layout
Set scales to let axis limits vary across facets
t + facet_grid(y ~ x, scales = “free”)
x and y axis limits adjust to individual facets
• “free_x” – x axis limits adjust
• “free_y” – y axis limits adjust
Set labeller to adjust facet
labels
t + facet_grid(. ~ fl, labeller = label_both)
t + facet_grid(. ~ fl, labeller = label_bquote(alpha ^ .(x)))
t + facet_grid(. ~ fl, labeller = label_parsed)
Position adjustments determine how to arrange
geoms that would otherwise occupy the same space.
Each position adjustment can be recast as a function
with manual width and height arguments
s + geom_bar(position = position_dodge(width = 1))
r + theme_classic()
White background
no gridlines
r + theme_minimal()
Minimal theme
t + coord_cartesian(
xlim = c(0, 100), ylim = c(10, 20))
With clipping (removes unseen data points)
t + xlim(0, 100) + ylim(10, 20)
t + scale_x_continuous(limits = c(0, 100)) +
scale_y_continuous(limits = c(0, 100))
t + theme(legend.position = “bottom”)
Place legend at “bottom”, “top”, “left”, or “right”
t + guides(color = “none”)
Set legend type for each aesthetic: colorbar, legend,
or none (no legend)
t + scale_fill_discrete(name = “Title”,
labels = c(“A”, “B”, “C”))
Set legend title and labels with a scale function.
Each stat creates additional variables to map aesthetics
to. These variables use a common ..name.. syntax.
stat functions and geom functions both combine a stat
with a geom to make a layer, i.e. stat_bin(geom=”bar”)
does the same as geom_bar(stat=”bin”)
+
x ..count..
=
1
2
3
0
0 1 2 3 4
4
1
2
3
0
0 1 2 3 4
4
data geom coordinate
system
plot
x = x
y = ..count..
fl cty cyl
stat
ggplot() + stat_function(aes(x = -3:3),
fun = dnorm, n = 101, args = list(sd=0.5))
x | ..y..
f + stat_identity()
ggplot() + stat_qq(aes(sample=1:100), distribution = qt,
dparams = list(df=5))
sample, x, y | ..x.., ..y..
f + stat_sum()
x, y, size | ..size..
f + stat_summary(fun.data = “mean_cl_boot”)
f + stat_unique()
i + stat_density2d(aes(fill = ..level..),
geom = “polygon”, n = 100)
stat function
layer specific
mappings
variable created
by transformation
geom for layer parameters for stat
a + stat_bin(binwidth = 1, origin = 10)
x, y | ..count.., ..ncount.., ..density.., ..ndensity..
a + stat_bindot(binwidth = 1, binaxis = “x”)
x, y, | ..count.., ..ncount..
a + stat_density(adjust = 1, kernel = “gaussian”)
x, y, | ..count.., ..density.., ..scaled..
f + stat_bin2d(bins = 30, drop = TRUE)
x, y, fill | ..count.., ..density..
f + stat_binhex(bins = 30)
x, y, fill | ..count.., ..density..
f + stat_density2d(contour = TRUE, n = 100)
x, y, color, size | ..level..
m + stat_contour(aes(z = z))
x, y, z, order | ..level..
m+ stat_spoke(aes(radius= z, angle = z))
angle, radius, x, xend, y, yend | ..x.., ..xend.., ..y.., ..yend..
m + stat_summary_hex(aes(z = z), bins = 30, fun = mean)
x, y, z, fill | ..value..
m + stat_summary2d(aes(z = z), bins = 30, fun = mean)
x, y, z, fill | ..value..
g + stat_boxplot(coef = 1.5)
x, y | ..lower.., ..middle.., ..upper.., ..outliers..
g + stat_ydensity(adjust = 1, kernel = “gaussian”, scale = “area”)
x, y | ..density.., ..scaled.., ..count.., ..n.., ..violinwidth.., ..width..
f + stat_ecdf(n = 40)
x, y | ..x.., ..y..
f + stat_quantile(quantiles = c(0.25, 0.5, 0.75), formula = y ~ log(x),
method = “rq”)
x, y | ..quantile.., ..x.., ..y..
f + stat_smooth(method = “auto”, formula = y ~ x, se = TRUE, n = 80,
fullrange = FALSE, level = 0.95)
x, y | ..se.., ..x.., ..y.., ..ymin.., ..ymax..
1D distributions
2D distributions
3 Variables
Comparisons
Functions
General Purpose
Scales control how a plot maps data values to the visual
values of an aesthetic. To change the mapping, add a
custom scale.
n <- b + geom_bar(aes(fill = fl)) n
n + scale_fill_manual(
values = c(“skyblue”, “royalblue”, “blue”, “navy”),
limits = c(“d”, “e”, “p”, “r”), breaks =c(“d”, “e”, “p”, “r”),
name = “fuel”, labels = c(“D”, “E”, “P”, “R”))
scale_ aesthetic
to adjust
prepackaged
scale to use
scale specific
arguments
range of values to
include in mapping
title to use in
legend/axis
labels to use in
legend/axis
breaks to use in
legend/axis
General Purpose scales
Use with any aesthetic:
alpha, color, fill, linetype, shape, size
scale_*_continuous() – map cont’ values to visual values
scale_*_discrete() – map discrete values to visual values
scale_*_identity() – use data values as visual values
scale_*_manual(values = c()) – map discrete values to
manually chosen visual values
X and Y location scales
Color and fill scales
Shape scales
Size scales
Use with x or y aesthetics (x shown here)
scale_x_date(labels = date_format(“%m/%d”),
breaks = date_breaks(“2 weeks”)) – treat x
values as dates. See ?strptime for label formats.
scale_x_datetime() – treat x values as date times. Use
same arguments as scale_x_date().
scale_x_log10() – Plot x on log10 scale
scale_x_reverse() – Reverse direction of x axis
scale_x_sqrt() – Plot x on square root scale
Discrete Continuous
n <- b + geom_bar(
aes(fill = fl))
o <- a + geom_dotplot(
aes(fill = ..x..))
n + scale_fill_brewer(
palette = “Blues”)
For palette choices:
library(RcolorBrewer)
display.brewer.all()
n + scale_fill_grey(
start = 0.2, end = 0.8,
na.value = “red”)
o + scale_fill_gradient(
low = “red”,
high = “yellow”)
o + scale_fill_gradient2(
low = “red”, hight = “blue”,
mid = “white”, midpoint = 25)
o + scale_fill_gradientn(
colours = terrain.colors(6))
Also: rainbow(), heat.colors(),
topo.colors(), cm.colors(),
RColorBrewer::brewer.pal()
p <- f + geom_point( aes(shape = fl))
p + scale_shape(
solid = FALSE)
p + scale_shape_manual(
values = c(3:7))
Shape values shown in
chart on right
Manual Shape values
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
**
.
oo
OO
00
++
—
||
%%
##
Manual shape values
q <- f + geom_point( aes(size = cyl))
q + scale_size_area(max = 6)
Value mapped to area of circle
(not radius)
ggthemes – Package with additional ggplot2 themes
60
long
la
t
z + coord_map(projection = “ortho”,
orientation=c(41, -74, 0))
projection, orientation, xlim, ylim
Map projections from the mapproj package
(mercator (default), azequalarea, lagrange, etc.)
fl: c fl: d fl: e fl: p fl: r
c d e p r
↵c ↵d ↵
e ↵p ↵r
Use scale functions
to update legend
labels
Without clipping (preferred)
0
50
100
150
c d e p r
fl
co
un
t
0
50
100
150
c d e p r
fl
co
un
t
0
50
100
150
c d e p r
fl
co
un
t
r + theme_bw()
White background
with grid lines
r + theme_grey()
Grey background
(default theme) 0
50
100
150
c d e p r
fl
co
un
t
Some plots visualize a transformation of the original data set.
Use a stat to choose a common transformation to visualize,
e.g. a + geom_bar(stat = “bin”)
https://creativecommons.org/licenses/by/4.0/
mailto:info@rstudio.com
http://rstudio.com
Data Visualization with ggplot2 : : CHEAT SHEET
ggplot2 is based on the grammar of graphics, the idea
that you can build every graph from the same
components: a data set, a coordinate system,
and geoms—visual marks that represent data points.
Basics
GRAPHICAL PRIMITIVES
a + geom_blank()
(Useful for expanding limits)
b + geom_curve(aes(yend = lat + 1,
xend=long+1),curvature=1) – x, xend, y, yend,
alpha, angle, color, curvature, linetype, size
a + geom_path(lineend=”butt”, linejoin=”round”,
linemitre=1)
x, y, alpha, color, group, linetype, size
a + geom_polygon(aes(group = group))
x, y, alpha, color, fill, group, linetype, size
b + geom_rect(aes(xmin = long, ymin=lat, xmax=
long + 1, ymax = lat + 1)) – xmax, xmin, ymax,
ymin, alpha, color, fill, linetype, size
a + geom_ribbon(aes(ymin=unemploy – 900,
ymax=unemploy + 900)) – x, ymax, ymin,
alpha, color, fill, group, linetype, size
+ =
To display values, map variables in the data to visual
properties of the geom (aesthetics) like size, color, and x
and y locations.
+ =
data geom
x = F · y = A
coordinate
system
plot
data geom
x = F · y = A
color = F
size = A
coordinate
system
plot
Complete the template below to build a graph.
required
ggplot(data = mpg, aes(x = cty, y = hwy)) Begins a plot
that you finish by adding layers to. Add one geom
function per layer.
qplot(x = cty, y = hwy, data = mpg, geom = “point”)
Creates a complete plot with given data, geom, and
mappings. Supplies many useful defaults.
last_plot() Returns the last plot
ggsave(“plot “, width = 5, height = 5) Saves last plot
as 5’ x 5’ file named “plot ” in working directory.
Matches file type to file extension.
F M A
F M A
aesthetic mappings data geom
LINE SEGMENTS
b + geom_abline(aes(intercept=0, slope=1))
b + geom_hline(aes(yintercept = lat))
b + geom_vline(aes(xintercept = long))
common aesthetics: x, y, alpha, color, linetype, size
b + geom_segment(aes(yend=lat+1, xend=long+1))
b + geom_spoke(aes(angle = 1:1155, radius = 1))
a <- ggplot(economics, aes(date, unemploy)) b <- ggplot(seals, aes(x = long, y = lat))
ONE VARIABLE continuous
c <- ggplot(mpg, aes(hwy)); c2 <- ggplot(mpg)
c + geom_area(stat = “bin”)
x, y, alpha, color, fill, linetype, size
c + geom_density(kernel = “gaussian”)
x, y, alpha, color, fill, group, linetype, size, weight
c + geom_dotplot()
x, y, alpha, color, fill
c + geom_freqpoly() x, y, alpha, color, group,
linetype, size
c + geom_histogram(binwidth = 5) x, y, alpha,
color, fill, linetype, size, weight
c2 + geom_qq(aes(sample = hwy)) x, y, alpha,
color, fill, linetype, size, weight
discrete
d <- ggplot(mpg, aes(fl))
d + geom_bar()
x, alpha, color, fill, linetype, size, weight
e + geom_label(aes(label = cty), nudge_x = 1,
nudge_y = 1, check_overlap = TRUE) x, y, label,
alpha, angle, color, family, fontface, hjust,
lineheight, size, vjust
e + geom_jitter(height = 2, width = 2)
x, y, alpha, color, fill, shape, size
e + geom_point(), x, y, alpha, color, fill, shape,
size, stroke
e + geom_quantile(), x, y, alpha, color, group,
linetype, size, weight
e + geom_rug(sides = “bl”), x, y, alpha, color,
linetype, size
e + geom_smooth(method = lm), x, y, alpha,
color, fill, group, linetype, size, weight
e + geom_text(aes(label = cty), nudge_x = 1,
nudge_y = 1, check_overlap = TRUE), x, y, label,
alpha, angle, color, family, fontface, hjust,
lineheight, size, vjust
discrete x , continuous y
f <- ggplot(mpg, aes(class, hwy))
f + geom_col(), x, y, alpha, color, fill, group,
linetype, size
f + geom_boxplot(), x, y, lower, middle, upper,
ymax, ymin, alpha, color, fill, group, linetype,
shape, size, weight
f + geom_dotplot(binaxis = “y”, stackdir =
“center”), x, y, alpha, color, fill, group
f + geom_violin(scale = “area”), x, y, alpha, color,
fill, group, linetype, size, weight
discrete x , discrete y
g <- ggplot(diamonds, aes(cut, color))
g + geom_count(), x, y, alpha, color, fill, shape,
size, stroke
THREE VARIABLES
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2)); l <- ggplot(seals, aes(long, lat))
l + geom_contour(aes(z = z))
x, y, z, alpha, colour, group, linetype,
size, weight
l + geom_raster(aes(fill = z), hjust=0.5, vjust=0.5,
interpolate=FALSE)
x, y, alpha, fill
l + geom_tile(aes(fill = z)), x, y, alpha, color, fill,
linetype, size, width
h + geom_bin2d(binwidth = c(0.25, 500))
x, y, alpha, color, fill, linetype, size, weight
h + geom_density2d()
x, y, alpha, colour, group, linetype, size
h + geom_hex()
x, y, alpha, colour, fill, size
i + geom_area()
x, y, alpha, color, fill, linetype, size
i + geom_line()
x, y, alpha, color, group, linetype, size
i + geom_step(direction = “hv”)
x, y, alpha, color, group, linetype, size
j + geom_crossbar(fatten = 2)
x, y, ymax, ymin, alpha, color, fill, group, linetype,
size
j + geom_errorbar(), x, ymax, ymin, alpha, color,
group, linetype, size, width (also
geom_errorbarh())
j + geom_linerange()
x, ymin, ymax, alpha, color, group, linetype, size
j + geom_pointrange()
x, y, ymin, ymax, alpha, color, fill, group, linetype,
shape, size
continuous function
i <- ggplot(economics, aes(date, unemploy))
visualizing error
df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
j <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
maps
data <- data.frame(murder = USArrests$Murder,
state = tolower(rownames(USArrests)))
map <- map_data("state")
k <- ggplot(data, aes(fill = murder))
k + geom_map(aes(map_id = state), map = map)
+ expand_limits(x = map$long, y = map$lat),
map_id, alpha, color, fill, linetype, size
Not
required,
sensible
defaults
supplied
Geoms Use a geom function to represent data points, use the geom’s aesthetic properties to represent variables.
Each function returns a layer.
TWO VARIABLES
continuous x , continuous y
e <- ggplot(mpg, aes(cty, hwy))
continuous bivariate distribution
h <- ggplot(diamonds, aes(carat, price))
RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org • ggplot2 3.1.0 • Updated: 2018-12
ggplot (data = ) +
stat =
https://creativecommons.org/licenses/by-sa/4.0/
mailto:info@rstudio.com
http://rstudio.com
Scales Coordinate Systems
A stat builds new variables to plot (e.g., count, prop).
Stats An alternative way to build a layer
+ =
data geom
x = x ·
y = ..count..
coordinate
system
plot
fl cty cyl
x ..count..
stat
Visualize a stat by changing the default stat of a geom
function, geom_bar(stat=”count”) or by using a stat
function, stat_count(geom=”bar”), which calls a default
geom to make a layer (equivalent to a geom function).
Use ..name.. syntax to map stat variables to aesthetics.
i + stat_density2d(aes(fill = ..level..),
geom = “polygon”)
stat function geommappings
variable created by stat
geom to use
c + stat_bin(binwidth = 1, origin = 10)
x, y | ..count.., ..ncount.., ..density.., ..ndensity..
c + stat_count(width = 1) x, y, | ..count.., ..prop..
c + stat_density(adjust = 1, kernel = “gaussian”)
x, y, | ..count.., ..density.., ..scaled..
e + stat_bin_2d(bins = 30, drop = T)
x, y, fill | ..count.., ..density..
e + stat_bin_hex(bins=30) x, y, fill | ..count.., ..density..
e + stat_density_2d(contour = TRUE, n = 100)
x, y, color, size | ..level..
e + stat_ellipse(level = 0.95, segments = 51, type = “t”)
l + stat_contour(aes(z = z)) x, y, z, order | ..level..
l + stat_summary_hex(aes(z = z), bins = 30, fun = max)
x, y, z, fill | ..value..
l + stat_summary_2d(aes(z = z), bins = 30, fun = mean)
x, y, z, fill | ..value..
f + stat_boxplot(coef = 1.5) x, y | ..lower..,
..middle.., ..upper.., ..width.. , ..ymin.., ..ymax..
f + stat_ydensity(kernel = “gaussian”, scale = “area”) x, y |
..density.., ..scaled.., ..count.., ..n.., ..violinwidth.., ..width..
e + stat_ecdf(n = 40) x, y | ..x.., ..y..
e + stat_quantile(quantiles = c(0.1, 0.9), formula = y ~
log(x), method = “rq”) x, y | ..quantile..
e + stat_smooth(method = “lm”, formula = y ~ x, se=T,
level=0.95) x, y | ..se.., ..x.., ..y.., ..ymin.., ..ymax..
ggplot() + stat_function(aes(x = -3:3), n = 99, fun =
dnorm, args = list(sd=0.5)) x | ..x.., ..y..
e + stat_identity(na.rm = TRUE)
ggplot() + stat_qq(aes(sample=1:100), dist = qt,
dparam=list(df=5)) sample, x, y | ..sample.., ..theoretical..
e + stat_sum() x, y, size | ..n.., ..prop..
e + stat_summary(fun.data = “mean_cl_boot”)
h + stat_summary_bin(fun.y = “mean”, geom = “bar”)
e + stat_unique()
Scales map data values to the visual values of an
aesthetic. To change a mapping, add a new scale.
(n <- d + geom_bar(aes(fill = fl)))
n + scale_fill_manual(
values = c(“skyblue”, “royalblue”, “blue”, “navy”),
limits = c(“d”, “e”, “p”, “r”), breaks =c(“d”, “e”, “p”, “r”),
name = “fuel”, labels = c(“D”, “E”, “P”, “R”))
scale_
aesthetic
to adjust
prepackaged
scale to use
scale-specific
arguments
title to use in
legend/axis
labels to use
in legend/axis
breaks to use in
legend/axis
range of
values to include
in mapping
GENERAL PURPOSE SCALES
Use with most aesthetics
scale_*_continuous() – map cont’ values to visual ones
scale_*_discrete() – map discrete values to visual ones
scale_*_identity() – use data values as visual ones
scale_*_manual(values = c()) – map discrete values to
manually chosen visual ones
scale_*_date(date_labels = “%m/%d”), date_breaks = “2
weeks”) – treat data values as dates.
scale_*_datetime() – treat data x values as date times.
Use same arguments as scale_x_date(). See ?strptime for
label formats.
X & Y LOCATION SCALES
Use with x or y aesthetics (x shown here)
scale_x_log10() – Plot x on log10 scale
scale_x_reverse() – Reverse direction of x axis
scale_x_sqrt() – Plot x on square root scale
COLOR AND FILL SCALES (DISCRETE)
n <- d + geom_bar(aes(fill = fl))
n + scale_fill_brewer(palette = "Blues")
For palette choices:
RColorBrewer::display.brewer.all()
n + scale_fill_grey(start = 0.2, end = 0.8,
na.value = "red")
COLOR AND FILL SCALES (CONTINUOUS)
o <- c + geom_dotplot(aes(fill = ..x..))
o + scale_fill_distiller(palette = “Blues”)
o + scale_fill_gradient(low=”red”, high=”yellow”)
o + scale_fill_gradient2(low=”red”, high=“blue”,
mid = “white”, midpoint = 25)
o + scale_fill_gradientn(colours=topo.colors(6))
Also: rainbow(), heat.colors(), terrain.colors(),
cm.colors(), RColorBrewer::brewer.pal()
SHAPE AND SIZE SCALES
p <- e + geom_point(aes(shape = fl, size = cyl))
p + scale_shape() + scale_size()
p + scale_shape_manual(values = c(3:7))
p + scale_radius(range = c(1,6))
p + scale_size_area(max_size = 6)
r <- d + geom_bar() r + coord_cartesian(xlim = c(0, 5)) xlim, ylim The default cartesian coordinate system r + coord_fixed(ratio = 1/2) ratio, xlim, ylim Cartesian coordinates with fixed aspect ratio between x and y units
r + coord_flip()
xlim, ylim
Flipped Cartesian coordinates
r + coord_polar(theta = “x”, direction=1 )
theta, start, direction
Polar coordinates
r + coord_trans(ytrans = “sqrt”)
xtrans, ytrans, limx, limy
Transformed cartesian coordinates. Set xtrans and
ytrans to the name of a window function.
π + coord_quickmap()
π + coord_map(projection = “ortho”,
orientation=c(41, -74, 0))projection, xlim, ylim
Map projections from the mapproj package
(mercator (default), azequalarea, lagrange, etc.)
Position Adjustments
Position adjustments determine how to arrange geoms
that would otherwise occupy the same space.
s <- ggplot(mpg, aes(fl, fill = drv)) s + geom_bar(position = "dodge") Arrange elements side by side s + geom_bar(position = "fill") Stack elements on top of one another, normalize height e + geom_point(position = "jitter") Add random noise to X and Y position of each element to avoid overplotting e + geom_label(position = "nudge") Nudge labels away from points
s + geom_bar(position = “stack”)
Stack elements on top of one another
Each position adjustment can be recast as a function with
manual width and height arguments
s + geom_bar(position = position_dodge(width = 1))
A
B
Themes
r + theme_bw()
White background
with grid lines
r + theme_gray()
Grey background
(default theme)
r + theme_dark()
dark for contrast
r + theme_classic()
r + theme_light()
r + theme_linedraw()
r + theme_minimal()
Minimal themes
r + theme_void()
Empty theme
Faceting
Facets divide a plot into
subplots based on the
values of one or more
discrete variables.
t <- ggplot(mpg, aes(cty, hwy)) + geom_point()
t + facet_grid(cols = vars(fl))
facet into columns based on fl
t + facet_grid(rows = vars(year))
facet into rows based on year
t + facet_grid(rows = vars(year), cols = vars(fl))
facet into both rows and columns
t + facet_wrap(vars(fl))
wrap facets into a rectangular layout
Set scales to let axis limits vary across facets
t + facet_grid(rows = vars(drv), cols = vars(fl),
scales = “free”)
x and y axis limits adjust to individual facets
“free_x” – x axis limits adjust
“free_y” – y axis limits adjust
Set labeller to adjust facet labels
t + facet_grid(cols = vars(fl), labeller = label_both)
t + facet_grid(rows = vars(fl),
labeller = label_bquote(alpha ^ .(fl)))
fl: c fl: d fl: e fl: p fl: r
↵c ↵d ↵
e ↵p ↵r
Labels
t + labs( x = “New x axis label”, y = “New y axis label”,
title =”Add a title above the plot”,
subtitle = “Add a subtitle below title”,
caption = “Add a caption below plot”,
t + annotate(geom = “text”, x = 8, y = 9, label = “A”)
Use scale functions
to update legend
labels
geom to place manual values for geom’s aesthetics
Legends
n + theme(legend.position = “bottom”)
Place legend at “bottom”, “top”, “left”, or “right”
n + guides(fill = “none”)
Set legend type for each aesthetic: colorbar, legend, or
none (no legend)
n + scale_fill_discrete(name = “Title”,
labels = c(“A”, “B”, “C”, “D”, “E”))
Set legend title and labels with a scale function.
Zooming
Without clipping (preferred)
t + coord_cartesian(
xlim = c(0, 100), ylim = c(10, 20))
With clipping (removes unseen data points)
t + xlim(0, 100) + ylim(10, 20)
t + scale_x_continuous(limits = c(0, 100)) +
scale_y_continuous(limits = c(0, 100))
RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org • ggplot2 3.1.0 • Updated: 2018-12
60
long
la
t
https://creativecommons.org/licenses/by-sa/4.0/
mailto:info@rstudio.com
http://rstudio.com
RStudio IDE : : CHEAT SHEET
Write Code Pro Features
RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at www.rstudio.com • RStudio IDE 0.99.832 • Updated: 2016-01
Turn project into package,
Enable roxygen documentation with
Tools > Project Options > Build Tools
Roxygen guide at
Help > Roxygen Quick Reference
File > New Project >
New Directory > R Package
Share Project
with Collaborators
Active shared
collaborators
Select
R Version
Start new R Session
in current project
Close R
Session in
project
JHT
RStudio saves the call history,
workspace, and working
directory associated with a
project. It reloads each when
you re-open a project.
Name of
current project
View() opens spreadsheet like view of data set
Sort by
values
Filter rows by value
or value range
Search
for value
Viewer Pane displays HTML content, such as Shiny apps,
RMarkdown reports, and interactive visualizations
Stop Shiny
app
Publish to shinyapps.io,
rpubs, RSConnect, …
Refresh
RStudio opens documentation in a dedicated Help pane
Home page of
helpful links
Search within
help file
Search for
help file
GUI Package manager lists every installed package
Click to load package with
library(). Unclick to detach
package with detach()
Delete
from
library
Install
Packages
Update
Packages
Create reproducible package
library for your project
RStudio opens plots in a dedicated Plots pane
Navigate
recent plots
Open in
window
Export
plot
Delete
plot
Delete
all plots
Package
version
installed
Examine variables
in executing
environment
Open with debug(), browser(), or a breakpoint. RStudio will open the
debugger mode when it encounters a breakpoint while executing code.
Open traceback to examine
the functions that R called
before the error occurred
Launch debugger
mode from origin
of error
Click next to
line number to
add/remove a
breakpoint.
Select function
in traceback to
debug
Highlighted
line shows
where
execution has
paused
Run commands in
environment where
execution has paused
Step through
code one line
at a time
Step into and
out of functions
to run
Resume
execution
Quit debug
mode
Open Shiny, R Markdown,
knitr, Sweave, LaTeX, .Rd files
and more in Source Pane
Check
spelling
Render
output
Choose
output
format
Choose
output
location
Insert
code
chunk
Jump to
previous
chunk
Jump
to next
chunk
Run
selected
lines
Publish
to server
Show file
outline
Set knitr
chunk
options
Run this and
all previous
code chunks
Run this
code chunk
Jump to
chunk
RStudio recognizes that files named app.R,
server.R, ui.R, and global.R belong to a shiny app
Run
app
Choose
location to
view app
Publish to
shinyapps.io
or server
Manage
publish
accounts
Access markdown guide at
Help > Markdown Quick Reference
Stage
files:
Show file
diff
Commit
staged files
Push/Pull
to remote
View
History
current
branch
• Added
• Deleted
• Modified
• Renamed
• Untracked
Turn on at Tools > Project Options > Git/SVN
Open shell to
type commands
A
D
M
R
?
Search inside
environment
Syntax highlighting based
on your file’s extension
Code diagnostics that appear in the margin.
Hover over diagnostic symbols for details.
Tab completion to finish
function names, file paths,
arguments, and more.
Multi-language code
snippets to quickly use
common blocks of code.
Open in new
window
Save Find and
replace
Compile as
notebook
Run
selected
code
Re-run
previous code
Source with or
without Echo
Show file
outline
Jump to function in file Change file type
Navigate
tabs
A File browser keyed to your working directory.
Click on file or directory name to open.
Path to displayed directory
Upload
file
Create
folder
Delete
file
Rename
file
Change
directory
Displays saved objects by
type with short description
View function
source code
View in data
viewer
Load
workspace
Save
workspace
Import data
with wizard
Delete all
saved objects
Display objects
as list or grid
Choose environment to display from
list of parent environments
History of past
commands to
run/copy
Display .RPres slideshows
File > New File >
R Presentation
Working
Directory
Maximize,
minimize panes
Drag pane
boundaries
JHT
Cursors of
shared users
File > New Project
Press ! to see
command history
Multiple cursors/column selection
with Alt + mouse drag.
Documents and Apps R Support
PROJECT SYSTEM
Debug Mode Version Control with Git or SVN
Package Writing
https://creativecommons.org/licenses/by-sa/4.0/
mailto:info@rstudio.com
http://rstudio.com
3 NAVIGATE CODE Windows /Linux Mac
Goto File/Function Ctrl+. Ctrl+.
Fold Selected Alt+L Cmd+Option+L
Unfold Selected Shift+Alt+L Cmd+Shift+Option+L
Fold All Alt+O Cmd+Option+O
Unfold All Shift+Alt+O Cmd+Shift+Option+O
Go to line Shift+Alt+G Cmd+Shift+Option+G
Jump to Shift+Alt+J Cmd+Shift+Option+J
Switch to tab Ctrl+Shift+. Ctrl+Shift+.
Previous tab Ctrl+F11 Ctrl+F11
Next tab Ctrl+F12 Ctrl+F12
First tab Ctrl+Shift+F11 Ctrl+Shift+F11
Last tab Ctrl+Shift+F12 Ctrl+Shift+F12
Navigate back Ctrl+F9 Cmd+F9
Navigate forward Ctrl+F10 Cmd+F10
Jump to Brace Ctrl+P Ctrl+P
Select within Braces Ctrl+Shift+Alt+E Ctrl+Shift+Option+E
Use Selection for Find Ctrl+F3 Cmd+E
Find in Files Ctrl+Shift+F Cmd+Shift+F
Find Next Win: F3, Linux: Ctrl+G Cmd+G
Find Previous W: Shift+F3, L:
Ctrl+Shift+G
Cmd+Shift+G
Jump to Word Ctrl+ “/# Option+ “/#
Jump to Start/End Ctrl+!/$ Cmd+!/$
Toggle Outline Ctrl+Shift+O Cmd+Shift+O
8 DOCUMENTS AND APPS Windows/Linux Mac
Preview HTML (Markdown, etc.) Ctrl+Shift+K Cmd+Shift+K
Knit Document (knitr) Ctrl+Shift+K Cmd+Shift+K
Compile Notebook Ctrl+Shift+K Cmd+Shift+K
Compile PDF (TeX and Sweave) Ctrl+Shift+K Cmd+Shift+K
Insert chunk (Sweave and Knitr) Ctrl+Alt+I Cmd+Option+I
Insert code section Ctrl+Shift+R Cmd+Shift+R
Re-run previous region Ctrl+Shift+P Cmd+Shift+P
Run current document Ctrl+Alt+R Cmd+Option+R
Run from start to current line Ctrl+Alt+B Cmd+Option+B
Run the current code section Ctrl+Alt+T Cmd+Option+T
Run previous Sweave/Rmd code Ctrl+Alt+P Cmd+Option+P
Run the current chunk Ctrl+Alt+C Cmd+Option+C
Run the next chunk Ctrl+Alt+N Cmd+Option+N
Sync Editor & PDF Preview Ctrl+F8 Cmd+F8
7 MAKE PACKAGES Windows/Linux Mac
Build and Reload Ctrl+Shift+B Cmd+Shift+B
Load All (devtools) Ctrl+Shift+L Cmd+Shift+L
Test Package (Desktop) Ctrl+Shift+T Cmd+Shift+T
Test Package (Web) Ctrl+Alt+F7 Cmd+Opt+F7
Check Package Ctrl+Shift+E Cmd+Shift+E
Document Package Ctrl+Shift+D Cmd+Shift+D
6 VERSION CONTROL Windows/Linux Mac
Show diff Ctrl+Alt+D Ctrl+Option+D
Commit changes Ctrl+Alt+M Ctrl+Option+M
Scroll diff view Ctrl+!/$ Ctrl+!/$
Stage/Unstage (Git) Spacebar Spacebar
Stage/Unstage and move to next Enter Enter
5 DEBUG CODE Windows/Linux Mac
Toggle Breakpoint Shift+F9 Shift+F9
Execute Next Line F10 F10
Step Into Function Shift+F4 Shift+F4
Finish Function/Loop Shift+F6 Shift+F6
Continue Shift+F5 Shift+F5
Stop Debugging Shift+F8 Shift+F8
2 RUN CODE Windows/Linux Mac
Search command history Ctrl+! Cmd+!
Navigate command history !/$ !/$
Move cursor to start of line Home Cmd+”
Move cursor to end of line End Cmd+ #
Change working directory Ctrl+Shift+H Ctrl+Shift+H
Interrupt current command Esc Esc
Clear console Ctrl+L Ctrl+L
Quit Session (desktop only) Ctrl+Q Cmd+Q
Restart R Session Ctrl+Shift+F10 Cmd+Shift+F10
Run current line/selection Ctrl+Enter Cmd+Enter
Run current (retain cursor) Alt+Enter Option+Enter
Run from current to end Ctrl+Alt+E Cmd+Option+E
Run the current function
definition
Ctrl+Alt+F Cmd+Option+F
Source a file Ctrl+Alt+G Cmd+Option+G
Source the current file Ctrl+Shift+S Cmd+Shift+S
Source with echo Ctrl+Shift+Enter Cmd+Shift+Enter
1 LAYOUT Windows/Linux Mac
Move focus to Source Editor Ctrl+1 Ctrl+1
Move focus to Console Ctrl+2 Ctrl+2
Move focus to Help Ctrl+3 Ctrl+3
Show History Ctrl+4 Ctrl+4
Show Files Ctrl+5 Ctrl+5
Show Plots Ctrl+6 Ctrl+6
Show Packages Ctrl+7 Ctrl+7
Show Environment Ctrl+8 Ctrl+8
Show Git/SVN Ctrl+9 Ctrl+9
Show Build Ctrl+0 Ctrl+0
RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at www.rstudio.com • RStudio IDE 0.1.0 • Updated: 2017-09
Previous plot Ctrl+Alt+F11 Cmd+Option+F11
Next plot Ctrl+Alt+F12 Cmd+Option+F12
Show Keyboard Shortcuts Alt+Shift+K Option+Shift+K
RSP extends the the open source server with a
commercial license, support, and more:
• open and run multiple R sessions at once
• tune your resources to improve performance
• edit the same project at the same time as others
• see what you and others are doing on your server
• switch easily from one version of R to a different version
• integrate with your authentication, authorization, and audit practices
Download a free 45 day evaluation at
www.rstudio.com/products/rstudio-server-pro/
WHY RSTUDIO SERVER PRO?4 WRITE CODE Windows /Linux Mac
Attempt completion Tab or Ctrl+Space Tab or Cmd+Space
Navigate candidates !/$ !/$
Accept candidate Enter, Tab, or # Enter, Tab, or #
Dismiss candidates Esc Esc
Undo Ctrl+Z Cmd+Z
Redo Ctrl+Shift+Z Cmd+Shift+Z
Cut Ctrl+X Cmd+X
Copy Ctrl+C Cmd+C
Paste Ctrl+V Cmd+V
Select All Ctrl+A Cmd+A
Delete Line Ctrl+D Cmd+D
Select Shift+[Arrow] Shift+[Arrow]
Select Word Ctrl+Shift+ “/# Option+Shift+ “/#
Select to Line Start Alt+Shift+” Cmd+Shift+”
Select to Line End Alt+Shift+# Cmd+Shift+#
Select Page Up/Down Shift+PageUp/Down Shift+PageUp/Down
Select to Start/End Shift+Alt+!/$ Cmd+Shift+!/$
Delete Word Left Ctrl+Backspace Ctrl+Opt+Backspace
Delete Word Right Option+Delete
Delete to Line End Ctrl+K
Delete to Line Start Option+Backspace
Indent Tab (at start of line) Tab (at start of line)
Outdent Shift+Tab Shift+Tab
Yank line up to cursor Ctrl+U Ctrl+U
Yank line after cursor Ctrl+K Ctrl+K
Insert yanked text Ctrl+Y Ctrl+Y
Insert <- Alt+- Option+-
Insert %>% Ctrl+Shift+M Cmd+Shift+M
Show help for function F1 F1
Show source code
unction
F2 F2
New document Ctrl+Shift+N Cmd+Shift+N
New document (Chrome) Ctrl+Alt+Shift+N Cmd+Shift+Opt+N
Open document Ctrl+O Cmd+O
Save document Ctrl+S Cmd+S
Close document Ctrl+W Cmd+W
Close document (Chrome) Ctrl+Alt+W Cmd+Option+W
Close all documents Ctrl+Shift+W Cmd+Shift+W
Extract function Ctrl+Alt+X Cmd+Option+X
Extract variable Ctrl+Alt+V Cmd+Option+V
Reindent lines Ctrl+I Cmd+I
(Un)Comment lines Ctrl+Shift+C Cmd+Shift+C
Reflow Comment Ctrl+Shift+/ Cmd+Shift+/
Reformat Selection Ctrl+Shift+A Cmd+Shift+A
Select within braces Ctrl+Shift+E Ctrl+Shift+E
Show Diagnostics Ctrl+Shift+Alt+P Cmd+Shift+Opt+P
Transpose Letters Ctrl+T
Move Lines Up/Down Alt+!/$ Option+!/$
Copy Lines Up/Down Shift+Alt+!/$ Cmd+Option+!/$
Add New Cursor Above Ctrl+Alt+Up Ctrl+Option+Up
Add New Cursor Below Ctrl+Alt+Down Ctrl+Option+Down
Move Active Cursor Up Ctrl+Alt+Shift+Up Ctrl+Option+Shift+Up
Move Active Cursor Down Ctrl+Alt+Shift+Down Ctrl+Opt+Shift+Down
Find and Replace Ctrl+F Cmd+F
Use Selection for Find Ctrl+F3 Cmd+E
Replace and Find Ctrl+Shift+J Cmd+Shift+J
https://creativecommons.org/licenses/by-sa/4.0/
mailto:info@rstudio.com
http://rstudio.com
http://www.rstudio.com/products/rstudio-server-pro/
Assignment 2
Tidyverse Using ggplot2
For these exercises, you are to take the code, enter it into RStudio, and produce the plot. The point of the exercises is to generate a visual representation of the data. Once the visual representation is produced, snapshot the full RStudio page with the code and the plot. Paste into a Microsoft Word document and submit the 6 exercises.
####################
# #
# Exercise 1 #
# #
####################
library(tidyverse)
iris %>%
ggplot(aes(Sepal.Length, Sepal.Width, color = Species, shape = Species)) +
geom_point() +
geom_density2d() +
ggtitle(‘IRIS’) +
theme_light()
####################
# #
# Exercise 2 #
# #
####################
iris %>%
mutate(Species = ‘ALL’) %>%
bind_rows(iris) %>%
ggplot(aes(Petal.Length, Petal.Width, color = Species)) +
geom_point() +
geom_smooth() +
xlab(‘Petal Length’) +
ylab(‘Petal Width’) +
facet_wrap(~Species, scales = ‘free’) +
theme_bw()
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
####################
# #
# Exercise 3 #
# #
####################
mtcars %>%
rownames_to_column() %>%
mutate(rowname = forcats::fct_reorder(rowname, mpg)) %>%
ggplot(aes(rowname, mpg, label = rowname)) +
geom_point() +
geom_text(nudge_y = .3, hjust = ‘left’) +
coord_flip() +
ylab(‘Miles per gallon fuel consumption’) +
ylim(10, 40) +
theme_classic() +
theme(plot.title = element_text(hjust = 0, size = 16),
axis.title.x = element_text(face = ‘bold’),
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.line.y = element_blank())
####################
# #
# Exercise 4 #
# #
####################
mtcars %>%
ggplot(aes(mpg, qsec, size = disp, color = as.factor(am))) +
geom_point() +
scale_colour_discrete(name =”Gear”,
breaks=c(0, 1),
labels=c(“Manual”, “Automatic”)) +
scale_size_continuous(name = ‘Displacement’) +
xlab(‘Miles per gallon’) +
ylab(‘1/4 mile time’) +
theme_light()
####################
# #
# Exercise 5 #
# #
####################
diamonds2plot <- diamonds %>%
group_by(cut, color) %>%
summarise(price = mean(price)) %>%
arrange(color, price) %>%
ungroup() %>%
mutate(id = row_number(),
angle = 90 – 360 * (id – 0.5) / n())
diamonds2plot %>%
ggplot(aes(factor(id), price, fill = color, group = cut, label = cut)) +
geom_bar(stat = ‘identity’, position = ‘dodge’) +
geom_text(hjust = 0, angle = diamonds2plot$angle, alpha = .5) +
coord_polar() +
ggtitle(‘Mean dimond price’) +
ylim(-3000, 7000) +
theme_void() +
theme(plot.title = element_text(hjust = 0.5, size = 16, face = ‘bold’))
####################
# #
# Exercise 6 #
# #
####################
economics %>%
ggplot(aes(date, uempmed)) +
geom_line() +
expand_limits(y = 0) +
ggtitle(‘Median duration of unemployment [weeks]’) +
ylab(‘Median duration of unemployment [weeks]’) +
ggthemes::theme_economist_white() +
theme(axis.title.x = element_blank())
We provide professional writing services to help you score straight A’s by submitting custom written assignments that mirror your guidelines.
Get result-oriented writing and never worry about grades anymore. We follow the highest quality standards to make sure that you get perfect assignments.
Our writers have experience in dealing with papers of every educational level. You can surely rely on the expertise of our qualified professionals.
Your deadline is our threshold for success and we take it very seriously. We make sure you receive your papers before your predefined time.
Someone from our customer support team is always here to respond to your questions. So, hit us up if you have got any ambiguity or concern.
Sit back and relax while we help you out with writing your papers. We have an ultimate policy for keeping your personal and order-related details a secret.
We assure you that your document will be thoroughly checked for plagiarism and grammatical errors as we use highly authentic and licit sources.
Still reluctant about placing an order? Our 100% Moneyback Guarantee backs you up on rare occasions where you aren’t satisfied with the writing.
You don’t have to wait for an update for hours; you can track the progress of your order any time you want. We share the status after each step.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
Although you can leverage our expertise for any writing task, we have a knack for creating flawless papers for the following document types.
From brainstorming your paper's outline to perfecting its grammar, we perform every step carefully to make your paper worthy of A grade.
Hire your preferred writer anytime. Simply specify if you want your preferred expert to write your paper and we’ll make that happen.
Get an elaborate and authentic grammar check report with your work to have the grammar goodness sealed in your document.
You can purchase this feature if you want our writers to sum up your paper in the form of a concise and well-articulated summary.
You don’t have to worry about plagiarism anymore. Get a plagiarism report to certify the uniqueness of your work.
Join us for the best experience while seeking writing assistance in your college life. A good grade is all you need to boost up your academic excellence and we are all about it.
We create perfect papers according to the guidelines.
We seamlessly edit out errors from your papers.
We thoroughly read your final draft to identify errors.
Work with ultimate peace of mind because we ensure that your academic work is our responsibility and your grades are a top concern for us!
Dedication. Quality. Commitment. Punctuality
Here is what we have achieved so far. These numbers are evidence that we go the extra mile to make your college journey successful.
We have the most intuitive and minimalistic process so that you can easily place an order. Just follow a few steps to unlock success.
We understand your guidelines first before delivering any writing service. You can discuss your writing needs and we will have them evaluated by our dedicated team.
We write your papers in a standardized way. We complete your work in such a way that it turns out to be a perfect description of your guidelines.
We promise you excellent grades and academic excellence that you always longed for. Our writers stay in touch with you via email.