Welcome
Welcome to Lab 7! Today, we’ll focus on writing dplyr
code. In particular, we will both use the verbs individually and “write a sentence” with the verbs by stringing them together with pipes.
Learning objectives
- Use
dplyr
verbs together with pipes.
Deliverables (i.e., what to put in the lab drop box)
Upload your rendered PDF (lab_07.pdf
) and Quarto (lab_07.qmd
) document to the lab drop box. Make sure the Quarto document properly renders to PDF.
Exercise 0
Load any packages you’ll need for this lab below.
Exercise 1
Create the following dataset and call it plots
. The resulting tibble should look like this when printed:
# A tibble: 5 × 6
plot tree dbh logs type live
<dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 1 1 20.2 2 D TRUE
2 1 2 10.4 1 D TRUE
3 2 1 5 0.5 D TRUE
4 2 2 18 NA C FALSE
5 2 3 10.5 1.5 C TRUE
Exercise 2
Write some code to figure out the follow features of plots
:
- How many rows and columns?
- What are the column names?
- What is the data type of each column?
- Are there any
NA
values? If so, in which column?
Exercise 3
Use a dplyr
function to print all trees in plot 2.
# A tibble: 3 × 6
plot tree dbh logs type live
<dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 2 1 5 0.5 D TRUE
2 2 2 18 NA C FALSE
3 2 3 10.5 1.5 C TRUE
Exercise 4
Use a dplyr
function to print all trees in plot 2 that have dbh less than or equal 10.
# A tibble: 1 × 6
plot tree dbh logs type live
<dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 2 1 5 0.5 D TRUE
Exercise 5
Use a dplyr
function to print the tree with the largest dbh.
# A tibble: 1 × 6
plot tree dbh logs type live
<dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 1 1 20.2 2 D TRUE
Exercise 6
Use a series of piped dplyr
functions to find the largest dbh tree on plot 2.
# A tibble: 1 × 6
plot tree dbh logs type live
<dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 2 2 18 NA C FALSE
Exercise 7
Use a series of piped dplyr
functions to find the largest dbh live tree on plot 2.
# A tibble: 1 × 6
plot tree dbh logs type live
<dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 2 3 10.5 1.5 C TRUE
Exercise 8
Use a series of piped dplyr
functions to find the largest dbh dead tree on plot 2.
# A tibble: 1 × 6
plot tree dbh logs type live
<dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 2 2 18 NA C FALSE
Exercise 9
Use a series of piped dplyr
functions to find the largest dbh live tree on plot 2 of type D.
# A tibble: 1 × 6
plot tree dbh logs type live
<dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 2 1 5 0.5 D TRUE
Exercise 10
Use a series of piped dplyr
functions to find the smallest dbh tree on plot 1.
# A tibble: 1 × 6
plot tree dbh logs type live
<dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 1 2 10.4 1 D TRUE
Exercise 11
Use a dplyr
function to add a new column to plots
to hold each tree’s basal area (ft\(^2\)). This new column should be called ba
with values equal to 0.005454*dbh^2
(assuming dbh is in inches).
# A tibble: 5 × 7
plot tree dbh logs type live ba
<dbl> <dbl> <dbl> <dbl> <chr> <lgl> <dbl>
1 1 1 20.2 2 D TRUE 2.23
2 1 2 10.4 1 D TRUE 0.590
3 2 1 5 0.5 D TRUE 0.136
4 2 2 18 NA C FALSE 1.77
5 2 3 10.5 1.5 C TRUE 0.601
Exercise 12
Use a dplyr
function to move your newly created column ba
to between the dbh
and logs
columns.
# A tibble: 5 × 7
plot tree dbh ba logs type live
<dbl> <dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 1 1 20.2 2.23 2 D TRUE
2 1 2 10.4 0.590 1 D TRUE
3 2 1 5 0.136 0.5 D TRUE
4 2 2 18 1.77 NA C FALSE
5 2 3 10.5 0.601 1.5 C TRUE
Exercise 13
Use a series of piped dplyr
functions to compute the mean dbh for trees on plots 1 and 2. Note, I called my mean mean_dbh
.
# A tibble: 2 × 2
plot mean_dbh
<dbl> <dbl>
1 1 15.3
2 2 11.2
Exercise 14
Use a series of piped dplyr
functions to compute plot specific mean dbh and logs for trees. Exclude NA
values from the mean calculations (hint, use the na.rm
argument in mean()
). Note, I called my mean mean_dbh
and mean_logs
.
# A tibble: 2 × 3
plot mean_dbh mean_logs
<dbl> <dbl> <dbl>
1 1 15.3 1.5
2 2 11.2 1
Exercise 15
Use a series of piped dplyr
functions to compute plot specific mean dbh and logs for live trees. Note, I called my mean mean_dbh
and mean_logs
. Why did only plot 2 mean dbh change from the your solution to Exercise 14?
# A tibble: 2 × 3
plot mean_dbh mean_logs
<dbl> <dbl> <dbl>
1 1 15.3 1.5
2 2 7.75 1
Exercise 16
Sort plots
by increasing plot number and increasing dbh within plot.
# A tibble: 5 × 7
plot tree dbh ba logs type live
<dbl> <dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 1 2 10.4 0.590 1 D TRUE
2 1 1 20.2 2.23 2 D TRUE
3 2 1 5 0.136 0.5 D TRUE
4 2 3 10.5 0.601 1.5 C TRUE
5 2 2 18 1.77 NA C FALSE
Exercise 17
The type column holds values “D” and “C” which stand for deciduous and conifer, respectively. Use mutate()
and the case_when()
function to change values in the type column from “D” and “C” to “deciduous” and “conifer”.
# A tibble: 5 × 7
plot tree dbh ba logs type live
<dbl> <dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 1 1 20.2 2.23 2 deciduous TRUE
2 1 2 10.4 0.590 1 deciduous TRUE
3 2 1 5 0.136 0.5 deciduous TRUE
4 2 2 18 1.77 NA conifer FALSE
5 2 3 10.5 0.601 1.5 conifer TRUE
Exercise 18
Use a series of piped dplyr
functions to compute type specific mean dbh and logs. More specifically, I want you to use a grouped summarize()
, where you group by type. Note, I called my mean mean_dbh
and mean_logs
.
# A tibble: 2 × 3
type mean_dbh mean_logs
<chr> <dbl> <dbl>
1 conifer 14.2 1.5
2 deciduous 11.9 1.17
Exercise 19
Use a series of piped dplyr
functions to count the number of trees by type. Hint, use the n()
within a grouped summarize()
. I called my count n_trees
.
# A tibble: 2 × 2
type n_trees
<chr> <int>
1 conifer 2
2 deciduous 3
Exercise 20
Use a series of piped dplyr
functions to print the trees with the largest basal area within each plot.
# A tibble: 2 × 7
# Groups: plot [2]
plot tree dbh ba logs type live
<dbl> <dbl> <dbl> <dbl> <dbl> <chr> <lgl>
1 1 1 20.2 2.23 2 deciduous TRUE
2 2 2 18 1.77 NA conifer FALSE
Wrap up
Congratulations! You’ve made it to the end of Lab 7. Make sure to render your final document and submit both the .pdf and .qmd file to D2L.