Unit A: The targets Package#
Learning Goals
After this lesson, you should be able to:
Explain what the targets package does
Configure a
_targets.Rfile
This unit provides a very brief example of using the targets package for workflow management in R.
Introduction#
The targets package uses an R script, _targets.R, to configure the workflows
and steps in a project. You can use the package’s use_targets function to
generate a template version of _targets.R.
Within _targets.R, you use a list at the end of the file to define the steps
in the project’s workflows. Each step must be provided via a call to the
tar_target function. The first argument to tar_target is the name of the
step (without quotes). The second argument to tar_target is the R code to run
for the step, which can refer to outputs from other steps.
There are a two ideas that are key to understanding the targets package:
It assumes that each step corresponds to a separate function defined in the project’s R code. By default, the package will search for code in the
R/subdirectory.It automatically saves the R output from every step. There’s no need to save outputs to files unless you want to share those files. The output is saved in a special
_targets/subdirectory in the project directory.
Keep these in mind as you go through the following example.
Case Study: Davis Bike Counts, Part IV#
Let’s try using the target package to manage the workflow in the project from
Case Study: Davis Bike Counts, Part I. To get started, run R in the
project directory, load the targets package, and call the use_targets
function:
use_targets()
This will create a _targets.R file in the project directory.
To configure the project workflow, open the _targets.R file and edit the list
at the end of the file to look like this:
# Replace the target list below with your own:
list(
tar_target(file, "data/2020_davis_bikes.rds", format = "file"),
tar_target(source_bikes, read_bike_data(file)),
tar_target(clean_bikes, clean_bike_data(source_bikes)),
tar_target(bikes, make_bike_features(clean_bikes)),
tar_target(model, fit_bike_model(bikes)),
tar_target(plot, plot_bike_model(bikes, model))
)
Each call to tar_target defines a separate step in the workflow. The first
call to tar_target describes the data/2020_davis_bikes.rds file, and does
not carry out any action. Each of the remaining calls to tar_target describe
a call to a single function in the project’s R scripts.
After setting up the _targets.R file, you can call the package’s
tar_manifest function to get a data frame with information about all of the
steps in the workflow:
tar_manifest()
# A tibble: 6 × 2
name command
<chr> <chr>
1 file "\"data/2020_davis_bikes.rds\""
2 source_bikes "read_bike_data(file)"
3 clean_bikes "clean_bike_data(source_bikes)"
4 bikes "make_bike_features(clean_bikes)"
5 model "fit_bike_model(bikes)"
6 plot "plot_bike_model(bikes, model)"
Tip
You can also call the package’s tar_visnetwork function to view the project’s
workflow as a graph.
To run a step in the workflow, use the package’s tar_make function. If you
call the function without any arguments, the package will try to run every step
in the workflow. You can also set the first argument to the names of the steps
you want to run. Try running the plot step:
tar_make(plot)
+ file dispatched
✔ file completed [0ms, 4.02 kB]
+ source_bikes dispatched
✔ source_bikes completed [0ms, 4.02 kB]
+ clean_bikes dispatched
✔ clean_bikes completed [11ms, 3.85 kB]
+ bikes dispatched
✔ bikes completed [1ms, 3.88 kB]
+ model dispatched
✔ model completed [3ms, 61.21 kB]
+ plot dispatched
✔ plot completed [15ms, 244.39 kB]
✔ ended pipeline [285ms, 6 completed, 0 skipped]
After running the workflow, you can call the tar_read function to materialize
the output from any step. So to display the plot, run:
tar_read(plot)
Similarly, to get or display the fitted model, run:
tar_read(model)
Call:
lm(formula = count ~ date * site * pandemic, data = bikes)
Coefficients:
(Intercept) date
-95997.742 5.313
sitethird pandemicTRUE
51762.400 90878.360
date:sitethird date:pandemicTRUE
-2.859 -5.022
sitethird:pandemicTRUE date:sitethird:pandemicTRUE
-29280.107 1.653
This way you can access outputs from any step in the workflow.
See also
For more about how to use the targets package, see the official documentation.