% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/drake_plan.R
\name{drake_plan}
\alias{drake_plan}
\title{Create a drake plan
for the \code{plan} argument of \code{\link[=make]{make()}}.
\lifecycle{stable}}
\usage{
drake_plan(
  ...,
  list = NULL,
  file_targets = NULL,
  strings_in_dots = NULL,
  tidy_evaluation = NULL,
  transform = TRUE,
  trace = FALSE,
  envir = parent.frame(),
  tidy_eval = TRUE,
  max_expand = NULL
)
}
\arguments{
\item{...}{A collection of symbols/targets
with commands assigned to them. See the examples for details.}

\item{list}{Deprecated}

\item{file_targets}{Deprecated.}

\item{strings_in_dots}{Deprecated.}

\item{tidy_evaluation}{Deprecated. Use \code{tidy_eval} instead.}

\item{transform}{Logical, whether to transform the plan
into a larger plan with more targets.
Requires the \code{transform} field in
\code{target()}. See the examples for details.}

\item{trace}{Logical, whether to add columns to show
what happens during target transformations.}

\item{envir}{Environment for tidy evaluation.}

\item{tidy_eval}{Logical, whether to use tidy evaluation
(e.g. unquoting/\verb{!!}) when resolving commands.
Tidy evaluation in transformations is always turned on
regardless of the value you supply to this argument.}

\item{max_expand}{Positive integer, optional.
\code{max_expand} is the maximum number of targets to generate in each
\code{map()}, \code{split()}, or \code{cross()} transform.
Useful if you have a massive plan and you want to
test and visualize a strategic subset of targets
before scaling up.
Note: the \code{max_expand} argument of \code{drake_plan()} and
\code{transform_plan()} is for static branching only.
The dynamic branching \code{max_expand}
is an argument of \code{make()} and \code{drake_config()}.}
}
\value{
A data frame of targets, commands, and optional
custom columns.
}
\description{
A \code{drake} plan is a data frame with columns
\code{"target"} and \code{"command"}. Each target is an R object
produced in your workflow, and each command is the
R code to produce it.
}
\details{
Besides \code{"target"} and \code{"command"}, \code{\link[=drake_plan]{drake_plan()}}
understands a special set of optional columns. For details, visit
\url{https://books.ropensci.org/drake/plans.html#special-custom-columns-in-your-plan} # nolint
}
\section{Columns}{

\code{\link[=drake_plan]{drake_plan()}} creates a special data frame. At minimum, that data frame
must have columns \code{target} and \code{command} with the target names and the
R code chunks to build them, respectively.

You can add custom columns yourself, either with \code{target()} (e.g.
\code{drake_plan(y = target(f(x), transform = map(c(1, 2)), format = "fst"))})
or by appending columns post-hoc (e.g. \code{plan$col <- vals}).

Some of these custom columns are special. They are optional,
but \code{drake} looks for them at various points in the workflow.
\itemize{
\item \code{transform}: a call to \code{\link[=map]{map()}}, \code{\link[=split]{split()}}, \code{\link[=cross]{cross()}}, or
\code{\link[=combine]{combine()}} to create and manipulate large collections of targets.
Details: (\url{https://books.ropensci.org/drake/plans.html#large-plans}). # nolint
\item \code{format}: set a storage format to save big targets more efficiently.
See the "Formats" section of this help file for more details.
\item \code{trigger}: rule to decide whether a target needs to run.
It is recommended that you define this one with \code{target()}.
Details: \url{https://books.ropensci.org/drake/triggers.html}.
\item \code{hpc}: logical values (\code{TRUE}/\code{FALSE}/\code{NA}) whether to send each target
to parallel workers.
Visit \url{https://books.ropensci.org/drake/hpc.html#selectivity}
to learn more.
\item \code{resources}: target-specific lists of resources for a computing cluster.
See
\url{https://books.ropensci.org/drake/hpc.html#advanced-options}
for details.
\item \code{caching}: overrides the \code{caching} argument of \code{\link[=make]{make()}} for each target
individually. Possible values:
\itemize{
\item "master": tell the master process to store the target in the cache.
\item "worker": tell the HPC worker to store the target in the cache.
\item NA: default to the \code{caching} argument of \code{\link[=make]{make()}}.
}
\item \code{elapsed} and \code{cpu}: number of seconds to wait for the target to build
before timing out (\code{elapsed} for elapsed time and \code{cpu} for CPU time).
\item \code{retries}: number of times to retry building a target
in the event of an error.
\item \code{seed}: an optional pseudo-random number generator (RNG)
seed for each target. \code{drake} usually comes up with its own
unique reproducible target-specific seeds using the global seed
(the \code{seed} argument to \code{\link[=make]{make()}} and \code{\link[=drake_config]{drake_config()}})
and the target names, but you can overwrite these automatic seeds.
\code{NA} entries default back to \code{drake}'s automatic seeds.
\item \code{max_expand}: for dynamic branching only. Same as the \code{max_expand}
argument of \code{\link[=make]{make()}}, but on a target-by-target basis.
Limits the number of sub-targets created for a given target.
}
}

\section{Formats}{

Specialized target formats increase efficiency and flexibility.
Some allow you to save specialized objects like \code{keras} models,
while others increase the speed while conserving storage and memory.
You can declare target-specific formats in the plan
(e.g. \code{drake_plan(x = target(big_data_frame, format = "fst"))})
or supply a global default \code{format} for all targets in \code{make()}.
Either way, most formats have specialized installation requirements
(e.g. R packages) that are not installed with \code{drake} by default.
You will need to install them separately yourself.
Available formats:
\itemize{
\item \code{"file"}: Dynamic files. To use this format, simply create
local files and directories yourself and then return
a character vector of paths as the target's value.
Then, \code{drake} will watch for changes to those files in
subsequent calls to \code{make()}. This is a more flexible
alternative to \code{file_in()} and \code{file_out()}, and it is
compatible with dynamic branching.
See \url{https://github.com/ropensci/drake/pull/1178} for an example.
\item \code{"fst"}: save big data frames fast. Requires the \code{fst} package.
Note: this format strips non-data-frame attributes such as the
\item \code{"fst_tbl"}: Like \code{"fst"}, but for \code{tibble} objects.
Requires the \code{fst} and \code{tibble} packages.
Strips away non-data-frame non-tibble attributes.
\item \code{"fst_dt"}: Like \code{"fst"} format, but for \code{data.table} objects.
Requires the \code{fst} and \code{data.table} packages.
Strips away non-data-frame non-data-table attributes.
\item \code{"diskframe"}:
Stores \code{disk.frame} objects, which could potentially be
larger than memory. Requires the \code{fst} and \code{disk.frame} packages.
Coerces objects to \code{disk.frame}s.
Note: \code{disk.frame} objects get moved to the \code{drake} cache
(a subfolder of \verb{.drake/} for most workflows).
To ensure this data transfer is fast, it is best to
save your \code{disk.frame} objects to the same physical storage
drive as the \code{drake} cache,
\code{as.disk.frame(your_dataset, outdir = drake_tempfile())}.
\item \code{"keras"}: save Keras models as HDF5 files.
Requires the \code{keras} package.
\item \code{"qs"}: save any R object that can be properly serialized
with the \code{qs} package. Requires the \code{qs} package.
Uses \code{qsave()} and \code{qread()}.
Uses the default settings in \code{qs} version 0.20.2.
\item \code{"rds"}: save any R object that can be properly serialized.
Requires R version >= 3.5.0 due to ALTREP.
Note: the \code{"rds"} format uses gzip compression, which is slow.
\code{"qs"} is a superior format.
}
}

\section{Keywords}{

\code{\link[=drake_plan]{drake_plan()}} understands special keyword functions for your commands.
With the exception of \code{\link[=target]{target()}}, each one is a proper function
with its own help file.
\itemize{
\item \code{\link[=target]{target()}}: give the target more than just a command.
Using \code{\link[=target]{target()}}, you can apply a transformation
(examples: \url{https://books.ropensci.org/drake/plans.html#large-plans}), # nolint
supply a trigger (\url{https://books.ropensci.org/drake/triggers.html}), # nolint
or set any number of custom columns.
\item \code{\link[=file_in]{file_in()}}: declare an input file dependency.
\item \code{\link[=file_out]{file_out()}}: declare an output file to be produced
when the target is built.
\item \code{\link[=knitr_in]{knitr_in()}}: declare a \code{knitr} file dependency such as an
R Markdown (\verb{*.Rmd}) or R LaTeX (\verb{*.Rnw}) file.
\item \code{\link[=ignore]{ignore()}}: force \code{drake} to entirely ignore a piece of code:
do not track it for changes and do not analyze it for dependencies.
\item \code{\link[=no_deps]{no_deps()}}: tell \code{drake} to not track the dependencies
of a piece of code. \code{drake} still tracks the code itself for changes.
\item \code{\link[=id_chr]{id_chr()}}: Get the name of the current target.
\item \code{\link[=drake_envir]{drake_envir()}}: get the environment where drake builds targets.
Intended for advanced custom memory management.
}
}

\section{Transformations}{

\code{drake} has special syntax for generating large plans.
Your code will look something like
\verb{drake_plan(y = target(f(x), transform = map(x = c(1, 2, 3)))}
You can read about this interface at
\url{https://books.ropensci.org/drake/plans.html#large-plans}. # nolint
}

\section{Static branching}{

In static branching, you define batches of targets
based on information you know in advance.
Overall usage looks like
\verb{drake_plan(<x> = target(<...>, transform = <call>)},
where
\itemize{
\item \verb{<x>} is the name of the target or group of targets.
\item \verb{<...>} is optional arguments to \code{\link[=target]{target()}}.
\item \verb{<call>} is a call to one of the transformation functions.
}

Transformation function usage:
\itemize{
\item \code{map(..., .data, .names, .id, .tag_in, .tag_out)}
\item \code{split(..., slices, margin = 1L, drop = FALSE, .names, .tag_in, .tag_out)} # nolint
\item \code{cross(..., .data, .names, .id, .tag_in, .tag_out)}
\item \code{combine(..., .by, .names, .id, .tag_in, .tag_out)}
}
}

\section{Dynamic branching}{

\itemize{
\item \code{map(..., .trace)}
\item \code{cross(..., .trace)}
\item \code{group(..., .by, .trace)}
}

\code{map()} and \code{cross()} create dynamic sub-targets from the variables
supplied to the dots. As with static branching, the variables
supplied to \code{map()} must all have equal length.
\code{group(f(data), .by = x)} makes new dynamic
sub-targets from \code{data}. Here, \code{data} can be either static or dynamic.
If \code{data} is dynamic, \code{group()} aggregates existing sub-targets.
If \code{data} is static, \code{group()} splits \code{data} into multiple
subsets based on the groupings from \code{.by}.

Differences from static branching:
\itemize{
\item \code{...} must contain \emph{unnamed} symbols with no values supplied,
and they must be the names of targets.
\item Arguments \code{.id}, \code{.tag_in}, and \code{.tag_out} no longer apply.
}
}

\examples{
\dontrun{
isolate_example("contain side effects", {
# For more examples, visit
# https://books.ropensci.org/drake/plans.html.

# Create drake plans:
mtcars_plan <- drake_plan(
  write.csv(mtcars[, c("mpg", "cyl")], file_out("mtcars.csv")),
  value = read.csv(file_in("mtcars.csv"))
)
if (requireNamespace("visNetwork", quietly = TRUE)) {
  plot(mtcars_plan) # fast simplified call to vis_drake_graph()
}
mtcars_plan
make(mtcars_plan) # Makes `mtcars.csv` and then `value`
head(readd(value))
# You can use knitr inputs too. See the top command below.

load_mtcars_example()
head(my_plan)
if (requireNamespace("knitr", quietly = TRUE)) {
  plot(my_plan)
}
# The `knitr_in("report.Rmd")` tells `drake` to dive into the active
# code chunks to find dependencies.
# There, `drake` sees that `small`, `large`, and `coef_regression2_small`
# are loaded in with calls to `loadd()` and `readd()`.
deps_code("report.Rmd")

# Formats are great for big data: https://github.com/ropensci/drake/pull/977
# Below, each target is 1.6 GB in memory.
# Run make() on this plan to see how much faster fst is!
n <- 1e8
plan <- drake_plan(
  data_fst = target(
    data.frame(x = runif(n), y = runif(n)),
    format = "fst"
  ),
  data_old = data.frame(x = runif(n), y = runif(n))
)

# Use transformations to generate large plans.
# Read more at
# <https://books.ropensci.org/drake/plans.html#create-large-plans-the-easy-way>. # nolint
drake_plan(
  data = target(
    simulate(nrows),
    transform = map(nrows = c(48, 64)),
    custom_column = 123
  ),
  reg = target(
    reg_fun(data),
   transform = cross(reg_fun = c(reg1, reg2), data)
  ),
  summ = target(
    sum_fun(data, reg),
   transform = cross(sum_fun = c(coef, residuals), reg)
  ),
  winners = target(
    min(summ),
    transform = combine(summ, .by = c(data, sum_fun))
  )
)

# Split data among multiple targets.
drake_plan(
  large_data = get_data(),
  slice_analysis = target(
    analyze(large_data),
    transform = split(large_data, slices = 4)
  ),
  results = target(
    rbind(slice_analysis),
    transform = combine(slice_analysis)
  )
)

# Set trace = TRUE to show what happened during the transformation process.
drake_plan(
  data = target(
    simulate(nrows),
    transform = map(nrows = c(48, 64)),
    custom_column = 123
  ),
  reg = target(
    reg_fun(data),
   transform = cross(reg_fun = c(reg1, reg2), data)
  ),
  summ = target(
    sum_fun(data, reg),
   transform = cross(sum_fun = c(coef, residuals), reg)
  ),
  winners = target(
    min(summ),
    transform = combine(summ, .by = c(data, sum_fun))
  ),
  trace = TRUE
)

# You can create your own custom columns too.
# See ?triggers for more on triggers.
drake_plan(
  website_data = target(
    command = download_data("www.your_url.com"),
    trigger = "always",
    custom_column = 5
  ),
  analysis = analyze(website_data)
)

# Tidy evaluation can help generate super large plans.
sms <- rlang::syms(letters) # To sub in character args, skip this.
drake_plan(x = target(f(char), transform = map(char = !!sms)))

# Dynamic branching
# Get the mean mpg for each cyl in the mtcars dataset.
plan <- drake_plan(
  raw = mtcars,
  group_index = raw$cyl,
  munged = target(raw[, c("mpg", "cyl")], dynamic = map(raw)),
  mean_mpg_by_cyl = target(
    data.frame(mpg = mean(munged$mpg), cyl = munged$cyl[1]),
    dynamic = group(munged, .by = group_index)
  )
)
make(plan)
readd(mean_mpg_by_cyl)
})
}
}
\seealso{
make, drake_config, transform_plan, map, split, cross, combine
}
