% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/multiObjMatch.R
\name{distBalMatch}
\alias{distBalMatch}
\title{Optimal tradeoffs among distance, exclusion and marginal imbalance}
\usage{
distBalMatch(
  df,
  treatCol,
  myBalCol,
  rhoExclude = c(1),
  rhoBalance = c(1, 2, 3),
  distMatrix = NULL,
  distList = NULL,
  exactlist = NULL,
  propensityCols = NULL,
  pScores = NULL,
  ignore = NULL,
  maxUnMatched = 0.25,
  caliperOption = NULL,
  toleranceOption = 0.01,
  maxIter = 0,
  rho.max.f = 10
)
}
\arguments{
\item{df}{data frame that contain columns indicating treatment, outcome and
covariates.}

\item{treatCol}{character of name of the column indicating treatment
assignment.}

\item{myBalCol}{character of column name of the variable on which to evaluate
marginal balance.}

\item{rhoExclude}{(optional) numeric vector of values of exclusion penalty.
Default value is c(1).}

\item{rhoBalance}{(optional) factor of values of marginal balance penalty.
Default value is c(1,2,3).}

\item{distMatrix}{(optional) a matrix that specifies the pair-wise distances
between any two objects.}

\item{distList}{(optional) character vector of variable names used for
calculating within-pair distance.}

\item{exactlist}{(optional) character vector, variable names that we want
exact matching on; NULL by default.}

\item{propensityCols}{(optional) character vector, variable names on which to
fit a propensity score (to supply a caliper).}

\item{pScores}{(optional) character, giving the variable name for the fitted
propensity score.}

\item{ignore}{(optional) character vector of variable names that should be
ignored when constructing the internal matching. NULL by default.}

\item{maxUnMatched}{(optional) numeric, the maximum proportion of unmatched
units that can be accepted; default is 0.25.}

\item{caliperOption}{(optional) numeric, the propensity score caliper value
in standard deviations of the estimated propensity scores; default is NULL,
which is no caliper.}

\item{toleranceOption}{(optional) numeric, tolerance of close match distance;
default is 1e-2.}

\item{maxIter}{(optional) integer,  maximum number of iterations to use in
searching for penalty combintions that improve the matching; default is 0.}

\item{rho.max.f}{(optional) numeric, the scaling factor used in proposal for
rhos; default is 10.}
}
\value{
a named list whose elements are: * "rhoList": list of penalty
combinations for each match * "matchList": list of matches indexed by
number
\itemize{
\item "treatmentCol": character of treatment variable
\item "covs":
character vector of names of the variables used for calculating within-pair
distance
\item "exactCovs": character vector of names of variables that we want
exact or close match on * "idMapping": numeric vector of row indices for
each observation in the sorted data frame for internal use
\item "stats": data
frame of important statistics (total variation distance) for variable on
which marginal balance is measured
\item "b.var": character, name of variable
on which marginal balance is measured * "dataTable": data frame sorted by
treatment value
\item "t": a treatment vector
\item "df": the original dataframe
input by the user
\item "pair_cost1": list of pair-wise distance sum using the
first distance measure
\item "pair_cost2": list of pair-wise distance sum using
the second distance measure (left NULL since only one distance measure is
used here).
\item "version": (for internal use) the version of the matching
function called; "Basic" indicates the matching comes from distBalMatch and
"Advanced" from twoDistMatch.
\item "fPair": a vector of values for the first
objective function; it corresponds to the pair-wise distance sum according
to the first distance measure.
\item "fExclude": a vector of values for the
second objective function; it corresponds to the number of treated units
being unmatched.
\item "fMarginal": a vector of values for the third objective
function; it corresponds to the marginal balanced distance for the
specified variable(s).
}
}
\description{
Explores tradeoffs among three important objective functions in
an optimal matching problem:the sum of covariate distances within matched
pairs, the number of treated units included in the match, and the marginal
imbalance on pre-specified covariates (in total variation distance).
}
\details{
Matched designs generated by this function are Pareto optimal for
the three objective functions.  The degree of relative emphasis among the
three objectives in any specific solution is controlled by the penalties,
denoted by Greek letter rho. Larger values of \code{rhoExclude} corresponds to
increased emphasis on retaining treated units (all else being equal), while
larger values of \code{rhoBalance} corresponds to increased emphasis on marginal
balance. Additional details:
\itemize{
\item Users may either specify their own distance
matrix via the \code{distMatrix} argument or ask the function to create a
Mahalanobis distance matrix internally on a set of covariates specified by
the \code{distList} argument; if neither argument is specified an error will
result.  User-specified distance matrices should have row count equal to
the number of treated units and column count equal to the number of
controls.
\item If the \code{caliperOption} argument is specified, a propensity
score caliper will be imposed, forbidding matches between units more than a
fixed distance apart on the propensity score.  The caliper will be based
either on a user-fit propensity score, identified in the input dataframe by
argument \code{pScores}, or by an internally-fit propensity score based on
logistic regression against the variables named in \code{psoreCols}.  If
\code{caliperOption} is non-NULL and neither of the other arguments is specified
an error will result.
\item \code{toleranceOption} controls the precision at which
the objective functions is evaluated. When matching problems are especially
large or complex it may be necessary to increase toleranceOption in order
to prevent integer overflows in the underlying network flow solver;
generally this will be suggested in appropariate warning messages.
\item While
by default tradeoffs are only assessed at penalty combinations provided by
the user, the user may ask for the algorithm to search over additional
penalty values in order to identify additional Pareto optimal solutions.
\code{rho.max.f} is a multiplier applied to initial penalty values to discover
new solutions, and setting it larger leads to wider exploration; similarly,
\code{maxIter} controls how long the exploration routine runs, with larger
values leading to more exploration.
}
}
\examples{
data("lalonde", package="cobalt")
psCols <- c("age", "educ", "married", "nodegree")
treatVal <- "treat"
responseVal <- "re78"
pairDistVal <- c("age", "married", "educ", "nodegree")
exactVal <- c("educ")
myBalVal <- c("race")
r1s <- c( 0.1, 0.3, 0.5, 0.7, 0.9,1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7)
r2s <- c(0.01)
matchResult <- distBalMatch(df=lalonde, treatCol=treatVal, myBalCol=myBalVal,
rhoExclude =r1s, rhoBalance=r2s,
distList=pairDistVal, exactlist=exactVal,
propensityCols = psCols,ignore = c(responseVal), maxUnMatched = 0.1,
caliperOption=NULL, toleranceOption=1e-1, maxIter=0, rho.max.f = 10)
}
\seealso{
Other main matching function: 
\code{\link{twoDistMatch}()}
}
\concept{main matching function}
