% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/jackstraw_lfa.R
\name{jackstraw_lfa}
\alias{jackstraw_lfa}
\title{Non-Parametric Jackstraw for Logistic Factor Analysis}
\usage{
jackstraw_lfa(
  dat,
  r,
  FUN = function(x) lfa::lfa(x, r),
  r1 = NULL,
  s = NULL,
  B = NULL,
  covariate = NULL,
  permute_alleles = TRUE,
  verbose = TRUE
)
}
\arguments{
\item{dat}{either a genotype matrix with \code{m} rows as variables and \code{n} columns as observations, or a \code{BEDMatrix} object (see package \code{BEDMatrix}, these objects are transposed compared to the above but this works fine as-is, see example, no need to modify a \code{BEDMatrix} input).
A \code{BEDMatrix} input triggers a low-memory mode where permuted data is also written and processed from disk, whereas a regular matrix input stores permutations in memory.
The tradeoff is \code{BEDMatrix} version typically runs considerably slower, but enables analysis of very large data that is otherwise impossible.}

\item{r}{a number of significant LFs.}

\item{FUN}{a function to use for LFA (by default, it uses the \code{lfa} package)}

\item{r1}{a numeric vector of LFs of interest (implying you are not interested in all \code{r} LFs).}

\item{s}{a number of ``synthetic'' null variables. Out of \code{m} variables, \code{s} variables are independently permuted.}

\item{B}{a number of resampling iterations. There will be a total of \code{s*B} null statistics.}

\item{covariate}{a data matrix of covariates with corresponding \code{n} observations (do not include an intercept term).}

\item{permute_alleles}{If TRUE (default), alleles (rather than genotypes) are permuted, which results in a more Binomial synthetic null when data is highly structured.
Changing to FALSE is not recommended, except for research purposes to confirm that it performs worse than the default.}

\item{verbose}{a logical specifying to print the computational progress.}
}
\value{
\code{jackstraw_lfa} returns a list consisting of
\item{p.value}{\code{m} p-values of association tests between variables and their LFs}
\item{obs.stat}{\code{m} observed deviances}
\item{null.stat}{\code{s*B} null deviances}
}
\description{
Test association between the observed variables and their latent variables captured by logistic factors (LFs).
}
\details{
This function uses logistic factor analysis (LFA) from Hao et al. (2016).
Particularly, the deviance in logistic regression (the full model with \code{r} LFs vs. the intercept-only model) is used to assess significance.

The random outputs of the regular matrix versus the \code{BEDMatrix} versions are equal in distribution.
However, fixing a seed and providing the same data to both versions does not result in the same exact outputs.
This is because the \code{BEDMatrix} version permutes loci in a different order by necessity.
}
\examples{
\dontrun{
## simulate genotype data from a logistic factor model: drawing rbinom from logit(BL)
m <- 5000; n <- 100; pi0 <- .9
m0 <- round(m*pi0)
m1 <- m - round(m*pi0)
B <- matrix(0, nrow=m, ncol=1)
B[1:m1,] <- matrix(runif(m1*n, min=-.5, max=.5), nrow=m1, ncol=n)
L <- matrix(rnorm(n), nrow=1, ncol=n)
BL <- B \%*\% L
prob <- exp(BL)/(1+exp(BL))

dat <- matrix(rbinom(m*n, 2, as.numeric(prob)), m, n)

## apply the jackstraw_lfa
out <- jackstraw_lfa(dat, r = 2)

# if you had very large genotype data in plink BED/BIM/FAM files,
# use BEDMatrix and save memory by reading from disk (at the expense of speed)
library(BEDMatrix)
dat_BM <- BEDMatrix( 'filepath' ) # assumes filepath.bed, .bim and .fam exist
# run jackstraw!
out <- jackstraw_lfa(dat_BM, r = 2)
}

}
\references{
Chung and Storey (2015) Statistical significance of variables driving systematic variation in high-dimensional data. Bioinformatics, 31(4): 545-554 \doi{10.1093/bioinformatics/btu674}
}
\seealso{
\link{jackstraw_pca} \link{jackstraw} \link{jackstraw_subspace}
}
\author{
Neo Christopher Chung \email{nchchung@gmail.com}
}
