
<!-- README.md is generated from README.Rmd. Please edit that file -->

# highDmean

<!-- badges: start -->

<!-- badges: end -->

This package highDmean is an implementation of the high-dimensional
two-sample test proposed by Zhang and Wang (2020) “Result consistency of
high dimensional two-sample tests applied to gene ontology terms with
gene sets”. Testing multivariate two-sample mean equality has a
classical solution–Hotelling’s T-square test. When the dimensionality is
greater than the sample sizes, Hotelling’s test fails due to the
singularity of covariance matrix. In this case, the test proposed by
Zhang and Wang (2020), referred to as `zwl_test()` in this package, can
tackle the issue and provide reliable and powerful test. It also
implement the test proposed by Srivastava, Katayama, and Kano (2013) “A
two sample test in high dimensional data.”

## Installation

You can install the released version of highDmean from
[CRAN](https://CRAN.R-project.org) with:

``` r
install.packages("highDmean")
```

## Example

This is a basic example which shows you how to solve a common problem:

``` r
library(highDmean)
data <- buildData(n = 45, m =60, p = 300,
          muX = rep(0,300), muY = rep(0,300),
          dep = 'IND', S = 1, innov = rnorm)
zwl_test(data[[1]]$X, data[[1]]$Y, order = 2)
#> $statistic
#> [1] 0.7534648
#> 
#> $pvalue
#> [1] 0.4511707
#> 
#> $Tn
#> [1] 1.08859
#> 
#> $var
#> [1] 0.007897337
```

## Main functions

The functions `zwl_test()` and `SKK_test()` accept n by p and m by p
data matrices with sample data from the first and second populations and
return test statistics and p-values for the null hypothesis of equal
means.

The `buildData()` function simulates high-dimensional data in the
two-population setting with specified sample sizes, numbers of
components, covariance structure, etc., and the functions `zwl_sim()`
and `SKK_sim()` return test statistic values and p-values for lists of
simulated data sets generated by `buildData()`.
