--- title: "Working with HDF5-Backed Matrices in BigDataStatMeth" author: - name: Dolors Pelegri-Siso affiliation: - &uab Universitat Autonoma de Barcelona (UAB) - &isglobal ISGlobal, Centre for Research in Environmental Epidemiology (ISGlobal) - &brge Bioinformatics Research Group in Epidemiology (BRGE) email: dolors.pelegri@isglobal.org - name: Juan R. Gonzalez affiliation: - *uab - *isglobal - *brge email: juanr.gonzalez@isglobal.org date: "`r Sys.Date()`" output: BiocStyle::html_document: number_sections: yes toc: yes fig_caption: yes toc_float: yes abstract: | This vignette illustrates how to work with HDF5-backed matrices using the S3 interface provided by the BigDataStatMeth package. It covers matrix creation, data import, subsetting, algebraic and statistical operations, matrix decompositions, global options, compression, HDF5 file-space management, and resource handling. vignette: | %\VignetteIndexEntry{Working with HDF5-Backed Matrices in BigDataStatMeth} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} bibliography: references.bib --- ```{r setup, include = FALSE} library(knitr) library(BiocStyle) knitr::opts_chunk$set( collapse = TRUE, comment = "", cache = FALSE, message = FALSE, warning = FALSE, width = 80, crop = NULL ) ``` # Overview `BigDataStatMeth` provides statistical and linear algebra operations for matrices stored in HDF5 files. The package is designed for workflows in which matrices may be too large to be held entirely in memory, while still allowing users to work with familiar R functions. The examples in this vignette use small matrices so that they can be run quickly and compared with standard in-memory R results. The same interface is intended for larger HDF5-backed matrices. All files created in the examples are written to temporary locations. ```{r load-package} library(BigDataStatMeth) ``` # User-facing interface and global options The recommended user-facing interface is based on `HDF5Matrix` objects and standard R generics. Thus, HDF5-backed matrices can be manipulated using calls such as `dim()`, `[`, `%*%`, `crossprod()`, `scale()`, `cor()`, `svd()`, `qr()`, and `prcomp()`. Internally, `BigDataStatMeth` uses an R6 and C++ backend. Advanced users may interact with lower-level interfaces when needed, but this vignette focuses on the S3 interface because it allows code to remain close to ordinary R matrix workflows. The following table summarizes the main user-facing functionality available for `HDF5Matrix` objects. It is not an exhaustive list of all exported functions; rather, it highlights the standard R-style methods and high-level helpers used throughout this vignette. Additional domain-specific, lower-level, and compatibility functions are documented in their corresponding help pages. | Category | Representative calls | |:---|:---| | Core object handling | `hdf5_create_matrix()`, `hdf5_matrix()`, `dim()`, `nrow()`, `ncol()`, `is_open()`, `close()` | | HDF5 inspection and I/O | `list_datasets()`, `hdf5_import()`, `hdf5_import_multiple()`, `as.matrix()`, `as.data.frame()` | | Subsetting and assignment | `X[i, j]`, `X[i, j] <- value` | | Dimension names | `rownames()`, `colnames()`, `dimnames()` | | Element-wise arithmetic | `X + Y`, `X - Y`, `X * Y`, `X / Y` | | Matrix algebra | `%*%`, `crossprod()`, `tcrossprod()` | | Binding | `cbind()`, `rbind()` | | Aggregations | `colSums()`, `rowSums()`, `colMeans()`, `rowMeans()`, `colVars()`, `rowVars()`, `colSds()`, `rowSds()`, `colMins()`, `rowMins()`, `colMaxs()`, `rowMaxs()` | | Scalar summaries | `mean()`, `var()`, `sd()` | | Normalization and transformations | `scale()`, `sweep()` | | Correlation | `cor()` | | Matrix decompositions | `svd()`, `prcomp()`, `eigen()`, `pseudoinverse()` | | Factorizations and solvers | `qr()`, `chol()`, `solve()` | | Diagonal operations | `diag()`, `diag<-()`, `diag_op()`, `diag_scale()` | | Split, reduce, and apply | `split_dataset()`, `split()`, `reduce()`, `apply_function()` | | Resource management and options | `hdf5matrix_options()`, `show_hdf5matrix_options()`, `hdf5_close_all()` | | Specialized high-level helpers | selected `bd*` functions for utilities without a direct standard R generic, such as `bdCreate_hdf5_group()`, `bdmove_hdf5_dataset()`, and `bdWrite_hdf5_dimnames()` | Most examples in this vignette use the standard `HDF5Matrix`/S3 interface. Some high-level helper functions keep the `bd*` prefix because they provide specialized functionality that does not map directly to an existing base R generic. These functions remain part of the package API and are documented in their corresponding help pages. ## Global options Some S3 operators, such as element-wise arithmetic, follow the standard R syntax and do not expose additional arguments in the function call. Global options make it possible to configure block-wise processing, parallelization, the number of threads, and compression for such operations. ```{r options-view} old_opts <- hdf5matrix_options() old_opts ``` For example, the following settings enable parallel execution with two threads, use a fixed block size, and set the compression level used for new output datasets when not explicitly specified by a method call. ```{r options-set} hdf5matrix_options( paral = TRUE, threads = 2L, block_size = 512L, compression = 6L ) hdf5matrix_options() ``` When an individual method provides explicit arguments, those arguments should be used for local control of that operation. The global options are especially useful for generic operators where the standard R call is kept unchanged. Additional operation-specific parameters are documented in the corresponding help pages, for example `?svd.HDF5Matrix`, `?prcomp.HDF5Matrix`, `?qr.HDF5Matrix`, and `?hdf5matrix_options`. # Creating, opening, and inspecting HDF5 datasets The function `hdf5_create_matrix()` creates an HDF5 dataset and returns an `HDF5Matrix` object pointing to it. In the example below, a standard R matrix is written to an HDF5 file and then manipulated through the S3 interface. ```{r create-matrix} h5file <- tempfile(fileext = ".h5") set.seed(123) X <- matrix(rnorm(500 * 100), nrow = 500, ncol = 100) X_h5 <- hdf5_create_matrix( filename = h5file, dataset = "data/X", data = X, overwrite = TRUE ) X_h5 dim(X_h5) nrow(X_h5) ncol(X_h5) ``` The datasets stored in an HDF5 file can be listed with `list_datasets()`. Existing datasets can be reopened with `hdf5_matrix()`. ```{r open-matrix} list_datasets(h5file) X_h5_reopened <- hdf5_matrix( filename = h5file, path = "data/X" ) dim(X_h5_reopened) ``` Operations that return an `HDF5Matrix` object write their results to an HDF5 dataset. The printed object shows the file and dataset path. Some methods, such as `crossprod()` and `tcrossprod()`, allow the output group and dataset name to be specified explicitly. Other S3 operators, such as element-wise arithmetic, currently use package-defined output names. # Importing tabular data Delimited text files can be imported directly into HDF5 with `hdf5_import()`. The following example uses the small cholesterol data file distributed with the package. In the final package layout, the file should be available under `inst/extdata/colesterol.csv`. The fallback below is included only to allow this vignette to be tested against older source layouts in which the same file was stored elsewhere. ```{r import-tabular-data} # Generate reproducible example data matching the cholesterol dataset structure set.seed(4214123) n <- 1000L colesterol <- data.frame( TCholesterol = rnorm(n, mean = 195.8, sd = 30.1), Age = rnorm(n, mean = 58.4, sd = 9.9), Insulin = rnorm(n, mean = 12.5, sd = 7.4), Creatinine = rnorm(n, mean = 0.798, sd = 0.098), BUN = rnorm(n, mean = 14.6, sd = 3.4), LLDR = rnorm(n, mean = 0.981, sd = 0.450), Triglycerides= rnorm(n, mean = 136.6, sd = 55.1), HDL_C = rnorm(n, mean = 48.2, sd = 9.2), LDL_C = rnorm(n, mean = 116.4, sd = 25.5), Sex = rbinom(n, 1, prob = 0.58) ) csv_file <- tempfile(fileext = ".csv") write.csv(colesterol, csv_file, row.names = FALSE) h5_csv <- tempfile(fileext = ".h5") cholesterol_h5 <- hdf5_import( source = csv_file, filename = h5_csv, dataset = "cholesterol/data", sep = ",", header = TRUE, overwrite = TRUE ) cholesterol_h5 dim(cholesterol_h5) cholesterol_h5[1:5, 1:min(6, ncol(cholesterol_h5))] ``` # Subsetting, assignment, dimnames, and conversion to memory The `[` operator can be used to read subsets from an `HDF5Matrix` object. Only the requested rows and columns are read. ```{r subsetting} X_h5[1:5, 1:6] sub_X <- X_h5[1:20, 1:10] dim(sub_X) ``` The replacement form `[<-` writes values back to the HDF5 dataset. Changes are applied directly to the data stored on disk. ```{r assignment} X_edit <- hdf5_create_matrix( h5file, "data/X_edit", data = X[1:10, 1:6], overwrite = TRUE ) X_edit[1, 1] <- 999 X_edit[1:3, 1:3] ``` Dimension names can also be stored and retrieved with the usual R accessors. ```{r dimnames-example} DN_h5 <- hdf5_create_matrix( h5file, "data/dimnames_example", data = matrix(seq_len(12), nrow = 4, ncol = 3), overwrite = TRUE ) rownames(DN_h5) <- paste0("sample_", seq_len(nrow(DN_h5))) colnames(DN_h5) <- paste0("feature_", seq_len(ncol(DN_h5))) rownames(DN_h5) colnames(DN_h5) dimnames(DN_h5) ``` The function `as.matrix()` reads an HDF5-backed matrix into memory. It is useful for small examples or final results, but should be used with care for very large datasets. ```{r convert-to-memory} X_small <- as.matrix(X_h5[1:10, 1:5]) X_small ``` # Matrix algebra ## Element-wise arithmetic Element-wise arithmetic between `HDF5Matrix` objects can be performed using the standard arithmetic operators. The results are stored as new HDF5-backed matrices. ```{r arithmetic} set.seed(1) A <- matrix(rnorm(300 * 80), nrow = 300, ncol = 80) B <- matrix(rnorm(300 * 80), nrow = 300, ncol = 80) C <- matrix(rnorm(80 * 40), nrow = 80, ncol = 40) A_h5 <- hdf5_create_matrix( h5file, "data/A", data = A, overwrite = TRUE ) B_h5 <- hdf5_create_matrix( h5file, "data/B", data = B, overwrite = TRUE ) C_h5 <- hdf5_create_matrix( h5file, "data/C", data = C, overwrite = TRUE ) S_h5 <- A_h5 + B_h5 D_h5 <- A_h5 - B_h5 S_h5 dim(S_h5) all.equal(as.matrix(S_h5), A + B) all.equal(as.matrix(D_h5), A - B) ``` The output datasets can be inspected directly from the file. This is useful when results need to be reopened later in the workflow. ```{r arithmetic-output-location} list_datasets(h5file, group = "OUTPUT", recursive = TRUE) ``` ## Matrix multiplication Matrix multiplication uses the same `%*%` syntax as ordinary R matrices, while the computation is performed block-wise on the HDF5-backed data. ```{r multiplication} M_h5 <- A_h5 %*% C_h5 M_h5 dim(M_h5) all.equal(as.matrix(M_h5), A %*% C) ``` ## Crossproducts Crossproducts are available through the standard R calls. These methods also allow the output group and dataset name to be specified explicitly. ```{r crossproducts} XtX_h5 <- crossprod( A_h5, outgroup = "RESULTS", outdataset = "A_crossprod" ) XXt_h5 <- tcrossprod( A_h5, outgroup = "RESULTS", outdataset = "A_tcrossprod" ) XtX_h5 list_datasets(h5file, group = "RESULTS", recursive = TRUE) all.equal(as.matrix(XtX_h5), crossprod(A)) all.equal(as.matrix(XXt_h5), tcrossprod(A)) ``` ## Binding matrices The functions `cbind()` and `rbind()` can be used to combine compatible HDF5-backed matrices by columns or rows. ```{r bind-example} A1_h5 <- hdf5_create_matrix( h5file, "bind/A1", data = A[1:50, 1:10], overwrite = TRUE ) A2_h5 <- hdf5_create_matrix( h5file, "bind/A2", data = A[1:50, 11:20], overwrite = TRUE ) Cbind_h5 <- cbind( A1_h5, A2_h5, out_group = "BIND_RESULTS", out_dataset = "A1_A2_cbind", overwrite = TRUE ) Rbind_h5 <- rbind( A1_h5, A1_h5, out_group = "BIND_RESULTS", out_dataset = "A1_A1_rbind", overwrite = TRUE ) dim(Cbind_h5) dim(Rbind_h5) all.equal(as.matrix(Cbind_h5), cbind(A[1:50, 1:10], A[1:50, 11:20])) all.equal(as.matrix(Rbind_h5), rbind(A[1:50, 1:10], A[1:50, 1:10])) ``` # Statistical operations Many summary operations return standard R vectors or scalars. This is convenient when the output is small relative to the input matrix. ```{r aggregations} all.equal(colMeans(A_h5), colMeans(A)) all.equal(rowSums(A_h5), rowSums(A)) all.equal(colVars(A_h5), apply(A, 2, var)) all.equal(rowSds(A_h5), apply(A, 1, sd)) ``` The `scale()` method centers and/or scales an HDF5-backed matrix by blocks and stores the result on disk. ```{r scaling} A_scaled_h5 <- scale(A_h5) A_scaled <- scale(A) all.equal( as.matrix(A_scaled_h5), A_scaled, check.attributes = FALSE ) ``` Correlation matrices can be computed directly from an `HDF5Matrix`. ```{r correlation} Cor_h5 <- cor(A_h5) all.equal( as.matrix(Cor_h5), cor(A), tolerance = 1e-6 ) ``` Some operations can be combined with small HDF5-backed vectors. The following example subtracts column means from each column using `sweep()`. ```{r sweep-example} col_means_h5 <- hdf5_create_matrix( h5file, "stats/A_col_means", data = matrix(colMeans(A), nrow = 1), overwrite = TRUE ) A_centered_h5 <- sweep(A_h5, MARGIN = 2, STATS = col_means_h5, FUN = "-") all.equal( as.matrix(A_centered_h5), sweep(A, MARGIN = 2, STATS = colMeans(A), FUN = "-"), check.attributes = FALSE ) ``` # Matrix decompositions `BigDataStatMeth` implements several matrix decompositions for HDF5-backed matrices. The examples below use small matrices to validate results against standard R computations. ## Singular value decomposition The SVD method can center and scale the input before decomposition. The returned singular values are stored in memory, while singular vectors are stored as HDF5-backed matrices. ```{r svd-example} set.seed(123) X_svd <- matrix(rnorm(120 * 300), nrow = 120, ncol = 300) X_svd_h5 <- hdf5_create_matrix( h5file, "decomp/X_svd", data = X_svd, overwrite = TRUE ) svd_h5 <- svd( X_svd_h5, nu = 10, nv = 10, center = TRUE, scale = TRUE, overwrite = TRUE ) head(svd_h5$d) dim(svd_h5$u) dim(svd_h5$v) ``` The first singular values can be compared with the corresponding in-memory R calculation. ```{r svd-validation} svd_r <- svd(scale(X_svd), nu = 10, nv = 10) all.equal( svd_h5$d[1:10], svd_r$d[1:10], tolerance = 1e-6 ) ``` For large matrices, `svd()` can use a block-wise strategy. The parameter `k` controls the number of local SVDs per incremental level, whereas `q` controls the number of incremental levels used to combine intermediate decompositions. Larger values may reduce the size of each local problem, but they can also increase overhead. Therefore, these parameters should be selected according to the matrix dimensions and the available hardware. The argument `threads` controls the number of threads used by parallel parts of the computation. The arguments `nu` and `nv` follow the same idea as in `base::svd()`: `nu` controls how many left singular vectors are returned, and `nv` controls how many right singular vectors are returned. Requesting only the leading components is useful when the complete set of singular vectors is not needed. ```{r block-svd-example} svd_blk_h5 <- svd( X_svd_h5, nu = 5, nv = 5, k = 4, q = 1, threads = 2, overwrite = TRUE ) head(svd_blk_h5$d) dim(svd_blk_h5$u) dim(svd_blk_h5$v) ``` ## Principal component analysis The `prcomp()` method provides a PCA interface for HDF5-backed matrices. The result includes standard PCA summaries together with HDF5-backed matrices for quantities such as scores and loadings. The argument `ncomponents` controls the number of principal components computed; a value of `0` requests all available components. ```{r pca-example} set.seed(124) n <- 180 p <- 40 group <- rep(c("Group 1", "Group 2", "Group 3"), each = n / 3) X_pca <- matrix(rnorm(n * p), nrow = n, ncol = p) X_pca[group == "Group 2", 1:8] <- X_pca[group == "Group 2", 1:8] + 1.5 X_pca[group == "Group 3", 9:16] <- X_pca[group == "Group 3", 9:16] - 1.5 X_pca_h5 <- hdf5_create_matrix( h5file, "decomp/X_pca", data = X_pca, overwrite = TRUE ) pca_h5 <- prcomp( X_pca_h5, center = TRUE, scale. = TRUE, ncomponents = 5, overwrite = TRUE ) pca_h5 head(pca_h5$sdev) ``` The PCA scores are stored in the `x` component of the returned object. When `x` is an HDF5-backed matrix, the required columns can be subsetted and converted to memory for visualization. Here the synthetic groups are used only to make the low-dimensional structure visible. ```{r pca-plot, fig.width = 7, fig.height = 5, fig.cap = "PCA scores computed from an HDF5-backed matrix."} class(pca_h5$x) dim(pca_h5$x) pca_scores <- as.matrix(pca_h5$x[, 1:2]) pca_df <- data.frame( PC1 = pca_scores[, 1], PC2 = pca_scores[, 2], group = group ) if (requireNamespace("ggplot2", quietly = TRUE)) { ggplot2::ggplot(pca_df, ggplot2::aes(PC1, PC2, colour = group)) + ggplot2::geom_point(size = 2.4, alpha = 0.85) + ggplot2::stat_ellipse(linewidth = 0.6, alpha = 0.7) + ggplot2::theme_minimal(base_size = 12) + ggplot2::theme( legend.position = "top", panel.grid.minor = ggplot2::element_blank() ) + ggplot2::labs( title = "PCA of an HDF5-backed matrix", subtitle = "Scores returned by prcomp.HDF5Matrix()", x = "PC1", y = "PC2", colour = "Synthetic group" ) } else { plot( pca_df$PC1, pca_df$PC2, pch = 19, xlab = "PC1", ylab = "PC2", main = "PCA of an HDF5-backed matrix" ) } ``` ## QR decomposition The QR decomposition returns HDF5-backed `Q` and `R` matrices. The argument `thin` controls whether a reduced decomposition is returned. With `thin = TRUE`, `Q` has only the columns needed to reconstruct the input matrix, which is usually preferable for tall matrices because it avoids storing unnecessary columns. With `thin = FALSE`, a full `Q` matrix is returned. Rather than comparing the signs of individual columns, the example validates the factorization and the orthogonality of `Q`. ```{r qr-example} QR_h5 <- qr(A_h5, thin = TRUE, overwrite = TRUE) Q <- as.matrix(QR_h5$Q) R <- as.matrix(QR_h5$R) all.equal(Q %*% R, A, tolerance = 1e-6) all.equal(crossprod(Q), diag(ncol(Q)), tolerance = 1e-6) ``` For tall-skinny matrices, the QR method can use the TSQR path. The `method = "auto"` setting selects the backend according to the matrix shape, while `method = "tsqr"` requests the tall-skinny algorithm explicitly. ```{r tsqr-example} X_tsqr <- matrix(rnorm(600 * 30), nrow = 600, ncol = 30) X_tsqr_h5 <- hdf5_create_matrix( h5file, "decomp/X_tsqr", data = X_tsqr, overwrite = TRUE ) QR_tsqr_h5 <- qr( X_tsqr_h5, thin = TRUE, method = "tsqr", threads = 2L, overwrite = TRUE ) dim(QR_tsqr_h5$Q) dim(QR_tsqr_h5$R) ``` ## Cholesky decomposition and inverse For symmetric positive-definite matrices, `chol()` and `solve()` can be used directly on `HDF5Matrix` objects. ```{r chol-solve-example} set.seed(321) Z <- matrix(rnorm(200 * 50), nrow = 200, ncol = 50) SPD <- crossprod(Z) + diag(1e-6, 50) SPD_h5 <- hdf5_create_matrix( h5file, "decomp/SPD", data = SPD, overwrite = TRUE ) Ch_h5 <- chol(SPD_h5, overwrite = TRUE) Inv_h5 <- solve(SPD_h5, overwrite = TRUE) Ch <- as.matrix(Ch_h5) chol_error <- min( max(abs(crossprod(Ch) - SPD)), max(abs(tcrossprod(Ch) - SPD)) ) chol_error < 1e-6 all.equal(as.matrix(Inv_h5), solve(SPD), tolerance = 1e-6) ``` ## Eigen decomposition and pseudoinverse Additional decompositions are available through S3 methods. For example, `eigen()` can be applied to symmetric HDF5-backed matrices, and `pseudoinverse()` computes the Moore--Penrose pseudoinverse. ```{r eigen-pseudoinverse} Eig_h5 <- eigen( SPD_h5, symmetric = TRUE, k = 5L, overwrite = TRUE ) head(Eig_h5$values) dim(Eig_h5$vectors) Pinv_h5 <- pseudoinverse( A_h5, overwrite = TRUE ) dim(Pinv_h5) ``` # Compression and HDF5 file space management ## Compression HDF5 datasets can be stored with different compression levels. Lower compression levels generally reduce writing time but use more disk space. Higher compression levels may reduce file size but can increase the time needed to write and read data. The default value in `BigDataStatMeth` is `compression = 6`, which provides a balanced setting for many workflows. The following small example illustrates the trade-off between writing time and file size. It is not intended as a formal benchmark. ```{r compression-example} set.seed(123) X_cmp <- round(matrix(rnorm(2500 * 250), nrow = 2500, ncol = 250), 2) f0 <- tempfile(fileext = ".h5") f6 <- tempfile(fileext = ".h5") t0 <- system.time( hdf5_create_matrix( f0, "data/X", data = X_cmp, compression = 0, overwrite = TRUE ) ) t6 <- system.time( hdf5_create_matrix( f6, "data/X", data = X_cmp, compression = 6, overwrite = TRUE ) ) data.frame( compression = c(0, 6), elapsed = c(t0[["elapsed"]], t6[["elapsed"]]), file_size_MB = round(file.info(c(f0, f6))$size / 1024^2, 3) ) ``` Formal performance comparisons should be run on the target hardware and with representative datasets. ## HDF5 file space management When datasets are deleted or overwritten in an HDF5 file, the physical file size on disk does not always decrease immediately. `BigDataStatMeth` creates HDF5 files using a file space management strategy that allows free space inside the file to be tracked and reused by subsequent writes. Thus, space released by removed datasets can be reused within the same file instead of unnecessarily increasing the file size after repeated operations. This behavior helps control file growth during workflows that create, overwrite, or remove intermediate datasets. Nevertheless, reducing the physical size of an existing HDF5 file may require repacking the file. The HDF Group provides external command-line tools for this purpose. In particular, the [`h5repack` user guide](https://support.hdfgroup.org/documentation/hdf5/latest/_h5_t_o_o_l__r_p__u_g.html) describes how to rewrite an HDF5 file into a new compacted or reconfigured file. This is an external HDF5 utility and is not executed from this vignette. # Managing HDF5 resources HDF5-backed objects keep file handles open while they are in use. In ordinary interactive workflows these handles are released when objects are garbage-collected. Users can also close objects explicitly. ```{r close-single-object} close(X_h5_reopened) ``` The function `hdf5_close_all()` closes the HDF5 handles currently tracked by the package, including open file handles. After this call, HDF5-backed objects that pointed to those handles should be reopened before further use. Calling `gc()` may help trigger R finalizers for objects that are no longer referenced, which can be useful when repeatedly re-running code during development. ```{r restore-options-and-close-all} hdf5matrix_options( paral = old_opts$paral, block_size = old_opts$block_size, threads = old_opts$threads, compression = old_opts$compression ) hdf5_close_all() gc() ``` # Extending BigDataStatMeth through C++ The examples above use the standard R interface, which is the recommended entry point for most users. However, `BigDataStatMeth` also provides a C++ infrastructure for developers who need to implement new scalable methods. Internally, the package defines C++ classes for managing HDF5 files, groups, and datasets, together with block-wise routines for linear algebra and statistical operations. These are the same computational building blocks used by the R/S3 interface. As a result, developers can reuse the package infrastructure from Rcpp-based code instead of implementing HDF5 file management, block iteration, and numerical operations from scratch. This makes the package useful not only as an end-user R package, but also as a development framework for extending HDF5-backed statistical computing in C++. Since the purpose of this vignette is to introduce the standard R interface, the C++ API is not discussed in detail here. Nevertheless, it is an important part of the package architecture and provides the computational infrastructure behind the R/S3 methods shown above. # Session information ```{r session-info} sessionInfo() ```