Skip to contents

Perform geodesic pairing using ATAC/RNA co-embedding components

Usage

cell_pairing(
  ATACpcs,
  RNApcs,
  mode = "geodesic",
  tol = 1e-04,
  search_range = 0.2,
  max_multimatch = 5,
  umap_knn_k = 30,
  min_subgraph_size = 50,
  cca_umap_df = NULL,
  seed = 123
)

Arguments

ATACpcs

combined co-embedding components matrix of the ATAC cells. Must have valid ATAC cell barcode names as the rownames.

RNApcs

combined co-embedding components matrix of the RNA cells. Must have valid RNA cell barcode names as the rownames (unique from ATAC cell barcodes), and have the same number of components (columns) as the `ATACpcs` matrix

mode

character specifying the pairing mode. Must be one of 'geodesic' (default) or 'greedy'.

tol

See fullmatch for more information on this parameter

search_range

This determines the size of the search knn window for allowed pairing. search_range * total number of cells = size of knn. Default is 0.2. Increasing this can take more time since we have to evaluate over more possible pairs

max_multimatch

Maximum number of cells in the larger dataset that is allowed to be matched to each cell in the smaller dataset (after up-sampling). Default is 5. This is only to allow for a solvable optmatch solution given geodesic constraints

umap_knn_k

Number of geodesic ATAC x RNA neighbors to consider in co-embedding graph

min_subgraph_size

Minimum number of cells (ATAC/RNA each) needed for pairing in a given subgraph. Will skip subgraphs with fewer than these cells.Default is 50 cells. Useful to set this to the smallest cell cluster size you might have in the data

cca_umap_df

optional umap of all ATAC+RNA cells to visualize UMAP of cells from each chunk being paired, colored by subgraph

seed

numeric specifying seed to use for subgraph UMAP and for down-sampling in case subgraphs are imbalanced

Value

a data.frame object containing ATAC and RNA cell barcodes from the resulting pairing

Author

Yan Hu, Vinay Kartha