Markov Chain Construction and Probability Calculation
markovProbability.RdThis function preprocesses data, constructs a Markov chain, and calculates transition probabilities based on pseudotime information.
Usage
markovProbability(
  milo,
  diffusionmap,
  terminal_state = NULL,
  root_cell,
  knn = 30L,
  diffusiontime = NULL,
  pseudotime_key = "pseudotime",
  scale_components = TRUE,
  num_waypoints = 500,
  n_eigs = NULL,
  verbose = TRUE,
  use_RANN = TRUE
)Arguments
- milo
 A
MiloorSingleCellExperimentobject. This object should have pseudotime stored incolData, which will be used to calculate probabilities. If pseudotime is available inmilo, it takes precedence over the value provided through thediffusiontimeparameter.- diffusionmap
 A
DiffusionMapobject corresponding to themiloobject. Used for Markov chain construction.- terminal_state
 Integer. The index of the terminal state in the Markov chain.
- root_cell
 Integer. The index of the root state in the Markov chain.
- knn
 Integer. The number of nearest neighbors for graph construction. Default is
30L.- diffusiontime
 Numeric vector. If pseudotime is not stored in
milo, this parameter can be used to provide pseudotime values to the function.- pseudotime_key
 Character. The name of the column in
colDatathat contains the inferred pseudotime.- scale_components
 Logical. If
TRUE, the components will be scaled before constructing the Markov chain. Default isFALSE.- num_waypoints
 Integer. The number of waypoints to sample when constructing the Markov chain. Default is
500L.- n_eigs
 integer, default is NULL. Number of eigen vectors to use.
If is not specified, the number of eigen vectors will be determined using the eigen gap.
- verbose
 Logical. If
TRUE, print progress. Default isTRUE.- use_RANN
 parameter to make user choose whether to use RANN to construct Markov chain, or keep using bluster
Examples
data(sce_vdj)
# downsample to first 2000 cells
sce_vdj <- sce_vdj[, 1:2000]
sce_vdj <- setupVdjPseudobulk(sce_vdj,
    already.productive = FALSE,
    allowed_chain_status = c("Single pair", "Extra pair")
)
#> Checking productivity from productive_abT_VDJ, productive_abT_VJ ...
#> 1461 of cells filtered
#> checking allowed chains...
#> 3 of cells filtered
#> VDJ data extraction begin:
#> extract_cols not specified, automatically generate colnames for extraction.
#> Extract main TCR from v_call_abT_VJ, j_call_abT_VJ, v_call_abT_VDJ, d_call_abT_VDJ, j_call_abT_VDJ ...
#> Complete.
#> Filtering cells from v_call_abT_VJ_main, j_call_abT_VJ_main, v_call_abT_VDJ_main, j_call_abT_VDJ_main ...
#> 11 of cells filtered
#> 525 of cells remain.
# Build Milo Object
set.seed(100)
milo_object <- miloR::Milo(sce_vdj)
milo_object <- miloR::buildGraph(milo_object,
    k = 50, d = 20,
    reduced.dim = "X_scvi"
)
#> Constructing kNN graph with k:50
milo_object <- miloR::makeNhoods(milo_object,
    reduced_dims = "X_scvi",
    d = 20
)
#> Checking valid object
#> Running refined sampling with reduced_dim
# Construct Pseudobulked VDJ Feature Space
pb.milo <- vdjPseudobulk(milo_object, col_to_take = "anno_lvl_2_final_clean")
pb.milo <- scater::runPCA(pb.milo, assay.type = "Feature_space")
#> Warning: more singular values/vectors requested than available
#> Warning: You're computing too large a percentage of total singular values, use a standard svd instead.
# Define root and branch tips
pca <- t(as.matrix(SingleCellExperiment::reducedDim(pb.milo, type = "PCA")))
branch.tips <- c(which.min(pca[, 2]), which.max(pca[, 2]))
names(branch.tips) <- c("CD8+T", "CD4+T")
root <- which.min(pca[, 1])
# Construct Diffusion Map
dm <- destiny::DiffusionMap(t(pca), n_pcs = 10, n_eigs = 5)
#> 'as(<dsCMatrix>, "dgTMatrix")' is deprecated.
#> Use 'as(as(., "generalMatrix"), "TsparseMatrix")' instead.
#> See help("Deprecated") and help("Matrix-deprecated").
dif.pse <- destiny::DPT(dm, tips = c(root, branch.tips), w_width = 0.1)
# Markov Chain Construction
pb.milo <- markovProbability(
    milo = pb.milo,
    diffusionmap = dm,
    diffusiontime = dif.pse[[paste0("DPT", root)]],
    terminal_state = branch.tips,
    root_cell = root,
    pseudotime_key = "pseudotime"
)
#> Sampling and flocking waypoints...
#> Markov chain construction...
#> Computing fundamental matrix and absorption probabilities...
#> Project probabilites from waypoints to each pseudobulk...
#> Complete.