Generate Pseudobulk V(D)J Feature Space
vdjPseudobulk.RdThis function creates a pseudobulk V(D)J feature space from single-cell data,
aggregating V(D)J information into pseudobulk groups. It supports input as
either a Milo object or a SingleCellExperiment object.
Arguments
- milo
A
MiloorSingleCellExperimentobject containing V(D)J data.- pbs
Optional. A binary matrix with cells as rows and pseudobulk groups as columns.
If
milois aMiloobject, this parameter is not required.If
milois aSingleCellExperimentobject, eitherpbsorcol_to_bulkmust be provided.
- col_to_bulk
Optional character or character vector. Specifies
colDatacolumn(s) to generatepbs. If multiple columns are provided, they will be combined. Default isNULL.If
milois aMiloobject, this parameter is not required.If
milois aSingleCellExperimentobject, eitherpbsorcol_to_bulkmust be provided.
- extract_cols
Character vector. Specifies column names where V(D)J information is stored. Default is
c('v_call_abT_VDJ_main', 'j_call_abT_VDJ_main', ' 'v_call_abT_VJ_main', 'j_call_abT_VJ_main').- mode_option
Character. Specifies the mode for extracting V(D)J genes. Must be one of
c('B', 'abT', 'gdT'). Default is'abT'.Note: This parameter is considered only when
extract_cols = NULL.If
NULL, uses column names such asv_call_VDJinstead ofv_call_abT_VDJ.
- col_to_take
Optional character or vector of characters. Specifies names of colData of milo that need to identify the most common value for each pseudobulk Default is
NULL.- normalise
Logical. If
TRUE, scales the counts of each V(D)J gene group to 1 for each pseudobulk. Default isTRUE.- renormalize
Logical. If
TRUE, rescales the counts of each V(D)J gene group to 1 for each pseudobulk after removing 'missing' calls. Useful whensetupVdjPseudobulk()was run withremove_missing = FALSE. Default isFALSE.- min_count
Integer. Sets pseudobulk counts in V(D)J gene groups with fewer than this many non-missing calls to 0. Default is
1.- verbose
Logical. If
TRUE, prints messages and warnings. Default isTRUE.
Details
This function aggregates V(D)J data into pseudobulk groups based on the following logic:
Input Requirements:
If
milois aMiloobject, neitherpbsnorcol_to_bulkis required.If
milois aSingleCellExperimentobject, the user must provide eitherpbsorcol_to_bulk.Normalization:
When
normalise = TRUE, scales V(D)J counts to 1 for each pseudobulk group.When
renormalize = TRUE, rescales the counts after removing 'missing' calls.Mode Selection:
If
extract_cols = NULL, the function relies onmode_optionto determine which V(D)J columns to extract.Filtering:
Uses
min_countto filter pseudobulks with insufficient counts for V(D)J groups.
Examples
data(sce_vdj)
sce_vdj <- setupVdjPseudobulk(sce_vdj,
already.productive = FALSE,
allowed_chain_status = c("Single pair", "Extra pair")
)
#> Checking productivity from productive_abT_VDJ, productive_abT_VJ ...
#> 7279 of cells filtered
#> checking allowed chains...
#> 12 of cells filtered
#> VDJ data extraction begin:
#> extract_cols not specified, automatically generate colnames for extraction.
#> Extract main TCR from v_call_abT_VJ, j_call_abT_VJ, v_call_abT_VDJ, d_call_abT_VDJ, j_call_abT_VDJ ...
#> Complete.
#> Filtering cells from v_call_abT_VJ_main, j_call_abT_VJ_main, v_call_abT_VDJ_main, j_call_abT_VDJ_main ...
#> 63 of cells filtered
#> 2646 of cells remain.
# Build Milo Object
milo_object <- miloR::Milo(sce_vdj)
milo_object <- miloR::buildGraph(milo_object,
k = 50, d = 20,
reduced.dim = "X_scvi"
)
#> Constructing kNN graph with k:50
milo_object <- miloR::makeNhoods(milo_object,
reduced_dims = "X_scvi",
d = 20
)
#> Checking valid object
#> Running refined sampling with reduced_dim
# Construct pseudobulked VDJ feature space
pb.milo <- vdjPseudobulk(milo_object, col_to_take = "anno_lvl_2_final_clean")