Generate Pseudobulk V(D)J Feature Space
vdjPseudobulk.Rd
This function creates a pseudobulk V(D)J feature space from single-cell data,
aggregating V(D)J information into pseudobulk groups. It supports input as
either a Milo
object or a SingleCellExperiment
object.
Arguments
- milo
A
Milo
orSingleCellExperiment
object containing V(D)J data.- pbs
Optional. A binary matrix with cells as rows and pseudobulk groups as columns.
If
milo
is aMilo
object, this parameter is not required.If
milo
is aSingleCellExperiment
object, eitherpbs
orcol_to_bulk
must be provided.
- col_to_bulk
Optional character or character vector. Specifies
colData
column(s) to generatepbs
. If multiple columns are provided, they will be combined. Default isNULL
.If
milo
is aMilo
object, this parameter is not required.If
milo
is aSingleCellExperiment
object, eitherpbs
orcol_to_bulk
must be provided.
- extract_cols
Character vector. Specifies column names where V(D)J information is stored. Default is
c('v_call_abT_VDJ_main', 'j_call_abT_VDJ_main', ' 'v_call_abT_VJ_main', 'j_call_abT_VJ_main')
.- mode_option
Character. Specifies the mode for extracting V(D)J genes. Must be one of
c('B', 'abT', 'gdT')
. Default is'abT'
.Note: This parameter is considered only when
extract_cols = NULL
.If
NULL
, uses column names such asv_call_VDJ
instead ofv_call_abT_VDJ
.
- col_to_take
Optional character or vector of characters. Specifies names of colData of milo that need to identify the most common value for each pseudobulk Default is
NULL
.- normalise
Logical. If
TRUE
, scales the counts of each V(D)J gene group to 1 for each pseudobulk. Default isTRUE
.- renormalize
Logical. If
TRUE
, rescales the counts of each V(D)J gene group to 1 for each pseudobulk after removing 'missing' calls. Useful whensetupVdjPseudobulk()
was run withremove_missing = FALSE
. Default isFALSE
.- min_count
Integer. Sets pseudobulk counts in V(D)J gene groups with fewer than this many non-missing calls to 0. Default is
1
.- verbose
Logical. If
TRUE
, prints messages and warnings. Default isTRUE
.
Details
This function aggregates V(D)J data into pseudobulk groups based on the following logic:
Input Requirements:
If
milo
is aMilo
object, neitherpbs
norcol_to_bulk
is required.If
milo
is aSingleCellExperiment
object, the user must provide eitherpbs
orcol_to_bulk
.Normalization:
When
normalise = TRUE
, scales V(D)J counts to 1 for each pseudobulk group.When
renormalize = TRUE
, rescales the counts after removing 'missing' calls.Mode Selection:
If
extract_cols = NULL
, the function relies onmode_option
to determine which V(D)J columns to extract.Filtering:
Uses
min_count
to filter pseudobulks with insufficient counts for V(D)J groups.
Examples
data(sce_vdj)
sce_vdj <- setupVdjPseudobulk(sce_vdj,
already.productive = FALSE,
allowed_chain_status = c("Single pair", "Extra pair")
)
#> Checking productivity from productive_abT_VDJ, productive_abT_VJ ...
#> 7279 of cells filtered
#> checking allowed chains...
#> 12 of cells filtered
#> VDJ data extraction begin:
#> extract_cols not specified, automatically generate colnames for extraction.
#> Extract main TCR from v_call_abT_VDJ, d_call_abT_VDJ, j_call_abT_VDJ, v_call_abT_VJ, j_call_abT_VJ ...
#> Complete.
#> Filtering cells from v_call_abT_VDJ_main, j_call_abT_VDJ_main, v_call_abT_VJ_main, j_call_abT_VJ_main ...
#> 63 of cells filtered
#> 2646 of cells remain.
# Build Milo Object
milo_object <- miloR::Milo(sce_vdj)
milo_object <- miloR::buildGraph(milo_object,
k = 50, d = 20,
reduced.dim = "X_scvi"
)
#> Constructing kNN graph with k:50
milo_object <- miloR::makeNhoods(milo_object,
reduced_dims = "X_scvi",
d = 20
)
#> Checking valid object
#> Running refined sampling with reduced_dim
# Construct pseudobulked VDJ feature space
pb.milo <- vdjPseudobulk(milo_object, col_to_take = "anno_lvl_2_final_clean")