Human immune

Human immune cells dataset from the scIB benchmarks

openproblems_v1

Info

openproblems_v1/immune_cells
Luecken et al. (2021)
1.18 GiB
02-02-2024
33506 cells × 12303 genes

Used in

Description

Human immune cells from peripheral blood and bone marrow taken from 5 datasets comprising 10 batches across technologies (10X, Smart-seq2).

Preview

dataset is an AnnData object with n_obs × n_vars = 33506 × 12303 with slots:

Reference

Name Description Type Data type Size
obs
batch A batch identifier. This label is very context-dependent and may be a combination of the tissue, assay, donor, etc. vector category 33506
cell_type Classification of the cell type based on its characteristics and function within the tissue or organism. vector category 33506
size_factors The size factors created by the normalisation method, if any. vector float32 33506
tissue Specific tissue from which the cells were derived, key for context and specificity in cell studies. vector category 33506
var
feature_name A human-readable name for the feature, usually a gene symbol. vector object 12303
hvg Whether or not the feature is considered to be a ‘highly variable gene’ vector bool 12303
hvg_score A ranking of the features by hvg. vector float64 12303
obsp
knn_connectivities K nearest neighbors connectivities matrix. sparsematrix float32 33506 × 33506
knn_distances K nearest neighbors distance matrix. sparsematrix float64 33506 × 33506
obsm
X_pca The resulting PCA embedding. densematrix float32 33506 × 50
varm
pca_loadings The PCA loadings matrix. densematrix float32 12303 × 50
layers
counts Raw counts sparsematrix float32 33506 × 12303
normalized Normalised expression values sparsematrix float32 33506 × 12303
uns
dataset_description Long description of the dataset. atomic str 1
dataset_id A unique identifier for the dataset. This is different from the obs.dataset_id field, which is the identifier for the dataset from which the cell data is derived. atomic str 1
dataset_name A human-readable name for the dataset. atomic str 1
dataset_organism The organism of the sample in the dataset. atomic str 1
dataset_reference Bibtex reference of the paper in which the dataset was published. atomic str 1
dataset_summary Short description of the dataset. atomic str 1
dataset_url Link to the original source of the dataset. atomic str 1
knn Supplementary K nearest neighbors data. dict 3
normalization_id Which normalization was used atomic str 1
pca_variance The PCA variance objects. dict 2

References

Luecken, Malte D., M. Büttner, K. Chaichoompu, A. Danese, M. Interlandi, M. F. Mueller, D. C. Strobl, et al. 2021. “Benchmarking Atlas-Level Data Integration in Single-Cell Genomics.” Nature Methods 19 (1): 41–50. https://doi.org/10.1038/s41592-021-01336-8.