Hello!
Thanks for the interesting package!
I'm testing it on one of our datasets, which contains about 1M cells and 1.5k individuals. When calling the lemur function, I noticed that the tool converts the sparse count matrix into a dense array. This occupies a huge amount of memory and thus limits the possibility of applying the method to large-scale data.
Briefly, I have a SingleCellExperiment object that I generated from a dgCMatrix, and data tables like this.
sce <- SingleCellExperiment(
assays = list(logcounts = sparse_matrix),
colData = cell_info,
rowData = gene_info
)
Then I'm trying to run lemur as suggested in the tutorial
fit <- lemur(sce, design = ~ phenotype + Exp,
n_embedding = 20, test_fraction = 0.5)
And I get this message:
Warning: sparse->dense coercion: allocating vector of size 172.0 GiB
The memory occupancy then quickly goes up to about 250 GB.
Am I doing something wrong? Is there a way to make the method work with a sparse matrix (the standard dgCMatrix used by SingleCellExperiment)?
Thanks!
Hello!
Thanks for the interesting package!
I'm testing it on one of our datasets, which contains about 1M cells and 1.5k individuals. When calling the
lemurfunction, I noticed that the tool converts the sparse count matrix into a dense array. This occupies a huge amount of memory and thus limits the possibility of applying the method to large-scale data.Briefly, I have a SingleCellExperiment object that I generated from a dgCMatrix, and data tables like this.
Then I'm trying to run lemur as suggested in the tutorial
And I get this message:
The memory occupancy then quickly goes up to about 250 GB.
Am I doing something wrong? Is there a way to make the method work with a sparse matrix (the standard dgCMatrix used by SingleCellExperiment)?
Thanks!