pip install scDenorm
#or
conda install -c changebio scdenorm
import scanpy as sc
from scipy.io import mmwrite
from scDenorm.denorm import *
ad=sc.datasets.pbmc3k()
ad.layers['count']=ad.X.copy()
ad
AnnData object with n_obs × n_vars = 2700 × 32738
var: 'gene_ids'
layers: 'count'
sc.pp.normalize_total(ad, target_sum=1e4)
sc.pp.log1p(ad)
smtx = ad.X.tocsr().asfptype()
smtx.data
array([1.6352079, 1.6352079, 2.2258174, ..., 1.7980369, 1.7980369,
2.779648 ], dtype=float32)
ad.write_h5ad('data/pbmc3k_norm.h5ad')
write out as sparse matrix
mmwrite('data/scaled.mtx', smtx[1:10,])
scdenorm('data/pbmc3k_norm.h5ad',fout='data/pbmc3k_denorm.h5ad',verbose=1)
INFO:root:Reading input file: data/pbmc3k_norm.h5ad
INFO:root:The dimensions of this data are (2700, 32738).
INFO:root:select base
INFO:root:denormlizing ...
100%|██████████| 2700/2700 [00:00<00:00, 2900.90it/s]
INFO:root:Writing output file: data/pbmc3k_denorm.h5ad
return a new anndata if there is no output path.
new_ad=scdenorm('data/pbmc3k_norm.h5ad')
100%|██████████| 2700/2700 [00:00<00:00, 2969.22it/s]
new_ad
View of AnnData object with n_obs × n_vars = 2700 × 32738
var: 'gene_ids'
uns: 'log1p'
ad.layers['count'].data
array([1., 1., 2., ..., 1., 1., 3.], dtype=float32)
new_ad.X.data
array([1. , 1. , 2.0000002, ..., 1. , 1. ,
3. ], dtype=float32)
If it is gene by cell, set gxc=True
.
scdenorm('data/scaled.mtx',fout='data/scd_scaled.h5ad')
100%|██████████| 9/9 [00:00<00:00, 2883.12it/s]
!scdenorm data/pbmc3k_norm.h5ad --fout data/pbmc3k_denorm.h5ad
100%|█████████████████████████████████████| 2700/2700 [00:00<00:00, 2719.59it/s]
!scdenorm data/scaled.mtx --fout data/scd_scaled_c.h5ad
100%|███████████████████████████████████████████| 9/9 [00:00<00:00, 1333.31it/s]
or output mtx
format.
!scdenorm data/scaled.mtx --fout data/scd_scaled_c.mtx
100%|███████████████████████████████████████████| 9/9 [00:00<00:00, 1290.78it/s]