CITEseq is a technique that allows researchers to measure both gene expression and protein levels within individual cells, providing valuable insights into the functional state of cells. However, accurately predicting protein levels from gene expression values is a challenging task, as protein levels are often regulated by multiple mechanisms beyond just gene expression. We aim to construct a model that will best predict the surface protein levels.
Having implemented 4 Regression models, 2 Encoder-Decoder Neural Network models and 1 Deep Learning conv1d Encoder-Decoder model, we find the Deep Learning model to be the best model (based on R-squared value of 0.2064) to predict surface protein levels given the gene expression data