A python package for standardized and reproducible processing, analysis, and integration of clinical biomarker data
Written by Nicholas Giangreco
Copyright (c) 2019 by the Tatonetti Lab
Discovery of disease biomarkers or risk factors requires experiments with patient samples collected as a cohort of patients. This clinical data is collected from multiple patient cohorts either within an institutions or among many institutions.
This python package aims to provide standardization in creating a data structure representing each clinical dataset, while allowing for seamless integration of multiple instances.
git clone https://github.com/ngiangre/cohorts.git
cd cohorts/
pip3 install .
mkdir pkg_dev
cd pkg_dev/
virtualenv cohorts -p python3
cd cohorts
source bin/activate
git clone https://github.com/ngiangre/cohorts.git
cd cohorts/
pip3 install .
Please see the accompanying python notebooks in the directory for how to use the package.
Introduction.pynb gives an overview of features and how to use the functionality.
Bioinformatics_Note_Implementation.ipynb reproduces the code and figures for the Implementation section of the manuscript that accompanies this python package. The manuscript will give more detail on the motivation and structure of this package.
Please do! Both cohort data structure and functionality features are needed.
To contribute, please submit a pull request.
This software is released under the MIT license, which can be found in LICENSE in the root directory of this repository.
Giangreco, N. Fine, B. Tatonetti, N. cohorts: A Python package for clinical ‘omics data management. bioarxiv. link