GitHub - thoughtworks/cell-reprogram-workflow: A framework to perform cellular reprogramming computationally

CRGEM: Cellular Reprogramming using mechanism-driven Gene Expression Modulation

A framework created using already available tools and databases, to perform cellular reprogramming computationally. To carry out tool specific task using the various tools available separately, the researchers need to go through the tools individually. CRGEM ease their work by integrating the tools at one place. With this integration of the functionalities researchers can invest time in biological inferences and experimentally verifying the key modulators and their effects.

Steps:

Install all R requirements
git clone using installation command mentioned below
cd into pipeline directory formed with clone command
run the setup.sh file
Install the feather file in data directory using the command mentioned below (Point 8)
Run the commands as per the requirement. (Refer commands section given below)

Installation

git clone --depth 1 git@github.com:Avani7/Pipeline.git pipeline

Requirements

1. Rstudio version used: R version 4.2.2

2. R requirements

Use this command ro install R requirements

install.packages(c("gtools","Matrix", "nibble","dplyr","stringr","purrr","Rcpp","reshape2","umap","pheatmap", "igraph","GGally","ggplot2","RcisTarget","AUCell"))

3. Python version used: Python 3.9

4. Cytoscape version recommended: 3.9.1. Cytoscape needs to open in the background while running the workflow.

5. PathLinker app to be added to cytoscape using App manager in the software.

6. boost (Collection of portable C++ source libraries):

brew install boost

User needs to define the CAPTH and LD_LIBRARY_PATH

For M1 users:

CAPTH = /opt/homebrew/include

LIBRARY_PATH = /opt/homebrew/lib

For Intel users:

CAPTH =/usr/local/include

LD_LIBRARY_PATH =/usr/local/lib

7. wget:

brew install wget

8. feature file

User needs to download the feature file from https://resources.aertslab.org/cistarget/databases/old/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/ and add it in the data folder. curl -O https://resources.aertslab.org/cistarget/databases/old/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-10kb-10species.mc9nr.feather

9. TRRUST data netowrk in .tsv format

The file is already present in the data directory. If the user wants to download the latest data, it can be done using curl -s 'https://www.grnpedia.org/trrust/data/trrust_rawdata.human.tsv' >> trrust_rawdata_human.tsv command. User needs to make sure that the downloaded file is in the data directory.

Definitions

1. artefacts: Directory provided by user, where all the results would be saved.

2. stage: Part of the tool user wants to run.

3. params: Input arguments required by the stage.

Input file preparation

All the input files should be saved in a folder called data. Inputs files user needs to create: Gene expression files

Starting cell population: create a .txt file with gene expression data of starting cell population. The rows should be gene names and columns should be sample IDs. Eg: start.txt
Starting cell and terminal cell population combined: create a .txt file with gene expression data of both starting cell and terminal cell population. The rows should be gene names and columns should be sample IDs. Eg: start_terminal.txt
Terminal cell population: create a .csv file with gene expression data of starting cell population. The rows should be gene names and columns should be sample IDs. Eg: terminal.csv
annotation.txt: create a .txt file with lable IDs of the samples from starting cell and terminal cell population combined expression data. First row being the sample IDS similar to the starting cell and terminal cell population combined file and second two should be the cluster IDs of the population. Also one of the sample ID and cluster ID should be matching.

The starting cell population and terminal cell population cluster IDs to be enerted as parameters should match the one in the annotations files.

Commands

stage: all (TransSynW + PAGA + SIGNET + TRRUST + Cytoscape + Uniprot)

crgem run all --artefacts ./artefacts/[directory_name] --params [start_cell population] [start and terminal_cell population] [annotation file] [terminal cell cluster ID] [startaing cell cluster ID] ./data/terminal.csv ./data/trrust_rawdata_human.tsv

Eg: crgem run all --artefacts ./artefacts/temp --params start.txt start_terminal.txt annotation.txt HPROGFPM HNES ./data/terminal.csv ./data/trrust_rawdata_human.tsv
stage: generate_hypothesis (TransSynW)

crgem run generate_hypothesis --artefacts ./artefacts/[directory_name] --params [start_cell population] [start and terminal_cell population] [annotation file] [terminal cell cluster ID]

Eg: crgem run generate_hypothesis --artefacts ./artefacts/[directory_name] --params start.txt start_terminal.txt annotation.txt HPROGFPM
stage: mechanistic insights (TransSynW + PAGA)

crgem run mechanistic insights --artefacts ./artefacts/[directory_name] --params [start_cell population] [start and terminal_cell population] [annotation file] [terminal cell cluster ID] [startaing cell cluster ID]

Eg: crgem run mechanistic_insights --artefacts ./artefacts/temp --params start.txt start_terminal.txt annotation.txt HPROGFPM HNES
stage: grn inference (SIGNET)

crgem run grn_inference --artefacts ./artefacts/[directory_name] --params ./data/terminal.csv

Eg: crgem run grn_inference --artefacts ./artefacts/temp --params ./data/terminal.csv
stage: trrust analysis (TRRUST)

crgem run trrust_analysis --artefacts ./artefacts/[directory_name] --params ./data/trrust_rawdata_human.tsv

Eg: crgem run trrust_analysis --artefacts ./artefacts/temp --params ./data/trrust_rawdata_human.tsv
stage: gene network (Cytoscape)

crgem run create_network --artefacts ./artefacts/[directory_name] --params ./artefacts/[directory_name]/Trrust_Analysis/trrust_analysis.csv

Eg: crgem run create_network --artefacts ./artefacts/temp --params ./artefacts/temp/Trrust_Analysis/trrust_analysis.csv
stage: functional analysis (Uniprot)

crgem run functional_analysis --artefacts ./artefacts/[directory_name] /Trrust_analysis/transsynw_genes.csv /Trrust_analysis/signet_genes.csv

Eg: crgem run functional_analysis --artefacts ./artefacts/temp /Trrust_analysis/transsynw_genes.csv /Trrust_analysis/signet_genes.csv

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
.github/workflows		.github/workflows
_build/scripts-3.9		_build/scripts-3.9
bin		bin
crgem		crgem
data		data
images		images
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
path.txt		path.txt
requirements.txt		requirements.txt
setup.py		setup.py
setup.sh		setup.sh
test_setup.sh		test_setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CRGEM: Cellular Reprogramming using mechanism-driven Gene Expression Modulation

Steps:

Installation

Requirements

1. Rstudio version used: R version 4.2.2

2. R requirements

3. Python version used: Python 3.9

4. Cytoscape version recommended: 3.9.1. Cytoscape needs to open in the background while running the workflow.

5. PathLinker app to be added to cytoscape using App manager in the software.

6. boost (Collection of portable C++ source libraries):

For M1 users:

CAPTH = /opt/homebrew/include

LIBRARY_PATH = /opt/homebrew/lib

For Intel users:

CAPTH =/usr/local/include

LD_LIBRARY_PATH =/usr/local/lib

7. wget:

8. feature file

9. TRRUST data netowrk in .tsv format

Definitions

Input file preparation

Commands

About

Releases

Packages

Languages

License

thoughtworks/cell-reprogram-workflow

Folders and files

Latest commit

History

Repository files navigation

CRGEM: Cellular Reprogramming using mechanism-driven Gene Expression Modulation

Steps:

Installation

Requirements

1. Rstudio version used: R version 4.2.2

2. R requirements

3. Python version used: Python 3.9

4. Cytoscape version recommended: 3.9.1. Cytoscape needs to open in the background while running the workflow.

5. PathLinker app to be added to cytoscape using App manager in the software.

6. boost (Collection of portable C++ source libraries):

For M1 users:

CAPTH = /opt/homebrew/include

LIBRARY_PATH = /opt/homebrew/lib

For Intel users:

CAPTH =/usr/local/include

LD_LIBRARY_PATH =/usr/local/lib

7. wget:

8. feature file

9. TRRUST data netowrk in .tsv format

Definitions

Input file preparation

Commands

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages