Skip to content

This is the R code to extract Gene Expression Profiles of particular genes from GEO (Gene Expression Omnibus) datasets.

Notifications You must be signed in to change notification settings

DanishAslam7/GenExp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

GenExp

This R code is designed to extract gene expression profiles of specific genes from the Gene Expression Omnibus (GEO) datasets. The code leverages publicly available data from GEO, a comprehensive repository of gene expression datasets. By focusing on genes of interest related to the progression of disease of interest, this code aims to obtain relevant expression data for further analysis.

Prerequisites

Before using this code, ensure the following prerequisites are met:

Data Collection:

Conduct a thorough literature review to identify genes of interest that play significant roles in the development of Alzheimer's disease. GenBank, the NIH genetic sequence database, can be used to access publicly available DNA sequences (https://www.ncbi.nlm.nih.gov/genbank/).

GEO Datasets Selection:

Choose expression datasets from GEO that are directly relevant to your study. These datasets should be associated with Alzheimer's and contain gene expression information. The code focuses on extracting data for specific genes, so make sure to identify the datasets that align with your research.

Procedure:

Identify Genes of Interest: Based on your literature review and research objectives, select genes of interest for studying their expression patterns in Alzheimer's disease.

Note the Reference Series Number: For the samples of interest, take note of the reference series numbers associated with the accession numbers. These reference series numbers will be used in the extraction steps to target specific samples.

Run the R Code: Execute the provided R code to perform the extraction process. The code reads and normalizes gene expression data from the selected GEO datasets using the Robust Multi-array Average (RMA) method.

Data Manipulation: The code manipulates the gene expression data, merging with additional feature-level data that likely contains gene annotations.

Extract Expression Profiles: The code extracts the expression profiles for the genes of interest based on their unique identifiers (IDs) from the normalized data.

Resulting Data: The output will be a subset of gene expression profiles specific to the selected genes, ready for further analysis and investigation.

Note: It is essential to ensure that the chosen datasets from GEO are relevant to your research and aligned with the focus of your study. The code is designed to concentrate on specific genes, so accuracy in selecting the genes of interest and associated reference series numbers is critical for obtaining meaningful results. Additionally, no need to download the datasets; the code extracts the necessary information directly from GEO.

For any questions or clarifications, feel free to contact the author of this code.

About

This is the R code to extract Gene Expression Profiles of particular genes from GEO (Gene Expression Omnibus) datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages