Filters








79 Hits in 0.99 sec

Recent Advances in High Throughput Sequencing Analysis

Yan Guo, Leng Han, Quanhu Sheng
2017 International Journal of Genomics  
doi:10.1155/2017/2454780 pmid:28706940 pmcid:PMC5494582 fatcat:ascgydcwhzhhzhoqghtfz6lhlu

Advanced Heat Map and Clustering Analysis Using Heatmap3

Shilin Zhao, Yan Guo, Quanhu Sheng, Yu Shyr
2014 BioMed Research International  
Heat maps and clustering are used frequently in expression analysis studies for data visualization and quality control. Simple clustering and heat maps can be produced from the "heatmap" function in R. However, the "heatmap" function lacks certain functionalities and customizability, preventing it from generating advanced heat maps and dendrograms. To tackle the limitations of the "heatmap" function, we have developed an R package "heatmap3" which significantly improves the original "heatmap"
more » ... nction by adding several more powerful and convenient features. The "heatmap3" package allows users to produce highly customizable state of the art heat maps and dendrograms. The "heatmap3" package is developed based on the "heatmap" function in R, and it is completely compatible with it. The new features of "heatmap3" include highly customizable legends and side annotation, a wider range of color selections, new labeling features which allow users to define multiple layers of phenotype variables, and automatically conducted association tests based on the phenotypes provided. Additional features such as different agglomeration methods for estimating distance between two samples are also added for clustering.
doi:10.1155/2014/986048 pmid:25143956 pmcid:PMC4124803 fatcat:veohll23mbec3nxluulh7avfma

beRBP: binding estimation for human RNA-binding proteins

Hui Yu, Jing Wang, Quanhu Sheng, Qi Liu, Yu Shyr
2018 Nucleic Acids Research  
Identifying binding targets of RNA-binding proteins (RBPs) can greatly facilitate our understanding of their functional mechanisms. Most computational methods employ machine learning to train classifiers on either RBP-specific targets or pooled RBP-RNA interactions. The former strategy is more powerful, but it only applies to a few RBPs with a large number of known targets; conversely, the latter strategy sacrifices prediction accuracy for a wider application, since specific interaction
more » ... are inevitably obscured through pooling heterogeneous datasets. Here, we present beRBP, a dual approach to predict human RBP-RNA interaction given PWM of a RBP and one RNA sequence. Based on Random Forests, beRBP not only builds a specific model for each RBP with a decent number of known targets, but also develops a general model for RBPs with limited or null known targets. The specific and general models both compared well with existing methods on three benchmark datasets. Notably, the general model achieved a better performance than existing methods on most novel RBPs. Overall, as a composite solution overarching the RBP-specific and RBP-General strategies, beRBP is a promising tool for human RBP binding estimation with good prediction accuracy and a broad application scope.
doi:10.1093/nar/gky1294 pmid:30590704 pmcid:PMC6411931 fatcat:a4tv6v26bbbuvlgiygbsftrl4u

RnaSeqSampleSize: real data based sample size estimation for RNA sequencing

Shilin Zhao, Chung-I Li, Yan Guo, Quanhu Sheng, Yu Shyr
2018 BMC Bioinformatics  
One of the most important and often neglected components of a successful RNA sequencing (RNA-Seq) experiment is sample size estimation. A few negative binomial model-based methods have been developed to estimate sample size based on the parameters of a single gene. However, thousands of genes are quantified and tested for differential expression simultaneously in RNA-Seq experiments. Thus, additional issues should be carefully addressed, including the false discovery rate for multiple statistic
more » ... tests, widely distributed read counts and dispersions for different genes. Results: To solve these issues, we developed a sample size and power estimation method named RnaSeqSampleSize, based on the distributions of gene average read counts and dispersions estimated from real RNA-seq data. Datasets from previous, similar experiments such as the Cancer Genome Atlas (TCGA) can be used as a point of reference. Read counts and their dispersions were estimated from the reference's distribution; using that information, we estimated and summarized the power and sample size. RnaSeqSampleSize is implemented in R language and can be installed from Bioconductor website. A user friendly web graphic interface is provided at http://cqs.mc.vanderbilt.edu/shiny/ RnaSeqSampleSize/. Conclusions: RnaSeqSampleSize provides a convenient and powerful way for power and sample size estimation for an RNAseq experiment. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.
doi:10.1186/s12859-018-2191-5 pmid:29843589 pmcid:PMC5975570 fatcat:ygcf2il4rrei5d42bj2bewtf6q

Heatmap3: an improved heatmap package with more powerful and convenient features

Shilin Zhao, Yan Guo, Quanhu Sheng, Yu Shyr
2014 BMC Bioinformatics  
doi:10.1186/1471-2105-15-s10-p16 pmcid:PMC4196034 fatcat:dbpatatjyraafmrfbpbbkhoo2y

MultiRankSeq: Multiperspective Approach for RNAseq Differential Expression Analysis and Quality Control

Yan Guo, Shilin Zhao, Fei Ye, Quanhu Sheng, Yu Shyr
2014 BioMed Research International  
Background. After a decade of microarray technology dominating the field of high-throughput gene expression profiling, the introduction of RNAseq has revolutionized gene expression research. While RNAseq provides more abundant information than microarray, its analysis has proved considerably more complicated. To date, no consensus has been reached on the best approach for RNAseq-based differential expression analysis. Not surprisingly, different studies have drawn different conclusions as to
more » ... best approach to identify differentially expressed genes based upon their own criteria and scenarios considered. Furthermore, the lack of effective quality control may lead to misleading results interpretation and erroneous conclusions. To solve these aforementioned problems, we propose a simple yet safe and practical rank-sum approach for RNAseq-based differential gene expression analysis named MultiRankSeq. MultiRankSeq first performs quality control assessment. For data meeting the quality control criteria, MultiRankSeq compares the study groups using several of the most commonly applied analytical methods and combines their results to generate a new rank-sum interpretation. MultiRankSeq provides a unique analysis approach to RNAseq differential expression analysis. MultiRankSeq is written in R, and it is easily applicable. Detailed graphical and tabular analysis reports can be generated with a single command line.
doi:10.1155/2014/248090 pmid:24977143 pmcid:PMC4058234 fatcat:tj7333z5hjf4tgte7jkpittgq4

DupChecker: a bioconductor package for checking high-throughput genomic data redundancy in meta-analysis

Quanhu Sheng, Yu Shyr, Xi Chen
2014 BMC Bioinformatics  
Meta-analysis has become a popular approach for high-throughput genomic data analysis because it often can significantly increase power to detect biological signals or patterns in datasets. However, when using public-available databases for meta-analysis, duplication of samples is an often encountered problem, especially for gene expression data. Not removing duplicates could lead false positive finding, misleading clustering pattern or model over-fitting issue, etc in the subsequent data
more » ... is. Results: We developed a Bioconductor package Dupchecker that efficiently identifies duplicated samples by generating MD5 fingerprints for raw data. A real data example was demonstrated to show the usage and output of the package. Conclusions: Researchers may not pay enough attention to checking and removing duplicated samples, and then data contamination could make the results or conclusions from meta-analysis questionable. We suggest applying DupChecker to examine all gene expression data sets before any data analysis step.
doi:10.1186/1471-2105-15-323 pmid:25267467 pmcid:PMC4261523 fatcat:daeoyyyfdrb6dbk27k3unnxuze

Practicability of detecting somatic point mutation from RNA high throughput sequencing data

Quanhu Sheng, Shilin Zhao, Chung-I Li, Yu Shyr, Yan Guo
2016 Genomics  
Sheng et al. Page 16 Genomics. Author manuscript; available in PMC 2017 October 31. Sheng et al. Page 17 Sheng et al. Page 18  ... 
doi:10.1016/j.ygeno.2016.03.006 pmid:27046520 pmcid:PMC5663213 fatcat:wnb2cvcdgvbatl7lb55n3goydi

A pan-cancer immunogenomic atlas for immune checkpoint blockade immunotherapy

Jing Yang, Shilin Zhao, Jing Wang, Quanhu Sheng, Qi Liu, Yu Shyr
2021 Cancer Research  
The ability to identify robust genomic signatures that predict response to immune checkpoint blockade is restricted by limited sample sizes and ungeneralizable performance across cohorts. To address these challenges, we established Cancer-Immu (http://bioinfo.vanderbilt.edu/database/Cancer-Immu/) a comprehensive platform that integrates large-scale multidimensional omics data, including genetic, bulk, and single-cell transcriptomic, proteomic, and dynamic genomic profiles, with clinical
more » ... es to explore consistent and rare immunogenomic connections. Currently Cancer-Immu has incorporated data for 3,652 samples for 16 cancer types. It provides easy access to immunogenomic data and empowers researchers to translate omics datasets into biological insights and clinical applications.
doi:10.1158/0008-5472.can-21-2335 pmid:34903605 pmcid:PMC9189237 fatcat:jmidmzquqrazbawpfrdeiowrha

Detection of internal exon deletion with exon Del

Yan Guo, Shilin Zhao, Brian D Lehmann, Quanhu Sheng, Timothy M Shaver, Thomas P Stricker, Jennifer A Pietenpol, Yu Shyr
2014 BMC Bioinformatics  
Exome sequencing allows researchers to study the human genome in unprecedented detail. Among the many types of variants detectable through exome sequencing, one of the most over looked types of mutation is internal deletion of exons. Internal exon deletions are the absence of consecutive exons in a gene. Such deletions have potentially significant biological meaning, and they are often too short to be considered copy number variation. Therefore, to the need for efficient detection of such
more » ... ons using exome sequencing data exists. Results: We present ExonDel, a tool specially designed to detect homozygous exon deletions efficiently. We tested ExonDel on exome sequencing data generated from 16 breast cancer cell lines and identified both novel and known IEDs. Subsequently, we verified our findings using RNAseq and PCR technologies. Further comparisons with multiple sequencing-based CNV tools showed that ExonDel is capable of detecting unique IEDs not found by other CNV tools. Conclusions: ExonDel is an efficient way to screen for novel and known IEDs using exome sequencing data. ExonDel and its source code can be downloaded freely at https://github.com/slzhao/ExonDel.
doi:10.1186/1471-2105-15-332 pmid:25322818 pmcid:PMC4288651 fatcat:2d3w3ue62vf3let7qqag2dbaoa

O18Quant: A Semiautomatic Strategy for Quantitative Analysis of High-Resolution16O/18O Labeled Data

Yan Guo, Masaru Miyagi, Rong Zeng, Quanhu Sheng
2014 BioMed Research International  
Proteolytic18O-labeling has been widely used in quantitative proteomics since it can uniformly label all peptides from different kinds of proteins. There have been multiple algorithms and tools developed over the last few years to analyze high-resolution proteolytic16O/18O labeled mass spectra. We have developed a software package, O18Quant, which addresses two major issues in the previously developed algorithms. First, O18Quant uses a robust linear model (RLM) for peptide-to-protein ratio
more » ... ation. RLM can minimize the effect of outliers instead of iteratively removing them which is a common practice in other approaches. Second, the existing algorithms lack applicable implementation. We address this by implementing O18Quant using C# under Microsoft.net framework and R. O18Quant automatically calculates the peptide/protein relative ratio and provides a friendly graphical user interface (GUI) which allows the user to manually validate the quantification results at scan, peptide, and protein levels. The intuitive GUI of O18Quant can greatly enhance the user's visualization and understanding of the data analysis. O18Quant can be downloaded for free as part of the software suite ProteomicsTools.
doi:10.1155/2014/971857 pmid:24901003 pmcid:PMC4037588 fatcat:fynrpsw6pvaefiaxe5uudzsyue

Maternal Hyperglycemia Induces Changes in Gene Expression and Morphology in Mouse Placentas

Molly Eckmann, Quanhu Sheng, Scott Baldwin H, Rolanda L. Lister
2021 Gynecology & Reproductive Health  
Pregestational diabetes complicates one million pregnancies in the United States and is associated with placental dysfunction. Placental dysfunction can manifest as stillbirth, spontaneous abortions, fetal growth restriction, and preeclampsia in the mother. However, the underlying mechanisms of placental dysfunction are not well understood. Objective: We hypothesize that maternal hyperglycemia disrupts cellular processes important for normal vascular development and function. Study Design:
more » ... glycemia, defined as a non-fasting glucose concentration of >250 mg/dL was induced in eight-week-old female CD1 mice by injecting a one-time intraperitoneal dose of 150mg/kg streptozotocin. Control mice received an equal volume of normal saline. Hyperglycemic and control females were mated with CD-1 males. At Embryonic Day 17.5, the pregnant mice were euthanized. Sixty-eight placentas were harvested from the six euglycemic dams and twenty-six placentas were harvested from three hyperglycemic dams. RNA was extracted from homogenized placental tissue (N=12/group; 2-4 placentas per litter of each group). Total RNA was prepared and sequenced. Differentially expressed genes that were >2-fold change was considered significant. Placentas (9-20/group) were fixed in paraffin wax and sectioned at 6 µm. Cross-sectional areas of placental zones were evaluated using slides stained for hematoxylin and eosin, glycogen, collagen, proliferation and apoptosis. Quantification of staining intensity and percent positive nuclei was done using Leica Image Hub Data software. Data were compared between the control and experimental group using t-tests. Values of p < 0.05 were considered to be statistically significant. Results: The average maternal blood glucose concentrations for control and diabetic dams were 112+/-24 and 473+/- 47 respectively (p<0.0001). A higher rate of resorptions was noted in the hyperglycemia exposed placentas compared to euglycemic exposed placentas (24% vs 7%; p=0.04). A total of 24 RNA libraries (12/group) were prepared. Placentas from hyperglycemic pregnancies exhibited 1374 differentially expressed genes (DEGs). The 10 most significantly differentially expressed genes are Filip 1, Prom 2, Fam 78a, Pde4d, Pou3f1, Kcnk5, Dusp4, Cxcr4, Slc6a4 and D430019H16Rik. Their corresponding biologic functions are related to chemotaxis, ossification, cellular and vascular development. Histologically, we found that hyperglycemia exposed placentas demonstrated increased proliferation, apoptosis, and glycogen content and decreased collagen deposition. Conclusion: There was a higher rate of resorptions in the pregnancies of hyperglycemic dams. Pregestational diabetes resulted in significant changes in placental morphology, including increased glycogen content in the spongiotrophoblast, decreased collagen deposition, increased apoptosis and proliferation in the junction zone. Maternal diabetes causes widespread disruption in multiple cellular processes important for normal vascular development and sets the platform for placenta dysfunction.
doi:10.33425/2639-9342.1140 fatcat:74xsk2vtczcs5di4y36wzneup4

Maternal Hyperglycemia Induces Changes in Gene Expression and Morphology in Mouse Placentas

Molly Eckmann, Quanhu Sheng, Scott Baldwin H, Rolanda L Lister
2021 Gynecology & Reproductive Health  
Pregestational diabetes complicates one million pregnancies in the United States and is associated with placental dysfunction. Placental dysfunction can manifest as stillbirth, spontaneous abortions, fetal growth restriction, and preeclampsia in the mother. However, the underlying mechanisms of placental dysfunction are not well understood. We hypothesize that maternal hyperglycemia disrupts cellular processes important for normal vascular development and function. Hyperglycemia, defined as a
more » ... n-fasting glucose concentration of >250 mg/dL was induced in eight-week-old female CD1 mice by injecting a one-time intraperitoneal dose of 150mg/kg streptozotocin. Control mice received an equal volume of normal saline. Hyperglycemic and control females were mated with CD-1 males. At Embryonic Day 17.5, the pregnant mice were euthanized. Sixty-eight placentas were harvested from the six euglycemic dams and twenty-six placentas were harvested from three hyperglycemic dams. RNA was extracted from homogenized placental tissue (N=12/group; 2-4 placentas per litter of each group). Total RNA was prepared and sequenced. Differentially expressed genes that were >2-fold change was considered significant. Placentas (9-20/group) were fixed in paraffin wax and sectioned at 6 μm. Cross-sectional areas of placental zones were evaluated using slides stained for hematoxylin and eosin, glycogen, collagen, proliferation and apoptosis. Quantification of staining intensity and percent positive nuclei was done using Leica Image Hub Data software. Data were compared between the control and experimental group using t-tests. Values of p < 0.05 were considered to be statistically significant. The average maternal blood glucose concentrations for control and diabetic dams were 112+/-24 and 473+/-47 respectively (p<0.0001). A higher rate of resorptions was noted in the hyperglycemia exposed placentas compared to euglycemic exposed placentas (24% vs 7%; p=0.04). A total of 24 RNA libraries (12/group) were prepared. Placentas from hyperglycemic pregnancies exhibited 1374 differentially expressed genes (DEGs). The 10 most significantly differentially expressed genes are Filip 1, Prom 2, Fam 78a, Pde4d, Pou3f1, Kcnk5, Dusp4, Cxcr4, Slc6a4 and D430019H16Rik. Their corresponding biologic functions are related to chemotaxis, ossification, cellular and vascular development. Histologically, we found that hyperglycemia exposed placentas demonstrated increased proliferation, apoptosis, and glycogen content and decreased collagen deposition. There was a higher rate of resorptions in the pregnancies of hyperglycemic dams. Pregestational diabetes resulted in significant changes in placental morphology, including increased glycogen content in the spongiotrophoblast, decreased collagen deposition, increased apoptosis and proliferation in the junction zone. Maternal diabetes causes widespread disruption in multiple cellular processes important for normal vascular development and sets the platform for placenta dysfunction.
pmid:34250501 pmcid:PMC8270392 fatcat:u737vf5rgrc3hnrn3kzmvvgthm

RNAseqPS: A Web Tool for Estimating Sample Size and Power for RNAseq Experiment

Yan Guo, Shilin Zhao, Chung-I Li, Quanhu Sheng, Yu Shyr
2014 Cancer Informatics  
Sample size and power determination is the first step in the experimental design of a successful study. Sample size and power calculation is required for applications for National Institutes of Health (NIH) funding. Sample size and power calculation is well established for traditional biological studies such as mouse model, genome wide association study (GWAS), and microarray studies. Recent developments in high-throughput sequencing technology have allowed RNAseq to replace microarray as the
more » ... chnology of choice for high-throughput gene expression profiling. However, the sample size and power analysis of RNAseq technology is an underdeveloped area. Here, we present RNAseqPS, an advanced online RNAseq power and sample size calculation tool based on the Poisson and negative binomial distributions. RNAseqPS was built using the Shiny package in R. It provides an interactive graphical user interface that allows the users to easily conduct sample size and power analysis for RNAseq experimental design. RNAseqPS can be accessed directly at http://cqs.mc.vanderbilt.edu/shiny/RNAseqPS/.
doi:10.4137/cin.s17688 pmid:25374457 pmcid:PMC4213196 fatcat:iatllwtct5c5jb6qcgvnpmzqvm

Quantitative assessment of cell population diversity in single-cell landscapes [article]

Qi Liu, Charles A Herring, Quanhu Sheng, Jie Ping, Alan J Simmons, Bob Chen, Amrita Banerjee, Guoqiang Gu, Robert J Coffey, Yu Shyr, Ken S Lau
2018 bioRxiv   pre-print
Single-cell RNA-sequencing (scRNA-seq) has become a powerful tool for the systematic investigation of cellular diversity. As a number of computational tools have been developed to identify and visualize cell populations within a single scRNA-seq dataset, there is a need for methods to quantitatively and statistically define proportional shifts in cell population structures across datasets, such expansion or shrinkage, or emergence or disappearance of cell populations. Here we present
more » ... a framework to statistically quantify compositional diversity in cell populations between single-cell transcriptome landscapes. sc-UniFrac enables sensitive and robust quantification in simulated and experimental datasets in terms of both population identity and quantity. We have demonstrated the utility of sc-UniFrac in multiple applications, including assessment of biological and technical replicates, classification of tissue phenotypes, identification and definition of altered cell populations, and benchmarking batch correction tools. sc-UniFrac provides a framework for quantifying diversity or alterations in cell populations across conditions, and has broad utility for gaining insight on how cell populations respond to perturbations.
doi:10.1101/333393 fatcat:mwzdlnmaiba5lfflchlwna5fo4
« Previous Showing results 1 — 15 out of 79 results