Browsing by Subject "Software"

Now showing 1 - 9 of 9

An Automated Tool for Measuring Aortic Pulse Wave Velocity
(2013-01-22) Goel, Akshay; Peshock, Ronald M.; McColl, Roderick; King, Kevin; Whittemore, Anthony
PURPOSE: Aortic Pulse wave Velocity (APV) has been shown to be associated with end organ damage independent of age, sex, and hypertension duration. The purpose of this study is to evaluate an automated approach for computing transit time (Δt) for the measurement of APV as a tool for future investigations and clinical application. METHODS AND MATERIALS: Phase contrast cardiac gated MRI of the aorta in the transverse plane at the level of the pulmonary artery was utilized from the Dallas Heart Study-2 (DHS2), a multiethnic, population-based study of cardiovascular health. A three-step algorithm was used to analyze all 1884 phase contrast MRI studies from the DHS2 central database. The algorithm functions in three key steps: 1) Isolating contours for the ascending aorta and descending aorta using a computer vision technique known as the Hough Transform. 2) Using isolated contours and phase contrast MRI to generate flow curves for the ascending and descending aorta. 3) Computing Δt defined as the time shift between the flow curves in the ascending aorta (AA) and descending aorta (DA), calculated using the half maximum of AA and DA. Fifty of these studies uniformly distributed across all Δt were then randomly selected and manually analyzed with the standard approach utilizing QFlow (v. 4.1.6, Medis) and the corresponding manually derived flow curves were used to compute Δt. The results from the manual analysis using QFlow were compared to results from the automated algorithm using linear regression Bland-Altman difference analysis. RESULTS: The mean Δt in the 1884 studies analyzed with our automated tool was 19.8+/-6.5 ms. In the validation set of 50 studies, linear regression analysis showed an excellent correlation between the automated (A) and manual (M) methods (r=0.97, A = 1.01M-0.885 ms). Bland-Altman difference analysis showed strong agreement with no significant bias (mean difference (A-M) = -0.386 ± 0.768 ms). CONCLUSION: Our automated approach for computing transit time (Δt) for the measurement of APV demonstrates excellent agreement with the standard manual method. These findings suggest this approach could serve as a useful tool for future investigations and clinical application.
High-Performance Software Development for Genomic Sequence Alignment and Analysis
(2023-05-01T05:00:00.000Z) Zhang, Yun; Zhan, Xiaowei; Kim, Daehwan; Li, Bo; Wang, Tao; Hon, Gary C.
Nucleic acid sequencing technology is a powerful tool for understanding genetic information. Genomic data analysis software is critical for transforming complex sequencing results into meaningful biological information. Emerging sequencing technologies help scientists to understand biological processes from multiple angles, but they also raise the challenge of developing new sequence analysis tools, especially new alignment methods, to support these techniques. In this dissertation, I developed a rapid and accurate sequence alignment software, HISAT-3N, to solve the alignment problem of nucleotide conversion sequencing (NC) technologies. NC technologies, such as BS-seq and SLAM seq, involve converting one type of nucleotide to another, which allows researchers to identify specific chemical modifications in DNA or RNA molecules. However, the conversions generated in these NC technologies make it difficult to align the reads back to the reference genome. To solve this issue, I implemented the 3-letter alignment algorithm into HISAT2, which was developed by our lab previously, to create HISAT-3N. I thoroughly tested HISAT-3N and demonstrated that it is more than seven times faster and more accurate than widely used sequence aligners, and can support all types of nucleotide conversion sequencing technologies, including those that have not yet been developed. Additionally, to generalize the process of developing new alignment methods to support new sequencing technologies, I created a platform that allows for the modularized design of sequence alignment software. This platform incorporates algorithms from HISAT2, STAR, and BWA, providing greater efficiency for developers to create novel sequence alignment software and more flexibility for users to analyze different types of data in a variety of computational environments. Finally, I developed a metagenomics analysis pipeline that effectively organizes and manages multiple well-known sequence analysis software for rapid and accurate soil microbial analysis. The successful development and implementation of these tools demonstrate the robustness of a well-designed bioinformatics software and pipeline framework in bioinformatics analysis. Overall, my work emphasizes the significance of continuously improving genomics data analysis tools. This is important to support emerging sequencing technologies and deliver more precise results, which assist researchers in revealing valuable genetic information.
PROCAIN: Protein Profile Comparison with Assisting Information
(2009-06-19) Wang, Yong; Grishin, Nick V.
Detection of remote sequence homology is essential for the accurate inference of protein structure, function, and evolution. The most sensitive detection methods involve the comparison of evolutionary patterns reflected in multiple sequence alignments of protein families. We present PROCAIN, a new method for MSA comparison based on the combination of 'vertical' MSA context (substitution constraints at individual sequence positions) and 'horizontal' context (patterns of residue content at multiple positions). Based on a simple and tractable profile methodology and primitive measures for the similarity of horizontal MSA patterns, the method achieves the quality of homology detection comparable to a more complex advanced method employing hidden Markov models and secondary structure prediction. Adding secondary structure information further improves PROCAIN performance beyond the capabilities of current state-of-the-art tools. The potential value of the method for structure/function predictions is illustrated by the detection of subtle homology between evolutionary distant yet structurally similar protein domains. ProCAIn, relevant databases and tools can be downloaded from http://prodata.swmed.edu/procain/download. The web server can be accessed at http://prodata.swmed.edu/procain/procain.php.
[Southwestern News]
(2004-01-22) Maier, Scott
Studies on Combining Sequence and Structure for Protein Classification
(2010-01-12) Kim, Bong-Hyun; Grishin, Nick V.
The ultimate goal of our research is to develop a better understanding of how proteins evolve different structures and functions. A large scale protein clustering can provide a useful platform to identify such principles of protein evolution. Manual classification schemes accurately group homologous proteins, but they are slow and subjective. Automatic protein clustering methods are largely based on sequence information. Therefore, they often do not accurately reflect remote homologies that can be recognized by structural information. We hypothesized that combining evolutionary signals from protein sequence and 3D structure will improve automated protein classification. To test this hypothesis, we clustered proteins into evolutionary groups using both sequence and structure by a fully automated method. We developed a stringent algorithm, self-consistency grouping (SCG) method, which clusters proteins if all the proteins in the group are more similar to each other than to proteins outside the group. Comparison of SCG and other commonly used clustering methods to a widely accepted manual classification scheme, Structural Classification of Protein (SCOP), showed SCG groups to better reflect the reference classification. In depth analysis of SCG clusters highlights new non-trivial evolutionary links between proteins. SCG clustering can be further developed as a reference for evolutionary classification of proteins.
Tackling Computational Challenges in High-Throughput RNA Interference Screening
(2014-04-07) Zhong, Rui; Minna, John D.; Shay, Jerry W.; White, Michael A.; Xie, Yang; Xiao, Guanghua
Since the discovery of RNAi decades ago, it has been increasingly used in biomedical and biological research. The success of analyzing single genes using siRNAs has resulted in the large-scale application of RNAi for genome-wide loss-of-function phenotype screening while reducing cost and decreasing time. High-throughput RNAi screening (HTS) has been widely accepted and used in a variety of biomedical and biological research projects as the first step to identifying novel drug targets or pathway components. Huge data sets are being generated, but computational challenges remain in data analysis and hit identification, which have become hurdles in HTS. These must be tackled before we can more accurately and precisely interpret the HTS results, since they are often blurred by spatial noise and off-target effects. In my thesis research, I have been working on statistical modeling of high-throughput RNAi screening results. I developed SbacHTS (spatial background noise correction in high-throughput RNAi screening) to identify and remove spatially-correlated background noise from HTS, which helps enhance statistical detection power in triplicate experiments. On top of that, I also created a novel algorithm, DeciRNAi (deconvolution analysis high-throughput RNAi screening results), to quantify the strength and direction of siRNA-mimic-miRNA off-target effects in HTS projects. As a special case, image-based high-content HTS requires management of high-dimensional data analysis and visualization. I built a new R package “iScreen” (image-based high-throughput RNAi screening analysis tools) to deal with such problems.
To Develop a Small Interfering RNA (siRNA) Design and Information Resource to Facilitate Genetic Manipulaton of Human Cells
(2004-05-25) Shah, Jyoti Khetsi; Minna, John D.
Part I: Small interfering RNAs (siRNAs) have revolutionized our ability to study the effects of altering the expression of single genes in mammalian (and other) cells through targeted knockdown of gene expression. In the past, there were a set of rules designed to develop siRNA which worked efficiently in most cases. There was further refinement performed in these rules in some modern research analyses which attempted to address the question of what most closely determines siRNA functionality. I have designed and implemented a new software tool siRNA Information Resource ('sIR') that incorporates the most recent refinements in the design algorithm in order to provide fast and efficient siRNA design. sIR is a web-based computational tool which takes these existing rules for designing synthetic siRNAs and puts them in a software architecture that allows the researcher to design siRNAs for every gene. It also provides a database containing information about already developed siRNA and thus allows the researcher to access the siRNA information database consisting of siRNA information from literature and various other sources. This will ultimately help in future siRNA related discoveries. It also includes a scoring system which helps in rational selection of efficient siRNA. sIR was successfully validated using already designed and developed target siRNA sequences. Part II: One of the major problems in using chemotherapy to treat cancer is whether patients, whose tumors do not respond to one drug, would respond to another. Thus, it would be very useful if one could rationally select the appropriate chemotherapy for each patient's tumor. We are asking is whether tumor gene "expression signatures" detected by microarray analysis could identify a set of genes correlating with sensitivity or resistance to a particular drug. A large panel of breast cancer cell lines was tested with cisplatin, paclitaxel, vinorelbine, doxorubicin and gemcitabine, in vitro using a colorimetric assay to determine the concentration of drug that gives 50% growth inhibition (IC50). Gene expression profiles were also performed using Affymetrix chips and the two data sets were merged. It was found that a panel of ~100 genes were significantly up regulated (4 fold or more) for each drug in resistant cells. As an alternative approach, Pearson correlations between each gene expression data and each drug IC50 across all cell lines analyzed were determined. A positive correlation for a pair of gene and drug indicates the gene may be associated with resistance to the drug whereas a negative correlation would associate that gene with sensitivity to the drug. Some of these genes might be associated with the drug mechanism of action. We conclude that gene expression signatures do exist for individual breast tumor cell chemosensitivity and these could be of clinical significance.
Toward Structural and Functional Predictions from Biological Sequences
(2018-05-25) Li, Wenlin; Otwinowski, Zbyszek; Grishin, Nick V.; Thomas, Philip J.; Rosenbaum, Daniel M.
Biological sequences, including DNA and protein sequences, are believed to encode sufficient information to determine the structure and function of biological molecules, which in turn decide the phenotypic traits of animals. Deciphering the biological sequences is an important and multiscale problem that connecting the information flow from genotypes to phenotypes. Current advances in next-generation sequence technology provided tons of sequencing data, demanding innovations in computational algorithm for better interpretation. I developed computational methodologies to understand the biological sequences in various levels. In the primary sequence level, I analyzed the evolutionary information encoded in protein families and predicted the function (and active sites) of the proteins. To aid my sequence analysis, I developed a set of computational methodologies and deployed them as public web-servers. In the protein structure level, I studied the plasticity of the 3D structures, as well as demonstrated its effect on the uncertainty of computational scoring algorithms. In the organism level, I innovated the computational methodology to assemble and analyze complete genomes of butterflies and discovered convergence evolution in butterfly wing patterns. In conclusion, I advanced the knowledge of biological sequences in multi-layers by computational approaches.
[UT Southwestern Medical Center News]
(2008-01-23) Siegfried, Amanda