Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Front Bioinform ; 3: 1277923, 2023.
Article in English | MEDLINE | ID: mdl-37885757

ABSTRACT

Motivation: For a number of neurological diseases, such as Alzheimer's disease, amyotrophic lateral sclerosis, and many others, certain genes are known to be involved in the disease mechanism. A common question is whether a structural variant in any such gene may be related to drug response in clinical trials and how this relationship can contribute to the lifecycle of drug development. Results: To this end, we introduce VariantSurvival, a tool that identifies changes in survival relative to structural variants within target genes. VariantSurvival matches annotated structural variants with genes that are clinically relevant to neurological diseases. A Cox regression model determines the change in survival between the placebo and clinical trial groups with respect to the number of structural variants in the drug target genes. We demonstrate the functionality of our approach with the exemplary case of the SETX gene. VariantSurvival has a user-friendly and lightweight graphical user interface built on the shiny web application package.

2.
BMC Bioinformatics ; 22(1): 71, 2021 Feb 16.
Article in English | MEDLINE | ID: mdl-33593271

ABSTRACT

BACKGROUND: Specialized data structures are required for online algorithms to efficiently handle large sequencing datasets. The counting quotient filter (CQF), a compact hashtable, can efficiently store k-mers with a skewed distribution. RESULT: Here, we present the mixed-counters quotient filter (MQF) as a new variant of the CQF with novel counting and labeling systems. The new counting system adapts to a wider range of data distributions for increased space efficiency and is faster than the CQF for insertions and queries in most of the tested scenarios. A buffered version of the MQF can offload storage to disk, trading speed of insertions and queries for a significant memory reduction. The labeling system provides a flexible framework for assigning labels to member items while maintaining good data locality and a concise memory representation. These labels serve as a minimal perfect hash function but are ~ tenfold faster than BBhash, with no need to re-analyze the original data for further insertions or deletions. CONCLUSIONS: The MQF is a flexible and efficient data structure that extends our ability to work with high throughput sequencing data.


Subject(s)
Metadata , Software , Algorithms , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA
3.
BMC Bioinformatics ; 21(1): 435, 2020 Oct 06.
Article in English | MEDLINE | ID: mdl-33023475

ABSTRACT

An amendment to this paper has been published and can be accessed via the original article.

4.
BMC Bioinformatics ; 21(1): 397, 2020 Sep 09.
Article in English | MEDLINE | ID: mdl-32907531

ABSTRACT

BACKGROUND: Ion Torrent is one of the major next generation sequencing (NGS) technologies and it is frequently used in medical research and diagnosis. The built-in software for the Ion Torrent sequencing machines delivers the sequencing results in the BAM format. In addition to the usual SAM/BAM fields, the Ion Torrent BAM file includes technology-specific flow signal data. The flow signals occupy a big portion of the BAM file (about 75% for the human genome). Compressing SAM/BAM into CRAM format significantly reduces the space needed to store the NGS results. However, the tools for generating the CRAM formats are not designed to handle the flow signals. This missing feature has motivated us to develop a new program to improve the compression of the Ion Torrent files for long term archiving. RESULTS: In this paper, we present IonCRAM, the first reference-based compression tool to compress Ion Torrent BAM files for long term archiving. For the BAM files, IonCRAM could achieve a space saving of about 43%. This space saving is superior to what achieved with the CRAM format by about 8-9%. CONCLUSIONS: Reducing the space consumption of NGS data reduces the cost of storage and data transfer. Therefore, developing efficient compression software for clinical NGS data goes beyond the computational interest; as it ultimately contributes to the overall cost reduction of the clinical test. The space saving achieved by our tool is a practical step in this direction. The tool is open source and available at Code Ocean, github, and http://ioncram.saudigenomeproject.com .


Subject(s)
User-Computer Interface , Algorithms , Databases, Genetic , Genome, Human , High-Throughput Nucleotide Sequencing/methods , Humans
5.
PLoS One ; 15(8): e0237087, 2020.
Article in English | MEDLINE | ID: mdl-32813723

ABSTRACT

Water buffalo (Bubalus bubalis) is an important source of meat and milk in countries with relatively warm weather. Compared to the cattle genome, a little has been done to reveal its genome structure and genomic traits. This is due to the complications stemming from the large genome size, the complexity of the genome, and the high repetitive content. In this paper, we introduce a high-quality draft assembly of the Egyptian water buffalo genome. The Egyptian breed is used as a dual purpose animal (milk/meat). It is distinguished by its adaptability to the local environment, quality of feed changes, as well as its high resistance to diseases. The genome assembly of the Egyptian water buffalo has been achieved using a reference-based assembly workflow. Our workflow significantly reduced the computational complexity of the assembly process, and improved the assembly quality by integrating different public resources. We also compared our assembly to the currently available draft assemblies of water buffalo breeds. A total of 21,128 genes were identified in the produced assembly. A list of milk virgin-related genes; milk pregnancy-related genes; milk lactation-related genes; milk involution-related genes; and milk mastitis-related genes were identified in the assembly. Our results will significantly contribute to a better understanding of the genetics of the Egyptian water buffalo which will eventually support the ongoing breeding efforts and facilitate the future discovery of genes responsible for complex processes of dairy, meat production and disease resistance among other significant traits.


Subject(s)
Buffaloes/genetics , Genome , Animals , Molecular Sequence Annotation , Whole Genome Sequencing
SELECTION OF CITATIONS
SEARCH DETAIL
...