Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Nucleic Acids Res ; 52(D1): D513-D521, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37962356

RESUMO

In this update paper, we present the latest developments in the OMA browser knowledgebase, which aims to provide high-quality orthology inferences and facilitate the study of gene families, genomes and their evolution. First, we discuss the addition of new species in the database, particularly an expanded representation of prokaryotic species. The OMA browser now offers Ancestral Genome pages and an Ancestral Gene Order viewer, allowing users to explore the evolutionary history and gene content of ancestral genomes. We also introduce a revamped Local Synteny Viewer to compare genomic neighborhoods across both extant and ancestral genomes. Hierarchical Orthologous Groups (HOGs) are now annotated with Gene Ontology annotations, and users can easily perform extant or ancestral GO enrichments. Finally, we recap new tools in the OMA Ecosystem, including OMAmer for proteome mapping, OMArk for proteome quality assessment, OMAMO for model organism selection and Read2Tree for phylogenetic species tree construction from reads. These new features provide exciting opportunities for orthology analysis and comparative genomics. OMA is accessible at https://omabrowser.org.


Assuntos
Bases de Dados Genéticas , Ecossistema , Genoma , Proteoma , Genoma/genética , Filogenia , Sintenia , Internet , Ordem dos Genes/genética
2.
BMC Genomics ; 25(1): 390, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38649807

RESUMO

Medicinal plants are rich sources for treating various diseases due their bioactive secondary metabolites. Fenugreek (Trigonella foenum-graecum) is one of the medicinal plants traditionally used in human nutrition and medicine which contains an active substance, called diosgenin, with anticancer properties. Biosynthesis of this important anticancer compound in fenugreek can be enhanced using eliciting agents which involves in manipulation of metabolite and biochemical pathways stimulating defense responses. Methyl jasmonate elicitor was used to increase diosgenin biosynthesis in fenugreek plants. However, the molecular mechanism and gene expression profiles underlying diosgening accumulation remain unexplored. In the current study we performed an extensive analysis of publicly available RNA-sequencing datasets to elucidate the biosynthesis and expression profile of fenugreek plants treated with methyl jasmonate. For this purpose, seven read datasets of methyl jasmonate treated plants were obtained that were covering several post-treatment time points (6-120 h). Transcriptomics analysis revealed upregulation of several key genes involved in diosgenein biosynthetic pathway including Squalene synthase (SQS) as the first committed step in diosgenin biosynthesis as well as Squalene Epoxidase (SEP) and Cycloartenol Synthase (CAS) upon methyl jasmonate application. Bioinformatics analysis, including gene ontology enrichment and pathway analysis, further supported the involvement of these genes in diosgenin biosynthesis. The bioinformatics analysis led to a comprehensive validation, with expression profiling across three different fenugreek populations treated with the same methyl jasmonate application. Initially, key genes like SQS, SEP, and CAS showed upregulation, followed by later upregulation of Δ24, suggesting dynamic pathway regulation. Real-time PCR confirmed consistent upregulation of SQS and SEP, peaking at 72 h. Additionally, candidate genes Δ24 and SMT1 highlighted roles in directing metabolic flux towards diosgenin biosynthesis. This integrated approach validates the bioinformatics findings and elucidates fenugreek's molecular response to methyl jasmonate elicitation, offering insights for enhancing diosgenin yield. The assembled transcripts and gene expression profiles are deposited in the Zenodo open repository at https://doi.org/10.5281/zenodo.8155183 .


Assuntos
Vias Biossintéticas , Perfilação da Expressão Gênica , Oxilipinas , Terpenos , Transcriptoma , Trigonella , Trigonella/metabolismo , Trigonella/genética , Vias Biossintéticas/efeitos dos fármacos , Vias Biossintéticas/genética , Terpenos/metabolismo , Oxilipinas/farmacologia , Ciclopentanos/farmacologia , Ciclopentanos/metabolismo , Acetatos/farmacologia , Regulação da Expressão Gênica de Plantas/efeitos dos fármacos
3.
Bioinformatics ; 38(10): 2965-2966, 2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35561194

RESUMO

SUMMARY: The conservation of pathways and genes across species has allowed scientists to use non-human model organisms to gain a deeper understanding of human biology. However, the use of traditional model systems such as mice, rats and zebrafish is costly, time-consuming and increasingly raises ethical concerns, which highlights the need to search for less complex model organisms. Existing tools only focus on the few well-studied model systems, most of which are complex animals. To address these issues, we have developed Orthologous Matrix and Alternative Model Organism (OMAMO), a software and a web service that provides the user with the best non-complex organism for research into a biological process of interest based on orthologous relationships between human and the species. The outputs provided by OMAMO were supported by a systematic literature review. AVAILABILITY AND IMPLEMENTATION: https://omabrowser.org/omamo/, https://github.com/DessimozLab/omamo. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Peixe-Zebra , Animais , Camundongos , Ratos , Peixe-Zebra/genética
4.
BMC Bioinformatics ; 21(1): 253, 2020 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-32552661

RESUMO

BACKGROUND: Haplotype information is essential for many genetic and genomic analyses, including genotype-phenotype associations in human, animals and plants. Haplotype assembly is a method for reconstructing haplotypes from DNA sequencing reads. By the advent of new sequencing technologies, new algorithms are needed to ensure long and accurate haplotypes. While a few linked-read haplotype assembly algorithms are available for diploid genomes, to the best of our knowledge, no algorithms have yet been proposed for polyploids specifically exploiting linked reads. RESULTS: The first haplotyping algorithm designed for linked reads generated from a polyploid genome is presented, built on a typical short-read haplotyping method, SDhaP. Using the input aligned reads and called variants, the haplotype-relevant information is extracted. Next, reads with the same barcodes are combined to produce molecule-specific fragments. Then, these fragments are clustered into strongly connected components which are then used as input of a haplotype assembly core in order to estimate accurate and long haplotypes. CONCLUSIONS: Hap10 is a novel algorithm for haplotype assembly of polyploid genomes using linked reads. The performance of the algorithms is evaluated in a number of simulation scenarios and its applicability is demonstrated on a real dataset of sweet potato.


Assuntos
Genoma Humano/genética , Haplótipos/fisiologia , Poliploidia , Algoritmos , Humanos
5.
Nat Biotechnol ; 42(1): 139-147, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37081138

RESUMO

Current methods for inference of phylogenetic trees require running complex pipelines at substantial computational and labor costs, with additional constraints in sequencing coverage, assembly and annotation quality, especially for large datasets. To overcome these challenges, we present Read2Tree, which directly processes raw sequencing reads into groups of corresponding genes and bypasses traditional steps in phylogeny inference, such as genome assembly, annotation and all-versus-all sequence comparisons, while retaining accuracy. In a benchmark encompassing a broad variety of datasets, Read2Tree is 10-100 times faster than assembly-based approaches and in most cases more accurate-the exception being when sequencing coverage is high and reference species very distant. Here, to illustrate the broad applicability of the tool, we reconstruct a yeast tree of life of 435 species spanning 590 million years of evolution. We also apply Read2Tree to >10,000 Coronaviridae samples, accurately classifying highly diverse animal samples and near-identical severe acute respiratory syndrome coronavirus 2 sequences on a single tree. The speed, accuracy and versatility of Read2Tree enable comparative genomics at scale.


Assuntos
Genômica , Animais , Filogenia , Análise de Sequência , Genômica/métodos
6.
Genome Biol ; 25(1): 270, 2024 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-39402664

RESUMO

The exponential increase in sequencing data calls for conceptual and computational advances to extract useful biological insights. One such advance, minimizers, allows for reducing the quantity of data handled while maintaining some of its key properties. We provide a basic introduction to minimizers, cover recent methodological developments, and review the diverse applications of minimizers to analyze genomic data, including de novo genome assembly, metagenomics, read alignment, read correction, and pangenomes. We also touch on alternative data sketching techniques including universal hitting sets, syncmers, or strobemers. Minimizers and their alternatives have rapidly become indispensable tools for handling vast amounts of data.


Assuntos
Genômica , Genômica/métodos , Metagenômica/métodos , Humanos , Software
7.
Nat Commun ; 15(1): 9029, 2024 Oct 19.
Artigo em Inglês | MEDLINE | ID: mdl-39424793

RESUMO

Despite the growing variety of sequencing and variant-calling tools, no workflow performs equally well across the entire human genome. Understanding context-dependent performance is critical for enabling researchers, clinicians, and developers to make informed tradeoffs when selecting sequencing hardware and software. Here we describe a set of "stratifications," which are BED files that define distinct contexts throughout the genome. We define these for GRCh37/38 as well as the new T2T-CHM13 reference, adding many new hard-to-sequence regions which are critical for understanding performance as the field progresses. Specifically, we highlight the increase in hard-to-map and GC-rich stratifications in CHM13 relative to the previous references. We then compare the benchmarking performance with each reference and show the performance penalty brought about by these additional difficult regions in CHM13. Additionally, we demonstrate how the stratifications can track context-specific improvements over different platform iterations, using Oxford Nanopore Technologies as an example. The means to generate these stratifications are available as a snakemake pipeline at https://github.com/usnistgov/giab-stratifications . We anticipate this being useful in enabling precise risk-reward calculations when building sequencing pipelines for any of the commonly-used reference genomes.


Assuntos
Genoma Humano , Genômica , Software , Humanos , Genômica/métodos , Análise de Sequência de DNA/métodos , Benchmarking , Sequenciamento de Nucleotídeos em Larga Escala/métodos
8.
Genome Biol Evol ; 16(10)2024 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-39404012

RESUMO

The era of biodiversity genomics is characterized by large-scale genome sequencing efforts that aim to represent each living taxon with an assembled genome. Generating knowledge from this wealth of data has not kept up with this pace. We here discuss major challenges to integrating these novel genomes into a comprehensive functional and evolutionary network spanning the tree of life. In summary, the expanding datasets create a need for scalable gene annotation methods. To trace gene function across species, new methods must seek to increase the resolution of ortholog analyses, e.g. by extending analyses to the protein domain level and by accounting for alternative splicing. Additionally, the scope of orthology prediction should be pushed beyond well-investigated proteomes. This demands the development of specialized methods for the identification of orthologs to short proteins and noncoding RNAs and for the functional characterization of novel gene families. Furthermore, protein structures predicted by machine learning are now readily available, but this new information is yet to be integrated with orthology-based analyses. Finally, an increasing focus should be placed on making orthology assignments adhere to the findable, accessible, interoperable, and reusable (FAIR) principles. This fosters green bioinformatics by avoiding redundant computations and helps integrating diverse scientific communities sharing the need for comparative genetics and genomics information. It should also help with communicating orthology-related concepts in a format that is accessible to the public, to counteract existing misinformation about evolution.


Assuntos
Biodiversidade , Genômica , Genômica/métodos , Animais , Evolução Molecular , Anotação de Sequência Molecular , Biologia Computacional/métodos
9.
Genome Biol ; 24(1): 221, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37798733

RESUMO

Genomic benchmark datasets are essential to driving the field of genomics and bioinformatics. They provide a snapshot of the performances of sequencing technologies and analytical methods and highlight future challenges. However, they depend on sequencing technology, reference genome, and available benchmarking methods. Thus, creating a genomic benchmark dataset is laborious and highly challenging, often involving multiple sequencing technologies, different variant calling tools, and laborious manual curation. In this review, we discuss the available benchmark datasets and their utility. Additionally, we focus on the most recent benchmark of genes with medical relevance and challenging genomic complexity.


Assuntos
Benchmarking , Genômica , Genômica/métodos , Biologia Computacional/métodos , Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos
10.
bioRxiv ; 2022 Dec 13.
Artigo em Inglês | MEDLINE | ID: mdl-36561179

RESUMO

The inference of phylogenetic trees is foundational to biology. However, state-of-the-art phylogenomics requires running complex pipelines, at significant computational and labour costs, with additional constraints in sequencing coverage, assembly and annotation quality. To overcome these challenges, we present Read2Tree, which directly processes raw sequencing reads into groups of corresponding genes. In a benchmark encompassing a broad variety of datasets, our assembly-free approach was 10-100x faster than conventional approaches, and in most cases more accurate-the exception being when sequencing coverage was high and reference species very distant. To illustrate the broad applicability of the tool, we reconstructed a yeast tree of life of 435 species spanning 590 million years of evolution. Applied to Coronaviridae samples, Read2Tree accurately classified highly diverse animal samples and near-identical SARS-CoV-2 sequences on a single tree-thereby exhibiting remarkable breadth and depth. The speed, accuracy, and versatility of Read2Tree enables comparative genomics at scale.

11.
F1000Res ; 11: 530, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36262335

RESUMO

In October 2021, 59 scientists from 14 countries and 13 U.S. states collaborated virtually in the Third Annual Baylor College of Medicine & DNANexus Structural Variation hackathon. The goal of the hackathon was to advance research on structural variants (SVs) by prototyping and iterating on open-source software. This led to nine hackathon projects focused on diverse genomics research interests, including various SV discovery and genotyping methods, SV sequence reconstruction, and clinically relevant structural variation, including SARS-CoV-2 variants. Repositories for the projects that participated in the hackathon are available at https://github.com/collaborativebioinformatics.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Genômica , Software
12.
FEMS Microbiol Ecol ; 97(8)2021 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-34289042

RESUMO

The microbial communities associated to the rhizosphere (the rhizomicrobiome) have a substantial impact on plant growth and yield. Understanding the effects of agricultural management on the rhizomicrobiome is very important for selecting efficient practices. By sequencing the V4 region of 16S rRNA for bacteria and the ITS1 regions and fungi, we investigated the influences of agronomic practices, including cucumber grafting on cucurbit hybrid (Cucurbita moschata × C. maxima), cucumber-garlic intercropping, and treatment with fungicide iprodione-carbendazim on cucumber rhizosphere microbial communities during plant growth. Soil dehydrogenase activity (DHA) and plant vegetative parameters were assessed as an indicator of overall soil microbial activity. We found that both treatments and growth stage induced significant shifts in microbial community structure. Grafting had the highest number of differentially abundant OTUs compared to control samples, followed by intercropping and fungicide treatment, while plant development stage affected both alpha and beta diversities indices and composition of the rhizomicrobiome. DHA was more dependent on plant growth stages than on treatments. Among the assessed factors, grafting and plant developmental stage resulted in the greatest changes in the microbial community composition. Grafting also increased the plant growth parameters, suggesting that this method should be further investigated in vegetable production systems.


Assuntos
Cucumis sativus , Microbiota , Raízes de Plantas , RNA Ribossômico 16S/genética , Rizosfera , Solo , Microbiologia do Solo
13.
Gigascience ; 9(7)2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32706368

RESUMO

BACKGROUND: The detection of which mutations are occurring on the same DNA molecule is essential to predict their consequences. This can be achieved by phasing the genomic variations. Nevertheless, state-of-the-art haplotype phasing is currently a black box in which the accuracy and quality of the reconstructed haplotypes are hard to assess. FINDINGS: Here we present PhaseME, a versatile method to provide insights into and improvement of sample phasing results based on linkage data. We showcase the performance and the importance of PhaseME by comparing phasing information obtained from Pacific Biosciences including both continuous long reads and high-quality consensus reads, Oxford Nanopore Technologies, 10x Genomics, and Illumina sequencing technologies. We found that 10x Genomics and Oxford Nanopore phasing can be significantly improved while retaining a high N50 and completeness of phase blocks. PhaseME generates reports and summary plots to provide insights into phasing performance and correctness. We observed unique phasing issues for each of the sequencing technologies, highlighting the necessity of quality assessments. PhaseME is able to decrease the Hamming error rate significantly by 22.4% on average across all 5 technologies. Additionally, a significant improvement is obtained in the reduction of long switch errors. Especially for high-quality consensus reads, the improvement is 54.6% in return for only a 5% decrease in phase block N50 length. CONCLUSIONS: PhaseME is a universal method to assess the phasing quality and accuracy and improves the quality of phasing using linkage information. The package is freely available at https://github.com/smajidian/phaseme.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , Software , Genômica/normas , Haplótipos , Humanos , Mutação , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Fluxo de Trabalho
14.
PLoS One ; 15(6): e0234470, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32530974

RESUMO

The single nucleotide polymorphism (SNP) is the most widely studied type of genetic variation. A haplotype is defined as the sequence of alleles at SNP sites on each haploid chromosome. Haplotype information is essential in unravelling the genome-phenotype association. Haplotype assembly is a well-known approach for reconstructing haplotypes, exploiting reads generated by DNA sequencing devices. The Minimum Error Correction (MEC) metric is often used for reconstruction of haplotypes from reads. However, problems with the MEC metric have been reported. Here, we investigate the MEC approach to demonstrate that it may result in incorrectly reconstructed haplotypes for devices that produce error-prone long reads. Specifically, we evaluate this approach for devices developed by Illumina, Pacific BioSciences and Oxford Nanopore Technologies. We show that imprecise haplotypes may be reconstructed with a lower MEC than that of the exact haplotype. The performance of MEC is explored for different coverage levels and error rates of data. Our simulation results reveal that in order to avoid incorrect MEC-based haplotypes, a coverage of 25 is needed for reads generated by Pacific BioSciences RS systems.


Assuntos
Processamento Eletrônico de Dados/métodos , Haplótipos/genética , Polimorfismo de Nucleotídeo Único/genética , Erro Científico Experimental , Análise de Dados , Genoma Humano , Humanos , Análise de Sequência de DNA/instrumentação , Análise de Sequência de DNA/métodos
15.
Gene ; 754: 144856, 2020 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-32512160

RESUMO

Growing evidence indicates the antitumor and antiangiogenesis activities of testis-specific gene antigen 10 (TSGA10). However, the underlying mechanisms and precise role of TSGA10 in angiogenesis are still elusive. In this study, we isolated human umbilical cord vein endothelial cells (HUVECs) and stably transfected with pcDNA3.1 carrying TSGA10 coding sequence. We demonstrated that TSGA10 over-expression significantly decreases HUVEC tubulogenesis and interconnected capillary network formation. HUVECs over-expressing TSGA10 exhibited a significant decrease in migration and proliferation rates. TSGA10 over-expression markedly decreased expression of angiogenesis-related genes, including VEGF-A, VEGFR-2, Ang-1, Ang-2, and Tie-2. Our ELISA results showed the decrease in VEGF-A mRNA expression level is associated with a significant decrease in its protein secretion. Additionally, over-expressing TSGA10 decreased expression levels of marker genes of cell migration (MMP-2, MMP-9, and SDF-1a) and proliferation (PCNA and Ki-67. Furthermore, ERK-1 and AKT phosphorylation significantly reduced in HUVECs over-expressing TSGA10. Our findings suggest a potent anti-angiogenesis activity of TSGA10 in HUVECs through down-regulation of ERK and AKT signalling pathways, and may provide therapeutic benefits for the management of different pathological angiogenesis.


Assuntos
Inibidores da Angiogênese/metabolismo , Movimento Celular , Proliferação de Células , Proteínas do Citoesqueleto/metabolismo , Células Endoteliais da Veia Umbilical Humana/citologia , Células Endoteliais da Veia Umbilical Humana/metabolismo , Neovascularização Fisiológica , Inibidores da Angiogênese/genética , Proteínas do Citoesqueleto/genética , MAP Quinases Reguladas por Sinal Extracelular/genética , MAP Quinases Reguladas por Sinal Extracelular/metabolismo , Humanos , Metaloproteinases da Matriz/genética , Metaloproteinases da Matriz/metabolismo , Proteína Quinase 3 Ativada por Mitógeno/genética , Proteína Quinase 3 Ativada por Mitógeno/metabolismo , Transdução de Sinais , Receptor 2 de Fatores de Crescimento do Endotélio Vascular/genética , Receptor 2 de Fatores de Crescimento do Endotélio Vascular/metabolismo
16.
PLoS One ; 14(3): e0214455, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30913270

RESUMO

We apply matrix completion methods for haplotype assembly from NGS reads to develop the new HapSVT, HapNuc, and HapOPT algorithms. This is performed by applying a mathematical model to convert the reads to an incomplete matrix and estimating unknown components. This process is followed by quantizing and decoding the completed matrix in order to estimate haplotypes. These algorithms are compared to the state-of-the-art algorithms using simulated data as well as the real fosmid data. It is shown that the SNP missing rate and the haplotype block length of the proposed HapOPT are better than those of HapCUT2 with comparable accuracy in terms of reconstruction rate and switch error rate. A program implementing the proposed algorithms in MATLAB is freely available at https://github.com/smajidian/HapMC.


Assuntos
Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Algoritmos , Benchmarking , Cosmídeos/genética , Modelos Genéticos , Polimorfismo de Nucleotídeo Único
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA