Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros










Intervalo de ano de publicação
1.
Nat Commun ; 15(1): 372, 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38191463

RESUMO

Homing-based gene drives are recently proposed interventions promising the area-wide, species-specific genetic control of harmful insect populations. Here we characterise a first set of gene drives in a tephritid agricultural pest species, the Mediterranean fruit fly Ceratitis capitata (medfly). Our results show that the medfly is highly amenable to homing-based gene drive strategies. By targeting the medfly transformer gene, we also demonstrate how CRISPR-Cas9 gene drive can be coupled to sex conversion, whereby genetic females are transformed into fertile and harmless XX males. Given this unique malleability of sex determination, we modelled gene drive interventions that couple sex conversion and female sterility and found that such approaches could be effective and tolerant of resistant allele selection in the target population. Our results open the door for developing gene drive strains for the population suppression of the medfly and related tephritid pests by co-targeting female reproduction and shifting the reproductive sex ratio towards males. They demonstrate the untapped potential for gene drives to tackle agricultural pests in an environmentally friendly and economical way.


Assuntos
Ceratitis capitata , Tecnologia de Impulso Genético , Feminino , Masculino , Animais , Ceratitis capitata/genética , Agricultura , Alelos , Fontes de Energia Elétrica
2.
Mem Inst Oswaldo Cruz ; 118: e230122, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37937604

RESUMO

BACKGROUND: Epstein-Barr virus (EBV) is a human gammaherpesvirus etiologically linked to several benign and malignant diseases. EBV-associated malignancies exhibit an unusual global distribution that might be partly attributed to virus and host genetic backgrounds. OBJECTIVES: To assemble a new genome of EBV (CEMO3) from a paediatric Burkitt's lymphoma from Rio de Janeiro State (Southeast Brazil). In addition, to perform global phylogenetic analysis using complete EBV genomes, including CEMO3, and investigate the genetic relationship of some South American (SA) genomes through EBV subgenomic targets. METHODS: CEMO3 was sequenced through next generation sequencing and its coverage and gaps were corrected through the Sanger method. CEMO3 and 67 EBV genomes representing diverse geographic regions were evaluated through maximum likelihood phylogenetic analysis. Further, the polymorphism of subgenomic regions of some SA EBV genomes were assessed. FINDINGS: The whole bulk tumour sequencing yielded 23,217 reads related to EBV, which 172,713 base pairs of the newly EBV genome CEMO3 was assembled. The CEMO3 and most SA EBV genomes clustered within the SA subclade closely related to the African Raji strain, forming the South American/Raji clade. Notably, these Raji-related genomes exhibit significant genetic diversity, characterised by distinctive synapomorphies at some gene levels absent in the original Raji strain. CONCLUSION: The CEMO3 represents a new South American EBV genome assembled. Albeit the majority of EBV genomes from SA are Raji-related, it harbours a high diversity different from the original Raji strain.


Assuntos
Infecções por Vírus Epstein-Barr , Herpesvirus Humano 4 , Criança , Humanos , Herpesvirus Humano 4/genética , Infecções por Vírus Epstein-Barr/genética , Infecções por Vírus Epstein-Barr/patologia , Filogenia , Genoma Viral/genética , Brasil
3.
DNA Res ; 30(1)2023 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-36370138

RESUMO

The New World Screwworm, Cochliomyia hominivorax (Calliphoridae), is the most important myiasis-causing species in America. Screwworm myiasis is a zoonosis that can cause severe lesions in livestock, domesticated and wild animals, and occasionally in people. Beyond the sanitary problems associated with this species, these infestations negatively impact economic sectors, such as the cattle industry. Here, we present a chromosome-scale assembly of C. hominivorax's genome, organized in 6 chromosome-length and 515 unplaced scaffolds spanning 534 Mb. There was a clear correspondence between the D. melanogaster linkage groups A-E and the chromosomal-scale scaffolds. Chromosome quotient (CQ) analysis identified a single scaffold from the X chromosome that contains most of the orthologs of genes that are on the D. melanogaster fourth chromosome (linkage group F or dot chromosome). CQ analysis also identified potential X and Y unplaced scaffolds and genes. Y-linkage for selected regions was confirmed by PCR with male and female DNA. Some of the long chromosome-scale scaffolds include Y-linked sequences, suggesting misassembly of these regions. These resources will provide a basis for future studies aiming at understanding the biology and evolution of this devastating obligate parasite.


Assuntos
Miíase , Infecção por Mosca da Bicheira , Animais , Masculino , Feminino , Bovinos , Calliphoridae , Drosophila melanogaster , Miíase/veterinária , Infecção por Mosca da Bicheira/veterinária , Cromossomos
4.
Mem. Inst. Oswaldo Cruz ; 118: e230122, 2023. tab, graf
Artigo em Inglês | LILACS-Express | LILACS | ID: biblio-1521242

RESUMO

BACKGROUND Epstein-Barr virus (EBV) is a human gammaherpesvirus etiologically linked to several benign and malignant diseases. EBV-associated malignancies exhibit an unusual global distribution that might be partly attributed to virus and host genetic backgrounds. OBJECTIVES To assemble a new genome of EBV (CEMO3) from a paediatric Burkitt's lymphoma from Rio de Janeiro State (Southeast Brazil). In addition, to perform global phylogenetic analysis using complete EBV genomes, including CEMO3, and investigate the genetic relationship of some South American (SA) genomes through EBV subgenomic targets. METHODS CEMO3 was sequenced through next generation sequencing and its coverage and gaps were corrected through the Sanger method. CEMO3 and 67 EBV genomes representing diverse geographic regions were evaluated through maximum likelihood phylogenetic analysis. Further, the polymorphism of subgenomic regions of some SA EBV genomes were assessed. FINDINGS The whole bulk tumour sequencing yielded 23,217 reads related to EBV, which 172,713 base pairs of the newly EBV genome CEMO3 was assembled. The CEMO3 and most SA EBV genomes clustered within the SA subclade closely related to the African Raji strain, forming the South American/Raji clade. Notably, these Raji-related genomes exhibit significant genetic diversity, characterised by distinctive synapomorphies at some gene levels absent in the original Raji strain. CONCLUSION The CEMO3 represents a new South American EBV genome assembled. Albeit the majority of EBV genomes from SA are Raji-related, it harbours a high diversity different from the original Raji strain.

5.
Sci Rep ; 12(1): 7619, 2022 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-35538127

RESUMO

Nucleic-acid barcoding is an enabling technique for many applications, but its use remains limited in emerging long-read sequencing technologies with intrinsically low raw accuracy. Here, we apply so-called NS-watermark barcodes, whose error correction capability was previously validated in silico, in a proof of concept where we synthesize 3840 NS-watermark barcodes and use them to asymmetrically tag and simultaneously sequence amplicons from two evolutionarily distant species (namely Bordetella pertussis and Drosophila mojavensis) on the ONT MinION platform. To our knowledge, this is the largest number of distinct, non-random tags ever sequenced in parallel and the first report of microarray-based synthesis as a source for large oligonucleotide pools for barcoding. We recovered the identity of more than 86% of the barcodes, with a crosstalk rate of 0.17% (i.e., one misassignment every 584 reads). This falls in the range of the index hopping rate of established, high-accuracy Illumina sequencing, despite the increased number of tags and the relatively low accuracy of both microarray-based synthesis and long-read sequencing. The robustness of NS-watermark barcodes, together with their scalable design and compatibility with low-cost massive synthesis, makes them promising for present and future sequencing applications requiring massive labeling, such as long-read single-cell RNA-Seq.


Assuntos
Código de Barras de DNA Taxonômico , Sequenciamento de Nucleotídeos em Larga Escala , Código de Barras de DNA Taxonômico/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos
7.
BMC Biol ; 19(1): 78, 2021 04 16.
Artigo em Inglês | MEDLINE | ID: mdl-33863334

RESUMO

BACKGROUND: Genetic sex ratio distorters are systems aimed at effecting a bias in the reproductive sex ratio of a population and could be applied for the area-wide control of sexually reproducing insects that vector disease or disrupt agricultural production. One example of such a system leading to male bias is X-shredding, an approach that interferes with the transmission of the X-chromosome by inducing multiple DNA double-strand breaks during male meiosis. Endonucleases targeting the X-chromosome and whose activity is restricted to male gametogenesis have recently been pioneered as a means to engineer such traits. RESULTS: Here, we enabled endogenous CRISPR/Cas9 and CRISPR/Cas12a activity during spermatogenesis of the Mediterranean fruit fly Ceratitis capitata, a worldwide agricultural pest of extensive economic significance. In the absence of a chromosome-level assembly, we analysed long- and short-read genome sequencing data from males and females to identify two clusters of abundant and X-chromosome-specific sequence repeats. When targeted by gRNAs in conjunction with Cas9, cleavage of these repeats yielded a significant and consistent distortion of the sex ratio towards males in independent transgenic strains, while the combination of distinct distorters induced a strong bias (~ 80%). CONCLUSION: We provide a first demonstration of CRISPR-based sex distortion towards male bias in a non-model organism, the global pest insect Ceratitis capitata. Although the sex ratio bias reached in our study would require improvement, possibly through the generation and combination of additional transgenic lines, to result in a system with realistic applicability in the field, our results suggest that strains with characteristics suitable for field application can now be developed for a range of medically or agriculturally relevant insect species.


Assuntos
Ceratitis capitata , Animais , Animais Geneticamente Modificados , Sistemas CRISPR-Cas/genética , Ceratitis capitata/genética , Feminino , Masculino , RNA Guia de Cinetoplastídeos , Razão de Masculinidade , Cromossomo X/genética
8.
Genome Biol ; 21(1): 215, 2020 08 26.
Artigo em Inglês | MEDLINE | ID: mdl-32847630

RESUMO

BACKGROUND: The Asian tiger mosquito Aedes albopictus is globally expanding and has become the main vector for human arboviruses in Europe. With limited antiviral drugs and vaccines available, vector control is the primary approach to prevent mosquito-borne diseases. A reliable and accurate DNA sequence of the Ae. albopictus genome is essential to develop new approaches that involve genetic manipulation of mosquitoes. RESULTS: We use long-read sequencing methods and modern scaffolding techniques (PacBio, 10X, and Hi-C) to produce AalbF2, a dramatically improved assembly of the Ae. albopictus genome. AalbF2 reveals widespread viral insertions, novel microRNAs and piRNA clusters, the sex-determining locus, and new immunity genes, and enables genome-wide studies of geographically diverse Ae. albopictus populations and analyses of the developmental and stage-dependent network of expression data. Additionally, we build the first physical map for this species with 75% of the assembled genome anchored to the chromosomes. CONCLUSION: The AalbF2 genome assembly represents the most up-to-date collective knowledge of the Ae. albopictus genome. These resources represent a foundation to improve understanding of the adaptation potential and the epidemiological relevance of this species and foster the development of innovative control measures.


Assuntos
Aedes/genética , Arbovírus/genética , Genoma , Mosquitos Vetores/genética , Aedes/imunologia , Aedes/virologia , Animais , Mapeamento Cromossômico , Cromossomos , Tamanho do Genoma , Imunidade , Insetos Vetores , Mosquitos Vetores/imunologia , Mosquitos Vetores/virologia , RNA Interferente Pequeno/genética , Transcriptoma
9.
BMC Genomics ; 19(Suppl 8): 860, 2018 Dec 11.
Artigo em Inglês | MEDLINE | ID: mdl-30537925

RESUMO

BACKGROUND: In living organisms, small heat shock proteins (sHSPs) are triggered in response to stress situations. This family of proteins is large in plants and, in the case of tomato (Solanum lycopersicum), 33 genes have been identified, most of them related to heat stress response and to the ripening process. Transcriptomic and proteomic studies have revealed complex patterns of expression for these genes. In this work, we investigate the coregulation of these genes by performing a computational analysis of their promoter architecture to find regulatory motifs known as heat shock elements (HSEs). We leverage the presence of sHSP members that originated from tandem duplication events and analyze the promoter architecture diversity of the whole sHSP family, focusing on the identification of HSEs. RESULTS: We performed a search for conserved genomic sequences in the promoter regions of the sHSPs of tomato, plus several other proteins (mainly HSPs) that are functionally related to heat stress situations or to ripening. Several computational analyses were performed to build multiple sequence motifs and identify transcription factor binding sites (TFBS) homologous to HSF1AE and HSF21 in Arabidopsis. We also investigated the expression and interaction of these proteins under two heat stress situations in whole tomato plants and in protoplast cells, both in the presence and in the absence of heat shock transcription factor A2 (HsfA2). The results of these analyses indicate that different sHSPs are up-regulated depending on the activation or repression of HsfA2, a key regulator of HSPs. Further, the analysis of protein-protein interaction between the sHSP protein family and other heat shock response proteins (Hsp70, Hsp90 and MBF1c) suggests that several sHSPs are mediating alternative stress response through a regulatory subnetwork that is not dependent on HsfA2. CONCLUSIONS: Overall, this study identifies two regulatory motifs (HSF1AE and HSF21) associated with the sHSP family in tomato which are considered genomic HSEs. The study also suggests that, despite the apparent redundancy of these proteins, which has been linked to gene duplication, tomato sHSPs showed different up-regulation and different interaction patterns when analyzed under different stress situations.


Assuntos
Regulação da Expressão Gênica de Plantas , Proteínas de Choque Térmico Pequenas/genética , Motivos de Nucleotídeos , Proteínas de Plantas/genética , Sequências Reguladoras de Ácido Nucleico , Solanum lycopersicum/genética , Duplicação Gênica , Proteínas de Choque Térmico Pequenas/metabolismo , Resposta ao Choque Térmico , Solanum lycopersicum/crescimento & desenvolvimento , Solanum lycopersicum/metabolismo , Proteínas de Plantas/metabolismo , Regiões Promotoras Genéticas , Mapas de Interação de Proteínas
10.
PLoS Genet ; 14(11): e1007770, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30388103

RESUMO

Y chromosomes are widely believed to evolve from a normal autosome through a process of massive gene loss (with preservation of some male genes), shaped by sex-antagonistic selection and complemented by occasional gains of male-related genes. The net result of these processes is a male-specialized chromosome. This might be expected to be an irreversible process, but it was found in 2005 that the Drosophila pseudoobscura Y chromosome was incorporated into an autosome. Y chromosome incorporations have important consequences: a formerly male-restricted chromosome reverts to autosomal inheritance, and the species may shift from an XY/XX to X0/XX sex-chromosome system. In order to assess the frequency and causes of this phenomenon we searched for Y chromosome incorporations in 400 species from Drosophila and related genera. We found one additional large scale event of Y chromosome incorporation, affecting the whole montium subgroup (40 species in our sample); overall 13% of the sampled species (52/400) have Y incorporations. While previous data indicated that after the Y incorporation the ancestral Y disappeared as a free chromosome, the much larger data set analyzed here indicates that a copy of the Y survived as a free chromosome both in montium and pseudoobscura species, and that the current Y of the pseudoobscura lineage results from a fusion between this free Y and the neoY. The 400 species sample also showed that the previously suggested causal connection between X-autosome fusions and Y incorporations is, at best, weak: the new case of Y incorporation (montium) does not have X-autosome fusion, whereas nine independent cases of X-autosome fusions were not followed by Y incorporations. Y incorporation is an underappreciated mechanism affecting Y chromosome evolution; our results show that at least in Drosophila it plays a relevant role and highlight the need of similar studies in other groups.


Assuntos
Drosophila/classificação , Drosophila/genética , Cromossomo Y/genética , Animais , Evolução Molecular , Feminino , Duplicação Gênica , Genes de Insetos , Ligação Genética , Masculino , Modelos Genéticos , Filogenia , Seleção Genética , Especificidade da Espécie , Translocação Genética , Cromossomo X/genética
11.
Sci Rep ; 8(1): 7757, 2018 05 17.
Artigo em Inglês | MEDLINE | ID: mdl-29773825

RESUMO

The GO-Cellular Component (GO-CC) ontology provides a controlled vocabulary for the consistent description of the subcellular compartments or macromolecular complexes where proteins may act. Current machine learning-based methods used for the automated GO-CC annotation of proteins suffer from the inconsistency of individual GO-CC term predictions. Here, we present FGGA-CC+, a class of hierarchical graph-based classifiers for the consistent GO-CC annotation of protein coding genes at the subcellular compartment or macromolecular complex levels. Aiming to boost the accuracy of GO-CC predictions, we make use of the protein localization knowledge in the GO-Biological Process (GO-BP) annotations to boost the accuracy of GO-CC prediction. As a result, FGGA-CC+ classifiers are built from annotation data in both the GO-CC and GO-BP ontologies. Due to their graph-based design, FGGA-CC+ classifiers are fully interpretable and their predictions amenable to expert analysis. Promising results on protein annotation data from five model organisms were obtained. Additionally, successful validation results in the annotation of a challenging subset of tandem duplicated genes in the tomato non-model organism were accomplished. Overall, these results suggest that FGGA-CC+ classifiers can indeed be useful for satisfying the huge demand of GO-CC annotation arising from ubiquitous high throughout sequencing and proteomic projects.


Assuntos
Arabidopsis/metabolismo , Biologia Computacional/métodos , Drosophila melanogaster/metabolismo , Ontologia Genética , Proteínas/metabolismo , Saccharomyces cerevisiae/metabolismo , Solanum lycopersicum/metabolismo , Animais , Bases de Dados de Proteínas , Anotação de Sequência Molecular , Proteínas/análise , Proteômica , Software
12.
Int J Mol Sci ; 19(3)2018 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-29534015

RESUMO

Classical Hodgkin lymphoma (cHL) cells overexpress heat-shock protein 90 (HSP90), an important intracellular signaling hub regulating cell survival, which is emerging as a promising therapeutic target. Here, we report the antitumor effect of celastrol, an anti-inflammatory compound and a recognized HSP90 inhibitor, in Hodgkin and Reed-Sternberg cell lines. Two disparate responses were recorded. In KM-H2 cells, celastrol inhibited cell proliferation, induced G0/G1 arrest, and triggered apoptosis through the activation of caspase-3/7. Conversely, L428 cells exhibited resistance to the compound. A proteomic screening identified a total of 262 differentially expressed proteins in sensitive KM-H2 cells and revealed that celastrol's toxicity involved the suppression of the MAPK/ERK (extracellular signal regulated kinase/mitogen activated protein kinase) pathway. The apoptotic effects were preceded by a decrease in RAS (proto-oncogene protein Ras), p-ERK1/2 (phospho-extracellular signal-regulated Kinase-1/2), and c-Fos (proto-oncogene protein c-Fos) protein levels, as validated by immunoblot analysis. The L428 resistant cells exhibited a marked induction of HSP27 mRNA and protein after celastrol treatment. Our results provide the first evidence that celastrol has antitumor effects in cHL cells through the suppression of the MAPK/ERK pathway. Resistance to celastrol has rarely been described, and our results suggest that in cHL it may be mediated by the upregulation of HSP27. The antitumor properties of celastrol against cHL and whether the disparate responses observed in vitro have clinical correlates deserve further research.


Assuntos
Antineoplásicos/farmacologia , Resistencia a Medicamentos Antineoplásicos , Proteínas de Choque Térmico HSP90/antagonistas & inibidores , Doença de Hodgkin/metabolismo , Células de Reed-Sternberg/metabolismo , Triterpenos/farmacologia , Apoptose , Linhagem Celular Tumoral , Proliferação de Células , Humanos , Sistema de Sinalização das MAP Quinases , Proteína Quinase 1 Ativada por Mitógeno/metabolismo , Proteína Quinase 3 Ativada por Mitógeno/metabolismo , Triterpenos Pentacíclicos , Proteoma , Proto-Oncogene Mas , Células de Reed-Sternberg/efeitos dos fármacos , Proteínas ras/metabolismo
13.
Bioinformatics ; 33(6): 807-813, 2017 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-27259539

RESUMO

Motivation: To attain acceptable sample misassignment rates, current approaches to multiplex single-molecule real-time sequencing require upstream quality improvement, which is obtained from multiple passes over the sequenced insert and significantly reduces the effective read length. In order to fully exploit the raw read length on multiplex applications, robust barcodes capable of dealing with the full single-pass error rates are needed. Results: We present a method for designing sequencing barcodes that can withstand a large number of insertion, deletion and substitution errors and are suitable for use in multiplex single-molecule real-time sequencing. The manuscript focuses on the design of barcodes for full-length single-pass reads, impaired by challenging error rates in the order of 11%. The proposed barcodes can multiplex hundreds or thousands of samples while achieving sample misassignment probabilities as low as 10-7 under the above conditions, and are designed to be compatible with chemical constraints imposed by the sequencing process. Availability and Implementation: Software tools for constructing watermark barcode sets and demultiplexing barcoded reads, together with example sets of barcodes and synthetic barcoded reads, are freely available at www.cifasis-conicet.gov.ar/ezpeleta/NS-watermark . Contact: ezpeleta@cifasis-conicet.gov.ar.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Simulação por Computador
14.
G3 (Bethesda) ; 6(10): 3027-3034, 2016 10 13.
Artigo em Inglês | MEDLINE | ID: mdl-27565886

RESUMO

In plants, fruit maturation and oxidative stress can induce small heat shock protein (sHSP) synthesis to maintain cellular homeostasis. Although the tomato reference genome was published in 2012, the actual number and functionality of sHSP genes remain unknown. Using a transcriptomic (RNA-seq) and evolutionary genomic approach, putative sHSP genes in the Solanum lycopersicum (cv. Heinz 1706) genome were investigated. A sHSP gene family of 33 members was established. Remarkably, roughly half of the members of this family can be explained by nine independent tandem duplication events that determined, evolutionarily, their functional fates. Within a mitochondrial class subfamily, only one duplicated member, Solyc08g078700, retained its ancestral chaperone function, while the others, Solyc08g078710 and Solyc08g078720, likely degenerated under neutrality and lack ancestral chaperone function. Functional conservation occurred within a cytosolic class I subfamily, whose four members, Solyc06g076570, Solyc06g076560, Solyc06g076540, and Solyc06g076520, support ∼57% of the total sHSP RNAm in the red ripe fruit. Subfunctionalization occurred within a new subfamily, whose two members, Solyc04g082720 and Solyc04g082740, show heterogeneous differential expression profiles during fruit ripening. These findings, involving the birth/death of some genes or the preferential/plastic expression of some others during fruit ripening, highlight the importance of tandem duplication events in the expansion of the sHSP gene family in the tomato genome. Despite its evolutionary diversity, the sHSP gene family in the tomato genome seems to be endowed with a core set of four homeostasis genes: Solyc05g014280, Solyc03g082420, Solyc11g020330, and Solyc06g076560, which appear to provide a baseline protection during both fruit ripening and heat shock stress in different tomato tissues.


Assuntos
Duplicação Gênica , Genes de Plantas , Proteínas de Choque Térmico Pequenas/genética , Família Multigênica , Solanum lycopersicum/genética , Sequências de Repetição em Tandem , Biologia Computacional/métodos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Proteínas de Choque Térmico Pequenas/classificação , Proteínas de Choque Térmico Pequenas/metabolismo , Solanum lycopersicum/metabolismo , Anotação de Sequência Molecular , Filogenia , Transporte Proteico , Transcriptoma
15.
PLoS One ; 11(1): e0146986, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26771463

RESUMO

As volume of genomic data grows, computational methods become essential for providing a first glimpse onto gene annotations. Automated Gene Ontology (GO) annotation methods based on hierarchical ensemble classification techniques are particularly interesting when interpretability of annotation results is a main concern. In these methods, raw GO-term predictions computed by base binary classifiers are leveraged by checking the consistency of predefined GO relationships. Both formal leveraging strategies, with main focus on annotation precision, and heuristic alternatives, with main focus on scalability issues, have been described in literature. In this contribution, a factor graph approach to the hierarchical ensemble formulation of the automated GO annotation problem is presented. In this formal framework, a core factor graph is first built based on the GO structure and then enriched to take into account the noisy nature of GO-term predictions. Hence, starting from raw GO-term predictions, an iterative message passing algorithm between nodes of the factor graph is used to compute marginal probabilities of target GO-terms. Evaluations on Saccharomyces cerevisiae, Arabidopsis thaliana and Drosophila melanogaster protein sequences from the GO Molecular Function domain showed significant improvements over competing approaches, even when protein sequences were naively characterized by their physicochemical and secondary structure properties or when loose noisy annotation datasets were considered. Based on these promising results and using Arabidopsis thaliana annotation data, we extend our approach to the identification of most promising molecular function annotations for a set of proteins of unknown function in Solanum lycopersicum.


Assuntos
Drosophila melanogaster/genética , Ontologia Genética , Algoritmos , Animais , Arabidopsis/genética , Biologia Computacional , Solanum lycopersicum/genética , Saccharomyces cerevisiae/genética , Software
16.
PLoS One ; 10(10): e0140459, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26492348

RESUMO

For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10(-2) per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10(-9) at the expense of a rate of read losses just in the order of 10(-6).


Assuntos
Código de Barras de DNA Taxonômico/métodos , Probabilidade
17.
G3 (Bethesda) ; 5(6): 1145-50, 2015 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-25858959

RESUMO

The autosomal gene Mst77F of Drosophila melanogaster is essential for male fertility. In 2010, Krsticevic et al. (Genetics 184: 295-307) found 18 Y-linked copies of Mst77F ("Mst77Y"), which collectively account for 20% of the functional Mst77F-like mRNA. The Mst77Y genes were severely misassembled in the then-available genome assembly and were identified by cloning and sequencing polymerase chain reaction products. The genomic structure of the Mst77Y region and the possible existence of additional copies remained unknown. The recent publication of two long-read assemblies of D. melanogaster prompted us to reinvestigate this challenging region of the Y chromosome. We found that the Illumina Synthetic Long Reads assembly failed in the Mst77Y region, most likely because of its tandem duplication structure. The PacBio MHAP assembly of the Mst77Y region seems to be very accurate, as revealed by comparisons with the previously found Mst77Y genes, a bacterial artificial chromosome sequence, and Illumina reads of the same strain. We found that the Mst77Y region spans 96 kb and originated from a 3.4-kb transposition from chromosome 3L to the Y chromosome, followed by tandem duplications inside the Y chromosome and invasion of transposable elements, which account for 48% of its length. Twelve of the 18 Mst77Y genes found in 2010 were confirmed in the PacBio assembly, the remaining six being polymerase chain reaction-induced artifacts. There are several identical copies of some Mst77Y genes, coincidentally bringing the total copy number to 18. Besides providing a detailed picture of the Mst77Y region, our results highlight the utility of PacBio technology in assembling difficult genomic regions such as tandemly repeated genes.


Assuntos
Drosophila melanogaster/genética , Dosagem de Genes , Genes de Insetos , Análise de Sequência de DNA/métodos , Cromossomo Y/genética , Algoritmos , Animais , Evolução Molecular , Dados de Sequência Molecular , Reprodutibilidade dos Testes
18.
Genetics ; 184(1): 295-307, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19897751

RESUMO

The Y chromosome of Drosophila melanogaster has <20 protein-coding genes. These genes originated from the duplication of autosomal genes and have male-related functions. In 1993, Russell and Kaiser found three Y-linked pseudogenes of the Mst77F gene, which is a testis-expressed autosomal gene that is essential for male fertility. We did a thorough search using experimental and computational methods and found 18 Y-linked copies of this gene (named Mst77Y-1-Mst77Y-18). Ten Mst77Y genes encode defective proteins and the other eight are potentially functional. These eight genes produce approximately 20% of the functional Mst77F-like mRNA, and molecular evolutionary analysis shows that they evolved under purifying selection. Hence several Mst77Y genes have all the features of functional genes. Mst77Y genes are present only in D. melanogaster, and phylogenetic analysis confirmed that the duplication is a recent event. The identification of functional Mst77Y genes reinforces the previous finding that gene gains play a prominent role in the evolution of the Drosophila Y chromosome.


Assuntos
Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Dosagem de Genes , Genes de Insetos/genética , Histonas/genética , Cromossomo Y/genética , Animais , Enzimas de Restrição do DNA/metabolismo , Proteínas de Drosophila/metabolismo , Evolução Molecular , Feminino , Genes Ligados ao Cromossomo Y/genética , Histonas/metabolismo , Masculino , Análise de Sequência de DNA , Transcrição Gênica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...