Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 96
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 187(19): 5146-5150, 2024 Sep 19.
Artículo en Inglés | MEDLINE | ID: mdl-39303683

RESUMEN

Rapid expansion of pathogen sequencing capacity in Africa has led to a paradigm shift from relying on others to locally generating genomic data and sharing it with the global community. However, several barriers remain to be unlocked for timely processing, analysis, dissemination, and effective use of pathogen sequence data for pandemic prevention, preparedness, and response.


Asunto(s)
Genómica , Humanos , África/epidemiología , Pandemias , Difusión de la Información , COVID-19/virología , COVID-19/epidemiología , COVID-19/genética
2.
BMC Genomics ; 23(1): 520, 2022 Jul 18.
Artículo en Inglés | MEDLINE | ID: mdl-35850574

RESUMEN

Genetic evolution of Rift Valley fever virus (RVFV) in Africa has been shaped mainly by environmental changes such as abnormal rainfall patterns and climate change that has occurred over the last few decades. These gradual environmental changes are believed to have effected gene migration from macro (geographical) to micro (reassortment) levels. Presently, 15 lineages of RVFV have been identified to be circulating within the Sub-Saharan Africa. International trade in livestock and movement of mosquitoes are thought to be responsible for the outbreaks occurring outside endemic or enzootic regions. Virus spillover events contribute to outbreaks as was demonstrated by the largest epidemic of 1977 in Egypt. Genomic surveillance of the virus evolution is crucial in developing intervention strategies. Therefore, we have developed a computational tool for rapidly classifying and assigning lineages of the RVFV isolates. The computational method is presented both as a command line tool and a web application hosted at https://www.genomedetective.com/app/typingtool/rvfv/ . Validation of the tool has been performed on a large dataset using glycoprotein gene (Gn) and whole genome sequences of the Large (L), Medium (M) and Small (S) segments of the RVFV retrieved from the National Center for Biotechnology Information (NCBI) GenBank database. Using the Gn nucleotide sequences, the RVFV typing tool was able to correctly classify all 234 RVFV sequences at species level with 100% specificity, sensitivity and accuracy. All the sequences in lineages A (n = 10), B (n = 1), C (n = 88), D (n = 1), E (n = 3), F (n = 2), G (n = 2), H (n = 105), I (n = 2), J (n = 1), K (n = 4), L (n = 8), M (n = 1), N (n = 5) and O (n = 1) were also correctly classified at phylogenetic level. Lineage assignment using whole RVFV genome sequences (L, M and S-segments) did not achieve 100% specificity, sensitivity and accuracy for all the sequences analyzed. We further tested our tool using genomic data that we generated by sequencing 5 samples collected following a recent RVF outbreak in Kenya. All the 5 samples were assigned lineage C by both the partial (Gn) and whole genome sequence classifiers. The tool is useful in tracing the origin of outbreaks and supporting surveillance efforts.Availability: https://github.com/ajodeh-juma/rvfvtyping.


Asunto(s)
Fiebre del Valle del Rift , Virus de la Fiebre del Valle del Rift , Animales , Comercio , Genómica , Internacionalidad , Kenia , Filogenia , Fiebre del Valle del Rift/epidemiología , Virus de la Fiebre del Valle del Rift/genética
3.
Bioinformatics ; 36(3): 982-983, 2020 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-31504165

RESUMEN

MOTIVATION: Recent advancements in genomic technologies have enabled high throughput cost-effective generation of 'omics' data from M.tuberculosis (M.tb) isolates, which then gets shared via a number of heterogeneous publicly available biological databases. Albeit useful, fragmented curation negatively impacts the researcher's ability to leverage the data via federated queries. RESULTS: We present Combat-TB-NeoDB, an integrated M.tb 'omics' knowledge-base. Combat-TB-NeoDB is based on Neo4j and was created by binding the labeled property graph model to a suitable ontology namely Chado. Combat-TB-NeoDB enables researchers to execute complex federated queries by linking prominent biological databases, and supplementary M.tb variants data from published literature. AVAILABILITY AND IMPLEMENTATION: The Combat-TB-NeoDB (https://neodb.sanbi.ac.za) repository and all tools mentioned in this manuscript are freely available at https://github.com/COMBAT-TB. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Mycobacterium tuberculosis , Tuberculosis , Bases de Datos Factuales , Genoma , Genómica , Humanos , Programas Informáticos
4.
Future Oncol ; 17(34): 4769-4783, 2021 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-34751044

RESUMEN

Background: Neuroblastoma is the most common extracranial solid tumor in childhood. Amplification of MYCN in neuroblastoma is a predictor of poor prognosis. Materials and methods: DNA methylation data from the TARGET data matrix were stratified into MYCN amplified and non-amplified groups. Differential methylation analysis, clustering, recursive feature elimination (RFE), machine learning (ML), Cox regression analysis and Kaplan-Meier estimates were performed. Results and Conclusion: 663 CpGs were differentially methylated between the two groups. A total of 25 CpGs were selected by RFE for clustering and ML, and a 100% clustering accuracy was obtained. ML validation on three external datasets produced high accuracy scores of 100%, 97% and 93%. Eight survival-associated CpGs were also identified. Therapeutic interventions may need to be targeted to patient subgroups.


Lay abstract Neuroblastoma is the most common extracranial solid tumor in childhood. Elevated levels of the MYCN protein in neuroblastoma is a predictor of poor prognosis. It is the most relevant prognostic factor in neuroblastoma and predicting MYCN gene amplification (which leads to increased gene expression and more protein) from epigenetic data rather than genetic testing might be useful in the oncology clinic. This study was designed to identify a DNA methylation (epigenetic) signature that can be used to diagnose MYCN amplification without actually testing for the gene. The authors also aimed to correlate this DNA methylation signature with patient survival and poorer prognosis. Based on statistical and computational methods applied to DNA methylation data for neuroblastoma, signatures that are predictive of MYCN amplification and poor prognosis were found, which clinicians can use for early patient diagnosis and selection of the best therapies for patients at high risk.


Asunto(s)
Biomarcadores de Tumor/genética , Metilación de ADN , Epigénesis Genética , Proteína Proto-Oncogénica N-Myc/genética , Neuroblastoma/mortalidad , Niño , Islas de CpG/genética , Conjuntos de Datos como Asunto , Amplificación de Genes , Regulación Neoplásica de la Expresión Génica , Humanos , Estimación de Kaplan-Meier , Aprendizaje Automático , Neuroblastoma/genética , Pronóstico , Supervivencia sin Progresión , Medición de Riesgo/métodos
5.
Molecules ; 26(12)2021 Jun 16.
Artículo en Inglés | MEDLINE | ID: mdl-34208597

RESUMEN

Several natural products (NPs) have displayed varying in vitro activities against methicillin-resistant Staphylococcus aureus (MRSA). However, few of these compounds have not been developed into potential antimicrobial drug candidates. This may be due to the high cost and tedious and time-consuming process of conducting the necessary preclinical tests on these compounds. In this study, cheminformatic profiling was performed on 111 anti-MRSA NPs (AMNPs), using a few orally administered conventional drugs for MRSA (CDs) as reference, to identify compounds with prospects to become drug candidates. This was followed by prioritizing these hits and identifying the liabilities among the AMNPs for possible optimization. Cheminformatic profiling revealed that most of the AMNPs were within the required drug-like region of the investigated properties. For example, more than 76% of the AMNPs showed compliance with the Lipinski, Veber, and Egan predictive rules for oral absorption and permeability. About 34% of the AMNPs showed the prospect to penetrate the blood-brain barrier (BBB), an advantage over the CDs, which are generally non-permeant of BBB. The analysis of toxicity revealed that 59% of the AMNPs might have negligible or no toxicity risks. Structure-activity relationship (SAR) analysis revealed chemical groups that may be determinants of the reported bioactivity of the compounds. A hit prioritization strategy using a novel "desirability scoring function" was able to identify AMNPs with the desired drug-likeness. Hit optimization strategies implemented on AMNPs with poor desirability scores led to the design of two compounds with improved desirability scores.


Asunto(s)
Productos Biológicos/química , Productos Biológicos/farmacología , Staphylococcus aureus Resistente a Meticilina/efectos de los fármacos , Antibacterianos/farmacología , Antiinfecciosos/farmacología , Quimioinformática/métodos , Bases de Datos Factuales , Evaluación Preclínica de Medicamentos/métodos , Staphylococcus aureus Resistente a Meticilina/metabolismo , Pruebas de Sensibilidad Microbiana , Staphylococcus aureus/efectos de los fármacos , Staphylococcus aureus/metabolismo , Relación Estructura-Actividad
6.
Molecules ; 26(13)2021 Jun 29.
Artículo en Inglés | MEDLINE | ID: mdl-34209681

RESUMEN

The growing antimicrobial resistance (AMR) of pathogenic organisms to currently prescribed drugs has resulted in the failure to treat various infections caused by these superbugs. Therefore, to keep pace with the increasing drug resistance, there is a pressing need for novel antimicrobial agents, especially from non-conventional sources. Several natural products (NPs) have been shown to display promising in vitro activities against multidrug-resistant pathogens. Still, only a few of these compounds have been studied as prospective drug candidates. This may be due to the expensive and time-consuming process of conducting important studies on these compounds. The present review focuses on applying cheminformatics strategies to characterize, prioritize, and optimize NPs to develop new lead compounds against antimicrobial resistance pathogens. Moreover, case studies where these strategies have been used to identify potential drug candidates, including a few selected open-access tools commonly used for these studies, are briefly outlined.


Asunto(s)
Antiinfecciosos/química , Productos Biológicos/química , Plomo/química , Antiinfecciosos/uso terapéutico , Productos Biológicos/uso terapéutico , Resistencia a Medicamentos , Humanos , Plomo/uso terapéutico
7.
Bioinformatics ; 34(24): 4159-4164, 2018 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-29945178

RESUMEN

Motivation: Triplet amino acids have successfully been included in feature selection to predict human-HPV protein-protein interactions (PPI). The utility of supervised learning methods is curtailed due to experimental data not being available in sufficient quantities. Improvements in machine learning techniques and features selection will enhance the study of PPI between host and pathogen. Results: We present a comparison of a neural network model versus SVM for prediction of host-pathogen PPI based on a combination of features including: amino acid quadruplets, pairwise sequence similarity, and human interactome properties. The neural network and SVM were implemented using Python Sklearn library. The neural network model using quadruplet features and other network features outperformance the SVM model. The models are tested against published predictors and then applied to the human-B.anthracis case. Gene ontology term enrichment analysis identifies immunology response and regulation as functions of interacting proteins. For prediction of Human-viral PPI, our model (neural network) is a significant improvement in overall performance compared to a predictor using the triplets feature and achieves a good accuracy in predicting human-B.anthracis PPI. Availability and implementation: All code can be downloaded from ftp://ftp.sanbi.ac.za/machine_learning/. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Bacillus anthracis , Proteínas Bacterianas/metabolismo , Redes Neurales de la Computación , Mapeo de Interacción de Proteínas , Máquina de Vectores de Soporte , Biología Computacional , Humanos , Modelos Moleculares , Proteínas
8.
Nature ; 496(7445): 311-6, 2013 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-23598338

RESUMEN

The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.


Asunto(s)
Evolución Biológica , Peces/clasificación , Peces/genética , Genoma/genética , Animales , Animales Modificados Genéticamente , Embrión de Pollo , Secuencia Conservada/genética , Elementos de Facilitación Genéticos/genética , Evolución Molecular , Extremidades/anatomía & histología , Extremidades/crecimiento & desarrollo , Peces/anatomía & histología , Peces/fisiología , Genes Homeobox/genética , Genómica , Inmunoglobulina M/genética , Ratones , Anotación de Secuencia Molecular , Datos de Secuencia Molecular , Filogenia , Alineación de Secuencia , Análisis de Secuencia de ADN , Vertebrados/anatomía & histología , Vertebrados/genética , Vertebrados/fisiología
10.
PLoS Genet ; 12(4): e1005954, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-27082250

RESUMEN

We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics.


Asunto(s)
Lubina/genética , Mapeo Cromosómico , Animales , Lubina/clasificación , Genoma , Hibridación Fluorescente in Situ , Filogenia
11.
BMC Genet ; 18(1): 119, 2017 12 22.
Artículo en Inglés | MEDLINE | ID: mdl-29273003

RESUMEN

BACKGROUND: Drought is the most disastrous abiotic stress that severely affects agricultural productivity worldwide. Understanding the biological basis of drought-regulated traits, requires identification and an in-depth characterization of genetic determinants using model organisms and high-throughput technologies. However, studies on drought tolerance have generally been limited to traditional candidate gene approach that targets only a single gene in a pathway that is related to a trait. In this study, we used sorghum, one of the model crops that is well adapted to arid regions, to mine genes and define determinants for drought tolerance using drought expression libraries and RNA-seq data. RESULTS: We provide an integrated and comparative in silico candidate gene identification, characterization and annotation approach, with an emphasis on genes playing a prominent role in conferring drought tolerance in sorghum. A total of 470 non-redundant functionally annotated drought responsive genes (DRGs) were identified using experimental data from drought responses by employing pairwise sequence similarity searches, pathway and interpro-domain analysis, expression profiling and orthology relation. Comparison of the genomic locations between these genes and sorghum quantitative trait loci (QTLs) showed that 40% of these genes were co-localized with QTLs known for drought tolerance. The genome reannotation conducted using the Program to Assemble Spliced Alignment (PASA), resulted in 9.6% of existing single gene models being updated. In addition, 210 putative novel genes were identified using AUGUSTUS and PASA based analysis on expression dataset. Among these, 50% were single exonic, 69.5% represented drought responsive and 5.7% were complete gene structure models. Analysis of biochemical metabolism revealed 14 metabolic pathways that are related to drought tolerance and also had a strong biological network, among categories of genes involved. Identification of these pathways, signifies the interplay of biochemical reactions that make up the metabolic network, constituting fundamental interface for sorghum defence mechanism against drought stress. CONCLUSIONS: This study suggests untapped natural variability in sorghum that could be used for developing drought tolerance. The data presented here, may be regarded as an initial reference point in functional and comparative genomics in the Gramineae family.


Asunto(s)
Genes de Plantas , Anotación de Secuencia Molecular , Sorghum/genética , Sorghum/fisiología , Simulación por Computador , Sequías , Exones , Redes y Vías Metabólicas , Sitios de Carácter Cuantitativo , Transcriptoma
12.
PLoS Comput Biol ; 12(2): e1004395, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26845152

RESUMEN

Bioinformatics is now a critical skill in many research and commercial environments as biological data are increasing in both size and complexity. South African researchers recognized this need in the mid-1990s and responded by working with the government as well as international bodies to develop initiatives to build bioinformatics capacity in the country. Significant injections of support from these bodies provided a springboard for the establishment of computational biology units at multiple universities throughout the country, which took on teaching, basic research and support roles. Several challenges were encountered, for example with unreliability of funding, lack of skills, and lack of infrastructure. However, the bioinformatics community worked together to overcome these, and South Africa is now arguably the leading country in bioinformatics on the African continent. Here we discuss how the discipline developed in the country, highlighting the challenges, successes, and lessons learnt.


Asunto(s)
Biología Computacional , Biotecnología , Biología Computacional/educación , Biología Computacional/historia , Biología Computacional/organización & administración , Historia del Siglo XX , Historia del Siglo XXI , Humanos , Sudáfrica
13.
BMC Bioinformatics ; 17: 75, 2016 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-26856535

RESUMEN

BACKGROUND: Increasing resistance to anti-tuberculosis drugs has driven the need for developing new drugs. Resources such as the tropical disease research (TDR) target database and AssessDrugTarget can help to prioritize putative drug targets. Hower, these resources do not necessarily map to metabolic pathways and the targets are not involved in dormancy. In this study, we specifically identify drug resistance pathways to allow known drug resistant mutations in one target to be offset by inhibiting another enzyme of the same metabolic pathway. One of the putative targets, Rv1712, was analysed by modelling its three dimensional structure and docking potential inhibitors. RESULTS: We mapped 18 TB drug resistance gene products to 15 metabolic pathways critical for mycobacterial growth and latent TB by screening publicly available microarray data. Nine putative targets, Rv1712, Rv2984, Rv2194, Rv1311, Rv1305, Rv2195, Rv1622c, Rv1456c and Rv2421c, were found to be essential, to lack a close human homolog, and to share >67 % sequence identity and >87 % query coverage with mycobacterial orthologs. A structural model was generated for Rv1712, subjected to molecular dynamic simulation, and identified 10 compounds with affinities better than that for the ligand cytidine-5'-monophosphate (C5P). Each compound formed more interactions with the protein than C5P. CONCLUSIONS: We focused on metabolic pathways associated with bacterial drug resistance and proteins unique to pathogenic bacteria to identify novel putative drug targets. The ten compounds identified in this study should be considered for experimental studies to validate their potential as inhibitors of Rv1712.


Asunto(s)
Antibacterianos/farmacología , Farmacorresistencia Bacteriana/genética , Regulación Bacteriana de la Expresión Génica/efectos de los fármacos , Redes y Vías Metabólicas , Mycobacterium tuberculosis/efectos de los fármacos , Tuberculosis/genética , Genes Bacterianos , Genoma Bacteriano , Humanos , Mycobacterium tuberculosis/genética , Relación Estructura-Actividad Cuantitativa , Tuberculosis/tratamiento farmacológico , Tuberculosis/microbiología
14.
BMC Genomics ; 17: 561, 2016 08 08.
Artículo en Inglés | MEDLINE | ID: mdl-27503259

RESUMEN

BACKGROUND: Iron metabolism and regulation is an indispensable part of species survival, most importantly for blood feeding insects. Iron regulatory proteins are central regulators of iron homeostasis, whose binding to iron response element (IRE) stem-loop structures within the UTRs of genes regulate expression at the post-transcriptional level. Despite the extensive literature on the mechanism of iron regulation in human, less attention has been given to insect and more specifically the blood feeding insects, where research has mainly focused on the characterization of ferritin and transferrin. We thus, examined the mechanism of iron homeostasis through a genome-wide computational identification of IREs and other enriched motifs in the UTRs of Glossina morsitans with the view to identify new IRE-regulated genes. RESULTS: We identified 150 genes, of which two are known to contain IREs, namely the ferritin heavy chain and the MRCK-alpha. The remainder of the identified genes is considered novel including 20 hypothetical proteins, for which an iron-regulatory mechanism of action was inferred. Forty-three genes were found with IRE-signatures of regulation in two or more insects, while 46 were only found to be IRE-regulated in two species. Notably 39 % of the identified genes exclusively shared IRE-signatures in other Glossina species, which are potentially Glossina-specific adaptive measures in addressing its unique reproductive biology and blood meal-induced iron overload. In line with previous findings, we found no evidence pertaining to an IRE regulation of Transferrin, which highlight the importance of ferritin heavy chain and the other proposed transporters in the tsetse fly. In the context of iron-sequestration, key players of tsetse immune defence against trypanosomes have been introduced namely 14 stress and immune response genes, while 28 cell-envelop, transport, and binding genes were assigned a putative role in iron trafficking. Additionally, we identified and annotated enriched motifs in the UTRs of the putative IRE-regulated genes to derive at a co-regulatory network that maintains iron homeostasis in tsetse flies. Three putative microRNA-binding sites namely Gy-box, Brd-box and K-box motifs were identified among the regulatory motifs, enriched in the UTRs of the putative IRE-regulated genes. CONCLUSION: Beyond our current view of iron metabolism in insects, with ferritin and transferrin as its key players, this study provides a comprehensive catalogue of genes with possible roles in the acquisition; transport and storage of iron hence iron homeostasis in the tsetse fly.


Asunto(s)
Hierro/metabolismo , Modelos Biológicos , Elementos de Respuesta , Moscas Tse-Tse/genética , Moscas Tse-Tse/metabolismo , Animales , Transporte Biológico , Vectores de Enfermedades , Genes de Insecto , Proteínas Reguladoras del Hierro/genética , Proteínas Reguladoras del Hierro/metabolismo
15.
Lancet ; 395(10217): 29-30, 2020 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-31908277
16.
Malar J ; 15: 50, 2016 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-26823078

RESUMEN

BACKGROUND: A large number of natural products have shown in vitro antiplasmodial activities. Early identification and prioritization of these natural products with potential for novel mechanism of action, desirable pharmacokinetics and likelihood for development into drugs is advantageous. Chemo-informatic profiling of these natural products were conducted and compared to currently registered anti-malarial drugs (CRAD). METHODS: Natural products with in vitro antiplasmodial activities (NAA) were compiled from various sources. These natural products were sub-divided into four groups based on inhibitory concentration (IC50). Key molecular descriptors and physicochemical properties were computed for these compounds and analysis of variance used to assess statistical significance amongst the sets of compounds. Molecular similarity analysis, estimation of drug-likeness, in silico pharmacokinetic profiling, and exploration of structure-activity landscape were also carried out on these sets of compounds. RESULTS: A total of 1040 natural products were selected and a total of 13 molecular descriptors were analysed. Significant differences were observed among the sub-groups of NAA and CRAD for at least 11 of the molecular descriptors, including number of hydrogen bond donors and acceptors, molecular weight, polar and hydrophobic surface areas, chiral centres, oxygen and nitrogen atoms, and shape index. The remaining molecular descriptors, including clogP, number of rotatable bonds and number of aromatic rings, did not show any significant difference when comparing the two compound sets. Molecular similarity and chemical space analysis identified natural products that were structurally diverse from CRAD. Prediction of the pharmacokinetic properties and drug-likeness of these natural products identified over 50% with desirable drug-like properties. Nearly 70% of all natural products were identified as potentially promiscuous compounds. Structure-activity landscape analysis highlighted compound pairs that form 'activity cliffs'. In all, prioritization strategies for the NAA were proposed. CONCLUSIONS: Chemo-informatic profiling of NAA and CRAD have produced a wealth of information that may guide decisions and facilitate anti-malarial drug development from natural products. Articulation of the information provided within an interactive data-mining environment led to a prioritized list of NAA.


Asunto(s)
Antimaláricos/química , Productos Biológicos/química , Peso Molecular
17.
Malar J ; 15(1): 542, 2016 Nov 08.
Artículo en Inglés | MEDLINE | ID: mdl-27825380

RESUMEN

BACKGROUND: Over the past several years, thousands of microRNAs (miRNAs) have been identified in the genomes of various insects through cloning and sequencing or even by computational prediction. However, the number of miRNAs identified in anopheline species is low and little is known about their role. The mosquito Anopheles funestus is one of the dominant malaria vectors in Africa, which infects and kills millions of people every year. Therefore, small RNA molecules isolated from the four life stages (eggs, larvae, pupae and unfed adult females) of An. funestus were sequenced using next generation sequencing technology. RESULTS: High throughput sequencing of four replicates in combination with computational analysis identified 107 mature miRNA sequences expressed in the An. funestus mosquito. These include 20 novel miRNAs without sequence identity in any organism and eight miRNAs not previously reported in the Anopheles genus but are known in non-anopheles mosquitoes. Finally, the changes in the expression of miRNAs during the mosquito development were determined and the analysis showed that many miRNAs have stage-specific expression, and are co-transcribed and co-regulated during development. CONCLUSIONS: This study presents the first direct experimental evidence of miRNAs in An. funestus and the first profiling study of miRNA associated with the maturation in this mosquito. Overall, the results indicate that miRNAs play important roles during the growth and development. Silencing such molecules in a specific life stage could decrease the vector population and therefore interrupt malaria transmission.


Asunto(s)
Anopheles/crecimiento & desarrollo , Anopheles/genética , Perfilación de la Expresión Génica , Estadios del Ciclo de Vida , MicroARNs/biosíntesis , Mosquitos Vectores/crecimiento & desarrollo , Mosquitos Vectores/genética , África , Animales , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , MicroARNs/genética
18.
Molecules ; 21(1): 104, 2016 Jan 16.
Artículo en Inglés | MEDLINE | ID: mdl-26784165

RESUMEN

In light of current resistance to antimalarial drugs, there is a need to discover new classes of antimalarial agents with unique mechanisms of action. Identification of unique scaffolds from natural products with in vitro antiplasmodial activities may be the starting point for such new classes of antimalarial agents. We therefore conducted scaffold diversity and comparison analysis of natural products with in vitro antiplasmodial activities (NAA), currently registered antimalarial drugs (CRAD) and malaria screen data from Medicine for Malaria Ventures (MMV). The scaffold diversity analyses on the three datasets were performed using scaffold counts and cumulative scaffold frequency plots. Scaffolds from the NAA were compared to those from CRAD and MMV. A Scaffold Tree was also generated for each of the datasets and the scaffold diversity of NAA was found to be higher than that of MMV. Among the NAA compounds, we identified unique scaffolds that were not contained in any of the other compound datasets. These scaffolds from NAA also possess desirable drug-like properties making them ideal starting points for antimalarial drug design considerations. The Scaffold Tree showed the preponderance of ring systems in NAA and identified virtual scaffolds, which may be potential bioactive compounds.


Asunto(s)
Antimaláricos/química , Productos Biológicos/química , Diseño de Fármacos , Bibliotecas de Moléculas Pequeñas/química , Antimaláricos/farmacología , Bases de Datos de Compuestos Químicos , Descubrimiento de Drogas , Humanos , Malaria/tratamiento farmacológico , Plasmodium/efectos de los fármacos , Bibliotecas de Moléculas Pequeñas/farmacología , Relación Estructura-Actividad , Interfaz Usuario-Computador
19.
BMC Bioinformatics ; 16: 58, 2015 Feb 21.
Artículo en Inglés | MEDLINE | ID: mdl-25880035

RESUMEN

BACKGROUND: De novo transcriptome assembly of short transcribed fragments (transfrags) produced from sequencing-by-synthesis technologies often results in redundant datasets with differing levels of unassembled, partially assembled or mis-assembled transcripts. Post-assembly processing intended to reduce redundancy typically involves reassembly or clustering of assembled sequences. However, these approaches are mostly based on common word heuristics and often create clusters of biologically unrelated sequences, resulting in loss of unique transfrags annotations and propagation of mis-assemblies. RESULTS: Here, we propose a structured framework that consists of a few steps in pipeline architecture for Inferring Functionally Relevant Assembly-derived Transcripts (IFRAT). IFRAT combines 1) removal of identical subsequences, 2) error tolerant CDS prediction, 3) identification of coding potential, and 4) complements BLAST with a multiple domain architecture annotation that reduces non-specific domain annotation. We demonstrate that independent of the assembler, IFRAT selects bona fide transfrags (with CDS and coding potential) from the transcriptome assembly of a model organism without relying on post-assembly clustering or reassembly. The robustness of IFRAT is inferred on RNA-Seq data of Neurospora crassa assembled using de Bruijn graph-based assemblers, in single (Trinity and Oases-25) and multiple (Oases-Merge and additive or pooled) k-mer modes. Single k-mer assemblies contained fewer transfrags compared to the multiple k-mer assemblies. However, Trinity identified a comparable number of predicted coding sequence and gene loci to Oases pooled assembly. IFRAT selects bona fide transfrags representing over 94% of cumulative BLAST-derived functional annotations of the unfiltered assemblies. Between 4-6% are lost when orphan transfrags are excluded and this represents only a tiny fraction of annotation derived from functional transference by sequence similarity. The median length of bona fide transfrags ranged from 1.5kb (Trinity) to 2kb (Oases), which is consistent with the average coding sequence length in fungi. The fraction of transfrags that could be associated with gene ontology terms ranged from 33-50%, which is also high for domain based annotation. We showed that unselected transfrags were mostly truncated and represent sequences from intronic, untranslated (5' and 3') regions and non-coding gene loci. CONCLUSIONS: IFRAT simplifies post-assembly processing providing a reference transcriptome enriched with functionally relevant assembly-derived transcripts for non-model organism.


Asunto(s)
Algoritmos , Genoma Fúngico , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Anotación de Secuencia Molecular , Neurospora crassa/genética , Análisis de Secuencia de ARN/métodos , Transcriptoma , Análisis por Conglomerados , Programas Informáticos
20.
BMC Genomics ; 16: 722, 2015 Sep 22.
Artículo en Inglés | MEDLINE | ID: mdl-26394619

RESUMEN

BACKGROUND: Transcription initiation regulation is mediated by sequence-specific interactions between DNA-binding proteins (transcription factors) and cis-elements, where BRE, TATA, INR, DPE and MTE motifs constitute canonical core motifs for basal transcription initiation of genes. Accurate identification of transcription start site (TSS) and their corresponding promoter regions is critical for delineation of these motifs. To this end, the genome scale analysis of core promoter architecture in insects has been confined to Drosophila. The recently sequenced Tsetse fly genome provides a unique opportunity to analyze transcription initiation regulation machinery in blood-feeding insects. RESULTS: A computational method for identification of TSS in newly sequenced Tsetse fly genome was evaluated, using TSS seq tags sampled from two developmental stages namely; larvae and pupae. There were 3134 tag clusters among which 45.4% (1424) of the tag clusters mapped to first coding exons or their proximal predicted 5'UTR regions and 1.0% (31) tag clusters mapping to transposons, within a threshold of 100 tags per cluster. These 1393 non transposon-derived core promoters had propensity for AT nucleotides. The -1/+1 and 1/+1 positions in D. melanogaster, and G. m. morsitans had propensity for CA and AA dinucleotides respectively. The 1393 tag clusters comprised narrow promoters (5%), broad with peak promoters (23%) and broad without peak promoters (72%). Two-way motif co-occurrence analysis showed that the MTE-DPE pair is over-represented in broad core promoters. The frequently occurring triplet motifs in all promoter classes are the INR-MTE-DPE, TATA-MTE-DPE and TATA-INR-DPE. Promoters without the TATA motif had higher frequency of the MTE and INR motifs than those observed in Drosophila, where the DPE motif occur more frequently in promoters without TATA motif. Gene ontology terms associated with developmental processes were overrepresented in the narrow and broad with peak promoters. CONCLUSIONS: The study has identified different motif combinations associated with broad promoters in a blood-feeding insect. In the case of TATA-less core promoters, G.m. morsitans uses the MTE to compensate for the lack of a TATA motif. The increasing availability of TSS seq data allows for revision of existing gene annotation datasets with the potential of identifying new transcriptional units.


Asunto(s)
Insectos Vectores , Regiones Promotoras Genéticas , Sitio de Iniciación de la Transcripción , Moscas Tse-Tse/genética , Animales , Composición de Base , Análisis por Conglomerados , Biología Computacional/métodos , Mapeo Contig , Genoma de los Insectos , Genómica , Anotación de Secuencia Molecular , Motivos de Nucleótidos , Trypanosoma , Tripanosomiasis/transmisión , Moscas Tse-Tse/parasitología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA