Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 344
Filtrar
1.
Genome Biol ; 22(1): 202, 2021 07 12.
Artículo en Inglés | MEDLINE | ID: mdl-34253237

RESUMEN

GRIDSS2 is the first structural variant caller to explicitly report single breakends-breakpoints in which only one side can be unambiguously determined. By treating single breakends as a fundamental genomic rearrangement signal on par with breakpoints, GRIDSS2 can explain 47% of somatic centromere copy number changes using single breakends to non-centromere sequence. On a cohort of 3782 deeply sequenced metastatic cancers, GRIDSS2 achieves an unprecedented 3.1% false negative rate and 3.3% false discovery rate and identifies a novel 32-100 bp duplication signature. GRIDSS2 simplifies complex rearrangement interpretation through phasing of structural variants with 16% of somatic calls phasable using paired-end sequencing.


Asunto(s)
Puntos de Rotura del Cromosoma , Variaciones en el Número de Copia de ADN , Neoplasias/genética , Programas Informáticos , Mapeo Contig , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Genoma Humano , Genómica , Humanos , Metástasis de la Neoplasia , Neoplasias/patología
2.
Cancer Med ; 9(18): 6776-6790, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32738030

RESUMEN

The glioblastoma multiforme (GBM) is one of the deadliest tumors. It has been speculated that virus plays a role in GBM but the evidences are controversy. Published researches are mainly limited to studies on the presence of human cytomegalovirus (HCMV) in GBM. No comprehensive assessment of the brain virome, the collection of viral material in the brain, based on recently sequenced data has been performed. Here, we characterized the virome from 111 GBM samples and 57 normal brain samples from eight projects in the SRA database by a tested and comprehensive assembly approach. The annotation of the assembled contigs showed that most viral sequences in the brain belong to the viral family Retroviridae. In some GBM samples, we also detected full genome sequence of a novel picornavirus recently discovered in invertebrates. Unlike previous reports, our study did not detect herpes virus such as HCMV in GBM from the data we used. However, some contigs that cannot be annotated with any known genes exhibited antibody epitopes in their sequences. These findings provide several avenues for potential cancer therapy: the newly discovered picornavirus could be a starting point to engineer novel oncolytic virus; and the exhibited antibody epitopes could be a source to explore potential drug targets for immune cancer therapy. By characterizing the virosphere in GBM and normal brain at a global level, the results from this study strengthen the link between GBM and viral infection which warrants the further investigation.


Asunto(s)
Neoplasias Encefálicas/virología , Encéfalo/virología , Glioblastoma/virología , Secuenciación de Nucleótidos de Alto Rendimiento , Metagenómica , Picornaviridae/genética , Retroviridae/genética , Viroma/genética , Encéfalo/patología , Neoplasias Encefálicas/patología , Estudios de Casos y Controles , Mapeo Contig , Bases de Datos de Ácidos Nucleicos , Glioblastoma/patología , Humanos , ARN Viral/genética , RNA-Seq
3.
Genomics ; 112(2): 2028-2033, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31760041

RESUMEN

Tobacco (Nicotiana tabacum L.) is an essential commercial crop and an ideal model plant for biological mechanism studies. As an allopolyploid species, tobacco harbors a massive and complex genome, which makes the application of molecular markers complicated and challenging. In our study, we performed whole-genome sequencing of an intraspecific recombinant inbred line (RIL) population, a F1 generation and their parents. With the Nicotiana tabacum (K326 cultivar) genome as reference, a total of 45,081 markers were characterized to construct the genetic map, which spanned a genetic distance of 3486.78 cM. Evaluation of a two-dimensional heat map proved the high quality of the genetic map. We utilized these markers to anchor scaffolds and analyzed the ancestral genome origin of linkage groups (LGs). Furthermore, such a high-density genetic map will be applied for quantitative trait locus (QTL) detection, gene localization, genome-wide association studies (GWAS), and marker-assisted breeding in tobacco.


Asunto(s)
Ligamiento Genético , Genoma de Planta , Nicotiana/genética , Mapeo Contig , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Secuenciación Completa del Genoma
4.
Arch Virol ; 165(1): 227-231, 2020 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-31659444

RESUMEN

Three viral contig sequences, which represented complete genome of a novel virus with three dsRNAs of 1,712 nucleotides (nt) (dsRNA1), 1,504 nt (dsRNA2) and 1,353 nt (dsRNA3), were found in tea-oil camellia plants by high-throughput sequencing analysis. The three dsRNAs were re-sequenced by RT-PCR cloning. The largest dsRNA, dsRNA1, had a single open reading frame (ORF) that encoded a putative 52.7-kDa protein of a putative viral RNA-dependent RNA polymerase (RdRp). DsRNA2 and dsRNA3 were predicted to encode putative capsid proteins (CPs) of 40.47 kDa and 40.59 kDa, respectively. The virus, which is provisionally named "tea-oil camellia deltapartitivirus 1",  shared amino acid sequence itentities of 36.09-69.18% with members of the genus Deltapartitivirus on RdRp. Phylogenetic analysis based on RdRp also placed the new virus and other deltapartitiviruses together in a group, suggesting that this virus should be considered a new member of the genus Deltapartitivirus.


Asunto(s)
Camellia/virología , Virus ARN/genética , Secuenciación Completa del Genoma/métodos , Mapeo Contig , Genoma Viral , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Sistemas de Lectura Abierta , Filogenia , Virus ARN/clasificación , ARN Bicatenario/genética
5.
Gigascience ; 8(12)2019 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-31782791

RESUMEN

BACKGROUND: Sugarcane cultivars are polyploid interspecific hybrids of giant genomes, typically with 10-13 sets of chromosomes from 2 Saccharum species. The ploidy, hybridity, and size of the genome, estimated to have >10 Gb, pose a challenge for sequencing. RESULTS: Here we present a gene space assembly of SP80-3280, including 373,869 putative genes and their potential regulatory regions. The alignment of single-copy genes in diploid grasses to the putative genes indicates that we could resolve 2-6 (up to 15) putative homo(eo)logs that are 99.1% identical within their coding sequences. Dissimilarities increase in their regulatory regions, and gene promoter analysis shows differences in regulatory elements within gene families that are expressed in a species-specific manner. We exemplify these differences for sucrose synthase (SuSy) and phenylalanine ammonia-lyase (PAL), 2 gene families central to carbon partitioning. SP80-3280 has particular regulatory elements involved in sucrose synthesis not found in the ancestor Saccharum spontaneum. PAL regulatory elements are found in co-expressed genes related to fiber synthesis within gene networks defined during plant growth and maturation. Comparison with sorghum reveals predominantly bi-allelic variations in sugarcane, consistent with the formation of 2 "subgenomes" after their divergence ∼3.8-4.6 million years ago and reveals single-nucleotide variants that may underlie their differences. CONCLUSIONS: This assembly represents a large step towards a whole-genome assembly of a commercial sugarcane cultivar. It includes a rich diversity of genes and homo(eo)logous resolution for a representative fraction of the gene space, relevant to improve biomass and food production.


Asunto(s)
Mapeo Contig/métodos , Glucosiltransferasas/genética , Fenilanina Amoníaco-Liasa/genética , Saccharum/crecimiento & desarrollo , Biomasa , Productos Agrícolas/genética , Productos Agrícolas/crecimiento & desarrollo , Variación Genética , Tamaño del Genoma , Genoma de Planta , Familia de Multigenes , Proteínas de Plantas/genética , Poliploidía , Regiones Promotoras Genéticas , Saccharum/genética
6.
Sci Rep ; 9(1): 12629, 2019 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-31477765

RESUMEN

The centromere is important for segregation of chromosomes during cell division in eukaryotes. Its destabilization results in chromosomal missegregation, aneuploidy, hallmarks of cancers and birth defects. In primate genomes centromeres contain tandem repeats of ~171 bp alpha satellite DNA, commonly organized into higher order repeats (HORs). In spite of crucial importance, satellites have been understudied because of gaps in sequencing - genomic "black holes". Bioinformatical studies of genomic sequences open possibilities to revolutionize understanding of repetitive DNA datasets. Here, using robust (Global Repeat Map) algorithm we identified in hg38 sequence of human chromosome 21 complete ensemble of alpha satellite HORs with six long repeat units (≥20 mers), five of them novel. Novel 33mer HOR has the longest HOR unit identified so far among all somatic chromosomes and novel 23mer reverse HOR is distant far from the centromere. Also, we discovered that for hg38 assembly the 33mer sequences in chromosomes 21, 13, 14, and 22 are 100% identical but nearby gaps are present; that seems to require an additional more precise sequencing. Chromosome 21 is of significant interest for deciphering the molecular base of Down syndrome and of aneuploidies in general. Since the chromosome identifier probes are largely based on the detection of higher order alpha satellite repeats, distinctions between alpha satellite HORs in chromosomes 21 and 13 here identified might lead to a unique chromosome 21 probe in molecular cytogenetics, which would find utility in diagnostics. It is expected that its complete sequence analysis will have profound implications for understanding pathogenesis of diseases and development of new therapeutic approaches.


Asunto(s)
Cromosomas Humanos Par 21/genética , ADN Satélite/genética , Secuencias Repetidas en Tándem/genética , Algoritmos , Mapeo Contig , Humanos
7.
Gene ; 691: 96-105, 2019 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-30630096

RESUMEN

Vriesea carinata is an endemic bromeliad from the Brazilian Atlantic Forest. It has trichome and tank system in their leaves which allows to absorb water and nutrients. It belongs to Bromeliaceae family, which includes several species highly enriched of cysteine-proteases (CysPs). These proteolytic enzymes regulate processes as senescence, cell differentiation, pathogen-linked programmed cell death and mobilization of proteins. Although, their biological importance, there are not genomic resources in V. carinata that can help to identify and understand their molecular mechanisms involved in different biological processes. Thus high-throughput transcriptome sequencing of V. carinata is necessary to generate sequences for the purpose of gene discovery and functional genomic studies. In the present study, we sequenced and assembled the V. carinata transcriptome to the identification of CysPs. A total of 43,232 contigs were assembled for the leaf tissue. BLAST analysis indicated that 23,803 contigs exhibited similarity to non-redundant Viridiplantae proteins. 28.24% of the contigs were classified into the COG database, and gene ontology categorized them into 61 functional groups. A metabolic pathway analysis with KEGG revealed 9679 contigs assigned to 31 metabolic pathways. Among 16 full-length CysPs identified, 11 were evaluated in respect to their expression patterns in the leaf apex, base and inflorescence tissues. The results showed differential expression levels of legumain, metacaspase, pyroglutamyl and papain-like CysPs depending of the leaf region. These results provide a global overview of V. carinata gene functions and expression activities of CysPs in those tissues.


Asunto(s)
Bromeliaceae/genética , Mapeo Contig/métodos , Proteasas de Cisteína/genética , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica de las Plantas , Secuenciación de Nucleótidos de Alto Rendimiento , Redes y Vías Metabólicas , Anotación de Secuencia Molecular , Familia de Multigenes , Hojas de la Planta/genética , Proteínas de Plantas/genética , Análisis de Secuencia de ARN
8.
Aquat Toxicol ; 202: 46-56, 2018 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-30007154

RESUMEN

Thyroid hormones (THs) regulate vertebrate growth, development, and metabolism. Despite their importance, there is a need for effective detection of TH-disruption by endocrine disrupting chemicals (EDCs). The frog olfactory system substantially remodels during TH-dependent metamorphosis and the objective of the present study is to examine olfactory system gene expression for TH biomarkers that can evaluate the biological effects of complex mixtures such as municipal wastewater. We first examine classic TH-response gene transcripts using reverse transcription-quantitative real-time polymerase chain reaction (RT-qPCR) in the olfactory epithelium (OE) and olfactory bulb (OB) of premetamorphic Rana (Lithobates) catesbeiana tadpoles after 48 h exposure to biologically-relevant concentrations of the THs, 3,5,3'-triiodothyronine (T3) and L-thyroxine (T4), or 17-beta estradiol (E2); a hormone that can crosstalk with THs. As the OE was particularly sensitive to THs, further RNA-seq analysis found >30,000 TH-responsive contigs. In contrast, E2 affected 267 contigs of which only 57 overlapped with THs suggesting that E2 has limited effect on the OE at this developmental phase. Gene ontology enrichment analyses identified sensory perception and nucleoside diphosphate phosphorylation as the top affected terms for THs and E2, respectively. Using classic and additional RNA-seq-derived TH-response gene transcripts, we queried TH-disrupting activity in municipal wastewater effluent from two different treatment systems: anaerobic membrane bioreactor (AnMBR) and membrane enhanced biological phosphorous removal (MEBPR). While we observed physical EDC removal in both systems, some TH disruption activity was retained in the effluents. This work lays an important foundation for linking TH-dependent gene expression with olfactory system function in amphibians.


Asunto(s)
Disruptores Endocrinos/toxicidad , Bulbo Olfatorio/efectos de los fármacos , Rana catesbeiana/genética , Transcriptoma/efectos de los fármacos , Contaminantes Químicos del Agua/toxicidad , Animales , Mapeo Contig , Estradiol/metabolismo , Perfilación de la Expresión Génica , Yoduro Peroxidasa/genética , Yoduro Peroxidasa/metabolismo , Larva/efectos de los fármacos , Larva/metabolismo , Bulbo Olfatorio/metabolismo , Rana catesbeiana/crecimiento & desarrollo , Receptores alfa de Hormona Tiroidea/genética , Receptores alfa de Hormona Tiroidea/metabolismo , Receptores beta de Hormona Tiroidea/genética , Receptores beta de Hormona Tiroidea/metabolismo , Hormonas Tiroideas/toxicidad , Tiroxina/toxicidad , Triyodotironina/toxicidad , Yodotironina Deyodinasa Tipo II
9.
Gigascience ; 7(5)2018 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-29762659

RESUMEN

Background: The accurate sequencing and assembly of very large, often polyploid, genomes remains a challenging task, limiting long-range sequence information and phased sequence variation for applications such as plant breeding. The 15-Gb hexaploid bread wheat (Triticum aestivum) genome has been particularly challenging to sequence, and several different approaches have recently generated long-range assemblies. Mapping and understanding the types of assembly errors are important for optimising future sequencing and assembly approaches and for comparative genomics. Results: Here we use a Fosill 38-kb jumping library to assess medium and longer-range order of different publicly available wheat genome assemblies. Modifications to the Fosill protocol generated longer Illumina sequences and enabled comprehensive genome coverage. Analyses of two independent Bacterial Artificial Chromosome (BAC)-based chromosome-scale assemblies, two independent Illumina whole genome shotgun assemblies, and a hybrid Single Molecule Real Time (SMRT-PacBio) and short read (Illumina) assembly were carried out. We revealed a surprising scale and variety of discrepancies using Fosill mate-pair mapping and validated several of each class. In addition, Fosill mate-pairs were used to scaffold a whole genome Illumina assembly, leading to a 3-fold increase in N50 values. Conclusions: Our analyses, using an independent means to validate different wheat genome assemblies, show that whole genome shotgun assemblies based solely on Illumina sequences are significantly more accurate by all measures compared to BAC-based chromosome-scale assemblies and hybrid SMRT-Illumina approaches. Although current whole genome assemblies are reasonably accurate and useful, additional improvements will be needed to generate complete assemblies of wheat genomes using open-source, computationally efficient, and cost-effective methods.


Asunto(s)
Biblioteca de Genes , Genoma de Planta , Análisis de Secuencia de ADN/métodos , Triticum/genética , Cromosomas Artificiales Bacterianos/genética , Cromosomas de las Plantas/genética , Mapeo Contig
10.
Funct Integr Genomics ; 18(5): 533-543, 2018 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-29730772

RESUMEN

One of the main challenges in elimination of oil contamination from polluted environments is improvement of biodegradation by highly efficient microorganisms. Bacillus subtilis MJ01 has been evaluated as a new resource for producing biosurfactant compounds. This bacterium, which produces surfactin, is able to enhance bio-accessibility to oil hydrocarbons in contaminated soils. The genome of B. subtilis MJ01 was sequenced and assembled by PacBio RS sequencing technology. One big contig with a length of 4,108,293 bp without any gap was assembled. Genome annotation and prediction of gene showed that MJ01 genome is very similar to B. subtilis spizizenii TU-B-10 (95% similarity). The comparison and analysis of orthologous genes carried out between B. subtilis MJ01, reference strain B. subtilis subsp. subtilis str. 168, and close relative spizizenii TU-B-10 by microscope platform and various bioinformatics tools. More than 88% of 4269 predicted coding sequences in MJ01 had at least one similar sequence in genome of reference strain and spizizenii TU-B-10. Despite this high similarity, some differences were detected among encoding sequences of non-ribosome protein and bacteriocins in MJ01 and spizizenii TU-B-10. MJ01 has unique nucleotide sequences and a novel predicted lasso-peptide bacteriocin; it also has not any similar nucleotide sequence in non-redundant nucleotide data base.


Asunto(s)
Bacillus subtilis/genética , Proteínas Bacterianas/genética , Regulación Bacteriana de la Expresión Génica , Genoma Bacteriano , Aceites Industriales/análisis , Contaminantes del Suelo/metabolismo , Bacillus subtilis/clasificación , Bacillus subtilis/metabolismo , Proteínas Bacterianas/metabolismo , Bacteriocinas/biosíntesis , Bacteriocinas/genética , Biodegradación Ambiental , Biología Computacional , Mapeo Contig , Ontología de Genes , Lipopéptidos/biosíntesis , Lipopéptidos/genética , Anotación de Secuencia Molecular , Péptidos Cíclicos/biosíntesis , Péptidos Cíclicos/genética , Filogenia , Suelo/química , Microbiología del Suelo , Tensoactivos/química , Tensoactivos/metabolismo , Secuenciación Completa del Genoma
12.
Sci Rep ; 7(1): 15274, 2017 11 10.
Artículo en Inglés | MEDLINE | ID: mdl-29127298

RESUMEN

Like those of many agricultural crops, the cultivated cotton is an allotetraploid and has a large genome (~2.5 gigabase pairs). The two sub genomes, A and D, are highly similar but unequally sized and repeat-rich, which pose significant challenges for accurate genome reconstruction using standard approaches. Here we report the development of BAC libraries, sub genome specific physical maps, and a new-generation sequencing approach that will lead to a reference-grade genome assembly for Upland cotton. Three BAC libraries were constructed, fingerprinted, and integrated with BAC-end sequences (BES) to produce a de novo whole-genome physical map. The BAC map was partitioned by sub genomes through alignment to the diploid progenitor D-genome reference sequence with densely spaced BES anchor points and computational filtering. The physical maps were validated with FISH and genetic mapping of SNP markers derived from BES. Two pairs of homeologous chromosomes, A11/D11 and A12/D12, were used to assess multiplex sequencing approaches for completeness and scalability. The results represent the first sub genome anchored physical maps of Upland cotton, and a new-generation approach to the whole-genome sequencing, which will lead to the reference-grade assembly of allopolyploid cotton and serve as a general strategy for sequencing other polyploid species.


Asunto(s)
Cromosomas de las Plantas/genética , Ligamiento Genético , Genoma de Planta , Gossypium/genética , Poliploidía , Cromosomas Artificiales Bacterianos , Mapeo Contig , Biblioteca de Genes , Análisis de Secuencia de ADN
13.
Gigascience ; 6(9): 1-7, 2017 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-28922823

RESUMEN

Camptotheca acuminata is 1 of a limited number of species that produce camptothecin, a pentacyclic quinoline alkaloid with anti-cancer activity due to its ability to inhibit DNA topoisomerase. While transcriptome studies have been performed previously with various camptothecin-producing species, no genome sequence for a camptothecin-producing species is available to date. We generated a high-quality de novo genome assembly for C. acuminata representing 403 174 860 bp on 1394 scaffolds with an N50 scaffold size of 1752 kbp. Quality assessments of the assembly revealed robust representation of the genome sequence including genic regions. Using a novel genome annotation method, we annotated 31 825 genes encoding 40 332 gene models. Based on sequence identity and orthology with validated genes from Catharanthus roseus as well as Pfam searches, we identified candidate orthologs for genes potentially involved in camptothecin biosynthesis. Extensive gene duplication including tandem duplication was widespread in the C. acuminata genome, with 2571 genes belonging to 997 tandem duplicated gene clusters. To our knowledge, this is the first genome sequence for a camptothecin-producing species, and access to the C. acuminata genome will permit not only discovery of genes encoding the camptothecin biosynthetic pathway but also reagents that can be used for heterologous expression of camptothecin and camptothecin analogs with novel pharmaceutical applications.


Asunto(s)
Camptotheca/genética , Genoma de Planta , Antineoplásicos/química , Antineoplásicos/metabolismo , Camptotheca/clasificación , Camptotecina/biosíntesis , Mapeo Contig , Duplicación de Gen , Anotación de Secuencia Molecular , Proteínas de Plantas/genética , Secuencias Repetidas en Tándem , Secuenciación Completa del Genoma
14.
Genome Res ; 27(5): 885-896, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-28420692

RESUMEN

Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.


Asunto(s)
Mapeo Contig/métodos , Genoma de Planta , Anotación de Secuencia Molecular/métodos , Proteínas de Plantas/genética , Translocación Genética , Triticum/genética , Algoritmos , Mapeo Contig/normas , Anotación de Secuencia Molecular/normas , Polimorfismo Genético , Poliploidía
15.
BMC Genomics ; 18(1): 204, 2017 02 27.
Artículo en Inglés | MEDLINE | ID: mdl-28241794

RESUMEN

BACKGROUND: The parasite Echinococcus canadensis (G7) (phylum Platyhelminthes, class Cestoda) is one of the causative agents of echinococcosis. Echinococcosis is a worldwide chronic zoonosis affecting humans as well as domestic and wild mammals, which has been reported as a prioritized neglected disease by the World Health Organisation. No genomic data, comparative genomic analyses or efficient therapeutic and diagnostic tools are available for this severe disease. The information presented in this study will help to understand the peculiar biological characters and to design species-specific control tools. RESULTS: We sequenced, assembled and annotated the 115-Mb genome of E. canadensis (G7). Comparative genomic analyses using whole genome data of three Echinococcus species not only confirmed the status of E. canadensis (G7) as a separate species but also demonstrated a high nucleotide sequences divergence in relation to E. granulosus (G1). The E. canadensis (G7) genome contains 11,449 genes with a core set of 881 orthologs shared among five cestode species. Comparative genomics revealed that there are more single nucleotide polymorphisms (SNPs) between E. canadensis (G7) and E. granulosus (G1) than between E. canadensis (G7) and E. multilocularis. This result was unexpected since E. canadensis (G7) and E. granulosus (G1) were considered to belong to the species complex E. granulosus sensu lato. We described SNPs in known drug targets and metabolism genes in the E. canadensis (G7) genome. Regarding gene regulation, we analysed three particular features: CpG island distribution along the three Echinococcus genomes, DNA methylation system and small RNA pathway. The results suggest the occurrence of yet unknown gene regulation mechanisms in Echinococcus. CONCLUSIONS: This is the first work that addresses Echinococcus comparative genomics. The resources presented here will promote the study of mechanisms of parasite development as well as new tools for drug discovery. The availability of a high-quality genome assembly is critical for fully exploring the biology of a pathogenic organism. The E. canadensis (G7) genome presented in this study provides a unique opportunity to address the genetic diversity among the genus Echinococcus and its particular developmental features. At present, there is no unequivocal taxonomic classification of Echinococcus species; however, the genome-wide SNPs analysis performed here revealed the phylogenetic distance among these three Echinococcus species. Additional cestode genomes need to be sequenced to be able to resolve their phylogeny.


Asunto(s)
Equinococosis/genética , Echinococcus/genética , Genoma de Protozoos , Animales , Proteínas Argonautas/antagonistas & inhibidores , Proteínas Argonautas/genética , Proteínas Argonautas/metabolismo , Hibridación Genómica Comparativa , Mapeo Contig , Islas de CpG , Metilación de ADN , Equinococosis/parasitología , Equinococosis/patología , Echinococcus/clasificación , Echinococcus/metabolismo , Humanos , Secuencias Repetitivas Esparcidas/genética , Filogenia , Polimorfismo de Nucleótido Simple , Proteínas Protozoarias/antagonistas & inhibidores , Proteínas Protozoarias/genética , Proteínas Protozoarias/metabolismo
16.
Sci Rep ; 7: 41458, 2017 02 02.
Artículo en Inglés | MEDLINE | ID: mdl-28150733

RESUMEN

Wound healing and regeneration in cnidarian species, a group that forms the sister phylum to Bilateria, remains poorly characterised despite the ability of many cnidarians to rapidly repair injuries, regenerate lost structures, or re-form whole organisms from small populations of somatic cells. Here we present results from a fully replicated RNA-Seq experiment to identify genes that are differentially expressed in the sea anemone Calliactis polypus following catastrophic injury. We find that a large-scale transcriptomic response is established in C. polypus, comprising an abundance of genes involved in tissue patterning, energy dynamics, immunity, cellular communication, and extracellular matrix remodelling. We also identified a substantial proportion of uncharacterised genes that were differentially expressed during regeneration, that appear to be restricted to cnidarians. Overall, our study serves to both identify the role that conserved genes play in eumetazoan wound healing and regeneration, as well as to highlight the lack of information regarding many genes involved in this process. We suggest that functional analysis of the large group of uncharacterised genes found in our study may contribute to better understanding of the regenerative capacity of cnidarians, as well as provide insight into how wound healing and regeneration has evolved in different lineages.


Asunto(s)
Regeneración/genética , Anémonas de Mar/genética , Transcriptoma/genética , Cicatrización de Heridas/genética , Animales , Biología Computacional , Mapeo Contig , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Estudios de Asociación Genética , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta/genética , Factores de Tiempo
17.
Genome Res ; 27(5): 686-696, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-28137821

RESUMEN

The American alligator, Alligator mississippiensis, like all crocodilians, has temperature-dependent sex determination, in which the sex of an embryo is determined by the incubation temperature of the egg during a critical period of development. The lack of genetic differences between male and female alligators leaves open the question of how the genes responsible for sex determination and differentiation are regulated. Insight into this question comes from the fact that exposing an embryo incubated at male-producing temperature to estrogen causes it to develop ovaries. Because estrogen response elements are known to regulate genes over long distances, a contiguous genome assembly is crucial for predicting and understanding their impact. We present an improved assembly of the American alligator genome, scaffolded with in vitro proximity ligation (Chicago) data. We use this assembly to scaffold two other crocodilian genomes based on synteny. We perform RNA sequencing of tissues from American alligator embryos to find genes that are differentially expressed between embryos incubated at male- versus female-producing temperature. Finally, we use the improved contiguity of our assembly along with the current model of CTCF-mediated chromatin looping to predict regions of the genome likely to contain estrogen-responsive genes. We find that these regions are significantly enriched for genes with female-biased expression in developing gonads after the critical period during which sex is determined by incubation temperature. We thus conclude that estrogen signaling is a major driver of female-biased gene expression in the post-temperature sensitive period gonads.


Asunto(s)
Caimanes y Cocodrilos/genética , Secuencia Conservada , Estrógenos/genética , Genoma , Transducción de Señal , Caimanes y Cocodrilos/embriología , Animales , Factor de Unión a CCCTC/metabolismo , Cromatina/metabolismo , Mapeo Contig , Estrógenos/metabolismo , Femenino , Masculino , Análisis de Secuencia de ADN , Procesos de Determinación del Sexo/genética , Sintenía
18.
Genome Res ; 27(5): 793-800, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-28104618

RESUMEN

Achieving complete, accurate, and cost-effective assembly of human genomes is of great importance for realizing the promise of precision medicine. The abundance of repeats and genetic variations in human genomes and the limitations of existing sequencing technologies call for the development of novel assembly methods that can leverage the complementary strengths of multiple technologies. We propose a Hybrid Structural variant Assembly (HySA) approach that integrates sequencing reads from next-generation sequencing and single-molecule sequencing technologies to accurately assemble and detect structural variants (SVs) in human genomes. By identifying homologous SV-containing reads from different technologies through a bipartite-graph-based clustering algorithm, our approach turns a whole genome assembly problem into a set of independent SV assembly problems, each of which can be effectively solved to enhance the assembly of structurally altered regions in human genomes. We used data generated from a haploid hydatidiform mole genome (CHM1) and a diploid human genome (NA12878) to test our approach. The result showed that, compared with existing methods, our approach had a low false discovery rate and substantially improved the detection of many types of SVs, particularly novel large insertions, small indels (10-50 bp), and short tandem repeat expansions and contractions. Our work highlights the strengths and limitations of current approaches and provides an effective solution for extending the power of existing sequencing technologies for SV discovery.


Asunto(s)
Mapeo Contig/métodos , Genoma Humano , Variación Estructural del Genoma , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Animales , Mapeo Contig/normas , Diploidia , Genómica/normas , Haploidia , Humanos , Ratones , Análisis de Secuencia de ADN/normas , Secuencias Repetidas en Tándem
19.
Sci Rep ; 6: 39256, 2016 12 19.
Artículo en Inglés | MEDLINE | ID: mdl-27991536

RESUMEN

Norcoclaurine synthase (NCS) catalyzes the enantioselective Pictet-Spengler condensation of dopamine and 4-hydroxyphenylacetaldehyde as the first step in benzylisoquinoline alkaloid (BIA) biosynthesis. NCS orthologs in available transcriptome databases were screened for variants that might improve the low yield of BIAs in engineered microorganisms. Databases for 21 BIA-producing species from four plant families yielded 33 assembled contigs with homology to characterized NCS genes. Predicted translation products generated from nine contigs consisted of two to five sequential repeats, each containing most of the sequence found in single-domain enzymes. Assembled contigs containing tandem domain repeats were detected only in members of the Papaveraceae family, including opium poppy (Papaver somniferum). Fourteen cDNAs were generated from 10 species, five of which encoded NCS orthologs with repeated domains. Functional analysis of corresponding recombinant proteins yielded six active NCS enzymes, including four containing either two, three or four repeated catalytic domains. Truncation of the first 25 N-terminal amino acids from the remaining polypeptides revealed two additional enzymes. Multiple catalytic domains correlated with a proportional increase in catalytic efficiency. Expression of NCS genes in Saccharomyces cereviseae also produced active enzymes. The metabolic conversion capacity of engineered yeast positively correlated with the number of repeated domains.


Asunto(s)
Ligasas de Carbono-Nitrógeno/genética , Proteínas de Plantas/genética , Alcaloides/biosíntesis , Secuencia de Aminoácidos , Bencilisoquinolinas/metabolismo , Biocatálisis , Ligasas de Carbono-Nitrógeno/química , Ligasas de Carbono-Nitrógeno/clasificación , Ligasas de Carbono-Nitrógeno/metabolismo , Dominio Catalítico , Clonación Molecular , Mapeo Contig , ADN Complementario/metabolismo , Bases de Datos Factuales , Pruebas de Enzimas , Escherichia coli/metabolismo , Cinética , Papaveraceae , Filogenia , Proteínas de Plantas/química , Proteínas de Plantas/clasificación , Proteínas de Plantas/metabolismo , Proteínas Recombinantes/biosíntesis , Proteínas Recombinantes/química , Proteínas Recombinantes/aislamiento & purificación , Saccharomyces cerevisiae/metabolismo , Alineación de Secuencia
20.
BMC Genomics ; 17(1): 1005, 2016 12 08.
Artículo en Inglés | MEDLINE | ID: mdl-27931186

RESUMEN

BACKGROUND: The evolutionary arms race between plants and insects has driven the co-evolution of sophisticated defense mechanisms used by plants to deter herbivores and equally sophisticated strategies that enable phytophagous insects to rapidly detoxify the plant's defense metabolites. In this study, we identify the genetic determinants that enable the mirid, Tupiocoris notatus, to feed on its well-defended host plant, Nicotiana attenuata, an outstanding model for plant-insect interaction studies. RESULTS: We used an RNAseq approach to evaluate the global gene expression of T. notatus after feeding on a transgenic N. attenuata line which does not accumulate jasmonic acid (JA) after herbivory, and consequently accumulates very low levels of defense metabolites. Using Illumina sequencing, we generated a de novo assembled transcriptome which resulted in 63,062 contigs (putative transcript isoforms) contained in 42,610 isotigs (putative identified genes). Differential expression analysis based on RSEM-estimated transcript abundances identified 82 differentially expressed (DE) transcripts between T. notatus fed on wild-type and the defenseless plants. The same analysis conducted with Corset-estimated transcript abundances identified 59 DE clusters containing 85 transcripts. In both analyses, a larger number of DE transcripts were found down-regulated in mirids feeding on JA-silenced plants (around 70%). Among these down-regulated transcripts we identified seven transcripts possibly involved in the detoxification of N. attenuata defense metabolite, specifically, one glutathione-S-transferase (GST), one UDP-glucosyltransferase (UGT), five cytochrome P450 (P450s), and six serine proteases. Real-time quantitative PCR confirmed the down-regulation for six transcripts (encoding GST, UGT and four P450s) and revealed that their expression was only slightly decreased in mirids feeding on another N. attenuata transgenic line specifically silenced in the accumulation of diterpene glycosides, one of the many classes of JA-mediated defenses in N. attenuata. CONCLUSIONS: The results provide a transcriptional overview of the changes in a specialist hemimetabolous insect associated with feeding on host plants depleted in chemical defenses. Overall, the analysis reveals that T. notatus responses to host plant defenses are narrow and engages P450 detoxification pathways. It further identifies candidate genes which can be tested in future experiments to understand their role in shaping the T. notatus-N. attenuata interaction.


Asunto(s)
Chinches/genética , Ciclopentanos/metabolismo , Nicotiana/genética , Oxilipinas/metabolismo , Reguladores del Crecimiento de las Plantas/metabolismo , Animales , Chinches/enzimología , Mapeo Contig , Sistema Enzimático del Citocromo P-450/clasificación , Sistema Enzimático del Citocromo P-450/genética , Sistema Enzimático del Citocromo P-450/metabolismo , Regulación hacia Abajo , Perfilación de la Expresión Génica , Silenciador del Gen , Glutatión Transferasa/clasificación , Glutatión Transferasa/genética , Glutatión Transferasa/metabolismo , Herbivoria , Inactivación Metabólica/genética , Proteínas de Transporte de Monosacáridos/clasificación , Proteínas de Transporte de Monosacáridos/genética , Proteínas de Transporte de Monosacáridos/metabolismo , Filogenia , Reguladores del Crecimiento de las Plantas/genética , Plantas Modificadas Genéticamente/genética , ARN/química , ARN/aislamiento & purificación , ARN/metabolismo , Análisis de Secuencia de ARN , Regulación hacia Arriba
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA