Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 112
Filtrar
1.
BMC Genomics ; 23(1): 327, 2022 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-35477350

RESUMO

The cosmopolitan Thalassionema species are often dominant components of the plankton diatom flora and sediment diatom assemblages in all but the Polar regions, making important ecological contribution to primary productivity. Historical studies concentrated on their indicative function for the marine environment based primarily on morphological features and essentially ignored their genomic information, hindering in-depth investigation on Thalassionema biodiversity. In this project, we constructed the complete chloroplast genomes (cpDNAs) of seven Thalassionema strains representing three different species, which were also the first cpDNAs constructed for any species in the order Thalassionematales that includes 35 reported species and varieties. The sizes of these Thalassionema cpDNAs, which showed typical quadripartite structures, varied from 124,127 bp to 140,121 bp. Comparative analysis revealed that Thalassionema cpDNAs possess conserved gene content inter-species and intra-species, along with several gene losses and transfers. Besides, their cpDNAs also have expanded inverted repeat regions (IRs) and preserve large intergenic spacers compared to other diatom cpDNAs. In addition, substantial genome rearrangements were discovered not only among different Thalassionema species but also among strains of a same species T. frauenfeldii, suggesting much higher diversity than previous reports. In addition to confirming the phylogenetic position of Thalassionema species, this study also estimated their emergence time at approximately 38 Mya. The availability of the Thalassionema species cpDNAs not only helps understand the Thalassionema species, but also facilitates phylogenetic analysis of diatoms.


Assuntos
Diatomáceas , Genoma de Cloroplastos , Biodiversidade , Cloroplastos/genética , DNA de Cloroplastos/genética , Diatomáceas/genética , Evolução Molecular , Filogenia
2.
BMC Bioinformatics ; 22(Suppl 3): 522, 2021 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-34696728

RESUMO

BACKGROUND: In the process of designing drugs and proteins, it is crucial to recognize hot regions in protein-protein interactions. Each hot region of protein-protein interaction is composed of at least three hot spots, which play an important role in binding. However, it takes time and labor force to identify hot spots through biological experiments. If predictive models based on machine learning methods can be trained, the drug design process can be effectively accelerated. RESULTS: The results show that different machine learning algorithms perform similarly, as evaluating using the F-measure. The main differences between these methods are recall and precision. Since the key attribute of hot regions is that they are packed tightly, we used the cluster algorithm to predict hot regions. By combining Gaussian Naïve Bayes and DBSCAN, the F-measure of hot region prediction can reach 0.809. CONCLUSIONS: In this paper, different machine learning models such as Gaussian Naïve Bayes, SVM, Xgboost, Random Forest, and Artificial Neural Network are used to predict hot spots. The experiment results show that the combination of hot spot classification algorithm with higher recall rate and clustering algorithm with higher precision can effectively improve the accuracy of hot region prediction.


Assuntos
Algoritmos , Aprendizado de Máquina , Teorema de Bayes , Análise por Conglomerados , Proteínas , Máquina de Vetores de Suporte
3.
BMC Genomics ; 22(1): 746, 2021 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-34654361

RESUMO

BACKGROUND: Skeletonema species are prominent primary producers, some of which can also cause massive harmful algal blooms (HABs) in coastal waters under specific environmental conditions. Nevertheless, genomic information of Skeletonema species is currently limited, hindering advanced research on their role as primary producers and as HAB species. Mitochondrial genome (mtDNA) has been extensively used as "super barcode" in the phylogenetic analyses and comparative genomic analyses. However, of the 21 accepted Skeletonema species, full-length mtDNAs are currently available only for a single species, S. marinoi. RESULTS: In this study, we constructed full-length mtDNAs for six strains of five Skeletonema species, including S. marinoi, S. tropicum, S. grevillei, S. pseudocostatum and S. costatum (with two strains), which were isolated from coastal waters in China. The mtDNAs of all of these Skeletonema species were compact with short intergenic regions, no introns, and no repeat regions. Comparative analyses of these Skeletonema mtDNAs revealed high conservation, with a few discrete regions of high variations, some of which could be used as molecular markers for distinguishing Skeletonema species and for tracking the biogeographic distribution of these species with high resolution and specificity. We estimated divergence times among these Skeletonema species using 34 mtDNAs genes with fossil data as calibration point in PAML, which revealed that the Skeletonema species formed the independent clade diverging from Thalassiosira species approximately 48.30 Mya. CONCLUSIONS: The availability of mtDNAs of five Skeletonema species provided valuable reference sequences for further evolutionary studies including speciation time estimation and comparative genomic analysis among diatom species. Divergent regions could be used as molecular markers for tracking different Skeletonema species in the fields of coastal regions.


Assuntos
Diatomáceas , Genoma Mitocondrial , DNA Mitocondrial , Diatomáceas/genética , Proliferação Nociva de Algas , Filogenia
4.
Genetica ; 149(1): 63-72, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33449239

RESUMO

Ulva prolifera O.F. Müller (Ulvophyceae, Chlorophyta) is well known as a typical green-tide forming macroalga which has caused the world's largest macroalgal blooms in the Yellow Sea of China. In this study, two full-length γ-carbonic anhydrase (γ-CA) genes (UpγCA1 and UpγCA2) were cloned from U. prolifera. UpγCA1 has three conserved histidine residues, which act as an active site for binding a zinc metal ion. In UpγCA2, two of the three histidine residues were replaced by serine and arginine, respectively. The two γ-CA genes are clustered together with other γ-CAs in Chlorophyta with strong support value (100% bootstrap) in maximum likelihood (ML) phylogenetic tree. Quantitative real-time PCR (qRT-PCR) analysis showed that stressful environmental conditions markedly inhibited transcription levels of these two γ-CA genes. Low pH value (pH 7.5) significantly increased transcription level of UpγCA2 not UpγCA1 at 12 h, whereas high pH value (pH 8.5) significantly inhibited the transcription of these two γ-CA genes at 6 h. These findings enhanced our understanding on transcriptional regulation of γ-CA genes in response to environmental factors in U. prolifera.


Assuntos
Anidrase Carbônica II/genética , Anidrase Carbônica I/genética , Transcrição Gênica , Ulva/genética , Anidrase Carbônica I/isolamento & purificação , Anidrase Carbônica II/isolamento & purificação , China , Clonagem Molecular , Regulação da Expressão Gênica , Filogenia , Ulva/enzimologia
5.
Microb Cell Fact ; 16(1): 63, 2017 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-28420406

RESUMO

BACKGROUND: Efficient biomass bioconversion is a promising solution to alternative energy resources and environmental issues associated with lignocellulosic wastes. The Trichoderma species of cellulolytic fungi have strong cellulose-degrading capability, and their cellulase systems have been extensively studied. Currently, a major limitation of Trichoderma strains is their low production of ß-glucosidases. RESULTS: We isolated two Trichoderma hamatum strains YYH13 and YYH16 with drastically different cellulose degrading efficiencies. YYH13 has higher cellobiose-hydrolyzing efficiency. To understand mechanisms underlying such differences, we sequenced the genomes of YYH13 and YYH16, which are essentially identical (38.93 and 38.92 Mb, respectively) and are similar to that of the T. hamatum strain GD12. Using GeneMark-ES, we annotated 11,316 and 11,755 protein-coding genes in YYH13 and YYH16, respectively. Comparative analysis identified 13 functionally important genes in YYH13 under positive selection. Through examining orthologous relationships, we identified 172,655, and 320 genome-specific genes in YYH13, YYH16, and GD12, respectively. We found 15 protease families that show differences between YYH13 and YYH16. Enzymatic tests showed that exoglucanase, endoglucanase, and ß-glucosidase activities were higher in YYH13 than YYH16. Additionally, YYH13 contains 10 families of carbohydrate-active enzymes, including GH1, GH3, GH18, GH35, and GH55 families of chitinases, glucosidases, galactosidases, and glucanases, which are subject to stronger positive selection pressure. Furthermore, we found that the ß-glucosidase gene (YYH1311079) and pGEX-KG/YYH1311079 bacterial expression vector may provide valuable insight for designing ß-glucosidase with higher cellobiose-hydrolyzing efficiencies. CONCLUSIONS: This study suggests that the YYH13 strain of T. hamatum has the potential to serve as a model organism for producing cellulase because of its strong ability to efficiently degrade cellulosic biomass. The genome sequences of YYH13 and YYH16 represents a valuable resource for studying efficient production of biofuels.


Assuntos
Celobiose/metabolismo , Genoma Fúngico , Trichoderma/genética , Trichoderma/metabolismo , Biocombustíveis , Biomassa , Celulase/biossíntese , Celulase/genética , Celulase/metabolismo , Celulose/metabolismo , Fermentação , Variação Genética , Genômica , Hidrólise , Peptídeo Hidrolases/genética , Peptídeo Hidrolases/metabolismo , Análise de Sequência de DNA , Trichoderma/enzimologia , beta-Glucosidase/genética , beta-Glucosidase/metabolismo
6.
Methods ; 110: 73-80, 2016 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-27346249

RESUMO

The hot regions of protein-protein interactions refer to the active area which formed by those most important residues to protein combination process. With the research development on protein interactions, lots of predicted hot regions can be discovered efficiently by intelligent computing methods, while performing biology experiments to verify each every prediction is hardly to be done due to the time-cost and the complexity of the experiment. This study based on the research of hot spot residue conservations, the proposed method is used to verify authenticity of predicted hot regions that using machine learning algorithm combined with protein's biological features and sequence conservation, though multiple sequence alignment, module substitute matrix and sequence similarity to create conservation scoring algorithm, and then using threshold module to verify the conservation tendency of hot regions in evolution. This research work gives an effective method to verify predicted hot regions in protein-protein interactions, which also provides a useful way to deeply investigate the functional activities of protein hot regions.


Assuntos
Biologia Computacional/métodos , Evolução Molecular , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas/genética , Animais , Sequência Conservada/genética , Bases de Dados de Proteínas , Humanos , Conformação Proteica , Proteínas/genética , Alinhamento de Sequência
7.
J Biol Chem ; 290(19): 12079-89, 2015 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-25795783

RESUMO

The generation of personalized induced pluripotent stem cells (iPSCs) followed by targeted genome editing provides an opportunity for developing customized effective cellular therapies for genetic disorders. However, it is critical to ascertain whether edited iPSCs harbor unfavorable genomic variations before their clinical application. To examine the mutation status of the edited iPSC genome and trace the origin of possible mutations at different steps, we have generated virus-free iPSCs from amniotic cells carrying homozygous point mutations in ß-hemoglobin gene (HBB) that cause severe ß-thalassemia (ß-Thal), corrected the mutations in both HBB alleles by zinc finger nuclease-aided gene targeting, and obtained the final HBB gene-corrected iPSCs by excising the exogenous drug resistance gene with Cre recombinase. Through comparative genomic hybridization and whole-exome sequencing, we uncovered seven copy number variations, five small insertions/deletions, and 64 single nucleotide variations (SNVs) in ß-Thal iPSCs before the gene targeting step and found a single small copy number variation, 19 insertions/deletions, and 340 single nucleotide variations in the final gene-corrected ß-Thal iPSCs. Our data revealed that substantial but different genomic variations occurred at factor-induced somatic cell reprogramming and zinc finger nuclease-aided gene targeting steps, suggesting that stringent genomic monitoring and selection are needed both at the time of iPSC derivation and after gene targeting.


Assuntos
Reprogramação Celular , Endonucleases/metabolismo , Endorribonucleases/metabolismo , Marcação de Genes , Instabilidade Genômica , Células-Tronco Pluripotentes Induzidas/citologia , Alelos , Animais , Diferenciação Celular , Cromossomos/ultraestrutura , Hibridização Genômica Comparativa , Variações do Número de Cópias de DNA , Eritroblastos/citologia , Exoma , Deleção de Genes , Variação Genética , Humanos , Camundongos , Mutação , Análise de Sequência com Séries de Oligonucleotídeos , Dedos de Zinco/genética , Globinas beta/genética , Talassemia beta/genética
8.
Mol Genet Metab ; 118(2): 92-9, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27142465

RESUMO

UNLABELLED: Sialuria, a rare inborn error of metabolism, was diagnosed in a healthy 12-year-old boy through whole exome sequencing. The patient had experienced mild delays of speech and motor development, as well as persistent hepatomegaly. Identification of the 8th individual with this disorder, prompted follow-up of the mother-son pair of patients diagnosed over 15years ago. Hepatomegaly was confirmed in the now 19-year-old son, but in the 46-year-old mother a clinically silent liver tumor was detected by ultrasound and MRI. The tumor was characterized as an intrahepatic cholangiocarcinoma (IHCC) and DNA analysis of both tumor and normal liver tissue confirmed the original GNE mutation. As the maternal grandmother in the latter family died at age 49years of a liver tumor, a retrospective study of the remaining pathology slides was conducted and confirmed it to have been an IHCC as well. The overall observation generated the hypothesis that sialuria may predispose to development of this form of liver cancer. As proof of sialuria in the grandmother could not be obtained, an alternate cause of IHCC cannot be ruled out. In a series of 102 patients with IHCC, not a single instance was found with the allosteric site mutation in the GNE gene. This confirms that sialuria is rare even in a selected group of patients, but does not invalidate the concern that sialuria may be a risk factor for IHCC. SYNOPSIS: Sialuria is a rare inborn error of metabolism characterized by excessive synthesis and urinary excretion of free sialic acid with only minimal clinical morbidity in early childhood, but may be a risk factor for intrahepatic cholangiocarcinoma in adulthood.


Assuntos
Neoplasias dos Ductos Biliares/genética , Colangiocarcinoma/genética , Neoplasias Hepáticas/genética , Doenças Raras/genética , Doença do Armazenamento de Ácido Siálico/genética , Neoplasias dos Ductos Biliares/diagnóstico , Neoplasias dos Ductos Biliares/cirurgia , Criança , Colangiocarcinoma/diagnóstico , Colangiocarcinoma/cirurgia , Feminino , Hepatomegalia/diagnóstico , Heterozigoto , Humanos , Fígado/patologia , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/cirurgia , Masculino , Pessoa de Meia-Idade , Ácido N-Acetilneuramínico/biossíntese , Ácido N-Acetilneuramínico/urina , Doenças Raras/diagnóstico , Estudos Retrospectivos , Fatores de Risco , Doença do Armazenamento de Ácido Siálico/diagnóstico , Sequenciamento do Exoma , Adulto Jovem
9.
BMC Genomics ; 16: 1039, 2015 Dec 09.
Artigo em Inglês | MEDLINE | ID: mdl-26645802

RESUMO

BACKGROUND: The large and complex hexaploid genome has greatly hindered genomics studies of common wheat (Triticum aestivum, AABBDD). Here, we investigated transcripts in common wheat developing caryopses using the emerging single-molecule real-time (SMRT) sequencing technology PacBio RSII, and assessed the resultant data for improving common wheat genome annotation and grain transcriptome research. RESULTS: We obtained 197,709 full-length non-chimeric (FLNC) reads, 74.6 % of which were estimated to carry complete open reading frame. A total of 91,881 high-quality FLNC reads were identified and mapped to 16,188 chromosomal loci, corresponding to 13,162 known genes and 3026 new genes not annotated previously. Although some FLNC reads could not be unambiguously mapped to the current draft genome sequence, many of them are likely useful for studying highly similar homoeologous or paralogous loci or for improving chromosomal contig assembly in further research. The 91,881 high-quality FLNC reads represented 22,768 unique transcripts, 9591 of which were newly discovered. We found 180 transcripts each spanning two or three previously annotated adjacent loci, suggesting that they should be merged to form correct gene models. Finally, our data facilitated the identification of 6030 genes differentially regulated during caryopsis development, and full-length transcripts for 72 transcribed gluten gene members that are important for the end-use quality control of common wheat. CONCLUSIONS: Our work demonstrated the value of PacBio transcript sequencing for improving common wheat genome annotation through uncovering the loci and full-length transcripts not discovered previously. The resource obtained may aid further structural genomics and grain transcriptome studies of common wheat.


Assuntos
Grão Comestível/genética , Genoma de Planta , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Transcriptoma , Triticum/genética , Biologia Computacional/métodos , Mapeamento de Sequências Contíguas , Regulação da Expressão Gênica no Desenvolvimento , Regulação da Expressão Gênica de Plantas , Loci Gênicos , Genômica/métodos , Glutens/genética , Fases de Leitura Aberta
10.
BMC Genomics ; 16: 210, 2015 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-25880765

RESUMO

BACKGROUND: Whole and partial chromosome losses or gains and structural chromosome changes are hallmarks of human tumors. Guanine-rich DNA, which has a potential to form a G-quadruplex (G4) structure, is particularly vulnerable to changes. In Caenorhabditis elegans, faithful transmission of G-rich DNA is ensured by the DOG-1/FANCJ deadbox helicase. RESULTS: To identify a spectrum of mutations, after long-term propagation, we combined whole genome sequencing (WGS) and oligonucleotide array Comparative Genomic Hybridization (oaCGH) analysis of a C. elegans strain that was propagated, in the absence of DOG-1 and MDF-1/MAD1, for a total of 470 generations, with samples taken for long term storage (by freezing) in generations 170 and 270. We compared the genomes of F170 and F470 strains and identified 94 substitutions, 17 InDels, 3 duplications, and 139 deletions larger than 20 bp. These homozygous variants were predicted to impact 101 protein-coding genes. Phenotypic analysis of this strain revealed remarkable fitness recovery indicating that mutations, which have accumulated in the strain, are not only tolerated but also cooperate to achieve long-term population survival in the absence of DOG-1 and MDF-1. Furthermore, deletions larger than 20 bp were the only variants that frequently occurred in G-rich DNA. We showed that 126 of the possible 954 predicted monoG/C tracts, larger than 14 bp, were deleted in unc-46 mdf-1 such-4; dog-1 F470 (JNC170). CONCLUSIONS: Here, we identified variants that accumulated in C. elegans' genome after long-term propagation in the absence of DOG-1 and MDF-1. We showed that DNA sequences, with G4-forming potential, are vulnerable to deletion-formation in this genetic background.


Assuntos
Proteínas de Caenorhabditis elegans/genética , Caenorhabditis elegans/genética , Proteínas de Ciclo Celular/genética , DNA Helicases/genética , Genoma , Animais , Caenorhabditis elegans/metabolismo , Hibridização Genômica Comparativa , Quadruplex G , Sequenciamento de Nucleotídeos em Larga Escala , Homozigoto , Mutação , Fenótipo , Análise de Sequência de DNA , Deleção de Sequência
11.
Genome Res ; 22(8): 1567-80, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22772596

RESUMO

Curation of a high-quality gene set is the critical first step in genome research, enabling subsequent analyses such as ortholog assignment, cis-regulatory element finding, and synteny detection. In this project, we have reannotated the genome of Caenorhabditis briggsae, the best studied sister species of the model organism Caenorhabditis elegans. First, we applied a homology-based gene predictor genBlastG to annotate the C. briggsae genome. We then validated and further improved the C. briggsae gene annotation through RNA-seq analysis of the C. briggsae transcriptome, which resulted in the first validated C. briggsae gene set (23,159 genes), among which 7347 genes (33.9% of all genes with introns) have all of their introns confirmed. Most genes (14,812, or 68.3%) have at least one intron validated, compared with only 3.9% in the most recent WormBase release (WS228). Of all introns in the revised gene set (103,083), 61,503 (60.1%) have been confirmed. Additionally, we have identified numerous trans-splicing leaders (SL1 and SL2 variants) in C. briggsae, leading to the first genome-wide annotation of operons in C. briggsae (1105 operons). The majority of the annotated operons (564, or 51.0%) are perfectly conserved in C. elegans, with an additional 345 operons (or 31.2%) somewhat divergent. Additionally, RNA-seq analysis revealed over 10 thousand small-size assembly errors in the current C. briggsae reference genome that can be readily corrected. The revised C. briggsae genome annotation represents a solid platform for comparative genomics analysis and evolutionary studies of Caenorhabditis species.


Assuntos
Caenorhabditis/genética , Genoma Helmíntico , Anotação de Sequência Molecular/métodos , Análise de Sequência de RNA/métodos , Transcriptoma , Processamento Alternativo , Animais , Sequência de Bases , Sequência Conservada , Evolução Molecular , Perfilação da Expressão Gênica/métodos , Íntrons , Modelos Genéticos , Óperon , Sítios de Splice de RNA , RNA Líder para Processamento/genética , RNA Líder para Processamento/metabolismo , Alinhamento de Sequência/métodos , Sintenia , Trans-Splicing
12.
BMC Genomics ; 15: 255, 2014 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-24694239

RESUMO

BACKGROUND: Increasing genetic and phenotypic differences found among natural isolates of C. elegans have encouraged researchers to explore the natural variation of this nematode species. RESULTS: Here we report on the identification of genomic differences between the reference strain N2 and the Hawaiian strain CB4856, one of the most genetically distant strains from N2. To identify both small- and large-scale genomic variations (GVs), we have sequenced the CB4856 genome using both Roche 454 (~400 bps single reads) and Illumina GA DNA sequencing methods (101 bps paired-end reads). Compared to previously described variants (available in WormBase), our effort uncovered twice as many single nucleotide variants (SNVs) and increased the number of small InDels almost 20-fold. Moreover, we identified and validated large insertions, most of which range from 150 bps to 1.2 kb in length in the CB4856 strain. Identified GVs had a widespread impact on protein-coding sequences, including 585 single-copy genes that have associated severe phenotypes of reduced viability in RNAi and genetics studies. Sixty of these genes are homologs of human genes associated with diseases. Furthermore, our work confirms previously identified GVs associated with differences in behavioural and biological traits between the N2 and CB4856 strains. CONCLUSIONS: The identified GVs provide a rich resource for future studies that aim to explain the genetic basis for other trait differences between the N2 and CB4856 strains.


Assuntos
Caenorhabditis elegans/genética , Variação Genética , Genoma Helmíntico , Animais , Composição de Bases , Caenorhabditis elegans/efeitos dos fármacos , Mapeamento Cromossômico , Códon , Hibridização Genômica Comparativa , Biologia Computacional , Elementos de DNA Transponíveis , Resistência a Medicamentos/genética , Estudos de Associação Genética , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Mutação INDEL , Família Multigênica , Mutagênese Insercional , Fases de Leitura Aberta , Fenótipo , Polimorfismo de Nucleotídeo Único , Deleção de Sequência
13.
BMC Genomics ; 15: 440, 2014 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-24906389

RESUMO

BACKGROUND: Evidence based on genomic sequences is urgently needed to confirm the phylogenetic relationship between Mesorhizobium strain MAFF303099 and M. huakuii. To define underlying causes for the rather striking difference in host specificity between M. huakuii strain 7653R and MAFF303099, several probable determinants also require comparison at the genomic level. An improved understanding of mobile genetic elements that can be integrated into the main chromosomes of Mesorhizobium to form genomic islands would enrich our knowledge of how genome dynamics may contribute to Mesorhizobium evolution in general. RESULTS: In this study, we sequenced the complete genome of 7653R and compared it with five other Mesorhizobium genomes. Genomes of 7653R and MAFF303099 were found to share a large set of orthologs and, most importantly, a conserved chromosomal backbone and even larger perfectly conserved synteny blocks. We also identified candidate molecular differences responsible for the different host specificities of these two strains. Finally, we reconstructed an ancestral Mesorhizobium genomic island that has evolved into diverse forms in different Mesorhizobium species. CONCLUSIONS: Our ortholog and synteny analyses firmly establish MAFF303099 as a strain of M. huakuii. Differences in nodulation factors and secretion systems T3SS, T4SS, and T6SS may be responsible for the unique host specificities of 7653R and MAFF303099 strains. The plasmids of 7653R may have arisen by excision of the original genomic island from the 7653R chromosome.


Assuntos
Genoma Bacteriano , Mesorhizobium/genética , Evolução Molecular , Especificidade de Hospedeiro , Mesorhizobium/classificação , Mesorhizobium/fisiologia , Dados de Sequência Molecular , Filogenia , Fenômenos Fisiológicos Vegetais , Plantas/microbiologia , Análise de Sequência de DNA , Simbiose
14.
Nucleic Acids Res ; 40(1): 53-64, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21908398

RESUMO

In humans, mutations of a growing list of regulatory factor X (RFX) target genes have been associated with devastating genetics disease conditions including ciliopathies. However, mechanisms underlying RFX transcription factors (TFs)-mediated gene expression regulation, especially differential gene expression regulation, are largely unknown. In this study, we explore the functional significance of the co-existence of multiple X-box motifs in regulating differential gene expression in Caenorhabditis elegans. We hypothesize that the effect of multiple X-box motifs is not a simple summation of binding effect to individual X-box motifs located within a same gene. To test this hypothesis, we identified eight C. elegans genes that contain two or more X-box motifs using comparative genomics. We examined one of these genes, F25B4.2, which contains two 15-bp X-box motifs. F25B4.2 expression in ciliated neurons is driven by the proximal motif and its expression is repressed by the distal motif. Our data suggest that two X-box motifs cooperate together to regulate the expression of F25B4.2 in location and intensity. We propose that multiple X-box motifs might be required to tune specific expression level. Our identification of genes with multiple X-box motifs will also improve our understanding of RFX/DAF-19-mediated regulation in C. elegans and in other organisms including humans.


Assuntos
Proteínas de Caenorhabditis elegans/genética , Caenorhabditis elegans/genética , Regulação da Expressão Gênica , Proteínas do Tecido Nervoso/genética , Regiões Promotoras Genéticas , Fatores de Transcrição/metabolismo , Animais , Sítios de Ligação , Caenorhabditis/genética , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/metabolismo , Genes de Helmintos , Genômica , Proteínas do Tecido Nervoso/metabolismo , Neurônios/metabolismo , Motivos de Nucleotídeos
15.
Sci Data ; 11(1): 403, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643276

RESUMO

Skeletonema tropicum is a marine diatom of the genus Skeletonema that also includes many well-known species including S. marinoi. S. tropicum is a high temperature preferring species thriving in tropical ocean regions or temperate ocean regions during summer-autumn. However, mechanisms of ecological adaptation of S. tropicum remain poorly understood due partially to the lack of a high-quality whole genome assembly. Here, we report the first high-quality chromosome-scale genome assembly for S. tropicum, using cutting-edge technologies including PacBio single molecular sequencing and high-throughput chromatin conformation capture. The assembled genome has a size of 78.78 Mb with a scaffold N50 of 3.17 Mb, anchored to 23 pseudo-chromosomes. In total, 20,613 protein-coding genes were predicted, of which 17,757 (86.14%) genes were functionally annotated. Collinearity analysis of the genomes of S. tropicum and S. marinoi revealed that these two genomes were highly homologous. This chromosome-level genome assembly of S. tropicum provides a valuable genomic platform for comparative analysis of mechanisms of ecological adaption.


Assuntos
Diatomáceas , Genoma , Cromatina , Cromossomos , Diatomáceas/genética , Genômica , Filogenia
16.
Harmful Algae ; 132: 102568, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38331542

RESUMO

The application of high-throughput sequencing (HTS) technologies has revolutionized research on phytoplankton biodiversity by generating an unprecedented amount of molecular data in marine ecosystem surveys. However, high-level of molecular diversity uncovered in HTS-based metabarcoding analyses may lead to overinterpretation of phytoplankton diversity due to excessive intra-genomic variations (IGVs). The aims in this study are to explore the nature of phytoplankton molecular diversity and to test the hypothesis. We carried out single-cell metabarcoding analysis of 18S rDNA V4 sequences obtained in single Noctiluca scintillans cells isolated from various sites in coastal waters of China. Results showed that each single N. scintillans cell harbored a high level of IGVs with about 100 amplicon sequence variants (ASVs). The large numbers of non-dominant ASVs identified in N. scintillans cells, which might correspond to the larger numbers of ASVs annotated as N. scintillans and showed similar temporal dynamics in metabarcoding analyses, could inflate the inter-species diversity or intra-species genetic diversity. In addition, there were large numbers of additional ASVs that were not annotated as N. scintillans. These non-N. scintillans ASVs might represent diverse preys for N. scintillans, consistent with previous reports that N. scintillans may act as chance predator of a broad-spectrum preys. This single-cell study has unambiguously demonstrated that the existence of high levels of IGVs in N. scintillans and most likely many other phytoplankton species, demonstrating that the majority of the molecular diversity revealed in metabarcoding analysis, which were generally interpreted as the sum of inter-species diversity and intra-species diversity, actually included high levels of IGVs and should be interpreted with caution.


Assuntos
Dinoflagellida , Ecossistema , DNA Ribossômico/genética , Dinoflagellida/genética , Fitoplâncton/genética , Genômica
17.
ISME Commun ; 4(1): ycad009, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38313810

RESUMO

Thalassiosira is a species-rich genus in Bacillariophyta that not only contributes positively as primary producer, but also poses negative impacts on ecosystems by causing harmful algal blooms. Although taxonomical studies have identified a large number of Thalassiosira species, however, the composition of Thalassiosira species and their geographical distribution in marine ecosystems were not well understood due primarily to the lack of resolution of morphology-based approaches used previously in ecological expeditions. In this study, we systematically analyzed the composition and spatial-temporal dynamic distributions of Thalassiosira in the model marine ecosystem Jiaozhou Bay by applying metabarcoding analysis. Through analyzing samples collected monthly from 12 sampling sites, 14 Thalassiosira species were identified, including five species that were not previously reported in Jiaozhou Bay, demonstrating the resolution and effectiveness of metabarcoding analysis in ecological research. Many Thalassiosira species showed prominent temporal preferences in Jiaozhou Bay, with some displaying spring-winter preference represented by Thalassiosira tenera, while others displaying summer-autumn preference represented by Thalassiosira lundiana and Thalassiosira minuscula, indicating that the temperature is an important driving factor in the temporal dynamics. The application of metabarcoding analysis, equipped with appropriate molecular markers with high resolution and high specificity and databases of reference molecular marker sequences for potential all Thalassiosira species, will revolutionize ecological research of Thalassiosira species in Jiaozhou Bay and other marine ecosystems.

18.
Mar Pollut Bull ; 201: 116198, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38428045

RESUMO

Metabarcoding analysis is an effective technique for monitoring the domoic acid-producing Pseudo-nitzschia species in marine environments, uncovering high-levels of molecular diversity. However, such efforts may result in the overinterpretation of Pseudo-nitzschia species diversity, as molecular diversity not only encompasses interspecies and intraspecies diversities but also exhibits extensive intragenomic variations (IGVs). In this study, we analyzed the V4 region of the 18S rDNA of 30 strains of Pseudo-nitzschia multistriata collected from the coasts of China. The results showed that each P. multistriata strain harbored about a hundred of unique 18S rDNA V4 sequence varieties, of which each represented by a unique amplicon sequence variant (ASV). This study demonstrated the extensive degree of IGVs in P. multistriata strains, suggesting that IGVs may also present in other Pseudo-nitzschia species and other phytoplankton species. Understanding the scope and levels of IGVs is crucial for accurately interpreting the results of metabarcoding analysis.


Assuntos
Diatomáceas , Diatomáceas/genética , DNA Ribossômico , Fitoplâncton/genética , Sequenciamento de Nucleotídeos em Larga Escala , China
19.
Infect Dis Poverty ; 13(1): 19, 2024 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-38414088

RESUMO

BACKGROUND: Schistosoma japonicum is a parasitic flatworm that causes human schistosomiasis, which is a significant cause of morbidity in China, the Philippines and Indonesia. Oncomelania hupensis (Gastropoda: Pomatiopsidae) is the unique intermediate host of S. japonicum. A complete genome sequence of O. hupensis will enable the fundamental understanding of snail biology as well as its co-evolution with the S. japonicum parasite. Assembling a high-quality reference genome of O. hupehensis will provide data for further research on the snail biology and controlling the spread of S. japonicum. METHODS: The draft genome was de novo assembly using the long-read sequencing technology (PacBio Sequel II) and corrected with Illumina sequencing data. Then, using Hi-C sequencing data, the genome was assembled at the chromosomal level. CAFE was used to do analysis of contraction and expansion of the gene family and CodeML module in PAML was used for positive selection analysis in protein coding sequences. RESULTS: A total length of 1.46 Gb high-quality O. hupensis genome with 17 unique full-length chromosomes (2n = 34) of the individual including a contig N50 of 1.35 Mb and a scaffold N50 of 75.08 Mb. Additionally, 95.03% of these contig sequences were anchored in 17 chromosomes. After scanning the assembled genome, a total of 30,604 protein-coding genes were predicted. Among them, 86.67% were functionally annotated. Further phylogenetic analysis revealed that O. hupensis was separated from a common ancestor of Pomacea canaliculata and Bellamya purificata approximately 170 million years ago. Comparing the genome of O. hupensis with its most recent common ancestor, it showed 266 significantly expanded and 58 significantly contracted gene families (P < 0.05). Functional enrichment of the expanded gene families indicated that they were mainly involved with intracellular, DNA-mediated transposition, DNA integration and transposase activity. CONCLUSIONS: Integrated use of multiple sequencing technologies, we have successfully constructed the genome at the chromosomal-level of O. hupensis. These data will not only provide the compressive genomic information, but also benefit future work on population genetics of this snail as well as evolutional studies between S. japonicum and the snail host.


Assuntos
Gastrópodes , Schistosoma japonicum , Animais , Humanos , Schistosoma japonicum/genética , Filogenia , Gastrópodes/genética , Cromossomos/genética , DNA , China
20.
J Biol Chem ; 287(46): 38980-91, 2012 Nov 09.
Artigo em Inglês | MEDLINE | ID: mdl-22988242

RESUMO

CTP:phosphocholine cytidylyltransferase (CCT), an amphitropic enzyme that regulates phosphatidylcholine synthesis, is composed of a catalytic head domain and a regulatory tail. The tail region has dual functions as a regulator of membrane binding/enzyme activation and as an inhibitor of catalysis in the unbound form of the enzyme, suggesting conformational plasticity. These functions are well conserved in CCTs across diverse phyla, although the sequences of the tail regions are not. CCT regulatory tails of diverse origins are composed of a long membrane lipid-inducible amphipathic helix (m-AH) followed by a highly disordered segment, reminiscent of the Parkinson disease-linked protein, α-synuclein, which we show shares a novel sequence motif with vertebrate CCTs. To unravel features required for silencing, we created chimeric enzymes by fusing the catalytic domain of rat CCTα to the regulatory tail of CCTs from Drosophila, Caenorhabditis elegans, or Saccharomyces cerevisiae or to α-synuclein. Only the tail domains of the two invertebrate CCTs were competent for both suppression of catalytic activity and for activation by lipid vesicles. Thus, both silencing and activating functions of the m-AH can tolerate significant changes in length and sequence. We identified a highly amphipathic 22-residue segment in the m-AH with features conserved among animal CCTs but not yeast CCT or α-synuclein. Deletion of this segment from rat CCT increased the lipid-independent V(max) by 10-fold, equivalent to the effect of deleting the entire tail, and severely weakened membrane binding affinity. However, membrane binding was required for additional increases in catalytic efficiency. Thus, full activation of CCT may require not only loss of a silencing conformation in the m-AH but a gain of an activating conformation, promoted by membrane binding.


Assuntos
Colina-Fosfato Citidililtransferase/fisiologia , Citidina Trifosfato/química , Motivos de Aminoácidos , Sequência de Aminoácidos , Animais , Catálise , Domínio Catalítico , Colina-Fosfato Citidililtransferase/química , Biologia Computacional/métodos , Ativação Enzimática , Inativação Gênica , Cinética , Lipídeos/química , Dados de Sequência Molecular , Fosfatidilcolinas/química , Conformação Proteica , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Ratos , Homologia de Sequência de Aminoácidos , alfa-Sinucleína/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA