Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Breed Sci ; 71(2): 125-133, 2021 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34377060

RESUMO

Understanding genetic diversity among local populations is a primary goal of modern crop breeding programs. Here, we demonstrated the genetic relationships of rice varieties in Hokkaido, Japan, one of the northern limits of rice cultivation around the world. Furthermore, artificial selection during rice breeding programs has been characterized using genome sequences. We utilized 8,565 single nucleotide polymorphisms and insertion/deletion markers distributed across the genome in genotype-by-sequencing for genetic diversity analyses. Phylogenetics, genetic population structure, and principal component analysis showed that a total of 110 varieties were classified into four distinct clusters according to different populations geographically and historically. Furthermore, the genome sequences of 19 rice varieties along with historic representations in Hokkaido, nucleotide diversity and FST values in each cluster revealed that artificial selection of elite phenotypes focused on chromosomal regions. These results clearly demonstrated the history of the selections on agronomic traits as genome sequences among current rice varieties from Hokkaido.

2.
Chembiochem ; 20(16): 2054-2058, 2019 08 16.
Artigo em Inglês | MEDLINE | ID: mdl-31269328

RESUMO

Endomorphins are neuropeptides that bind strongly to µ-opioid receptors and are considered to play important roles in pain modulation and other biological functions. Two endomorphins have been identified, to date, endomorphine-1 and -2; both are tetrapeptides and differ by only a single amino acid in the third position. Both peptides were isolated from bovine brains; however, their precursor genes have not been identified. In this study, a nucleotide sequence corresponding to the endomorphin-1 peptide in an expressed sequence tag database has been found and a preproendomorphin-like precursor peptide from human brain complementary DNA (cDNA) has been cloned. The cDNA consists of nucleotide sequences of two already annotated predicted genes, and the putative peptide differs by one amino acid from the isolated endomorphin peptides. It is proposed herein that there is the possibility of unknown short proteins or peptide precursors being missed by automated gene prediction programs based on similarities of known protein sequences. A novel concept of how to produce endomorphins from a similar peptide is described. The oxidatively modified base might provide a clue for understanding discrepancies between nucleotide sequences on the genome and those on cDNAs.


Assuntos
Oligopeptídeos/biossíntese , Receptores Opioides mu/genética , Algoritmos , Animais , Bovinos , Oligopeptídeos/genética , Oligopeptídeos/isolamento & purificação
4.
J Virol ; 89(12): 6209-17, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25833048

RESUMO

UNLABELLED: Human mastadenovirus D (HAdV-D) is exceptionally rich in type among the seven human adenovirus species. This feature is attributed to frequent intertypic recombination events that have reshuffled orthologous genomic regions between different HAdV-D types. However, this trend appears to be paradoxical, as it has been demonstrated that the replacement of some of the interacting proteins for a specific function with other orthologues causes malfunction, indicating that intertypic recombination events may be deleterious. In order to understand why the paradoxical trend has been possible in HAdV-D evolution, we conducted an interregional coevolution analysis between different genomic regions of 45 different HAdV-D types and found that ca. 70% of the genome has coevolved, even though these are fragmented into several pieces via short intertypic recombination hot spot regions. Since it is statistically and biologically unlikely that all of the coevolving fragments have synchronously recombined between different genomes, it is probable that these regions have stayed in their original genomes during evolution as a platform for frequent intertypic recombination events in limited regions. It is also unlikely that the same genomic regions have remained almost untouched during frequent recombination events, independently, in all different types, by chance. In addition, the coevolving regions contain the coding regions of physically interacting proteins for important functions. Therefore, the coevolution of these regions should be attributed at least in part to natural selection due to common biological constraints operating on all types, including protein-protein interactions for essential functions. Our results predict additional unknown protein interactions. IMPORTANCE: Human mastadenovirus D, an exceptionally type-rich human adenovirus species and causative agent of different diseases in a wide variety of tissues, including that of ocular region and digestive tract, as well as an opportunistic infection in immunocompromised patients, is known to have highly diverged through frequent intertypic recombination events; however, it has also been demonstrated that the replacement of a component protein of a multiprotein system with a homologous protein causes malfunction. The present study solved this apparent paradox by looking at which genomic parts have coevolved using a newly developed method. The results revealed that intertypic recombination events have occurred in limited genomic regions and been avoided in the genomic regions encoding proteins that physically interact for a given function. This approach detects purifying selection against recombination events causing the replacement of partial components of multiprotein systems and therefore predicts physical and functional interactions between different proteins and/or genomic elements.


Assuntos
Evolução Molecular , Variação Genética , Genoma Viral , Mastadenovirus/classificação , Mastadenovirus/genética , Humanos , Ligação Proteica , Recombinação Genética , Seleção Genética , Proteínas Virais/genética , Proteínas Virais/metabolismo
5.
Breed Sci ; 65(5): 403-10, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26719743

RESUMO

Plant breeding programs in local regions may generate genetic variations that are desirable to local populations and shape adaptability during the establishment of local populations. To elucidate genetic bases for this process, we proposed a new approach for identifying the genetic bases for the traits improved during rice breeding programs; association mapping focusing on a local population. In the present study, we performed association mapping focusing on a local rice population, consisting of 63 varieties, in Hokkaido, the northernmost region of Japan and one of the northern limits of rice cultivation worldwide. Six and seventeen QTLs were identified for heading date and low temperature germinability, respectively. Of these, 13 were novel QTLs in this population and 10 corresponded to the QTLs previously reported based on QTL mapping. The identification of QTLs for traits in local populations including elite varieties may lead to a better understanding of the genetic bases of elite traits. This is of direct relevance for plant breeding programs in local regions.

6.
Plant J ; 72(5): 817-28, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22900922

RESUMO

In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus-like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host-dependent manner. Conversely, other simple mono- and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double-strand breaks that induce non-homologous end joining. The insertions within ATrs occasionally generated new gene-related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force.


Assuntos
Sequência Rica em At , Elementos de DNA Transponíveis , Genoma de Planta , Oryza/genética , Tungrovirus/genética , Arabidopsis/genética , Sequência de Bases , Repetições de Dinucleotídeos , Dados de Sequência Molecular
7.
BMC Res Notes ; 16(1): 222, 2023 Sep 19.
Artigo em Inglês | MEDLINE | ID: mdl-37726849

RESUMO

OBJECTIVE: Diversification of cell types and changes in epigenetic states during cell differentiation processes are important for understanding development. Recently, phylogenetic analysis using DNA methylation and histone modification information has been shown useful for inferring these processes. The purpose of this study was to examine whether chromatin accessibility data can help infer these processes in murine hematopoiesis. RESULTS: Chromatin accessibility data could partially infer the hematopoietic differentiation hierarchy. Furthermore, based on the ancestral state estimation of internal nodes, the open/closed chromatin states of differentiating progenitor cells could be predicted with a specificity of 0.86-0.99 and sensitivity of 0.29-0.72. These results suggest that the phylogenetic analysis of chromatin accessibility could offer important information on cell differentiation, particularly for organisms from which progenitor cells are difficult to obtain.


Assuntos
Cromatina , Hematopoese , Animais , Camundongos , Cromatina/genética , Filogenia , Hematopoese/genética , Diferenciação Celular/genética , Metilação de DNA
8.
Front Plant Sci ; 14: 1261705, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37965031

RESUMO

Introduction: Rice genomes contain endogenous viral elements homologous to rice tungro bacilliform virus (RTBV) from the pararetrovirus family Caulimoviridae. These viral elements, known as endogenous RTBV-like sequences (eRTBVLs), comprise five subfamilies, eRTBVL-A, -B, -C, -D, and -X. Four subfamilies (A, B, C, and X) are present to a limited degree in the genomes of the Asian cultivated rice Oryza sativa (spp. japonica and indica) and the closely related wild species Oryza rufipogon. Methods: The eRTBVL-D sequences are widely distributed within these and other Oryza AA-genome species. Fifteen eRTBVL-D segments identified in the japonica (Nipponbare) genome occur mostly at orthologous chromosomal positions in other AA-genome species. The eRTBVL-D sequences were inserted into the genomes just before speciation of the AA-genome species. Results and discussion: Ten eRTBVL-D segments are located at six loci, which were used for our evolutionary analyses during the speciation of the AA-genome species. The degree of genetic differentiation varied among the eRTBVL-D segments. Of the six loci, three showed phylogenetic trees consistent with the standard speciation pattern (SSP) of the AA-genome species (Type A), and the other three represented phylogenies different from the SSP (Type B). The atypical phylogenetic trees for the Type B loci revealed chromosome region-specific evolution among the AA-genome species that is associated with phylogenetic incongruences: complex genome rearrangements between eRTBVL-D segments, an introgression between the distant species, and low genetic diversity of a shared eRTBVL-D segment. Using eRTBVL-D as an indicator, this study revealed the phylogenetic incongruence of local chromosomal regions with different topologies that developed during speciation.

9.
J Gen Virol ; 92(Pt 6): 1251-1259, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21402595

RESUMO

Human adenovirus type 53 (HAdV-53) has commonly been detected in samples from epidemic keratoconjunctivitis (EKC) patients in Japan since 1996. HAdV-53 is an intermediate virus, containing hexon-chimeric, penton base and fiber structures similar to HAdV-22 and -37, HAdV-37 and HAdV-8, respectively. HAdV-53-like intermediate strains were first isolated from EKC samples in Japan in the 1980s. Here, the complete genome sequences of three such HAdV-53-like intermediate strains (870006C, 880249C and 890357C) and four HAdV-53 strains were determined, and their relationships were analysed. The seven HAdV strains were classified into three groups, 870006C/880249C, 890357C and the four HAdV-53 strains, on the basis of phylogenetic analyses of the partial and complete genome sequences. HAdV strains within the same group showed the highest nucleotide identities (99.87-100.00 %). Like HAdV-53, the hexon loop 1 and 2 regions of 870006C, 880249C and 890357C showed the highest identity with HAdV-22. However, these strains did not show a hexon-chimeric structure similar to HAdV-22 and -37, or a penton base similar to HAdV-37. The fiber genes of 870006C and 880249C were identical to that of HAdV-37, but not HAdV-8. Thus, the three intermediate HAdVs isolated in the 1980s were similar to each other but not to HAdV-53. The recombination breakpoints were inferred by the Recombination Detection Program (rdp) using whole-genome sequences of these seven HAdV and of 12 HAdV-D strains from GenBank. HAdV-53 may have evolved from intermediate HAdVs circulating in the 1980s, and from HAdV-8, -22 and -37, by recombination of sections cut at the putative breakpoints.


Assuntos
Infecções por Adenovirus Humanos/virologia , Adenovírus Humanos/genética , Genoma Viral , Ceratoconjuntivite/virologia , Recombinação Genética , Infecções por Adenovirus Humanos/epidemiologia , Adenovírus Humanos/classificação , Adenovírus Humanos/isolamento & purificação , Humanos , Japão/epidemiologia , Ceratoconjuntivite/epidemiologia , Dados de Sequência Molecular , Filogenia , Análise de Sequência de DNA
10.
J Clin Microbiol ; 49(2): 484-90, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21147954

RESUMO

For 4 months from September 2008, 102 conjunctival swab specimens were collected for surveillance purposes from patients across Japan suspected of having epidemic keratoconjunctivitis (EKC). Human adenovirus (HAdV) DNA was detected in 61 samples by PCR, though the HAdV type for 6 of the PCR-positive samples could not be determined by phylogenetic analysis using a partial hexon gene sequence. Moreover, for 2 months from January 2009, HAdV strains with identical sequences were isolated from five conjunctival swab samples obtained from EKC patients in five different regions of Japan. For the analyses of the 11 samples mentioned above, we determined the nucleotide sequences of the entire penton base, hexon, and fiber genes and early 3 (E3) region, which are variable regions among HAdV types, and compared them to those of other HAdV species D strains. The nucleotide sequences of loops 1 and 2 in the hexons of all 11 samples showed high degrees of identity with those of the HAdV type 15 (HAdV-15) and HAdV-29 prototype strains. However, the fiber gene and E3 region sequences showed high degrees of identity with those of HAdV-9, and the penton base gene sequence showed a high degree of identity with the penton base gene sequences of HAdV-9 and -26. Moreover, the complete genome sequence of the 2307-S strain, which was isolated by viral culture from 1 of the 11 samples, was determined. The 2307-S strain was a recombinant HAdV between HAdV-9, -15, -26, -29, and/or another HAdV type; however, the recombination sites in the genome were not obvious. We propose that this virus is a novel intertypic recombinant, HAdV-15/29/H9, and may be an etiological agent of EKC.


Assuntos
Infecções por Adenoviridae/epidemiologia , Adenovírus Humanos/genética , Adenovírus Humanos/isolamento & purificação , DNA Viral/genética , Genoma Viral , Ceratoconjuntivite/epidemiologia , Infecções por Adenoviridae/virologia , Análise por Conglomerados , DNA Viral/química , Humanos , Japão/epidemiologia , Ceratoconjuntivite/virologia , Epidemiologia Molecular , Dados de Sequência Molecular , Filogenia , Análise de Sequência de DNA , Homologia de Sequência , Proteínas Virais/genética
11.
Nucleic Acids Res ; 36(Database issue): D787-92, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17982176

RESUMO

Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Currently, with the rapid growth of transcriptome data of various species, more reliable orthology information is prerequisite for further studies. However, detection of orthologs could be erroneous if pairwise distance-based methods, such as reciprocal BLAST searches, are utilized. Thus, as a sub-database of H-InvDB, an integrated database of annotated human genes (http://h-invitational.jp/), we constructed a fully curated database of evolutionary features of human genes, called 'Evola'. In the process of the ortholog detection, computational analysis based on conserved genome synteny and transcript sequence similarity was followed by manual curation by researchers examining phylogenetic trees. In total, 18 968 human genes have orthologs among 11 vertebrates (chimpanzee, mouse, cow, chicken, zebrafish, etc.), either computationally detected or manually curated orthologs. Evola provides amino acid sequence alignments and phylogenetic trees of orthologs and homologs. In 'd(N)/d(S) view', natural selection on genes can be analyzed between human and other species. In 'Locus maps', all transcript variants and their exon/intron structures can be compared among orthologous gene loci. We expect the Evola to serve as a comprehensive and reliable database to be utilized in comparative analyses for obtaining new knowledge about human genes. Evola is available at http://www.h-invitational.jp/evola/.


Assuntos
Bases de Dados Genéticas , Genes , Genoma Humano , Filogenia , Animais , Biologia Computacional , Genômica , Humanos , Internet , RNA Mensageiro/química , Seleção Genética , Alinhamento de Sequência , Análise de Sequência de Proteína , Sintenia
12.
Nucleic Acids Res ; 36(Database issue): D1028-33, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18089549

RESUMO

The Rice Annotation Project Database (RAP-DB) was created to provide the genome sequence assembly of the International Rice Genome Sequencing Project (IRGSP), manually curated annotation of the sequence, and other genomics information that could be useful for comprehensive understanding of the rice biology. Since the last publication of the RAP-DB, the IRGSP genome has been revised and reassembled. In addition, a large number of rice-expressed sequence tags have been released, and functional genomics resources have been produced worldwide. Thus, we have thoroughly updated our genome annotation by manual curation of all the functional descriptions of rice genes. The latest version of the RAP-DB contains a variety of annotation data as follows: clone positions, structures and functions of 31 439 genes validated by cDNAs, RNA genes detected by massively parallel signature sequencing (MPSS) technology and sequence similarity, flanking sequences of mutant lines, transposable elements, etc. Other annotation data such as Gnomon can be displayed along with those of RAP for comparison. We have also developed a new keyword search system to allow the user to access useful information. The RAP-DB is available at: http://rapdb.dna.affrc.go.jp/ and http://rapdb.lab.nig.ac.jp/.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genoma de Planta , Oryza/genética , Genes de Plantas , Genômica , Internet , MicroRNAs/genética , RNA Interferente Pequeno/genética , Interface Usuário-Computador
13.
Nucleic Acids Res ; 36(Database issue): D793-9, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18089548

RESUMO

Here we report the new features and improvements in our latest release of the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/), a comprehensive annotation resource for human genes and transcripts. H-InvDB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6. We mapped those human transcripts onto the human genome sequences (NCBI build 36.1) and determined 34 699 human gene clusters, which could define 34 057 (98.1%) protein-coding and 642 (1.9%) non-protein-coding loci; 858 (2.5%) transcribed loci overlapped with predicted pseudogenes. For all these transcripts and genes, we provide comprehensive annotation including gene structures, gene functions, alternative splicing variants, functional non-protein-coding RNAs, functional domains, predicted sub cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs, co-localization with orphan diseases, gene expression profiles, orthologous genes, protein-protein interactions (PPI) and annotation for gene families. The current H-InvDB annotation resources consist of two main views: Transcript view and Locus view and eight sub-databases: the DiseaseInfo Viewer, H-ANGEL, the Clustering Viewer, G-integra, the TOPO Viewer, Evola, the PPI view and the Gene family/group.


Assuntos
Bases de Dados Genéticas , Genes , RNA Mensageiro/química , Animais , Mapeamento Cromossômico , DNA Complementar/química , Humanos , Internet , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , RNA Mensageiro/genética , Interface Usuário-Computador
15.
Gene ; 721S: 100021, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-34530996

RESUMO

Revealing the landscape of epigenetic changes in cells during differentiation is important for understanding the development of organisms. In this study, to infer such epigenetic changes during human hematopoiesis, ancestral state estimation based on a phylogenetic tree was applied to map the epigenomic changes in six kinds of histone modifications onto the hierarchical cell differentiation process of hematopoiesis using epigenomes of eight types of differentiated hematopoietic cells. The histone modification changes inferred during hematopoiesis showed that changes that occurred on the branches separating different cell types reflected the characteristics of hematopoiesis in terms of genomic position and gene function. These results suggested that ancestral state estimation based on phylogenetic analysis of histone modifications in differentiated hematopoietic cells could reconstruct an appropriate landscape of histone modification changes during hematopoiesis. Since integration of the inferred changes of different histone modifications could reveal genes with specific histone marks such as active histone marks and bivalent histone marks on each internal branch of cell-type trees, this approach could provide valuable information for understanding the cell differentiation steps of each cell lineage.

16.
Gene X ; 3: 100021, 2019 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32550550

RESUMO

Revealing the landscape of epigenetic changes in cells during differentiation is important for understanding the development of organisms. In this study, to infer such epigenetic changes during human hematopoiesis, ancestral state estimation based on a phylogenetic tree was applied to map the epigenomic changes in six kinds of histone modifications onto the hierarchical cell differentiation process of hematopoiesis using epigenomes of eight types of differentiated hematopoietic cells. The histone modification changes inferred during hematopoiesis showed that changes that occurred on the branches separating different cell types reflected the characteristics of hematopoiesis in terms of genomic position and gene function. These results suggested that ancestral state estimation based on phylogenetic analysis of histone modifications in differentiated hematopoietic cells could reconstruct an appropriate landscape of histone modification changes during hematopoiesis. Since integration of the inferred changes of different histone modifications could reveal genes with specific histone marks such as active histone marks and bivalent histone marks on each internal branch of cell-type trees, this approach could provide valuable information for understanding the cell differentiation steps of each cell lineage.

17.
Nucleic Acids Res ; 34(14): 3917-28, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-16914452

RESUMO

We report the first genome-wide identification and characterization of alternative splicing in human gene transcripts based on analysis of the full-length cDNAs. Applying both manual and computational analyses for 56,419 completely sequenced and precisely annotated full-length cDNAs selected for the H-Invitational human transcriptome annotation meetings, we identified 6877 alternative splicing genes with 18 297 different alternative splicing variants. A total of 37,670 exons were involved in these alternative splicing events. The encoded protein sequences were affected in 6005 of the 6877 genes. Notably, alternative splicing affected protein motifs in 3015 genes, subcellular localizations in 2982 genes and transmembrane domains in 1348 genes. We also identified interesting patterns of alternative splicing, in which two distinct genes seemed to be bridged, nested or having overlapping protein coding sequences (CDSs) of different reading frames (multiple CDS). In these cases, completely unrelated proteins are encoded by a single locus. Genome-wide annotations of alternative splicing, relying on full-length cDNAs, should lay firm groundwork for exploring in detail the diversification of protein function, which is mediated by the fast expanding universe of alternative splicing variants.


Assuntos
Processamento Alternativo , DNA Complementar/química , Genoma Humano , Proteínas/genética , RNA Mensageiro/química , Motivos de Aminoácidos , Sequência de Aminoácidos , Sequência de Bases , Biologia Computacional/métodos , Éxons , Variação Genética , Genômica/métodos , Humanos , Proteínas/química , Proteínas/fisiologia , RNA Mensageiro/metabolismo , Análise de Sequência de DNA
18.
Gene ; 389(2): 196-203, 2007 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-17196768

RESUMO

Despite the wide distribution of processed pseudogenes in mammalian genomes, such as those of human and mouse, relatively little is known about their roles in genomic evolution. While gene duplications are recognized as one of the major driving forces in genome evolution, processed pseudogenes, which are retrotransposed copies of mRNAs, have been regarded as junk or selfish DNA for a long time. In order to elucidate the quantitative and qualitative contribution of processed pseudogenes to the mammalian genome evolution, we attempted to detect processed pseudogenes by extensively mapping the mRNAs to both the human and mouse genomes, and then we estimated the rate of their emergence. As a result, we revealed that the rate of pseudogene emergence was about 1-2% per gene per million years, which was as high as the rate (0.9%) of gene duplication in the human genome, although the rate of pseudogene emergence was found to drastically decrease in the hominid lineage. Furthermore, 1% of the processed pseudogenes seemed to be reinvigorated by post-retrotransposition transcription, many of them preserving the intact coding regions. Since the expression patterns of transcribed pseudogenes in various tissues were quite different between human and mouse, their emergence might have led to species-specific evolution. Our results indicate that the generation of processed pseudogenes was not wholly futile but instead has been an indispensable resource, driving dynamic evolution of the mammalian genomes.


Assuntos
Evolução Molecular , Genoma Humano , Genoma , Pseudogenes , Sequência de Aminoácidos , Animais , Humanos , Camundongos , Dados de Sequência Molecular , Filogenia , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Alinhamento de Sequência , Transcrição Gênica
19.
PLoS Biol ; 2(6): e162, 2004 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-15103394

RESUMO

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.


Assuntos
Biologia Computacional/métodos , DNA Complementar/genética , Bases de Dados Genéticas , Genes/fisiologia , Genoma Humano , Processamento Alternativo/genética , Genes/genética , Humanos , Internet , Repetições de Microssatélites/genética , Fases de Leitura Aberta/genética , Polimorfismo Genético , Polimorfismo de Nucleotídeo Único , Estrutura Terciária de Proteína
20.
Biol Direct ; 11: 35, 2016 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-27487948

RESUMO

BACKGROUND: Retroposition, one of the processes of copying the genetic material, is an important RNA-mediated mechanism leading to the emergence of new genes. Because the transcription controlling segments are usually not copied to the new location in this mechanism, the duplicated gene copies (retrocopies) become pseudogenized. However, few can still survive, e.g. by recruiting novel regulatory elements from the region of insertion. Subsequently, these duplicated genes can contribute to the formation of lineage-specific traits and phenotypic diversity. Despite the numerous studies of the functional retrocopies (retrogenes) in animals and plants, very little is known about their presence in green algae, including morphologically diverse species. The current availability of the genomes of both uni- and multicellular algae provides a good opportunity to conduct a genome-wide investigation in order to fill the knowledge gap in retroposition phenomenon in this lineage. RESULTS: Here we present a comparative genomic analysis of uni- and multicellular algae, Chlamydomonas reinhardtii and Volvox carteri, respectively, to explore their retrogene complements. By adopting a computational approach, we identified 141 retrogene candidates in total in both genomes, with their fraction being significantly higher in the multicellular Volvox. Majority of the retrogene candidates showed signatures of functional constraints, thus indicating their functionality. Detailed analyses of the identified retrogene candidates, their parental genes, and homologs of both, revealed that most of the retrogene candidates were derived from ancient retroposition events in the common ancestor of the two algae and that the parental genes were subsequently lost from the respective lineages, making many retrogenes 'orphan'. CONCLUSION: We revealed that the genomes of the green algae have maintained many possibly functional retrogenes in spite of experiencing various molecular evolutionary events during a long evolutionary time after the retroposition events. Our first report about the retrogene set in the green algae provides a good foundation for any future investigation of the repertoire of retrogenes and facilitates the assessment of the evolutionary impact of retroposition on diverse morphological traits in this lineage. REVIEWERS: This article was reviewed by William Martin and Piotr Zielenkiewicz.


Assuntos
Proteínas de Algas/genética , Chlamydomonas reinhardtii/genética , Genoma de Planta , Proteínas de Plantas/genética , Retroelementos , Volvox/genética , Proteínas de Algas/metabolismo , Proteínas de Plantas/metabolismo
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa