Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
1.
Front Genet ; 14: 1114774, 2023.
Article in English | MEDLINE | ID: mdl-37065472

ABSTRACT

Dyslipidemias are risk factors in diseases of significant importance to public health, such as atherosclerosis, a condition that contributes to the development of cardiovascular disease. Unhealthy lifestyles, the pre-existence of diseases, and the accumulation of genetic variants in some loci contribute to the development of dyslipidemia. The genetic causality behind these diseases has been studied primarily on populations with extensive European ancestry. Only some studies have explored this topic in Costa Rica, and none have focused on identifying variants that can alter blood lipid levels and quantifying their frequency. To fill this gap, this study focused on identifying variants in 69 genes involved in lipid metabolism using genomes from two studies in Costa Rica. We contrasted the allelic frequencies with those of groups reported in the 1000 Genomes Project and gnomAD and identified potential variants that could influence the development of dyslipidemias. In total, we detected 2,600 variants in the evaluated regions. However, after various filtering steps, we obtained 18 variants that have the potential to alter the function of 16 genes, nine variants have pharmacogenomic or protective implications, eight have high risk in Variant Effect Predictor, and eight were found in other Latin American genetic studies of lipid alterations and the development of dyslipidemia. Some of these variants have been linked to changes in blood lipid levels in other global studies and databases. In future studies, we propose to confirm at least 40 variants of interest from 23 genes in a larger cohort from Costa Rica and Latin American populations to determine their relevance regarding the genetic burden for dyslipidemia. Additionally, more complex studies should arise that include diverse clinical, environmental, and genetic data from patients and controls and functional validation of the variants.

2.
Front Public Health ; 11: 1095202, 2023.
Article in English | MEDLINE | ID: mdl-36935725

ABSTRACT

Latin America is one of the regions in which the COVID-19 pandemic has a stronger impact, with more than 72 million reported infections and 1.6 million deaths until June 2022. Since this region is ecologically diverse and is affected by enormous social inequalities, efforts to identify genomic patterns of the circulating SARS-CoV-2 genotypes are necessary for the suitable management of the pandemic. To contribute to the genomic surveillance of the SARS-CoV-2 in Latin America, we extended the number of SARS-CoV-2 genomes available from the region by sequencing and analyzing the viral genome from COVID-19 patients from seven countries (Argentina, Brazil, Costa Rica, Colombia, Mexico, Bolivia, and Peru). Subsequently, we analyzed the genomes circulating mainly during 2021 including records from GISAID database from Latin America. A total of 1,534 genome sequences were generated from seven countries, demonstrating the laboratory and bioinformatics capabilities for genomic surveillance of pathogens that have been developed locally. For Latin America, patterns regarding several variants associated with multiple re-introductions, a relatively low percentage of sequenced samples, as well as an increment in the mutation frequency since the beginning of the pandemic, are in line with worldwide data. Besides, some variants of concern (VOC) and variants of interest (VOI) such as Gamma, Mu and Lambda, and at least 83 other lineages have predominated locally with a country-specific enrichments. This work has contributed to the understanding of the dynamics of the pandemic in Latin America as part of the local and international efforts to achieve timely genomic surveillance of SARS-CoV-2.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , Latin America/epidemiology , Pandemics , Genotype
3.
Rev. biol. trop ; 69(4)dic. 2021.
Article in Spanish | LILACS, SaludCR | ID: biblio-1387685

ABSTRACT

Resumen Introducción: La disciplina científica de la bioinformática tiene el potencial de generar aplicaciones innovadoras para las sociedades humanas. Costa Rica, pequeña en tamaño y población en comparación con otros países de América Latina, ha ido adoptando la disciplina de manera progresiva. El reconocer los avances permite determinar hacia dónde puede dirigirse el país en este campo, así como su contribución a la región latinoamericana. Objetivo: En este manuscrito se reporta evidencia de la evolución de la bioinformática en Costa Rica, para identificar debilidades y fortalezas que permitan definir acciones a futuro. Métodos: Se realizaron búsquedas en bases de datos de publicaciones científicas y repositorios de secuencias, así como información de actividades de capacitación, redes, infraestructura, páginas web y fuentes de financiamiento. Resultados: Se observan avances importantes desde el 2010, incluyendo un aumento en oportunidades de entrenamiento y número de publicaciones, aportes significativos a las bases de datos de secuencias y conexiones por medio de redes. Sin embargo, ciertas áreas, como la masa crítica y la financiación requieren más desarrollo. La comunidad científica y sus patrocinadores deben promover la investigación basada en bioinformática, invertir en la formación de estudiantes de posgrado, aumentar la formación de profesionales, crear oportunidades laborales para carreras en bioinformática y promover colaboraciones internacionales a través de redes. Conclusiones: Se sugiere que para experimentar los beneficios de las aplicaciones de la bioinformática se deben fortalecer tres aspectos clave: la comunidad científica, la infraestructura de investigación y las oportunidades de financiamiento. El impacto de tal inversión sería el desarrollo de proyectos ambiciosos pero factibles y colaboraciones extendidas dentro de la región latinoamericana. Esto permitiría realizar contribuciones significativas para abordar los desafíos globales y la aplicación de nuevos enfoques de investigación, innovación y transferencia de conocimiento para el desarrollo de la economía, dentro de un marco de ética de la investigación.


Abstract Introduction: The scientific discipline of bioinformatics has the potential to generate innovative applications for human societies. Costa Rica, small in size and population compared to other Latin American countries, has been progressively adopting the discipline. Recognizing progress makes it possible to determine where the country can go in this field, as well as its contribution to the Latin American region. Objective: This manuscript reports evidence of the evolution of bioinformatics in Costa Rica, to identify weaknesses and strengths allowing future actions plans. Methods: We searched databases of scientific publications and sequence repositories, as well as information on training activities, networks, infrastructure, web pages and funding sources. Results: Important advances have been observed since 2010, such as increases in training opportunities and the number of publications, significant contributions to the sequence databases and connections through networks. However, areas such as critical mass and financing require further development. The scientific community and its sponsors should promote bioinformatics-based research, invest in graduate student training, increase professional training, create career opportunities in bioinformatics, and promote international collaborations through networks. Conclusions: It is suggested that in order to experience the benefits of bioinformatics applications, three key aspects must be strengthened: the scientific community, the research infrastructure, and funding opportunities. The impact of such investment would be the development of ambitious but feasible projects and extended collaborations within the Latin American region and abroad. This would allow significant contributions to address global challenges and the implementation of new approaches to research, innovation and knowledge transfer for the development of the economy, within an ethics of research framework.


Subject(s)
Computational Biology/trends , Data Management , Costa Rica
4.
Biosystems ; 205: 104411, 2021 Jul.
Article in English | MEDLINE | ID: mdl-33757842

ABSTRACT

Tolerance to stress conditions is vital for organismal survival, including bacteria under specific environmental conditions, antibiotics, and other perturbations. Some studies have described common modulation and shared genes during stress response to different types of disturbances (termed as perturbome), leading to the idea of central control at the molecular level. We implemented a robust machine learning approach to identify and describe genes associated with multiple perturbations or perturbome in a Pseudomonas aeruginosa PAO1 model. Using microarray datasets from the Gene Expression Omnibus (GEO), we evaluated six approaches to rank and select genes: using two methodologies, data single partition (SP method) or multiple partitions (MP method) for training and testing datasets, we evaluated three classification algorithms (SVM Support Vector Machine, KNN K-Nearest neighbor and RF Random Forest). Gene expression patterns and topological features at the systems level were included to describe the perturbome elements. We were able to select and describe 46 core response genes associated with multiple perturbations in P. aeruginosa PAO1 and it can be considered a first report of the P. aeruginosa perturbome. Molecular annotations, patterns in expression levels, and topological features in molecular networks revealed biological functions of biosynthesis, binding, and metabolism, many of them related to DNA damage repair and aerobic respiration in the context of tolerance to stress. We also discuss different issues related to implemented and assessed algorithms, including data partitioning, classification approaches, and metrics. Altogether, this work offers a different and robust framework to select genes using a machine learning approach.


Subject(s)
Genes, Bacterial , Genomics , Machine Learning , Models, Biological , Pseudomonas aeruginosa/genetics , Stress, Physiological/genetics , Algorithms , Principal Component Analysis , Transcriptome
5.
Sci Rep ; 10(1): 13717, 2020 08 13.
Article in English | MEDLINE | ID: mdl-32792590

ABSTRACT

Pseudomonas aeruginosa is an opportunistic pathogen that thrives in diverse environments and causes a variety of human infections. Pseudomonas aeruginosa AG1 (PaeAG1) is a high-risk sequence type 111 (ST-111) strain isolated from a Costa Rican hospital in 2010. PaeAG1 has both blaVIM-2 and blaIMP-18 genes encoding for metallo-ß-lactamases, and it is resistant to ß-lactams (including carbapenems), aminoglycosides, and fluoroquinolones. Ciprofloxacin (CIP) is an antibiotic commonly used to treat P. aeruginosa infections, and it is known to produce DNA damage, triggering a complex molecular response. In order to evaluate the effects of a sub-inhibitory CIP concentration on PaeAG1, growth curves using increasing CIP concentrations were compared. We then measured gene expression using RNA-Seq at three time points (0, 2.5 and 5 h) after CIP exposure to identify the transcriptomic determinants of the response (i.e. hub genes, gene clusters and enriched pathways). Changes in expression were determined using differential expression analysis and network analysis using a top-down systems biology approach. A hybrid model using database-based and co-expression analysis approaches was implemented to predict gene-gene interactions. We observed a reduction of the growth curve rate as the sub-inhibitory CIP concentrations were increased. In the transcriptomic analysis, we detected that over time CIP treatment resulted in the differential expression of 518 genes, showing a complex impact at the molecular level. The transcriptomic determinants were 14 hub genes, multiple gene clusters at different levels (associated to hub genes or as co-expression modules) and 15 enriched pathways. Down-regulation of genes implicated in several metabolism pathways, virulence elements and ribosomal activity was observed. In contrast, amino acid catabolism, RpoS factor, proteases, and phenazines genes were up-regulated. Remarkably, > 80 resident-phage genes were up-regulated after CIP treatment, which was validated at phenomic level using a phage plaque assay. Thus, reduction of the growth curve rate and increasing phage induction was evidenced as the CIP concentrations were increased. In summary, transcriptomic and network analyses, as well as the growth curves and phage plaque assays provide evidence that PaeAG1 presents a complex, concentration-dependent response to sub-inhibitory CIP exposure, showing pleiotropic effects at the systems level. Manipulation of these determinants, such as phage genes, could be used to gain more insights about the regulation of responses in PaeAG1 as well as the identification of possible therapeutic targets. To our knowledge, this is the first report of the transcriptomic analysis of CIP response in a ST-111 high-risk P. aeruginosa strain, in particular using a top-down systems biology approach.


Subject(s)
Bacterial Proteins/genetics , Ciprofloxacin/pharmacology , Computational Biology/methods , Gene Expression Regulation, Bacterial/drug effects , Pseudomonas Infections/genetics , Pseudomonas aeruginosa/genetics , Transcriptome/drug effects , Anti-Bacterial Agents/pharmacology , Bacterial Proteins/metabolism , Biofilms/drug effects , Biofilms/growth & development , Gene Regulatory Networks , Humans , Pseudomonas Infections/drug therapy , Pseudomonas Infections/microbiology , Pseudomonas aeruginosa/growth & development , Pseudomonas aeruginosa/isolation & purification , Virulence
6.
Genome Biol Evol ; 12(6): 842-859, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32374870

ABSTRACT

Multicopy ampliconic gene families on the Y chromosome play an important role in spermatogenesis. Thus, studying their genetic variation in endangered great ape species is critical. We estimated the sizes (copy number) of nine Y ampliconic gene families in population samples of chimpanzee, bonobo, and orangutan with droplet digital polymerase chain reaction, combined these estimates with published data for human and gorilla, and produced genome-wide testis gene expression data for great apes. Analyzing this comprehensive data set within an evolutionary framework, we, first, found high inter- and intraspecific variation in gene family size, with larger families exhibiting higher variation as compared with smaller families, a pattern consistent with random genetic drift. Second, for four gene families, we observed significant interspecific size differences, sometimes even between sister species-chimpanzee and bonobo. Third, despite substantial variation in copy number, Y ampliconic gene families' expression levels did not differ significantly among species, suggesting dosage regulation. Fourth, for three gene families, size was positively correlated with gene expression levels across species, suggesting that, given sufficient evolutionary time, copy number influences gene expression. Our results indicate high variability in size but conservation in gene expression levels in Y ampliconic gene families, significantly advancing our understanding of Y-chromosome evolution in great apes.


Subject(s)
Biological Evolution , Gene Dosage , Gene Expression , Hominidae/genetics , Y Chromosome , Animals , Hominidae/metabolism , Male , Multigene Family
7.
Sci Rep ; 10(1): 1392, 2020 Jan 29.
Article in English | MEDLINE | ID: mdl-31996747

ABSTRACT

Genotyping methods and genome sequencing are indispensable to reveal genomic structure of bacterial species displaying high level of genome plasticity. However, reconstruction of genome or assembly is not straightforward due to data complexity, including repeats, mobile and accessory genetic elements of bacterial genomes. Moreover, since the solution to this problem is strongly influenced by sequencing technology, bioinformatics pipelines, and selection criteria to assess assemblers, there is no systematic way to select a priori the optimal assembler and parameter settings. To assembly the genome of Pseudomonas aeruginosa strain AG1 (PaeAG1), short reads (Illumina) and long reads (Oxford Nanopore) sequencing data were used in 13 different non-hybrid and hybrid approaches. PaeAG1 is a multiresistant high-risk sequence type 111 (ST-111) clone that was isolated from a Costa Rican hospital and it was the first report of an isolate of P. aeruginosa carrying both blaVIM-2 and blaIMP-18 genes encoding for metallo-ß-lactamases (MBL) enzymes. To assess the assemblies, multiple metrics regard to contiguity, correctness and completeness (3C criterion, as we define here) were used for benchmarking the 13 approaches and select a definitive assembly. In addition, annotation was done to identify genes (coding and RNA regions) and to describe the genomic content of PaeAG1. Whereas long reads and hybrid approaches showed better performances in terms of contiguity, higher correctness and completeness metrics were obtained for short read only and hybrid approaches. A manually curated and polished hybrid assembly gave rise to a single circular sequence with 100% of core genes and known regions identified, >98% of reads mapped back, no gaps, and uniform coverage. The strategy followed to obtain this high-quality 3C assembly is detailed in the manuscript and we provide readers with an all-in-one script to replicate our results or to apply it to other troublesome cases. The final 3C assembly revealed that the PaeAG1 genome has 7,190,208 bp, a 65.7% GC content and 6,709 genes (6,620 coding sequences), many of which are included in multiple mobile genomic elements, such as 57 genomic islands, six prophages, and two complete integrons with blaVIM-2 and blaIMP-18 MBL genes. Up to 250 and 60 of the predicted genes are anticipated to play a role in virulence (adherence, quorum sensing and secretion) or antibiotic resistance (ß-lactamases, efflux pumps, etc). Altogether, the assembly and annotation of the PaeAG1 genome provide new perspectives to continue studying the genomic diversity and gene content of this important human pathogen.


Subject(s)
Computational Biology/methods , Drug Resistance, Multiple, Bacterial/genetics , Genome, Bacterial/genetics , Pseudomonas aeruginosa/genetics , Sequence Analysis, DNA/methods , DNA, Bacterial/genetics , DNA, Bacterial/isolation & purification , Genotyping Techniques/methods , High-Throughput Nucleotide Sequencing , Interspersed Repetitive Sequences/genetics , Molecular Sequence Annotation , Pseudomonas aeruginosa/drug effects
8.
Mol Biol Evol ; 33(10): 2744-58, 2016 10.
Article in English | MEDLINE | ID: mdl-27413049

ABSTRACT

Transcript variation has important implications for organismal function in health and disease. Most transcriptome studies focus on assessing variation in gene expression levels and isoform representation. Variation at the level of transcript sequence is caused by RNA editing and transcription errors, and leads to nongenetically encoded transcript variants, or RNA-DNA differences (RDDs). Such variation has been understudied, in part because its detection is obscured by reverse transcription (RT) and sequencing errors. It has only been evaluated for intertranscript base substitution differences. Here, we investigated transcript sequence variation for short tandem repeats (STRs). We developed the first maximum-likelihood estimator (MLE) to infer RT error and RDD rates, taking next generation sequencing error rates into account. Using the MLE, we empirically evaluated RT error and RDD rates for STRs in a large-scale DNA and RNA replicated sequencing experiment conducted in a primate species. The RT error rates increased exponentially with STR length and were biased toward expansions. The RDD rates were approximately 1 order of magnitude lower than the RT error rates. The RT error rates estimated with the MLE from a primate data set were concordant with those estimated with an independent method, barcoded RNA sequencing, from a Caenorhabditis elegans data set. Our results have important implications for medical genomics, as STR allelic variation is associated with >40 diseases. STR nonallelic transcript variation can also contribute to disease phenotype. The MLE and empirical rates presented here can be used to evaluate the probability of disease-associated transcripts arising due to RDD.


Subject(s)
DNA/genetics , Microsatellite Repeats , RNA/genetics , Reverse Transcription , Alleles , DNA Repair , Genetic Variation , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Sequence Analysis, RNA , Transcriptome
9.
PLoS Comput Biol ; 12(6): e1004956, 2016 06.
Article in English | MEDLINE | ID: mdl-27309962

ABSTRACT

Endogenous retroviruses (ERVs), the remnants of retroviral infections in the germ line, occupy ~8% and ~10% of the human and mouse genomes, respectively, and affect their structure, evolution, and function. Yet we still have a limited understanding of how the genomic landscape influences integration and fixation of ERVs. Here we conducted a genome-wide study of the most recently active ERVs in the human and mouse genome. We investigated 826 fixed and 1,065 in vitro HERV-Ks in human, and 1,624 fixed and 242 polymorphic ETns, as well as 3,964 fixed and 1,986 polymorphic IAPs, in mouse. We quantitated >40 human and mouse genomic features (e.g., non-B DNA structure, recombination rates, and histone modifications) in ±32 kb of these ERVs' integration sites and in control regions, and analyzed them using Functional Data Analysis (FDA) methodology. In one of the first applications of FDA in genomics, we identified genomic scales and locations at which these features display their influence, and how they work in concert, to provide signals essential for integration and fixation of ERVs. The investigation of ERVs of different evolutionary ages (young in vitro and polymorphic ERVs, older fixed ERVs) allowed us to disentangle integration vs. fixation preferences. As a result of these analyses, we built a comprehensive model explaining the uneven distribution of ERVs along the genome. We found that ERVs integrate in late-replicating AT-rich regions with abundant microsatellites, mirror repeats, and repressive histone marks. Regions favoring fixation are depleted of genes and evolutionarily conserved elements, and have low recombination rates, reflecting the effects of purifying selection and ectopic recombination removing ERVs from the genome. In addition to providing these biological insights, our study demonstrates the power of exploiting multiple scales and localization with FDA. These powerful techniques are expected to be applicable to many other genomic investigations.


Subject(s)
Endogenous Retroviruses/genetics , Virus Integration/genetics , Animals , Chromosome Mapping , Computational Biology , DNA Replication , Data Interpretation, Statistical , Epigenesis, Genetic , Genome, Human , Humans , Logistic Models , Mice , Models, Biological , Recombination, Genetic , Repetitive Sequences, Nucleic Acid , Selection, Genetic
10.
Genome Res ; 26(4): 530-40, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26934921

ABSTRACT

The mammalian Y Chromosome sequence, critical for studying male fertility and dispersal, is enriched in repeats and palindromes, and thus, is the most difficult component of the genome to assemble. Previously, expensive and labor-intensive BAC-based techniques were used to sequence the Y for a handful of mammalian species. Here, we present a much faster and more affordable strategy for sequencing and assembling mammalian Y Chromosomes of sufficient quality for most comparative genomics analyses and for conservation genetics applications. The strategy combines flow sorting, short- and long-read genome and transcriptome sequencing, and droplet digital PCR with novel and existing computational methods. It can be used to reconstruct sex chromosomes in a heterogametic sex of any species. We applied our strategy to produce a draft of the gorilla Y sequence. The resulting assembly allowed us to refine gene content, evaluate copy number of ampliconic gene families, locate species-specific palindromes, examine the repetitive element content, and produce sequence alignments with human and chimpanzee Y Chromosomes. Our results inform the evolution of the hominine (human, chimpanzee, and gorilla) Y Chromosomes. Surprisingly, we found the gorilla Y Chromosome to be similar to the human Y Chromosome, but not to the chimpanzee Y Chromosome. Moreover, we have utilized the assembled gorilla Y Chromosome sequence to design genetic markers for studying the male-specific dispersal of this endangered species.


Subject(s)
Computational Biology , High-Throughput Nucleotide Sequencing , Mammals/genetics , Y Chromosome , Animals , Computational Biology/methods , Gene Rearrangement , Genome , Genomics , Gorilla gorilla/genetics , Humans , Inverted Repeat Sequences , Male , Microsatellite Repeats , Pan troglodytes/genetics , Repetitive Sequences, Nucleic Acid , Sequence Analysis, DNA
11.
Mol Biol Evol ; 31(7): 1816-32, 2014 Jul.
Article in English | MEDLINE | ID: mdl-24809961

ABSTRACT

The integration and fixation preferences of DNA transposons, one of the major classes of eukaryotic transposable elements, have never been evaluated comprehensively on a genome-wide scale. Here, we present a detailed study of the distribution of DNA transposons in the human and bat genomes. We studied three groups of DNA transposons that integrated at different evolutionary times: 1) ancient (>40 My) and currently inactive human elements, 2) younger (<40 My) bat elements, and 3) ex vivo integrations of piggyBat and Sleeping Beauty elements in HeLa cells. Although the distribution of ex vivo elements reflected integration preferences, the distribution of human and (to a lesser extent) bat elements was also affected by selection. We used regression techniques (linear, negative binomial, and logistic regression models with multiple predictors) applied to 20-kb and 1-Mb windows to investigate how the genomic landscape in the vicinity of DNA transposons contributes to their integration and fixation. Our models indicate that genomic landscape explains 16-79% of variability in DNA transposon genome-wide distribution. Importantly, we not only confirmed previously identified predictors (e.g., DNA conformation and recombination hotspots) but also identified several novel predictors (e.g., signatures of double-strand breaks and telomere hexamer). Ex vivo integrations showed a bias toward actively transcribed regions. Older DNA transposons were located in genomic regions scarce in most conserved elements-likely reflecting purifying selection. Our study highlights how DNA transposons are integral to the evolution of bat and human genomes, and has implications for the development of DNA transposon assays for gene therapy and mutagenesis applications.


Subject(s)
Chiroptera/genetics , DNA Transposable Elements , Evolution, Molecular , Animals , Genetic Variation , Genome , HeLa Cells , Humans , Models, Genetic , Mutagenesis, Insertional , Regression Analysis
12.
Hum Biol ; 85(5): 721-40, 2013 Oct.
Article in English | MEDLINE | ID: mdl-25078957

ABSTRACT

The genetic structure of Costa Rica's population is complex, both by region and by individual, due to the admixture process that started during the 15th century and historical events thereafter. Previous studies have been done mostly on Amerindian populations and the Central Valley inhabitants using various microsatellites and mitochondrial DNA markers. Here, we study for the first time a random sample from all regions of the country with ancestry informative markers (AIMs) to address the individual and regional admixture proportions. A sample of 160 male individuals was screened for 78 AIMs customized in a GoldenGate platform from Illumina. We observed that this small set of AIMs has the same power of hundreds of microsatellites and thousands of single-nucleotide polymorphisms to evaluate admixture, with the benefit of reducing genotyping costs. This type of investigation is necessary to explore new genetic markers useful for forensic and genetic investigation. Our data showed a mean admixture proportion of 49.2% European (EUR), 37.8% Native American (NAM), and 12.9% African (AFR), with a disproportionate admixture composition by region. In addition, when Chinese (CHB) was included as a fourth component, the proportions changed to 45.6% EUR, 33.5% NAM, 11.7% AFR, and 9.2% CHB. The admixture trend is consistent among all regions (EUR > NAM > AFR), and individual admixture estimates vary broadly in each region. Though we did not find stratification in Costa Rica's population, gene admixture should be evaluated in future genetic studies of Costa Rica, especially for the Caribbean region, as it contains the largest proportion of African ancestry (30.9%).


Subject(s)
Genetic Variation/genetics , Pedigree , Asian People/genetics , Black People/genetics , Costa Rica/epidemiology , DNA, Mitochondrial/genetics , Genetic Markers/genetics , Genotype , Geography , Humans , Indians, Central American/genetics , Male , Microsatellite Repeats/genetics , White People/genetics
13.
PLoS Genet ; 8(8): e1002842, 2012.
Article in English | MEDLINE | ID: mdl-22912586

ABSTRACT

Alu elements are trans-mobilized by the autonomous non-LTR retroelement, LINE-1 (L1). Alu-induced insertion mutagenesis contributes to about 0.1% human genetic disease and is responsible for the majority of the documented instances of human retroelement insertion-induced disease. Here we introduce a SINE recovery method that provides a complementary approach for comprehensive analysis of the impact and biological mechanisms of Alu retrotransposition. Using this approach, we recovered 226 de novo tagged Alu inserts in HeLa cells. Our analysis reveals that in human cells marked Alu inserts driven by either exogenously supplied full length L1 or ORF2 protein are indistinguishable. Four percent of de novo Alu inserts were associated with genomic deletions and rearrangements and lacked the hallmarks of retrotransposition. In contrast to L1 inserts, 5' truncations of Alu inserts are rare, as most of the recovered inserts (96.5%) are full length. De novo Alus show a random pattern of insertion across chromosomes, but further characterization revealed an Alu insertion bias exists favoring insertion near other SINEs, highly conserved elements, with almost 60% landing within genes. De novo Alu inserts show no evidence of RNA editing. Priming for reverse transcription rarely occurred within the first 20 bp (most 5') of the A-tail. The A-tails of recovered inserts show significant expansion, with many at least doubling in length. Sequence manipulation of the construct led to the demonstration that the A-tail expansion likely occurs during insertion due to slippage by the L1 ORF2 protein. We postulate that the A-tail expansion directly impacts Alu evolution by reintroducing new active source elements to counteract the natural loss of active Alus and minimizing Alu extinction.


Subject(s)
Alu Elements/genetics , Long Interspersed Nucleotide Elements/genetics , Mutagenesis, Insertional , Terminal Repeat Sequences/genetics , 3' Flanking Region , 5' Flanking Region , Base Sequence , Endonucleases/genetics , Endonucleases/metabolism , Evolution, Molecular , Exons , Genome, Human , HeLa Cells , Humans , Introns , Molecular Sequence Data , RNA-Directed DNA Polymerase/genetics , RNA-Directed DNA Polymerase/metabolism , Reverse Transcription
14.
Hum Biol ; 78(5): 551-63, 2006 Oct.
Article in English | MEDLINE | ID: mdl-17506286

ABSTRACT

Two hundred seventeen male subjects from Costa Rica, Mexico, and the Hispanic population of the southwestern United States were studied. Twelve Y-chromosome STRs and the HVSI sequence of the mtDNA were analyzed to describe their genetic structure and to compare maternal and paternal lineages. All subjects are part of two NIMH-funded studies to localize schizophrenia susceptibility genes in Hispanic populations of Mexican and Central American ancestry. We showed that these three populations are similar in their internal genetic characteristics, as revealed by analyses of mtDNA and Y-chromosome STR diversity. These populations are related through their maternal lineage in a stronger way than through their paternal lineage, because a higher number of shared haplotypes and polymorphisms are seen in the mtDNA (compared to Y-chromosome STRs). These results provide evidence of previous contact between the three populations and shared histories. An analysis of molecular variance revealed no genetic differentiation for the mtDNA for the three populations, but differentiation was detected in the Y-chromosome STRs. Genetic distance analysis showed that the three populations are closely related, probably as a result of migration between close neighbors, as indicated by shared haplotypes and their demographic histories. This relationship could be an important common feature for genetic studies in Latin American and Hispanic populations.


Subject(s)
Chromosomes, Human, Y , DNA, Mitochondrial/analysis , Genetics, Population , Hispanic or Latino/genetics , Tandem Repeat Sequences , Costa Rica , Haplotypes , Humans , Male , Mexico , Microsatellite Repeats , Polymorphism, Genetic , Southwestern United States
15.
Forensic Sci Int ; 135(2): 150-7, 2003 Aug 12.
Article in English | MEDLINE | ID: mdl-12927417

ABSTRACT

The Spanish and Portuguese ISFG Working Group (GEP-ISFG) carried out a collaborative exercise in order to asses the performance of two Y chromosome STR tetraplexes, which include the loci DYS461, GATA C4, DYS437 and DYS438 (GEPY I), and DYS460, GATA A10, GATA H4 and DYS439 (GEPY II). The groups that reported correct results in all the systems were also asked to analyse a population sample in order to evaluate the informative content of these STRs in different populations. A total of 1020 males out of 13 population samples from Argentina, Brazil, Costa Rica, Macao, Mozambique, Portugal and Spain were analysed for all the loci included in the present study. Haplotype and allele frequencies of these eight Y-STRs were estimated in all samples. The lowest haplotype diversity was found in the Lara (Argentina) population (95.44%) and the highest (99.90%) in Macao (China). Pairwise haplotype analysis showed the relative homogeneity of the Iberian origin samples, in accordance with what was previously found in the European populations for other Y-STR haplotypes (http://www.ystr.org). As expected, the four non-Caucasian samples, Macao (Chinese), Mozambique (Africans), Costa Rica (Africans) and Argentina (Lara, Amerindians), show highly significant Phist values in the pairwise comparisons with all the Caucasian samples.


Subject(s)
Chromosomes, Human, Y , Genetics, Population , Haplotypes , Tandem Repeat Sequences , DNA Fingerprinting/methods , Ethnicity/genetics , Gene Frequency , Humans , Male , Portugal , Spain
SELECTION OF CITATIONS
SEARCH DETAIL
...