Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Mol Ecol ; 28(21): 4737-4754, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31550391

RESUMO

For half a century population genetics studies have put type II restriction endonucleases to work. Now, coupled with massively-parallel, short-read sequencing, the family of RAD protocols that wields these enzymes has generated vast genetic knowledge from the natural world. Here, we describe the first software natively capable of using paired-end sequencing to derive short contigs from de novo RAD data. Stacks version 2 employs a de Bruijn graph assembler to build and connect contigs from forward and reverse reads for each de novo RAD locus, which it then uses as a reference for read alignments. The new architecture allows all the individuals in a metapopulation to be considered at the same time as each RAD locus is processed. This enables a Bayesian genotype caller to provide precise SNPs, and a robust algorithm to phase those SNPs into long haplotypes, generating RAD loci that are 400-800 bp in length. To prove its recall and precision, we tested the software with simulated data and compared reference-aligned and de novo analyses of three empirical data sets. Our study shows that the latest version of Stacks is highly accurate and outperforms other software in assembling and genotyping paired-end de novo data sets.


Assuntos
Genética Populacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Teorema de Bayes , Genótipo , Humanos , Metagenômica/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Software
2.
Mol Biol Evol ; 31(4): 832-45, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24398320

RESUMO

The evolutionary origin of eukaryotes is a question of great interest for which many different hypotheses have been proposed. These hypotheses predict distinct patterns of evolutionary relationships for individual genes of the ancestral eukaryotic genome. The availability of numerous completely sequenced genomes covering the three domains of life makes it possible to contrast these predictions with empirical data. We performed a systematic analysis of the phylogenetic relationships of ancestral eukaryotic genes with archaeal and bacterial genes. In contrast with previous studies, we emphasize the critical importance of methods accounting for statistical support, horizontal gene transfer, and gene loss, and we disentangle the processes underlying the phylogenomic pattern we observe. We first recover a clear signal indicating that a fraction of the bacteria-like eukaryotic genes are of alphaproteobacterial origin. Then, we show that the majority of bacteria-related eukaryotic genes actually do not point to a relationship with a specific bacterial taxonomic group. We also provide evidence that eukaryotes branch close to the last archaeal common ancestor. Our results demonstrate that there is no phylogenetic support for hypotheses involving a fusion with a bacterium other than the ancestor of mitochondria. Overall, they leave only two possible interpretations, respectively, based on the early-mitochondria hypotheses, which suppose an early endosymbiosis of an alphaproteobacterium in an archaeal host and on the slow-drip autogenous hypothesis, in which early eukaryotic ancestors were particularly prone to horizontal gene transfers.


Assuntos
Evolução Molecular , Modelos Genéticos , Archaea/genética , Bactérias/genética , Evolução Biológica , Transferência Genética Horizontal , Especiação Genética , Genoma Humano , Humanos , Filogenia , Simbiose/genética , Leveduras/genética
3.
Mol Biol Evol ; 30(8): 1745-50, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23699471

RESUMO

Efficient algorithms and programs for the analysis of the ever-growing amount of biological sequence data are strongly needed in the genomics era. The pace at which new data and methodologies are generated calls for the use of pre-existing, optimized-yet extensible-code, typically distributed as libraries or packages. This motivated the Bio++ project, aiming at developing a set of C++ libraries for sequence analysis, phylogenetics, population genetics, and molecular evolution. The main attractiveness of Bio++ is the extensibility and reusability of its components through its object-oriented design, without compromising the computer-efficiency of the underlying methods. We present here the second major release of the libraries, which provides an extended set of classes and methods. These extensions notably provide built-in access to sequence databases and new data structures for handling and manipulating sequences from the omics era, such as multiple genome alignments and sequencing reads libraries. More complex models of sequence evolution, such as mixture models and generic n-tuples alphabets, are also included.


Assuntos
Biologia Computacional , Evolução Molecular , Software , Algoritmos , Biologia Computacional/métodos , Genômica/métodos , Humanos , Internet
4.
Mol Ecol Resour ; 23(6): 1299-1318, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37062860

RESUMO

Library preparation protocols for most sequencing technologies involve PCR amplification of the template DNA, which open the possibility that a given template DNA molecule is sequenced multiple times. Reads arising from this phenomenon, known as PCR duplicates, inflate the cost of sequencing and can jeopardize the reliability of affected experiments. Despite the pervasiveness of this artefact, our understanding of its causes and of its impact on downstream statistical analyses remains essentially empirical. Here, we develop a general quantitative model of amplification distortions in sequencing data sets, which we leverage to investigate the factors controlling the occurrence of PCR duplicates. We show that the PCR duplicate rate is determined primarily by the ratio between library complexity and sequencing depth, and that amplification noise (including in its dependence on the number of PCR cycles) only plays a secondary role for this artefact. We confirm our predictions using new and published RAD-seq libraries and provide a method to estimate library complexity and amplification noise in any data set containing PCR duplicates. We discuss how amplification-related artefacts impact downstream analyses, and in particular genotyping accuracy. The proposed framework unites the numerous observations made on PCR duplicates and will be useful to experimenters of all sequencing technologies where DNA availability is a concern.


Assuntos
DNA , Sequenciamento de Nucleotídeos em Larga Escala , Reprodutibilidade dos Testes , Reação em Cadeia da Polimerase/métodos , Análise de Sequência de DNA/métodos , DNA/genética , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos
5.
Mol Ecol Resour ; 21(2): 363-378, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-32275349

RESUMO

Restriction-site associated DNA sequencing (RADseq) has become a powerful and versatile tool in modern population genomics, enabling large-scale evolutionary and genomic analyses in otherwise inaccessible biological systems. With its widespread use, different variants on the protocol have been developed to suit specific experimental needs. Researchers face the challenge of choosing the optimal molecular and sequencing protocols for their reduced representation experimental design, an often-complicated process. Strategic errors can lead to biased data generation that has reduced power to answer biological questions. Here, we present RADinitio, simulation software for the selection and optimization of RADseq experiments via the generation of sequencing data that behave similarly to empirical sources. RADinitio provides an evolutionary simulation of populations, implementation of various RADseq protocols with customizable parameters, and thorough assessment of missing data. We test the efficacy of the software using different RAD protocols across several organisms, highlighting the importance of protocol selection on the magnitude and quality of data acquired. Additionally, we test the effects of RAD library preparation and sequencing on allelic dropout, observing that library preparation and sequencing often contributes more to missing alleles than population-level variation.


Assuntos
Simulação por Computador , Genômica , Projetos de Pesquisa , Análise de Sequência de DNA , Software , Metagenômica
6.
Heliyon ; 7(11): e08335, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34825075

RESUMO

BACKGROUND: Breastmilk is considered the gold standard of infant nutrition. Many mothers have difficulty with breastfeeding and over 50% of women stop due to perceived low production. AIMS AND METHODS: Our study compared gene expression in 8 samples of low and high producers of milk. All subjects were administered GAD-7 and PHQ-9 questionnaires. Low-producers were all found to have more depression and anxiety compared to high-producers. RESULTS: We did not find significant differences between gene expression between low and high milk producers. Only 5 of 8 samples contained a significant number of human cells. We did find differences in the amount of various bacterial populations. CONCLUSION: Our results indicate that gene expression in breastmilk is complicated by collection methods. We recommend that even though some women produced less than 600 ml of milk over a 24-hour period of time, due to the nature of the bacteria found in milk they try to breastfeed as much as they can for the health benefits of their infants. the rich bacterial diversity in all patients including the low producers strongly suggests that even women producing lesser quantities of milk confer their children numerous benefits by breastfeeding them.

7.
Nat Ecol Evol ; 4(4): 652-658, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32152530

RESUMO

Only recently have we begun to understand the ecological and evolutionary effects of urbanization on species, with studies revealing drastic impacts on community composition, gene flow, behaviour, morphology and physiology. However, our understanding of how adaptive evolution allows species to persist, and even thrive, in urban landscapes is still nascent. Here, we examine phenotypic, genomic and regulatory impacts of urbanization on a widespread lizard, the Puerto Rican crested anole (Anolis cristatellus). We find that urban lizards endure higher environmental temperatures and display greater heat tolerance than their forest counterparts. A single non-synonymous polymorphism within a protein synthesis gene (RARS) is associated with heat tolerance plasticity within urban heat islands and displays parallel signatures of selection in cities. Additionally, we identify groups of differentially expressed genes between habitats showing elevated genetic divergence in multiple urban-forest comparisons. These genes display evidence of adaptive regulatory evolution within cities and disproportionately cluster within regulatory modules associated with heat tolerance. This study provides evidence of temperature-mediated selection in urban heat islands with repeatable impacts on physiological evolution at multiple levels of biological hierarchy.


Assuntos
Lagartos , Animais , Cidades , Temperatura Alta , Ilhas , Porto Rico
8.
Nat Protoc ; 12(12): 2640-2659, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29189774

RESUMO

Restriction site-associated DNA sequencing (RAD-seq) allows for the genome-wide discovery and genotyping of single-nucleotide polymorphisms in hundreds of individuals at a time in model and nonmodel species alike. However, converting short-read sequencing data into reliable genotype data remains a nontrivial task, especially as RAD-seq is used in systems that have very diverse genomic properties. Here, we present a protocol to analyze RAD-seq data using the Stacks pipeline. This protocol will be of use in areas such as ecology and population genetics. It covers the assessment and demultiplexing of the sequencing data, read mapping, inference of RAD loci, genotype calling, and filtering of the output data, as well as providing two simple examples of downstream biological analyses. We place special emphasis on checking the soundness of the procedure and choosing the main parameters, given the properties of the data. The procedure can be completed in 1 week, but determining definitive methodological choices will typically take up to 1 month.


Assuntos
Técnicas de Genotipagem/métodos , Análise de Sequência de DNA/métodos , Animais , Mapeamento Cromossômico/métodos , Genética Populacional , Genoma , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Polimorfismo de Nucleotídeo Único , Software
9.
Mol Biol Cell ; 24(8): 1232-49, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23427270

RESUMO

In vertebrates, zyxin is a LIM-domain protein belonging to a family composed of seven members. We show that the nematode Caenorhabditis elegans has a unique zyxin-like protein, ZYX-1, which is the orthologue of the vertebrate zyxin subfamily composed of zyxin, migfilin, TRIP6, and LPP. The ZYX-1 protein is expressed in the striated body-wall muscles and localizes at dense bodies/Z-discs and M-lines, as well as in the nucleus. In yeast two-hybrid assays ZYX-1 interacts with several known dense body and M-line proteins, including DEB-1 (vinculin) and ATN-1 (α-actinin). ZYX-1 is mainly localized in the middle region of the dense body/Z-disk, overlapping the apical and basal regions containing, respectively, ATN-1 and DEB-1. The localization and dynamics of ZYX-1 at dense bodies depend on the presence of ATN-1. Fluorescence recovery after photobleaching experiments revealed a high mobility of the ZYX-1 protein within muscle cells, in particular at dense bodies and M-lines, indicating a peripheral and dynamic association of ZYX-1 at these muscle adhesion structures. A portion of the ZYX-1 protein shuttles from the cytoplasm into the nucleus, suggesting a role for ZYX-1 in signal transduction. We provide evidence that the zyx-1 gene encodes two different isoforms, ZYX-1a and ZYX-1b, which exhibit different roles in dystrophin-dependent muscle degeneration occurring in a C. elegans model of Duchenne muscular dystrophy.


Assuntos
Proteínas de Caenorhabditis elegans/fisiologia , Caenorhabditis elegans/metabolismo , Distrofina/metabolismo , Músculos/metabolismo , Zixina/fisiologia , Actinina/metabolismo , Sequência de Aminoácidos , Animais , Caenorhabditis elegans/citologia , Proteínas de Caenorhabditis elegans/química , Expressão Gênica , Dados de Sequência Molecular , Músculos/citologia , Especificidade de Órgãos , Filogenia , Isoformas de Proteínas/química , Isoformas de Proteínas/fisiologia , Transporte Proteico , Homologia de Sequência de Aminoácidos , Zixina/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA