Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nature ; 533(7602): 200-5, 2016 05 12.
Artigo em Inglês | MEDLINE | ID: mdl-27088604

RESUMO

The whole-genome duplication 80 million years ago of the common ancestor of salmonids (salmonid-specific fourth vertebrate whole-genome duplication, Ss4R) provides unique opportunities to learn about the evolutionary fate of a duplicated vertebrate genome in 70 extant lineages. Here we present a high-quality genome assembly for Atlantic salmon (Salmo salar), and show that large genomic reorganizations, coinciding with bursts of transposon-mediated repeat expansions, were crucial for the post-Ss4R rediploidization process. Comparisons of duplicate gene expression patterns across a wide range of tissues with orthologous genes from a pre-Ss4R outgroup unexpectedly demonstrate far more instances of neofunctionalization than subfunctionalization. Surprisingly, we find that genes that were retained as duplicates after the teleost-specific whole-genome duplication 320 million years ago were not more likely to be retained after the Ss4R, and that the duplicate retention was not influenced to a great extent by the nature of the predicted protein interactions of the gene products. Finally, we demonstrate that the Atlantic salmon assembly can serve as a reference sequence for the study of other salmonids for a range of purposes.


Assuntos
Diploide , Evolução Molecular , Duplicação Gênica/genética , Genes Duplicados/genética , Genoma/genética , Salmo salar/genética , Animais , Elementos de DNA Transponíveis/genética , Feminino , Genômica , Masculino , Modelos Genéticos , Mutagênese/genética , Filogenia , Padrões de Referência , Salmo salar/classificação , Homologia de Sequência
2.
Genome Res ; 27(5): 787-792, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28130360

RESUMO

Long sequencing reads generated by single-molecule sequencing technology offer the possibility of dramatically improving the contiguity of genome assemblies. The biggest challenge today is that long reads have relatively high error rates, currently around 15%. The high error rates make it difficult to use this data alone, particularly with highly repetitive plant genomes. Errors in the raw data can lead to insertion or deletion errors (indels) in the consensus genome sequence, which in turn create significant problems for downstream analysis; for example, a single indel may shift the reading frame and incorrectly truncate a protein sequence. Here, we describe an algorithm that solves the high error rate problem by combining long, high-error reads with shorter but much more accurate Illumina sequencing reads, whose error rates average <1%. Our hybrid assembly algorithm combines these two types of reads to construct mega-reads, which are both long and accurate, and then assembles the mega-reads using the CABOG assembler, which was designed for long reads. We apply this technique to a large data set of Illumina and PacBio sequences from the species Aegilops tauschii, a large and extremely repetitive plant genome that has resisted previous attempts at assembly. We show that the resulting assembled contigs are far larger than in any previous assembly, with an N50 contig size of 486,807 nucleotides. We compare the contigs to independently produced optical maps to evaluate their large-scale accuracy, and to a set of high-quality bacterial artificial chromosome (BAC)-based assemblies to evaluate base-level accuracy.


Assuntos
Mapeamento de Sequências Contíguas/métodos , Genoma de Planta , Genômica/métodos , Poaceae/genética , Sequências Repetitivas de Ácido Nucleico , Análise de Sequência de DNA/métodos , Software , Mapeamento de Sequências Contíguas/normas , Tamanho do Genoma , Genômica/normas , Análise de Sequência de DNA/normas
3.
Chaos ; 28(10): 103110, 2018 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-30384627

RESUMO

The dynamics on a chaotic attractor can be quite heterogeneous, being much more unstable in some regions than others. Some regions of a chaotic attractor can be expanding in more dimensions than other regions. Imagine a situation where two such regions and each contains trajectories that stay in the region for all time-while typical trajectories wander throughout the attractor. Furthermore, if arbitrarily close to each point of the attractor there are points on periodic orbits that have different unstable dimensions, then we say such an attractor is "hetero-chaotic" (i.e., it has heterogeneous chaos). This is hard to picture but we believe that most physical systems possessing a high-dimensional attractor are of this type. We have created simplified models with that behavior to give insight into real high-dimensional phenomena.

4.
BMC Genomics ; 17(Suppl 10): 826, 2016 11 11.
Artigo em Inglês | MEDLINE | ID: mdl-28185554

RESUMO

BACKGROUND: The diversity in eukaryotic life reflects a diversity in regulatory pathways. Nocedal and Johnson argue that the rewiring of gene regulatory networks is a major force for the diversity of life, that changes in regulation can create new species. RESULTS: We have created a method (based on our new "ping-pong algorithm) for detecting more complicated rewirings, where several transcription factors can substitute for one or more transcription factors in the regulation of a family of co-regulated genes. An example is illustrative. A rewiring has been reported by Hogues et al. that RAP1 in Saccharomyces cerevisiae substitutes for TBF1/CBF1 in Candida albicans for ribosomal RP genes. There one transcription factor substitutes for another on some collection of genes. Such a substitution is referred to as a "rewiring". We agree with this finding of rewiring as far as it goes but the situation is more complicated. Many transcription factors can regulate a gene and our algorithm finds that in this example a "team" (or collection) of three transcription factors including RAP1 substitutes for TBF1 for 19 genes. The switch occurs for a branch of the phylogenetic tree containing 10 species (including Saccharomyces cerevisiae), while the remaining 13 species (Candida albicans) are regulated by TBF1. CONCLUSIONS: To gain insight into more general evolutionary mechanisms, we have created a mathematical algorithm that finds such general switching events and we prove that it converges. Of course any such computational discovery should be validated in the biological tests. For each branch of the phylogenetic tree and each gene module, our algorithm finds a sub-group of co-regulated genes and a team of transcription factors that substitutes for another team of transcription factors. In most cases the signal will be small but in some cases we find a strong signal of switching. We report our findings for 23 Ascomycota fungi species.


Assuntos
Algoritmos , Evolução Molecular , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Fatores de Transcrição/genética , Candida albicans/classificação , Candida albicans/genética , Candida albicans/metabolismo , Redes Reguladoras de Genes , Filogenia , Saccharomyces cerevisiae/classificação , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Complexo Shelterina , Proteínas de Ligação a Telômeros/genética , Fatores de Transcrição/metabolismo , Transcrição Gênica
5.
Genome Res ; 22(3): 557-67, 2012 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-22147368

RESUMO

New sequencing technology has dramatically altered the landscape of whole-genome sequencing, allowing scientists to initiate numerous projects to decode the genomes of previously unsequenced organisms. The lowest-cost technology can generate deep coverage of most species, including mammals, in just a few days. The sequence data generated by one of these projects consist of millions or billions of short DNA sequences (reads) that range from 50 to 150 nt in length. These sequences must then be assembled de novo before most genome analyses can begin. Unfortunately, genome assembly remains a very difficult problem, made more difficult by shorter reads and unreliable long-range linking information. In this study, we evaluated several of the leading de novo assembly algorithms on four different short-read data sets, all generated by Illumina sequencers. Our results describe the relative performance of the different assemblers as well as other significant differences in assembly difficulty that appear to be inherent in the genomes themselves. Three overarching conclusions are apparent: first, that data quality, rather than the assembler itself, has a dramatic effect on the quality of an assembled genome; second, that the degree of contiguity of an assembly varies enormously among different assemblers and different genomes; and third, that the correctness of an assembly also varies widely and is not well correlated with statistics on contiguity. To enable others to replicate our results, all of our data and methods are freely available, as are all assemblers used in this study.


Assuntos
Algoritmos , Genômica/métodos , Análise de Sequência de DNA , Animais , Biologia Computacional/métodos , Genoma , Genoma Bacteriano/genética , Humanos , Internet , Reprodutibilidade dos Testes
6.
Bioinformatics ; 29(21): 2669-77, 2013 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-23990416

RESUMO

MOTIVATION: Second-generation sequencing technologies produce high coverage of the genome by short reads at a low cost, which has prompted development of new assembly methods. In particular, multiple algorithms based on de Bruijn graphs have been shown to be effective for the assembly problem. In this article, we describe a new hybrid approach that has the computational efficiency of de Bruijn graph methods and the flexibility of overlap-based assembly strategies, and which allows variable read lengths while tolerating a significant level of sequencing error. Our method transforms large numbers of paired-end reads into a much smaller number of longer 'super-reads'. The use of super-reads allows us to assemble combinations of Illumina reads of differing lengths together with longer reads from 454 and Sanger sequencing technologies, making it one of the few assemblers capable of handling such mixtures. We call our system the Maryland Super-Read Celera Assembler (abbreviated MaSuRCA and pronounced 'mazurka'). RESULTS: We evaluate the performance of MaSuRCA against two of the most widely used assemblers for Illumina data, Allpaths-LG and SOAPdenovo2, on two datasets from organisms for which high-quality assemblies are available: the bacterium Rhodobacter sphaeroides and chromosome 16 of the mouse genome. We show that MaSuRCA performs on par or better than Allpaths-LG and significantly better than SOAPdenovo on these data, when evaluated against the finished sequence. We then show that MaSuRCA can significantly improve its assemblies when the original data are augmented with long reads. AVAILABILITY: MaSuRCA is available as open-source code at ftp://ftp.genome.umd.edu/pub/MaSuRCA/. Previous (pre-publication) releases have been publicly available for over a year. CONTACT: alekseyz@ipst.umd.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica/métodos , Algoritmos , Animais , Genoma Bacteriano , Camundongos , Rhodobacter sphaeroides/genética , Análise de Sequência de DNA/métodos , Software
7.
Phys Rev Lett ; 113(8): 084101, 2014 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-25192099

RESUMO

The character of the time-asymptotic evolution of physical systems can have complex, singular behavior with variation of a system parameter, particularly when chaos is involved. A perturbation of the parameter by a small amount ε can convert an attractor from chaotic to nonchaotic or vice versa. We call a parameter value where this can happen ε uncertain. The probability that a random choice of the parameter is ε uncertain commonly scales like a power law in ε. Surprisingly, two seemingly similar ways of defining this scaling, both of physical interest, yield different numerical values for the scaling exponent. We show why this happens and present a quantitative analysis of this phenomenon.

8.
Proc Natl Acad Sci U S A ; 108(14): 5673-8, 2011 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-21282631

RESUMO

Ants are some of the most abundant and familiar animals on Earth, and they play vital roles in most terrestrial ecosystems. Although all ants are eusocial, and display a variety of complex and fascinating behaviors, few genomic resources exist for them. Here, we report the draft genome sequence of a particularly widespread and well-studied species, the invasive Argentine ant (Linepithema humile), which was accomplished using a combination of 454 (Roche) and Illumina sequencing and community-based funding rather than federal grant support. Manual annotation of >1,000 genes from a variety of different gene families and functional classes reveals unique features of the Argentine ant's biology, as well as similarities to Apis mellifera and Nasonia vitripennis. Distinctive features of the Argentine ant genome include remarkable expansions of gustatory (116 genes) and odorant receptors (367 genes), an abundance of cytochrome P450 genes (>110), lineage-specific expansions of yellow/major royal jelly proteins and desaturases, and complete CpG DNA methylation and RNAi toolkits. The Argentine ant genome contains fewer immune genes than Drosophila and Tribolium, which may reflect the prominent role played by behavioral and chemical suppression of pathogens. Analysis of the ratio of observed to expected CpG nucleotides for genes in the reproductive development and apoptosis pathways suggests higher levels of methylation than in the genome overall. The resources provided by this genome sequence will offer an abundance of tools for researchers seeking to illuminate the fascinating biology of this emerging model organism.


Assuntos
Formigas/genética , Genoma de Inseto/genética , Genômica/métodos , Filogenia , Animais , Formigas/fisiologia , Sequência de Bases , California , Metilação de DNA , Biblioteca Gênica , Genética Populacional , Hierarquia Social , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único/genética , Receptores Odorantes/genética , Análise de Sequência de DNA
9.
Chaos ; 23(3): 033113, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-24089949

RESUMO

A period-doubling cascade is often seen in numerical studies of those smooth (one-parameter families of) maps for which as the parameter is varied, the map transitions from one without chaos to one with chaos. Our emphasis in this paper is on establishing the existence of such a cascade for many maps with phase space dimension 2. We use continuation methods to show the following: under certain general assumptions, if at one parameter there are only finitely many periodic orbits, and at another parameter value there is chaos, then between those two parameter values there must be a cascade. We investigate only families that are generic in the sense that all periodic orbit bifurcations are generic. Our method of proof in showing there is one cascade is to show there must be infinitely many cascades. We discuss in detail two-dimensional families like those which arise as a time-2π maps for the Duffing equation and the forced damped pendulum equation.

10.
J Biol Dyn ; 17(1): 2259223, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37728890

RESUMO

Steady states of dynamical systems, whether stable or unstable, are critical for understanding future evolution. Robust steady states, ones that persist under small changes in the model parameters, are desired when modelling ecological systems, where it is common for accurate and detailed information on functional form and parameters to be unavailable. Previous work by Jahedi et al. [Robustness of solutions of almost every system of equations, SIAM J. Appl. Math. 82(5) (2022), pp. 1791-1807; Structured systems of nonlinear equations, SIAM J. Appl. Math. 83(4) (2023), pp. 1696-1716.] has established criteria to imply the prevalence of robust steady states for systems with minimal predetermined structure, including conventional structured systems. We review that work and extend it by allowing symmetries in the system structure, which present added obstructions to robustness.


Assuntos
Ecossistema , Modelos Biológicos
11.
Chaos ; 22(4): 047507, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23278093

RESUMO

Safe sets are a basic ingredient in the strategy of partial control of chaotic systems. Recently we have found an algorithm, the sculpting algorithm, which allows us to construct them, when they exist. Here we define another type of set, an asymptotic safe set, to which trajectories are attracted asymptotically when the partial control strategy is applied. We apply all these ideas to a specific example of a Duffing oscillator showing the geometry of these sets in phase space. The software for creating all the figures appearing in this paper is available as supplementary material.

12.
Biology (Basel) ; 9(11)2020 Oct 23.
Artigo em Inglês | MEDLINE | ID: mdl-33114047

RESUMO

As the coronavirus pandemic spreads across the globe, people are debating policies to mitigate its severity. Many complex, highly detailed models have been developed to help policy setters make better decisions. However, the basis of these models is unlikely to be understood by non-experts. We describe the advantages of simple models for COVID-19. We say a model is "simple" if its only parameter is the rate of contact between people in the population. This contact rate can vary over time, depending on choices by policy setters. Such models can be understood by a broad audience, and thus can be helpful in explaining the policy decisions to the public. They can be used to evaluate the outcomes of different policies. However, simple models have a disadvantage when dealing with inhomogeneous populations. To augment the power of a simple model to evaluate complicated situations, we add what we call "satellite" equations that do not change the original model. For example, with the help of a satellite equation, one could know what his/her chance is of remaining uninfected through the end of an epidemic. Satellite equations can model the effects of the epidemic on high-risk individuals, death rates, and nursing homes and other isolated populations. To compare simple models with complex models, we introduce our "slightly complex" Model J. We find the conclusions of simple and complex models can be quite similar. However, for each added complexity, a modeler may have to choose additional parameter values describing who will infect whom under what conditions, choices for which there is often little rationale but that can have big impacts on predictions. Our simulations suggest that the added complexity offers little predictive advantage.

13.
Bioinformatics ; 24(4): 462-7, 2008 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-18202027

RESUMO

MOTIVATION: Sequences produced by automated Sanger sequencing machines frequently contain fragments of the cloning vector on their ends. Software tools currently available for identifying and removing the vector sequence require knowledge of the vector sequence, specific splice sites and any adapter sequences used in the experiment-information often omitted from public databases. Furthermore, the clipping coordinates themselves are missing or incorrectly reported. As an example, within the approximately 1.24 billion shotgun sequences deposited in the NCBI Trace Archive, as many as approximately 735 million (approximately 60%) lack vector clipping information. Correct clipping information is essential to scientists attempting to validate, improve and even finish the increasingly large number of genomes released at a 'draft' quality level. RESULTS: We present here Figaro, a novel software tool for identifying and removing the vector from raw sequence data without prior knowledge of the vector sequence. The vector sequence is automatically inferred by analyzing the frequency of occurrence of short oligo-nucleotides using Poisson statistics. We show that Figaro achieves 99.98% sensitivity when tested on approximately 1.5 million shotgun reads from Drosophila pseudoobscura. We further explore the impact of accurate vector trimming on the quality of whole-genome assemblies by re-assembling two bacterial genomes from shotgun sequences deposited in the Trace Archive. Designed as a module in large computational pipelines, Figaro is fast, lightweight and flexible. AVAILABILITY: Figaro is released under an open-source license through the AMOS package (http://amos.sourceforge.net/Figaro).


Assuntos
Biologia Computacional/métodos , Vetores Genéticos/análise , Vetores Genéticos/genética , Análise de Sequência de DNA/métodos , Software , Animais , Sequência de Bases , Drosophila/genética , Genoma Bacteriano/genética , Dados de Sequência Molecular
14.
Bioinformatics ; 24(1): 42-5, 2008 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-18057021

RESUMO

MOTIVATION: Many genomes are sequenced by a collaboration of several centers, and then each center produces an assembly using their own assembly software. The collaborators then pick the draft assembly that they judge to be the best and the information contained in the other assemblies is usually not used. METHODS: We have developed a technique that we call assembly reconciliation that can merge draft genome assemblies. It takes one draft assembly, detects apparent errors, and, when possible, patches the problem areas using pieces from alternative draft assemblies. It also closes gaps in places where one of the alternative assemblies has spanned the gap correctly. RESULTS: Using the Assembly Reconciliation technique, we produced reconciled assemblies of six Drosophila species in collaboration with Agencourt Bioscience and The J. Craig Venter Institute. These assemblies are now the official (CAF1) assemblies used for analysis. We also produced a reconciled assembly of Rhesus Macaque genome, and this assembly is available from our website http://www.genome.umd.edu. AVAILABILITY: The reconciliation software is available for download from http://www.genome.umd.edu/software.htm


Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Mapeamento de Sequências Contíguas/métodos , Genoma/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Sequência de Bases , Dados de Sequência Molecular
15.
Phys Rev E Stat Nonlin Soft Matter Phys ; 78(5 Pt 2): 056203, 2008 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19113196

RESUMO

We describe a simple continuous-time flow such that Lyapunov exponents fail to exist at nearly every point in the phase space R2 , despite the fact that the flow admits a unique natural measure. This example illustrates that the existence of Lyapunov exponents is a subtle question for systems that are not conservative.

16.
Phys Rev E Stat Nonlin Soft Matter Phys ; 78(6 Pt 1): 061912, 2008 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19256873

RESUMO

We study quantitative features of complex repetitive DNA in several genomes by studying sequences that are sufficiently long that they are unlikely to have repeated by chance. For each genome we study, we determine the number of identical copies, the "duplication count," of each sequence of length 40, that is of each "40-mer." We say a 40-mer is "repeated" if its duplication count is at least 2. We focus mainly on "complex" 40-mers, those without short internal repetitions. We find that we can classify most of the complex repeated 40-mers into two categories: one category has its copies clustered closely together on one chromosome, the other has its copies distributed widely across multiple chromosomes. For each genome and each of the categories above, we compute N(c), the number of 40-mers that have duplication count c, for each integer c. In each case, we observe a power-law-like decay in N(c) as c increases from 3 to 50 or higher. In particular, we find that N(c) decays much more slowly than would be predicted by evolutionary models where each 40-mer is equally likely to be duplicated. We also analyze an evolutionary model that does reflect the slow decay of N(c).


Assuntos
DNA/química , DNA/genética , Modelos Genéticos , Animais , Sequência de Bases , Fenômenos Biofísicos , Cromossomos/genética , Duplicação Gênica , Genômica , Humanos , Cadeias de Markov , Modelos Químicos , Família Multigênica , Sequências Repetitivas de Ácido Nucleico
17.
Phys Rev E Stat Nonlin Soft Matter Phys ; 77(5 Pt 2): 055201, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18643119

RESUMO

In a region in phase space where there is a chaotic saddle, all initial conditions will escape from it after a transient with the exception of a set of points of zero Lebesgue measure. The action of an external noise makes all trajectories escape faster. Attempting to avoid those escapes by applying a control smaller than noise seems to be an impossible task. Here we show, however, that this goal is indeed possible, based on a geometrical property found typically in this situation: the existence of a horseshoe. The horseshoe implies that there exist what we call safe sets, which assures that there is a general strategy that allows one to keep trajectories inside that region with control smaller than noise. We call this type of control partial control of chaos.

18.
Philos Trans A Math Phys Eng Sci ; 375(2088)2017 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-28115608

RESUMO

Transient chaos is a characteristic behaviour in nonlinear dynamics where trajectories in a certain region of phase space behave chaotically for a while, before escaping to an external attractor. In some situations, the escapes are highly undesirable, so that it would be necessary to avoid such a situation. In this paper, we apply a control method known as partial control that allows one to prevent the escapes of the trajectories to the external attractors, keeping the trajectories in the chaotic region forever. We also show, for the first time, the application of this method in three dimensions, which is the major step forward in this work. To illustrate how the method works, we have chosen the Lorenz system for a choice of parameters where transient chaos appears, as a paradigmatic example in nonlinear dynamics. We analyse three quite different ways to implement the method. First, we apply this method by building an one-dimensional map using the successive maxima of one of the variables. Next, we implement it by building a two-dimensional map through a Poincaré section. Finally, we built a three-dimensional map, which has the advantage of using a fixed time interval between application of the control, which can be useful for practical applications.This article is part of the themed issue 'Horizons of cybernetical physics'.

19.
Gigascience ; 6(10): 1, 2017 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-29020755

RESUMO

The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.

20.
Gigascience ; 6(1): 1-4, 2017 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-28369353

RESUMO

The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.


Assuntos
Mapeamento de Sequências Contíguas , Genoma de Planta , Sequenciamento de Nucleotídeos em Larga Escala , Pinus taeda/genética , Análise de Sequência de DNA , Algoritmos , Genômica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA