Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Genome Res ; 29(4): 635-645, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30894395

RESUMO

Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short-read whole-genome sequencing. However, these short-read approaches fail to give a complete picture of a genome. They struggle to identify structural events, cannot access repetitive regions, and fail to resolve the human genome into haplotypes. Here, we describe an approach that retains long range information while maintaining the advantages of short reads. Starting from ∼1 ng of high molecular weight DNA, we produce barcoded short-read libraries. Novel informatic approaches allow for the barcoded short reads to be associated with their original long molecules producing a novel data type known as "Linked-Reads". This approach allows for simultaneous detection of small and large variants from a single library. In this manuscript, we show the advantages of Linked-Reads over standard short-read approaches for reference-based analysis. Linked-Reads allow mapping to 38 Mb of sequence not accessible to short reads, adding sequence in 423 difficult-to-sequence genes including disease-relevant genes STRC, SMN1, and SMN2 Both Linked-Read whole-genome and whole-exome sequencing identify complex structural variations, including balanced events and single exon deletions and duplications. Further, Linked-Reads extend the region of high-confidence calls by 68.9 Mb. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Polimorfismo Genético , Sequenciamento Completo do Genoma/métodos , Linhagem Celular , Genoma Humano , Humanos , Peptídeos e Proteínas de Sinalização Intercelular , Proteínas de Membrana/genética , Proteína 1 de Sobrevivência do Neurônio Motor/genética , Proteína 2 de Sobrevivência do Neurônio Motor/genética
2.
PLoS Biol ; 7(5): e1000112, 2009 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-19468303

RESUMO

The mouse (Mus musculus) is the premier animal model for understanding human disease and development. Here we show that a comprehensive understanding of mouse biology is only possible with the availability of a finished, high-quality genome assembly. The finished clone-based assembly of the mouse strain C57BL/6J reported here has over 175,000 fewer gaps and over 139 Mb more of novel sequence, compared with the earlier MGSCv3 draft genome assembly. In a comprehensive analysis of this revised genome sequence, we are now able to define 20,210 protein-coding genes, over a thousand more than predicted in the human genome (19,042 genes). In addition, we identified 439 long, non-protein-coding RNAs with evidence for transcribed orthologs in human. We analyzed the complex and repetitive landscape of 267 Mb of sequence that was missing or misassembled in the previously published assembly, and we provide insights into the reasons for its resistance to sequencing and assembly by whole-genome shotgun approaches. Duplicated regions within newly assembled sequence tend to be of more recent ancestry than duplicates in the published draft, correcting our initial understanding of recent evolution on the mouse lineage. These duplicates appear to be largely composed of sequence regions containing transposable elements and duplicated protein-coding genes; of these, some may be fixed in the mouse population, but at least 40% of segmentally duplicated sequences are copy number variable even among laboratory mouse strains. Mouse lineage-specific regions contain 3,767 genes drawn mainly from rapidly-changing gene families associated with reproductive functions. The finished mouse genome assembly, therefore, greatly improves our understanding of rodent-specific biology and allows the delineation of ancestral biological functions that are shared with human from derived functions that are not.


Assuntos
Biologia Computacional/métodos , Genoma/genética , Animais , Bases de Dados Genéticas , Duplicação Gênica , Genoma/fisiologia , Humanos , Camundongos
3.
Nat Protoc ; 2(3): 677-84, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17406630

RESUMO

This protocol describes pulsed-field gel electrophoresis (PFGE), a method developed for separation of large DNA molecules. Whereas standard DNA gel electrophoresis commonly resolves fragments up to approximately 50 kb in size, PFGE fractionates DNA molecules up to 10 Mb. The mechanism driving these separations exploits the fact that very large DNA molecules unravel and "snake" through a gel matrix, and such electrophoretic trajectories are perturbed in a size-dependent manner by carefully oriented electrical pulses. PFGE has enabled the rapid genomic analysis of microbes and mammalian cells, and motivated development of large-insert cloning systems such as bacterial and yeast artificial chromosomes. As such, this protocol includes descriptions of two types of PFGE instrumentation (not commercially available), along with detailed instructions for their operation. Additionally, this protocol provides basic instructions for the preparation of intact chromosomal DNA from several types of organisms. PFGE takes 2-3 days, excluding sample preparation.


Assuntos
DNA/isolamento & purificação , Eletroforese em Gel de Campo Pulsado/instrumentação , Eletroforese em Gel de Campo Pulsado/métodos
4.
Plant Physiol ; 137(1): 13-30, 2005 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-15644464

RESUMO

Approximately 5% of the Arabidopsis (Arabidopsis thaliana) proteome is predicted to be involved in the ubiquitination/26S proteasome pathway. The majority of these predicted proteins have identity to conserved domains found in E3 ligases, of which there are multiple types. The RING-type E3 is characterized by the presence of a cysteine-rich domain that coordinates two zinc atoms. Database searches followed by extensive manual curation identified 469 predicted Arabidopsis RING domain-containing proteins. In addition to the two canonical RING types (C3H2C3 or C3HC4), additional types of modified RING domains, named RING-v, RING-D, RING-S/T, RING-G, and RING-C2, were identified. The modified RINGs differ in either the spacing between metal ligands or have substitutions at one or more of the metal ligand positions. The majority of the canonical and modified RING domain-containing proteins analyzed were active in in vitro ubiquitination assays, catalyzing polyubiquitination with the E2 AtUBC8. To help identity regions of the proteins that may interact with substrates, domain analyses of the amino acids outside the RING domain classified RING proteins into 30 different groups. Several characterized protein-protein interaction domains were identified, as well as additional conserved domains not described previously. The two largest classes of RING proteins contain either no identifiable domain or a transmembrane domain. The presence of such a large and diverse number of RING domain-containing proteins that function as ubiquitin E3 ligases suggests that target-specific proteolysis by these E3 ligases is a complex and important part of cellular regulation in Arabidopsis.


Assuntos
Arabidopsis/enzimologia , Ubiquitina-Proteína Ligases/química , Sequência de Aminoácidos , Proteínas de Arabidopsis/química , Sequência Consenso , Ligantes , Metais , Dados de Sequência Molecular , Estrutura Terciária de Proteína , Ubiquitina-Proteína Ligases/metabolismo
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa