Pesquisa | Portal Regional da BVS

1.

DNA read count calibration for single-molecule, long-read sequencing.

Soares, Luis M M; Hanscom, Terrence; Selby, Donald E; Adjei, Samuel; Wang, Wei; Przybylski, Dariusz; Thompson, John F.

Sci Rep ; 12(1): 17257, 2022 11 01.

Artigo em Inglês | MEDLINE | ID: mdl-36319642

RESUMO

There are many applications in which quantitative information about DNA mixtures with different molecular lengths is important. Gene therapy vectors are much longer than can be sequenced individually via short-read NGS. However, vector preparations may contain smaller DNAs that behave differently during sequencing. We have used two library preparations each for Pacific Biosystems (PacBio) and Oxford Nanopore Technologies NGS to determine their suitability for quantitative assessment of varying sized DNAs. Equimolar length standards were generated from E. coli genomic DNA. Both PacBio library preparations provided a consistent length dependence though with a complex pattern. This method is sufficiently sensitive that differences in genomic copy number between DNA from E. coli grown in exponential and stationary phase conditions could be detected. The transposase-based Oxford Nanopore library preparation provided a predictable length dependence, but the random sequence starts caused the loss of original length information. The ligation-based approach retained length information but read frequency was more variable. Modeling of E. coli versus lambda read frequency via cubic spline smoothing showed that the shorter genome could be used as a suitable internal spike-in for DNAs in the 200 bp to 10 kb range, allowing meaningful QC to be carried out with AAV preparations.

Assuntos

Escherichia coli , Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Calibragem , Análise de Sequência de DNA/métodos , DNA

2.

A transcriptomic taxonomy of Drosophila circadian neurons around the clock.

Ma, Dingbang; Przybylski, Dariusz; Abruzzi, Katharine C; Schlichting, Matthias; Li, Qunlong; Long, Xi; Rosbash, Michael.

Elife ; 102021 01 13.

Artigo em Inglês | MEDLINE | ID: mdl-33438579

RESUMO

Many different functions are regulated by circadian rhythms, including those orchestrated by discrete clock neurons within animal brains. To comprehensively characterize and assign cell identity to the 75 pairs of Drosophila circadian neurons, we optimized a single-cell RNA sequencing method and assayed clock neuron gene expression at different times of day. The data identify at least 17 clock neuron categories with striking spatial regulation of gene expression. Transcription factor regulation is prominent and likely contributes to the robust circadian oscillation of many transcripts, including those that encode cell-surface proteins previously shown to be important for cell recognition and synapse formation during development. The many other clock-regulated genes also constitute an important resource for future mechanistic and functional studies between clock neurons and/or for temporal signaling to circuits elsewhere in the fly brain.

Assuntos

Relógios Biológicos , Ritmo Circadiano , Drosophila melanogaster/fisiologia , Regulação da Expressão Gênica , Neurônios/fisiologia , Transcriptoma , Animais , Drosophila melanogaster/genética , Feminino , Masculino , Fatores de Tempo

3.

Lipid availability determines fate of skeletal progenitor cells via SOX9.

van Gastel, Nick; Stegen, Steve; Eelen, Guy; Schoors, Sandra; Carlier, Aurélie; Daniëls, Veerle W; Baryawno, Ninib; Przybylski, Dariusz; Depypere, Maarten; Stiers, Pieter-Jan; Lambrechts, Dennis; Van Looveren, Riet; Torrekens, Sophie; Sharda, Azeem; Agostinis, Patrizia; Lambrechts, Diether; Maes, Frederik; Swinnen, Johan V; Geris, Liesbet; Van Oosterwyck, Hans; Thienpont, Bernard; Carmeliet, Peter; Scadden, David T; Carmeliet, Geert.

Nature ; 579(7797): 111-117, 2020 03.

Artigo em Inglês | MEDLINE | ID: mdl-32103177

RESUMO

The avascular nature of cartilage makes it a unique tissue1-4, but whether and how the absence of nutrient supply regulates chondrogenesis remain unknown. Here we show that obstruction of vascular invasion during bone healing favours chondrogenic over osteogenic differentiation of skeletal progenitor cells. Unexpectedly, this process is driven by a decreased availability of extracellular lipids. When lipids are scarce, skeletal progenitors activate forkhead box O (FOXO) transcription factors, which bind to the Sox9 promoter and increase its expression. Besides initiating chondrogenesis, SOX9 acts as a regulator of cellular metabolism by suppressing oxidation of fatty acids, and thus adapts the cells to an avascular life. Our results define lipid scarcity as an important determinant of chondrogenic commitment, reveal a role for FOXO transcription factors during lipid starvation, and identify SOX9 as a critical metabolic mediator. These data highlight the importance of the nutritional microenvironment in the specification of skeletal cell fate.

Assuntos

Osso e Ossos/citologia , Microambiente Celular , Condrogênese , Metabolismo dos Lipídeos , Fatores de Transcrição SOX9/metabolismo , Células-Tronco/citologia , Células-Tronco/metabolismo , Animais , Osso e Ossos/irrigação sanguínea , Condrócitos/citologia , Condrócitos/metabolismo , Ácidos Graxos/metabolismo , Feminino , Privação de Alimentos , Fatores de Transcrição Forkhead/metabolismo , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Osteogênese , Oxirredução , Fatores de Transcrição SOX9/genética , Transdução de Sinais , Cicatrização

4.

A Cellular Taxonomy of the Bone Marrow Stroma in Homeostasis and Leukemia.

Baryawno, Ninib; Przybylski, Dariusz; Kowalczyk, Monika S; Kfoury, Youmna; Severe, Nicolas; Gustafsson, Karin; Kokkaliaris, Konstantinos D; Mercier, Francois; Tabaka, Marcin; Hofree, Matan; Dionne, Danielle; Papazian, Ani; Lee, Dongjun; Ashenberg, Orr; Subramanian, Ayshwarya; Vaishnav, Eeshit Dhaval; Rozenblatt-Rosen, Orit; Regev, Aviv; Scadden, David T.

Cell ; 177(7): 1915-1932.e16, 2019 06 13.

Artigo em Inglês | MEDLINE | ID: mdl-31130381

RESUMO

Stroma is a poorly defined non-parenchymal component of virtually every organ with key roles in organ development, homeostasis, and repair. Studies of the bone marrow stroma have defined individual populations in the stem cell niche regulating hematopoietic regeneration and capable of initiating leukemia. Here, we use single-cell RNA sequencing (scRNA-seq) to define a cellular taxonomy of the mouse bone marrow stroma and its perturbation by malignancy. We identified seventeen stromal subsets expressing distinct hematopoietic regulatory genes spanning new fibroblastic and osteoblastic subpopulations including distinct osteoblast differentiation trajectories. Emerging acute myeloid leukemia impaired mesenchymal osteogenic differentiation and reduced regulatory molecules necessary for normal hematopoiesis. These data suggest that tissue stroma responds to malignant cells by disadvantaging normal parenchymal cells. Our taxonomy of the stromal compartment provides a comprehensive bone marrow cell census and experimental support for cancer cell crosstalk with specific stromal elements to impair normal tissue function and thereby enable emergent cancer.

Assuntos

Células da Medula Óssea/metabolismo , Diferenciação Celular , Homeostase , Leucemia Mieloide Aguda/metabolismo , Osteoblastos/metabolismo , Osteogênese , Microambiente Tumoral , Animais , Células da Medula Óssea/patologia , Humanos , Leucemia Mieloide Aguda/patologia , Camundongos , Osteoblastos/patologia , Células Estromais/metabolismo , Células Estromais/patologia

5.

Fas Promotes T Helper 17 Cell Differentiation and Inhibits T Helper 1 Cell Development by Binding and Sequestering Transcription Factor STAT1.

Meyer Zu Horste, Gerd; Przybylski, Dariusz; Schramm, Markus A; Wang, Chao; Schnell, Alexandra; Lee, Youjin; Sobel, Raymond; Regev, Aviv; Kuchroo, Vijay K.

Immunity ; 48(3): 556-569.e7, 2018 03 20.

Artigo em Inglês | MEDLINE | ID: mdl-29562202

RESUMO

The death receptor Fas removes activated lymphocytes through apoptosis. Previous transcriptional profiling predicted that Fas positively regulates interleukin-17 (IL-17)-producing T helper 17 (Th17) cells. Here, we demonstrate that Fas promoted the generation and stability of Th17 cells and prevented their differentiation into Th1 cells. Mice with T-cell- and Th17-cell-specific deletion of Fas were protected from induced autoimmunity, and Th17 cell differentiation and stability were impaired. Fas-deficient Th17 cells instead developed a Th1-cell-like transcriptional profile, which a new algorithm predicted to depend on STAT1. Experimentally, Fas indeed bound and sequestered STAT1, and Fas deficiency enhanced IL-6-induced STAT1 activation and nuclear translocation, whereas deficiency of STAT1 reversed the transcriptional changes induced by Fas deficiency. Thus, our computational and experimental approach identified Fas as a regulator of the Th17-to-Th1 cell balance by controlling the availability of opposing STAT1 and STAT3 to have a direct impact on autoimmunity.

Assuntos

Diferenciação Celular/imunologia , Fator de Transcrição STAT1/metabolismo , Células Th1/imunologia , Células Th1/metabolismo , Células Th17/imunologia , Células Th17/metabolismo , Receptor fas/metabolismo , Animais , Apoptose/imunologia , Biomarcadores , Caspases/metabolismo , Perfilação da Expressão Gênica , Técnicas de Inativação de Genes , Ativação Linfocitária , Camundongos , Fenótipo , Fosforilação , Ligação Proteica , Transporte Proteico , Fator de Transcrição STAT3/metabolismo , Células Th17/citologia , Transcriptoma , Receptor fas/genética

6.

Cancer immunogenomic approach to neoantigen discovery in a checkpoint blockade responsive murine model of oral cavity squamous cell carcinoma.

Zolkind, Paul; Przybylski, Dariusz; Marjanovic, Nemanja; Nguyen, Lan; Lin, Tianxiang; Johanns, Tanner; Alexandrov, Anton; Zhou, Liye; Allen, Clint T; Miceli, Alexander P; Schreiber, Robert D; Artyomov, Maxim; Dunn, Gavin P; Uppaluri, Ravindra.

Oncotarget ; 9(3): 4109-4119, 2018 Jan 09.

Artigo em Inglês | MEDLINE | ID: mdl-29423108

RESUMO

Head and neck squamous cell carcinomas (HNSCC) are an ideal immunotherapy target due to their high mutation burden and frequent infiltration with lymphocytes. Preclinical models to investigate targeted and combination therapies as well as defining biomarkers to guide treatment represent an important need in the field. Immunogenomics approaches have illuminated the role of mutation-derived tumor neoantigens as potential biomarkers of response to checkpoint blockade as well as representing therapeutic vaccines. Here, we aimed to define a platform for checkpoint and other immunotherapy studies using syngeneic HNSCC cell line models (MOC2 and MOC22), and evaluated the association between mutation burden, predicted neoantigen landscape, infiltrating T cell populations and responsiveness of tumors to anti-PD1 therapy. We defined dramatic hematopoietic cell transcriptomic alterations in the MOC22 anti-PD1 responsive model in both tumor and draining lymph nodes. Using a cancer immunogenomics pipeline and validation with ELISPOT and tetramer analysis, we identified the H-2Kb-restricted ICAM1P315L (mICAM1) as a neoantigen in MOC22. Finally, we demonstrated that mICAM1 vaccination was able to protect against MOC22 tumor development defining mICAM1 as a bona fide neoantigen. Together these data define a pre-clinical HNSCC model system that provides a foundation for future investigations into combination and novel therapeutics.

7.

An Integrative Framework Reveals Signaling-to-Transcription Events in Toll-like Receptor Signaling.

Mertins, Philipp; Przybylski, Dariusz; Yosef, Nir; Qiao, Jana; Clauser, Karl; Raychowdhury, Raktima; Eisenhaure, Thomas M; Maritzen, Tanja; Haucke, Volker; Satoh, Takashi; Akira, Shizuo; Carr, Steven A; Regev, Aviv; Hacohen, Nir; Chevrier, Nicolas.

Cell Rep ; 19(13): 2853-2866, 2017 06 27.

Artigo em Inglês | MEDLINE | ID: mdl-28658630

RESUMO

Building an integrated view of cellular responses to environmental cues remains a fundamental challenge due to the complexity of intracellular networks in mammalian cells. Here, we introduce an integrative biochemical and genetic framework to dissect signal transduction events using multiple data types and, in particular, to unify signaling and transcriptional networks. Using the Toll-like receptor (TLR) system as a model cellular response, we generate multifaceted datasets on physical, enzymatic, and functional interactions and integrate these data to reveal biochemical paths that connect TLR4 signaling to transcription. We define the roles of proximal TLR4 kinases, identify and functionally test two dozen candidate regulators, and demonstrate a role for Ap1ar (encoding the Gadkin protein) and its binding partner, Picalm, potentially linking vesicle transport with pro-inflammatory responses. Our study thus demonstrates how deciphering dynamic cellular responses by integrating datasets on various regulatory layers defines key components and higher-order logic underlying signaling-to-transcription pathways.

Assuntos

Células Dendríticas/metabolismo , Receptores Toll-Like/metabolismo , Humanos , Fosforilação , Transdução de Sinais

8.

A Genome-wide CRISPR Screen in Primary Immune Cells to Dissect Regulatory Networks.

Parnas, Oren; Jovanovic, Marko; Eisenhaure, Thomas M; Herbst, Rebecca H; Dixit, Atray; Ye, Chun Jimmie; Przybylski, Dariusz; Platt, Randall J; Tirosh, Itay; Sanjana, Neville E; Shalem, Ophir; Satija, Rahul; Raychowdhury, Raktima; Mertins, Philipp; Carr, Steven A; Zhang, Feng; Hacohen, Nir; Regev, Aviv.

Cell ; 162(3): 675-86, 2015 Jul 30.

Artigo em Inglês | MEDLINE | ID: mdl-26189680

RESUMO

Finding the components of cellular circuits and determining their functions systematically remains a major challenge in mammalian cells. Here, we introduced genome-wide pooled CRISPR-Cas9 libraries into dendritic cells (DCs) to identify genes that control the induction of tumor necrosis factor (Tnf) by bacterial lipopolysaccharide (LPS), a key process in the host response to pathogens, mediated by the Tlr4 pathway. We found many of the known regulators of Tlr4 signaling, as well as dozens of previously unknown candidates that we validated. By measuring protein markers and mRNA profiles in DCs that are deficient in known or candidate genes, we classified the genes into three functional modules with distinct effects on the canonical responses to LPS and highlighted functions for the PAF complex and oligosaccharyltransferase (OST) complex. Our findings uncover new facets of innate immune circuits in primary cells and provide a genetic approach for dissection of mammalian cell circuits.

Assuntos

Sistemas CRISPR-Cas , Técnicas Genéticas , Imunidade Inata , Animais , Células da Medula Óssea/imunologia , Diferenciação Celular , Sobrevivência Celular , Células Dendríticas/citologia , Células Dendríticas/imunologia , Técnicas de Inativação de Genes , Redes Reguladoras de Genes , Hexosiltransferases/metabolismo , Proteínas de Membrana/metabolismo , Camundongos , Camundongos Transgênicos , Receptor 4 Toll-Like/imunologia , Fator de Necrose Tumoral alfa/imunologia

9.

Immunogenetics. Dynamic profiling of the protein life cycle in response to pathogens.

Jovanovic, Marko; Rooney, Michael S; Mertins, Philipp; Przybylski, Dariusz; Chevrier, Nicolas; Satija, Rahul; Rodriguez, Edwin H; Fields, Alexander P; Schwartz, Schraga; Raychowdhury, Raktima; Mumbach, Maxwell R; Eisenhaure, Thomas; Rabani, Michal; Gennert, Dave; Lu, Diana; Delorey, Toni; Weissman, Jonathan S; Carr, Steven A; Hacohen, Nir; Regev, Aviv.

Science ; 347(6226): 1259038, 2015 Mar 06.

Artigo em Inglês | MEDLINE | ID: mdl-25745177

RESUMO

Protein expression is regulated by the production and degradation of messenger RNAs (mRNAs) and proteins, but their specific relationships remain unknown. We combine measurements of protein production and degradation and mRNA dynamics so as to build a quantitative genomic model of the differential regulation of gene expression in lipopolysaccharide-stimulated mouse dendritic cells. Changes in mRNA abundance play a dominant role in determining most dynamic fold changes in protein levels. Conversely, the preexisting proteome of proteins performing basic cellular functions is remodeled primarily through changes in protein production or degradation, accounting for more than half of the absolute change in protein molecules in the cell. Thus, the proteome is regulated by transcriptional induction for newly activated cellular functions and by protein life-cycle changes for remodeling of preexisting functions.

Assuntos

Células da Medula Óssea/imunologia , Células Dendríticas/imunologia , Interações Hospedeiro-Patógeno/imunologia , Simulação de Dinâmica Molecular , Biossíntese de Proteínas , Proteólise , Aminoácidos/química , Aminoácidos/metabolismo , Animais , Técnicas de Cultura de Células , Marcação por Isótopo/métodos , Lipopolissacarídeos/imunologia , Camundongos , Proteínas Mitocondriais/metabolismo , RNA Mensageiro/biossíntese , RNA Mensageiro/genética , Análise de Sequência de RNA

10.

Changes in nucleosome occupancy associated with metabolic alterations in aged mammalian liver.

Bochkis, Irina M; Przybylski, Dariusz; Chen, Jenny; Regev, Aviv.

Cell Rep ; 9(3): 996-1006, 2014 Nov 06.

Artigo em Inglês | MEDLINE | ID: mdl-25437555

RESUMO

Aging is accompanied by physiological impairments, which, in insulin-responsive tissues, including the liver, predispose individuals to metabolic disease. However, the molecular mechanisms underlying these changes remain largely unknown. Here, we analyze genome-wide profiles of RNA and chromatin organization in the liver of young (3 months) and old (21 months) mice. Transcriptional changes suggest that derepression of the nuclear receptors PPARα, PPARÎ³, and LXRα in aged mouse liver leads to activation of targets regulating lipid synthesis and storage, whereas age-dependent changes in nucleosome occupancy are associated with binding sites for both known regulators (forkhead factors and nuclear receptors) and candidates associated with nuclear lamina (Hdac3 and Srf) implicated to govern metabolic function of aging liver. Winged-helix transcription factor Foxa2 and nuclear receptor corepressor Hdac3 exhibit a reciprocal binding pattern at PPARα targets contributing to gene expression changes that lead to steatosis in aged liver.

Assuntos

Envelhecimento/metabolismo , Fígado/crescimento & desenvolvimento , Fígado/metabolismo , Mamíferos/metabolismo , Nucleossomos/metabolismo , Animais , Sequência de Bases , Proteínas de Ligação a DNA/metabolismo , Fígado Gorduroso/patologia , Regulação da Expressão Gênica no Desenvolvimento , Fator 3-beta Nuclear de Hepatócito/metabolismo , Histona Desacetilases/metabolismo , Inflamação/patologia , Fígado/patologia , Masculino , Camundongos Endogâmicos C57BL , Modelos Biológicos , Dados de Sequência Molecular , Lâmina Nuclear/metabolismo , PPAR alfa/metabolismo , Ligação Proteica , Fatores de Transcrição/metabolismo

11.

The genomic substrate for adaptive radiation in African cichlid fish.

Brawand, David; Wagner, Catherine E; Li, Yang I; Malinsky, Milan; Keller, Irene; Fan, Shaohua; Simakov, Oleg; Ng, Alvin Y; Lim, Zhi Wei; Bezault, Etienne; Turner-Maier, Jason; Johnson, Jeremy; Alcazar, Rosa; Noh, Hyun Ji; Russell, Pamela; Aken, Bronwen; Alföldi, Jessica; Amemiya, Chris; Azzouzi, Naoual; Baroiller, Jean-François; Barloy-Hubler, Frederique; Berlin, Aaron; Bloomquist, Ryan; Carleton, Karen L; Conte, Matthew A; D'Cotta, Helena; Eshel, Orly; Gaffney, Leslie; Galibert, Francis; Gante, Hugo F; Gnerre, Sante; Greuter, Lucie; Guyon, Richard; Haddad, Natalie S; Haerty, Wilfried; Harris, Rayna M; Hofmann, Hans A; Hourlier, Thibaut; Hulata, Gideon; Jaffe, David B; Lara, Marcia; Lee, Alison P; MacCallum, Iain; Mwaiko, Salome; Nikaido, Masato; Nishihara, Hidenori; Ozouf-Costaz, Catherine; Penman, David J; Przybylski, Dariusz; Rakotomanga, Michaelle.

Nature ; 513(7518): 375-381, 2014 Sep 18.

Artigo em Inglês | MEDLINE | ID: mdl-25186727

RESUMO

Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity, we sequenced the genomes and transcriptomes of five lineages of African cichlids: the Nile tilapia (Oreochromis niloticus), an ancestral lineage with low diversity; and four members of the East African lineage: Neolamprologus brichardi/pulcher (older radiation, Lake Tanganyika), Metriaclima zebra (recent radiation, Lake Malawi), Pundamilia nyererei (very recent radiation, Lake Victoria), and Astatotilapia burtoni (riverine species around Lake Tanganyika). We found an excess of gene duplications in the East African lineage compared to tilapia and other teleosts, an abundance of non-coding element divergence, accelerated coding sequence evolution, expression divergence associated with transposable element insertions, and regulation by novel microRNAs. In addition, we analysed sequence data from sixty individuals representing six closely related species from Lake Victoria, and show genome-wide diversifying selection on coding and regulatory variants, some of which were recruited from ancient polymorphisms. We conclude that a number of molecular mechanisms shaped East African cichlid genomes, and that amassing of standing variation during periods of relaxed purifying selection may have been important in facilitating subsequent evolutionary diversification.

Assuntos

Ciclídeos/classificação , Ciclídeos/genética , Evolução Molecular , Especiação Genética , Genoma/genética , África Oriental , Animais , Elementos de DNA Transponíveis/genética , Duplicação Gênica/genética , Regulação da Expressão Gênica/genética , Genômica , Lagos , MicroRNAs/genética , Filogenia , Polimorfismo Genético/genética

12.

Finished bacterial genomes from shotgun sequence data.

Ribeiro, Filipe J; Przybylski, Dariusz; Yin, Shuangye; Sharpe, Ted; Gnerre, Sante; Abouelleil, Amr; Berlin, Aaron M; Montmayeur, Anna; Shea, Terrance P; Walker, Bruce J; Young, Sarah K; Russ, Carsten; Nusbaum, Chad; MacCallum, Iain; Jaffe, David B.

Genome Res ; 22(11): 2270-7, 2012 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-22829535

RESUMO

Exceptionally accurate genome reference sequences have proven to be of great value to microbial researchers. Thus, to date, about 1800 bacterial genome assemblies have been "finished" at great expense with the aid of manual laboratory and computational processes that typically iterate over a period of months or even years. By applying a new laboratory design and new assembly algorithm to 16 samples, we demonstrate that assemblies exceeding finished quality can be obtained from whole-genome shotgun data and automated computation. Cost and time requirements are thus dramatically reduced.

Assuntos

Bactérias/genética , Genoma Bacteriano , Biblioteca Genômica , Análise de Sequência de DNA/métodos , Algoritmos

13.

High-quality draft assemblies of mammalian genomes from massively parallel sequence data.

Gnerre, Sante; Maccallum, Iain; Przybylski, Dariusz; Ribeiro, Filipe J; Burton, Joshua N; Walker, Bruce J; Sharpe, Ted; Hall, Giles; Shea, Terrance P; Sykes, Sean; Berlin, Aaron M; Aird, Daniel; Costello, Maura; Daza, Riza; Williams, Louise; Nicol, Robert; Gnirke, Andreas; Nusbaum, Chad; Lander, Eric S; Jaffe, David B.

Proc Natl Acad Sci U S A ; 108(4): 1513-8, 2011 Jan 25.

Artigo em Inglês | MEDLINE | ID: mdl-21187386

RESUMO

Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range of biomedical applications, it has proven difficult to use them to generate high-quality de novo genome assemblies of large, repeat-rich vertebrate genomes. To date, the genome assemblies generated from such data have fallen far short of those obtained with the older (but much more expensive) capillary-based sequencing approach. Here, we report the development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing. The combination of improved sequencing technology and improved computational methods should now make it possible to increase dramatically the de novo sequencing of large genomes. The ALLPATHS-LG program is available at http://www.broadinstitute.org/science/programs/genome-biology/crd.

Assuntos

Algoritmos , Genômica/métodos , Análise de Sequência de DNA/métodos , Software , Animais , Genoma/genética , Humanos , Internet , Camundongos , Reprodutibilidade dos Testes

14.

ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads.

Maccallum, Iain; Przybylski, Dariusz; Gnerre, Sante; Burton, Joshua; Shlyakhter, Ilya; Gnirke, Andreas; Malek, Joel; McKernan, Kevin; Ranade, Swati; Shea, Terrance P; Williams, Louise; Young, Sarah; Nusbaum, Chad; Jaffe, David B.

Genome Biol ; 10(10): R103, 2009.

Artigo em Inglês | MEDLINE | ID: mdl-19796385

RESUMO

We demonstrate that genome sequences approaching finished quality can be generated from short paired reads. Using 36 base (fragment) and 26 base (jumping) reads from five microbial genomes of varied GC composition and sizes up to 40 Mb, ALLPATHS2 generated assemblies with long, accurate contigs and scaffolds. Velvet and EULER-SR were less accurate. For example, for Escherichia coli, the fraction of 10-kb stretches that were perfect was 99.8% (ALLPATHS2), 68.7% (Velvet), and 42.1% (EULER-SR).

Assuntos

Bactérias/genética , Fungos/genética , Genoma/genética , Genômica/métodos , Software , Pareamento de Bases/genética , Reprodutibilidade dos Testes

15.

Powerful fusion: PSI-BLAST and consensus sequences.

Przybylski, Dariusz; Rost, Burkhard.

Bioinformatics ; 24(18): 1987-93, 2008 Sep 15.

Artigo em Inglês | MEDLINE | ID: mdl-18678588

RESUMO

MOTIVATION: A typical PSI-BLAST search consists of iterative scanning and alignment of a large sequence database during which a scoring profile is progressively built and refined. Such a profile can also be stored and used to search against a different database of sequences. Using it to search against a database of consensus rather than native sequences is a simple add-on that boosts performance surprisingly well. The improvement comes at a price: we hypothesized that random alignment score statistics would differ between native and consensus sequences. Thus PSI-BLAST-based profile searches against consensus sequences might incorrectly estimate statistical significance of alignment scores. In addition, iterative searches against consensus databases may fail. Here, we addressed these challenges in an attempt to harness the full power of the combination of PSI-BLAST and consensus sequences. RESULTS: We studied alignment score statistics for various types of consensus sequences. In general, the score distribution parameters of profile-based consensus sequence alignments differed significantly from those derived for the native sequences. PSI-BLAST partially compensated for the parameter variation. We have identified a protocol for building specialized consensus sequences that significantly improved search sensitivity and preserved score distribution parameters. As a result, PSI-BLAST profiles can be used to search specialized consensus sequences without sacrificing estimates of statistical significance. We also provided results indicating that iterative PSI-BLAST searches against consensus sequences could work very well. Overall, we showed how a very popular and effective method could be used to identify significantly more relevant similarities among protein sequences. AVAILABILITY: http://www.rostlab.org/services/consensus/.

Assuntos

Sequência Consenso , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Algoritmos , Sequência de Aminoácidos , Proteínas/química

16.

Consensus sequences improve PSI-BLAST through mimicking profile-profile alignments.

Przybylski, Dariusz; Rost, Burkhard.

Nucleic Acids Res ; 35(7): 2238-46, 2007.

Artigo em Inglês | MEDLINE | ID: mdl-17369271

RESUMO

Sequence alignments may be the most fundamental computational resource for molecular biology. The best methods that identify sequence relatedness through profile-profile comparisons are much slower and more complex than sequence-sequence and sequence-profile comparisons such as, respectively, BLAST and PSI-BLAST. Families of related genes and gene products (proteins) can be represented by consensus sequences that list the nucleic/amino acid most frequent at each sequence position in that family. Here, we propose a novel approach for consensus-sequence-based comparisons. This approach improved searches and alignments as a standard add-on to PSI-BLAST without any changes of code. Improvements were particularly significant for more difficult tasks such as the identification of distant structural relations between proteins and their corresponding alignments. Despite the fact that the improvements were higher for more divergent relations, they were consistent even at high accuracy/low error rates for non-trivially related proteins. The improvements were very easy to achieve; no parameter used by PSI-BLAST was altered and no single line of code changed. Furthermore, the consensus sequence add-on required relatively little additional CPU time. We discuss how advanced users of PSI-BLAST can immediately benefit from using consensus sequences on their local computers. We have also made the method available through the Internet (http://www.rostlab.org/services/consensus/).

Assuntos

Alinhamento de Sequência/métodos , Sequência de Aminoácidos , Substituição de Aminoácidos , Sequência de Bases , Sequência Consenso , Análise de Sequência de Proteína , Software

17.

Improving fold recognition without folds.

Przybylski, Dariusz; Rost, Burkhard.

J Mol Biol ; 341(1): 255-69, 2004 Jul 30.

Artigo em Inglês | MEDLINE | ID: mdl-15312777

RESUMO

The most reliable way to align two proteins of unknown structure is through sequence-profile and profile-profile alignment methods. If the structure for one of the two is known, fold recognition methods outperform purely sequence-based alignments. Here, we introduced a novel method that aligns generalised sequence and predicted structure profiles. Using predicted 1D structure (secondary structure and solvent accessibility) significantly improved over sequence-only methods, both in terms of correctly recognising pairs of proteins with different sequences and similar structures and in terms of correctly aligning the pairs. The scores obtained by our generalised scoring matrix followed an extreme value distribution; this yielded accurate estimates of the statistical significance of our alignments. We found that mistakes in 1D structure predictions correlated between proteins from different sequence-structure families. The impact of this surprising result was that our method succeeded in significantly out-performing sequence-only methods even without explicitly using structural information from any of the two. Since AGAPE also outperformed established methods that rely on 3D information, we made it available through. If we solved the problem of CPU-time required to apply AGAPE on millions of proteins, our results could also impact everyday database searches.

Assuntos

Dobramento de Proteína , Proteínas/metabolismo , Análise de Sequência de Proteína , Bases de Dados de Proteínas , Modelos Moleculares , Alinhamento de Sequência

18.

Predicting transmembrane beta-barrels in proteomes.

Bigelow, Henry R; Petrey, Donald S; Liu, Jinfeng; Przybylski, Dariusz; Rost, Burkhard.

Nucleic Acids Res ; 32(8): 2566-77, 2004.

Artigo em Inglês | MEDLINE | ID: mdl-15141026

RESUMO

Very few methods address the problem of predicting beta-barrel membrane proteins directly from sequence. One reason is that only very few high-resolution structures for transmembrane beta-barrel (TMB) proteins have been determined thus far. Here we introduced the design, statistics and results of a novel profile-based hidden Markov model for the prediction and discrimination of TMBs. The method carefully attempts to avoid over-fitting the sparse experimental data. While our model training and scoring procedures were very similar to a recently published work, the architecture and structure-based labelling were significantly different. In particular, we introduced a new definition of beta- hairpin motifs, explicit state modelling of transmembrane strands, and a log-odds whole-protein discrimination score. The resulting method reached an overall four-state (up-, down-strand, periplasmic-, outer-loop) accuracy as high as 86%. Furthermore, accurately discriminated TMB from non-TMB proteins (45% coverage at 100% accuracy). This high precision enabled the application to 72 entirely sequenced Gram-negative bacteria. We found over 164 previously uncharacterized TMB proteins at high confidence. Database searches did not implicate any of these proteins with membranes. We challenge that the vast majority of our 164 predictions will eventually be verified experimentally. All proteome predictions and the PROFtmb prediction method are available at http://www.rostlab.org/ services/PROFtmb/.

Assuntos

Proteínas de Membrana/química , Proteoma/química , Proteômica/métodos , Análise de Sequência de Proteína/métodos , Cadeias de Markov , Proteínas de Membrana/fisiologia , Estrutura Secundária de Proteína , Reprodutibilidade dos Testes , Alinhamento de Sequência

19.

CAFASP3 in the spotlight of EVA.

Eyrich, Volker A; Przybylski, Dariusz; Koh, Ingrid Y Y; Grana, Osvaldo; Pazos, Florencio; Valencia, Alfonso; Rost, Burkhard.

Proteins ; 53 Suppl 6: 548-60, 2003.

Artigo em Inglês | MEDLINE | ID: mdl-14579345

RESUMO

We have analysed fold recognition, secondary structure and contact prediction servers from CAFASP3. This assessment was carried out in the framework of the fully automated, web-based evaluation server EVA. Detailed results are available at http://cubic.bioc.columbia.edu/eva/cafasp3/. We observed that the sequence-unique targets from CAFASP3/CASP5 were not fully representative for evaluating performance. For all three categories, we showed how careless ranking might be misleading. We compared methods from all categories to experts in secondary structure and contact prediction and homology modellers to fold recognisers. While the secondary structure experts clearly outperformed all others, the contact experts appeared to outperform only novel fold methods. Automatic evaluation servers are good at getting statistics right and at using these to discard misleading ranking schemes. We challenge that to let machines rule where they are best might be the best way for the community to enjoy the tremendous benefit of CASP as a unique opportunity for brainstorming.

Assuntos

Biologia Computacional/métodos , Proteínas/química , Algoritmos , Dobramento de Proteína , Estrutura Secundária de Proteína , Sensibilidade e Especificidade

20.

EVA: Evaluation of protein structure prediction servers.

Koh, Ingrid Y Y; Eyrich, Volker A; Marti-Renom, Marc A; Przybylski, Dariusz; Madhusudhan, Mallur S; Eswar, Narayanan; Graña, Osvaldo; Pazos, Florencio; Valencia, Alfonso; Sali, Andrej; Rost, Burkhard.

Nucleic Acids Res ; 31(13): 3311-5, 2003 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-12824315

RESUMO

EVA (http://cubic.bioc.columbia.edu/eva/) is a web server for evaluation of the accuracy of automated protein structure prediction methods. The evaluation is updated automatically each week, to cope with the large number of existing prediction servers and the constant changes in the prediction methods. EVA currently assesses servers for secondary structure prediction, contact prediction, comparative protein structure modelling and threading/fold recognition. Every day, sequences of newly available protein structures in the Protein Data Bank (PDB) are sent to the servers and their predictions are collected. The predictions are then compared to the experimental structures once a week; the results are published on the EVA web pages. Over time, EVA has accumulated prediction results for a large number of proteins, ranging from hundreds to thousands, depending on the prediction method. This large sample assures that methods are compared reliably. As a result, EVA provides useful information to developers as well as users of prediction methods.

Assuntos

Conformação Proteica , Análise de Sequência de Proteína , Automação , Bases de Dados de Proteínas , Internet , Dobramento de Proteína , Estrutura Secundária de Proteína , Proteínas/química , Reprodutibilidade dos Testes , Homologia Estrutural de Proteína

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA