Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 14(1): 7731, 2024 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-38565928

RESUMO

Data storage in DNA has recently emerged as a promising archival solution, offering space-efficient and long-lasting digital storage solutions. Recent studies suggest leveraging the inherent redundancy of synthesis and sequencing technologies by using composite DNA alphabets. A major challenge of this approach involves the noisy inference process, obstructing large composite alphabets. This paper introduces a novel approach for DNA-based data storage, offering, in some implementations, a 6.5-fold increase in logical density over standard DNA-based storage systems, with near-zero reconstruction error. Combinatorial DNA encoding uses a set of clearly distinguishable DNA shortmers to construct large combinatorial alphabets, where each letter consists of a subset of shortmers. We formally define various combinatorial encoding schemes and investigate their theoretical properties. These include information density and reconstruction probabilities, as well as required synthesis and sequencing multiplicities. We then propose an end-to-end design for a combinatorial DNA-based data storage system, including encoding schemes, two-dimensional (2D) error correction codes, and reconstruction algorithms, under different error regimes. We performed simulations and show, for example, that the use of 2D Reed-Solomon error correction has significantly improved reconstruction rates. We validated our approach by constructing two combinatorial sequences using Gibson assembly, imitating a 4-cycle combinatorial synthesis process. We confirmed the successful reconstruction, and established the robustness of our approach for different error types. Subsampling experiments supported the important role of sampling rate and its effect on the overall performance. Our work demonstrates the potential of combinatorial shortmer encoding for DNA-based data storage and describes some theoretical research questions and technical challenges. Combining combinatorial principles with error-correcting strategies, and investing in the development of DNA synthesis technologies that efficiently support combinatorial synthesis, can pave the way to efficient, error-resilient DNA-based storage solutions.


Assuntos
Replicação do DNA , DNA , Análise de Sequência de DNA/métodos , DNA/genética , Algoritmos , Armazenamento e Recuperação da Informação
2.
Nat Commun ; 12(1): 3042, 2021 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-34031394

RESUMO

Controlling off-target editing activity is one of the central challenges in making CRISPR technology accurate and applicable in medical practice. Current algorithms for analyzing off-target activity do not provide statistical quantification, are not sufficiently sensitive in separating signal from noise in experiments with low editing rates, and do not address the detection of translocations. Here we present CRISPECTOR, a software tool that supports the detection and quantification of on- and off-target genome-editing activity from NGS data using paired treatment/control CRISPR experiments. In particular, CRISPECTOR facilitates the statistical analysis of NGS data from multiplex-PCR comparative experiments to detect and quantify adverse translocation events. We validate the observed results and show independent evidence of the occurrence of translocations in human cell lines, after genome editing. Our methodology is based on a statistical model comparison approach leading to better false-negative rates in sites with weak yet significant off-target activity.


Assuntos
Sistemas CRISPR-Cas , Biologia Computacional/métodos , Edição de Genes/métodos , Algoritmos , Proteínas de Ligação a DNA/genética , Células HEK293 , Proteínas de Homeodomínio/genética , Humanos , Proteínas Nucleares/genética , Software , Fatores de Transcrição/genética
3.
Bioinformatics ; 37(5): 720-722, 2021 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-32840559

RESUMO

MOTIVATION: Recent years have seen a growing number and an expanding scope of studies using synthetic oligo libraries for a range of applications in synthetic biology. As experiments are growing by numbers and complexity, analysis tools can facilitate quality control and support better assessment and inference. RESULTS: We present a novel analysis tool, called SOLQC, which enables fast and comprehensive analysis of synthetic oligo libraries, based on NGS analysis performed by the user. SOLQC provides statistical information such as the distribution of variant representation, different error rates and their dependence on sequence or library properties. SOLQC produces graphical reports from the analysis, in a flexible format. We demonstrate SOLQC by analyzing literature libraries. We also discuss the potential benefits and relevance of the different components of the analysis. AVAILABILITY AND IMPLEMENTATION: SOLQC is a free software for non-commercial use, available at https://app.gitbook.com/@yoav-orlev/s/solqc/. For commercial use please contact the authors. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Bibliotecas , Software , Biblioteca Gênica , Controle de Qualidade , Biologia Sintética
5.
Sci Rep ; 9(1): 15750, 2019 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-31673038

RESUMO

Recent developments in personalized medicine are based on molecular measurement steps that guide personally adjusted medical decisions. A central approach to molecular profiling consists of measuring DNA, RNA, and/or proteins in tissue samples, most notably in and around tumors. This measurement yields molecular biomarkers that are potentially predictive of response and of tumor type. Current methods in cancer therapy mostly use tissue biopsy as the starting point of molecular profiling. Tissue biopsies involve a physical resection of a small tissue sample, leading to localized tissue injury, bleeding, inflammation and stress, as well as to an increased risk of metastasis. Here we developed a technology for harvesting biomolecules from tissues using electroporation. We show that tissue electroporation, achieved using a combination of high-voltage short pulses, 50 pulses 500 V cm-1, 30 µs, 1 Hz, with low-voltage long pulses 50 pulses 50 V cm-1, 10 ms, delivered at 1 Hz, allows for tissue-specific extraction of RNA and proteins. We specifically tested RNA and protein extraction from excised kidney and liver samples and from excised HepG2 tumors in mice. Further in vivo development of extraction methods based on electroporation can drive novel approaches to the molecular profiling of tumors and of tumor environment and to related diagnosis practices.


Assuntos
Eletroporação/métodos , Rim/metabolismo , Fígado/metabolismo , Animais , Feminino , Ontologia Genética , Genômica , Células Hep G2 , Humanos , Rim/patologia , Fígado/patologia , Camundongos , Camundongos Nus , Proteínas de Neoplasias/metabolismo , Neoplasias/genética , Neoplasias/metabolismo , Neoplasias/patologia , Proteômica , RNA Neoplásico/metabolismo , Transplante Heterólogo
6.
Nat Biotechnol ; 37(10): 1237, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31527732

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

7.
Nat Biotechnol ; 37(10): 1229-1236, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31501560

RESUMO

The density and long-term stability of DNA make it an appealing storage medium, particularly for long-term data archiving. Existing DNA storage technologies involve the synthesis and sequencing of multiple nominally identical molecules in parallel, resulting in information redundancy. We report the development of encoding and decoding methods that exploit this redundancy using composite DNA letters. A composite DNA letter is a representation of a position in a sequence that consists of a mixture of all four DNA nucleotides in a predetermined ratio. Our methods encode data using fewer synthesis cycles. We encode 6.4 MB into composite DNA, with distinguishable composition medians, using 20% fewer synthesis cycles per unit of data, as compared to previous reports. We also simulate encoding with larger composite alphabets, with distinguishable composition deciles, to show that 75% fewer synthesis cycles are potentially sufficient. We describe applicable error-correcting codes and inference methods, and investigate error patterns in the context of composite DNA letters.


Assuntos
DNA/síntese química , Algoritmos , Sequência de Bases , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Armazenamento e Recuperação da Informação , Análise de Sequência de DNA/métodos
8.
Anal Chim Acta ; 1051: 32-40, 2019 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-30661617

RESUMO

Visual-Near-Infra-Red (VIS/NIR) spectroscopy has led the revolution in high-throughput phenotyping methods used to determine chemical and structural elements of organic materials. In the current state of the art, spectrophotometers used for imaging techniques are either very expensive or too large to be used as a field-operable device. In this study we developed a Sparse NIR Optimization method (SNIRO) that selects a pre-determined number of wavelengths that enable quantification of analytes in a given sample using linear regression. We compared the computed complexity time and the accuracy of SNIRO to Marten's test, to forward selection test and to LASSO all applied to the determination of protein content in corn flour and meat and octane number in diesel using publicly available datasets. In addition, for the first time, we determined the glucose content in the green seaweed Ulva sp., an important feedstock for marine biorefinery. The SNIRO approach can be used as a first step in designing a spectrophotometer that can scan a small number of specific spectral regions, thus decreasing, potentially, production costs and scanner size and enabling the development of field-operable devices for content analysis of complex organic materials.


Assuntos
Espectroscopia de Luz Próxima ao Infravermelho/métodos , Proteínas de Carne/análise , Octanos/análise , Ulva/química , Emissões de Veículos/análise , Zea mays/química
9.
Proc Natl Acad Sci U S A ; 115(17): 4459-4464, 2018 04 24.
Artigo em Inglês | MEDLINE | ID: mdl-29626130

RESUMO

The evolution of development has been studied through the lens of gene regulation by examining either closely related species or extremely distant animals of different phyla. In nematodes, detailed cell- and stage-specific expression analyses are focused on the model Caenorhabditis elegans, in part leading to the view that the developmental expression of gene cascades in this species is archetypic for the phylum. Here, we compared two species of an intermediate evolutionary distance: the nematodes C. elegans (clade V) and Acrobeloides nanus (clade IV). To examine A. nanus molecularly, we sequenced its genome and identified the expression profiles of all genes throughout embryogenesis. In comparison with C. elegans, A. nanus exhibits a much slower embryonic development and has a capacity for regulative compensation of missing early cells. We detected conserved stages between these species at the transcriptome level, as well as a prominent middevelopmental transition, at which point the two species converge in terms of their gene expression. Interestingly, we found that genes originating at the dawn of the Ecdysozoa supergroup show the least expression divergence between these two species. This led us to detect a correlation between the time of expression of a gene and its phylogenetic age: evolutionarily ancient and young genes are enriched for expression in early and late embryogenesis, respectively, whereas Ecdysozoa-specific genes are enriched for expression during the middevelopmental transition. Our results characterize the developmental constraints operating on each individual embryo in terms of developmental stages and genetic evolutionary history.


Assuntos
Evolução Molecular , Regulação da Expressão Gênica no Desenvolvimento/fisiologia , Filogenia , Rabditídios/embriologia , Transcriptoma/fisiologia , Animais , Rabditídios/classificação , Rabditídios/genética
10.
Cell Rep ; 21(3): 845-858, 2017 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-29045849

RESUMO

We use an oligonucleotide library of >10,000 variants to identify an insulation mechanism encoded within a subset of σ54 promoters. Insulation manifests itself as reduced protein expression for a downstream gene that is expressed by transcriptional readthrough. It is strongly associated with the presence of short CT-rich motifs (3-5 bp), positioned within 25 bp upstream of the Shine-Dalgarno (SD) motif of the silenced gene. We provide evidence that insulation is triggered by binding of the ribosome binding site (RBS) to the upstream CT-rich motif. We also show that, in E. coli, insulator sequences are preferentially encoded within σ54 promoters, suggesting an important regulatory role for these sequences in natural contexts. Our findings imply that sequence-specific regulatory effects that are sparsely encoded by short motifs may not be easily detected by lower throughput studies. Such sequence-specific phenomena can be uncovered with a focused oligo library (OL) design that mitigates sequence-related variance, as exemplified herein.


Assuntos
Escherichia coli/genética , Biblioteca Gênica , Elementos Isolantes/genética , Regiões Promotoras Genéticas , Análise de Sequência de DNA , Fator sigma/genética , Sítios de Ligação/genética , Regulação para Baixo/genética , Regulação Bacteriana da Expressão Gênica , Inativação Gênica , Genoma Bacteriano , Mutação/genética , Motivos de Nucleotídeos/genética , Ribossomos/metabolismo
11.
Genome Biol ; 17: 77, 2016 Apr 28.
Artigo em Inglês | MEDLINE | ID: mdl-27121950

RESUMO

Single-cell transcriptomics requires a method that is sensitive, accurate, and reproducible. Here, we present CEL-Seq2, a modified version of our CEL-Seq method, with threefold higher sensitivity, lower costs, and less hands-on time. We implemented CEL-Seq2 on Fluidigm's C1 system, providing its first single-cell, on-chip barcoding method, and we detected gene expression changes accompanying the progression through the cell cycle in mouse fibroblast cells. We also compare with Smart-Seq to demonstrate CEL-Seq2's increased sensitivity relative to other available methods. Collectively, the improvements make CEL-Seq2 uniquely suited to single-cell RNA-Seq analysis in terms of economics, resolution, and ease of use.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Animais , Ciclo Celular , Células Cultivadas , Fibroblastos/citologia , Fibroblastos/metabolismo , Camundongos , Sensibilidade e Especificidade
12.
Nature ; 531(7596): 637-641, 2016 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-26886793

RESUMO

Animals are grouped into ~35 'phyla' based upon the notion of distinct body plans. Morphological and molecular analyses have revealed that a stage in the middle of development--known as the phylotypic period--is conserved among species within some phyla. Although these analyses provide evidence for their existence, phyla have also been criticized as lacking an objective definition, and consequently based on arbitrary groupings of animals. Here we compare the developmental transcriptomes of ten species, each annotated to a different phylum, with a wide range of life histories and embryonic forms. We find that in all ten species, development comprises the coupling of early and late phases of conserved gene expression. These phases are linked by a divergent 'mid-developmental transition' that uses species-specific suites of signalling pathways and transcription factors. This mid-developmental transition overlaps with the phylotypic period that has been defined previously for three of the ten phyla, suggesting that transcriptional circuits and signalling mechanisms active during this transition are crucial for defining the phyletic body plan and that the mid-developmental transition may be used to define phylotypic periods in other phyla. Placing these observations alongside the reported conservation of mid-development within phyla, we propose that a phylum may be defined as a collection of species whose gene expression at the mid-developmental transition is both highly conserved among them, yet divergent relative to other species.


Assuntos
Padronização Corporal , Desenvolvimento Embrionário , Filogenia , Animais , Padronização Corporal/genética , Sequência Conservada/genética , Desenvolvimento Embrionário/genética , Evolução Molecular , Regulação da Expressão Gênica no Desenvolvimento , Redes Reguladoras de Genes , Genes Controladores do Desenvolvimento/genética , Modelos Biológicos , Fenótipo , Especificidade da Espécie , Transcriptoma/genética
13.
Development ; 141(5): 1161-6, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24504336

RESUMO

RNA-Seq enables the efficient transcriptome sequencing of many samples from small amounts of material, but the analysis of these data remains challenging. In particular, in developmental studies, RNA-Seq is challenged by the morphological staging of samples, such as embryos, since these often lack clear markers at any particular stage. In such cases, the automatic identification of the stage of a sample would enable previously infeasible experimental designs. Here we present the 'basic linear index determination of transcriptomes' (BLIND) method for ordering samples comprising different developmental stages. The method is an implementation of a traveling salesman algorithm to order the transcriptomes according to their inter-relationships as defined by principal components analysis. To establish the direction of the ordered samples, we show that an appropriate indicator is the entropy of transcriptomic gene expression levels, which increases over developmental time. Using BLIND, we correctly recover the annotated order of previously published embryonic transcriptomic timecourses for frog, mosquito, fly and zebrafish. We further demonstrate the efficacy of BLIND by collecting 59 embryos of the sponge Amphimedon queenslandica and ordering their transcriptomes according to developmental stage. BLIND is thus useful in establishing the temporal order of samples within large datasets and is of particular relevance to the study of organisms with asynchronous development and when morphological staging is difficult.


Assuntos
Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Transcriptoma/genética , Animais , Regulação da Expressão Gênica no Desenvolvimento , Análise de Componente Principal
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...