Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
JAMA Netw Open ; 4(12): e2138219, 2021 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-34882180

RESUMO

Importance: In March 2018, Medicare issued a national coverage determination (NCD) for next-generation sequencing (NGS) to facilitate access to NGS testing among Medicare beneficiaries. It is unknown whether the NCD affected health equity issues for Medicare beneficiaries and the overall population. Objective: To examine the association between the Medicare NCD and NGS use by insurance types and race and ethnicity. Design, Setting, and Participants: A retrospective cohort analysis was conducted using electronic health record data derived from a real-world database. Data originated from approximately 280 cancer clinics (approximately 800 sites of care) in the US. Patients with advanced non-small cell lung cancer (aNSCLC), metastatic colorectal cancer (mCRC), metastatic breast cancer (mBC), or advanced melanoma diagnosed from January 1, 2011, through March 31, 2020, were included. Exposure: Pre- vs post-NCD period. Main Outcomes and Measures: Patients were classified by insurance type and race and ethnicity to examine patterns in NGS testing less than or equal to 60 days after diagnosis. Difference-in-differences models examined changes in average NGS testing in the pre- and post-NCD periods by race and ethnicity, and interrupted time-series analysis examined whether trends over time varied by insurance type and race and ethnicity. Results: Among 92 687 patients with aNSCLC, mCRC, mBC, or advanced melanoma, mean (SD) age was 66.6 (11.2) years, 51 582 (55.7%) were women, and 63 864 (68.9%) were Medicare beneficiaries. The largest racial and ethnic categories according to the database used and further classification were Black or African American (8605 [9.3%]) and non-Hispanic White (59 806 [64.5%]). Compared with Medicare beneficiaries, changes in pre- to post-NCD NGS testing trends were similar in commercially insured patients (odds ratio [OR], 1.03; 95% CI, 0.98-1.08; P = .25). Pre- to post-NCD NGS testing trends increased at a slower rate among patients in assistance programs (OR, 0.93; 95% CI, 0.87-0.99; P = .03) compared with Medicare beneficiaries. The rate of increase for patients receiving Medicaid was not statistically significantly different compared with those receiving Medicare (OR, 0.92; 95% CI, 0.84-1.01; P = .07). The NCD was not associated with statistically significant changes in NGS use trends by racial and ethnic groups within Medicare beneficiaries alone or across all insurance types. Compared with non-Hispanic White individuals, increases in average NGS use from the pre-NCD to post-NCD period were 14% lower (OR, 0.86; 95% CI, 0.74-0.99; P = .04) among African American and 23% lower (OR, 0.77; 95% CI, 0.62-0.96; P = .02) among Hispanic/Latino individuals; increases among Asian individuals and those with other races and ethnicities were similar. Conclusions and Relevance: The findings of this study suggest that expansion of Medicare-covered benefits may not occur equally across insurance types, thereby further widening or maintaining disparities in NGS testing. Additional efforts beyond coverage policies are needed to ensure equitable access to the benefits of precision medicine.


Assuntos
Predisposição Genética para Doença , Testes Genéticos/economia , Sequenciamento de Nucleotídeos em Larga Escala/economia , Sequenciamento de Nucleotídeos em Larga Escala/tendências , Medicare/economia , Medicare/tendências , Neoplasias/genética , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Previsões , Testes Genéticos/estatística & dados numéricos , Testes Genéticos/tendências , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Cobertura do Seguro/normas , Cobertura do Seguro/estatística & dados numéricos , Cobertura do Seguro/tendências , Masculino , Medicare/estatística & dados numéricos , Pessoa de Meia-Idade , Estudos Retrospectivos , Estados Unidos , Adulto Jovem
2.
Clin Genet ; 100(5): 504-521, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34080181

RESUMO

Full coverage of the cost of clinical genetic testing is not always available through public or private insurance programs, or a public healthcare system. Consequently, some patients may be faced with the decision of whether to finance testing out-of-pocket (OOP), meet OOP expenses required by their insurer, or not proceed with testing. A scoping review was conducted to identify literature associated with patient OOP and private pay in clinical genetic testing. Seven databases (EMBASE, MEDLINE, CINAHL, PsychINFO, PAIS, the Cochrane Database of Systematic Reviews, and the JBI Evidence-Based Practice database) were searched, resulting in 83 unique publications included in the review. The presented evidence includes a descriptive analysis, followed by a narrative account of the extracted data. Results were divided into four groups according to clinical indication: (1) hereditary breast and ovarian cancer, (2) other hereditary cancers, (3) prenatal testing, (4) other clinical indications. The majority of studies focused on hereditary cancer and prenatal genetic testing. Overall trends indicated that OOP costs have fallen and payer coverage has improved, but OOP expenses continue to present a barrier to patients who do not qualify for full coverage.


Assuntos
Testes Genéticos/economia , Gastos em Saúde/estatística & dados numéricos , Custos e Análise de Custo , Testes Genéticos/métodos , Testes Genéticos/estatística & dados numéricos , Testes Genéticos/tendências , Custos de Cuidados de Saúde/estatística & dados numéricos , Custos de Cuidados de Saúde/tendências , Acessibilidade aos Serviços de Saúde/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/economia , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Programas de Rastreamento , Neoplasias/diagnóstico , Neoplasias/epidemiologia , Neoplasias/genética , Diagnóstico Pré-Natal/economia , Diagnóstico Pré-Natal/métodos , Diagnóstico Pré-Natal/estatística & dados numéricos , Estados Unidos/epidemiologia
3.
PLoS Comput Biol ; 16(11): e1008415, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33175836

RESUMO

Small non-coding RNAs (ncRNAs) are short non-coding sequences involved in gene regulation in many biological processes and diseases. The lack of a complete comprehension of their biological functionality, especially in a genome-wide scenario, has demanded new computational approaches to annotate their roles. It is widely known that secondary structure is determinant to know RNA function and machine learning based approaches have been successfully proven to predict RNA function from secondary structure information. Here we show that RNA function can be predicted with good accuracy from a lightweight representation of sequence information without the necessity of computing secondary structure features which is computationally expensive. This finding appears to go against the dogma of secondary structure being a key determinant of function in RNA. Compared to recent secondary structure based methods, the proposed solution is more robust to sequence boundary noise and reduces drastically the computational cost allowing for large data volume annotations. Scripts and datasets to reproduce the results of experiments proposed in this study are available at: https://github.com/bioinformatics-sannio/ncrna-deep.


Assuntos
Aprendizado Profundo , RNA não Traduzido/genética , RNA não Traduzido/fisiologia , Biologia Computacional , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Método de Monte Carlo , Redes Neurais de Computação , Conformação de Ácido Nucleico , RNA não Traduzido/química , Análise de Sequência de RNA/estatística & dados numéricos , Sequenciamento do Exoma/estatística & dados numéricos
5.
Comput Math Methods Med ; 2020: 7231205, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32952600

RESUMO

Although sequencing a human genome has become affordable, identifying genetic variants from whole-genome sequence data is still a hurdle for researchers without adequate computing equipment or bioinformatics support. GATK is a gold standard method for the identification of genetic variants and has been widely used in genome projects and population genetic studies for many years. This was until the Google Brain team developed a new method, DeepVariant, which utilizes deep neural networks to construct an image classification model to identify genetic variants. However, the superior accuracy of DeepVariant comes at the cost of computational intensity, largely constraining its applications. Accordingly, we present DeepVariant-on-Spark to optimize resource allocation, enable multi-GPU support, and accelerate the processing of the DeepVariant pipeline. To make DeepVariant-on-Spark more accessible to everyone, we have deployed the DeepVariant-on-Spark to the Google Cloud Platform (GCP). Users can deploy DeepVariant-on-Spark on the GCP following our instruction within 20 minutes and start to analyze at least ten whole-genome sequencing datasets using free credits provided by the GCP. DeepVaraint-on-Spark is freely available for small-scale genome analysis using a cloud-based computing framework, which is suitable for pilot testing or preliminary study, while reserving the flexibility and scalability for large-scale sequencing projects.


Assuntos
Computação em Nuvem , Aprendizado Profundo , Variação Genética , Sequenciamento Completo do Genoma/estatística & dados numéricos , Computação em Nuvem/economia , Biologia Computacional/métodos , Análise Custo-Benefício , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/economia , Sequenciamento de Nucleotídeos em Larga Escala/normas , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Redes Neurais de Computação , Software , Sequenciamento Completo do Genoma/economia , Sequenciamento Completo do Genoma/normas
6.
Hum Genomics ; 13(1): 9, 2019 02 13.
Artigo em Inglês | MEDLINE | ID: mdl-30795817

RESUMO

BACKGROUND: Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often make simplified assumptions of positional independence and fail to leverage the dependence between genotypes at nearby loci that is caused by linkage disequilibrium (LD). RESULTS AND CONCLUSION: We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short-read data. This method allows transitions between hidden states (defined as "SNP," "Ins," "Del," and "Match") of adjacent genomic bases and determines an optimal hidden state path by using the Viterbi algorithm. The inferred hidden state path provides a direct solution to the identification of SNPs and INDELs. Simulation studies show that, under various sequencing depths, vi-HMM outperforms commonly used variant calling methods in terms of sensitivity and F1 score. When applied to the real data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs.


Assuntos
Algoritmos , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Cadeias de Markov , Bases de Dados Genéticas , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Mutação INDEL , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único
7.
Brief Bioinform ; 20(4): 1151-1159, 2019 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-29028869

RESUMO

As technologies change, MG-RAST is adapting. Newly available software is being included to improve accuracy and performance. As a computational service constantly running large volume scientific workflows, MG-RAST is the right location to perform benchmarking and implement algorithmic or platform improvements, in many cases involving trade-offs between specificity, sensitivity and run-time cost. The work in [Glass EM, Dribinsky Y, Yilmaz P, et al. ISME J 2014;8:1-3] is an example; we use existing well-studied data sets as gold standards representing different environments and different technologies to evaluate any changes to the pipeline. Currently, we use well-understood data sets in MG-RAST as platform for benchmarking. The use of artificial data sets for pipeline performance optimization has not added value, as these data sets are not presenting the same challenges as real-world data sets. In addition, the MG-RAST team welcomes suggestions for improvements of the workflow. We are currently working on versions 4.02 and 4.1, both of which contain significant input from the community and our partners that will enable double barcoding, stronger inferences supported by longer-read technologies, and will increase throughput while maintaining sensitivity by using Diamond and SortMeRNA. On the technical platform side, the MG-RAST team intends to support the Common Workflow Language as a standard to specify bioinformatics workflows, both to facilitate development and efficient high-performance implementation of the community's data analysis tasks.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Metagenoma , Metagenômica/métodos , Software , Algoritmos , Orçamentos , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/economia , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Internet , Metagenômica/economia , Metagenômica/estatística & dados numéricos , Análise de Sequência de DNA/economia , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/estatística & dados numéricos , Interface Usuário-Computador , Fluxo de Trabalho
8.
Brief Bioinform ; 20(4): 1222-1237, 2019 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-29220512

RESUMO

MOTIVATION: Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. RESULTS: We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. AVAILABILITY: The source code of the benchmarking tool is available as Supplementary Materials.


Assuntos
Biologia Computacional/métodos , Modelos Estatísticos , Análise de Sequência de DNA/estatística & dados numéricos , Algoritmos , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Cadeias de Markov , Alinhamento de Sequência/estatística & dados numéricos
9.
PLoS One ; 13(11): e0206855, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30481188

RESUMO

Genetic testing availability in the health care system is rapidly increasing, along with the diffusion of next-generation sequencing (NGS) into diagnostics. These issues make imperative the knowledge-drive optimization of testing in the clinical setting. Time estimations of wet laboratory procedure in Italian molecular laboratories offering genetic diagnosis were evaluated to provide data suitable to adjust efficiency and optimize health policies and costs. A survey was undertaken by the Italian Society of Human Genetics (SIGU). Forty-two laboratories participated. For most molecular techniques, the most time-consuming steps are those requiring an intensive manual intervention or in which the human bias can affect the global process time-performances. For NGS, for which the study surveyed also the interpretation time, the latter represented the step that requiring longer times. We report the first survey describing the hands-on times requested for different molecular diagnostics procedures, including NGS. The analysis of this survey suggests the need of some improvements to optimize some analytical processes, such as the implementation of laboratory information management systems to minimize manual procedures in pre-analytical steps which may affect accuracy that represents the major challenge to be faced in the future setting of molecular genetics laboratory.


Assuntos
Testes Genéticos/estatística & dados numéricos , Laboratórios/estatística & dados numéricos , Inquéritos e Questionários/estatística & dados numéricos , Carga de Trabalho/estatística & dados numéricos , Testes Genéticos/economia , Testes Genéticos/tendências , Sequenciamento de Nucleotídeos em Larga Escala/economia , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Itália , Laboratórios/economia , Laboratórios/tendências , Sistemas de Informação Administrativa , Fatores de Tempo , Carga de Trabalho/economia
10.
Brief Bioinform ; 19(5): 737-753, 2018 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-28334228

RESUMO

DNA methylation is an important epigenetic mechanism that plays a crucial role in cellular regulatory systems. Recent advancements in sequencing technologies now enable us to generate high-throughput methylation data and to measure methylation up to single-base resolution. This wealth of data does not come without challenges, and one of the key challenges in DNA methylation studies is to identify the significant differences in the methylation levels of the base pairs across distinct biological conditions. Several computational methods have been developed to identify differential methylation using bisulfite sequencing data; however, there is no clear consensus among existing approaches. A comprehensive survey of these approaches would be of great benefit to potential users and researchers to get a complete picture of the available resources. In this article, we present a detailed survey of 22 such approaches focusing on their underlying statistical models, primary features, key advantages and major limitations. Importantly, the intrinsic drawbacks of the approaches pointed out in this survey could potentially be addressed by future research.


Assuntos
Metilação de DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Biologia Computacional/métodos , Ilhas de CpG , Epigênese Genética , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Modelos Logísticos , Cadeias de Markov , Análise de Sequência de DNA/estatística & dados numéricos , Sulfitos
12.
Genet Epidemiol ; 41(2): 145-151, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-27990689

RESUMO

Genome-wide association studies (GWAS) of common disease have been hugely successful in implicating loci that modify disease risk. The bulk of these associations have proven robust and reproducible, in part due to community adoption of statistical criteria for claiming significant genotype-phenotype associations. As the cost of sequencing continues to drop, assembling large samples in global populations is becoming increasingly feasible. Sequencing studies interrogate not only common variants, as was true for genotyping-based GWAS, but variation across the full allele frequency spectrum, yielding many more (independent) statistical tests. We sought to empirically determine genome-wide significance thresholds for various analysis scenarios. Using whole-genome sequence data, we simulated sequencing-based disease studies of varying sample size and ancestry. We determined that future sequencing efforts in >2,000 samples of European, Asian, or admixed ancestry should set genome-wide significance at approximately P = 5 × 10-9 , and studies of African samples should apply a more stringent genome-wide significance threshold of P = 1 × 10-9 . Adoption of a revised multiple test correction will be crucial in avoiding irreproducible claims of association.


Assuntos
Etnicidade/genética , Genoma Humano , Estudo de Associação Genômica Ampla/métodos , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Metagenômica , Polimorfismo de Nucleotídeo Único/genética , Genótipo , Saúde Global , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos
13.
Artigo em Alemão | MEDLINE | ID: mdl-27999872

RESUMO

BACKGROUND: The diagnostic use of whole-genome sequencing (WGS) is a growing issue in medical care. Due to limited resources in public health service, budget-impact analyses are necessary prior to implementation. OBJECTIVE: A budget-impact analysis for WGS of all newborns and diagnostic investigation of tumor patients in different oncologic indications were evaluated. METHODS: A cost analysis of WGS based on a quality-assured process chart for WGS at the German Cancer Research Center (DKFZ), Heidelberg, constitutes the basis for this evaluation. Data from the National Association of Statutory Health Insurance Funds and the Robert-Koch-Institute, Berlin, were used for calculations of specific clinical applications. RESULTS AND DISCUSSION: WGS in newborn screening leads to costs of € 2.85 bn and to an increase of total expenditure by 1.41%. Sequencing of all tumor patients would cost approximately € 0.84 bn, which corresponds to 0.42% of total expenditures. In all scenarios, the sole consideration of procedure costs results in increasing costs. However, in cost discussions potential savings (reduction of disease-related follow-up-costs, improved cost-effectiveness of medical measures etc.) should be considered. Such considerations are the subject of economic indication-specific evaluations. WGS has the potential to generate a large number of deterministic findings for which treatment options are limited. Hence, it is necessary to limit indications, in which WGS has proven medical evidence.


Assuntos
Mapeamento Cromossômico/economia , Testes Genéticos/economia , Custos de Cuidados de Saúde/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/economia , Triagem Neonatal/economia , Padrões de Prática Médica/economia , Mapeamento Cromossômico/estatística & dados numéricos , Efeitos Psicossociais da Doença , Testes Genéticos/estatística & dados numéricos , Alemanha/epidemiologia , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Recém-Nascido , Triagem Neonatal/estatística & dados numéricos , Padrões de Prática Médica/estatística & dados numéricos
14.
Stat Appl Genet Mol Biol ; 15(2): 139-50, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26926866

RESUMO

The statistical methodology developed in this study was motivated by our interest in studying neurodevelopment using the mouse brain RNA-Seq data set, where gene expression levels were measured in multiple layers in the somatosensory cortex across time in both female and male samples. We aim to identify differentially expressed genes between adjacent time points, which may provide insights on the dynamics of brain development. Because of the extremely small sample size (one male and female at each time point), simple marginal analysis may be underpowered. We propose a Markov random field (MRF)-based approach to capitalizing on the between layers similarity, temporal dependency and the similarity between sex. The model parameters are estimated by an efficient EM algorithm with mean field-like approximation. Simulation results and real data analysis suggest that the proposed model improves the power to detect differentially expressed genes than simple marginal analysis. Our method also reveals biologically interesting results in the mouse brain RNA-Seq data set.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Modelos Estatísticos , Análise de Sequência de RNA/estatística & dados numéricos , Transcriptoma/genética , Animais , Simulação por Computador , Feminino , Perfilação da Expressão Gênica/estatística & dados numéricos , Masculino , Cadeias de Markov , Camundongos , Análise de Regressão , Análise de Sequência de RNA/métodos
15.
Pac Symp Biocomput ; 21: 393-404, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26776203

RESUMO

We present a feature allocation model to reconstruct tumor subclones based on mutation pairs. The key innovation lies in the use of a pair of proximal single nucleotide variants (SNVs) for the subclone reconstruction as opposed to a single SNV. Using the categorical extension of the Indian buffet process (cIBP) we define the subclones as a vector of categorical matrices corresponding to a set of mutation pairs. Through Bayesian inference we report posterior probabilities of the number, genotypes and population frequencies of subclones in one or more tumor sample. We demonstrate the proposed methods using simulated and real-world data. A free software package is available at http://www.compgenome.org/pairclone.


Assuntos
Teorema de Bayes , Mutação , Neoplasias/genética , Estatísticas não Paramétricas , Algoritmos , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Simulação por Computador , Neoplasias de Cabeça e Pescoço/genética , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Cadeias de Markov , Modelos Genéticos , Método de Monte Carlo , Polimorfismo de Nucleotídeo Único , Software
16.
Pac Symp Biocomput ; 21: 456-67, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26776209

RESUMO

Small non-coding RNAs (sRNAs) are regulatory RNA molecules that have been identified in a multitude of bacterial species and shown to control numerous cellular processes through various regulatory mechanisms. In the last decade, next generation RNA sequencing (RNA-seq) has been used for the genome-wide detection of bacterial sRNAs. Here we describe sRNA-Detect, a novel approach to identify expressed small transcripts from prokaryotic RNA-seq data. Using RNA-seq data from three bacterial species and two sequencing platforms, we performed a comparative assessment of five computational approaches for the detection of small transcripts. We demonstrate that sRNA-Detect improves upon current standalone computational approaches for identifying novel small transcripts in bacteria.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , RNA Bacteriano/genética , Pequeno RNA não Traduzido/genética , Análise de Sequência de RNA/estatística & dados numéricos , Algoritmos , Sequência de Bases , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Deinococcus/genética , Erwinia amylovora/genética , Cadeias de Markov , Rhodobacter capsulatus/genética , Software , Design de Software
17.
PLoS One ; 10(6): e0131166, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26115441

RESUMO

Next Generation Sequencing (NGS) methods are driving profound changes in biomedical research, with a growing impact on patient care. Many academic medical centers are evaluating potential models to prepare for the rapid increase in NGS information needs. This study sought to investigate (1) how and where sequencing data is generated and analyzed, (2) research objectives and goals for NGS, (3) workforce capacity and unmet needs, (4) storage capacity and unmet needs, (5) available and anticipated funding resources, and (6) future challenges. As a precursor to informed decision making at our institution, we undertook a systematic needs assessment of investigators using survey methods. We recruited 331 investigators from over 60 departments and divisions at the University of Pittsburgh Schools of Health Sciences and had 140 respondents, or a 42% response rate. Results suggest that both sequencing and analysis bottlenecks currently exist. Significant educational needs were identified, including both investigator-focused needs, such as selection of NGS methods suitable for specific research objectives, and program-focused needs, such as support for training an analytic workforce. The absence of centralized infrastructure was identified as an important institutional gap. Key principles for organizations managing this change were formulated based on the survey responses. This needs assessment provides an in-depth case study which may be useful to other academic medical centers as they identify and plan for future needs.


Assuntos
Centros Médicos Acadêmicos , Pesquisa Biomédica , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Avaliação das Necessidades , Centros Médicos Acadêmicos/economia , Centros Médicos Acadêmicos/organização & administração , Pesquisa Biomédica/economia , Pesquisa Biomédica/instrumentação , Pesquisa Biomédica/métodos , Comportamento Cooperativo , Custos e Análise de Custo , Sequenciamento de Nucleotídeos em Larga Escala/economia , Sequenciamento de Nucleotídeos em Larga Escala/instrumentação , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Conhecimento , Pesquisadores/estatística & dados numéricos , Inquéritos e Questionários , Recursos Humanos
18.
Pac Symp Biocomput ; : 467-78, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25592605

RESUMO

In this paper, we present a novel feature allocation model to describe tumor heterogeneity (TH) using next-generation sequencing (NGS) data. Taking a Bayesian approach, we extend the Indian buffet process (IBP) to define a class of nonparametric models, the categorical IBP (cIBP). A cIBP takes categorical values to denote homozygous or heterozygous genotypes at each SNV. We define a subclone as a vector of these categorical values, each corresponding to an SNV. Instead of partitioning somatic mutations into non-overlapping clusters with similar cellular prevalences, we took a different approach using feature allocation. Importantly, we do not assume somatic mutations with similar cellular prevalence must be from the same subclone and allow overlapping mutations shared across subclones. We argue that this is closer to the underlying theory of phylogenetic clonal expansion, as somatic mutations occurred in parent subclones should be shared across the parent and child subclones. Bayesian inference yields posterior probabilities of the number, genotypes, and proportions of subclones in a tumor sample, thereby providing point estimates as well as variabilities of the estimates for each subclone. We report results on both simulated and real data. BayClone is available at http://health.bsd.uchicago.edu/yji/soft.html.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Modelos Estatísticos , Neoplasias/genética , Software , Teorema de Bayes , Biologia Computacional , Simulação por Computador , Humanos , Funções Verossimilhança , Neoplasias Pulmonares/genética , Cadeias de Markov , Método de Monte Carlo , Mutação , Polimorfismo de Nucleotídeo Único , Estatísticas não Paramétricas
19.
Brief Bioinform ; 16(2): 232-41, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24562872

RESUMO

Solid tumor samples typically contain multiple distinct clonal populations of cancer cells, and also stromal and immune cell contamination. A majority of the cancer genomics and transcriptomics studies do not explicitly consider genetic heterogeneity and impurity, and draw inferences based on mixed populations of cells. Deconvolution of genomic data from heterogeneous samples provides a powerful tool to address this limitation. We discuss several computational tools, which enable deconvolution of genomic and transcriptomic data from heterogeneous samples. We also performed a systematic comparative assessment of these tools. If properly used, these tools have potentials to complement single-cell genomics and immunoFISH analyses, and provide novel insights into tumor heterogeneity.


Assuntos
Biologia Computacional/métodos , Neoplasias/genética , Neoplasias/patologia , Perfilação da Expressão Gênica/estatística & dados numéricos , Genoma Humano , Genômica/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA