Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 114
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Comput Biol ; 17(11): e1008946, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34843453

RESUMO

Sickle cell disease, a genetic disorder affecting a sizeable global demographic, manifests in sickle red blood cells (sRBCs) with altered shape and biomechanics. sRBCs show heightened adhesive interactions with inflamed endothelium, triggering painful vascular occlusion events. Numerous studies employ microfluidic-assay-based monitoring tools to quantify characteristics of adhered sRBCs from high resolution channel images. The current image analysis workflow relies on detailed morphological characterization and cell counting by a specially trained worker. This is time and labor intensive, and prone to user bias artifacts. Here we establish a morphology based classification scheme to identify two naturally arising sRBC subpopulations-deformable and non-deformable sRBCs-utilizing novel visual markers that link to underlying cell biomechanical properties and hold promise for clinically relevant insights. We then set up a standardized, reproducible, and fully automated image analysis workflow designed to carry out this classification. This relies on a two part deep neural network architecture that works in tandem for segmentation of channel images and classification of adhered cells into subtypes. Network training utilized an extensive data set of images generated by the SCD BioChip, a microfluidic assay which injects clinical whole blood samples into protein-functionalized microchannels, mimicking physiological conditions in the microvasculature. Here we carried out the assay with the sub-endothelial protein laminin. The machine learning approach segmented the resulting channel images with 99.1±0.3% mean IoU on the validation set across 5 k-folds, classified detected sRBCs with 96.0±0.3% mean accuracy on the validation set across 5 k-folds, and matched trained personnel in overall characterization of whole channel images with R2 = 0.992, 0.987 and 0.834 for total, deformable and non-deformable sRBC counts respectively. Average analysis time per channel image was also improved by two orders of magnitude (∼ 2 minutes vs ∼ 2-3 hours) over manual characterization. Finally, the network results show an order of magnitude less variance in counts on repeat trials than humans. This kind of standardization is a prerequisite for the viability of any diagnostic technology, making our system suitable for affordable and high throughput disease monitoring.


Assuntos
Anemia Falciforme/sangue , Aprendizado Profundo , Eritrócitos Anormais/classificação , Microfluídica/estatística & dados numéricos , Anemia Falciforme/diagnóstico por imagem , Fenômenos Biofísicos , Biologia Computacional , Diagnóstico por Computador/estatística & dados numéricos , Deformação Eritrocítica/fisiologia , Eritrócitos Anormais/patologia , Eritrócitos Anormais/fisiologia , Hemoglobina Falciforme/química , Hemoglobina Falciforme/metabolismo , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Interpretação de Imagem Assistida por Computador/estatística & dados numéricos , Técnicas In Vitro , Dispositivos Lab-On-A-Chip/estatística & dados numéricos , Laminina/metabolismo , Redes Neurais de Computação , Multimerização Proteica
2.
Bioprocess Biosyst Eng ; 45(3): 503-514, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-35031864

RESUMO

The coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has had severe consequences for health and the global economy. To control the transmission, there is an urgent demand for early diagnosis and treatment in the general population. In the present study, an automatic system for SARS-CoV-2 diagnosis is designed and built to deliver high specification, high sensitivity, and high throughput with minimal workforce involvement. The system, set up with cross-priming amplification (CPA) rather than conventional reverse transcription-polymerase chain reaction (RT-PCR), was evaluated using more than 1000 real-world samples for direct comparison. This fully automated robotic system performed SARS-CoV-2 nucleic acid-based diagnosis with 192 samples in under 180 min at 100 copies per reaction in a "specimen in data out" manner. This throughput translates to a daily screening capacity of 800-1000 in an assembly-line manner with limited workforce involvement. The sensitivity of this device could be further improved using a CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)-based assay, which opens the door to mixed samples, potentially include SARS-CoV-2 variants screening in extensively scaled testing for fighting COVID-19.


Assuntos
Teste de Ácido Nucleico para COVID-19/métodos , COVID-19/diagnóstico , SARS-CoV-2 , Algoritmos , Engenharia Biomédica/instrumentação , Engenharia Biomédica/métodos , Engenharia Biomédica/estatística & dados numéricos , COVID-19/epidemiologia , COVID-19/virologia , Teste de Ácido Nucleico para COVID-19/instrumentação , Teste de Ácido Nucleico para COVID-19/estatística & dados numéricos , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Desenho de Equipamento , Ensaios de Triagem em Larga Escala/instrumentação , Ensaios de Triagem em Larga Escala/métodos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Técnicas de Amplificação de Ácido Nucleico/instrumentação , Técnicas de Amplificação de Ácido Nucleico/métodos , Técnicas de Amplificação de Ácido Nucleico/estatística & dados numéricos , Pandemias , Robótica/instrumentação , Robótica/métodos , Robótica/estatística & dados numéricos , SARS-CoV-2/genética , Sensibilidade e Especificidade , Análise de Sistemas
3.
Brief Bioinform ; 20(1): 299-316, 2019 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-29028878

RESUMO

Drug repurposing (a.k.a. drug repositioning) is the search for new indications or molecular targets distinct from a drug's putative activity, pharmacological effect or binding specificities. With the ever-increasing rates of termination of drugs in clinical trials, drug repositioning has risen as one of the effective solutions against the risk of drug failures. Repositioning finds a way to reverse the grim but real trend that Eroom's law portends for the pharmaceutical and biotech industry, and drug discovery in general. Further, the advent of high-throughput technologies to explore biological systems has enabled the generation of zeta bytes of data and a massive collection of databases that store them. Computational analytics and mining are frequently used as effective tools to explore this byzantine series of biological and biomedical data. However, advanced computational tools are often difficult to understand or use, thereby limiting their accessibility to scientists without a strong computational background. Hence it is of great importance to build user-friendly interfaces to extend the user-base beyond computational scientists, to include life scientists who may have deeper chemical and biological insights. This survey is focused on systematically presenting the available Web-based tools that aid in repositioning drugs.


Assuntos
Reposicionamento de Medicamentos/métodos , Internet , Software , Algoritmos , Sítios de Ligação , Biologia Computacional/métodos , Bases de Dados de Produtos Farmacêuticos/estatística & dados numéricos , Descoberta de Drogas/métodos , Descoberta de Drogas/estatística & dados numéricos , Reposicionamento de Medicamentos/estatística & dados numéricos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Ligantes , Ferramenta de Busca
4.
J Transl Med ; 19(1): 29, 2021 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-33413480

RESUMO

BACKGROUND: Limited data was available for rapid and accurate detection of COVID-19 using CT-based machine learning model. This study aimed to investigate the value of chest CT radiomics for diagnosing COVID-19 pneumonia compared with clinical model and COVID-19 reporting and data system (CO-RADS), and develop an open-source diagnostic tool with the constructed radiomics model. METHODS: This study enrolled 115 laboratory-confirmed COVID-19 and 435 non-COVID-19 pneumonia patients (training dataset, n = 379; validation dataset, n = 131; testing dataset, n = 40). Key radiomics features extracted from chest CT images were selected to build a radiomics signature using least absolute shrinkage and selection operator (LASSO) regression. Clinical and clinico-radiomics combined models were constructed. The combined model was further validated in the viral pneumonia cohort, and compared with performance of two radiologists using CO-RADS. The diagnostic performance was assessed by receiver operating characteristics curve (ROC) analysis, calibration curve, and decision curve analysis (DCA). RESULTS: Eight radiomics features and 5 clinical variables were selected to construct the combined radiomics model, which outperformed the clinical model in diagnosing COVID-19 pneumonia with an area under the ROC (AUC) of 0.98 and good calibration in the validation cohort. The combined model also performed better in distinguishing COVID-19 from other viral pneumonia with an AUC of 0.93 compared with 0.75 (P = 0.03) for clinical model, and 0.69 (P = 0.008) or 0.82 (P = 0.15) for two trained radiologists using CO-RADS. The sensitivity and specificity of the combined model can be achieved to 0.85 and 0.90. The DCA confirmed the clinical utility of the combined model. An easy-to-use open-source diagnostic tool was developed using the combined model. CONCLUSIONS: The combined radiomics model outperformed clinical model and CO-RADS for diagnosing COVID-19 pneumonia, which can facilitate more rapid and accurate detection.


Assuntos
Teste para COVID-19/métodos , COVID-19/diagnóstico por imagem , COVID-19/diagnóstico , Pneumonia Viral/diagnóstico por imagem , Pneumonia Viral/diagnóstico , SARS-CoV-2 , Tomografia Computadorizada por Raios X/métodos , Adulto , Idoso , COVID-19/epidemiologia , Teste para COVID-19/estatística & dados numéricos , China/epidemiologia , Feminino , Ensaios de Triagem em Larga Escala/métodos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Nomogramas , Pandemias , Pneumonia Viral/epidemiologia , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/estatística & dados numéricos , Estudos Retrospectivos , Sensibilidade e Especificidade , Tomografia Computadorizada por Raios X/estatística & dados numéricos , Pesquisa Translacional Biomédica
5.
Chem Res Toxicol ; 34(9): 2110-2124, 2021 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-34448577

RESUMO

Heart disease remains a significant human health burden worldwide with a significant fraction of morbidity attributable to environmental exposures. However, the extent to which the thousands of chemicals in commerce and the environment may contribute to heart disease morbidity is largely unknown, because in contrast to pharmaceuticals, environmental chemicals are seldom tested for potential cardiotoxicity. Human induced pluripotent stem cell (iPSC)-derived cardiomyocytes have become an informative in vitro model for cardiotoxicity testing of drugs with the availability of cells from multiple individuals allowing in vitro testing of population variability. In this study, we hypothesized that a panel of iPSC-derived cardiomyocytes from healthy human donors can be used to screen for the potential cardiotoxicity hazard and risk of environmental chemicals. We conducted concentration-response testing of 1029 chemicals (drugs, pesticides, flame retardants, polycyclic aromatic hydrocarbons (PAHs), plasticizers, industrial chemicals, food/flavor/fragrance agents, etc.) in iPSC-derived cardiomyocytes from 5 donors. We used kinetic calcium flux and high-content imaging to derive quantitative measures as inputs into Bayesian population concentration-response modeling of the effects of each chemical. We found that many environmental chemicals pose a hazard to human cardiomyocytes in vitro with more than half of all chemicals eliciting positive or negative chronotropic or arrhythmogenic effects. However, most of the tested environmental chemicals for which human exposure and high-throughput toxicokinetics data were available had wide margins of exposure and, thus, do not appear to pose a significant human health risk in a general population. Still, relatively narrow margins of exposure (<100) were estimated for some perfuoroalkyl substances and phthalates, raising concerns that cumulative exposures may pose a cardiotoxicity risk. Collectively, this study demonstrated the value of using a population-based human in vitro model for rapid, high-throughput hazard and risk characterization of chemicals for which little to no cardiotoxicity data are available from guideline studies in animals.


Assuntos
Cardiotoxicidade/etiologia , Células-Tronco Pluripotentes Induzidas/efeitos dos fármacos , Miócitos Cardíacos/efeitos dos fármacos , Compostos Orgânicos/toxicidade , Teorema de Bayes , Bioensaio/estatística & dados numéricos , Feminino , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Masculino , Reprodutibilidade dos Testes , Fatores de Risco
6.
PLoS Comput Biol ; 15(4): e1006867, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30986217

RESUMO

Genome-scale metabolic models provide a valuable context for analyzing data from diverse high-throughput experimental techniques. Models can quantify the activities of diverse pathways and cellular functions. Since some metabolic reactions are only catalyzed in specific environments, several algorithms exist that build context-specific models. However, these methods make differing assumptions that influence the content and associated predictive capacity of resulting models, such that model content varies more due to methods used than cell types. Here we overcome this problem with a novel framework for inferring the metabolic functions of a cell before model construction. For this, we curated a list of metabolic tasks and developed a framework to infer the activity of these functionalities from transcriptomic data. We protected the data-inferred tasks during the implementation of diverse context-specific model extraction algorithms for 44 cancer cell lines. We show that the protection of data-inferred metabolic tasks decreases the variability of models across extraction methods. Furthermore, resulting models better capture the actual biological variability across cell lines. This study highlights the potential of using biological knowledge, inferred from omics data, to obtain a better consensus between existing extraction algorithms. It further provides guidelines for the development of the next-generation of data contextualization methods.


Assuntos
Redes e Vias Metabólicas , Modelos Biológicos , Algoritmos , Animais , Linhagem Celular Tumoral , Biologia Computacional , Interpretação Estatística de Dados , Perfilação da Expressão Gênica , Genômica/estatística & dados numéricos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Redes e Vias Metabólicas/genética , Neoplasias/genética , Neoplasias/metabolismo , Análise de Componente Principal
7.
PLoS Comput Biol ; 15(8): e1006813, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31381559

RESUMO

Prediction of compounds that are active against a desired biological target is a common step in drug discovery efforts. Virtual screening methods seek some active-enriched fraction of a library for experimental testing. Where data are too scarce to train supervised learning models for compound prioritization, initial screening must provide the necessary data. Commonly, such an initial library is selected on the basis of chemical diversity by some pseudo-random process (for example, the first few plates of a larger library) or by selecting an entire smaller library. These approaches may not produce a sufficient number or diversity of actives. An alternative approach is to select an informer set of screening compounds on the basis of chemogenomic information from previous testing of compounds against a large number of targets. We compare different ways of using chemogenomic data to choose a small informer set of compounds based on previously measured bioactivity data. We develop this Informer-Based-Ranking (IBR) approach using the Published Kinase Inhibitor Sets (PKIS) as the chemogenomic data to select the informer sets. We test the informer compounds on a target that is not part of the chemogenomic data, then predict the activity of the remaining compounds based on the experimental informer data and the chemogenomic data. Through new chemical screening experiments, we demonstrate the utility of IBR strategies in a prospective test on three kinase targets not included in the PKIS.


Assuntos
Descoberta de Drogas/métodos , Inibidores de Proteínas Quinases/química , Inibidores de Proteínas Quinases/farmacologia , Quimioinformática/métodos , Quimioinformática/estatística & dados numéricos , Biologia Computacional , Simulação por Computador , Bases de Dados de Compostos Químicos , Bases de Dados de Produtos Farmacêuticos , Descoberta de Drogas/estatística & dados numéricos , Avaliação Pré-Clínica de Medicamentos/métodos , Avaliação Pré-Clínica de Medicamentos/estatística & dados numéricos , Ensaios de Triagem em Larga Escala/métodos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Estudos Prospectivos , Proteínas Serina-Treonina Quinases/antagonistas & inibidores , Proteínas de Protozoários , Relação Estrutura-Atividade , Interface Usuário-Computador , Proteínas Virais/antagonistas & inibidores
8.
Proteomics ; 19(21-22): e1900109, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31321850

RESUMO

The cancer tissue proteome has enormous potential as a source of novel predictive biomarkers in oncology. Progress in the development of mass spectrometry (MS)-based tissue proteomics now presents an opportunity to exploit this by applying the strategies of comprehensive molecular profiling and big-data analytics that are refined in other fields of 'omics research. ProCan (ProCan is a registered trademark) is a program aiming to generate high-quality tissue proteomic data across a broad spectrum of cancer types. It is based on data-independent acquisition-MS proteomic analysis of annotated tissue samples sourced through collaboration with expert clinical and cancer research groups. The practical requirements of a high-throughput translational research program have shaped the approach that ProCan is taking to address challenges in study design, sample preparation, raw data acquisition, and data analysis. The ultimate goal is to establish a large proteomics knowledge-base that, in combination with other cancer 'omics data, will accelerate cancer research.


Assuntos
Neoplasias/genética , Proteoma/genética , Proteômica/estatística & dados numéricos , Software , Biomarcadores Tumorais/genética , Análise de Dados , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Espectrometria de Massas , Neoplasias/patologia , Manejo de Espécimes
9.
Anal Chem ; 91(22): 14489-14497, 2019 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-31660729

RESUMO

Authentication of Cannabis products is important for assuring the quality of manufacturing, with the increasing consumption and regulation. In this report, a two-stage pipeline was developed for high-throughput screening and chemotyping the spectra from two sets of botanical extracts from the Cannabis genus. The first set contains different marijuana samples with higher concentrations of tetrahydrocannabinol (THC). The other set includes samples from hemp, a variety of Cannabis sativa with the THC concentration below 0.3%. The first stage applies the technique of class modeling to determine whether spectra belong to marijuana or hemp and reject novel spectra that may be neither marijuana nor hemp. An automatic soft independent modeling of class analogy (aSIMCA) that self-optimizes the number of principal components and the decision threshold is utilized in the first pipeline process to achieve excellent efficiency and efficacy. Once these spectra are recognized by aSIMCA as marijuana or hemp, they are then routed to the appropriate classifiers in the second stage for chemotyping the spectra, i.e., identifying these spectra into different chemotypes so that the pharmacological properties and cultivars of the spectra can be recognized. Three multivariate classifiers, a fuzzy rule building expert system (FuRES), super partial least-squares-discriminant analysis (sPLS-DA), and support vector machine tree type entropy (SVMtreeH), are employed for chemotyping. The discriminant ability of the pipeline was evaluated with different spectral data sets of these two groups of botanical samples, including proton nuclear magnetic resonance, mass, and ultraviolet spectra. All evaluations gave good results with accuracies greater than 95%, which demonstrated promising application of the pipeline for automated high-throughput screening and chemotyping marijuana and hemp, as well as other botanical products.


Assuntos
Cannabis/química , Cannabis/classificação , Ensaios de Triagem em Larga Escala/métodos , Extratos Vegetais/análise , Análise Discriminante , Lógica Fuzzy , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Análise dos Mínimos Quadrados , Espectrometria de Massas/estatística & dados numéricos , Modelos Químicos , Espectroscopia de Prótons por Ressonância Magnética/estatística & dados numéricos , Máquina de Vetores de Suporte
10.
BMC Med ; 17(1): 133, 2019 07 17.
Artigo em Inglês | MEDLINE | ID: mdl-31311528

RESUMO

BACKGROUND: There is great interest in and excitement about the concept of personalized or precision medicine and, in particular, advancing this vision via various 'big data' efforts. While these methods are necessary, they are insufficient to achieve the full personalized medicine promise. A rigorous, complementary 'small data' paradigm that can function both autonomously from and in collaboration with big data is also needed. By 'small data' we build on Estrin's formulation and refer to the rigorous use of data by and for a specific N-of-1 unit (i.e., a single person, clinic, hospital, healthcare system, community, city, etc.) to facilitate improved individual-level description, prediction and, ultimately, control for that specific unit. MAIN BODY: The purpose of this piece is to articulate why a small data paradigm is needed and is valuable in itself, and to provide initial directions for future work that can advance study designs and data analytic techniques for a small data approach to precision health. Scientifically, the central value of a small data approach is that it can uniquely manage complex, dynamic, multi-causal, idiosyncratically manifesting phenomena, such as chronic diseases, in comparison to big data. Beyond this, a small data approach better aligns the goals of science and practice, which can result in more rapid agile learning with less data. There is also, feasibly, a unique pathway towards transportable knowledge from a small data approach, which is complementary to a big data approach. Future work should (1) further refine appropriate methods for a small data approach; (2) advance strategies for better integrating a small data approach into real-world practices; and (3) advance ways of actively integrating the strengths and limitations from both small and big data approaches into a unified scientific knowledge base that is linked via a robust science of causality. CONCLUSION: Small data is valuable in its own right. That said, small and big data paradigms can and should be combined via a foundational science of causality. With these approaches combined, the vision of precision health can be achieved.


Assuntos
Interpretação Estatística de Dados , Conjuntos de Dados como Assunto/provisão & distribuição , Medicina de Precisão , Comportamento Cooperativo , Ciência de Dados/métodos , Ciência de Dados/tendências , Conjuntos de Dados como Assunto/normas , Conjuntos de Dados como Assunto/estatística & dados numéricos , Atenção à Saúde/métodos , Atenção à Saúde/estatística & dados numéricos , Ensaios de Triagem em Larga Escala/métodos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Aprendizagem , Medicina de Precisão/métodos , Medicina de Precisão/estatística & dados numéricos , Análise de Pequenas Áreas
11.
Plasmid ; 101: 28-34, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30599142

RESUMO

Horizontal gene transfer is an essential component of bacterial evolution. Quantitative information on transfer rates is particularly useful to better understand and possibly predict the spread of antimicrobial resistance. A variety of methods has been proposed to estimate the rates of plasmid-mediated gene transfer all of which require substantial labor input or financial resources. A cheap but reliable method with high-throughput capabilities is yet to be developed in order to better capture the variability of plasmid transfer rates, e.g. among strains or in response to environmental cues. We explored a new approach to the culture-based estimation of plasmid transfer rates in liquid media allowing for a large number of parallel experiments. It deviates from established approaches in the fact that it exploits data on the absence/presence of transconjugant cells in the wells of a well plate observed over time. Specifically, the binary observations are compared to the probability of transconjugant detection as predicted by a dynamic model. The bulk transfer rate is found as the best-fit value of a designated model parameter. The feasibility of the approach is demonstrated on mating experiments where the RP4 plasmid is transfered from Serratia marcescens to several Escherichia coli recipients. The method's uncertainty is explored via split sampling and virtual experiments.


Assuntos
Escherichia coli/genética , Transferência Genética Horizontal , Modelos Estatísticos , Plasmídeos/metabolismo , Serratia marcescens/genética , Evolução Biológica , Conjugação Genética , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Plasmídeos/química , Incerteza
12.
Nat Rev Genet ; 14(2): 89-99, 2013 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23269463

RESUMO

Our understanding of gene expression has changed dramatically over the past decade, largely catalysed by technological developments. High-throughput experiments - microarrays and next-generation sequencing - have generated large amounts of genome-wide gene expression data that are collected in public archives. Added-value databases process, analyse and annotate these data further to make them accessible to every biologist. In this Review, we discuss the utility of the gene expression data that are in the public domain and how researchers are making use of these data. Reuse of public data can be very powerful, but there are many obstacles in data preparation and analysis and in the interpretation of the results. We will discuss these challenges and provide recommendations that we believe can improve the utility of such data.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica/estatística & dados numéricos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Setor Público , Animais , Biologia Computacional , Bases de Dados Genéticas/normas , Bases de Dados Genéticas/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos
13.
Rev Epidemiol Sante Publique ; 67 Suppl 1: S19-S23, 2019 Feb.
Artigo em Francês | MEDLINE | ID: mdl-30635133

RESUMO

Big Data, the production of a massive amount of heterogeneous data, is often presented as a means to ensure the economic survival and sustainability of health systems. According to this perspective, Big Data could help save the spirit of our welfare states based on the principles of risks-sharing and equal access to care for all. According to a second perspective, opposed to the first, Big Data would fuel a process of demutualization, transferring to individuals a growing share of responsibility for managing their health. This article proposes to develop a third approach: Big Data does not induce a loss of solidarity but a transformation of the European model of welfare states. These are the data that are now the objects of the pooling. Individual and collective responsibilities are thus redistributed. However, this model, as new as it is, remains liberal in its inspiration; it basically allows the continuation of political liberalism by other means.


Assuntos
Altruísmo , Conjuntos de Dados como Assunto , Atenção à Saúde , Invenções , Ciências Biocomportamentais , Conjuntos de Dados como Assunto/normas , Conjuntos de Dados como Assunto/provisão & distribuição , Conjuntos de Dados como Assunto/tendências , Atenção à Saúde/organização & administração , Atenção à Saúde/normas , Atenção à Saúde/tendências , Testes Genéticos/tendências , Ensaios de Triagem em Larga Escala/normas , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Ensaios de Triagem em Larga Escala/tendências , Humanos , Individualidade , Invenções/tendências , Medicina de Precisão/efeitos adversos , Medicina de Precisão/métodos , Medicina de Precisão/normas , Medicina de Precisão/tendências , Melhoria de Qualidade/tendências , Fatores de Risco , Justiça Social , Seguridade Social
14.
Biometrics ; 74(3): 803-813, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29192968

RESUMO

The outcome of high-throughput biological experiments is affected by many operational factors in the experimental and data-analytical procedures. Understanding how these factors affect the reproducibility of the outcome is critical for establishing workflows that produce replicable discoveries. In this article, we propose a regression framework, based on a novel cumulative link model, to assess the covariate effects of operational factors on the reproducibility of findings from high-throughput experiments. In contrast to existing graphical approaches, our method allows one to succinctly characterize the simultaneous and independent effects of covariates on reproducibility and to compare reproducibility while controlling for potential confounding variables. We also establish a connection between our model and certain Archimedean copula models. This connection not only offers our regression framework an interpretation in copula models, but also provides guidance on choosing the functional forms of the regression. Furthermore, it also opens a new way to interpret and utilize these copulas in the context of reproducibility. Using simulations, we show that our method produces calibrated type I error and is more powerful in detecting difference in reproducibility than existing measures of agreement. We illustrate the usefulness of our method using a ChIP-seq study and a microarray study.


Assuntos
Fatores de Confusão Epidemiológicos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Análise de Regressão , Algoritmos , Sítios de Ligação , Fator de Ligação a CCCTC/química , Calibragem , Simulação por Computador , Perfilação da Expressão Gênica/estatística & dados numéricos , Ensaios de Triagem em Larga Escala/normas , Humanos , Análise em Microsséries/estatística & dados numéricos , Modelos Estatísticos , Reprodutibilidade dos Testes
15.
Arch Toxicol ; 92(9): 2913-2922, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29995190

RESUMO

The development and application of high throughput in vitro assays is an important development for risk assessment in the twenty-first century. However, there are still significant challenges to incorporate in vitro assays into routine toxicity testing practices. In this paper, a robust learning approach was developed to infer the in vivo point of departure (POD) with in vitro assay data from ToxCast and Tox21 projects. Assay data from ToxCast and Tox21 projects were utilized to derive the in vitro PODs for several hundred chemicals. These were combined with in vivo PODs from ToxRefDB regarding the rat and mouse liver to build a high-dimensional robust regression model. This approach separates the chemicals into a majority, well-predicted set; and a minority, outlier set. Salient relationships can then be learned from the data. For both mouse and rat liver PODs, over 93% of chemicals have inferred values from in vitro PODs that are within ± 1 of the in vivo PODs on the log10 scale (the target learning region, or TLR) and R2 of 0.80 (rats) and 0.78 (mice) for these chemicals. This is comparable with extrapolation between related species (mouse and rat), which has 93% chemicals within the TLR and the R2 being 0.78. Chemicals in the outlier set tend to also have more biologically variable characteristics. With the continued accumulation of high throughput data for a wide range of chemicals, predictive modeling can provide a valuable complement for adverse outcome pathway based approach in risk assessment.


Assuntos
Modelos Teóricos , Testes de Toxicidade Crônica/métodos , Animais , Bases de Dados Factuais , Ensaios de Triagem em Larga Escala/métodos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Fígado/efeitos dos fármacos , Camundongos , Ratos , Testes de Toxicidade Crônica/estatística & dados numéricos
16.
Methods ; 96: 27-32, 2016 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-26476368

RESUMO

High content screening (HCS) experiments create a classic data management challenge-multiple, large sets of heterogeneous structured and unstructured data, that must be integrated and linked to produce a set of "final" results. These different data include images, reagents, protocols, analytic output, and phenotypes, all of which must be stored, linked and made accessible for users, scientists, collaborators and where appropriate the wider community. The OME Consortium has built several open source tools for managing, linking and sharing these different types of data. The OME Data Model is a metadata specification that supports the image data and metadata recorded in HCS experiments. Bio-Formats is a Java library that reads recorded image data and metadata and includes support for several HCS screening systems. OMERO is an enterprise data management application that integrates image data, experimental and analytic metadata and makes them accessible for visualization, mining, sharing and downstream analysis. We discuss how Bio-Formats and OMERO handle these different data types, and how they can be used to integrate, link and share HCS experiments in facilities and public data repositories. OME specifications and software are open source and are available at https://www.openmicroscopy.org.


Assuntos
Biologia Computacional/estatística & dados numéricos , Mineração de Dados/estatística & dados numéricos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Armazenamento e Recuperação da Informação/estatística & dados numéricos , Software , Biologia Computacional/métodos , Conjuntos de Dados como Assunto , Ensaios de Triagem em Larga Escala/métodos , Humanos , Disseminação de Informação , Armazenamento e Recuperação da Informação/métodos , Internet
17.
Methods ; 96: 12-26, 2016 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-26476369

RESUMO

Heterogeneity is well recognized as a common property of cellular systems that impacts biomedical research and the development of therapeutics and diagnostics. Several studies have shown that analysis of heterogeneity: gives insight into mechanisms of action of perturbagens; can be used to predict optimal combination therapies; and can be applied to tumors where heterogeneity is believed to be associated with adaptation and resistance. Cytometry methods including high content screening (HCS), high throughput microscopy, flow cytometry, mass spec imaging and digital pathology capture cell level data for populations of cells. However it is often assumed that the population response is normally distributed and therefore that the average adequately describes the results. A deeper understanding of the results of the measurements and more effective comparison of perturbagen effects requires analysis that takes into account the distribution of the measurements, i.e. the heterogeneity. However, the reproducibility of heterogeneous data collected on different days, and in different plates/slides has not previously been evaluated. Here we show that conventional assay quality metrics alone are not adequate for quality control of the heterogeneity in the data. To address this need, we demonstrate the use of the Kolmogorov-Smirnov statistic as a metric for monitoring the reproducibility of heterogeneity in an SAR screen, describe a workflow for quality control in heterogeneity analysis. One major challenge in high throughput biology is the evaluation and interpretation of heterogeneity in thousands of samples, such as compounds in a cell-based screen. In this study we also demonstrate that three heterogeneity indices previously reported, capture the shapes of the distributions and provide a means to filter and browse big data sets of cellular distributions in order to compare and identify distributions of interest. These metrics and methods are presented as a workflow for analysis of heterogeneity in large scale biology projects.


Assuntos
Células Epiteliais/ultraestrutura , Citometria de Fluxo/estatística & dados numéricos , Regulação Neoplásica da Expressão Gênica , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Microscopia/estatística & dados numéricos , Imagem Molecular/estatística & dados numéricos , Linhagem Celular Tumoral , Árvores de Decisões , Células Epiteliais/efeitos dos fármacos , Células Epiteliais/metabolismo , Citometria de Fluxo/normas , Ensaios de Triagem em Larga Escala/normas , Humanos , Interleucina-6/farmacologia , Microscopia/normas , Imagem Molecular/normas , Fenótipo , Controle de Qualidade , Reprodutibilidade dos Testes , Fator de Transcrição STAT3/genética , Fator de Transcrição STAT3/metabolismo , Transdução de Sinais , Estatísticas não Paramétricas
18.
Nucleic Acids Res ; 43(6): e41, 2015 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-25586222

RESUMO

Defining the RNA target selectivity of the proteins regulating mRNA metabolism is a key issue in RNA biology. Here we present a novel use of principal component analysis (PCA) to extract the RNA sequence preference of RNA binding proteins. We show that PCA can be used to compare the changes in the nuclear magnetic resonance (NMR) spectrum of a protein upon binding a set of quasi-degenerate RNAs and define the nucleobase specificity. We couple this application of PCA to an automated NMR spectra recording and processing protocol and obtain an unbiased and high-throughput NMR method for the analysis of nucleobase preference in protein-RNA interactions. We test the method on the RNA binding domains of three important regulators of RNA metabolism.


Assuntos
Ensaios de Triagem em Larga Escala/métodos , Ressonância Magnética Nuclear Biomolecular/métodos , Proteínas de Ligação a RNA/metabolismo , RNA/genética , RNA/metabolismo , Sequência de Bases , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/metabolismo , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Modelos Moleculares , Análise de Componente Principal , Domínios e Motivos de Interação entre Proteínas , Proteínas de Ligação a RNA/química , Proteínas Recombinantes/química , Proteínas Recombinantes/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/metabolismo , Fatores de Poliadenilação e Clivagem de mRNA/química , Fatores de Poliadenilação e Clivagem de mRNA/metabolismo
19.
Biostatistics ; 16(3): 611-25, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25792622

RESUMO

A number of biomedical problems require performing many hypothesis tests, with an attendant need to apply stringent thresholds. Often the data take the form of a series of predictor vectors, each of which must be compared with a single response vector, perhaps with nuisance covariates. Parametric tests of association are often used, but can result in inaccurate type I error at the extreme thresholds, even for large sample sizes. Furthermore, standard two-sided testing can reduce power compared with the doubled [Formula: see text]-value, due to asymmetry in the null distribution. Exact (permutation) testing is attractive, but can be computationally intensive and cumbersome. We present an approximation to exact association tests of trend that is accurate and fast enough for standard use in high-throughput settings, and can easily provide standard two-sided or doubled [Formula: see text]-values. The approach is shown to be equivalent under permutation to likelihood ratio tests for the most commonly used generalized linear models (GLMs). For linear regression, covariates are handled by working with covariate-residualized responses and predictors. For GLMs, stratified covariates can be handled in a manner similar to exact conditional testing. Simulations and examples illustrate the wide applicability of the approach. The accompanying mcc package is available on CRAN http://cran.r-project.org/web/packages/mcc/index.html.


Assuntos
Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Bioestatística , Neoplasias da Mama/genética , Simulação por Computador , Fibrose Cística/genética , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Feminino , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Funções Verossimilhança , Modelos Lineares , Polimorfismo de Nucleotídeo Único , Tamanho da Amostra , Software
20.
Brief Bioinform ; 15(2): 229-43, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23620135

RESUMO

Identifying early warning signals of critical transitions during disease progression is a key to achieving early diagnosis of complex diseases. By exploiting rich information of high-throughput data, a novel model-free method has been developed to detect early warning signals of diseases. Its theoretical foundation is based on dynamical network biomarker (DNB), which is also called as the driver (or leading) network of the disease because components or molecules in DNB actually drive the whole system from one state (e.g. normal state) to another (e.g. disease state). In this article, we first reviewed the concept and main results of DNB theory, and then applied the new method to the analysis of type 2 diabetes mellitus (T2DM). Specifically, based on the temporal-spatial gene expression data of T2DM, we identified tissue-specific DNBs corresponding to the critical transitions occurring in liver, adipose and muscle during T2DM development and progression. Actually, we found that there are two different critical states during T2DM development characterized as responses to insulin resistance and serious inflammation, respectively. Interestingly, a new T2DM-associated function, i.e. steroid hormone biosynthesis, was discovered, and those related genes were significantly dysregulated in liver and adipose at the first critical transition during T2DM deterioration. Moreover, the dysfunction of genes related to responding hormone was also detected in muscle at the similar period. Based on the functional and network analysis on pathogenic molecular mechanism of T2DM, we showed that most of DNB genes, in particular the core ones, tended to be located at the upstream of biological pathways, which implied that DNB genes act as the causal factors rather than the consequence to drive the downstream molecules to change their transcriptional activities. This also validated our theoretical prediction of DNB as the driver network. As shown in this study, DNB can not only signal the emergence of the critical transitions for early diagnosis of diseases, but can also provide the causal network of the transitions for revealing molecular mechanisms of disease initiation and progression at a network level.


Assuntos
Biomarcadores/metabolismo , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/metabolismo , Algoritmos , Animais , Biologia Computacional/métodos , Diabetes Mellitus Tipo 2/genética , Progressão da Doença , Diagnóstico Precoce , Marcadores Genéticos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Modelos Biológicos , Ratos , Transdução de Sinais , Esteroides/biossíntese , Biologia de Sistemas , Distribuição Tecidual
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA