Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
Environ Sci Technol ; 56(18): 13473-13484, 2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36048618

RESUMO

Rapid progress in various advanced analytical methods, such as single-cell technologies, enable unprecedented and deeper understanding of microbial ecology beyond the resolution of conventional approaches. A major application challenge exists in the determination of sufficient sample size without sufficient prior knowledge of the community complexity and, the need to balance between statistical power and limited time or resources. This hinders the desired standardization and wider application of these technologies. Here, we proposed, tested and validated a computational sampling size assessment protocol taking advantage of a metric, named kernel divergence. This metric has two advantages: First, it directly compares data set-wise distributional differences with no requirements on human intervention or prior knowledge-based preclassification. Second, minimal assumptions in distribution and sample space are made in data processing to enhance its application domain. This enables test-verified appropriate handling of data sets with both linear and nonlinear relationships. The model was then validated in a case study with Single-cell Raman Spectroscopy (SCRS) phenotyping data sets from eight different enhanced biological phosphorus removal (EBPR) activated sludge communities located across North America. The model allows the determination of sufficient sampling size for any targeted or customized information capture capacity or resolution level. Promised by its flexibility and minimal restriction of input data types, the proposed method is expected to be a standardized approach for sampling size optimization, enabling more comparable and reproducible experiments and analysis on complex environmental samples. Finally, these advantages enable the extension of the capability to other single-cell technologies or environmental applications with data sets exhibiting continuous features.


Assuntos
Produtos Biológicos , Fósforo , Humanos , Aprendizado de Máquina , Fósforo/química , Polifosfatos , Esgotos , Análise Espectral Raman
2.
Am J Respir Crit Care Med ; 198(8): 1033-1042, 2018 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-29671603

RESUMO

RATIONALE: The relationship between longitudinal lung function trajectories, chest computed tomography (CT) imaging, and genetic predisposition to chronic obstructive pulmonary disease (COPD) has not been explored. OBJECTIVES: 1) To model trajectories using a data-driven approach applied to longitudinal data spanning adulthood in the Normative Aging Study (NAS), and 2) to apply these models to demographically similar subjects in the COPDGene (Genetic Epidemiology of COPD) Study with detailed phenotypic characterization including chest CT. METHODS: We modeled lung function trajectories in 1,060 subjects in NAS with a median follow-up time of 29 years. We assigned 3,546 non-Hispanic white males in COPDGene to these trajectories for further analysis. We assessed phenotypic and genetic differences between trajectories and across age strata. MEASUREMENTS AND MAIN RESULTS: We identified four trajectories in NAS with differing levels of maximum lung function and rate of decline. In COPDGene, 617 subjects (17%) were assigned to the lowest trajectory and had the greatest radiologic burden of disease (P < 0.01); 1,283 subjects (36%) were assigned to a low trajectory with evidence of airway disease preceding emphysema on CT; 1,411 subjects (40%) and 237 subjects (7%) were assigned to the remaining two trajectories and tended to have preserved lung function and negligible emphysema. The genetic contribution to these trajectories was as high as 83% (P = 0.02), and membership in lower lung function trajectories was associated with greater parental histories of COPD, decreased exercise capacity, greater dyspnea, and more frequent COPD exacerbations. CONCLUSIONS: Data-driven analysis identifies four lung function trajectories. Trajectory membership has a genetic basis and is associated with distinct lung structural abnormalities.


Assuntos
Pulmão/fisiopatologia , Doença Pulmonar Obstrutiva Crônica/complicações , Fumar/efeitos adversos , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos de Casos e Controles , Progressão da Doença , Volume Expiratório Forçado , Humanos , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade , Doença Pulmonar Obstrutiva Crônica/fisiopatologia , Testes de Função Respiratória , Adulto Jovem
3.
Am J Epidemiol ; 187(10): 2109-2116, 2018 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-29771274

RESUMO

Chronic obstructive pulmonary disease (COPD) is a syndrome caused by damage to the lungs that results in decreased pulmonary function and reduced structural integrity. Pulmonary function testing (PFT) is used to diagnose and stratify COPD into severity groups, and computed tomography (CT) imaging of the chest is often used to assess structural changes in the lungs. We hypothesized that the combination of PFT and CT phenotypes would provide a more powerful tool for assessing underlying morphologic differences associated with pulmonary function in COPD than does PFT alone. We used factor analysis of 26 variables to classify 8,157 participants recruited into the COPDGene cohort between January 2008 and June 2011 from 21 clinical centers across the United States. These factors were used as predictors of all-cause mortality using Cox proportional hazards modeling. Five factors explained 80% of the covariance and represented the following domains: factor 1, increased emphysema and decreased pulmonary function; factor 2, airway disease and decreased pulmonary function; factor 3, gas trapping; factor 4, CT variability; and factor 5, hyperinflation. After more than 46,079 person-years of follow-up, factors 1 through 4 were associated with mortality and there was a significant synergistic interaction between factors 1 and 2 on death. Considering CT measures along with PFT in the assessment of COPD can identify patients at particularly high risk for death.


Assuntos
Doença Pulmonar Obstrutiva Crônica/genética , Doença Pulmonar Obstrutiva Crônica/mortalidade , Testes de Função Respiratória/estatística & dados numéricos , Medição de Risco/métodos , Tomografia Computadorizada por Raios X/estatística & dados numéricos , Adulto , Idoso , Idoso de 80 Anos ou mais , Causas de Morte , Análise Fatorial , Feminino , Humanos , Pulmão/diagnóstico por imagem , Pulmão/fisiopatologia , Masculino , Pessoa de Meia-Idade , Fenótipo , Valor Preditivo dos Testes , Modelos de Riscos Proporcionais , Fatores de Risco
4.
medRxiv ; 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38826461

RESUMO

Rationale: Genetic variants and gene expression predict risk of chronic obstructive pulmonary disease (COPD), but their effect on COPD heterogeneity is unclear. Objectives: Define high-risk COPD subtypes using both genetics (polygenic risk score, PRS) and blood gene expression (transcriptional risk score, TRS) and assess differences in clinical and molecular characteristics. Methods: We defined high-risk groups based on PRS and TRS quantiles by maximizing differences in protein biomarkers in a COPDGene training set and identified these groups in COPDGene and ECLIPSE test sets. We tested multivariable associations of subgroups with clinical outcomes and compared protein-protein interaction networks and drug repurposing analyses between high-risk groups. Measurements and Main Results: We examined two high-risk omics-defined groups in non-overlapping test sets (n=1,133 NHW COPDGene, n=299 African American (AA) COPDGene, n=468 ECLIPSE). We defined "High activity" (low PRS/high TRS) and "severe risk" (high PRS/high TRS) subgroups. Participants in both subgroups had lower body-mass index (BMI), lower lung function, and alterations in metabolic, growth, and immune signaling processes compared to a low-risk (low PRS, low TRS) reference subgroup. "High activity" but not "severe risk" participants had greater prospective FEV 1 decline (COPDGene: -51 mL/year; ECLIPSE: - 40 mL/year) and their proteomic profiles were enriched in gene sets perturbed by treatment with 5-lipoxygenase inhibitors and angiotensin-converting enzyme (ACE) inhibitors. Conclusions: Concomitant use of polygenic and transcriptional risk scores identified clinical and molecular heterogeneity amongst high-risk individuals. Proteomic and drug repurposing analysis identified subtype-specific enrichment for therapies and suggest prior drug repurposing failures may be explained by patient selection.

5.
Chin J Cancer ; 32(4): 170-85, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23327800

RESUMO

Myelodysplastic syndromes have increased in frequency and incidence in the American population, but patient prognosis has not significantly improved over the last decade. Such improvements could be realized if biomarkers for accurate diagnosis and prognostic stratification were successfully identified. In this study, we propose a method that associates two state-of-the-art array technologies--single nucleotide polymor-phism(SNP) array and gene expression array--with gene motifs considered transcription factor-binding sites (TFBS). We are particularly interested in SNP-containing motifs introduced by genetic variation and mutation as TFBS. The potential regulation of SNP-containing motifs affects only when certain mutations occur. These motifs can be identified from a group of co-expressed genes with copy number variation. Then, we used a sliding window to identify motif candidates near SNPs on gene sequences. The candidates were filtered by coarse thresholding and fine statistical testing. Using the regression-based LARS-EN algorithm and a level-wise sequence combination procedure, we identified 28 SNP-containing motifs as candidate TFBS. We confirmed 21 of the 28 motifs with ChIP-chip fragments in the TRANSFAC database. Another six motifs were validated by TRANSFAC via searching binding fragments on co-regulated genes. The identified motifs and their location genes can be considered potential biomarkers for myelodysplastic syndromes. Thus, our proposed method, a novel strategy for associating two data categories, is capable of integrating information from different sources to identify reliable candidate regulatory SNP-containing motifs introduced by genetic variation and mutation.


Assuntos
Perfilação da Expressão Gênica , Genes Reguladores , Síndromes Mielodisplásicas/genética , Polimorfismo de Nucleotídeo Único/genética , Fatores de Transcrição/genética , Algoritmos , Sítios de Ligação , Variações do Número de Cópias de DNA , Bases de Dados Genéticas , Genótipo , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos
6.
Nat Commun ; 14(1): 339, 2023 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-36670105

RESUMO

The El Niño Southern Oscillation (ENSO) is a semi-periodic fluctuation in sea surface temperature (SST) over the tropical central and eastern Pacific Ocean that influences interannual variability in regional hydrology across the world through long-range dependence or teleconnections. Recent research has demonstrated the value of Deep Learning (DL) methods for improving ENSO prediction as well as Complex Networks (CN) for understanding teleconnections. However, gaps in predictive understanding of ENSO-driven river flows include the black box nature of DL, the use of simple ENSO indices to describe a complex phenomenon and translating DL-based ENSO predictions to river flow predictions. Here we show that eXplainable DL (XDL) methods, based on saliency maps, can extract interpretable predictive information contained in global SST and discover SST information regions and dependence structures relevant for river flows which, in tandem with climate network constructions, enable improved predictive understanding. Our results reveal additional information content in global SST beyond ENSO indices, develop understanding of how SSTs influence river flows, and generate improved river flow prediction, including uncertainty estimation. Observations, reanalysis data, and earth system model simulations are used to demonstrate the value of the XDL-CN based methods for future interannual and decadal scale climate projections.


Assuntos
Aprendizado Profundo , El Niño Oscilação Sul , Rios , Temperatura , Oceano Pacífico
7.
BMC Bioinformatics ; 13: 337, 2012 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-23270311

RESUMO

BACKGROUND: RNA interference (RNAi) becomes an increasingly important and effective genetic tool to study the function of target genes by suppressing specific genes of interest. This system approach helps identify signaling pathways and cellular phase types by tracking intensity and/or morphological changes of cells. The traditional RNAi screening scheme, in which one siRNA is designed to knockdown one specific mRNA target, needs a large library of siRNAs and turns out to be time-consuming and expensive. RESULTS: In this paper, we propose a conceptual model, called compressed sensing RNAi (csRNAi), which employs a unique combination of group of small interfering RNAs (siRNAs) to knockdown a much larger size of genes. This strategy is based on the fact that one gene can be partially bound with several small interfering RNAs (siRNAs) and conversely, one siRNA can bind to a few genes with distinct binding affinity. This model constructs a multi-to-multi correspondence between siRNAs and their targets, with siRNAs much fewer than mRNA targets, compared with the conventional scheme. Mathematically this problem involves an underdetermined system of equations (linear or nonlinear), which is ill-posed in general. However, the recently developed compressed sensing (CS) theory can solve this problem. We present a mathematical model to describe the csRNAi system based on both CS theory and biological concerns. To build this model, we first search nucleotide motifs in a target gene set. Then we propose a machine learning based method to find the effective siRNAs with novel features, such as image features and speech features to describe an siRNA sequence. Numerical simulations show that we can reduce the siRNA library to one third of that in the conventional scheme. In addition, the features to describe siRNAs outperform the existing ones substantially. CONCLUSIONS: This csRNAi system is very promising in saving both time and cost for large-scale RNAi screening experiments which may benefit the biological research with respect to cellular processes and pathways.


Assuntos
Simulação por Computador , Interferência de RNA , Algoritmos , Inteligência Artificial , Biblioteca Gênica , Humanos , Motivos de Nucleotídeos , RNA Mensageiro/metabolismo , RNA Interferente Pequeno/genética , RNA Interferente Pequeno/metabolismo
8.
Neuroimage ; 62(3): 2040-54, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22732566

RESUMO

Synaptic vesicle dynamics play an important role in the study of neuronal and synaptic activities of neurodegradation diseases ranging from the epidemic Alzheimer's disease to the rare Rett syndrome. A high-throughput assay with a large population of neurons would be useful and efficient to characterize neuronal activity based on the dynamics of synaptic vesicles for the study of mechanisms or to discover drug candidates for neurodegenerative and neurodevelopmental disorders. However, the massive amounts of image data generated via high-throughput screening require enormous manual processing time and effort, restricting the practical use of such an assay. This paper presents an automated analytic system to process and interpret the huge data set generated by such assays. Our system enables the automated detection, segmentation, quantification, and measurement of neuron activities based on the synaptic vesicle assay. To overcome challenges such as noisy background, inhomogeneity, and tiny object size, we first employ MSVST (Multi-Scale Variance Stabilizing Transform) to obtain a denoised and enhanced map of the original image data. Then, we propose an adaptive thresholding strategy to solve the inhomogeneity issue, based on the local information, and to accurately segment synaptic vesicles. We design algorithms to address the issue of tiny objects of interest overlapping. Several post processing criteria are defined to filter false positives. A total of 152 features are extracted for each detected vesicle. A score is defined for each synaptic vesicle image to quantify the neuron activity. We also compare the unsupervised strategy with the supervised method. Our experiments on hippocampal neuron assays showed that the proposed system can automatically detect vesicles and quantify their dynamics for evaluating neuron activities. The availability of such an automated system will open opportunities for investigation of synaptic neuropathology and identification of candidate therapeutics for neurodegeneration.


Assuntos
Diagnóstico por Imagem/métodos , Ensaios de Triagem em Larga Escala/métodos , Processamento de Imagem Assistida por Computador/métodos , Neurônios/fisiologia , Algoritmos , Animais , Encéfalo/fisiologia , Células Cultivadas , Camundongos , Camundongos Endogâmicos C57BL , Vesículas Sinápticas
9.
Chronic Obstr Pulm Dis ; 9(3): 349-365, 2022 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-35649102

RESUMO

Background: The heterogeneous nature of chronic obstructive pulmonary disease (COPD) complicates the identification of the predictors of disease progression. We aimed to improve the prediction of disease progression in COPD by using machine learning and incorporating a rich dataset of phenotypic features. Methods: We included 4496 smokers with available data from their enrollment and 5-year follow-up visits in the COPD Genetic Epidemiology (COPDGene®) study. We constructed linear regression (LR) and supervised random forest models to predict 5-year progression in forced expiratory in 1 second (FEV1) from 46 baseline features. Using cross-validation, we randomly partitioned participants into training and testing samples. We also validated the results in the COPDGene 10-year follow-up visit. Results: Predicting the change in FEV1 over time is more challenging than simply predicting the future absolute FEV1 level. For random forest, R-squared was 0.15 and the area under the receiver operator characteristic (ROC) curves for the prediction of participants in the top quartile of observed progression was 0.71 (testing) and respectively, 0.10 and 0.70 (validation). Random forest provided slightly better performance than LR. The accuracy was best for Global initiative for chronic Obstructive Lung Disease (GOLD) grades 1-2 participants, and it was harder to achieve accurate prediction in advanced stages of the disease. Predictive variables differed in their relative importance as well as for the predictions by GOLD. Conclusion: Random forest, along with deep phenotyping, predicts FEV1 progression with reasonable accuracy. There is significant room for improvement in future models. This prediction model facilitates the identification of smokers at increased risk for rapid disease progression. Such findings may be useful in the selection of patient populations for targeted clinical trials.

10.
Sci Rep ; 11(1): 12576, 2021 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-34131165

RESUMO

Reflectance confocal microscopy (RCM) is an effective non-invasive tool for cancer diagnosis. However, acquiring and reading RCM images requires extensive training and experience, and novice clinicians exhibit high discordance in diagnostic accuracy. Quantitative tools to standardize image acquisition could reduce both required training and diagnostic variability. To perform diagnostic analysis, clinicians collect a set of RCM mosaics (RCM images concatenated in a raster fashion to extend the field view) at 4-5 specific layers in skin, all localized in the junction between the epidermal and dermal layers (dermal-epidermal junction, DEJ), necessitating locating that junction before mosaic acquisition. In this study, we automate DEJ localization using deep recurrent convolutional neural networks to delineate skin strata in stacks of RCM images collected at consecutive depths. Success will guide to automated and quantitative mosaic acquisition thus reducing inter operator variability and bring standardization in imaging. Testing our model against an expert labeled dataset of 504 RCM stacks, we achieved [Formula: see text] classification accuracy and nine-fold reduction in the number of anatomically impossible errors compared to the previous state-of-the-art.


Assuntos
Detecção Precoce de Câncer , Microscopia Confocal/métodos , Neoplasias Cutâneas/diagnóstico , Epiderme/diagnóstico por imagem , Epiderme/patologia , Feminino , Humanos , Processamento de Imagem Assistida por Computador/métodos , Masculino , Redes Neurais de Computação , Neoplasias Cutâneas/diagnóstico por imagem , Neoplasias Cutâneas/patologia
11.
Sci Rep ; 11(1): 3679, 2021 02 11.
Artigo em Inglês | MEDLINE | ID: mdl-33574486

RESUMO

Reflectance confocal microscopy (RCM) is a non-invasive imaging tool that reduces the need for invasive histopathology for skin cancer diagnoses by providing high-resolution mosaics showing the architectural patterns of skin, which are used to identify malignancies in-vivo. RCM mosaics are similar to dermatopathology sections, both requiring extensive training to interpret. However, these modalities differ in orientation, as RCM mosaics are horizontal (parallel to the skin surface) while histopathology sections are vertical, and contrast mechanism, RCM with a single (reflectance) mechanism resulting in grayscale images and histopathology with multi-factor color-stained contrast. Image analysis and machine learning methods can potentially provide a diagnostic aid to clinicians to interpret RCM mosaics, eventually helping to ease the adoption and more efficiently utilizing RCM in routine clinical practice. However standard supervised machine learning may require a prohibitive volume of hand-labeled training data. In this paper, we present a weakly supervised machine learning model to perform semantic segmentation of architectural patterns encountered in RCM mosaics. Unlike more widely used fully supervised segmentation models that require pixel-level annotations, which are very labor-demanding and error-prone to obtain, here we focus on training models using only patch-level labels (e.g. a single field of view within an entire mosaic). We segment RCM mosaics into "benign" and "aspecific (nonspecific)" regions, where aspecific regions represent the loss of regular architecture due to injury and/or inflammation, pre-malignancy, or malignancy. We adopt Efficientnet, a deep neural network (DNN) proven to accurately accomplish classification tasks, to generate class activation maps, and use a Gaussian weighting kernel to stitch smaller images back into larger fields of view. The trained DNN achieved an average area under the curve of 0.969, and Dice coefficient of 0.778 showing the feasibility of spatial localization of aspecific regions in RCM images, and making the diagnostics decision model more interpretable to the clinicians.


Assuntos
Processamento de Imagem Assistida por Computador , Microscopia Confocal , Neoplasias Cutâneas/diagnóstico , Pele/ultraestrutura , Humanos , Aprendizado de Máquina , Redes Neurais de Computação , Semântica , Pele/diagnóstico por imagem , Pele/patologia , Neoplasias Cutâneas/diagnóstico por imagem , Neoplasias Cutâneas/patologia
12.
Med Image Anal ; 67: 101841, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33142135

RESUMO

In-vivo optical microscopy is advancing into routine clinical practice for non-invasively guiding diagnosis and treatment of cancer and other diseases, and thus beginning to reduce the need for traditional biopsy. However, reading and analysis of the optical microscopic images are generally still qualitative, relying mainly on visual examination. Here we present an automated semantic segmentation method called "Multiscale Encoder-Decoder Network (MED-Net)" that provides pixel-wise labeling into classes of patterns in a quantitative manner. The novelty in our approach is the modeling of textural patterns at multiple scales (magnifications, resolutions). This mimics the traditional procedure for examining pathology images, which routinely starts with low magnification (low resolution, large field of view) followed by closer inspection of suspicious areas with higher magnification (higher resolution, smaller fields of view). We trained and tested our model on non-overlapping partitions of 117 reflectance confocal microscopy (RCM) mosaics of melanocytic lesions, an extensive dataset for this application, collected at four clinics in the US, and two in Italy. With patient-wise cross-validation, we achieved pixel-wise mean sensitivity and specificity of 74% and 92%, respectively, with 0.74 Dice coefficient over six classes. In the scenario, we partitioned the data clinic-wise and tested the generalizability of the model over multiple clinics. In this setting, we achieved pixel-wise mean sensitivity and specificity of 77% and 94%, respectively, with 0.77 Dice coefficient. We compared MED-Net against the state-of-the-art semantic segmentation models and achieved better quantitative segmentation performance. Our results also suggest that, due to its nested multiscale architecture, the MED-Net model annotated RCM mosaics more coherently, avoiding unrealistic-fragmented annotations.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Humanos , Microscopia Confocal
13.
IEEE Trans Biomed Eng ; 68(6): 1871-1881, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-32997621

RESUMO

OBJECTIVE: Rehabilitation specialists have shown considerable interest for the development of models, based on clinical data, to predict the response to rehabilitation interventions in stroke and traumatic brain injury survivors. However, accurate predictions are difficult to obtain due to the variability in patients' response to rehabilitation interventions. This study aimed to investigate the use of wearable technology in combination with clinical data to predict and monitor the recovery process and assess the responsiveness to treatment on an individual basis. METHODS: Gaussian Process Regression-based algorithms were developed to estimate rehabilitation outcomes (i.e., Functional Ability Scale scores) using either clinical or wearable sensor data or a combination of the two. RESULTS: The algorithm based on clinical data predicted rehabilitation outcomes with a Pearson's correlation of 0.79 compared to actual clinical scores provided by clinicians but failed to model the variability in responsiveness to the intervention observed across individuals. In contrast, the algorithm based on wearable sensor data generated rehabilitation outcome estimates with a Pearson's correlation of 0.91 and modeled the individual responses to rehabilitation more accurately. Furthermore, we developed a novel approach to combine estimates derived from the clinical data and the sensor data using a constrained linear model. This approach resulted in a Pearson's correlation of 0.94 between estimated and clinician-provided scores. CONCLUSION: This algorithm could enable the design of patient-specific interventions based on predictions of rehabilitation outcomes relying on clinical and wearable sensor data. SIGNIFICANCE: This is important in the context of developing precision rehabilitation interventions.


Assuntos
Lesões Encefálicas , Reabilitação do Acidente Vascular Cerebral , Dispositivos Eletrônicos Vestíveis , Humanos , Sobreviventes , Resultado do Tratamento , Extremidade Superior
14.
J Invest Dermatol ; 140(6): 1214-1222, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-31838127

RESUMO

In vivo reflectance confocal microscopy (RCM) enables clinicians to examine lesions' morphological and cytological information in epidermal and dermal layers while reducing the need for biopsies. As RCM is being adopted more widely, the workflow is expanding from real-time diagnosis at the bedside to include a capture, store, and forward model with image interpretation and diagnosis occurring offsite, similar to radiology. As the patient may no longer be present at the time of image interpretation, quality assurance is key during image acquisition. Herein, we introduce a quality assurance process by means of automatically quantifying diagnostically uninformative areas within the lesional area by using RCM and coregistered dermoscopy images together. We trained and validated a pixel-level segmentation model on 117 RCM mosaics collected by international collaborators. The model delineates diagnostically uninformative areas with 82% sensitivity and 93% specificity. We further tested the model on a separate set of 372 coregistered RCM-dermoscopic image pairs and illustrate how the results of the RCM-only model can be improved via a multimodal (RCM + dermoscopy) approach, which can help quantify the uninformative regions within the lesional area. Our data suggest that machine learning-based automatic quantification offers a feasible objective quality control measure for RCM imaging.


Assuntos
Dermoscopia/métodos , Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Dermatopatias/diagnóstico , Pele/diagnóstico por imagem , Dermoscopia/normas , Diagnóstico Diferencial , Estudos de Viabilidade , Humanos , Microscopia Confocal/métodos , Microscopia Confocal/normas , Controle de Qualidade
15.
Chest ; 157(5): 1147-1157, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-31887283

RESUMO

COPD is a heterogeneous syndrome. Many COPD subtypes have been proposed, but there is not yet consensus on how many COPD subtypes there are and how they should be defined. The COPD Genetic Epidemiology Study (COPDGene), which has generated 10-year longitudinal chest imaging, spirometry, and molecular data, is a rich resource for relating COPD phenotypes to underlying genetic and molecular mechanisms. In this article, we place COPDGene clustering studies in context with other highly cited COPD clustering studies, and summarize the main COPD subtype findings from COPDGene. First, most manifestations of COPD occur along a continuum, which explains why continuous aspects of COPD or disease axes may be more accurate and reproducible than subtypes identified through clustering methods. Second, continuous COPD-related measures can be used to create subgroups through the use of predictive models to define cut-points, and we review COPDGene research on blood eosinophil count thresholds as a specific example. Third, COPD phenotypes identified or prioritized through machine learning methods have led to novel biological discoveries, including novel emphysema genetic risk variants and systemic inflammatory subtypes of COPD. Fourth, trajectory-based COPD subtyping captures differences in the longitudinal evolution of COPD, addressing a major limitation of clustering analyses that are confounded by disease severity. Ongoing longitudinal characterization of subjects in COPDGene will provide useful insights about the relationship between lung imaging parameters, molecular markers, and COPD progression that will enable the identification of subtypes based on underlying disease processes and distinct patterns of disease progression, with the potential to improve the clinical relevance and reproducibility of COPD subtypes.


Assuntos
Aprendizado de Máquina , Epidemiologia Molecular , Doença Pulmonar Obstrutiva Crônica/classificação , Doença Pulmonar Obstrutiva Crônica/epidemiologia , Doença Pulmonar Obstrutiva Crônica/genética , Análise por Conglomerados , Diagnóstico por Imagem , Progressão da Doença , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Fenótipo , Testes de Função Respiratória
16.
Phys Med Biol ; 54(6): 1555-63, 2009 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-19229098

RESUMO

In lung cancer radiotherapy, radiation to a mobile target can be delivered by respiratory gating, for which we need to know whether the target is inside or outside a predefined gating window at any time point during the treatment. This can be achieved by tracking one or more fiducial markers implanted inside or near the target, either fluoroscopically or electromagnetically. However, the clinical implementation of marker tracking is limited for lung cancer radiotherapy mainly due to the risk of pneumothorax. Therefore, gating without implanted fiducial markers is a promising clinical direction. We have developed several template-matching methods for fluoroscopic marker-less gating. Recently, we have modeled the gating problem as a binary pattern classification problem, in which principal component analysis (PCA) and support vector machine (SVM) are combined to perform the classification task. Following the same framework, we investigated different combinations of dimensionality reduction techniques (PCA and four nonlinear manifold learning methods) and two machine learning classification methods (artificial neural networks-ANN and SVM). Performance was evaluated on ten fluoroscopic image sequences of nine lung cancer patients. We found that among all combinations of dimensionality reduction techniques and classification methods, PCA combined with either ANN or SVM achieved a better performance than the other nonlinear manifold learning methods. ANN when combined with PCA achieves a better performance than SVM in terms of classification accuracy and recall rate, although the target coverage is similar for the two classification methods. Furthermore, the running time for both ANN and SVM with PCA is within tolerance for real-time applications. Overall, ANN combined with PCA is a better candidate than other combinations we investigated in this work for real-time gated radiotherapy.


Assuntos
Inteligência Artificial , Neoplasias Pulmonares/radioterapia , Radioterapia/métodos , Algoritmos , Redes Neurais de Computação , Padrões de Referência , Técnicas de Imagem de Sincronização Respiratória , Sensibilidade e Especificidade
17.
IEEE Trans Med Imaging ; 38(11): 2642-2653, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-30932833

RESUMO

Deep convolutional neural networks (CNN) have recently achieved superior performance at the task of medical image segmentation compared to classic models. However, training a generalizable CNN requires a large amount of training data, which is difficult, expensive, and time-consuming to obtain in medical settings. Active Learning (AL) algorithms can facilitate training CNN models by proposing a small number of the most informative data samples to be annotated to achieve a rapid increase in performance. We proposed a new active learning method based on Fisher information (FI) for CNNs for the first time. Using efficient backpropagation methods for computing gradients together with a novel low-dimensional approximation of FI enabled us to compute FI for CNNs with a large number of parameters. We evaluated the proposed method for brain extraction with a patch-wise segmentation CNN model in two different learning scenarios: universal active learning and active semi-automatic segmentation. In both scenarios, an initial model was obtained using labeled training subjects of a source data set and the goal was to annotate a small subset of new samples to build a model that performs well on the target subject(s). The target data sets included images that differed from the source data by either age group (e.g. newborns with different image contrast) or underlying pathology that was not available in the source data. In comparison to several recently proposed AL methods and brain extraction baselines, the results showed that FI-based AL outperformed the competing methods in improving the performance of the model after labeling a very small portion of target data set (<0.25%).


Assuntos
Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Encéfalo/diagnóstico por imagem , Pré-Escolar , Humanos , Lactente , Recém-Nascido , Imageamento por Ressonância Magnética
18.
Chronic Obstr Pulm Dis ; 6(5): 384-399, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-31710793

RESUMO

BACKGROUND: Chronic obstructive pulmonary disease (COPD) remains a major cause of morbidity and mortality. Present-day diagnostic criteria are largely based solely on spirometric criteria. Accumulating evidence has identified a substantial number of individuals without spirometric evidence of COPD who suffer from respiratory symptoms and/or increased morbidity and mortality. There is a clear need for an expanded definition of COPD that is linked to physiologic, structural (computed tomography [CT]) and clinical evidence of disease. Using data from the COPD Genetic Epidemiology study (COPDGene®), we hypothesized that an integrated approach that includes environmental exposure, clinical symptoms, chest CT imaging and spirometry better defines disease and captures the likelihood of progression of respiratory obstruction and mortality. METHODS: Four key disease characteristics - environmental exposure (cigarette smoking), clinical symptoms (dyspnea and/or chronic bronchitis), chest CT imaging abnormalities (emphysema, gas trapping and/or airway wall thickening), and abnormal spirometry - were evaluated in a group of 8784 current and former smokers who were participants in COPDGene® Phase 1. Using these 4 disease characteristics, 8 categories of participants were identified and evaluated for odds of spirometric disease progression (FEV1 > 350 ml loss over 5 years), and the hazard ratio for all-cause mortality was examined. RESULTS: Using smokers without symptoms, CT imaging abnormalities or airflow obstruction as the reference population, individuals were classified as Possible COPD, Probable COPD and Definite COPD. Current Global initiative for obstructive Lung Disease (GOLD) criteria would diagnose 4062 (46%) of the 8784 study participants with COPD. The proposed COPDGene® 2019 diagnostic criteria would add an additional 3144 participants. Under the new criteria, 82% of the 8784 study participants would be diagnosed with Possible, Probable or Definite COPD. These COPD groups showed increased risk of disease progression and mortality. Mortality increased in patients as the number of their COPD characteristics increased, with a maximum hazard ratio for all cause-mortality of 5.18 (95% confidence interval [CI]: 4.15-6.48) in those with all 4 disease characteristics. CONCLUSIONS: A substantial portion of smokers with respiratory symptoms and imaging abnormalities do not manifest spirometric obstruction as defined by population normals. These individuals are at significant risk of death and spirometric disease progression. We propose to redefine the diagnosis of COPD through an integrated approach using environmental exposure, clinical symptoms, CT imaging and spirometric criteria. These expanded criteria offer the potential to stimulate both current and future interventions that could slow or halt disease progression in patients before disability or irreversible lung structural changes develop.

19.
Phys Med Biol ; 53(16): N315-27, 2008 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-18660557

RESUMO

Various problems with the current state-of-the-art techniques for gated radiotherapy have prevented this new treatment modality from being widely implemented in clinical routine. These problems are caused mainly by applying various external respiratory surrogates. There might be large uncertainties in deriving the tumor position from external respiratory surrogates. While tracking implanted fiducial markers has sufficient accuracy, this procedure may not be widely accepted due to the risk of pneumothorax. Previously, we have developed a technique to generate gating signals from fluoroscopic images without implanted fiducial markers using template matching methods (Berbeco et al 2005 Phys. Med. Biol. 50 4481-90, Cui et al 2007b Phys. Med. Biol. 52 741-55). In this note, our main contribution is to provide a totally different new view of the gating problem by recasting it as a classification problem. Then, we solve this classification problem by a well-studied powerful classification method called a support vector machine (SVM). Note that the goal of an automated gating tool is to decide when to turn the beam ON or OFF. We treat ON and OFF as the two classes in our classification problem. We create our labeled training data during the patient setup session by utilizing the reference gating signal, manually determined by a radiation oncologist. We then pre-process these labeled training images and build our SVM prediction model. During treatment delivery, fluoroscopic images are continuously acquired, pre-processed and sent as an input to the SVM. Finally, our SVM model will output the predicted labels as gating signals. We test the proposed technique on five sequences of fluoroscopic images from five lung cancer patients against the reference gating signal as ground truth. We compare the performance of the SVM to our previous template matching method (Cui et al 2007b Phys. Med. Biol. 52 741-55). We find that the SVM is slightly more accurate on average (1-3%) than the template matching method, when delivering the target dose. And the average duty cycle is 4-6% longer. Given the very limited patient dataset, we cannot conclude that the SVM is more accurate and efficient than the template matching method. However, our preliminary results show that the SVM is a potentially precise and efficient algorithm for generating gating signals for radiotherapy. This work demonstrates that the gating problem can be considered as a classification problem and solved accordingly.


Assuntos
Algoritmos , Inteligência Artificial , Fluoroscopia/métodos , Neoplasias Pulmonares/diagnóstico por imagem , Neoplasias Pulmonares/radioterapia , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Radioterapia Assistida por Computador/métodos , Fluoroscopia/instrumentação , Humanos , Reconhecimento Automatizado de Padrão
20.
IEEE Trans Pattern Anal Mach Intell ; 40(8): 2023-2029, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-28858784

RESUMO

The task of labeling samples is demanding and expensive. Active learning aims to generate the smallest possible training data set that results in a classifier with high performance in the test phase. It usually consists of two steps of selecting a set of queries and requesting their labels. Among the suggested objectives to score the query sets, information theoretic measures have become very popular. Yet among them, those based on Fisher information (FI) have the advantage of considering the diversity among the queries and tractable computations. In this work, we provide a practical algorithm based on Fisher information ratio to obtain query distribution for a general framework where, in contrast to the previous FI-based querying methods, we make no assumptions over the test distribution. The empirical results on synthetic and real-world data sets indicate that this algorithm gives competitive results.


Assuntos
Algoritmos , Simulação por Computador , Bases de Dados Factuais/estatística & dados numéricos , Humanos , Modelos Estatísticos , Método de Monte Carlo
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa