RESUMO
Respiratory system cancer, encompassing lung, trachea and bronchus cancer, constitute a substantial and evolving public health challenge. Since pollution plays a prominent cause in the development of this disease, identifying which substances are most harmful is fundamental for implementing policies aimed at reducing exposure to these substances. We propose an approach based on explainable artificial intelligence (XAI) based on remote sensing data to identify the factors that most influence the prediction of the standard mortality ratio (SMR) for respiratory system cancer in the Italian provinces using environment and socio-economic data. First of all, we identified 10 clusters of provinces through the study of the SMR variogram. Then, a Random Forest regressor is used for learning a compact representation of data. Finally, we used XAI to identify which features were most important in predicting SMR values. Our machine learning analysis shows that NO, income and O3 are the first three relevant features for the mortality of this type of cancer, and provides a guideline on intervention priorities in reducing risk factors.
Assuntos
Poluição do Ar , Inteligência Artificial , Neoplasias do Sistema Respiratório , Humanos , Itália/epidemiologia , Poluição do Ar/efeitos adversos , Neoplasias do Sistema Respiratório/mortalidade , Fatores de Risco , Aprendizado de Máquina , Exposição Ambiental/efeitos adversosRESUMO
Respiratory malignancies, encompassing cancers affecting the lungs, the trachea, and the bronchi, pose a significant and dynamic public health challenge. Given that air pollution stands as a significant contributor to the onset of these ailments, discerning the most detrimental agents becomes imperative for crafting policies aimed at mitigating exposure. This study advocates for the utilization of explainable artificial intelligence (XAI) methodologies, leveraging remote sensing data, to ascertain the primary influencers on the prediction of standard mortality rates (SMRs) attributable to respiratory cancer across Italian provinces, utilizing both environmental and socioeconomic data. By scrutinizing thirteen distinct machine learning algorithms, we endeavor to pinpoint the most accurate model for categorizing Italian provinces as either above or below the national average SMR value for respiratory cancer. Furthermore, employing XAI techniques, we delineate the salient factors crucial in predicting the two classes of SMR. Through our machine learning scrutiny, we illuminate the environmental and socioeconomic factors pertinent to mortality in this disease category, thereby offering a roadmap for prioritizing interventions aimed at mitigating risk factors.
RESUMO
Background: Colorectal cancer (CRC) is a type of tumor caused by the uncontrolled growth of cells in the mucosa lining the last part of the intestine. Emerging evidence underscores an association between CRC and gut microbiome dysbiosis. The high mortality rate of this cancer has made it necessary to develop new early diagnostic methods. Machine learning (ML) techniques can represent a solution to evaluate the interaction between intestinal microbiota and host physiology. Through explained artificial intelligence (XAI) it is possible to evaluate the individual contributions of microbial taxonomic markers for each subject. Our work also implements the Shapley Method Additive Explanations (SHAP) algorithm to identify for each subject which parameters are important in the context of CRC. Results: The proposed study aimed to implement an explainable artificial intelligence framework using both gut microbiota data and demographic information from subjects to classify a cohort of control subjects from those with CRC. Our analysis revealed an association between gut microbiota and this disease. We compared three machine learning algorithms, and the Random Forest (RF) algorithm emerged as the best classifier, with a precision of 0.729 ± 0.038 and an area under the Precision-Recall curve of 0.668 ± 0.016. Additionally, SHAP analysis highlighted the most crucial variables in the model's decision-making, facilitating the identification of specific bacteria linked to CRC. Our results confirmed the role of certain bacteria, such as Fusobacterium, Peptostreptococcus, and Parvimonas, whose abundance appears notably associated with the disease, as well as bacteria whose presence is linked to a non-diseased state. Discussion: These findings emphasizes the potential of leveraging gut microbiota data within an explainable AI framework for CRC classification. The significant association observed aligns with existing knowledge. The precision exhibited by the RF algorithm reinforces its suitability for such classification tasks. The SHAP analysis not only enhanced interpretability but identified specific bacteria crucial in CRC determination. This approach opens avenues for targeted interventions based on microbial signatures. Further exploration is warranted to deepen our understanding of the intricate interplay between microbiota and health, providing insights for refined diagnostic and therapeutic strategies.
RESUMO
Raman spectroscopy shows great potential as a diagnostic tool for thyroid cancer due to its ability to detect biochemical changes during cancer development. This technique is particularly valuable because it is non-invasive and label/dye-free. Compared to molecular tests, Raman spectroscopy analyses can more effectively discriminate malignant features, thus reducing unnecessary surgeries. However, one major hurdle to using Raman spectroscopy as a diagnostic tool is the identification of significant patterns and peaks. In this study, we propose a Machine Learning procedure to discriminate healthy/benign versus malignant nodules that produces interpretable results. We collect Raman spectra obtained from histological samples, select a set of peaks with a data-driven and label independent approach and train the algorithms with the relative prominence of the peaks in the selected set. The performance of the considered models, quantified by area under the Receiver Operating Characteristic curve, exceeds 0.9. To enhance the interpretability of the results, we employ eXplainable Artificial Intelligence and compute the contribution of each feature to the prediction of each sample.
Assuntos
Inteligência Artificial , Neoplasias da Glândula Tireoide , Humanos , Diagnóstico Diferencial , Neoplasias da Glândula Tireoide/diagnóstico , Neoplasias da Glândula Tireoide/patologia , Algoritmos , Análise Espectral Raman/métodosRESUMO
Hepatocellular carcinoma (HCC) is one of the most common cancers worldwide, and the number of cases is constantly increasing. Early and accurate HCC diagnosis is crucial to improving the effectiveness of treatment. The aim of the study is to develop a supervised learning framework based on hierarchical community detection and artificial intelligence in order to classify patients and controls using publicly available microarray data. With our methodology, we identified 20 gene communities that discriminated between healthy and cancerous samples, with an accuracy exceeding 90%. We validated the performance of these communities on an independent dataset, and with two of them, we reached an accuracy exceeding 80%. Then, we focused on two communities, selected because they were enriched with relevant biological functions, and on these we applied an explainable artificial intelligence (XAI) approach to analyze the contribution of each gene to the classification task. In conclusion, the proposed framework provides an effective methodological and quantitative tool helping to find gene communities, which may uncover pivotal mechanisms responsible for HCC and thus discover new biomarkers.
Assuntos
Carcinoma Hepatocelular , Neoplasias Hepáticas , Humanos , Carcinoma Hepatocelular/diagnóstico , Carcinoma Hepatocelular/genética , Inteligência Artificial , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/genética , Marcadores Genéticos , Nível de SaúdeRESUMO
Introduction: Recently, accurate machine learning and deep learning approaches have been dedicated to the investigation of breast cancer invasive disease events (IDEs), such as recurrence, contralateral and second cancers. However, such approaches are poorly interpretable. Methods: Thus, we designed an Explainable Artificial Intelligence (XAI) framework to investigate IDEs within a cohort of 486 breast cancer patients enrolled at IRCCS Istituto Tumori "Giovanni Paolo II" in Bari, Italy. Using Shapley values, we determined the IDE driving features according to two periods, often adopted in clinical practice, of 5 and 10 years from the first tumor diagnosis. Results: Age, tumor diameter, surgery type, and multiplicity are predominant within the 5-year frame, while therapy-related features, including hormone, chemotherapy schemes and lymphovascular invasion, dominate the 10-year IDE prediction. Estrogen Receptor (ER), proliferation marker Ki67 and metastatic lymph nodes affect both frames. Discussion: Thus, our framework aims at shortening the distance between AI and clinical practice.
RESUMO
Tumours are nowadays the second worldleading cause of death after cardiovascular diseases. During the last decades of cancer research, lifestyle and random/genetic factors have been blamed for cancer mortality, with obesity, sedentary habits, alcoholism, and smoking contributing as supposed major causes. However, there is an emerging consensus that environmental pollution should be considered one of the main triggers. Unfortunately, all this preliminary scientific evidence has not always been followed by governments and institutions, which still fail to pursue research on cancer's environmental connections. In this unprecedented national-scale detailed study, we analyzed the links between cancer mortality, socio-economic factors, and sources of environmental pollution in Italy, both at wider regional and finer provincial scales, with an artificial intelligence approach. Overall, we found that cancer mortality does not have a random or spatial distribution and exceeds the national average mainly when environmental pollution is also higher, despite healthier lifestyle habits. Our machine learning analysis of 35 environmental sources of pollution showed that air quality ranks first for importance concerning the average cancer mortality rate, followed by sites to be reclaimed, urban areas, and motor vehicle density. Moreover, other environmental sources of pollution proved to be relevant for the mortality of some specific cancer types. Given these alarming results, we call for a rearrangement of the priority of cancer research and care that sees the reduction and prevention of environmental contamination as a priority action to put in place in the tough struggle against cancer.
Assuntos
Poluentes Atmosféricos , Poluição do Ar , Neoplasias , Humanos , Inteligência Artificial , Poluição Ambiental/efeitos adversos , Veículos Automotores , Itália/epidemiologia , Exposição Ambiental , MortalidadeRESUMO
In Italy, approximately 400.000 new cases of malignant tumors are recorded every year. The average of annual deaths caused by tumors, according to the Italian Cancer Registers, is about 3.5 deaths and about 2.5 per 1,000 men and women respectively, for a total of about 3 deaths every 1,000 people. Long-term (at least a decade) and spatially detailed data (up to the municipality scale) are neither easily accessible nor fully available for public consultation by the citizens, scientists, research groups, and associations. Therefore, here we present a ten-year (2009-2018) database on cancer mortality rates (in the form of Standardized Mortality Ratios, SMR) for 23 cancer macro-types in Italy on municipal, provincial, and regional scales. We aim to make easily accessible a comprehensive, ready-to-use, and openly accessible source of data on the most updated status of cancer mortality in Italy for local and national stakeholders, researchers, and policymakers and to provide researchers with ready-to-use data to perform specific studies.
Assuntos
Neoplasias , Feminino , Humanos , Masculino , Bases de Dados Factuais , Itália/epidemiologia , Neoplasias/mortalidadeRESUMO
A quartz enhanced photoacoustic spectroscopy (QEPAS) sensor capable to detect high concentrations of methane (C1) and ethane (C2) is here reported. The hydrocarbons fingerprint region around 3 µm was exploited using an interband cascade laser (ICL). A standard quartz tuning fork (QTF) coupled with two resonator tubes was used to detect the photoacoustic signal generated by the target molecules. Employing dedicated electronic boards to both control the laser source and collect the QTF signal, a shoe-box sized QEPAS sensor was realized. All the generated mixtures were downstream humidified to remove the influence of water vapor on the target gases. Several natural gas-like samples were generated and subsequently diluted 1:10 in N2. In the concentration ranges under investigation (1%-10% for C1 and 0.1%-1% for C2), both linear and nonlinear responses of the sensor were measured and signal variations due to matrix effects were observed. Partial least squares regression (PLSR) was employed as a multivariate statistical tool to accurately determine the concentrations of C1 and C2 in the mixtures, compensating the matrix relaxation effects. The achieved results extend the range of C1 and C2 concentrations detectable by QEPAS technique up to the percent scale.
RESUMO
Colorectal cancer (CRC) carcinogenesis is generally the result of the sequential mutation and deletion of various genes; this is known as the normal mucosa-adenoma-carcinoma sequence. The aim of this study was to develop a predictor-classifier during the "adenoma-carcinoma" sequence using microarray gene expression profiles of primary CRC, adenoma, and normal colon epithelial tissues. Four gene expression profiles from the Gene Expression Omnibus database, containing 465 samples (105 normal, 155 adenoma, and 205 CRC), were preprocessed to identify differentially expressed genes (DEGs) between adenoma tissue and primary CRC. The feature selection procedure, using the sequential Boruta algorithm and Stepwise Regression, determined 56 highly important genes. K-Means methods showed that, using the selected 56 DEGs, the three groups were clearly separate. The classification was performed with machine learning algorithms such as Linear Model (LM), Random Forest (RF), k-Nearest Neighbors (k-NN), and Artificial Neural Network (ANN). The best classification method in terms of accuracy (88.06 ± 0.70) and AUC (92.04 ± 0.47) was k-NN. To confirm the relevance of the predictive models, we applied the four models on a validation cohort: the k-NN model remained the best model in terms of performance, with 91.11% accuracy. Among the 56 DEGs, we identified 17 genes with an ascending or descending trend through the normal mucosa-adenoma-carcinoma sequence. Moreover, using the survival information of the TCGA database, we selected six DEGs related to patient prognosis (SCARA5, PKIB, CWH43, TEX11, METTL7A, and VEGFA). The six-gene-based classifier described in the current study could be used as a potential biomarker for the early diagnosis of CRC.
RESUMO
The mortality associated to breast cancer is in many cases related to metastasization and recurrence. Personalized treatment strategies are critical for the outcomes improvement of BC patients and the Clinical Decision Support Systems can have an important role in medical practice. In this paper, we present the preliminary results of a prediction model of the Breast Cancer Recurrence (BCR) within five and ten years after diagnosis. The main breast cancer-related and treatment-related features of 256 patients referred to Istituto Tumori "Giovanni Paolo II" of Bari (Italy) were used to train machine learning algorithms at the-state-of-the-art. Firstly, we implemented several feature importance techniques and then we evaluated the prediction performances of BCR within 5 and 10 years after the first diagnosis by means different classifiers. By using a small number of features, the models reached highly performing results both with reference to the BCR within 5 years and within 10 years with an accuracy of 77.50% and 80.39% and a sensitivity of 92.31% and 95.83% respectively, in the hold-out sample test. Despite validation studies are needed on larger samples, our results are promising for the development of a reliable prognostic supporting tool for clinicians in the definition of personalized treatment plans.
RESUMO
Contrast-Enhanced Spectral Mammography (CESM) is a recently introduced mammographic method with characteristics particularly suitable for breast cancer radiomic analysis. This work aims to evaluate radiomic features for predicting histological outcome and two cancer molecular subtypes, namely Human Epidermal growth factor Receptor 2 (HER2)-positive and triple-negative. From 52 patients, 68 lesions were identified and confirmed on histological examination. Radiomic analysis was performed on regions of interest (ROIs) selected from both low-energy (LE) and ReCombined (RC) CESM images. Fourteen statistical features were extracted from each ROI. Expression of estrogen receptor (ER) was significantly correlated with variation coefficient and variation range calculated on both LE and RC images; progesterone receptor (PR) with skewness index calculated on LE images; and Ki67 with variation coefficient, variation range, entropy and relative smoothness indices calculated on RC images. HER2 was significantly associated with relative smoothness calculated on LE images, and grading tumor with variation coefficient, entropy and relative smoothness calculated on RC images. Encouraging results for differentiation between ER+/ER-, PR+/PR-, HER2+/HER2-, Ki67+/Ki67-, High-Grade/Low-Grade and TN/NTN were obtained. Specifically, the highest performances were obtained for discriminating HER2+/HER2- (90.87%), ER+/ER- (83.79%) and Ki67+/Ki67- (84.80%). Our results suggest an interesting role for radiomics in CESM to predict histological outcomes and particular tumors' molecular subtype.
RESUMO
Malignant pleural mesothelioma (MPM) is a rare neoplasm, mainly caused by asbestos exposure, with a high mortality rate. The management of patients with MPM is controversial due to a long latency period between exposure and diagnosis and because of non-specific symptoms generally appearing at advanced stage of the disease. Breath analysis, aimed at the identification of diagnostic Volatile Organic Compounds (VOCs) pattern in exhaled breath, is believed to improve early detection of MPM. Therefore, in this study, breath samples from 14 MPM patients and 20 healthy controls (HC) were collected and analyzed by Thermal Desorption-Gas Chromatography-Mass Spectrometry (TD-GC/MS). Nonparametric test allowed to identify the most weighting variables to discriminate between MPM and HC breath samples and multivariate statistics were applied. Considering that MPM is an aggressive neoplasm leading to a late diagnosis and thus the recruitment of patients is very difficult, a promising data mining approach was developed and validated in order to discriminate between MPM patients and healthy controls, even if no large population data are available. Three different machine learning algorithms were applied to perform the classification task with a leave-one-out cross-validation approach, leading to remarkable results (Area Under Curve AUC = 93%). Ten VOCs, such as ketones, alkanes and methylate derivates, as well as hydrocarbons, were able to discriminate between MPM patients and healthy controls and for each compound which resulted diagnostic for MPM, the metabolic pathway was studied in order to identify the link between VOC and the neoplasm. Moreover, five breath samples from asymptomatic asbestos-exposed persons (AEx) were exploratively analyzed, processed and tested by the validated statistical method as blinded samples in order to evaluate the performance for the early recognition of patients affected by MPM among asbestos-exposed persons. Good agreement was found between the information obtained by gold-standard diagnostic methods such as computed tomography CT and model output.
RESUMO
BACKGROUND: Screening programs use mammography as primary diagnostic tool for detecting breast cancer at an early stage. The diagnosis of some lesions, such as microcalcifications, is still difficult today for radiologists. In this paper, we proposed an automatic binary model for discriminating tissue in digital mammograms, as support tool for the radiologists. In particular, we compared the contribution of different methods on the feature selection process in terms of the learning performances and selected features. RESULTS: For each ROI, we extracted textural features on Haar wavelet decompositions and also interest points and corners detected by using Speeded Up Robust Feature (SURF) and Minimum Eigenvalue Algorithm (MinEigenAlg). Then a Random Forest binary classifier is trained on a subset of a sub-set features selected by two different kinds of feature selection techniques, such as filter and embedded methods. We tested the proposed model on 260 ROIs extracted from digital mammograms of the BCDR public database. The best prediction performance for the normal/abnormal and benign/malignant problems reaches a median AUC value of 98.16% and 92.08%, and an accuracy of 97.31% and 88.46%, respectively. The experimental result was comparable with related work performance. CONCLUSIONS: The best performing result obtained with embedded method is more parsimonious than the filter one. The SURF and MinEigen algorithms provide a strong informative content useful for the characterization of microcalcification clusters.
Assuntos
Mama , Calcinose/diagnóstico , Aprendizado de Máquina , Algoritmos , Área Sob a Curva , Mama/diagnóstico por imagem , Neoplasias da Mama/diagnóstico , Bases de Dados Factuais , Feminino , Humanos , Mamografia , Curva ROCRESUMO
Contrast-Enhanced Spectral Mammography (CESM) is a novelty instrumentation for diagnosing of breast cancer, but it can still be considered operator dependent. In this paper, we proposed a fully automatic system as a diagnostic support tool for the clinicians. For each Region Of Interest (ROI), a features set was extracted from low-energy and recombined images by using different techniques. A Random Forest classifier was trained on a selected subset of significant features by a sequential feature selection algorithm. The proposed Computer-Automated Diagnosis system is tested on 48 ROIs extracted from 53 patients referred to Istituto Tumori "Giovanni Paolo II" of Bari (Italy) from the breast cancer screening phase between March 2017 and June 2018. The present method resulted highly performing in the prediction of benign/malignant ROIs with median values of sensitivity and specificity of 87 . 5 % and 91 . 7 % , respectively. The performance was high compared to the state-of-the-art, even with a moderate/marked level of parenchymal background. Our classification model outperformed the human reader, by increasing the specificity over 8 % . Therefore, our system could represent a valid support tool for radiologists for interpreting CESM images, both reducing the false positive rate and limiting biopsies and surgeries.
RESUMO
Meckel's diverticulum (MD ) is the most common congenital anomaly of the gastrointestinal tract. We revalued clinical records of patients discharged from Unit of Urgent and General Surgery of Highly Specialized Hospital "A.O.R.N. Antonio Cardarelli" of Naples with diagnosis of acute pathology associated to complicated MD from 1(st) January 2011 to 30(th) November 2012. Seven consecutive cases have been chosen: five males (71,4%) and two females (28,6%). The age ranges over from 13 to 50 years with a 28 years average. Four of them were submitted to emergency surgical intervention for hemorrhage from gastro-enteric tract (57%), two for bowel obstruction (29%) and one for acute appendicitis (14%). In all cases sample was send to histological examination. Two samples showed normal epithelial mucosa. Four of them showed ectopic mucosa inside the diverticulum: three gastric and one pancreatic ectopic mucosa focal areas. The last case showed normal epithelial cells but with ulcerated and hemorrhagic areas. Four samples of patients with hemorrhage from gastroenteric tract showed at histological examination: a case of normal mucosa, a case of gastric mucosa areas, one of pancreatic ectopic tissue and the last with normal mucosa but ulcerated and with bleeding areas.In our experience we never speculated that acute symptomatology depended on complicated MD and diagnosis was always done during laparotomy. We think that MD removal is always the correct choice, so that future complications such as neoplasm can be avoided. MD simple resection by Stapler at the base of diverticulum is the correct choice.
Assuntos
Divertículo Ileal/cirurgia , Abdome Agudo/etiologia , Doença Aguda , Adolescente , Adulto , Apendicite/complicações , Apendicite/cirurgia , Coristoma , Diagnóstico por Imagem/métodos , Procedimentos Cirúrgicos do Sistema Digestório/métodos , Gerenciamento Clínico , Diverticulite/patologia , Diverticulite/cirurgia , Emergências , Feminino , Mucosa Gástrica , Hemorragia Gastrointestinal/etiologia , Humanos , Obstrução Intestinal/etiologia , Laparotomia , Masculino , Divertículo Ileal/complicações , Divertículo Ileal/diagnóstico por imagem , Divertículo Ileal/patologia , Pessoa de Meia-Idade , Pâncreas , Radiografia , Estudos Retrospectivos , Úlcera/etiologia , Adulto JovemRESUMO
PURPOSE: The aim of this work is to evaluate the potential of combining different computer-aided detection (CADe) methods to increase the actual support for radiologists of automated systems in the identification of pulmonary nodules in CT scans. METHODS: The outputs of three different CADe systems developed by researchers of the Italian MAGIC-5 collaboration were combined. The systems are: the CAMCADe (based on a Channeler-Ant-Model which segments vessel tree and nodule candidates and a neural classifier), the RGVPCADe (a Region-Growing- Volume-Plateau algorithm detects nodule candidates and a neural network reduces false positives); the VBNACADe (two dedicated procedures, based respectively on a 3D dot-enhancement algorithm and on intersections of pleura surface normals, identifies internal and juxtapleural nodules, and a Voxel-Based-Neural-Approach reduces false positives. A dedicated OsiriX plugin implemented with the Cocoa environments of MacOSX allows annotating nodules and visualizing singles and combined CADe findings. RESULTS: The combined CADe has been tested on thin slice (lower than 2 mm) CTs of the LIDC public research database and the results have been compared with those obtained by the single systems. The FROC (Free Receiver Operating Characteristic) curves show better results than the best of the single approaches. CONCLUSIONS: Has been demonstrated that the combination of different approaches offers better results than each single CADe system. A clinical validation of the combined CADe as second reader is being addressed by means of the dedicated OsiriX plugin.
Assuntos
Algoritmos , Neoplasias Pulmonares/diagnóstico , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Tomografia Computadorizada por Raios X/métodos , Diagnóstico Diferencial , Reações Falso-Positivas , Humanos , Curva ROC , Nódulo Pulmonar SolitárioRESUMO
A fully automated and three-dimensional (3D) segmentation method for the identification of the pulmonary parenchyma in thorax X-ray computed tomography (CT) datasets is proposed. It is meant to be used as pre-processing step in the computer-assisted detection (CAD) system for malignant lung nodule detection that is being developed by the Medical Applications in a Grid Infrastructure Connection (MAGIC-5) Project. In this new approach the segmentation of the external airways (trachea and bronchi), is obtained by 3D region growing with wavefront simulation and suitable stop conditions, thus allowing an accurate handling of the hilar region, notoriously difficult to be segmented. Particular attention was also devoted to checking and solving the problem of the apparent 'fusion' between the lungs, caused by partial-volume effects, while 3D morphology operations ensure the accurate inclusion of all the nodules (internal, pleural, and vascular) in the segmented volume. The new algorithm was initially developed and tested on a dataset of 130 CT scans from the Italung-CT trial, and was then applied to the ANODE09-competition images (55 scans) and to the LIDC database (84 scans), giving very satisfactory results. In particular, the lung contour was adequately located in 96% of the CT scans, with incorrect segmentation of the external airways in the remaining cases. Segmentation metrics were calculated that quantitatively express the consistency between automatic and manual segmentations: the mean overlap degree of the segmentation masks is 0.96 ± 0.02, and the mean and the maximum distance between the mask borders (averaged on the whole dataset) are 0.74 ± 0.05 and 4.5 ± 1.5, respectively, which confirms that the automatic segmentations quite correctly reproduce the borders traced by the radiologist. Moreover, no tissue containing internal and pleural nodules was removed in the segmentation process, so that this method proved to be fit for the use in the framework of a CAD system. Finally, in the comparison with a two-dimensional segmentation procedure, inter-slice smoothness was calculated, showing that the masks created by the 3D algorithm are significantly smoother than those calculated by the 2D-only procedure.
Assuntos
Algoritmos , Neoplasias Pulmonares/diagnóstico , Pulmão/diagnóstico por imagem , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Humanos , Neoplasias Pulmonares/diagnóstico por imagem , Estudos Retrospectivos , Tomografia Computadorizada por Raios XRESUMO
BACKGROUND: Matrix metalloproteinases (MMPs) are well-known biological targets implicated in tumour progression, homeostatic regulation, innate immunity, impaired delivery of pro-apoptotic ligands, and the release and cleavage of cell-surface receptors. With this in mind, the perception of the intimate relationships among diverse MMPs could be a solid basis for accelerated learning in designing new selective MMP inhibitors. In this regard, decrypting the latent molecular reasons in order to elucidate similarity among MMPs is a key challenge. RESULTS: We describe a pairwise variant of the non-parametric chaotic map clustering (CMC) algorithm and its application to 104 X-ray MMP structures. In this analysis electrostatic potentials are computed and used as input for the CMC algorithm. It was shown that differences between proteins reflect genuine variation of their electrostatic potentials. In addition, the analysis has been also extended to analyze the protein primary structures and the molecular shapes of the MMP co-crystallised ligands. CONCLUSIONS: The CMC algorithm was shown to be a valuable tool in knowledge acquisition and transfer from MMP structures. Based on the variation of electrostatic potentials, CMC was successful in analysing the MMP target family landscape and different subsites. The first investigation resulted in rational figure interpretation of both domain organization as well as of substrate specificity classifications. The second made it possible to distinguish the MMP classes, demonstrating the high specificity of the S1' pocket, to detect both the occurrence of punctual mutations of ionisable residues and different side-chain conformations that likely account for induced-fit phenomena. In addition, CMC demonstrated a potential comparable to the most popular UPGMA (Unweighted Pair Group Method with Arithmetic mean) method that, at present, represents a standard clustering bioinformatics approach. Interestingly, CMC and UPGMA resulted in closely comparable outcomes, but often CMC produced more informative and more easy interpretable dendrograms. Finally, CMC was successful for standard pairwise analysis (i.e., Smith-Waterman algorithm) of protein sequences and was used to convincingly explain the complementarity existing between the molecular shapes of the co-crystallised ligand molecules and the accessible MMP void volumes.
Assuntos
Cristalografia por Raios X , Metaloproteinases da Matriz/química , Algoritmos , Análise por Conglomerados , Ligantes , Metaloproteinases da Matriz/metabolismo , Modelos Moleculares , Conformação Proteica , Especificidade por SubstratoRESUMO
Numerous publications and commercial systems are available that deal with automatic detection of pulmonary nodules in thoracic computed tomography scans, but a comparative study where many systems are applied to the same data set has not yet been performed. This paper introduces ANODE09 ( http://anode09.isi.uu.nl), a database of 55 scans from a lung cancer screening program and a web-based framework for objective evaluation of nodule detection algorithms. Any team can upload results to facilitate benchmarking. The performance of six algorithms for which results are available are compared; five from academic groups and one commercially available system. A method to combine the output of multiple systems is proposed. Results show a substantial performance difference between algorithms, and demonstrate that combining the output of algorithms leads to marked performance improvements.