RESUMO
BACKGROUND: The knowledge of the spatial organisation of the chromatin fibre in cell nuclei helps researchers to understand the nuclear machinery that regulates DNA activity. Recent experimental techniques of the type Chromosome Conformation Capture (3C, or similar) provide high-resolution, high-throughput data consisting in the number of times any possible pair of DNA fragments is found to be in contact, in a certain population of cells. As these data carry information on the structure of the chromatin fibre, several attempts have been made to use them to obtain high-resolution 3D reconstructions of entire chromosomes, or even an entire genome. The techniques proposed treat the data in different ways, possibly exploiting physical-geometric chromatin models. One popular strategy is to transform contact data into Euclidean distances between pairs of fragments, and then solve a classical distance-to-geometry problem. RESULTS: We developed and tested a reconstruction technique that does not require translating contacts into distances, thus avoiding a number of related drawbacks. Also, we introduce a geometrical chromatin chain model that allows us to include sound biochemical and biological constraints in the problem. This model can be scaled at different genomic resolutions, where the structures of the coarser models are influenced by the reconstructions at finer resolutions. The search in the solution space is then performed by a classical simulated annealing, where the model is evolved efficiently through quaternion operators. The presence of appropriate constraints permits the less reliable data to be overlooked, so the result is a set of plausible chromatin configurations compatible with both the data and the prior knowledge. CONCLUSIONS: To test our method, we obtained a number of 3D chromatin configurations from Hi-C data available in the literature for the long arm of human chromosome 1, and validated their features against known properties of gene density and transcriptional activity. Our results are compatible with biological features not introduced a priori in the problem: structurally different regions in our reconstructions highly correlate with functionally different regions as known from literature and genomic repositories.
Assuntos
Cromatina/química , Genômica/métodos , Algoritmos , Cromatina/metabolismo , DNA/química , DNA/metabolismo , Humanos , Internet , Conformação Molecular , Método de Monte Carlo , Interface Usuário-ComputadorRESUMO
Breast cancer holds the highest diagnosis rate among female tumors and is the leading cause of death among women. Quantitative analysis of radiological images shows the potential to address several medical challenges, including the early detection and classification of breast tumors. In the P.I.N.K study, 66 women were enrolled. Their paired Automated Breast Volume Scanner (ABVS) and Digital Breast Tomosynthesis (DBT) images, annotated with cancerous lesions, populated the first ABVS+DBT dataset. This enabled not only a radiomic analysis for the malignant vs. benign breast cancer classification, but also the comparison of the two modalities. For this purpose, the models were trained using a leave-one-out nested cross-validation strategy combined with a proper threshold selection approach. This approach provides statistically significant results even with medium-sized data sets. Additionally it provides distributional variables of importance, thus identifying the most informative radiomic features. The analysis proved the predictive capacity of radiomic models even using a reduced number of features. Indeed, from tomography we achieved AUC-ROC 89.9 % using 19 features and 92.1 % using 7 of them; while from ABVS we attained an AUC-ROC of 72.3 % using 22 features and 85.8 % using only 3 features. Although the predictive power of DBT outperforms ABVS, when comparing the predictions at the patient level, only 8.7% of lesions are misclassified by both methods, suggesting a partial complementarity. Notably, promising results (AUC-ROC ABVS-DBT 71.8 % - 74.1 % ) were achieved using non-geometric features, thus opening the way to the integration of virtual biopsy in medical routine.
Assuntos
Neoplasias da Mama , Aprendizado de Máquina , Mamografia , Humanos , Feminino , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/patologia , Mamografia/métodos , Pessoa de Meia-Idade , Idoso , Adulto , Mama/diagnóstico por imagem , Mama/patologia , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , RadiômicaRESUMO
A multiscale method proposed elsewhere for reconstructing plausible 3D configurations of the chromatin in cell nuclei is recalled, based on the integration of contact data from Hi-C experiments and additional information coming from ChIP-seq, RNA-seq and ChIA-PET experiments. Provided that the additional data come from independent experiments, this kind of approach is supposed to leverage them to complement possibly noisy, biased or missing Hi-C records. When the different data sources are mutually concurrent, the resulting solutions are corroborated; otherwise, their validity would be weakened. Here, a problem of reliability arises, entailing an appropriate choice of the relative weights to be assigned to the different informational contributions. A series of experiments is presented that help to quantify the advantages and the limitations offered by this strategy. Whereas the advantages in accuracy are not always significant, the case of missing Hi-C data demonstrates the effectiveness of additional information in reconstructing the highly packed segments of the structure.
RESUMO
In the last decade, Raman Spectroscopy is establishing itself as a highly promising technique for the classification of tumour tissues as it allows to obtain the biochemical maps of the tissues under investigation, making it possible to observe changes among different tissues in terms of biochemical constituents (proteins, lipid structures, DNA, vitamins, and so on). In this paper, we aim to show that techniques emerging from the cross-fertilization of persistent homology and machine learning can support the classification of Raman spectra extracted from cancerous tissues for tumour grading. In more detail, topological features of Raman spectra and machine learning classifiers are trained in combination as an automatic classification pipeline in order to select the best-performing pair. The case study is the grading of chondrosarcoma in four classes: cross and leave-one-patient-out validations have been used to assess the classification accuracy of the method. The binary classification achieves a validation accuracy of 81% and a test accuracy of 90%. Moreover, the test dataset has been collected at a different time and with different equipment. Such results are achieved by a support vector classifier trained with the Betti Curve representation of the topological features extracted from the Raman spectra, and are excellent compared with the existing literature. The added value of such results is that the model for the prediction of the chondrosarcoma grading could easily be implemented in clinical practice, possibly integrated into the acquisition system.
Assuntos
Neoplasias Ósseas , Condrossarcoma , Humanos , Análise Espectral Raman/métodos , Aprendizado de Máquina , Gradação de Tumores , Máquina de Vetores de SuporteRESUMO
NAVIGATOR is an Italian regional project boosting precision medicine in oncology with the aim of making it more predictive, preventive, and personalised by advancing translational research based on quantitative imaging and integrative omics analyses. The project's goal is to develop an open imaging biobank for the collection and preservation of a large amount of standardised imaging multimodal datasets, including computed tomography, magnetic resonance imaging, and positron emission tomography data, together with the corresponding patient-related and omics-related relevant information extracted from regional healthcare services using an adapted privacy-preserving model. The project is based on an open-source imaging biobank and an open-science oriented virtual research environment (VRE). Available integrative omics and multi-imaging data of three use cases (prostate cancer, rectal cancer, and gastric cancer) will be collected. All data confined in NAVIGATOR (i.e., standard and novel imaging biomarkers, non-imaging data, health agency data) will be used to create a digital patient model, to support the reliable prediction of the disease phenotype and risk stratification. The VRE that relies on a well-established infrastructure, called D4Science.org, will further provide a multiset infrastructure for processing the integrative omics data, extracting specific radiomic signatures, and for identification and testing of novel imaging biomarkers through big data analytics and artificial intelligence.
Assuntos
Inteligência Artificial , Medicina de Precisão , Medicina de Precisão/métodos , Bancos de Espécimes Biológicos , Tomografia por Emissão de Pósitrons , BiomarcadoresRESUMO
The three-dimensional structure of chromatin in the cellular nucleus carries important information that is connected to physiological and pathological correlates and dysfunctional cell behaviour. As direct observation is not feasible at present, on one side, several experimental techniques have been developed to provide information on the spatial organization of the DNA in the cell; on the other side, several computational methods have been developed to elaborate experimental data and infer 3D chromatin conformations. The most relevant experimental methods are Chromosome Conformation Capture and its derivatives, chromatin immunoprecipitation and sequencing techniques (CHIP-seq), RNA-seq, fluorescence in situ hybridization (FISH) and other genetic and biochemical techniques. All of them provide important and complementary information that relate to the three-dimensional organization of chromatin. However, these techniques employ very different experimental protocols and provide information that is not easily integrated, due to different contexts and different resolutions. Here, we present an open-source tool, which is an expansion of the previously reported code ChromStruct, for inferring the 3D structure of chromatin that, by exploiting a multilevel approach, allows an easy integration of information derived from different experimental protocols and referred to different resolution levels of the structure, from a few kilobases up to Megabases. Our results show that the introduction of chromatin modelling features related to CTCF CHIA-PET data, histone modification CHIP-seq, and RNA-seq data produce appreciable improvements in ChromStruct's 3D reconstructions, compared to the use of HI-C data alone, at a local level and at a very high resolution.
RESUMO
We review the current applications of artificial intelligence (AI) in functional genomics. The recent explosion of AI follows the remarkable achievements made possible by "deep learning", along with a burst of "big data" that can meet its hunger. Biology is about to overthrow astronomy as the paradigmatic representative of big data producer. This has been made possible by huge advancements in the field of high throughput technologies, applied to determine how the individual components of a biological system work together to accomplish different processes. The disciplines contributing to this bulk of data are collectively known as functional genomics. They consist in studies of: i) the information contained in the DNA (genomics); ii) the modifications that DNA can reversibly undergo (epigenomics); iii) the RNA transcripts originated by a genome (transcriptomics); iv) the ensemble of chemical modifications decorating different types of RNA transcripts (epitranscriptomics); v) the products of protein-coding transcripts (proteomics); and vi) the small molecules produced from cell metabolism (metabolomics) present in an organism or system at a given time, in physiological or pathological conditions. After reviewing main applications of AI in functional genomics, we discuss important accompanying issues, including ethical, legal and economic issues and the importance of explainability.
RESUMO
Our purpose is to evaluate the performance of magnetic resonance (MR) radiomics analysis for differentiating between malignant and benign parotid neoplasms and, among the latter, between pleomorphic adenomas and Warthin tumors. We retrospectively evaluated 75 T2-weighted images of parotid gland lesions, of which 61 were benign tumors (32 pleomorphic adenomas, 23 Warthin tumors and 6 oncocytomas) and 14 were malignant tumors. A receiver operating characteristics (ROC) curve analysis was performed to find the threshold values for the most discriminative features and determine their sensitivity, specificity and area under the ROC curve (AUROC). The most discriminative features were used to train a support vector machine classifier. The best classification performance was obtained by comparing a pleomorphic adenoma with a Warthin tumor (yielding sensitivity, specificity and a diagnostic accuracy as high as 0.8695, 0.9062 and 0.8909, respectively) and a pleomorphic adenoma with malignant tumors (sensitivity, specificity and a diagnostic accuracy of 0.6666, 0.8709 and 0.8043, respectively). Radiomics analysis of parotid tumors on conventional T2-weighted MR images allows the discrimination of pleomorphic adenomas from Warthin tumors and malignant tumors with a high sensitivity, specificity and diagnostic accuracy.
RESUMO
We present a method to infer 3D chromatin configurations from Chromosome Conformation Capture data. Quite a few methods have been proposed to estimate the structure of the nuclear dna in homogeneous populations of cells from this kind of data. Many of them transform contact frequencies into euclidean distances between pairs of chromatin fragments, and then reconstruct the structure by solving a distance-to-geometry problem. To avoid inconsistencies, our method is based on a score function that does not require any frequency-to-distance translation. We propose a multiscale chromatin model where the chromatin fiber is suitably partitioned at each scale. The partial structures are estimated independently, and connected to rebuild the whole fiber. Our score function consists of a data-fit part and a penalty part, balanced automatically at each scale and each subchain. The penalty part enforces soft geometric constraints. As many different structures can fit the data, our sampling strategy produces a set of solutions with similar scores. The procedure contains a few parameters, independent of both the scale and the genomic segment treated. The partition of the fiber, along with intrinsically parallel parts, make this method computationally efficient. Results from human genome data support the biological plausibility of our solutions.
Assuntos
Cromatina/ultraestrutura , Modelos Moleculares , Algoritmos , Teorema de Bayes , Linhagem Celular , Cromatina/química , Cromatina/metabolismo , Biologia Computacional , Humanos , Reprodutibilidade dos TestesRESUMO
A method and a stand-alone Python(TM) code to estimate the 3D chromatin structure from chromosome conformation capture data are presented. The method is based on a multiresolution, modified-bead-chain chromatin model, evolved through quaternion operators in a Monte Carlo sampling. The solution space to be sampled is generated by a score function with a data-fit part and a constraint part where the available prior knowledge is implicitly coded. The final solution is a set of 3D configurations that are compatible with both the data and the prior knowledge. The iterative code, provided here as additional material, is equipped with a graphical user interface and stores its results in standard-format files for 3D visualization. We describe the mathematical-computational aspects of the method and explain the details of the code. Some experimental results are reported, with a demonstration of their fit to the data.
RESUMO
Urban cultivation for food production is of growing importance. The quality of urban soil can be improved by tillage and the incorporation of organic matter, or can be degraded by chemical treatments. Urban gardeners have a role in this process, through the selection of various cultivation techniques. Our study focuses on an allotment area in the town of Pisa (Italy), which since 1995 has been run as a municipal vegetable garden by the residents. We analysed the soil and compared the data with those collected five years previously, to verify the possible changes in soil properties and fertility. We also interviewed the gardeners regarding their backgrounds, motivations and cultivation practices. We looked for possible changes in the soil quality attributable to the cultivation techniques. We found that the allotment holders influenced the soil quality through the cultivation techniques. Organic carbon, electrical conductivity and the content of copper increased unevenly in relation to the gardeners' cultivation practices. At the same time the study highlights that the urban gardeners were not completely aware of how to protect and enhance the fertility and the quality of urban soil. We believe that town councils should be responsible for providing correct information to the allotment holders and thus prevent the possible misuse of urban soil to grow food, as this can affect everyone's health.