Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Nucleic Acids Res ; 42(15): e122, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25030906

RESUMEN

Inundation of evolutionary markers expedited in Human Genome Project and 1000 Genome Consortium has necessitated pruning of redundant and dependent variables. Various computational tools based on machine-learning and data-mining methods like feature selection/extraction have been proposed to escape the curse of dimensionality in large datasets. Incidentally, evolutionary studies, primarily based on sequentially evolved variations have remained un-facilitated by such advances till date. Here, we present a novel approach of recursive feature selection for hierarchical clustering of Y-chromosomal SNPs/haplogroups to select a minimal set of independent markers, sufficient to infer population structure as precisely as deduced by a larger number of evolutionary markers. To validate the applicability of our approach, we optimally designed MALDI-TOF mass spectrometry-based multiplex to accommodate independent Y-chromosomal markers in a single multiplex and genotyped two geographically distinct Indian populations. An analysis of 105 world-wide populations reflected that 15 independent variations/markers were optimal in defining population structure parameters, such as FST, molecular variance and correlation-based relationship. A subsequent addition of randomly selected markers had a negligible effect (close to zero, i.e. 1 × 10(-3)) on these parameters. The study proves efficient in tracing complex population structures and deriving relationships among world-wide populations in a cost-effective and expedient manner.


Asunto(s)
Cromosomas Humanos Y/química , Evolución Molecular , Análisis por Conglomerados , Marcadores Genéticos , Genética de Población/métodos , Técnicas de Genotipaje , Haplotipos , Humanos , India , Masculino , Filogenia , Análisis de Componente Principal , Población Blanca/genética
2.
Artículo en Inglés | MEDLINE | ID: mdl-38083343

RESUMEN

Whole Slide Images (WSIs) or histopathology images are used in digital pathology. WSIs pose great challenges to deep learning models for clinical diagnosis, owing to their size and lack of pixel-level annotations. With the recent advancements in computational pathology, newer multiple-instance learning-based models have been proposed. Multiple-instance learning for WSIs necessitates creating patches and uses the encoding of these patches for diagnosis. These models use generic pre-trained models (ResNet-50 pre-trained on ImageNet) for patch encoding. The recently proposed KimiaNet, a DenseNet121 model pre-trained on TCGA slides, is a domain-specific pre-trained model. This paper shows the effect of domain-specific pre-training on WSI classification. To investigate the effect of domain-specific pre-training, we considered the current state-of-the-art multiple-instance learning models, 1) CLAM, an attention-based model, and 2) TransMIL, a self-attention-based model, and evaluated the models' confidence and predictive performance in detecting primary brain tumors - gliomas. Domain-specific pre-training improves the confidence of the models and also achieves a new state-of-the-art performance of WSI-based glioma subtype classification, showing a high clinical applicability in assisting glioma diagnosis. We will publicly share our code and experimental results at https://github.com/soham-chitnis10/WSI-domain-specific.


Asunto(s)
Glioma , Humanos , Procesos Mentales , Registros
3.
Comput Biol Med ; 109: 14-21, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31030180

RESUMEN

Automatic diagnosis of cardiac events is a current problem of interest in which deep learning has shown promising success. We have earlier reported the use of Long Short Term Memory (LSTM) networks-trained on normal ECG patterns-to the detection of anomalies from the prediction errors for real-time diagnostic applications. In this work, we extend our anomaly detection algorithm by introducing a second stage predictor that can identify the actual anomaly class from the error outputs of the first stage model. Results from seven types of anomalies have been presented including Atrial Premature Contraction (APC), Paced Beat (PB), Premature Ventricular Contraction (PVC), Right Bundle Branch Block (RBBB), Ventricular Bigeminy (VB), Ventricular Couplets (VCs) and Ventricular Tachycardia (VT). To optimize anomaly class prediction performance, multiple choices of second stage models such as multilayer perceptron (MLP), support vector machine (SVM) and logistic regression have been employed. A featurization scheme for LSTM prediction errors in the form of overall summaries has been proposed and a successful predictor for the same was developed with good performance. Our results indicate that the error vectors represented by their summary features carry useful predictive information about actual ECG anomaly type. We discuss how the accuracy scores without attention to inherent class imbalances and paucity of data instances may produce misleading performance estimates and hence accurate background models are needed to estimate true predictive performance of multi-class predictors such as those presented in this work. The training data sets and related resources for this study are provided at http://ecg.sciwhylab.org.


Asunto(s)
Arritmias Cardíacas/diagnóstico , Arritmias Cardíacas/fisiopatología , Electrocardiografía , Modelos Cardiovasculares , Procesamiento de Señales Asistido por Computador , Humanos
4.
Front Neuroinform ; 13: 53, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31417388

RESUMEN

Stroke causes behavioral deficits in multiple cognitive domains and there is a growing interest in predicting patient performance from neuroimaging data using machine learning techniques. Here, we investigated a deep learning approach based on convolutional neural networks (CNNs) for predicting the severity of language disorder from 3D lesion images from magnetic resonance imaging (MRI) in a heterogeneous sample of stroke patients. CNN performance was compared to that of conventional (shallow) machine learning methods, including ridge regression (RR) on the images' principal components and support vector regression. We also devised a hybrid method based on re-using CNN's high-level features as additional input to the RR model. Predictive accuracy of the four different methods was further investigated in relation to the size of the training set and the level of redundancy across lesion images in the dataset, which was evaluated in terms of location and topological properties of the lesions. The Hybrid model achieved the best performance in most cases, thereby suggesting that the high-level features extracted by CNNs are complementary to principal component analysis features and improve the model's predictive accuracy. Moreover, our analyses indicate that both the size of training data and image redundancy are critical factors in determining the accuracy of a computational model in predicting behavioral outcome from the structural brain imaging data of stroke patients.

5.
Appl Netw Sci ; 2(1): 2, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-30533510

RESUMEN

Interactomes such as Protein interaction networks have many undiscovered links between entities. Experimental verification of every link in these networks is prohibitively expensive, and therefore computational methods to direct the search for possible links are of great value. The problem of finding undiscovered links in a network is also referred to as the link prediction problem. A popular approach for link prediction has been to formulate it as a binary classification problem in which class labels indicate the existence or absence of a link (we refer to these as positive links or negative links respectively) between a pair of nodes in the network. Researchers have successfully applied such supervised classification techniques to determine the presence of links in protein interaction networks. However, it is quite common for protein-protein interaction (PPI) networks to have a large proportion of undiscovered links. Thus, a link prediction approach could incorrectly treat undiscovered positive links as negative links, thereby introducing a bias in the learning. In this paper, we propose to denoise the class of negative links in the training data via a Gaussian process anomaly detector. We show that this significantly reduces the noise due to mislabelled negative links and improves the resulting link prediction accuracy. We evaluate the approach by introducing synthetic noise into the PPI networks and measuring how accurately we can reconstruct the original PPI networks using classifiers trained on both noisy and denoised data. Experiments were performed with five different PPI network datasets and the results indicate a significant reduction in bias due to label noise, and more importantly, a significant improvement in the accuracy of detecting missing links via classification.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA