Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Bioinformatics ; 40(Supplement_1): i428-i436, 2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-38940171

RESUMO

MOTIVATION: Cross-linking tandem mass spectrometry (XL-MS/MS) is an established analytical platform used to determine distance constraints between residues within a protein or from physically interacting proteins, thus improving our understanding of protein structure and function. To aid biological discovery with XL-MS/MS, it is essential that pairs of chemically linked peptides be accurately identified, a process that requires: (i) database search, that creates a ranked list of candidate peptide pairs for each experimental spectrum and (ii) false discovery rate (FDR) estimation, that determines the probability of a false match in a group of top-ranked peptide pairs with scores above a given threshold. Currently, the only available FDR estimation mechanism in XL-MS/MS is the target-decoy approach (TDA). However, despite its simplicity, TDA has both theoretical and practical limitations that impact the estimation accuracy and increase run time over potential decoy-free approaches (DFAs). RESULTS: We introduce a novel decoy-free framework for FDR estimation in XL-MS/MS. Our approach relies on multi-sample mixtures of skew normal distributions, where the latent components correspond to the scores of correct peptide pairs (both peptides identified correctly), partially incorrect peptide pairs (one peptide identified correctly, the other incorrectly), and incorrect peptide pairs (both peptides identified incorrectly). To learn these components, we exploit the score distributions of first- and second-ranked peptide-spectrum matches for each experimental spectrum and subsequently estimate FDR using a novel expectation-maximization algorithm with constraints. We evaluate the method on ten datasets and provide evidence that the proposed DFA is theoretically sound and a viable alternative to TDA owing to its good performance in terms of accuracy, variance of estimation, and run time. AVAILABILITY AND IMPLEMENTATION: https://github.com/shawn-peng/xlms.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Proteômica , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Proteômica/métodos , Peptídeos/química , Proteínas/química
2.
Bioinformatics ; 36(Suppl_2): i745-i753, 2020 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-33381824

RESUMO

MOTIVATION: Accurate estimation of false discovery rate (FDR) of spectral identification is a central problem in mass spectrometry-based proteomics. Over the past two decades, target-decoy approaches (TDAs) and decoy-free approaches (DFAs) have been widely used to estimate FDR. TDAs use a database of decoy species to faithfully model score distributions of incorrect peptide-spectrum matches (PSMs). DFAs, on the other hand, fit two-component mixture models to learn the parameters of correct and incorrect PSM score distributions. While conceptually straightforward, both approaches lead to problems in practice, particularly in experiments that push instrumentation to the limit and generate low fragmentation-efficiency and low signal-to-noise-ratio spectra. RESULTS: We introduce a new decoy-free framework for FDR estimation that generalizes present DFAs while exploiting more search data in a manner similar to TDAs. Our approach relies on multi-component mixtures, in which score distributions corresponding to the correct PSMs, best incorrect PSMs and second-best incorrect PSMs are modeled by the skew normal family. We derive EM algorithms to estimate parameters of these distributions from the scores of best and second-best PSMs associated with each experimental spectrum. We evaluate our models on multiple proteomics datasets and a HeLa cell digest case study consisting of more than a million spectra in total. We provide evidence of improved performance over existing DFAs and improved stability and speed over TDAs without any performance degradation. We propose that the new strategy has the potential to extend beyond peptide identification and reduce the need for TDA on all analytical platforms. AVAILABILITYAND IMPLEMENTATION: https://github.com/shawn-peng/FDR-estimation. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Algoritmos , Bases de Dados de Proteínas , Células HeLa , Humanos , Peptídeos
3.
PLoS Comput Biol ; 12(8): e1005091, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27564311

RESUMO

Elucidating the precise molecular events altered by disease-causing genetic variants represents a major challenge in translational bioinformatics. To this end, many studies have investigated the structural and functional impact of amino acid substitutions. Most of these studies were however limited in scope to either individual molecular functions or were concerned with functional effects (e.g. deleterious vs. neutral) without specifically considering possible molecular alterations. The recent growth of structural, molecular and genetic data presents an opportunity for more comprehensive studies to consider the structural environment of a residue of interest, to hypothesize specific molecular effects of sequence variants and to statistically associate these effects with genetic disease. In this study, we analyzed data sets of disease-causing and putatively neutral human variants mapped to protein 3D structures as part of a systematic study of the loss and gain of various types of functional attribute potentially underlying pathogenic molecular alterations. We first propose a formal model to assess probabilistically function-impacting variants. We then develop an array of structure-based functional residue predictors, evaluate their performance, and use them to quantify the impact of disease-causing amino acid substitutions on catalytic activity, metal binding, macromolecular binding, ligand binding, allosteric regulation and post-translational modifications. We show that our methodology generates actionable biological hypotheses for up to 41% of disease-causing genetic variants mapped to protein structures suggesting that it can be reliably used to guide experimental validation. Our results suggest that a significant fraction of disease-causing human variants mapping to protein structures are function-altering both in the presence and absence of stability disruption.


Assuntos
Sequência de Aminoácidos/genética , Doença/genética , Modelos Estatísticos , Mutação/genética , Substituição de Aminoácidos/genética , Biologia Computacional , Simulação por Computador , Humanos , Modelos Moleculares , Ligação Proteica
4.
bioRxiv ; 2024 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-38798479

RESUMO

Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods to accurately determine the clinical impact of variants of unknown significance (VUS). Towards this goal, the ARSA Critical Assessment of Genome Interpretation (CAGI) challenge was designed to characterize progress by utilizing 219 experimentally assayed missense VUS in the Arylsulfatase A (ARSA) gene to assess the performance of community-submitted predictions of variant functional effects. The challenge involved 15 teams, and evaluated additional predictions from established and recently released models. Notably, a model developed by participants of a genetics and coding bootcamp, trained with standard machine-learning tools in Python, demonstrated superior performance among submissions. Furthermore, the study observed that state-of-the-art deep learning methods provided small but statistically significant improvement in predictive performance compared to less elaborate techniques. These findings underscore the utility of variant effect prediction, and the potential for models trained with modest resources to accurately classify VUS in genetic and clinical research.

5.
Pac Symp Biocomput ; 28: 311-322, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36540987

RESUMO

Data biases are a known impediment to the development of trustworthy machine learning models and their application to many biomedical problems. When biased data is suspected, the assumption that the labeled data is representative of the population must be relaxed and methods that exploit a typically representative unlabeled data must be developed. To mitigate the adverse effects of unrepresentative data, we consider a binary semi-supervised setting and focus on identifying whether the labeled data is biased and to what extent. We assume that the class-conditional distributions were generated by a family of component distributions represented at different proportions in labeled and unlabeled data. We also assume that the training data can be transformed to and subsequently modeled by a nested mixture of multivariate Gaussian distributions. We then develop a multi-sample expectation-maximization algorithm that learns all individual and shared parameters of the model from the combined data. Using these parameters, we develop a statistical test for the presence of the general form of bias in labeled data and estimate the level of this bias by computing the distance between corresponding class-conditional distributions in labeled and unlabeled data. We first study the new methods on synthetic data to understand their behavior and then apply them to real-world biomedical data to provide evidence that the bias estimation procedure is both possible and effective.


Assuntos
Algoritmos , Biologia Computacional , Humanos , Biologia Computacional/métodos , Aprendizado de Máquina , Aprendizado de Máquina Supervisionado
6.
Pac Symp Biocomput ; 28: 209-220, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36540978

RESUMO

Racial and ethnic disparities in adverse pregnancy outcomes (APOs) have been well-documented in the United States, but the extent to which the disparities are present in high-risk subgroups have not been studied. To address this problem, we first applied association rule mining to the clinical data derived from the prospective nuMoM2b study cohort to identify subgroups at increased risk of developing four APOs (gestational diabetes, hypertension acquired during pregnancy, preeclampsia, and preterm birth). We then quantified racial/ethnic disparities within the cohort as well as within high-risk subgroups to assess potential effects of risk-reduction strategies. We identify significant differences in distributions of major risk factors across racial/ethnic groups and find surprising heterogeneity in APO prevalence across these populations, both in the cohort and in its high-risk subgroups. Our results suggest that risk-reducing strategies that simultaneously reduce disparities may require targeting of high-risk subgroups with considerations for the population context.


Assuntos
Resultado da Gravidez , Nascimento Prematuro , Gravidez , Feminino , Recém-Nascido , Humanos , Estados Unidos , Nascimento Prematuro/epidemiologia , Nascimento Prematuro/etiologia , Estudos Prospectivos , Biologia Computacional , Fatores de Risco
7.
Pac Symp Biocomput ; 28: 323-334, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36540988

RESUMO

The accurate interpretation of genetic variants is essential for clinical actionability. However, a majority of variants remain of uncertain significance. Multiplexed assays of variant effects (MAVEs), can help provide functional evidence for variants of uncertain significance (VUS) at the scale of entire genes. Although the systematic prioritization of genes for such assays has been of great interest from the clinical perspective, existing strategies have rarely emphasized this motivation. Here, we propose three objectives for quantifying the importance of genes each satisfying a specific clinical goal: (1) Movability scores to prioritize genes with the most VUS moving to non-VUS categories, (2) Correction scores to prioritize genes with the most pathogenic and/or benign variants that could be reclassified, and (3) Uncertainty scores to prioritize genes with VUS for which variant pathogenicity predictors used in clinical classification exhibit the greatest uncertainty. We demonstrate that existing approaches are sub-optimal when considering these explicit clinical objectives. We also propose a combined weighted score that optimizes the three objectives simultaneously and finds optimal weights to improve over existing approaches. Our strategy generally results in better performance than existing knowledge-driven and data-driven strategies and yields gene sets that are clinically relevant. Our work has implications for systematic efforts that aim to iterate between predictor development, experimentation and translation to the clinic.


Assuntos
Predisposição Genética para Doença , Testes Genéticos , Humanos , Testes Genéticos/métodos , Variação Genética , Biologia Computacional/métodos
8.
Stud Health Technol Inform ; 182: 104-13, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23138085

RESUMO

Body sensor networks can be used for health monitoring of patients by expert medical doctors, in remote locations like rural areas in developing countries, and can also be used to provide medical aid to areas affected by natural disasters in any part of the world. An important issue to be addressed, when the number of patients is large, is to reliably maintain the patient records and have simple automated mobile applications for healthcare helpers to use. We propose an automated healthcare architecture using NFC-enabled mobile phones and patients having their patient ID on RFID tags. It utilizes NFC-enabled mobile phones to read the patient ID, followed by automated gathering of healthcare vital parameters from body sensors using Bluetooth, analyses the information and transmits it to a medical server for expert feedback. With limited hospital resources and less training requirement for healthcare helpers through simpler applications, this automation of healthcare processing can provide time effective and reliable mass health consultation from medical experts in remote locations.


Assuntos
Telefone Celular , Registros Eletrônicos de Saúde/instrumentação , Registros Eletrônicos de Saúde/organização & administração , Monitorização Ambulatorial/instrumentação , Serviços de Saúde Rural/organização & administração , Pessoal de Saúde/organização & administração , Humanos , Dispositivo de Identificação por Radiofrequência/métodos
9.
Pac Symp Biocomput ; 24: 124-135, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30864316

RESUMO

Accurately estimating performance accuracy of machine learning classifiers is of fundamental importance in biomedical research with potentially societal consequences upon the deployment of bestperforming tools in everyday life. Although classification has been extensively studied over the past decades, there remain understudied problems when the training data violate the main statistical assumptions relied upon for accurate learning and model characterization. This particularly holds true in the open world setting where observations of a phenomenon generally guarantee its presence but the absence of such evidence cannot be interpreted as the evidence of its absence. Learning from such data is often referred to as positive-unlabeled learning, a form of semi-supervised learning where all labeled data belong to one (say, positive) class. To improve the best practices in the field, we here study the quality of estimated performance in positive-unlabeled learning in the biomedical domain. We provide evidence that such estimates can be wildly inaccurate, depending on the fraction of positive examples in the unlabeled data and the fraction of negative examples mislabeled as positives in the labeled data. We then present correction methods for four such measures and demonstrate that the knowledge or accurate estimates of class priors in the unlabeled data and noise in the labeled data are sufficient for the recovery of true classification performance. We provide theoretical support as well as empirical evidence for the efficacy of the new performance estimation methods.


Assuntos
Classificação/métodos , Aprendizado de Máquina , Algoritmos , Biologia Computacional/métodos , Simulação por Computador , Humanos , Aprendizado de Máquina/estatística & dados numéricos , Modelos Estatísticos
10.
J Clin Orthop Trauma ; 10(3): 497-502, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31061576

RESUMO

BACKGROUND: The Nottingham Clavicle Score (NCS) has been recently described for functional outcome assessment after injuries to clavicle and the Acromioclavicular joint. However; validity and responsiveness are context specific psychometric terms and the NCS has not been previously described in surgically treated clavicle shaft fractures. Aim of the present study was to investigate validity and responsiveness of the NCS in clavicle fractures treated with titanium flexible nailing. METHODS: This prospective study was undertaken on consecutively operated clavicle shaft fractures treated with titanium elastic nail from November 2013 to August 2016. Functional assessment using NCS was done at two and six months postoperatively. Construct validity was also evaluated by formulating the null hypothesis that there would be no difference in NCS at six months after open and closed reduction and in 15B1 and 15B2 fracture sub-types. The above two hypotheses were formulated based on previous studies that used Constant score and DASH score. Pre-specified hypothesis and results in accordance with the hypotheses suggest satisfactory construct validity. Responsiveness was evaluated using standardized response mean (SRM) and Effect size (ES). ES and SRM values ≥0.80 suggest satisfactory responsiveness. The proportion of patients having the least possible score of 0 points (floor effect) and the highest possible score of 100 points (ceiling effect) was evaluated at two and six months postoperatively. Floor and ceiling effect of <15% suggests satisfactory internal validity. RESULTS: Thirty six consecutively operated patients were included in the study. The NCS at two months and six months was 69.6 ±â€¯9.6 and 87.2 ±â€¯7.1 respectively. The NCS at six months after fixation was 88.7 ±â€¯4.8 in closed reduction cohort and 84.7 ±â€¯9.4 in the open reduction cohort and this difference was not significant (p = 0.1). The NCS at six months after fixation was 85.3 ±â€¯8.3 in 15B1 clavicular fractures and 89.7 ±â€¯4.0 in 15B2 clavicular shaft fractures and this difference was also not significant (p = 0.07). All results pertaining to construct validity were in accordance with our hypothesis thereby suggesting that NCS demonstrates satisfactory construct validity. The ES and SRM were 1.8 and 2.6 respectively. NCS showed no ceiling (0%) or floor effect (0%) at two and six months postoperatively thereby suggesting adequate internal validity of the NCS. CONCLUSION: NCS has satisfactory construct validity, internal validity and responsiveness in surgically treated clavicle shaft fractures with titanium elastic nailing.

11.
Int J Clin Pediatr Dent ; 11(6): 505-509, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-31303738

RESUMO

BACKGROUND: Pulpotomy is the treatment for cariously exposed vital primary molars. UsingfFormocresol as a pulpotomy agent is been in controversy, which has triggered the search for better alternatives. The product like 'Myristica fragrans (MF)-Nutmeg gel, Terminalia chebula (TC)-Myrobolan gel is gaining popularity as herbal pulpotomy agents. AIM: To evaluate and compare clinical and radiographical success of herbal gels Myristica fragrans (MF)-Nutmeg, and Terminalia chebula (TC)-Myrobolan as pulpotomy medicaments in primary teeth. MATERIALS AND METHODS: Twenty participants (n = 20), each with at least two primary molars requiring pulpotomy, were selected and divided into two test groups. In 10 children Terminalia chebula gel was placed in one side and Formocresol on another side. Rest 10 children were treated with Myristica fragrans gel on one side and another side with formocresol. The treated teeth selected for clinical and radiographic evaluation were monitored periodically for 3, 6 and 12 months. RESULTS: With the follow-up of 12 months there was no significant difference in efficacy of all three pulpotomy medicaments, i.e. Nutmeg, Myrobolan, and Formocresol, respectively was found. CONCLUSION: Herbal gels have a promising role in dentistry having the proper knowledge, and their effects on teeth would prove them as a successful dental therapeutic agent. HOW TO CITE THIS ARTICLE: Mali S, Singla S, Sharma A, Gautam A, Niranjan B, Jain S. Efficacy of Myristica fragrans and Terminalia chebula as Pulpotomy Agents in Primary Teeth: A Clinical Study. Int J Clin Pediatr Dent, 2018;11(6):505-509.

12.
J Hand Surg Asian Pac Vol ; 21(1): 109-12, 2016 02.
Artigo em Inglês | MEDLINE | ID: mdl-27454514

RESUMO

Intraosseous ganglion cyst is a rare bone tumor and the lesion could often be missed. The diagnosis could be delayed so proper radiologic investigation and index of suspicion is necessary .Differential diagnoses of painful cystic radiolucent carpal lesion are osteoid osteoma, osteoblastoma and intraosseous ganglion. Curettage of the scaphoid lesion and filling of void with bone graft provides good functional outcomes. The cyst contains mucoid viscous material without epithelial or synovial lining. We present a case of 30 years old male with intraosseous ganglion cyst of scaphoid which was treated with curettage and bone grafting. Rarely ganglion cyst is found in small bones of hand and should be considered as differential diagnosis of chronic radial wrist pain.


Assuntos
Artralgia/etiologia , Cistos Ósseos/cirurgia , Osso Escafoide/cirurgia , Adulto , Cistos Ósseos/diagnóstico por imagem , Humanos , Masculino , Osso Escafoide/diagnóstico por imagem , Articulação do Punho/diagnóstico por imagem , Articulação do Punho/cirurgia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA