Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
1.
Radiol Imaging Cancer ; 6(1): e230033, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38180338

ABSTRACT

Purpose To describe the design, conduct, and results of the Breast Multiparametric MRI for prediction of neoadjuvant chemotherapy Response (BMMR2) challenge. Materials and Methods The BMMR2 computational challenge opened on May 28, 2021, and closed on December 21, 2021. The goal of the challenge was to identify image-based markers derived from multiparametric breast MRI, including diffusion-weighted imaging (DWI) and dynamic contrast-enhanced (DCE) MRI, along with clinical data for predicting pathologic complete response (pCR) following neoadjuvant treatment. Data included 573 breast MRI studies from 191 women (mean age [±SD], 48.9 years ± 10.56) in the I-SPY 2/American College of Radiology Imaging Network (ACRIN) 6698 trial (ClinicalTrials.gov: NCT01042379). The challenge cohort was split into training (60%) and test (40%) sets, with teams blinded to test set pCR outcomes. Prediction performance was evaluated by area under the receiver operating characteristic curve (AUC) and compared with the benchmark established from the ACRIN 6698 primary analysis. Results Eight teams submitted final predictions. Entries from three teams had point estimators of AUC that were higher than the benchmark performance (AUC, 0.782 [95% CI: 0.670, 0.893], with AUCs of 0.803 [95% CI: 0.702, 0.904], 0.838 [95% CI: 0.748, 0.928], and 0.840 [95% CI: 0.748, 0.932]). A variety of approaches were used, ranging from extraction of individual features to deep learning and artificial intelligence methods, incorporating DCE and DWI alone or in combination. Conclusion The BMMR2 challenge identified several models with high predictive performance, which may further expand the value of multiparametric breast MRI as an early marker of treatment response. Clinical trial registration no. NCT01042379 Keywords: MRI, Breast, Tumor Response Supplemental material is available for this article. © RSNA, 2024.


Subject(s)
Breast Neoplasms , Multiparametric Magnetic Resonance Imaging , Female , Humans , Middle Aged , Artificial Intelligence , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/drug therapy , Magnetic Resonance Imaging , Neoadjuvant Therapy , Pathologic Complete Response , Adult
2.
JMIR Form Res ; 7: e42930, 2023 May 02.
Article in English | MEDLINE | ID: mdl-36989460

ABSTRACT

BACKGROUND: The outbreak of the COVID-19 pandemic had a major effect on the consumption of health care services. Changes in the use of routine diagnostic exams, increased incidences of postacute COVID-19 syndrome (PCS), and other pandemic-related factors may have influenced detected clinical conditions. OBJECTIVE: This study aimed to analyze the impact of COVID-19 on the use of outpatient medical imaging services and clinical findings therein, specifically focusing on the time period after the launch of the Israeli COVID-19 vaccination campaign. In addition, the study tested whether the observed gains in abnormal findings may be linked to PCS or COVID-19 vaccination. METHODS: Our data set included 572,480 ambulatory medical imaging patients in a national health organization from January 1, 2019, to August 31, 2021. We compared different measures of medical imaging utilization and clinical findings therein before and after the surge of the pandemic to identify significant changes. We also inspected the changes in the rate of abnormal findings during the pandemic after adjusting for changes in medical imaging utilization. Finally, for imaging classes that showed increased rates of abnormal findings, we measured the causal associations between SARS-CoV-2 infection, COVID-19-related hospitalization (indicative of COVID-19 complications), and COVID-19 vaccination and future risk for abnormal findings. To adjust for a multitude of confounding factors, we used causal inference methodologies. RESULTS: After the initial drop in the utilization of routine medical imaging due to the first COVID-19 wave, the number of these exams has increased but with lower proportions of older patients, patients with comorbidities, women, and vaccine-hesitant patients. Furthermore, we observed significant gains in the rate of abnormal findings, specifically in musculoskeletal magnetic resonance (MR-MSK) and brain computed tomography (CT-brain) exams. These results also persisted after adjusting for the changes in medical imaging utilization. Demonstrated causal associations included the following: SARS-CoV-2 infection increasing the risk for an abnormal finding in a CT-brain exam (odds ratio [OR] 1.4, 95% CI 1.1-1.7) and COVID-19-related hospitalization increasing the risk for abnormal findings in an MR-MSK exam (OR 3.1, 95% CI 1.9-5.3). CONCLUSIONS: COVID-19 impacted the use of ambulatory imaging exams, with greater avoidance among patients at higher risk for COVID-19 complications: older patients, patients with comorbidities, and nonvaccinated patients. Causal analysis results imply that PCS may have contributed to the observed gains in abnormal findings in MR-MSK and CT-brain exams.

3.
JAMA Netw Open ; 6(2): e230524, 2023 02 01.
Article in English | MEDLINE | ID: mdl-36821110

ABSTRACT

Importance: An accurate and robust artificial intelligence (AI) algorithm for detecting cancer in digital breast tomosynthesis (DBT) could significantly improve detection accuracy and reduce health care costs worldwide. Objectives: To make training and evaluation data for the development of AI algorithms for DBT analysis available, to develop well-defined benchmarks, and to create publicly available code for existing methods. Design, Setting, and Participants: This diagnostic study is based on a multi-institutional international grand challenge in which research teams developed algorithms to detect lesions in DBT. A data set of 22 032 reconstructed DBT volumes was made available to research teams. Phase 1, in which teams were provided 700 scans from the training set, 120 from the validation set, and 180 from the test set, took place from December 2020 to January 2021, and phase 2, in which teams were given the full data set, took place from May to July 2021. Main Outcomes and Measures: The overall performance was evaluated by mean sensitivity for biopsied lesions using only DBT volumes with biopsied lesions; ties were broken by including all DBT volumes. Results: A total of 8 teams participated in the challenge. The team with the highest mean sensitivity for biopsied lesions was the NYU B-Team, with 0.957 (95% CI, 0.924-0.984), and the second-place team, ZeDuS, had a mean sensitivity of 0.926 (95% CI, 0.881-0.964). When the results were aggregated, the mean sensitivity for all submitted algorithms was 0.879; for only those who participated in phase 2, it was 0.926. Conclusions and Relevance: In this diagnostic study, an international competition produced algorithms with high sensitivity for using AI to detect lesions on DBT images. A standardized performance benchmark for the detection task using publicly available clinical imaging data was released, with detailed descriptions and analyses of submitted algorithms accompanied by a public release of their predictions and code for selected methods. These resources will serve as a foundation for future research on computer-assisted diagnosis methods for DBT, significantly lowering the barrier of entry for new researchers.


Subject(s)
Artificial Intelligence , Breast Neoplasms , Humans , Female , Benchmarking , Mammography/methods , Algorithms , Radiographic Image Interpretation, Computer-Assisted/methods , Breast Neoplasms/diagnostic imaging
4.
PLoS One ; 17(9): e0265289, 2022.
Article in English | MEDLINE | ID: mdl-36170272

ABSTRACT

In response to the outbreak of the coronavirus disease 2019 (Covid-19), governments worldwide have introduced multiple restriction policies, known as non-pharmaceutical interventions (NPIs). However, the relative impact of control measures and the long-term causal contribution of each NPI are still a topic of debate. We present a method to rigorously study the effectiveness of interventions on the rate of the time-varying reproduction number Rt and on human mobility, considered here as a proxy measure of policy adherence and social distancing. We frame our model using a causal inference approach to quantify the impact of five governmental interventions introduced until June 2020 to control the outbreak in 113 countries: confinement, school closure, mask wearing, cultural closure, and work restrictions. Our results indicate that mobility changes are more accurately predicted when compared to reproduction number. All NPIs, except for mask wearing, significantly affected human mobility trends. From these, schools and cultural closure mandates showed the largest effect on social distancing. We also found that closing schools, issuing face mask usage, and work-from-home mandates also caused a persistent reduction on Rt after their initiation, which was not observed with the other social distancing measures. Our results are robust and consistent across different model specifications and can shed more light on the impact of individual NPIs.


Subject(s)
COVID-19 , Pandemics , COVID-19/epidemiology , COVID-19/prevention & control , Humans , Masks , Pandemics/prevention & control , Physical Distancing , SARS-CoV-2
5.
Radiology ; 303(1): 69-77, 2022 04.
Article in English | MEDLINE | ID: mdl-35040677

ABSTRACT

Background Digital breast tomosynthesis (DBT) has higher diagnostic accuracy than digital mammography, but interpretation time is substantially longer. Artificial intelligence (AI) could improve reading efficiency. Purpose To evaluate the use of AI to reduce workload by filtering out normal DBT screens. Materials and Methods The retrospective study included 13 306 DBT examinations from 9919 women performed between June 2013 and November 2018 from two health care networks. The cohort was split into training, validation, and test sets (3948, 1661, and 4310 women, respectively). A workflow was simulated in which the AI model classified cancer-free examinations that could be dismissed from the screening worklist and used the original radiologists' interpretations on the rest of the worklist examinations. The AI system was also evaluated with a reader study of five breast radiologists reading the DBT mammograms of 205 women. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and recall rate were evaluated in both studies. Statistics were computed across 10 000 bootstrap samples to assess 95% CIs, noninferiority, and superiority tests. Results The model was tested on 4310 screened women (mean age, 60 years ± 11 [standard deviation]; 5182 DBT examinations). Compared with the radiologists' performance (417 of 459 detected cancers [90.8%], 477 recalls in 5182 examinations [9.2%]), the use of AI to automatically filter out cases would result in 39.6% less workload, noninferior sensitivity (413 of 459 detected cancers; 90.0%; P = .002), and 25% lower recall rate (358 recalls in 5182 examinations; 6.9%; P = .002). In the reader study, AUC was higher in the standalone AI compared with the mean reader (0.84 vs 0.81; P = .002). Conclusion The artificial intelligence model was able to identify normal digital breast tomosynthesis screening examinations, which decreased the number of examinations that required radiologist interpretation in a simulated clinical workflow. Published under a CC BY 4.0 license. Online supplemental material is available for this article. See also the editorial by Philpotts in this issue.


Subject(s)
Breast Neoplasms , Artificial Intelligence , Breast Neoplasms/diagnostic imaging , Early Detection of Cancer/methods , Female , Humans , Male , Mammography/methods , Middle Aged , Retrospective Studies , Workload
6.
AMIA Annu Symp Proc ; 2022: 385-394, 2022.
Article in English | MEDLINE | ID: mdl-37128397

ABSTRACT

Breast cancer (BC) risk models based on electronic health records (EHR) can assist physicians in estimating the probability of an individual with certain risk factors to develop BC in the future. In this retrospective study, we used clinical data combined with machine learning tools to assess the utility of a personalized BC risk model on 13,786 Israeli and 1,695 American women who underwent screening mammography in the years 2012-2018 and 2008-2018, respectively. Clinical features were extracted from EHR, personal questionnaires, and past radiologists' reports. Using a set of 1,547 features, the predictive ability for BC within 12 months was measured in both datasets and in sub-cohorts of interest. Our results highlight the improved performance of our model over previous established BC risk models, their ultimate potential for risk-based screening policies on first time patients and novel clinically relevant risk factors that can compensate for the absence of imaging history information.


Subject(s)
Breast Neoplasms , Humans , Female , Mammography , Retrospective Studies , Early Detection of Cancer , Breast , Risk Assessment
7.
Front Pharmacol ; 12: 631584, 2021.
Article in English | MEDLINE | ID: mdl-33967767

ABSTRACT

Real-world healthcare data hold the potential to identify therapeutic solutions for progressive diseases by efficiently pinpointing safe and efficacious repurposing drug candidates. This approach circumvents key early clinical development challenges, particularly relevant for neurological diseases, concordant with the vision of the 21st Century Cures Act. However, to-date, these data have been utilized mainly for confirmatory purposes rather than as drug discovery engines. Here, we demonstrate the usefulness of real-world data in identifying drug repurposing candidates for disease-modifying effects, specifically candidate marketed drugs that exhibit beneficial effects on Parkinson's disease (PD) progression. We performed an observational study in cohorts of ascertained PD patients extracted from two large medical databases, Explorys SuperMart (N = 88,867) and IBM MarketScan Research Databases (N = 106,395); and applied two conceptually different, well-established causal inference methods to estimate the effect of hundreds of drugs on delaying dementia onset as a proxy for slowing PD progression. Using this approach, we identified two drugs that manifested significant beneficial effects on PD progression in both datasets: rasagiline, narrowly indicated for PD motor symptoms; and zolpidem, a psycholeptic. Each confers its effects through distinct mechanisms, which we explored via a comparison of estimated effects within the drug classification ontology. We conclude that analysis of observational healthcare data, emulating otherwise costly, large, and lengthy clinical trials, can highlight promising repurposing candidates, to be validated in prospective registration trials, beneficial against common, late-onset progressive diseases for which disease-modifying therapeutic solutions are scarce.

8.
AMIA Annu Symp Proc ; 2021: 930-939, 2021.
Article in English | MEDLINE | ID: mdl-35308922

ABSTRACT

"No-shows", defined as missed appointments or late cancellations, is a central problem in healthcare systems. It has appeared to intensify during the COVID-19 pandemic and the nonpharmaceutical interventions, such as closures, taken to slow its spread. No-shows interfere with patients' continuous care, lead to inefficient utilization of medical resources, and increase healthcare costs. We present a comprehensive analysis of no-shows for breast imaging appointments made during 2020 in a large medical network in Israel. We applied advanced machine learning methods to provide insights into novel and known predictors. Additionally, we employed causal inference methodology to infer the effect of closures on no-shows, after accounting for confounding biases, and demonstrate the superiority of adversarial balancing over inverse probability weighting in correcting these biases. Our results imply that a patient's perceived risk of cancer and the COVID-19 time-based factors are major predictors. Further, we reveal that closures impact patients over 60, but not patients undergoing advanced diagnostic examinations.


Subject(s)
COVID-19 , Appointments and Schedules , COVID-19/epidemiology , Causality , Humans , Israel/epidemiology , Pandemics
9.
JAMIA Open ; 3(4): 536-544, 2020 Dec.
Article in English | MEDLINE | ID: mdl-33623890

ABSTRACT

OBJECTIVE: Observational medical databases, such as electronic health records and insurance claims, track the healthcare trajectory of millions of individuals. These databases provide real-world longitudinal information on large cohorts of patients and their medication prescription history. We present an easy-to-customize framework that systematically analyzes such databases to identify new indications for on-market prescription drugs. MATERIALS AND METHODS: Our framework provides an interface for defining study design parameters and extracting patient cohorts, disease-related outcomes, and potential confounders in observational databases. It then applies causal inference methodology to emulate hundreds of randomized controlled trials (RCTs) for prescribed drugs, while adjusting for confounding and selection biases. After correcting for multiple testing, it outputs the estimated effects and their statistical significance in each database. RESULTS: We demonstrate the utility of the framework in a case study of Parkinson's disease (PD) and evaluate the effect of 259 drugs on various PD progression measures in two observational medical databases, covering more than 150 million patients. The results of these emulated trials reveal remarkable agreement between the two databases for the most promising candidates. DISCUSSION: Estimating drug effects from observational data is challenging due to data biases and noise. To tackle this challenge, we integrate causal inference methodology with domain knowledge and compare the estimated effects in two separate databases. CONCLUSION: Our framework enables systematic search for drug repurposing candidates by emulating RCTs using observational data. The high level of agreement between separate databases strongly supports the identified effects.

10.
Stud Health Technol Inform ; 235: 181-185, 2017.
Article in English | MEDLINE | ID: mdl-28423779

ABSTRACT

We present a framework for feature engineering, tailored for longitudinal structured data, such as electronic health records (EHRs). To fast-track feature engineering and extraction, the framework combines general-use plug-in extractors, a multi-cohort management mechanism, and modular memoization. Using this framework, we rapidly extracted thousands of features from diverse and large healthcare data sources in multiple projects.


Subject(s)
Electronic Health Records/organization & administration , Informatics/methods , Cohort Studies , Delivery of Health Care/statistics & numerical data , Humans , Machine Learning , Risk Factors
11.
Big Data ; 4(3): 148-59, 2016 09.
Article in English | MEDLINE | ID: mdl-27541627

ABSTRACT

The availability of electronic health records creates fertile ground for developing computational models of various medical conditions. We present a new approach for detecting and analyzing patients with unexpected responses to treatment, building on machine learning and statistical methodology. Given a specific patient, we compute a statistical score for the deviation of the patient's response from responses observed in other patients having similar characteristics and medication regimens. These scores are used to define cohorts of patients showing deviant responses. Statistical tests are then applied to identify clinical features that correlate with these cohorts. We implement this methodology in a tool that is designed to assist researchers in the pharmaceutical field to uncover new features associated with reduced response to a treatment. It can also aid physicians by flagging patients who are not responding to treatment as expected and hence deserve more attention. The tool provides comprehensive visualizations of the analysis results and the supporting data, both at the cohort level and at the level of individual patients. We demonstrate the utility of our methodology and tool in a population of type II diabetic patients, treated with antidiabetic drugs, and monitored by the HbA1C test.


Subject(s)
Diabetes Mellitus, Type 2/drug therapy , Hypoglycemic Agents/therapeutic use , Electronic Health Records , Humans , Machine Learning
12.
Epilepsy Behav ; 56: 32-7, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26827299

ABSTRACT

PURPOSE: A UCB-IBM collaboration explored the application of machine learning to large claims databases to construct an algorithm for antiepileptic drug (AED) choice for individual patients. METHODS: Claims data were collected between January 2006 and September 2011 for patients with epilepsy > 16 years of age. A subset of patient claims with a valid index date of AED treatment change (new, add, or switch) were used to train the AED prediction model by retrospectively evaluating an index date treatment for subsequent treatment change. Based on the trained model, a model-predicted AED regimen with the lowest likelihood of treatment change was assigned to each patient in the group of test claims, and outcomes were evaluated to test model validity. RESULTS: The model had 72% area under receiver operator characteristic curve, indicating good predictive power. Patients who were given the model-predicted AED regimen had significantly longer survival rates (time until a treatment change event) and lower expected health resource utilization on average than those who received another treatment. The actual prescribed AED regimen at the index date matched the model-predicted AED regimen in only 13% of cases; there were large discrepancies in the frequency of use of certain AEDs/combinations between model-predicted AED regimens and those actually prescribed. CONCLUSIONS: Chances of treatment success were improved if patients received the model-predicted treatment. Using the model's prediction system may enable personalized, evidence-based epilepsy care, accelerating the match between patients and their ideal therapy, thereby delivering significantly better health outcomes for patients and providing health-care savings by applying resources more efficiently. Our goal will be to strengthen the predictive power of the model by integrating diverse data sets and potentially moving to prospective data collection.


Subject(s)
Anticonvulsants/therapeutic use , Epilepsy/drug therapy , Adolescent , Adult , Aged , Aged, 80 and over , Costs and Cost Analysis , Data Interpretation, Statistical , Databases, Factual , Epilepsy/epidemiology , Female , Humans , Insurance Claim Review , Likelihood Functions , Male , Middle Aged , Models, Statistical , Retrospective Studies , Treatment Outcome , United States/epidemiology , Young Adult
13.
AMIA Jt Summits Transl Sci Proc ; 2015: 137-41, 2015.
Article in English | MEDLINE | ID: mdl-26306256

ABSTRACT

The availability of electronic health records creates fertile ground for developing computational models for various medical conditions. Using machine learning, we can detect patients with unexpected responses to treatment and provide statistical testing and visualization tools to help further analysis. The new system was developed to help researchers uncover new features associated with reduced response to treatment, and to aid physicians in identifying patients that are not responding to treatment as expected and hence deserve more attention. The solution computes a statistical score for the deviation of a given patient's response from responses observed individuals with similar characteristics and medication regimens. Statistical tests are then applied to identify clinical features that correlate with cohorts of patients showing deviant responses. The results provide comprehensive visualizations, both at the cohort and the individual patient levels. We demonstrate the utility of this system in a population of diabetic patients.

14.
Nucleic Acids Res ; 42(15): 9854-61, 2014 Sep.
Article in English | MEDLINE | ID: mdl-25056310

ABSTRACT

Genomes undergo changes in organization as a result of gene duplications, chromosomal rearrangements and local mutations, among other mechanisms. In contrast to prokaryotes, in which genes of a common function are often organized in operons and reside contiguously along the genome, most eukaryotes show much weaker clustering of genes by function, except for few concrete functional groups. We set out to check systematically if there is a relation between gene function and gene organization in the human genome. We test this question for three types of functional groups: pairs of interacting proteins, complexes and pathways. We find a significant concentration of functional groups both in terms of their distance within the same chromosome and in terms of their dispersal over several chromosomes. Moreover, using Hi-C contact map of the tendency of chromosomal segments to appear close in the 3D space of the nucleus, we show that members of the same functional group that reside on distinct chromosomes tend to co-localize in space. The result holds for all three types of functional groups that we tested. Hence, the human genome shows substantial concentration of functional groups within chromosomes and across chromosomes in space.


Subject(s)
Cell Nucleus/genetics , Chromosomes, Human , Genes , Genome, Human , Humans , Intranuclear Space , Protein Interaction Mapping
15.
Diabetol Metab Syndr ; 5(1): 36, 2013 Jul 15.
Article in English | MEDLINE | ID: mdl-23856414

ABSTRACT

OBJECTIVE: To investigate the predictive value of different biomarkers for the incidence of type 2 diabetes mellitus (T2DM) in subjects with metabolic syndrome. METHODS: A prospective study of 525 non-diabetic, middle-aged Lithuanian men and women with metabolic syndrome but without overt atherosclerotic diseases during a follow-up period of two to four years. We used logistic regression to develop predictive models for incident cases and to investigate the association between various markers and the onset of T2DM. RESULTS: Fasting plasma glucose (FPG), body mass index (BMI), and glycosylated haemoglobin can be used to predict diabetes onset with a high level of accuracy and each was shown to have a cumulative predictive value. The estimated area under the receiver-operating characteristic curve (AUC) for this combination was 0.92. The oral glucose tolerance test (OGTT) did not show cumulative predictive value. Additionally, progression to diabetes was associated with high values of aortic pulse-wave velocity (aPWV). CONCLUSION: T2DM onset in middle-aged metabolic syndrome subjects can be predicted with remarkable accuracy using the combination of FPG, BMI, and HbA1c, and is related to elevated aPWV measurements.

16.
Stud Health Technol Inform ; 180: 781-5, 2012.
Article in English | MEDLINE | ID: mdl-22874298

ABSTRACT

We present a new framework for supporting decisions in sequential clinical risk assessment examinations. In this framework, the decision whether to perform a test depends on its expected contribution to risk assessment, given results of previous tests, and the contribution is quantified using information theory. In many cases adding an additional examination clearly improves the predictive model. However, there are cases in which the improvement is not constant for all values of previous tests, and quantification of possible improvement can support decision on further examinations. Using this approach can prevent many expensive, unpleasant or risky examinations. We demonstrate the use of this method on an example of type 2 diabetes onset study. The results show that reducing a considerable percent of the blood tests does not decrease the model's prediction power.


Subject(s)
Decision Support Systems, Clinical , Decision Support Techniques , Diagnostic Tests, Routine/statistics & numerical data , Proportional Hazards Models , Risk Assessment/methods , Lithuania
17.
Genome Biol ; 12(6): R61, 2011 Jun 29.
Article in English | MEDLINE | ID: mdl-21714908

ABSTRACT

BACKGROUND: Chromosomal aneuploidy, that is to say the gain or loss of chromosomes, is the most common abnormality in cancer. While certain aberrations, most commonly translocations, are known to be strongly associated with specific cancers and contribute to their formation, most aberrations appear to be non-specific and arbitrary, and do not have a clear effect. The understanding of chromosomal aneuploidy and its role in tumorigenesis is a fundamental open problem in cancer biology. RESULTS: We report on a systematic study of the characteristics of chromosomal aberrations in cancers, using over 15,000 karyotypes and 62 cancer classes in the Mitelman Database. Remarkably, we discovered a very high co-occurrence rate of chromosome gains with other chromosome gains, and of losses with losses. Gains and losses rarely show significant co-occurrence. This finding was consistent across cancer classes and was confirmed on an independent comparative genomic hybridization dataset of cancer samples. The results of our analysis are available for further investigation via an accompanying website. CONCLUSIONS: The broad generality and the intricate characteristics of the dichotomy of aneuploidy, ranging across numerous tumor classes, are revealed here rigorously for the first time using statistical analyses of large-scale datasets. Our finding suggests that aneuploid cancer cells may use extra chromosome gain or loss events to restore a balance in their altered protein ratios, needed for maintaining their cellular fitness.


Subject(s)
Aneuploidy , Chromosome Aberrations , Karyotype , Neoplasms/genetics , Cluster Analysis , Data Mining , Humans , Internet , Neoplasms/classification , User-Computer Interface
18.
J Comput Biol ; 16(10): 1445-60, 2009 Oct.
Article in English | MEDLINE | ID: mdl-19754273

ABSTRACT

Since the discovery of the "Philadelphia chromosome" in chronic myelogenous leukemia in 1960, there has been ongoing intensive research of chromosomal aberrations in cancer. These aberrations, which result in abnormally structured genomes, became a hallmark of cancer. Many studies provide evidence for the connection between chromosomal alterations and aberrant genes involved in the carcinogenesis process. An important problem in the analysis of cancer genomes is inferring the history of events leading to the observed aberrations. Cancer genomes are usually described in the form of karyotypes, which present the global changes in the genomes' structure. In this study, we propose a mathematical framework for analyzing chromosomal aberrations in cancer karyotypes. We introduce the problem of sorting karyotypes by elementary operations, which seeks a shortest sequence of elementary chromosomal events transforming a normal karyotype into a given (abnormal) cancerous karyotype. Under certain assumptions, we prove a lower bound for the elementary distance, and present a polynomial-time 3-approximation algorithm for the problem. We applied our algorithm to karyotypes from the Mitelman database, which records cancer karyotypes reported in the scientific literature. Approximately 94% of the karyotypes in the database, totaling 58,464 karyotypes, supported our assumptions, and each of them was subjected to our algorithm. Remarkably, even though the algorithm is only guaranteed to generate a 3-approximation, it produced a sequence whose length matched the lower bound (and hence optimal) in 99.9% of the tested karyotypes.


Subject(s)
Chromosome Aberrations , Karyotyping , Models, Genetic , Neoplasms/genetics , Algorithms , Cell Line, Tumor , Female , Genome, Human , Humans
19.
J Comput Biol ; 15(7): 793-812, 2008 Sep.
Article in English | MEDLINE | ID: mdl-18652529

ABSTRACT

A centromere is a special region in the chromosome that plays a vital role during cell division. Every new chromosome created by a genome rearrangement event must have a centromere in order to survive. This constraint has been ignored in the computational modeling and analysis of genome rearrangements to date. Unlike genes, the different centromeres are indistinguishable, they have no orientation, and only their location is known. A prevalent rearrangement event in the evolution of multi-chromosomal species is translocation (i.e., the exchange of tails between two chromosomes). A translocation may create a chromosome with no centromere in it. In this paper, we study for the first time centromeres-aware genome rearrangements. We present a polynomial time algorithm for computing a shortest sequence of translocations transforming one genome into the other, where all of the intermediate chromosomes must contain centromeres. We view this as a first step towards analysis of more general genome rearrangement models that take centromeres into consideration.


Subject(s)
Algorithms , Centromere/metabolism , Genome , Translocation, Genetic , Chromosomes/genetics , Computational Biology , Models, Genetic , Models, Statistical
20.
J Comput Biol ; 14(4): 408-22, 2007 May.
Article in English | MEDLINE | ID: mdl-17572020

ABSTRACT

The understanding of genome rearrangements is an important endeavor in comparative genomics. A major computational problem in this field is finding a shortest sequence of genome rearrangements that transforms, or sorts, one genome into another. In this paper we focus on sorting a multi-chromosomal genome by translocations. We reveal new relationships between this problem and the well studied problem of sorting by reversals. Based on these relationships, we develop two new algorithms for sorting by reciprocal translocations, which mimic known algorithms for sorting by reversals: a score-based method building on Bergeron's algorithm, and a recursive procedure similar to the Berman-Hannenhalli method. Though their proofs are more involved, our procedures for reciprocal translocations match the complexities of the original ones for reversals.


Subject(s)
Algorithms , Chromosomes/genetics , Genome , Translocation, Genetic/genetics , Computational Biology , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL