Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Stat Med ; 40(23): 5046-5064, 2021 10 15.
Article in English | MEDLINE | ID: mdl-34155660

ABSTRACT

Dealing with high-dimensional censored data is very challenging because of the complexities in data structure. This article focuses on developing a variable selection procedure for censored high-dimensional data with the AFT models using the Modified Correlation Adjusted coRrelation (MCAR) scores method. The latter is developed based on CAR scores method that provides a canonical ordering that encourages grouping of correlated predictors and down-weights antagonistic variables. The proposed MCAR scores method is developed as an extension of the CAR scores method using NOVEL integration of the sample and threshold estimator of the correlation matrix as suggested by Huang and Frylewicz. The proposed MCAR exhibits computationally more efficient estimates under model sparsity and can provide a canonical ordering among the predictors. The MCAR method is a greedy method that is also easy to understand and can perform estimation and variable selection simultaneously. Performances of variable selection by the MCAR method have been compared with other existing regularized techniques in literature-such as the lasso, elastic net and with a machine learning technique called boosting and with the censored CAR by a number of simulation studies and a real microarray data set called diffuse large-B-cell lymphoma. Results indicate that when correlation exists among the covariates, the MCAR method outperforms all five techniques while for uncorrelated data, the MCAR performs quite similar to the CAR method but clearly outperforms the other three methods. The empirical study further reveals that the MCAR method exhibits the best predictive performance among the methods.


Subject(s)
Computer Simulation , Microarray Analysis
2.
Front Artif Intell ; 3: 561801, 2020.
Article in English | MEDLINE | ID: mdl-33748745

ABSTRACT

Coronavirus disease 2019 (COVID-19) has developed into a global pandemic, affecting every nation and territory in the world. Machine learning-based approaches are useful when trying to understand the complexity behind the spread of the disease and how to contain its spread effectively. The unsupervised learning method could be useful to evaluate the shortcomings of health facilities in areas of increased infection as well as what strategies are necessary to prevent disease spread within or outside of the country. To contribute toward the well-being of society, this paper focusses on the implementation of machine learning techniques for identifying common prevailing public health care facilities and concerns related to COVID-19 as well as attitudes to infection prevention strategies held by people from different countries concerning the current pandemic situation. Regression tree, random forest, cluster analysis and principal component machine learning techniques are used to analyze the global COVID-19 data of 133 countries obtained from the Worldometer website as of April 17, 2020. The analysis revealed that there are four major clusters among the countries. Eight countries having the highest cumulative infected cases and deaths, forming the first cluster. Seven countries, United States, Spain, Italy, France, Germany, United Kingdom, and Iran, play a vital role in explaining the 60% variation of the total variations by us of the first component characterized by all variables except for the rate variables. The remaining countries explain only 20% of the variation of the total variation by use of the second component characterized by only rate variables. Most strikingly, the analysis found that the variable number of tests by the country did not play a vital role in the prediction of the cumulative number of confirmed cases.

3.
Stat Appl Genet Mol Biol ; 18(5)2019 10 07.
Article in English | MEDLINE | ID: mdl-31586968

ABSTRACT

The instability in the selection of models is a major concern with data sets containing a large number of covariates. We focus on stability selection which is used as a technique to improve variable selection performance for a range of selection methods, based on aggregating the results of applying a selection procedure to sub-samples of the data where the observations are subject to right censoring. The accelerated failure time (AFT) models have proved useful in many contexts including the heavy censoring (as for example in cancer survival) and the high dimensionality (as for example in micro-array data). We implement the stability selection approach using three variable selection techniques-Lasso, ridge regression, and elastic net applied to censored data using AFT models. We compare the performances of these regularized techniques with and without stability selection approaches with simulation studies and two real data examples-a breast cancer data and a diffuse large B-cell lymphoma data. The results suggest that stability selection gives always stable scenario about the selection of variables and that as the dimension of data increases the performance of methods with stability selection also improves compared to methods without stability selection irrespective of the collinearity between the covariates.


Subject(s)
Computer Simulation , Probability , Algorithms , Breast Neoplasms/genetics , Breast Neoplasms/metabolism , Breast Neoplasms/mortality , Breast Neoplasms/pathology , Female , Humans , Linear Models , Lymphoma, B-Cell/genetics , Lymphoma, B-Cell/metabolism , Lymphoma, B-Cell/mortality , Neoplasm Metastasis
4.
Biom J ; 60(4): 687-702, 2018 07.
Article in English | MEDLINE | ID: mdl-29603360

ABSTRACT

Preprocessing for high-dimensional censored datasets, such as the microarray data, is generally considered as an important technique to gain further stability by reducing potential noise from the data. When variable selection including inference is carried out with high-dimensional censored data the objective is to obtain a smaller subset of variables and then perform the inferential analysis using model estimates based on the selected subset of variables. This two stage inferential analysis is prone to circularity bias because of the noise that might still remain in the dataset. In this work, I propose an adaptive preprocessing technique that uses sure independence screening (SIS) idea to accomplish variable selection and reduces the circularity bias by some popularly known refined high-dimensional methods such as the elastic net, adaptive elastic net, weighted elastic net, elastic net-AFT, and two greedy variable selection methods known as TCS, PC-simple all implemented with the accelerated lifetime models. The proposed technique addresses several features including the issue of collinearity between important and some unimportant covariates, which is often the case in high-dimensional setting under variable selection framework, and different level of censoring. Simulation studies along with an empirical analysis with a real microarray data, mantle cell lymphoma, is carried out to demonstrate the performance of the adaptive pre-processing technique.


Subject(s)
Data Analysis , Algorithms , Biometry , False Positive Reactions
5.
Public Health Nutr ; 21(10): 1845-1854, 2018 07.
Article in English | MEDLINE | ID: mdl-29455704

ABSTRACT

OBJECTIVE: Despite progress, levels of malnutrition among children in Bangladesh are among the highest in the world and this is one of the major causes of death in children. The pace of reduction in the prevalence of undernutrition among children is still relatively low. The present study aimed to examine the association between parental education and childhood undernutrition among Bangladeshi children under 5 years of age when adjusting for potential risk factors. DESIGN: The data set was extracted from a nationally representative survey based on a cross-sectional study, the Bangladesh Demographic and Health Survey (BDHS) 2014. SETTING: The base survey was conducted using a two-stage stratified sample of households. In the first stage, 600 enumeration areas (EA) were selected with probability proportional to EA size (207 EA from urban areas, 393 EA from rural areas). SUBJECTS: A total of 7173 children under 5 years from 17 863 households were considered for the analysis. A modified Poisson regression model was implemented to the data for assessing the relationship between parental education and childhood undernutrition when demographic and socio-economic covariates for the child, parents, households and clustering were adjusted. RESULTS: Higher parental education level was associated with lower levels of stunting and underweight, but not with wasting. Maternal and paternal education were both significantly associated with the reduction in prevalence of childhood undernutrition in Bangladesh. CONCLUSIONS: Paternal education appears equally as important as maternal education in reducing undernutrition prevalence among children under 5 years in Bangladesh.


Subject(s)
Child Nutrition Disorders/epidemiology , Educational Status , Malnutrition/epidemiology , Parents , Thinness/epidemiology , Adolescent , Adult , Bangladesh/epidemiology , Child , Child, Preschool , Cross-Sectional Studies , Female , Growth Disorders/epidemiology , Humans , Infant , Infant, Newborn , Male , Middle Aged , Prevalence , Wasting Syndrome/epidemiology , Young Adult
6.
J Biosoc Sci ; 50(4): 573-578, 2018 07.
Article in English | MEDLINE | ID: mdl-28793942

ABSTRACT

This study examined the recent level, trends and determinants of consanguineous marriage in Jordan using time-series data from the Jordan Population and Family Health Surveys (JPFHSs). According to the 2012 JPFHS, 35% of all marriages were consanguineous in Jordan in 2012. There has been a declining trend in consanguinity in the country, with the rate decreasing from a level of 57% in 1990. Most consanguineous marriage in 2012 were first cousin marriages, constituting 23% of all marriages and 66% of all consanguineous marriages. The data show that women with a lower age at marriage, older marriage cohort, larger family size, less than secondary level of education, rural place of residence, no employment, no exposure to mass media, a monogamous marriage, a husband with less than higher level of education and lower economic status, and those from the Badia region, were more likely to have a consanguineous marriage. Increasing age at marriage, level of education, urbanization and knowledge about the health consequences of consanguinity, and the ongoing socioeconomic and demographic transition in the country, will be the driving forces for further decline in consanguinity in Jordan.


Subject(s)
Consanguinity , Developing Countries/statistics & numerical data , Marriage/trends , Adult , Data Collection , Economic Status/statistics & numerical data , Economic Status/trends , Employment/statistics & numerical data , Employment/trends , Family Health/statistics & numerical data , Family Health/trends , Female , Humans , Interrupted Time Series Analysis , Jordan , Male , Marriage/statistics & numerical data , Middle Aged , Population Dynamics/statistics & numerical data , Population Dynamics/trends , Rural Population/statistics & numerical data , Rural Population/trends , Socioeconomic Factors , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...