Search | VHL Search Portal

1.

Does Instruction Affect the Underlying Dimensionality of a Kinesiology Test?

Bezruczko, Nikolaus; Frank, Eva; Perkins, Kyle.

J Appl Meas ; 17(4): 393-415, 2016.

Article in English | MEDLINE | ID: mdl-28009588

ABSTRACT

Does effective instruction, which changes students' knowledge and possibly alters their cognitive functions, also affect the dimensionality of an achievement test? This question was examined by the parameterization of kinesiology test items (n = 42) with a Rasch dichotomous model, followed by an investigation of dimensionality in a pre- and post-test quasi-experimental study design. College students (n = 108) provided responses to kinesiology achievement test items. Then the stability of item difficulties, gender differences, and the interaction of item content categories with dimensionality were examined. In addition, a PCA/t-test protocol was implemented to examine dimensionality threats from the item residuals. Internal construct validity was investigated by regressing item content components on calibrated item difficulties. Measurement model item residuals were also investigated with statistical decomposition methods. In general, the results showed significant student achievement between pre and post testing, and dimensionality disturbances were relatively minor. The amount of unexpected item "shift" in an un-equated measurement dimension between pre and post testing was less than ten percent of the total items and largely concentrated among several unrelated items. An unexpected finding was a residual cluster consisting of several items testing related technical content. Complicating interpretation, these items tended to appear near the end of the test, which implicates test position as a threat to measurement equivalence. In general, the results across several methods did not tend to identify common threats and instead pointed to multiple sources of threats with varying degree of prominence. These results suggest conventional approaches to measurement equivalence that emphasize expedient overall procedures such as DIF, IRT, and factor analysis are probably capturing isolated sources of variability. Their implementation probably improves measurement equivalence but with substantial residual sources undetected.

Subject(s)

Educational Measurement/methods , Kinesiology, Applied/education , Models, Statistical , Psychometrics/methods , Surveys and Questionnaires , Teaching/statistics & numerical data , Adolescent , Algorithms , Data Interpretation, Statistical , Female , Humans , Male , Reproducibility of Results , Sensitivity and Specificity , United States , Young Adult

2.

Automatic item generation implemented for measuring artistic judgment aptitude.

Bezruczko, Nikolaus.

J Appl Meas ; 15(1): 1-25, 2014.

Article in English | MEDLINE | ID: mdl-24518578

ABSTRACT

Automatic item generation (AIG) is a broad class of methods that are being developed to address psychometric issues arising from internet and computer-based testing. In general, issues emphasize efficiency, validity, and diagnostic usefulness of large scale mental testing. Rapid prominence of AIG methods and their implicit perspective on mental testing is bringing painful scrutiny to many sacred psychometric assumptions. This report reviews basic AIG ideas, then presents conceptual foundations, image model development, and operational application to artistic judgment aptitude testing.

Subject(s)

Aptitude Tests/statistics & numerical data , Art , Computer-Assisted Instruction/statistics & numerical data , Judgment , Psychometrics/statistics & numerical data , Algorithms , Computer Simulation , Humans , Internet , Pattern Recognition, Visual

3.

Multi-factor scale consolidation when theory is weak.

Bezruczko, Nikolaus; Perkins, Kyle.

J Appl Meas ; 13(1): 77-96, 2012.

Article in English | MEDLINE | ID: mdl-22677498

ABSTRACT

As a practical matter, Spirituality and Quality of Life in the health sciences are usually measured separately. Theoretical foundations for this distinction, however, are not strong. In this research, an empirical investigation was conducted into their joint calibration with a Rasch model. Functional Assessment of Cancer Therapy-General (28 items), a cancer health-related quality of life measure (HRQOL), and Functional Assessment of Chronic Illness - Spiritual Well-Being (12 items), a measure of religious and existential well-being (Spirituality), were co-calibrated with a Rasch model implemented with WINSTEPS software for ratings from 545 breast cancer patients. The results show a hierarchical integration of QOL and Spirituality items on a common variable, and both patient separation (2.66) and reliability (.88) improve after co-calibration. Principal Component Analysis of co-calibrated item residuals did not show major threats to dimensionality, and joint calibration explains item variance comparable to separate calibrations (51.9%). Although patient measures (logits) based on separate and co-calibration are within two standard errors, ethnic and racial group values shift after co-calibration.

Subject(s)

Black or African American/psychology , Breast Neoplasms/psychology , Hispanic or Latino/psychology , Psychometrics/statistics & numerical data , Quality of Life/psychology , Spirituality , White People/psychology , Breast Neoplasms/ethnology , Breast Neoplasms/therapy , Female , Health Surveys/statistics & numerical data , Humans , Logistic Models , Mathematical Computing , Models, Statistical , Software

4.

An external validation study of a classification of mixed connective tissue disease and systemic lupus erythematosus patients.

Hoffman, Robert W; Bezruczko, Nikolaus; Perkins, Kyle.

J Appl Meas ; 13(2): 205-16, 2012.

Article in English | MEDLINE | ID: mdl-22805362

ABSTRACT

Mixed Connective Tissue Disease (MCTD) and Systemic Lupus Erythematosus (SLE) are autoimmune rheumatic diseases that are difficult for physicians to diagnose and to distinguish for a variety of reasons. The correct classification of these two diseases is a crucial issue for clinicians who treat autoimmune rheumatic diseases. In prior research, medical risk factors represented by instrument or laboratory measures and physician judgments (12 key features for MCTD and 12 key features for SLE) were parameterized with a one parameter logistic function in a Rasch model. Those results identified separate diagnostic dimensions for MCTD and SLE. This procedure was replicated in the present research with a sample of largely African American and Hispanic patients. Results verified separate dimensions for MCTD and SLE, which suggests MCTD is a separate disease from SLE.

Subject(s)

Data Interpretation, Statistical , Decision Support Techniques , Diagnosis, Computer-Assisted/methods , Lupus Erythematosus, Systemic/classification , Lupus Erythematosus, Systemic/diagnosis , Mixed Connective Tissue Disease/classification , Mixed Connective Tissue Disease/diagnosis , Diagnosis, Differential , Humans , Reproducibility of Results , Sensitivity and Specificity

5.

Ben Wright: A wisp of greatness Brief photographic review of his life and times.

Bezruczko, Nikolaus.

J Appl Meas ; 17(2): 239-261, 2016.

Article in English | MEDLINE | ID: mdl-28009587

Subject(s)

Educational Measurement/history , Epidemiology/history , Psychometrics/history , Statistics as Topic/history , Surveys and Questionnaires , History, 20th Century , History, 21st Century , United States

6.

An ADL measure for spinal cord injury.

Bryden, Anne; Bezruczko, Nikolaus.

J Appl Meas ; 12(3): 279-97, 2011.

Article in English | MEDLINE | ID: mdl-22357128

ABSTRACT

Occupational therapists do not have a comprehensive, objective method for measuring how persons with tetraplegia perform activities of daily living (ADL) in their homes and communities, because SCI ADL performance is usually determined in rehabilitation. The ADL Habits Survey (ADLHS) is designed specifically to address this knowledge gap by surveying performance on relevant and meaningful activities in homes and communities. After a comprehensive task analysis and pilot development, 30 activities were selected that emphasize a broad range of hand and wrist, reaching, and grasping movements in compound activities. A sample of 49 persons with cervical spinal cord injuries responded to items. The sample was predominantly male, median age was 41 years, and ASIA motor classification levels ranged from C2 through C8/T1 with majority concentration in C4, C5, or C6 (68%). Each participant report was rated by an occupational therapist using a seven category rating scale, and the item by participant response matrix (30 X 49) was analyzed with a Rasch model for rating scales. Results showed excellent participant separation (>4) and very high reliability (>.95), and both item and participant fit values were adequate (STANDARDIZED INFIT less than absolute value of 3). With only two exceptions, all participants fit the Rasch rating scale model, and only one item "Light housekeeping" presented significant fit issues. Principal Components Analysis an analysis of item residuals did not reveal serious threats to unidimensionality. A between group fit comparison of participants with more versus less movement found invariant item calibrations, and ANOVA of participant measures found statistically significant differences across ASIA motor classification levels. These ADLHS results offer occupational therapists a new method for measuring ADL that is potentially more sensitive to functional changes in tetraplegia than most instruments in common use. Accommodation of step disorder with a three category rating scale did not diminish measurement properties.

Subject(s)

Activities of Daily Living , Disability Evaluation , Spinal Cord Injuries/physiopathology , Surveys and Questionnaires , Adult , Female , Humans , Male , Spinal Cord Injuries/rehabilitation , United States

7.

Measurement of mothers' confidence to care for children assisted with tracheostomy technology in family homes.

Bezruczko, Nikolaus; Chen, Shu-Pi C; Hill, Constance D; Chesniak, Joyce M.

J Appl Meas ; 12(4): 339-57, 2011.

Article in English | MEDLINE | ID: mdl-22357156

ABSTRACT

The purpose of this research was to develop an objective, linear measure of mothers' confidence to care for children assisted with tracheostomy medical technology in their homes. Caregiver confidence is addressed in this research for three technologies, namely, a) trachesotomy, b) tracheostomy and ventilator, and c) BiPAP/CPAP although detailed measurement results are only reported for tracheostomy, and its co-calibration with tracheostomy and ventilator caregiving items. The sample consisted of 53 mothers responding to several caregiver questionnaires based on a caregiving task matrix after content and clinical validation. A major challenge was integrating this construct with overarching principles already established by Functional Caregiving, a multi-level humanistic caregiving model for children with intellectual disabilities. Empirical analyses included principal components analysis, and then linear transformation of Tracheostomy item ratings to an objective, equal-interval scale with a Rasch model. Results show caregiver separation on the Tracheostomy caregiving scale was 2.66 and reliability, .88. In general, co-calibration improved measurement properties without affecting mothers' caregiving confidence measures. Although sample size was small, measuring mothers' confidence to care for a child supported by complex medical technologies appears very promising.

Subject(s)

Caregivers/psychology , Continuous Positive Airway Pressure/nursing , Continuous Positive Airway Pressure/psychology , Home Nursing/psychology , Mothers/psychology , Self Efficacy , Tracheostomy/nursing , Tracheostomy/psychology , Adolescent , Adult , Child , Child, Preschool , Female , Humans , Infant , Intellectual Disability/nursing , Male , Middle Aged , Models, Statistical , Surveys and Questionnaires , Young Adult

8.

Foreword: emergence of efficiency in health outcome measurement.

Bezruczko, Nikolaus.

J Appl Meas ; 11(3): 197-213, 2010.

Article in English | MEDLINE | ID: mdl-20847470

ABSTRACT

Psychosocial measurement in the 21st Century is a dynamic field that is addressing challenges unthinkable even a generation ago. Sophisticated methods and modern technology has brought psychometrics to the cusp of scientific objectivity. This Foreword provides historical context and intellectual foundations for appreciating contemporary psychometric advancements, as well as a perspective on issues that are determining future advances. Efficiency in outcome measurement is one of these forces driving future advances. Efficiency, however, can easily become conflated with expediency, and neither can substitute for effectiveness. Blind efficiency runs risk of degrading measurement properties. Likewise, measurement advancement without accommodation to ordinary needs leads to practical rejection. Bouchard presents a biographical link between scientific physics and Rasch models that opened the door for fundamental psychosocial measurement. Symposium papers presented in this issue present a broad range of ideas about contemporary psychosocial measurement. Granger summarizes key ideas underlying achievement of objective, fundamental measurement. Massof, then, Stenner and Stone present alternative perspectives on scientific knowledge systems, which are prominent landmarks on the psychometric horizon. Fisher and Burton describe fundamental measurement methodology in diagnosis and implementation of technology, which will consolidate isolated and redundant constructs in PROMIS. Hart presents an overview on computer adaptive testing, which is the vanguard in health outcome measurement. Kisala and Tulsky present a qualitative strategy that is improving sensitivity and validity of new outcome measures. Their diversity reflects an intense competition of ideas about solving measurement problems. Their collection together in this special issue is a milestone and tribute to scientific ingenuity.

Subject(s)

Outcome Assessment, Health Care , Biostatistics/history , Efficiency, Organizational , History, 20th Century , History, 21st Century , Humans , Models, Statistical , Outcome Assessment, Health Care/history , Outcome Assessment, Health Care/statistics & numerical data , Psychometrics/history , Psychometrics/statistics & numerical data

9.

A Rasch analysis for classification of systemic lupus erythematosus and mixed connective tissue disease.

Perkins, Kyle; Hoffman, Robert W; Bezruczko, Nikolaus.

J Appl Meas ; 9(2): 136-50, 2008.

Article in English | MEDLINE | ID: mdl-18480510

ABSTRACT

The classification of rheumatic diseases is challenging because these diseases have protean and frequently overlapping clinical and laboratory manifestations. This problem is typified by the difficulty of classification and differentiation of two prototypic multi-system autoimmune diseases, Systemic Lupus Erythematosus (SLE) and Mixed Connective Tissue Disease (MCTD). The researchers submitted medical risk factor data represented by instrument or laboratory measures and physician judgments (12 key features for SLE) from 43 patients diagnosed with SLE and 12 key features for MCTD from 51 patients diagnosed with MCTD to the WINSTEPS Rasch analysis program. Using Rasch model parameterization, and fit and residuals analyses, the researchers identified separate dimensions for MCTD and SLE, thereby lending support to the position that MCTD is its own separate disease, distinct from SLE.

Subject(s)

Lupus Erythematosus, Systemic/classification , Mixed Connective Tissue Disease/classification , Psychometrics , Adult , Aged , Female , Humans , Male , Middle Aged , Mixed Connective Tissue Disease/physiopathology , Surveys and Questionnaires

10.

Nonequivalent survey consolidation: an example from functional caregiving.

Bezruczko, Nikolaus; Chen, Shu-Pi C.

J Appl Meas ; 8(4): 336-58, 2007.

Article in English | MEDLINE | ID: mdl-18250522

ABSTRACT

Functional Caregiving (FC) is a construct about mothers caring for children (both old and young) with intellectual disabilities, which is operationally defined by two nonequivalent survey forms, urban and suburban, respectively. The purposes of this research are, first, to generalize school-based achievement test principles to survey methods by equating two nonequivalent survey forms. A second purpose is to expand FC foundations by a) establishing linear measurement properties for new caregiving items, b) replicate a hierarchical item structure across an urban, school-based population, c) consolidate survey forms to establish a calibrated item bank, and d) collect more external construct validity data. Results supported invariant item parameters of a fixed item form (96 items) for two urban samples (N = 186). FC measures also showed expected construct relationships with age, mental depression, and health status. However, only five common items between urban and suburban forms were statistically stable because suburban mothers' age and child's age appear to interact with medical information and social activities.

Subject(s)

Caregivers/psychology , Mother-Child Relations , Persons with Mental Disabilities , Adult , Chicago , Data Collection/methods , Female , Humans , Middle Aged

11.

Relative precision, efficiency and construct validity of different starting and stopping rules for a computerized adaptive test: the GAIN substance problem scale.

Riley, Barth B; Conrad, Kendon J; Bezruczko, Nikolaus; Dennis, Michael L.

J Appl Meas ; 8(1): 48-64, 2007.

Article in English | MEDLINE | ID: mdl-17215565

ABSTRACT

Substance abuse treatment programs are being pressed to measure and make clinical decisions more efficiently about an increasing array of problems. This computerized adaptive testing (CAT) simulation examined the relative efficiency, precision and construct validity of different starting and stopping rules used to shorten the Global Appraisal of Individual Needs' (GAIN) Substance Problem Scale (SPS) and facilitate diagnosis based on it. Data came from 1,048 adolescents and adults referred to substance abuse treatment centers in 5 sites. CAT performance was evaluated using: (1) average standard errors, (2) average number of items, (3) bias in person measures, (4) root mean squared error of person measures, (5) Cohen's kappa to evaluate CAT classification compared to clinical classification, (6) correlation between CAT and full-scale measures, and (7) construct validity of CAT classification vs. clinical classification using correlations with five theoretically associated instruments. Results supported both CAT efficiency and validity.

Subject(s)

Diagnosis, Computer-Assisted , Models, Psychological , Substance-Related Disorders/diagnosis , Surveys and Questionnaires , Adolescent , Female , Humans , Male

12.

Substance use disorder symptoms: evidence of differential item functioning by age.

Conrad, Kendon J; Dennis, Michael L; Bezruczko, Nikolaus; Funk, Rodney R; Riley, Barth B.

J Appl Meas ; 8(4): 373-87, 2007.

Article in English | MEDLINE | ID: mdl-18250524

ABSTRACT

This study examined the applicability of substance abuse diagnostic criteria for adolescents, young adults, and adults using the Global Appraisal of Individual Need's Substance Problems Scale (SPS) from 7,408 clients. Rasch analysis was used to: 1) evaluate whether the SPS operationalized a single reliable dimension, and 2) examine the extent to which the severity of each symptom and the overall test functioned the same or differently by age. Rasch analysis indicated that the SPS was unidimensional with a person reliability of .84. Eight symptoms were significantly different between adolescents and adults. Young adult calibrations tended to fall between adolescents and adults. Differential test functioning was clinically negligible for adolescents but resulted in about 7% more adults being classified as high need. These findings have theoretical implications for screening and treatment of adolescents vs. adults. SPS can be used across age groups though age-specific calibrations enable greater precision of measurement.

Subject(s)

Severity of Illness Index , Substance-Related Disorders/physiopathology , Adolescent , Adult , Aged , Child , Female , Humans , Male , Middle Aged , United States

13.

Rasch analysis of a new construct: functional caregiving for adult children with intellectual disabilities.

Chen, Shu-Pi C; Bezruczko, Nikolaus; Ryan-Henry, Sheila.

J Appl Meas ; 7(2): 141-59, 2006.

Article in English | MEDLINE | ID: mdl-16632898

ABSTRACT

This research examined empirical evidence for a new construct, Functional Caregiving, which is a theory about mothers' caregiving of their adult children with intellectual disabilities. A sample of 108 biological mothers and primary caregivers rated survey items about their confidence to perform caregiving tasks. Rasch rating scale analysis found 61 items defined an empirical construct with three caregiving levels: Advocacy, Personal Caregiving, and Community. Results show item separation was 3.11 with high reliability, .91, and mother separation was 2.93 and reliability, .90. Both items and mothers showed adequate INFIT and OUTFIT values. Item invariance was confirmed between older and younger mothers, and principle components analysis of item residuals did not reveal any major dimensionality threats. Item decomposition analysis showed FC content theory to account for 58 percent of item calibration variance (R2 = .58, F = 42.3, p < .001). These results have important practical implications for health and social services, as well as family caregiving, interdisciplinary practices, and health policy development.

Subject(s)

Adult Children , Caregivers/psychology , Persons with Mental Disabilities , Adult , Aged , Aged, 80 and over , Chicago , Data Collection , Female , Humans , Middle Aged , Mother-Child Relations , Psychometrics

14.

Breakthrough measuring neighborhoods.

Bezruczko, Nikolaus.

J Appl Meas ; 4(2): 137-52, 2003.

Article in English | MEDLINE | ID: mdl-12748406

ABSTRACT

An empirical strategy is presented for transforming ordinal counts and percentages to interval scale measures by recoding them as ordered categories and estimating Rasch model rating scale parameters. This strategy is demonstrated for a neighborhood construct socioeconomic disadvantage operationally defined by eight characteristics of Chicago neighborhoods (N = 77). Results show surprisingly sound model fit and satisfactory scale invariance between 1980 and 1990 census. A striking finding obscured by traditional methods is many Chicago neighborhoods are four times more disadvantaged than official U.S. poverty threshold. Intramodel construct validation confirms this scale structure is consistent with sociological expectations about property values, income, and race. A general benefit of this approach over conventional categorical socioeconomic indices is neighborhood measurement on a linear scale.

Subject(s)

Models, Statistical , Poverty/statistics & numerical data , Residence Characteristics/classification , Vulnerable Populations/statistics & numerical data , Chicago , Data Interpretation, Statistical , Humans , Psychology, Social , Residence Characteristics/statistics & numerical data , United States

15.

A multi-factor Rasch scale for artistic judgment.

Bezruczko, Nikolaus.

J Appl Meas ; 3(4): 360-99, 2002.

Article in English | MEDLINE | ID: mdl-12486307

ABSTRACT

Measurement properties are reported for a combined scale of abstract and figurative artistic judgment aptitude items. Abstract items are synthetic, rule-based images from Visual Designs Test which implements a statistical algorithm to control design complexity and redundancy, and figurative items are canvas paintings in five styles, Fauvism, Post-Impressionism, Surrealism, Renaissance, and Baroque especially created for this research. The paintings integrate syntactic structure from VDT Abstract designs with thematic content for each style at four levels of complexity while controlling redundancy. Trained test administrators collected preference for synthetic abstract designs and authentic figurative art from 462 examinees in Johnson O'Connor Research Foundation testing offices in Boston, New York, Chicago, and Dallas. The Rasch model replicated measurement properties for VDT Abstract items and identified an item hierarchy that was statistically invariant between genders and generally stable across age for new, authentic figurative items. Further examination of the figurative item hierarchy revealed that complexity interacts with style and meaning. Sound measurement properties for a combined VDT Abstract and Figurative scale shows promise for a comprehensive artistic judgment construct.

Subject(s)

Algorithms , Art , Perception , Decision Making , Humans , Psychometrics , Reference Values

16.

Screening for atypical suicide risk with person fit statistics among people presenting to alcohol and other drug treatment.

Conrad, Kendon J; Bezruczko, Nikolaus; Chan, Ya-Fen; Riley, Barth; Diamond, Guy; Dennis, Michael L.

Drug Alcohol Depend ; 106(2-3): 92-100, 2010 Jan 15.

Article in English | MEDLINE | ID: mdl-19748746

ABSTRACT

UNLABELLED: Symptoms of internalizing disorders (depression, anxiety, somatic, trauma) are the major risk factors for suicide. Atypical suicide risk is characterized by people with few or no symptoms of internalizing disorders. OBJECTIVE: In persons screened at intake to alcohol or other drug (AOD) treatment, this research examined whether person fit statistics would support an atypical subtype at high risk for suicide that did not present with typical depression and other internalizing disorders. METHODS: Symptom profiles of the prototypical, typical, and atypical persons, as defined using fit statistics, were tested on 7408 persons entering AOD treatment using the Global Appraisal of Individual Needs (GAIN; Dennis et al., 2003a,b). RESULTS: Of those with suicide symptoms, the findings were as expected with the atypical group being higher on suicide and lower on symptoms of internalizing disorders. In addition, the atypical group was similar or lower on substance problems, symptoms of externalizing disorders, and crime and violence. CONCLUSIONS: Person fit statistics were useful in identifying persons with atypical suicide profiles and in enlightening aspects of existing theory concerning atypical suicidal ideation.

Subject(s)

Alcoholism/psychology , Mass Screening/statistics & numerical data , Personality Assessment/statistics & numerical data , Substance-Related Disorders/psychology , Suicide Prevention , Adolescent , Adult , Anxiety/epidemiology , Crime/statistics & numerical data , Depression/epidemiology , Female , Humans , Male , Racial Groups , Risk Factors , Severity of Illness Index , Suicide/statistics & numerical data , Violence/statistics & numerical data

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL