RESUMEN
Gaussian processes, such as Brownian motion and the Ornstein-Uhlenbeck process, have been popular models for the evolution of quantitative traits and are widely used in phylogenetic comparative methods. However, they have drawbacks that limit their utility. Here we describe new, non-Gaussian stochastic differential equation (diffusion) models of quantitative trait evolution. We present general methods for deriving new diffusion models and develop new software for fitting non-Gaussian evolutionary models to trait data. The theory of stochastic processes provides a mathematical framework for understanding the properties of current and future phylogenetic comparative methods. Attention to the mathematical details of models of trait evolution and diversification may help avoid some pitfalls when using stochastic processes to model macroevolution.
Asunto(s)
Evolución Biológica , Modelos Estadísticos , Filogenia , Animales , Euterios/clasificación , Longevidad , Programas Informáticos , Procesos EstocásticosRESUMEN
Population abundance is fundamental in ecology and conservation biology, and provides essential information for predicting population dynamics and implementing conservation actions. While a range of approaches have been proposed to estimate population abundance based on existing data, data deficiency is ubiquitous. When information is deficient, a population estimation will rely on labor intensive field surveys. Typically, time is one of the critical constraints in conservation, and management decisions must often be made quickly under a data deficient situation. Hence, it is important to acquire a theoretical justification for survey methods to meet a required estimation precision. There is no such theory available in a spatially explicit context, while spatial considerations are critical to any field survey. Here, we develop a spatially explicit theory for population estimation that allows us to examine the estimation precision under different survey designs and individual distribution patterns (e.g. random/clustered sampling and individual distribution). We demonstrate that clustered sampling decreases the estimation precision when individuals form clusters, while sampling designs do not affect the estimation accuracy when individuals are distributed randomly. Regardless of individual distribution, the estimation precision becomes higher with increasing total population abundance and the sampled fraction. These insights provide theoretical bases for efficient field survey designs in information deficiency situations.
Asunto(s)
Densidad de Población , Pronóstico de Población/métodos , Encuestas y Cuestionarios , Animales , Conservación de los Recursos Naturales/métodos , Conservación de los Recursos Naturales/estadística & datos numéricos , Demografía , Ecosistema , Humanos , Modelos Estadísticos , Distribución de Poisson , Dinámica Poblacional , Proyectos de InvestigaciónRESUMEN
We consider the classification of microarray gene-expression data. First, attention is given to the supervised case, where the tissue samples are classified with respect to a number of predefined classes and the intent is to assign a new unclassified tissue to one of these classes. The problems of forming a classifier and estimating its error rate are addressed in the context of there being a relatively small number of observations (tissue samples) compared to the number of variables (that is, the genes, which can number in the tens of thousands). We then proceed to the unsupervised case and consider the clustering of the tissue samples and also the clustering of the gene profiles. Both problems can be viewed as being non-standard ones in statistics and we address some of the key issues involved. The focus is on the use of mixture models to effect the clustering for both problems.
Asunto(s)
Expresión Génica , Genómica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Niño , Análisis por Conglomerados , Bases de Datos Genéticas , Humanos , Especificidad de Órganos , Leucemia-Linfoma Linfoblástico de Células Precursoras/metabolismo , TranscriptomaRESUMEN
With the use of finite mixture models for the clustering of a data set, the crucial question of how many clusters there are in the data can be addressed by testing for the smallest number of components in the mixture model compatible with the data. We investigate the performance of a resampling approach to this latter problem in the context of high-dimensional data, where the number of variables p is extremely large relative to the number of observations n. In order to be able to fit normal mixture models to such data, some form of dimension reduction has to be performed. This raises the question of whether a practically significant bias results if the bootstrapping is undertaken solely on the basis of the reduced dimensional form of the data, rather than using the full data from which to draw the bootstrap sample replications.
Asunto(s)
Análisis por Conglomerados , Análisis Factorial , Modelos EstadísticosRESUMEN
Studies have shown that algorithms based on single-channel airflow records are effective in screening for sleep-disordered breathing diseases (SDB). In this study, we investigate the diagnostic effectiveness of a classifier trained on a set of features derived from single-channel airflow measurements. The features considered are based on recurrence quantification analysis (RQA) of the measurement time series and are optionally augmented with single measurements of neck circumference and body mass index. The airflow measurement utilized is the nasal pressure (NP). The study used an overnight recording from each of 77 patients undergoing PSG testing. Mixture discriminant analysis was used to obtain a classifier, which predicts whether or not a measurement segment contains an SDB event. Patients were diagnosed as having SDB disease if the recording contained measurement segments predicted to include an SDB event at a rate exceeding a threshold value. A patient can be diagnosed as having SDB disease if the rate of SDB events per hour of sleep, the respiratory disturbance index (RDI), is > or = 15 or sometimes > or = 5. Here we trained and evaluated the classifier under each assumption, obtaining areas under receiver operating curves using fivefold cross-validation of 0.96 and 0.93, respectively. We used a two-layer structure to select the optimal operating point and assess the resulting classifier to avoid unbiased estimates. The resulting estimates for diagnostic sensitivity/specificity were 71.5%/89.5% for disease classification when RDI > or = 15 and 63.3%/100% for RDI > or = 5. These results were found assuming that the costs of misclassifying healthy and diseased subjects are equal, but we provide a framework to vary these costs. The results suggest that a classifier based on RQA features derived from NP measurements could be used in an automated SDB screening device.
Asunto(s)
Dinámicas no Lineales , Reconocimiento de Normas Patrones Automatizadas/métodos , Polisomnografía/métodos , Síndromes de la Apnea del Sueño/diagnóstico , Algoritmos , Índice de Masa Corporal , Femenino , Humanos , Masculino , Análisis Multivariante , Cuello , Nariz , Presión , Ventilación Pulmonar , Curva ROC , Reproducibilidad de los Resultados , Respiración , Síndromes de la Apnea del Sueño/fisiopatologíaRESUMEN
Measurements of multiple physiological signals are required for diagnostic procedures such as for sleep disordered breathing. Accuracy of automated diagnostic procedures and home based screening methods can be affected when phisiological measurements contains artifacts or signal losses. We investigate on predicting one physiological signal measurement from others, using dependencies exists in physiological signals, in order to obtain a measure of reliability to the measurements. Modeling such relationships are done with the use of artificial neural networks. We conclude that via such cross prediction tasks, it is possible to identify and correct both artifacts and signal losses in these measurements.
Asunto(s)
Modelos Biológicos , Redes Neurales de la Computación , Polisomnografía/métodos , Procesamiento de Señales Asistido por Computador , Síndromes de la Apnea del Sueño/fisiopatología , Humanos , Síndromes de la Apnea del Sueño/diagnósticoRESUMEN
Electroencephalography (EEG) is a core measurement in overnight sleep studies. In this paper we study functional asymmetries of the brain as manifested through spectral correlation coefficient. Our target group is patients symptomatic of sleep apnea and referred for routine Polysomnography (PSG) testing at the hospital. We measured EEG data (using electrodes C4/A1 and C3/A2 of the International 10/20 System) as a part of the routine PSG test. Spectral correlation coefficients were computed between EEG data from the two hemispheres, for each frequency band of interest: delta, theta, alpha, and beta. Our results indicated that hemispheric correlation distinctly changes with the gross sleep type (REM/NREM) as well as with different sleep stages (stages 1-4) within NREM. It also varies in the presence of arousal events and apnea. These results may provide a basis for novel insights into the functional asymmetries of brain in sleep and sleep associated events such as arousals and apnea.
Asunto(s)
Encéfalo/fisiopatología , Electroencefalografía , Polisomnografía , Síndromes de la Apnea del Sueño/fisiopatología , Fases del Sueño , Nivel de Alerta , Femenino , Humanos , Letargia/fisiopatología , Masculino , Ronquido/fisiopatologíaRESUMEN
Obstructive Sleep Apnea (OSA) is a serious disease caused by the collapse of upper airways during sleep. The present method of measuring the severity of OSA is the Apnea Hypopnea Index (AHI). The AHI is defined as the average number of Obstructive events (Apnea and Hypopnea, OAH-events) during the total sleep period. The number of occurrence of OAH events during each hour of sleep is a random variable with an unknown probability density function. Thus the measure AHI alone is insufficient to describe its true nature. We propose a new measure Dynamic Apnea Hypopnea Index Time Series (DAHI), which captures the temporal density of Apnea event over shorter time intervals, and use its higher moments to obtain a dynamic characterization of OSA.