Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 143
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Genet Sel Evol ; 56(1): 31, 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38684971

RESUMO

BACKGROUND: Metabolic disturbances adversely impact productive and reproductive performance of dairy cattle due to changes in endocrine status and immune function, which increase the risk of disease. This may occur in the post-partum phase, but also throughout lactation, with sub-clinical symptoms. Recently, increased attention has been directed towards improved health and resilience in dairy cattle, and genomic selection (GS) could be a helpful tool for selecting animals that are more resilient to metabolic disturbances throughout lactation. Hence, we evaluated the genomic prediction of serum biomarkers levels for metabolic distress in 1353 Holsteins genotyped with the 100K single nucleotide polymorphism (SNP) chip assay. The GS was evaluated using parametric models best linear unbiased prediction (GBLUP), Bayesian B (BayesB), elastic net (ENET), and nonparametric models, gradient boosting machine (GBM) and stacking ensemble (Stack), which combines ENET and GBM approaches. RESULTS: The results show that the Stack approach outperformed other methods with a relative difference (RD), calculated as an increment in prediction accuracy, of approximately 18.0% compared to GBLUP, 12.6% compared to BayesB, 8.7% compared to ENET, and 4.4% compared to GBM. The highest RD in prediction accuracy between other models with respect to GBLUP was observed for haptoglobin (hapto) from 17.7% for BayesB to 41.2% for Stack; for Zn from 9.8% (BayesB) to 29.3% (Stack); for ceruloplasmin (CuCp) from 9.3% (BayesB) to 27.9% (Stack); for ferric reducing antioxidant power (FRAP) from 8.0% (BayesB) to 40.0% (Stack); and for total protein (PROTt) from 5.7% (BayesB) to 22.9% (Stack). Using a subset of top SNPs (1.5k) selected from the GBM approach improved the accuracy for GBLUP from 1.8 to 76.5%. However, for the other models reductions in prediction accuracy of 4.8% for ENET (average of 10 traits), 5.9% for GBM (average of 21 traits), and 6.6% for Stack (average of 16 traits) were observed. CONCLUSIONS: Our results indicate that the Stack approach was more accurate in predicting metabolic disturbances than GBLUP, BayesB, ENET, and GBM and seemed to be competitive for predicting complex phenotypes with various degrees of mode of inheritance, i.e. additive and non-additive effects. Selecting markers based on GBM improved accuracy of GBLUP.


Assuntos
Biomarcadores , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Animais , Bovinos/genética , Biomarcadores/sangue , Doenças dos Bovinos/genética , Doenças dos Bovinos/sangue , Teorema de Bayes , Feminino , Doenças Metabólicas/genética , Doenças Metabólicas/veterinária , Doenças Metabólicas/sangue , Genômica/métodos
2.
Genet Sel Evol ; 54(1): 78, 2022 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-36460973

RESUMO

BACKGROUND: Selection schemes distort inference when estimating differences between treatments or genetic associations between traits, and may degrade prediction of outcomes, e.g., the expected performance of the progeny of an individual with a certain genotype. If input and output measurements are not collected on random samples, inferences and predictions must be biased to some degree. Our paper revisits inference in quantitative genetics when using samples stemming from some selection process. The approach used integrates the classical notion of fitness with that of missing data. Treatment is fully Bayesian, with inference and prediction dealt with, in an unified manner. While focus is on animal and plant breeding, concepts apply to natural selection as well. Examples based on real data and stylized models illustrate how selection can be accounted for in four different situations, and sometimes without success. RESULTS: Our flexible "soft selection" setting helps to diagnose the extent to which selection can be ignored. The clear connection between probability of missingness and the concept of fitness in stylized selection scenarios is highlighted. It is not realistic to assume that a fixed selection threshold t holds in conceptual replication, as the chance of selection depends on observed and unobserved data, and on unequal amounts of information over individuals, aspects that a "soft" selection representation addresses explicitly. There does not seem to be a general prescription to accommodate potential distortions due to selection. In structures that combine cross-sectional, longitudinal and multi-trait data such as in animal breeding, balance is the exception rather than the rule. The Bayesian approach provides an integrated answer to inference, prediction and model choice under selection that goes beyond the likelihood-based approach, where breeding values are inferred indirectly. CONCLUSIONS: The approach used here for inference and prediction under selection may or may not yield the best possible answers. One may believe that selection has been accounted for diligently, but the central problem of whether statistical inferences are good or bad does not have an unambiguous solution. On the other hand, the quality of predictions can be gauged empirically via appropriate training-testing of competing methods.


Assuntos
Genômica , Animais , Teorema de Bayes , Estudos Transversais , Funções Verossimilhança , Fenótipo
3.
J Anim Breed Genet ; 139(3): 247-258, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-34931377

RESUMO

Single-step GBLUP (ssGBLUP) to obtain genomic prediction was proposed in 2009. Many studies have investigated ssGBLUP in genomic selection in animals and plants using a standard linear kernel (similarity matrix) called genomic relationship matrix (G). More general kernels should allow capturing non-additive effects as well, whereas GBLUP is based on additive gene action. In this study, we generalized ssBLUP to accommodate two non-linear kernels, the averaged Gaussian kernel (AK) and the recently developed arc-cosine deep kernel (DK). We evaluated the methodology using body weight (BW) and hen-housing production (HHP) traits, recorded on a sample of phenotyped and genotyped commercial broiler chickens. There were, thus, different ssGBLUP models corresponding to G, AK and DK. We used random replication of training (TRN) and testing (TST) layouts at different genotyping rates (20%, 40%, 60% and 80% of all birds) in three selective genotyping scenarios. The selections were genotyping the youngest individuals in the pedigree (YS), random genotyping (RS) and genotyping based on parent average (PA). Predictive abilities were measured using rank correlations between the observed and the predictive phenotypic values in TST for each random partition. Prediction accuracy was influenced by the type of kernel when a large proportion of birds was genotyped. An advantage of non-linear kernels (AK and DK) was more apparent when 60 and 80% of birds had been genotyped. For BW, the lowest rank correlations were obtained with G (0.093 ± 0.015 using RS by 20% genotyped individuals) and the highest values with DK (0.320 ± 0.016 in the PA setting with 80% genotyped individuals). For HHP, the lowest and highest rank correlations were obtained by AK with 20% and 80% genotyped individuals, 0.071 ± 0.016 (in RS) and 0.23 ± 0.016 (in PA) respectively. Our results indicated that AK and DK are more effective than G when a large proportion of the target population is genotyped. Our expectation is that ssGBLUP with AK or DK models can perform even better than G when non-additive genetic effects influence the underlying variability of complex traits.


Assuntos
Galinhas , Modelos Genéticos , Animais , Galinhas/genética , Feminino , Genoma , Genótipo , Linhagem , Fenótipo
4.
Theor Appl Genet ; 134(9): 3069-3081, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34117908

RESUMO

KEY MESSAGE: Model training on data from all selection cycles yielded the highest prediction accuracy by attenuating specific effects of individual cycles. Expected reliability was a robust predictor of accuracies obtained with different calibration sets. The transition from phenotypic to genome-based selection requires a profound understanding of factors that determine genomic prediction accuracy. We analysed experimental data from a commercial maize breeding programme to investigate if genomic measures can assist in identifying optimal calibration sets for model training. The data set consisted of six contiguous selection cycles comprising testcrosses of 5968 doubled haploid lines genotyped with a minimum of 12,000 SNP markers. We evaluated genomic prediction accuracies in two independent prediction sets in combination with calibration sets differing in sample size and genomic measures (effective sample size, average maximum kinship, expected reliability, number of common polymorphic SNPs and linkage phase similarity). Our results indicate that across selection cycles prediction accuracies were as high as 0.57 for grain dry matter yield and 0.76 for grain dry matter content. Including data from all selection cycles in model training yielded the best results because interactions between calibration and prediction sets as well as the effects of different testers and specific years were attenuated. Among genomic measures, the expected reliability of genomic breeding values was the best predictor of empirical accuracies obtained with different calibration sets. For grain yield, a large difference between expected and empirical reliability was observed in one prediction set. We propose to use this difference as guidance for determining the weight phenotypic data of a given selection cycle should receive in model retraining and for selection when both genomic breeding values and phenotypes are available.


Assuntos
Cromossomos de Plantas/genética , Genoma de Planta , Fenótipo , Melhoramento Vegetal/métodos , Polimorfismo de Nucleotídeo Único , Zea mays/crescimento & desenvolvimento , Zea mays/genética , Mapeamento Cromossômico/métodos , Locos de Características Quantitativas
5.
Theor Popul Biol ; 132: 47-59, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-31830483

RESUMO

Modeling covariance structure based on genetic similarity between pairs of relatives plays an important role in evolutionary, quantitative and statistical genetics. Historically, genetic similarity between individuals has been quantified from pedigrees via the probability that randomly chosen homologous alleles between individuals are identical by descent (IBD). At present, however, many genetic analyses rely on molecular markers, with realized measures of genomic similarity replacing IBD-based expected similarities. Animal and plant breeders, for example, now employ marker-based genomic relationship matrices between individuals in prediction models and in estimation of genome-based heritability coefficients. Phenotypes convey information about genetic similarity as well. For instance, if phenotypic values are at least partially the result of the action of quantitative trait loci, one would expect the former to inform about the latter, as in genome-wide association studies. Statistically, a non-trivial conditional distribution of unknown genetic similarities, given phenotypes, is to be expected. A Bayesian formalism is presented here that applies to whole-genome regression methods where some genetic similarity matrix, e.g., a genomic relationship matrix, can be defined. Our Bayesian approach, based on phenotypes and markers, converts prior (markers only) expected similarity into trait-specific posterior similarity. A simulation illustrates situations under which effective Bayesian learning from phenotypes occurs. Pinus and wheat data sets were used to demonstrate applicability of the concept in practice. The methodology applies to a wide class of Bayesian linear regression models, it extends to the multiple-trait domain, and can also be used to develop phenotype-guided similarity kernels in prediction problems.


Assuntos
Estudo de Associação Genômica Ampla , Modelos Genéticos , Locos de Características Quantitativas , Teorema de Bayes , Genótipo , Fenótipo , Pinus/genética , Polimorfismo de Nucleotídeo Único , Triticum/genética
6.
Heredity (Edinb) ; 124(5): 658-674, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32127659

RESUMO

This study evaluated the use of multiomics data for classification accuracy of rheumatoid arthritis (RA). Three approaches were used and compared in terms of prediction accuracy: (1) whole-genome prediction (WGP) using SNP marker information only, (2) whole-methylome prediction (WMP) using methylation profiles only, and (3) whole-genome/methylome prediction (WGMP) with combining both omics layers. The number of SNP and of methylation sites varied in each scenario, with either 1, 10, or 50% of these preselected based on four approaches: randomly, evenly spaced, lowest p value (genome-wide association or epigenome-wide association study), and estimated effect size using a Bayesian ridge regression (BRR) model. To remove effects of high levels of pairwise linkage disequilibrium (LD), SNPs were also preselected with an LD-pruning method. Five Bayesian regression models were studied for classification, including BRR, Bayes-A, Bayes-B, Bayes-C, and the Bayesian LASSO. Adjusting methylation profiles for cellular heterogeneity within whole blood samples had a detrimental effect on the classification ability of the models. Overall, WGMP using Bayes-B model has the best performance. In particular, selecting SNPs based on LD-pruning with 1% of the methylation sites selected based on BRR included in the model, and fitting the most significant SNP as a fixed effect was the best method for predicting disease risk with a classification accuracy of 0.975. Our results showed that multiomics data can be used to effectively predict the risk of RA and identify cases in early stages to prevent or alter disease progression via appropriate interventions.


Assuntos
Artrite Reumatoide , Metilação de DNA , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Artrite Reumatoide/genética , Teorema de Bayes , Humanos
7.
Genet Sel Evol ; 52(1): 12, 2020 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-32093611

RESUMO

BACKGROUND: Transforming large amounts of genomic data into valuable knowledge for predicting complex traits has been an important challenge for animal and plant breeders. Prediction of complex traits has not escaped the current excitement on machine-learning, including interest in deep learning algorithms such as multilayer perceptrons (MLP) and convolutional neural networks (CNN). The aim of this study was to compare the predictive performance of two deep learning methods (MLP and CNN), two ensemble learning methods [random forests (RF) and gradient boosting (GB)], and two parametric methods [genomic best linear unbiased prediction (GBLUP) and Bayes B] using real and simulated datasets. METHODS: The real dataset consisted of 11,790 Holstein bulls with sire conception rate (SCR) records and genotyped for 58k single nucleotide polymorphisms (SNPs). To support the evaluation of deep learning methods, various simulation studies were conducted using the observed genotype data as template, assuming a heritability of 0.30 with either additive or non-additive gene effects, and two different numbers of quantitative trait nucleotides (100 and 1000). RESULTS: In the bull dataset, the best predictive correlation was obtained with GB (0.36), followed by Bayes B (0.34), GBLUP (0.33), RF (0.32), CNN (0.29) and MLP (0.26). The same trend was observed when using mean squared error of prediction. The simulation indicated that when gene action was purely additive, parametric methods outperformed other methods. When the gene action was a combination of additive, dominance and of two-locus epistasis, the best predictive ability was obtained with gradient boosting, and the superiority of deep learning over the parametric methods depended on the number of loci controlling the trait and on sample size. In fact, with a large dataset including 80k individuals, the predictive performance of deep learning methods was similar or slightly better than that of parametric methods for traits with non-additive gene action. CONCLUSIONS: For prediction of traits with non-additive gene action, gradient boosting was a robust method. Deep learning approaches were not better for genomic prediction unless non-additive variance was sizable.


Assuntos
Bovinos/genética , Aprendizado Profundo , Genômica , Animais , Teorema de Bayes , Genótipo , Modelos Genéticos , Herança Multifatorial , Fenótipo , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável
8.
Theor Appl Genet ; 132(5): 1587-1606, 2019 May.
Artigo em Inglês | MEDLINE | ID: mdl-30747261

RESUMO

KEY MESSAGE: Current genome-enabled prediction models assumed errors normally distributed, which are sensitive to outliers. We propose a model with errors assumed to follow a Laplace distribution to deal better with outliers. Current genome-enabled prediction models use regressions that fit the expected value (mean) of a response variable with errors assumed normally distributed, which are often sensitive to outliers, either genetic or environmental. For this reason, we propose a robust Bayesian genome median regression (BGMR) model that fits regressions to the medians of a distribution, with errors assumed to follow a Laplace distribution to deal better with outliers. The BGMR model was evaluated under a Bayesian framework with Markov Chain Monte Carlo sampling using a location-scale mixture representation of the Laplace distribution. The BGMR was implemented with two simulated and two real genomic data sets, and we compared its prediction performance with that of a conventional genomic best linear unbiased prediction (GBLUP) model and the Laplace maximum a posteriori (LMAP) method. The prediction accuracies of BGMR were higher than those of the GBLUP and LMAP methods when there were outliers. The BGMR model could be useful to breeders who need to predict and select genotypes based on data with unknown outliers.


Assuntos
Cruzamento , Genoma de Planta , Modelos Teóricos , Plantas/genética , Teorema de Bayes , Simulação por Computador , Cadeias de Markov , Método de Monte Carlo , Análise de Regressão
9.
J Anim Breed Genet ; 136(2): 113-117, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30614572

RESUMO

A curious result from mixed linear models applied to genome-wide association studies was expanded. In particular, a model in which one or more markers are considered as fixed but are allowed to contribute to the covariance structure by treating such markers as random as well was examined. The best linear unbiased estimator of marker effects is invariant with respect to whether those markers are employed in constructing a genomic relationship matrix or are ignored, provided marker effects are uncorrelated with those not being tested. Also, the implications of regarding some marker effects as fixed when, in fact, these possess a non-trivial covariance structure with those declared as random were examined.


Assuntos
Estudo de Associação Genômica Ampla/estatística & dados numéricos , Modelos Lineares , Modelos Genéticos , Modelos Estatísticos , Animais , Cruzamento , Genoma/genética , Genômica , Polimorfismo de Nucleotídeo Único
10.
PLoS Genet ; 11(5): e1005048, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25942577

RESUMO

Whole-genome regression methods are being increasingly used for the analysis and prediction of complex traits and diseases. In human genetics, these methods are commonly used for inferences about genetic parameters, such as the amount of genetic variance among individuals or the proportion of phenotypic variance that can be explained by regression on molecular markers. This is so even though some of the assumptions commonly adopted for data analysis are at odds with important quantitative genetic concepts. In this article we develop theory that leads to a precise definition of parameters arising in high dimensional genomic regressions; we focus on the so-called genomic heritability: the proportion of variance of a trait that can be explained (in the population) by a linear regression on a set of markers. We propose a definition of this parameter that is framed within the classical quantitative genetics theory and show that the genomic heritability and the trait heritability parameters are equal only when all causal variants are typed. Further, we discuss how the genomic variance and genomic heritability, defined as quantitative genetic parameters, relate to parameters of statistical models commonly used for inferences, and indicate potential inferential problems that are assessed further using simulations. When a large proportion of the markers used in the analysis are in LE with QTL the likelihood function can be misspecified. This can induce a sizable finite-sample bias and, possibly, lack of consistency of likelihood (or Bayesian) estimates. This situation can be encountered if the individuals in the sample are distantly related and linkage disequilibrium spans over short regions. This bias does not negate the use of whole-genome regression models as predictive machines; however, our results indicate that caution is needed when using marker-based regressions for inferences about population parameters such as the genomic heritability.


Assuntos
Genômica/métodos , Modelos Genéticos , Característica Quantitativa Herdável , Teorema de Bayes , Marcadores Genéticos , Humanos , Funções Verossimilhança , Modelos Lineares , Desequilíbrio de Ligação , Modelos Estatísticos , Locos de Características Quantitativas
12.
Genet Sel Evol ; 49(1): 16, 2017 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-28148241

RESUMO

BACKGROUND: Genomic selection has been successfully implemented in plant and animal breeding programs to shorten generation intervals and accelerate genetic progress per unit of time. In practice, genomic selection can be used to improve several correlated traits simultaneously via multiple-trait prediction, which exploits correlations between traits. However, few studies have explored multiple-trait genomic selection. Our aim was to infer genetic correlations between three traits measured in broiler chickens by exploring kinship matrices based on a linear combination of measures of pedigree and marker-based relatedness. A predictive assessment was used to gauge genetic correlations. METHODS: A multivariate genomic best linear unbiased prediction model was designed to combine information from pedigree and genome-wide markers in order to assess genetic correlations between three complex traits in chickens, i.e. body weight at 35 days of age (BW), ultrasound area of breast meat (BM) and hen-house egg production (HHP). A dataset with 1351 birds that were genotyped with the 600 K Affymetrix platform was used. A kinship kernel (K) was constructed as K = λ G + (1 - λ)A, where A is the numerator relationship matrix, measuring pedigree-based relatedness, and G is a genomic relationship matrix. The weight (λ) assigned to each source of information varied over the grid λ = (0, 0.2, 0.4, 0.6, 0.8, 1). Maximum likelihood estimates of heritability and genetic correlations were obtained at each λ, and the "optimum" λ was determined using cross-validation. RESULTS: Estimates of genetic correlations were affected by the weight placed on the source of information used to build K. For example, the genetic correlation between BW-HHP and BM-HHP changed markedly when λ varied from 0 (only A used for measuring relatedness) to 1 (only genomic information used). As λ increased, predictive correlations (correlation between observed phenotypes and predicted breeding values) increased and mean-squared predictive error decreased. However, the improvement in predictive ability was not monotonic, with an optimum found at some 0 < λ < 1, i.e., when both sources of information were used together. CONCLUSIONS: Our findings indicate that multiple-trait prediction may benefit from combining pedigree and marker information. Also, it appeared that expected correlated responses to selection computed from standard theory may differ from realized responses. The predictive assessment provided a metric for performance evaluation as well as a means for expressing uncertainty of outcomes of multiple-trait selection.


Assuntos
Galinhas/genética , Estudos de Associação Genética , Marcadores Genéticos , Locos de Características Quantitativas , Característica Quantitativa Herdável , Animais , Peso Corporal/genética , Estudo de Associação Genômica Ampla , Genótipo , Modelos Genéticos , Fenótipo
13.
Proc Natl Acad Sci U S A ; 111(51): 18167-72, 2014 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-25489098

RESUMO

We study the uniaxial compressive behavior of disordered colloidal free-standing micropillars composed of a bidisperse mixture of 3- and 6-µm polystyrene particles. Mechanical annealing of confined pillars enables variation of the packing fraction across the phase space of colloidal glasses. The measured normalized strengths and elastic moduli of the annealed freestanding micropillars span almost three orders of magnitude despite similar plastic morphology governed by shear banding. We measure a robust correlation between ultimate strengths and elastic constants that is invariant to relative humidity, implying a critical strain of ∼0.01 that is strikingly similar to that observed in metallic glasses (MGs) [Johnson WL, Samwer K (2005) Phys Rev Lett 95:195501] and suggestive of a universal mode of cooperative plastic deformation. We estimate the characteristic strain of the underlying cooperative plastic event by considering the energy necessary to create an Eshelby-like ellipsoidal inclusion in an elastic matrix. We find that the characteristic strain is similar to that found in experiments and simulations of other disordered solids with distinct bonding and particle sizes, suggesting a universal criterion for the elastic to plastic transition in glassy materials with the capacity for finite plastic flow.

14.
J Dairy Sci ; 100(1): 453-464, 2017 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-27889124

RESUMO

Since the introduction of genome-enabled prediction for dairy cattle in 2009, genomic selection has markedly changed many aspects of the dairy genetics industry and enhanced the rate of response to selection for most economically important traits. Young dairy bulls are genotyped to obtain their genomic predicted transmitting ability (GPTA) and reliability (REL) values. These GPTA are a main factor in most purchasing, marketing, and culling decisions until bulls reach 5 yr of age and their milk-recorded offspring become available. At that time, daughter yield deviations (DYD) can be compared with the GPTA computed several years earlier. For most bulls, the DYD align well with the initial predictions. However, for some bulls, the difference between DYD and corresponding GPTA is quite large, and published REL are of limited value in identifying such bulls. A method of bootstrap aggregation sampling (bagging) using genomic BLUP (GBLUP) was applied to predict the GPTA of 2,963, 2,963, and 2,803 young Holstein bulls for protein yield, somatic cell score, and daughter pregnancy rate (DPR), respectively. For each trait, 50 bootstrap samples from a reference population comprising 2011 DYD of 8,610, 8,405, and 7,945 older Holstein bulls were used. Leave-one-out cross validation was also performed to assess prediction accuracy when removing specific bulls from the reference population. The main objectives of this study were (1) to assess the extent to which current REL values and alternative measures of variability, such as the bootstrap standard deviation (SD) of predictions, could detect bulls whose daughter performance deviates significantly from early genomic predictions, and (2) to identify factors associated with the reference population that inform about inaccurate genomic predictions. The SD of bootstrap predictions was a mildly useful metric for identifying bulls whose future daughter performance may deviate significantly from early GPTA for protein and DPR. Leave-one-out cross validation allowed us to identify groups of reference population bulls that were influential on other reference population bulls for protein yield and observe their effects on predictions of testing set bulls, as a whole and individually.


Assuntos
Cruzamento , Genoma , Animais , Bovinos , Feminino , Genômica , Genótipo , Masculino , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes
15.
BMC Genomics ; 17: 208, 2016 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-26956885

RESUMO

BACKGROUND: Multi-layer perceptron (MLP) and radial basis function neural networks (RBFNN) have been shown to be effective in genome-enabled prediction. Here, we evaluated and compared the classification performance of an MLP classifier versus that of a probabilistic neural network (PNN), to predict the probability of membership of one individual in a phenotypic class of interest, using genomic and phenotypic data as input variables. We used 16 maize and 17 wheat genomic and phenotypic datasets with different trait-environment combinations (sample sizes ranged from 290 to 300 individuals) with 1.4 k and 55 k SNP chips. Classifiers were tested using continuous traits that were categorized into three classes (upper, middle and lower) based on the empirical distribution of each trait, constructed on the basis of two percentiles (15-85 % and 30-70 %). We focused on the 15 and 30 % percentiles for the upper and lower classes for selecting the best individuals, as commonly done in genomic selection. Wheat datasets were also used with two classes. The criteria for assessing the predictive accuracy of the two classifiers were the area under the receiver operating characteristic curve (AUC) and the area under the precision-recall curve (AUCpr). Parameters of both classifiers were estimated by optimizing the AUC for a specific class of interest. RESULTS: The AUC and AUCpr criteria provided enough evidence to conclude that PNN was more accurate than MLP for assigning maize and wheat lines to the correct upper, middle or lower class for the complex traits analyzed. Results for the wheat datasets with continuous traits split into two and three classes showed that the performance of PNN with three classes was higher than with two classes when classifying individuals into the upper and lower (15 or 30 %) categories. CONCLUSIONS: The PNN classifier outperformed the MLP classifier in all 33 (maize and wheat) datasets when using AUC and AUCpr for selecting individuals of a specific class. Use of PNN with Gaussian radial basis functions seems promising in genomic selection for identifying the best individuals. Categorizing continuous traits into three classes generally provided better classification than when using two classes, because classification accuracy improved when classes were balanced.


Assuntos
Genômica/métodos , Redes Neurais de Computação , Triticum/genética , Zea mays/genética , Área Sob a Curva , Interação Gene-Ambiente , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Curva ROC
16.
Nat Mater ; 14(7): 707-13, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25985457

RESUMO

Linear defects in crystalline materials, known as dislocations, are central to the understanding of plastic deformation and mechanical strength, as well as control of performance in a variety of electronic and photonic materials. Despite nearly a century of research on dislocation structure and interactions, measurements of the energetics and kinetics of dislocation nucleation have not been possible, as synthesizing and testing pristine crystals absent of defects has been prohibitively challenging. Here, we report experiments that directly measure the surface dislocation nucleation strengths in high-quality 〈110〉 Pd nanowhiskers subjected to uniaxial tension. We find that, whereas nucleation strengths are weakly size- and strain-rate-dependent, a strong temperature dependence is uncovered, corroborating predictions that nucleation is assisted by thermal fluctuations. We measure atomic-scale activation volumes, which explain both the ultrahigh athermal strength as well as the temperature-dependent scatter, evident in our experiments and well captured by a thermal activation model.

17.
Nat Rev Genet ; 11(12): 880-6, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-21045869

RESUMO

Although genome-wide association studies have identified markers that are associated with various human traits and diseases, our ability to predict such phenotypes remains limited. A perhaps overlooked explanation lies in the limitations of the genetic models and statistical techniques commonly used in association studies. We propose that alternative approaches, which are largely borrowed from animal breeding, provide potential for advances. We review selected methods and discuss the challenges and opportunities ahead.


Assuntos
Marcadores Genéticos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Doença/genética , Humanos
18.
Genet Sel Evol ; 48: 34, 2016 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-27091137

RESUMO

BACKGROUND: Parent-of-origin effects are due to differential contributions of paternal and maternal lineages to offspring phenotypes. Such effects include, for example, maternal effects in several species. However, epigenetically induced parent-of-origin effects have recently attracted attention due to their potential impact on variation of complex traits. Given that prediction of genetic merit or phenotypic performance is of interest in the study of complex traits, it is relevant to consider parent-of-origin effects in such predictions. We built a whole-genome prediction model that incorporates parent-of-origin effects by considering parental allele substitution effects of single nucleotide polymorphisms and gametic relationships derived from a pedigree (the POE model). We used this model to predict body mass index in a mouse population, a trait that is presumably affected by parent-of-origin effects, and also compared the prediction performance to that of a standard additive model that ignores parent-of-origin effects (the ADD model). We also used simulated data to assess the predictive performance of the POE model under various circumstances, in which parent-of-origin effects were generated by mimicking an imprinting mechanism. RESULTS: The POE model did not predict better than the ADD model in the real data analysis, probably due to overfitting, since the POE model had far more parameters than the ADD model. However, when applied to simulated data, the POE model outperformed the ADD model when the contribution of parent-of-origin effects to phenotypic variation increased. The superiority of the POE model over the ADD model was up to 8 % on predictive correlation and 5 % on predictive mean squared error. CONCLUSIONS: The simulation and the negative result obtained in the real data analysis indicated that, in order to gain benefit from the POE model in terms of prediction, a sizable contribution of parent-of-origin effects to variation is needed and such variation must be captured by the genetic markers fitted. Recent studies, however, suggest that most parent-of-origin effects stem from epigenetic regulation but not from a change in DNA sequence. Therefore, integrating epigenetic information with genetic markers may help to account for parent-of-origin effects in whole-genome prediction.


Assuntos
Estudos de Associação Genética , Impressão Genômica/genética , Genômica , Camundongos/genética , Modelos Genéticos , Fenótipo , Algoritmos , Alelos , Animais , Índice de Massa Corporal , Simulação por Computador , Feminino , Frequência do Gene , Marcadores Genéticos , Masculino , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
19.
Genet Sel Evol ; 48: 10, 2016 Feb 03.
Artigo em Inglês | MEDLINE | ID: mdl-26842494

RESUMO

BACKGROUND: Genome-wide association studies in humans have found enrichment of trait-associated single nucleotide polymorphisms (SNPs) in coding regions of the genome and depletion of these in intergenic regions. However, a recent release of the ENCyclopedia of DNA elements showed that ~80 % of the human genome has a biochemical function. Similar studies on the chicken genome are lacking, thus assessing the relative contribution of its genic and non-genic regions to variation is relevant for biological studies and genetic improvement of chicken populations. METHODS: A dataset including 1351 birds that were genotyped with the 600K Affymetrix platform was used. We partitioned SNPs according to genome annotation data into six classes to characterize the relative contribution of genic and non-genic regions to genetic variation as well as their predictive power using all available quality-filtered SNPs. Target traits were body weight, ultrasound measurement of breast muscle and hen house egg production in broiler chickens. Six genomic regions were considered: intergenic regions, introns, missense, synonymous, 5' and 3' untranslated regions, and regions that are located 5 kb upstream and downstream of coding genes. Genomic relationship matrices were constructed for each genomic region and fitted in the models, separately or simultaneously. Kernel-based ridge regression was used to estimate variance components and assess predictive ability. Contribution of each class of genomic regions to dominance variance was also considered. RESULTS: Variance component estimates indicated that all genomic regions contributed to marked additive genetic variation and that the class of synonymous regions tended to have the greatest contribution. The marked dominance genetic variation explained by each class of genomic regions was similar and negligible (~0.05). In terms of prediction mean-square error, the whole-genome approach showed the best predictive ability. CONCLUSIONS: All genic and non-genic regions contributed to phenotypic variation for the three traits studied. Overall, the contribution of additive genetic variance to the total genetic variance was much greater than that of dominance variance. Our results show that all genomic regions are important for the prediction of the targeted traits, and the whole-genome approach was reaffirmed as the best tool for genome-enabled prediction of quantitative traits.


Assuntos
Galinhas/genética , Genoma , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Animais , Peso Corporal/genética , Conjuntos de Dados como Assunto , Ovos , Feminino , Genômica , Genótipo , Carne/análise , Fenótipo , Seleção Genética
20.
J Dairy Sci ; 99(5): 3632-3645, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-26971146

RESUMO

Genomic selection has revolutionized the dairy genetics industry and enhanced the rate of response to selection for most economically important traits. All young bulls are now genotyped using commercially available single nucleotide polymorphism arrays to compute genomic predicted transmitting ability (GPTA) and reliability (REL) values. Decisions regarding the purchasing, marketing, and culling of dairy bulls are based on GPTA until roughly 5 yr of age, when milk-recorded offspring become available. At that time, daughter yield deviations (DYD) can be used to assess the accuracy of the GPTA computed several years earlier. Although agreement between predictions and DYD is often good, the DYD of some bulls differ widely from corresponding GPTA, and published REL are of limited value in identifying such bulls. A method of bootstrap aggregation sampling (bagging) using genomic BLUP (GBLUP) was implemented to predict the GPTA of 379, 379, and 342 young Jersey bulls for protein yield, somatic cell score, and daughter pregnancy rate, respectively. For each trait, 50 bootstrap samples from a reference population consisting of 2011 DYD of 1,738, 1,616, and 1,551 older Jersey bulls were used, and correlations between bagged GBLUP predictions and 2014 DYD were lower than GBLUP predictions derived from the full reference population. Although the bagged GBLUP approach did not improve the predictive correlations, it allowed computation of bootstrap predictive reliabilities across random samples of the reference population. The bootstrap predictive reliabilities could be a useful diagnostic tool for assessing genome-enabled prediction systems or evaluating the composition of a reference population. Our main objective was to determine if bagging GBLUP of young Jersey bulls could lead to measures of reliability that would be a useful alternative to published REL values. The standard deviations of bagged GBLUP predictions were found to weakly improve our ability to identify bulls whose future daughter performance may deviate significantly from early GPTA for protein, but not for somatic cell score or daughter pregnancy rate.


Assuntos
Cruzamento , Polimorfismo de Nucleotídeo Único , Animais , Bovinos , Genoma , Genômica , Genótipo , Masculino , Modelos Genéticos , Fenótipo , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA