RESUMO
The advantage of employing mid-infrared spectrometry for milk analysis in breeding lies in its ability to quickly generate millions of records. However, these records may be biased if the calibration process does not account for their spectral variability when constructing the predictive model. So, this study introduces a novel method for developing a World Representative Spectral Database (WRSD) to reduce the risks of spectral extrapolation when predicting dairy traits in new samples. Utilizing a 2-phase selection procedure that is both efficient and minimizes memory usage, we first generate a decomposition matrix via Principal Component Analysis (PCA) on a data set of 2,324,443 records. The next phase iterates spectral selection based on a location index from PCA scores, calculating spectra occurrence frequency for refined barycenter estimations. The chosen spectra's barycenter closely aligns with the entire data set, proving the efficacy of using just 3 principal components (PCs). Applied to 4 varied data sets, totaling over 21 million records, we select 583,440 spectra to represent spectral diversity, with selection rates between 2.00% and 14.88%. This selection illustrates the spectral variability across different dairy populations and data providers. Demonstrated through a hypothetical calibration set of 71 samples, the WRSD's utility for algorithm developers becomes apparent. This calibration set covers between 91.42 to 98.50% of the WRSD variability, except for the Irish data set (3.50%), indicating a need for additional samples to accurately represent Irish variability and minimize spectral extrapolation. This study offers valuable insights into the representativeness of training sets for capturing spectral variability within targeted dairy populations. While the current WRSD does not fully encompass global milk spectral diversity, its development underscores the importance of gathering more data and standardizing spectral information across spectrometer brands. Ultimately, the WRSD proves crucial not just for trait prediction but also for identifying abnormal milk samples, also marking a significant relevance in dairy science technology.
RESUMO
Chromosome X is often excluded from bovine genetic studies due to complications caused by the sex specific nature of the chromosome. As chromosome X is the second largest cattle chromosome and makes up approximately 6% of the female genome, finding ways to include chromosome X in dairy genetic studies is important. Using female animals and treating chromosome X as an autosome, we performed X chromosome inclusive genome-wide association studies in the selective breeding environment of the New Zealand dairy industry, aiming to identify chromosome X variants associated with milk production traits. We report on the findings of these genome-wide association studies and their potential effect within the dairy industry. We identify missense mutations in the MOSPD1 and CCDC160 genes that are associated with decreased milk volume and protein production and increased fat production. Both of these mutations are exonic SNP that are more prevalent in the Jersey breed than in Holstein-Friesians. Of the 2 candidates proposed it is likely that only one is causal, though we have not been able to identify which is more likely.
RESUMO
Accurate and timely pregnancy diagnosis is an important component of effective herd management in dairy cattle. Predicting pregnancy from Fourier-transform mid-infrared (FT-MIR) spectroscopy data is of particular interest because the data are often already available from routine milk testing. The purpose of this study was to evaluate how well pregnancy status could be predicted in a large data set of 1,161,436 FT-MIR milk spectra records from 863,982 mixed-breed pasture-based New Zealand dairy cattle managed within seasonal calving systems. Three strategies were assessed for defining the nonpregnant cows when partitioning the records according to pregnancy status in the training population. Two of these used records for cows with a subsequent calving only, whereas the third also included records for cows without a subsequent calving. For each partitioning strategy, partial least squares discriminant analysis models were developed, whereby spectra from all the cows in 80% of herds were used to train the models, and predictions on cows in the remaining herds were used for validation. A separate data set was also used as a secondary validation, whereby pregnancy diagnosis had been assigned according to the presence of pregnancy-associated glycoproteins (PAG) in the milk samples. We examined different ways of accounting for stage of lactation in the prediction models, either by including it as an effect in the prediction model, or by pre-adjusting spectra before fitting the model. For a subset of strategies, we also assessed prediction accuracies from deep learning approaches, utilizing either the raw spectra or images of spectra. Across all strategies, prediction accuracies were highest for models using the unadjusted spectra as model predictors. Strategies for cows with a subsequent calving performed well in herd-independent validation with sensitivities above 0.79, specificities above 0.91 and area under the receiver operating characteristic curve (AUC) values over 0.91. However, for these strategies, the specificity to predict nonpregnant cows in the external PAG data set was poor (0.002-0.04). The best performing models were those that included records for cows without a subsequent calving, and used unadjusted spectra and days in milk as predictors, with consistent results observed across the training, herd-independent validation and PAG data sets. For the partial least squares discriminant analysis model, sensitivity was 0.71, specificity was 0.54 and AUC values were 0.68 in the PAG data set; and for an image-based deep learning model, the sensitivity was 0.74, specificity was 0.52 and the AUC value was 0.69. Our results demonstrate that in pasture-based seasonal calving herds, confounding between pregnancy status and spectral changes associated with stage of lactation can inflate prediction accuracies. When the effect of this confounding was reduced, prediction accuracies were not sufficiently high enough to use as a sole indicator of pregnancy status.
Assuntos
Lactação , Leite , Animais , Bovinos , Feminino , Análise dos Mínimos Quadrados , Leite/química , Nova Zelândia , Gravidez , Espectrofotometria Infravermelho/veterináriaRESUMO
The use of Fourier-transform mid-infrared (FTIR) spectroscopy is of interest to the dairy industry worldwide for predicting milk composition and other novel traits that are difficult or expensive to measure directly. Although there are many valuable applications for FTIR spectra, noise from differences in spectral responses between instruments is problematic because it reduces prediction accuracy if ignored. The purpose of this study was to develop strategies to reduce the impact of noise and to compare methods for standardizing FTIR spectra in order to reduce between-instrument variability in multiple-instrument networks. Noise levels in bands of the infrared spectrum caused by the water content of milk were characterized, and a method for identifying and removing outliers was developed. Two standardization methods were assessed and compared: piecewise direct standardization (PDS), which related spectra on a primary instrument to spectra on 5 other (secondary) instruments using identical milk-based reference samples (n = 918) analyzed across the 6 instruments; and retroactive percentile standardization (RPS), whereby percentiles of observed spectra from routine milk test samples (n = 2,044,094) were used to map and exploit primary- and secondary-instrument relationships. Different applications of each method were studied to determine the optimal way to implement each method across time. Industry-standard predictions of milk components from 2,044,094 spectra records were regressed against predictions from spectra before and after standardization using PDS or RPS. The PDS approach resulted in an overall decrease in root mean square error between industry-standard predictions and predictions from spectra from 0.190 to 0.071 g/100 mL for fat, from 0.129 to 0.055 g/100 mL for protein, and from 0.143 to 0.088 g/100 mL for lactose. Reductions in prediction error for RPS were similar but less consistent than those for PDS across time, but similar reductions were achieved when PDS coefficients were updated monthly and separate primary instruments were assigned for the North and South Islands of New Zealand. We demonstrated that the PDS approach is the most consistent method to reduce prediction errors across time. We also showed that the RPS approach is sensitive to shifts in milk composition but can be used to reduce prediction errors, provided that secondary-instrument spectra are standardized to a primary instrument with samples of broadly equivalent milk composition. Appropriate implementation of either of these approaches will improve the quality of predictions based on FTIR spectra for various downstream applications.
Assuntos
Bovinos/metabolismo , Leite/química , Espectroscopia de Infravermelho com Transformada de Fourier/normas , Animais , Indústria de Laticínios , Leite/metabolismo , Nova Zelândia , Fenótipo , Padrões de Referência , Espectroscopia de Infravermelho com Transformada de Fourier/instrumentação , Espectroscopia de Infravermelho com Transformada de Fourier/métodos , Espectroscopia de Infravermelho com Transformada de Fourier/veterináriaRESUMO
The genetic merit of a herd is a key determinant in productivity for dairy farmers. However, making breeding decisions to maximize the rate of genetic gain can be complex because there is no certainty about which cows will become pregnant with a heifer calf. In this study, breeding worth (BrW) was used as a measure of genetic merit, and several mating strategies were evaluated. These strategies included randomly mating whole herds to the entire bull team, excluding low-ranked cows from producing replacement heifers, and nominating high-ranked cows to the most highly ranked bulls. Simulations were undertaken using 4 bull teams generated from bulls currently marketed in New Zealand and a selection of New Zealand dairy herds. Average replacement heifer BrW was calculated for 1,000 iterations of each combination of mating strategy, herd, and bull team (scenario). Variation in resulting average replacement heifer BrW within scenarios was due to random sampling of which cows became pregnant with a heifer calf. Relative to mating the whole herd to an entire bull team, excluding the lowest ranked cows from producing replacements resulted in the greatest increase in average replacement heifer BrW across all herds and bull teams, with a gain of approximately 0.4 BrW point for each 1% of cows excluded. Nominating top-ranking cows to the highest ranking bulls in the team had little effect (0.06-0.13 BrW increase for each 1% of top cows nominated) in improving BrW of replacement heifers. The number of top bulls nominated had a variable effect depending on the BrW spread of the entire bull team. Although excluding cows with the lowest BrW from producing replacement heifers is most effective for improving BrW, it is important to ensure that the number of heifers born is sufficient to replace cows leaving the herd. It is likely that optimal strategies for improving BrW will vary from farm to farm depending not only on the BrW structure of the herd, the bull team available, and the reproduction success on farm but also on farm management practices. This simulation study provides expected outcomes from a variety of mating strategies to allow informed decision making on farm.
Assuntos
Cruzamento/métodos , Bovinos/fisiologia , Animais , Bovinos/genética , Indústria de Laticínios , Feminino , Masculino , Nova Zelândia , Parto , Gravidez , ReproduçãoRESUMO
X chromosome inactivation (XCI) is a process by which 1 of the 2 copies of the X chromosomes present in female mammals is inactivated. The transcriptional silencing of one X chromosome achieves dosage compensation between XX females and XY males and ensures equal expression of X-linked genes in both sexes. Although all mammals use this form of dosage compensation, the complex mechanisms that regulate XCI vary between species, tissues, and development. These mechanisms include not only varying levels of inactivation, but also the nature of inactivation, which can range from being random in nature to driven by parent of origin. To date, no data describing XCI in calves or adult cattle have been reported and we are reliant on data from mice to infer potential mechanisms and timings for this process. In the context of dairy cattle breeding and genomic prediction, the implications of X chromosome inheritance and XCI in the mammary gland are particularly important where a relatively small number of bulls pass their single X chromosome on to all of their daughters. We describe here the use of RNA-seq, whole genome sequencing and Illumina BovineHD BeadChip (Illumina, San Diego, CA) genotypes to assess XCI in lactating mammary glands of dairy cattle. At a population level, maternally and paternally inherited copies of the X chromosome are expressed equally in the lactating mammary gland consistent with random inactivation of the X chromosome. However, average expression of the paternal chromosome ranged from 10 to 90% depending on the individual animal. These results suggest that either the mammary gland arises from 1 or 2 stem cells, or a nongenetic mechanism that skews XCI exists. Although a considerable amount of future work is required to fully understand XCI in cattle, the data reported here represent an initial step in ensuring that X chromosome variation is captured and used in an appropriate manner for future genomic selection.
Assuntos
Regulação da Expressão Gênica , Glândulas Mamárias Animais , Inativação do Cromossomo X , Animais , Bovinos , Mecanismo Genético de Compensação de Dose , Feminino , Lactação , Masculino , Fatores Sexuais , Cromossomo X/genéticaRESUMO
Single nucleotide polymorphisms have been the DNA variant of choice for genomic prediction, largely because of the ease of single nucleotide polymorphism genotype collection. In contrast, structural variants (SV), which include copy number variants (CNV), translocations, insertions, and inversions, have eluded easy detection and characterization, particularly in nonhuman species. However, evidence increasingly shows that SV not only contribute a substantial proportion of genetic variation but also have significant influence on phenotypes. Here we present the discovery of CNV in a prominent New Zealand dairy bull using long-read PacBio (Pacific Biosciences, Menlo Park, CA) sequencing technology and the Sniffles SV discovery tool (version 0.0.1; https://github.com/fritzsedlazeck/Sniffles). The CNV identified from long reads were compared with CNV discovered in the same bull from Illumina sequencing using CNVnator (read depth-based tool; Illumina Inc., San Diego, CA) as a means of validation. Subsequently, further validation was undertaken using whole-genome Illumina sequencing of 556 cattle representing the wider New Zealand dairy cattle population. Very limited overlap was observed in CNV discovered from the 2 sequencing platforms, in part because of the differences in size of CNV detected. Only a few CNV were therefore able to be validated using this approach. However, the ability to use CNVnator to genotype the 557 cattle for copy number across all regions identified as putative CNV allowed a genome-wide assessment of transmission level of copy number based on pedigree. The more highly transmissible a putative CNV region was observed to be, the more likely the distribution of copy number was multimodal across the 557 sequenced animals. Furthermore, visual assessment of highly transmissible CNV regions provided evidence supporting the presence of CNV across the sequenced animals. This transmission-based approach was able to confirm a subset of CNV that segregates in the New Zealand dairy cattle population. Genome-wide identification and validation of CNV is an important step toward their inclusion in genomic selection strategies.
Assuntos
Variações do Número de Cópias de DNA , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/veterinária , Animais , Bovinos , Genoma , Genômica , Genótipo , Masculino , Nova Zelândia , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodosRESUMO
The main objectives of this study were to establish the relative value of milk yields under twice-daily milking (TDM) as a predictor of yield and yield loss under once-daily milking (ODM), and to understand the role of residual milk and udder storage capacity-related traits in regulating yield and yield loss during ODM. A Holstein-Friesian × Jersey crossbred herd was established over 2 seasons (years), as 2 individual cohorts on the same farm, managed on a pasture-based system over 4 lactations. Short-term (1-wk) ODM studies, with a starting total of 690 cows, were undertaken in mid- and late-lactation in lactation 2 and in mid-lactation in lactation 3 for each cohort. A 10-wk study of ODM performance began in mid-lactation in lactation 3, whereas lactation 4 was a full-lactation assessment of ODM. In the short-term studies, milk yield under ODM was well predicted (R(2)=0.7 to 0.8 in 5 of 6 studies) by the daily yield under TDM in the week before ODM. Yield loss (kg/d) increased with increasing milk yield and with increasing somatic cell count (SCC), although predictions were relatively poor (R(2)=0.09 to 0.30). Yield loss (%) decreased with increasing TDM yield in 3 of the 6 studies and was positively correlated with SCC during ODM. Nevertheless, ODM yield loss, in absolute or percentage terms, was a poorly repeatable trait in grazing cows. Part of the variation in yield loss percentage (30%) was positively associated with residual milk (%), measured pretrial, during measurement of functional udder capacity in lactation 3. Total production (kg of milk) over the full-lactation ODM study in lactation 4 was correlated with total production in the 10-wk trial in lactation 3 (r=0.72 and 0.63 for cohorts 1 and 2, respectively). Identifying the highest- and lowest-producing 10% of animals during the full lactation of ODM indicated that poor production was associated with high yields of residual milk (measured in lactation 3) and, conversely, high production was associated with low yields of residual milk, relative to the other 80% of animals. These same "high" and "low" production groups from lactation 4 had similar differences in performance in the earlier short-term studies and a larger or smaller percentage yield loss associated with the residual milk measurement. Breeding strategies for ODM may benefit, therefore, from greater emphasis on selecting for a low residual milk fraction to optimize milking performance. Nevertheless, the level of milk production under TDM is a strong phenotypic predictor of milk production under ODM.
Assuntos
Bovinos/fisiologia , Lactação , Leite/metabolismo , Animais , Cruzamento , Indústria de Laticínios , Feminino , Masculino , Glândulas Mamárias Animais/fisiologia , Nova Zelândia , FenótipoRESUMO
Over the last 100 years, significant advances have been made in the characterisation of milk composition for dairy cattle improvement programs. Technological progress has enabled a shift from labour intensive, on-farm collection and processing of samples that assess yield and fat levels in milk, to large-scale processing of samples through centralised laboratories, with the scope extended to include quantification of other traits. Fourier-transform mid-infrared (FT-MIR) spectroscopy has had a significant role in the transformation of milk composition phenotyping, with spectral-based predictions of major milk components already being widely used in milk payment and animal evaluation systems globally. Increasingly, there is interest in analysing the individual FT-MIR wavenumbers, and in utilising the FT-MIR data to predict other novel traits of importance to breeding programs. This includes traits related to the nutritional value of milk, the processability of milk into products such as cheese, and traits relevant to animal health and the environment. The ability to successfully incorporate these traits into breeding programs is dependent on the heritability of the FT-MIR predicted traits, and the genetic correlations between the FT-MIR predicted and actual trait values. Linking FT-MIR predicted traits to the underlying mutations responsible for their variation can be difficult because the phenotypic expression of these traits are a function of a diverse range of molecular and biological mechanisms that can obscure their genetic basis. The individual FT-MIR wavenumbers give insights into the chemical composition of milk and provide an additional layer of granularity that may assist with establishing causal links between the genome and observed phenotypes. Additionally, there are other molecular phenotypes such as those related to the metabolome, chromatin accessibility, and RNA editing that could improve our understanding of the underlying biological systems controlling traits of interest. Here we review topics of importance to phenotyping and genetic applications of FT-MIR spectra datasets, and discuss opportunities for consolidating FT-MIR datasets with other genomic and molecular data sources to improve future dairy cattle breeding programs.