Multivariate Statistical Machine Learning Methods for Genomic Prediction.

Bibliographic Details
Main Author: Montesinos López, Osval Antonio.
Other Authors: Montesinos López, Abelardo., Crossa, José.
Format: eBook
Language:English
Published: Cham : Springer International Publishing AG, 2022.
Edition:1st ed.
Subjects:
Online Access:Click to View
Table of Contents:
  • Intro
  • Foreword
  • Preface
  • Acknowledgments
  • Contents
  • Chapter 1: General Elements of Genomic Selection and Statistical Learning
  • 1.1 Data as a Powerful Weapon
  • 1.2 Genomic Selection
  • 1.2.1 Concepts of Genomic Selection
  • 1.2.2 Why Is Statistical Machine Learning a Key Element of Genomic Selection?
  • 1.3 Modeling Basics
  • 1.3.1 What Is a Statistical Machine Learning Model?
  • 1.3.2 The Two Cultures of Model Building: Prediction Versus Inference
  • 1.3.3 Types of Statistical Machine Learning Models and Model Effects
  • 1.3.3.1 Types of Statistical Machine Learning Models
  • 1.3.3.2 Model Effects
  • 1.4 Matrix Algebra Review
  • 1.5 Statistical Data Types
  • 1.5.1 Data Types
  • 1.5.2 Multivariate Data Types
  • 1.6 Types of Learning
  • 1.6.1 Definition and Examples of Supervised Learning
  • 1.6.2 Definitions and Examples of Unsupervised Learning
  • 1.6.3 Definition and Examples of Semi-Supervised Learning
  • References
  • Chapter 2: Preprocessing Tools for Data Preparation
  • 2.1 Fixed or Random Effects
  • 2.2 BLUEs and BLUPs
  • 2.3 Marker Depuration
  • 2.4 Methods to Compute the Genomic Relationship Matrix
  • 2.5 Genomic Breeding Values and Their Estimation
  • 2.6 Normalization Methods
  • 2.7 General Suggestions for Removing or Adding Inputs
  • 2.8 Principal Component Analysis as a Compression Method
  • Appendix 1
  • Appendix 2
  • References
  • Chapter 3: Elements for Building Supervised Statistical Machine Learning Models
  • 3.1 Definition of a Linear Multiple Regression Model
  • 3.2 Fitting a Linear Multiple Regression Model via the Ordinary Least Square (OLS) Method
  • 3.3 Fitting the Linear Multiple Regression Model via the Maximum Likelihood (ML) Method
  • 3.4 Fitting the Linear Multiple Regression Model via the Gradient Descent (GD) Method
  • 3.5 Advantages and Disadvantages of Standard Linear Regression Models (OLS and MLR).
  • 3.6 Regularized Linear Multiple Regression Model
  • 3.6.1 Ridge Regression
  • 3.6.2 Lasso Regression
  • 3.7 Logistic Regression
  • 3.7.1 Logistic Ridge Regression
  • 3.7.2 Lasso Logistic Regression
  • Appendix 1: R Code for Ridge Regression Used in Example 2
  • References
  • Chapter 4: Overfitting, Model Tuning, and Evaluation of Prediction Performance
  • 4.1 The Problem of Overfitting and Underfitting
  • 4.2 The Trade-Off Between Prediction Accuracy and Model Interpretability
  • 4.3 Cross-validation
  • 4.3.1 The Single Hold-Out Set Approach
  • 4.3.2 The k-Fold Cross-validation
  • 4.3.3 The Leave-One-Out Cross-validation
  • 4.3.4 The Leave-m-Out Cross-validation
  • 4.3.5 Random Cross-validation
  • 4.3.6 The Leave-One-Group-Out Cross-validation
  • 4.3.7 Bootstrap Cross-validation
  • 4.3.8 Incomplete Block Cross-validation
  • 4.3.9 Random Cross-validation with Blocks
  • 4.3.10 Other Options and General Comments on Cross-validation
  • 4.4 Model Tuning
  • 4.4.1 Why Is Model Tuning Important?
  • 4.4.2 Methods for Hyperparameter Tuning (Grid Search, Random Search, etc.)
  • 4.5 Metrics for the Evaluation of Prediction Performance
  • 4.5.1 Quantitative Measures of Prediction Performance
  • 4.5.2 Binary and Ordinal Measures of Prediction Performance
  • 4.5.3 Count Measures of Prediction Performance
  • References
  • Chapter 5: Linear Mixed Models
  • 5.1 General of Linear Mixed Models
  • 5.2 Estimation of the Linear Mixed Model
  • 5.2.1 Maximum Likelihood Estimation
  • 5.2.1.1 EM Algorithm
  • E Step
  • M Step
  • 5.2.1.2 REML
  • 5.2.1.3 BLUPs
  • 5.3 Linear Mixed Models in Genomic Prediction
  • 5.4 Illustrative Examples of the Univariate LMM
  • 5.5 Multi-trait Genomic Linear Mixed-Effects Models
  • 5.6 Final Comments
  • Appendix 1
  • Appendix 2
  • Appendix 3
  • Appendix 4
  • Appendix 5
  • Appendix 6
  • Appendix 7
  • References.
  • Chapter 6: Bayesian Genomic Linear Regression
  • 6.1 Bayes Theorem and Bayesian Linear Regression
  • 6.2 Bayesian Genome-Based Ridge Regression
  • 6.3 Bayesian GBLUP Genomic Model
  • 6.4 Genomic-Enabled Prediction BayesA Model
  • 6.5 Genomic-Enabled Prediction BayesB and BayesC Models
  • 6.6 Genomic-Enabled Prediction Bayesian Lasso Model
  • 6.7 Extended Predictor in Bayesian Genomic Regression Models
  • 6.8 Bayesian Genomic Multi-trait Linear Regression Model
  • 6.8.1 Genomic Multi-trait Linear Model
  • 6.9 Bayesian Genomic Multi-trait and Multi-environment Model (BMTME)
  • Appendix 1
  • Appendix 2: Setting Hyperparameters for the Prior Distributions of the BRR Model
  • Appendix 3: R Code Example 1
  • Appendix 4: R Code Example 2
  • Appendix 5
  • R Code Example 3
  • R Code for Example 4
  • References
  • Chapter 7: Bayesian and Classical Prediction Models for Categorical and Count Data
  • 7.1 Introduction
  • 7.2 Bayesian Ordinal Regression Model
  • 7.2.1 Illustrative Examples
  • 7.3 Ordinal Logistic Regression
  • 7.4 Penalized Multinomial Logistic Regression
  • 7.4.1 Illustrative Examples for Multinomial Penalized Logistic Regression
  • 7.5 Penalized Poisson Regression
  • 7.6 Final Comments
  • Appendix 1
  • Appendix 2
  • Appendix 3
  • Appendix 4 (Example 4)
  • Appendix 5
  • Appendix 6
  • References
  • Chapter 8: Reproducing Kernel Hilbert Spaces Regression and Classification Methods
  • 8.1 The Reproducing Kernel Hilbert Spaces (RKHS)
  • 8.2 Generalized Kernel Model
  • 8.2.1 Parameter Estimation Under the Frequentist Paradigm
  • 8.2.2 Kernels
  • 8.2.3 Kernel Trick
  • 8.2.4 Popular Kernel Functions
  • 8.2.5 A Two Separate Step Process for Building Kernel Machines
  • 8.3 Kernel Methods for Gaussian Response Variables
  • 8.4 Kernel Methods for Binary Response Variables
  • 8.5 Kernel Methods for Categorical Response Variables.
  • 8.6 The Linear Mixed Model with Kernels
  • 8.7 Hyperparameter Tuning for Building the Kernels
  • 8.8 Bayesian Kernel Methods
  • 8.8.1 Extended Predictor Under the Bayesian Kernel BLUP
  • 8.8.2 Extended Predictor Under the Bayesian Kernel BLUP with a Binary Response Variable
  • 8.8.3 Extended Predictor Under the Bayesian Kernel BLUP with a Categorical Response Variable
  • 8.9 Multi-trait Bayesian Kernel
  • 8.10 Kernel Compression Methods
  • 8.10.1 Extended Predictor Under the Approximate Kernel Method
  • 8.11 Final Comments
  • Appendix 1
  • Appendix 2
  • Appendix 3
  • Appendix 4
  • Appendix 5
  • Appendix 6
  • Appendix 7
  • Appendix 8
  • Appendix 9
  • Appendix 10
  • Appendix 11
  • References
  • Chapter 9: Support Vector Machines and Support Vector Regression
  • 9.1 Introduction to Support Vector Machine
  • 9.2 Hyperplane
  • 9.3 Maximum Margin Classifier
  • 9.3.1 Derivation of the Maximum Margin Classifier
  • 9.3.2 Wolfe Dual
  • 9.4 Derivation of the Support Vector Classifier
  • 9.5 Support Vector Machine
  • 9.5.1 One-Versus-One Classification
  • 9.5.2 One-Versus-All Classification
  • 9.6 Support Vector Regression
  • Appendix 1
  • Appendix 2
  • Appendix 3
  • References
  • Chapter 10: Fundamentals of Artificial Neural Networks and Deep Learning
  • 10.1 The Inspiration for the Neural Network Model
  • 10.2 The Building Blocks of Artificial Neural Networks
  • 10.3 Activation Functions
  • 10.3.1 Linear
  • 10.3.2 Rectifier Linear Unit (ReLU)
  • 10.3.3 Leaky ReLU
  • 10.3.4 Sigmoid
  • 10.3.5 Softmax
  • 10.3.6 Tanh
  • 10.4 The Universal Approximation Theorem
  • 10.5 Artificial Neural Network Topologies
  • 10.6 Successful Applications of ANN and DL
  • 10.7 Loss Functions
  • 10.7.1 Loss Functions for Continuous Outcomes
  • 10.7.2 Loss Functions for Binary and Ordinal Outcomes
  • 10.7.3 Regularized Loss Functions
  • 10.7.4 Early Stopping Method of Training.
  • 10.8 The King Algorithm for Training Artificial Neural Networks: Backpropagation
  • 10.8.1 Backpropagation Algorithm: Online Version
  • 10.8.1.1 Feedforward Part
  • 10.8.1.2 Backpropagation Part
  • 10.8.2 Illustrative Example 10.1: A Hand Computation
  • 10.8.3 Illustrative Example 10.2-By Hand Computation
  • References
  • Chapter 11: Artificial Neural Networks and Deep Learning for Genomic Prediction of Continuous Outcomes
  • 11.1 Hyperparameters to Be Tuned in ANN and DL
  • 11.1.1 Network Topology
  • 11.1.2 Activation Functions
  • 11.1.3 Loss Function
  • 11.1.4 Number of Hidden Layers
  • 11.1.5 Number of Neurons in Each Layer
  • 11.1.6 Regularization Type
  • 11.1.7 Learning Rate
  • 11.1.8 Number of Epochs and Number of Batches
  • 11.1.9 Normalization Scheme for Input Data
  • 11.2 Popular DL Frameworks
  • 11.3 Optimizers
  • 11.4 Illustrative Examples
  • Appendix 1
  • Appendix 2
  • Appendix 3
  • Appendix 4
  • Appendix 5
  • References
  • Chapter 12: Artificial Neural Networks and Deep Learning for Genomic Prediction of Binary, Ordinal, and Mixed Outcomes
  • 12.1 Training DNN with Binary Outcomes
  • 12.2 Training DNN with Categorical (Ordinal) Outcomes
  • 12.3 Training DNN with Count Outcomes
  • 12.4 Training DNN with Multivariate Outcomes
  • 12.4.1 DNN with Multivariate Continuous Outcomes
  • 12.4.2 DNN with Multivariate Binary Outcomes
  • 12.4.3 DNN with Multivariate Ordinal Outcomes
  • 12.4.4 DNN with Multivariate Count Outcomes
  • 12.4.5 DNN with Multivariate Mixed Outcomes
  • Appendix 1
  • Appendix 2
  • Appendix 3
  • Appendix 4
  • Appendix 5
  • References
  • Chapter 13: Convolutional Neural Networks
  • 13.1 The Importance of Convolutional Neural Networks
  • 13.2 Tensors
  • 13.3 Convolution
  • 13.4 Pooling
  • 13.5 Convolutional Operation for 1D Tensor for Sequence Data
  • 13.6 Motivation of CNN.
  • 13.7 Why Are CNNs Preferred over Feedforward Deep Neural Networks for Processing Images?.