Multivariate Statistical Machine Learning Methods for Genomic Prediction.
Main Author: | |
---|---|
Other Authors: | , |
Format: | eBook |
Language: | English |
Published: |
Cham :
Springer International Publishing AG,
2022.
|
Edition: | 1st ed. |
Subjects: | |
Online Access: | Click to View |
Table of Contents:
- Intro
- Foreword
- Preface
- Acknowledgments
- Contents
- Chapter 1: General Elements of Genomic Selection and Statistical Learning
- 1.1 Data as a Powerful Weapon
- 1.2 Genomic Selection
- 1.2.1 Concepts of Genomic Selection
- 1.2.2 Why Is Statistical Machine Learning a Key Element of Genomic Selection?
- 1.3 Modeling Basics
- 1.3.1 What Is a Statistical Machine Learning Model?
- 1.3.2 The Two Cultures of Model Building: Prediction Versus Inference
- 1.3.3 Types of Statistical Machine Learning Models and Model Effects
- 1.3.3.1 Types of Statistical Machine Learning Models
- 1.3.3.2 Model Effects
- 1.4 Matrix Algebra Review
- 1.5 Statistical Data Types
- 1.5.1 Data Types
- 1.5.2 Multivariate Data Types
- 1.6 Types of Learning
- 1.6.1 Definition and Examples of Supervised Learning
- 1.6.2 Definitions and Examples of Unsupervised Learning
- 1.6.3 Definition and Examples of Semi-Supervised Learning
- References
- Chapter 2: Preprocessing Tools for Data Preparation
- 2.1 Fixed or Random Effects
- 2.2 BLUEs and BLUPs
- 2.3 Marker Depuration
- 2.4 Methods to Compute the Genomic Relationship Matrix
- 2.5 Genomic Breeding Values and Their Estimation
- 2.6 Normalization Methods
- 2.7 General Suggestions for Removing or Adding Inputs
- 2.8 Principal Component Analysis as a Compression Method
- Appendix 1
- Appendix 2
- References
- Chapter 3: Elements for Building Supervised Statistical Machine Learning Models
- 3.1 Definition of a Linear Multiple Regression Model
- 3.2 Fitting a Linear Multiple Regression Model via the Ordinary Least Square (OLS) Method
- 3.3 Fitting the Linear Multiple Regression Model via the Maximum Likelihood (ML) Method
- 3.4 Fitting the Linear Multiple Regression Model via the Gradient Descent (GD) Method
- 3.5 Advantages and Disadvantages of Standard Linear Regression Models (OLS and MLR).
- 3.6 Regularized Linear Multiple Regression Model
- 3.6.1 Ridge Regression
- 3.6.2 Lasso Regression
- 3.7 Logistic Regression
- 3.7.1 Logistic Ridge Regression
- 3.7.2 Lasso Logistic Regression
- Appendix 1: R Code for Ridge Regression Used in Example 2
- References
- Chapter 4: Overfitting, Model Tuning, and Evaluation of Prediction Performance
- 4.1 The Problem of Overfitting and Underfitting
- 4.2 The Trade-Off Between Prediction Accuracy and Model Interpretability
- 4.3 Cross-validation
- 4.3.1 The Single Hold-Out Set Approach
- 4.3.2 The k-Fold Cross-validation
- 4.3.3 The Leave-One-Out Cross-validation
- 4.3.4 The Leave-m-Out Cross-validation
- 4.3.5 Random Cross-validation
- 4.3.6 The Leave-One-Group-Out Cross-validation
- 4.3.7 Bootstrap Cross-validation
- 4.3.8 Incomplete Block Cross-validation
- 4.3.9 Random Cross-validation with Blocks
- 4.3.10 Other Options and General Comments on Cross-validation
- 4.4 Model Tuning
- 4.4.1 Why Is Model Tuning Important?
- 4.4.2 Methods for Hyperparameter Tuning (Grid Search, Random Search, etc.)
- 4.5 Metrics for the Evaluation of Prediction Performance
- 4.5.1 Quantitative Measures of Prediction Performance
- 4.5.2 Binary and Ordinal Measures of Prediction Performance
- 4.5.3 Count Measures of Prediction Performance
- References
- Chapter 5: Linear Mixed Models
- 5.1 General of Linear Mixed Models
- 5.2 Estimation of the Linear Mixed Model
- 5.2.1 Maximum Likelihood Estimation
- 5.2.1.1 EM Algorithm
- E Step
- M Step
- 5.2.1.2 REML
- 5.2.1.3 BLUPs
- 5.3 Linear Mixed Models in Genomic Prediction
- 5.4 Illustrative Examples of the Univariate LMM
- 5.5 Multi-trait Genomic Linear Mixed-Effects Models
- 5.6 Final Comments
- Appendix 1
- Appendix 2
- Appendix 3
- Appendix 4
- Appendix 5
- Appendix 6
- Appendix 7
- References.
- Chapter 6: Bayesian Genomic Linear Regression
- 6.1 Bayes Theorem and Bayesian Linear Regression
- 6.2 Bayesian Genome-Based Ridge Regression
- 6.3 Bayesian GBLUP Genomic Model
- 6.4 Genomic-Enabled Prediction BayesA Model
- 6.5 Genomic-Enabled Prediction BayesB and BayesC Models
- 6.6 Genomic-Enabled Prediction Bayesian Lasso Model
- 6.7 Extended Predictor in Bayesian Genomic Regression Models
- 6.8 Bayesian Genomic Multi-trait Linear Regression Model
- 6.8.1 Genomic Multi-trait Linear Model
- 6.9 Bayesian Genomic Multi-trait and Multi-environment Model (BMTME)
- Appendix 1
- Appendix 2: Setting Hyperparameters for the Prior Distributions of the BRR Model
- Appendix 3: R Code Example 1
- Appendix 4: R Code Example 2
- Appendix 5
- R Code Example 3
- R Code for Example 4
- References
- Chapter 7: Bayesian and Classical Prediction Models for Categorical and Count Data
- 7.1 Introduction
- 7.2 Bayesian Ordinal Regression Model
- 7.2.1 Illustrative Examples
- 7.3 Ordinal Logistic Regression
- 7.4 Penalized Multinomial Logistic Regression
- 7.4.1 Illustrative Examples for Multinomial Penalized Logistic Regression
- 7.5 Penalized Poisson Regression
- 7.6 Final Comments
- Appendix 1
- Appendix 2
- Appendix 3
- Appendix 4 (Example 4)
- Appendix 5
- Appendix 6
- References
- Chapter 8: Reproducing Kernel Hilbert Spaces Regression and Classification Methods
- 8.1 The Reproducing Kernel Hilbert Spaces (RKHS)
- 8.2 Generalized Kernel Model
- 8.2.1 Parameter Estimation Under the Frequentist Paradigm
- 8.2.2 Kernels
- 8.2.3 Kernel Trick
- 8.2.4 Popular Kernel Functions
- 8.2.5 A Two Separate Step Process for Building Kernel Machines
- 8.3 Kernel Methods for Gaussian Response Variables
- 8.4 Kernel Methods for Binary Response Variables
- 8.5 Kernel Methods for Categorical Response Variables.
- 8.6 The Linear Mixed Model with Kernels
- 8.7 Hyperparameter Tuning for Building the Kernels
- 8.8 Bayesian Kernel Methods
- 8.8.1 Extended Predictor Under the Bayesian Kernel BLUP
- 8.8.2 Extended Predictor Under the Bayesian Kernel BLUP with a Binary Response Variable
- 8.8.3 Extended Predictor Under the Bayesian Kernel BLUP with a Categorical Response Variable
- 8.9 Multi-trait Bayesian Kernel
- 8.10 Kernel Compression Methods
- 8.10.1 Extended Predictor Under the Approximate Kernel Method
- 8.11 Final Comments
- Appendix 1
- Appendix 2
- Appendix 3
- Appendix 4
- Appendix 5
- Appendix 6
- Appendix 7
- Appendix 8
- Appendix 9
- Appendix 10
- Appendix 11
- References
- Chapter 9: Support Vector Machines and Support Vector Regression
- 9.1 Introduction to Support Vector Machine
- 9.2 Hyperplane
- 9.3 Maximum Margin Classifier
- 9.3.1 Derivation of the Maximum Margin Classifier
- 9.3.2 Wolfe Dual
- 9.4 Derivation of the Support Vector Classifier
- 9.5 Support Vector Machine
- 9.5.1 One-Versus-One Classification
- 9.5.2 One-Versus-All Classification
- 9.6 Support Vector Regression
- Appendix 1
- Appendix 2
- Appendix 3
- References
- Chapter 10: Fundamentals of Artificial Neural Networks and Deep Learning
- 10.1 The Inspiration for the Neural Network Model
- 10.2 The Building Blocks of Artificial Neural Networks
- 10.3 Activation Functions
- 10.3.1 Linear
- 10.3.2 Rectifier Linear Unit (ReLU)
- 10.3.3 Leaky ReLU
- 10.3.4 Sigmoid
- 10.3.5 Softmax
- 10.3.6 Tanh
- 10.4 The Universal Approximation Theorem
- 10.5 Artificial Neural Network Topologies
- 10.6 Successful Applications of ANN and DL
- 10.7 Loss Functions
- 10.7.1 Loss Functions for Continuous Outcomes
- 10.7.2 Loss Functions for Binary and Ordinal Outcomes
- 10.7.3 Regularized Loss Functions
- 10.7.4 Early Stopping Method of Training.
- 10.8 The King Algorithm for Training Artificial Neural Networks: Backpropagation
- 10.8.1 Backpropagation Algorithm: Online Version
- 10.8.1.1 Feedforward Part
- 10.8.1.2 Backpropagation Part
- 10.8.2 Illustrative Example 10.1: A Hand Computation
- 10.8.3 Illustrative Example 10.2-By Hand Computation
- References
- Chapter 11: Artificial Neural Networks and Deep Learning for Genomic Prediction of Continuous Outcomes
- 11.1 Hyperparameters to Be Tuned in ANN and DL
- 11.1.1 Network Topology
- 11.1.2 Activation Functions
- 11.1.3 Loss Function
- 11.1.4 Number of Hidden Layers
- 11.1.5 Number of Neurons in Each Layer
- 11.1.6 Regularization Type
- 11.1.7 Learning Rate
- 11.1.8 Number of Epochs and Number of Batches
- 11.1.9 Normalization Scheme for Input Data
- 11.2 Popular DL Frameworks
- 11.3 Optimizers
- 11.4 Illustrative Examples
- Appendix 1
- Appendix 2
- Appendix 3
- Appendix 4
- Appendix 5
- References
- Chapter 12: Artificial Neural Networks and Deep Learning for Genomic Prediction of Binary, Ordinal, and Mixed Outcomes
- 12.1 Training DNN with Binary Outcomes
- 12.2 Training DNN with Categorical (Ordinal) Outcomes
- 12.3 Training DNN with Count Outcomes
- 12.4 Training DNN with Multivariate Outcomes
- 12.4.1 DNN with Multivariate Continuous Outcomes
- 12.4.2 DNN with Multivariate Binary Outcomes
- 12.4.3 DNN with Multivariate Ordinal Outcomes
- 12.4.4 DNN with Multivariate Count Outcomes
- 12.4.5 DNN with Multivariate Mixed Outcomes
- Appendix 1
- Appendix 2
- Appendix 3
- Appendix 4
- Appendix 5
- References
- Chapter 13: Convolutional Neural Networks
- 13.1 The Importance of Convolutional Neural Networks
- 13.2 Tensors
- 13.3 Convolution
- 13.4 Pooling
- 13.5 Convolutional Operation for 1D Tensor for Sequence Data
- 13.6 Motivation of CNN.
- 13.7 Why Are CNNs Preferred over Feedforward Deep Neural Networks for Processing Images?.