Biocomputing 2022 - Proceedings Of The Pacific Symposium.
Main Author: | |
---|---|
Other Authors: | , , , , |
Format: | eBook |
Language: | English |
Published: |
Singapore :
World Scientific Publishing Company,
2021.
|
Edition: | 1st ed. |
Subjects: | |
Online Access: | Click to View |
Table of Contents:
- Intro
- Content
- Preface
- AI-DRIVEN ADVANCES IN MODELING OF PROTEIN STRUCTURE
- Session Introduction: AI-Driven Advances in Modeling of Protein Structure
- 1. A short retrospect
- 2. A brief outline of current research
- 3. Future developments (complexes, ligand interactions, other molecules, dynamics, language models, geometry models, sequence design)
- 4. What is needed for further progress?
- 5. Overview of papers in this session
- 5.1. Evaluating significance of training data selection in machine learning
- 5.2. Geometric pattern transferability
- 5.3. Supervised versus unsupervised sequence to contact learning
- 5.4. Side chain packing using SE(3) transformers
- 5.5. Feature selection in electrostatic representations of ligand binding sites
- References
- Training Data Composition Affects Performance of Protein Structure Analysis Algorithms
- 1. Introduction
- 2. Methods
- 2.1. Experimental Design
- 2.2. Task-specific Methods
- 3. Results
- 3.1. Performance on NMR and cryo-EM structures is consistently lower than performance on X-ray structures, independent of training set
- 3.2. Inclusion of NMR data in the training set improves performance on held-out NMR data and does not degrade performance on X-ray data
- 3.3. Known biochemical and biophysical effects are replicated in trained models
- 3.4. Downsampling X-ray structures during training negatively affects performance on all types of data
- 4. Conclusion
- 5. Acknowledgments
- References
- Transferability of Geometric Patterns from Protein Self-Interactions to Protein-Ligand Interactions
- 1. Introduction
- 2. Related Work
- 3. Methods
- 3.1. Datasets
- 3.2. Contact extraction
- 3.3. Representing contact geometry
- 4. Results
- 4.1. Protein self-contacts exhibit clear geometric clustering.
- 4.2. Many geometric patterns transfer to protein-ligand contacts
- 4.3. Application to protein-ligand docking
- 5. Conclusion and Future Work
- Supplemental Material, Code, and Data Availability
- Acknowledgments
- References
- Interpreting Potts and Transformer Protein Models Through the Lens of Simplified Attention
- 1. Introduction
- 2. Background
- 3. Methods
- 3.1. Potts Models
- 3.2. Factored Attention
- 3.3. Single-layer attention
- 3.4. Pretraining on Sequence Databases
- 3.5. Extracting Contacts
- 4. Results
- 5. Discussion
- Acknowledgements
- References
- Side-Chain Packing Using SE(3)-Transformer
- 1. Introduction
- 2. Methods
- 2.1. Neighborhood Graph Representation
- 2.2. The SE(3)-Transformer Architecture
- 2.3. Node Features
- 2.4. Final Layer
- 2.5. Rotamer Selection
- 2.6. Experiments
- 3. Results
- 4. Conclusion
- 5. Acknowledgements
- 6. References
- DeepVASP-E: A Flexible Analysis of Electrostatic Isopotentials for Finding and Explaining Mechanisms that Control Binding Specificity
- 1. Introduction
- 2. Methods
- 2.1. Convolutional Neural Network
- 2.2. Experimental Design
- 2.3. Comparison with Existing Methods
- 3. Results
- 4. Conclusions
- Acknowledgements
- References
- BIG DATA IMAGING GENOMICS
- Session Introduction: Big Data Imaging Genomics
- 1. Introduction
- 2. Overview of Contributions
- References
- A New Mendelian Randomization Method to Estimate Causal Effects of Multivariable Brain Imaging Exposures
- 1. Introduction
- 2. Methods
- 2.1. Step 1 : Mendelian randomization analysis on a single imaging exposure
- 2.2. Step 2: Joint instrumental variables and imaging exposures selection
- 2.3. Step 3: Causal effect identification for multiple imaging exposures
- 3. Application to evaluate the causal effect of white matter microstructure integrity on cognitive function.
- 3.1. Data and study cohort
- 3.2. Results
- 4. Simulation
- 5. Discussion
- Funding
- Availability of data and materials
- Authors' contributions
- References
- Efficient Differentially Private Methods for a Transmission Disequilibrium Test in Genome Wide Association Studies
- 1. Introduction
- 2. Preliminaries
- 2.1. TDT
- 2.2. Differential Privacy
- 3. Methods
- 3.1. Exact Algorithm
- 3.2. Approximation Algorithm
- 4. Experiments
- 4.1. Simulation Data
- 4.2. Results
- 4.2.1. Run Time
- 4.2.2. Accuracy
- 4.3. Real Data
- 5. Conclusion
- Acknowledgement
- References
- Identifying Imaging Genetic Associations via Regional Morphometricity Estimation
- 1. Introduction
- 2. Methods
- 3. Materials
- 4. Experimental Design
- 5. Results and Discussion
- 6. Conclusion
- Acknowledgements
- References
- Identifying Highly Heritable Brain Amyloid Phenotypes Through Mining Alzheimer's Imaging and Sequencing Biobank Data
- 1. Introduction
- 2. Method
- 3. Materials
- 4. Experimental Workow
- 5. Results and Discussion
- 6. Conclusion
- Acknowledgements
- References
- Effects of ApoE4 and ApoE2 Genotypes on Subcortical Magnetic Susceptibility and Microstructure in 27,535 Participants from the UK Biobank
- 1. Introduction
- 2. Methods
- 2.1. UK Biobank Participants
- 2.2. T1-Weighted MRI
- 2.3. Quantitative Magnetic Susceptibility
- 2.4. Diffusion-Weighted MRI
- 2.5. Statistical Analyses
- 3. Results
- 3.1. ApoE4 Microstructural Associations
- 3.2. ApoE2 Microstructural Associations
- 3.3. ApoE-by-Age Interactions
- 3.3.1. ApoE Associations Stratified by Age
- 4. Discussion
- References
- Separating Clinical and Subclinical Depression by Big Data Informed Structural Vulnerability Index and Its impact on Cognition: ENIGMA Dot Product
- 1. Introduction
- 2. Methods.
- 2.1 Participants.
- 2.2 Major Depressive Disorder Classification
- 2.3 Imaging Protocol and Processing
- 2.4 Calculation of linear indices of similarity
- 2.5 Calculation of QRI
- 2.7 Cognitive assessment
- 2.8 Statistics
- 3. Results
- 3.1 Group differences in symptoms and biomarkers
- 3.2 Effects of MDD on cognition.
- 3.3. Cognitive association
- 4. Discussion.
- 5. Conclusion
- 6. Acknowledgement
- References
- Generalizing Few-Shot Classification of Whole-Genome Doubling Across Cancer Types
- 1. Introduction
- 2. Related Work
- 3. Cohort
- 3.1. Cohort Selection
- 3.2. Feature Extraction
- 4. Methods
- 4.1. Model
- 4.2. Training
- 4.2.1. Pre-Training
- 4.2.2. Meta-Training
- 4.3. Meta-Validation and Meta-Test
- 4.4. Experiments
- 4.4.1. Cancer Types
- 4.4.2. Batch Effects
- 5. Results
- 5.1. Cancer Types
- 5.2. Batch Effects
- 5.2.1. Image Resolution
- 5.2.2. Image Brightness
- 6. Discussion
- Software and Data
- References
- HUMAN INTRIGUE: META-ANALYSIS APPROACHES FOR BIG QUESTIONS WITH BIG DATA WHILE SHAKING UP THE PEER REVIEW PROCESS
- Session Introduction: Human Intrigue: Meta-Analysis Approaches for Big Questions with Big Data While Shaking Up the Peer Review Process
- 1. Introduction
- 2. The Crowd Peer Review Process
- 2.1 Reviewer's Feedback
- 2.2 Conclusions
- 3. Meta-Analysis in Biocomputing
- 3.1 Novel Methods for Meta-Analysis of 'Omics Data
- 3.2 Using Publicly Available Data in Methods Development
- 3.3 Studying the Structure of Publicly Available Data
- 3.4 Conclusions
- Acknowledgements
- References
- Multitask Group Lasso for Genome Wide Association Studies in Diverse Populations
- 1. Introduction
- 2. Methods
- 2.1. Population stratification
- 2.2.1. Adjacency-constrained hierarchical clustering
- 2.2.2. LD-groups across populations
- 2.3. Multitask group Lasso.
- 2.3.1. General framework and problem formulation
- 2.3.2. Related work
- 2.3.3. Gap safe screening rules
- 2.4. Stability selection
- 3. Experiments
- 3.1. Data
- 3.2. Preprocessing
- 3.3. Comparison partners
- 4. Results
- 4.1. MuGLasso draws on both LD-groups and the multitask approach to recover disease SNPs
- 4.2. MuGLasso provides the most stable selection
- 4.3. MuGLasso selects both task-speci c and global LD-groups
- 5. Discussion and Conclusions
- Acknowledgments
- Supplementary Materials and code
- References
- Mixed Effects Machine Learning Models for Colon Cancer Metastasis Prediction Using Spatially Localized Immuno-Oncology Markers
- 1. Introduction
- 2. Motivation for Comparison Study
- 2.1. Review of Prior Spatial Omics Analysis Methods
- 2.2. Motivation for Mixed Effects Machine Learning Approaches
- 3. Materials and Methods
- 3.1. Data Acquisition and Preprocessing
- 3.2. Experimental Design: Prediction Tasks and Modeling Approaches
- 4. Results
- 4.1. Macro: Inter-Tumoral Prediction
- 4.2. METS: Nodal and Distant Metastasis Prediction
- 5. Discussion
- 6. Conclusion
- 7. Acknowledgements
- 8. References
- Improving QSAR Modeling for Predictive Toxicology Using Publicly Aggregated Semantic Graph Data and Graph Neural Networks
- 1. Introduction
- 2. Methods
- 2.1. Obtaining toxicology assay data
- 2.2. Aggregating publicly available multimodal graph data
- 2.3. Heterogeneous graph neural network
- 2.3.1. Node classification
- 2.4. Baseline QSAR classifiers
- 3. Results
- 3.1. GNN node classification performance vs. baseline QSAR models
- 3.2. Ablation analysis of graph components' inuence on the trained model
- 4. Discussion
- 4.1. GNNs versus traditional ML for QSAR modeling
- 4.2. Interpretability of GNNs in QSAR
- 4.3. Sources of bias and their effects on QSAR for toxicity prediction.
- 5. Conclusions.