Representation Learning for Natural Language Processing.
Main Author: | |
---|---|
Other Authors: | , |
Format: | eBook |
Language: | English |
Published: |
Singapore :
Springer,
2023.
|
Edition: | 2nd ed. |
Subjects: | |
Online Access: | Click to View |
Table of Contents:
- Intro
- Preface
- Book Organization
- Book Cover
- Note for the Second Edition
- Prerequisites
- Contact Information
- Acknowledgments
- Acknowledgments for the Second Edition
- Acknowledgments for the First Edition
- Contents
- Contributors
- Acronyms
- Symbols and Notations
- 1 Representation Learning and NLP
- 1.1 Motivation
- 1.2 Why Representation Learning Is Important for NLP
- 1.2.1 Multiple Granularities
- 1.2.2 Multiple Knowledge
- 1.2.3 Multiple Tasks
- 1.2.4 Multiple Domains
- 1.3 Development of Representation Learning for NLP
- 1.3.1 Symbolic Representation and Statistical Learning
- 1.3.2 Distributed Representation and Deep Learning
- 1.3.3 Going Deeper and Larger with Pre-training on Big Data
- 1.4 Intellectual Origins of Distributed Representation
- 1.4.1 Representation Debates in Cognitive Neuroscience
- 1.4.2 Knowledge Representation in AI
- 1.4.3 Feature Engineering in Machine Learning
- 1.4.4 Linguistics
- 1.5 Representation Learning Approaches in NLP
- 1.5.1 Feature Engineering
- 1.5.2 Supervised Representation Learning
- 1.5.3 Self-supervised Representation Learning
- 1.6 How to Apply Representation Learning to NLP
- 1.6.1 Input Augmentation
- 1.6.2 Architecture Reformulation
- 1.6.3 Objective Regularization
- 1.6.4 Parameter Transfer
- 1.7 Advantages of Distributed Representation Learning
- 1.8 The Organization of This Book
- References
- 2 Word Representation Learning
- 2.1 Introduction
- 2.2 Symbolic Word Representation
- 2.2.1 One-Hot Word Representation
- 2.2.2 Linguistic KB-based Word Representation
- 2.2.3 Corpus-based Word Representation
- 2.3 Distributed Word Representation
- 2.3.1 Preliminary: Interpreting the Representation
- 2.3.2 Matrix Factorization-based Word Representation
- 2.3.3 Word2vec and GloVe
- 2.3.4 Contextualized Word Representation.
- 2.4 Advanced Topics
- 2.4.1 Informative Word Representation
- 2.4.2 Interpretable Word Representation
- 2.5 Applications
- 2.5.1 NLP
- 2.5.2 Cognitive Psychology
- 2.5.3 History and Social Science
- 2.6 Summary and Further Readings
- References
- 3 Representation Learning for Compositional Semantics
- 3.1 Introduction
- 3.2 Binary Composition
- 3.2.1 Additive Model
- 3.2.2 Multiplicative Model
- 3.3 N-ary Composition
- 3.4 Summary and Further Readings
- References
- 4 Sentence and Document Representation Learning
- 4.1 Introduction
- 4.2 Symbolic Sentence Representation
- 4.2.1 Bag-of-Words Model
- 4.2.2 Probabilistic Language Model
- 4.3 Neural Language Models
- 4.3.1 Feed-Forward Neural Network
- 4.3.2 Convolutional Neural Network
- 4.3.3 Recurrent Neural Network
- 4.3.4 Transformer
- 4.3.5 Enhancing Neural Language Models
- 4.4 From Sentence to Document Representation
- 4.4.1 Memory-Based Document Representation
- 4.4.2 Hierarchical Document Representation
- 4.5 Applications
- 4.5.1 Text Classification
- 4.5.2 Information Retrieval
- 4.5.3 Reading Comprehension
- 4.5.4 Open-Domain Question Answering
- 4.5.5 Sequence Labeling
- 4.5.6 Sequence-to-Sequence Generation
- 4.6 Summary and Further Readings
- References
- 5 Pre-trained Models for Representation Learning
- 5.1 Introduction
- 5.2 Pre-training Tasks
- 5.2.1 Word-Level Pre-training
- 5.2.2 Sentence-Level Pre-training
- 5.3 Model Adaptation
- 5.3.1 Full-Parameter Fine-Tuning
- 5.3.2 Delta Tuning
- 5.3.3 Prompt Learning
- 5.4 Advanced Topics
- 5.4.1 Better Model Architecture
- 5.4.2 Multilingual Representation
- 5.4.3 Multi-Task Representation
- 5.4.4 Efficient Representation
- 5.4.5 Chain-of-Thought Reasoning
- 5.5 Summary and Further Readings
- References
- 6 Graph Representation Learning
- 6.1 Introduction.
- 6.2 Symbolic Graph Representation
- 6.3 Shallow Node Representation Learning
- 6.3.1 Spectral Clustering
- 6.3.2 Shallow Neural Networks
- 6.3.3 Matrix Factorization
- 6.4 Deep Node Representation Learning
- 6.4.1 Autoencoder-Based Methods
- 6.4.2 Graph Convolutional Networks
- 6.4.3 Graph Attention Networks
- 6.4.4 Graph Recurrent Networks
- 6.4.5 Graph Transformers
- 6.4.6 Extensions
- 6.5 From Node Representation to Graph Representation
- 6.5.1 Flat Pooling
- 6.5.2 Hierarchical Pooling
- 6.6 Self-Supervised Graph Representation Learning
- 6.7 Applications
- 6.8 Summary and Further Readings
- References
- 7 Cross-Modal Representation Learning
- 7.1 Introduction
- 7.2 Cross-Modal Capabilities
- 7.3 Shallow Cross-Modal Representation Learning
- 7.4 Deep Cross-Modal Representation Learning
- 7.4.1 Cross-Modal Understanding
- 7.4.2 Cross-Modal Retrieval
- 7.4.3 Cross-Modal Generation
- 7.5 Deep Cross-Modal Pre-training
- 7.5.1 Input Representations
- 7.5.2 Model Architectures
- 7.5.3 Pre-training Tasks
- 7.5.4 Adaptation Approaches
- 7.6 Applications
- 7.7 Summary and Further Readings
- References
- 8 Robust Representation Learning
- 8.1 Introduction
- 8.2 Backdoor Robustness
- 8.2.1 Backdoor Attack on Supervised Representation Learning
- 8.2.2 Backdoor Attack on Self-Supervised Representation Learning
- 8.2.3 Backdoor Defense
- 8.2.4 Toolkits
- 8.3 Adversarial Robustness
- 8.3.1 Adversarial Attack
- 8.3.2 Adversarial Defense
- 8.3.3 Toolkits
- 8.4 Out-of-Distribution Robustness
- 8.4.1 Spurious Correlation
- 8.4.2 Domain Shift
- 8.4.3 Subpopulation Shift
- 8.5 Interpretability
- 8.5.1 Understanding Model Functionality
- 8.5.2 Explaining Model Mechanism
- 8.6 Summary and Further Readings
- References
- 9 Knowledge Representation Learning and Knowledge-Guided NLP
- 9.1 Introduction.
- 9.2 Symbolic Knowledge and Model Knowledge
- 9.2.1 Symbolic Knowledge
- 9.2.2 Model Knowledge
- 9.2.3 Integrating Symbolic Knowledge and Model Knowledge
- 9.3 Knowledge Representation Learning
- 9.3.1 Linear Representation
- 9.3.2 Translation Representation
- 9.3.3 Neural Representation
- 9.3.4 Manifold Representation
- 9.3.5 Contextualized Representation
- 9.3.6 Summary
- 9.4 Knowledge-Guided NLP
- 9.4.1 Knowledge Augmentation
- 9.4.2 Knowledge Reformulation
- 9.4.3 Knowledge Regularization
- 9.4.4 Knowledge Transfer
- 9.4.5 Summary
- 9.5 Knowledge Acquisition
- 9.5.1 Sentence-Level Relation Extraction
- 9.5.2 Bag-Level Relation Extraction
- 9.5.3 Document-Level Relation Extraction
- 9.5.4 Few-Shot Relation Extraction
- 9.5.5 Open-Domain Relation Extraction
- 9.5.6 Contextualized Relation Extraction
- 9.5.7 Summary
- 9.6 Summary and Further Readings
- References
- 10 Sememe-Based Lexical Knowledge Representation Learning
- 10.1 Introduction
- 10.2 Linguistic and Commonsense Knowledge Bases
- 10.2.1 WordNet and ConceptNet
- 10.2.2 HowNet
- 10.2.3 HowNet and Deep Learning
- 10.3 Sememe Knowledge Representation
- 10.3.1 Sememe-Encoded Word Representation
- 10.3.2 Sememe-Regularized Word Representation
- 10.4 Sememe-Guided Natural Language Processing
- 10.4.1 Sememe-Guided Semantic Compositionality Modeling
- 10.4.2 Sememe-Guided Language Modeling
- 10.4.3 Sememe-Guided Recurrent Neural Networks
- 10.5 Automatic Sememe Knowledge Acquisition
- 10.5.1 Embedding-Based Sememe Prediction
- 10.5.2 Sememe Prediction with Internal Information
- 10.5.3 Cross-lingual Sememe Prediction
- 10.5.4 Connecting HowNet with BabelNet
- 10.5.5 Summary and Discussion
- 10.6 Applications
- 10.6.1 Chinese LIWC Lexicon Expansion
- 10.6.2 Reverse Dictionary
- 10.7 Summary and Further Readings
- References.
- 11 Legal Knowledge Representation Learning
- 11.1 Introduction
- 11.2 Typical Tasks and Real-World Applications
- 11.3 Legal Knowledge Representation and Acquisition
- 11.3.1 Legal Textual Knowledge
- 11.3.2 Legal Structured Knowledge
- 11.3.3 Discussion
- 11.4 Knowledge-Guided Legal NLP
- 11.4.1 Input Augmentation
- 11.4.2 Architecture Reformulation
- 11.4.3 Objective Regularization
- 11.4.4 Parameter Transfer
- 11.5 Outlook
- 11.6 Ethical Consideration
- 11.7 Open Competitions and Benchmarks
- 11.8 Summary and Further Readings
- References
- 12 Biomedical Knowledge Representation Learning
- 12.1 Introduction
- 12.1.1 Perspectives for Biomedical NLP
- 12.1.2 Role of Knowledge in Biomedical NLP
- 12.2 Biomedical Knowledge Representation and Acquisition
- 12.2.1 Biomedical Knowledge from Natural Language
- 12.2.2 Biomedical Knowledge from Biomedical Language Materials
- 12.3 Knowledge-Guided Biomedical NLP
- 12.3.1 Input Augmentation
- 12.3.2 Architecture Reformulation
- 12.3.3 Objective Regularization
- 12.3.4 Parameter Transfer
- 12.4 Typical Applications
- 12.4.1 Literature Processing
- 12.4.2 Retrosynthetic Prediction
- 12.4.3 Diagnosis Assistance
- 12.5 Advanced Topics
- 12.6 Summary and Further Readings
- References
- 13 OpenBMB: Big Model Systems for Large-Scale Representation Learning
- 13.1 Introduction
- 13.2 BMTrain: Efficient Training Toolkit for Big Models
- 13.2.1 Data Parallelism
- 13.2.2 ZeRO Optimization
- 13.2.3 Quickstart of BMTrain
- 13.3 OpenPrompt and OpenDelta: Efficient Tuning Toolkit for Big Models
- 13.3.1 Serving Multiple Tasks with a Unified Big Model
- 13.3.2 Quickstart of OpenPrompt
- 13.3.3 QuickStart of OpenDelta
- 13.4 BMCook: Efficient Compression Toolkit for Big Models
- 13.4.1 Model Quantization
- 13.4.2 Model Distillation
- 13.4.3 Model Pruning.
- 13.4.4 Model MoEfication.