Deep Learning in Quantitative Finance
John Wiley & Sons Inc (Verlag)
978-1-119-68524-1 (ISBN)
- Noch nicht erschienen (ca. März 2026)
- Versandkostenfrei
- Auch auf Rechnung
- Artikel merken
This book is a complete resource on how deep learning is used in quantitative finance applications. It introduces the basics of neural networks, including feedforward networks, optimization, and training, before proceeding to cover more advanced topics. You’ll also learn about the most important software frameworks. The book then proceeds to cover the very latest deep learning research in quantitative finance, including approximating derivative values, volatility models, credit curve mapping, generating realistic market data, and hedging. The book concludes with a look at the potential for quantum deep learning and the broader implications deep learning has for quantitative finance and quantitative analysts.
Covers the basics of deep learning and neural networks, including feedforward networks, optimization and training, and regularization techniques
Offers an understanding of more advanced topics like CNNs, RNNs, autoencoders, generative models including GANs and VAEs, and deep reinforcement learning
Demonstrates deep learning application in quantitative finance through case studies and hands-on applications via the companion website
Introduces the most important software frameworks for applying deep learning within finance
This book is perfect for anyone engaged with quantitative finance who wants to get involved in a subject that is clearly going to be hugely influential for the future of finance.
Contents
Acknowledgmentsxix 1 Introduction3 1.1 What this book is about3 1.2 The Rise of AI5 1.2.1 LLMs5 1.3 The Promise of AI in Quantitative Finance7 1.4 Practicalities7 1.4.1 The Examples7 1.4.2 Python and PyTorch8 1.4.3 Docker9 1.5 Reading this book10 2 Feed Forward Neural Networks13 2.1 Introducing Neural Networks13 2.1.1 Why activation must be non-linear15 2.1.2 Learning Representations17 2.2 Regression and Classification18 2.3 Activation Functions27 2.3.1 Linear28 Acknowledgmentsxix 1 Introduction3 1.1 What this book is about3 1.2 The Rise of AI5 1.2.1 LLMs5 1.3 The Promise of AI in Quantitative Finance7 1.4 Practicalities7 1.4.1 The Examples7 1.4.2 Python and PyTorch8 1.4.3 Docker9 1.5 Reading this book10 2 Feed Forward Neural Networks13 2.1 Introducing Neural Networks13 2.1.1 Why activation must be non-linear15 2.1.2 Learning Representations17 2.2 Regression and Classification18 2.3 Activation Functions27 2.3.1 Linear28 2.3.2 Sigmoid (Logistic)28 2.3.3 Heaviside (Binary)29 2.3.4 Hyperbolic Tangent (tanh)29 2.3.5 Rectified Linear Unit (ReLU)31 2.3.6 Leaky ReLU32 2.3.7 Parameteric rectified linear unit (PReLU)32 2.3.8 Gaussian Error Linear Unit (GELU)33 2.3.9 Exponential Linear Unit (ELU)33 2.3.10 Scaled Exponential Linear Unit (SELU)33 2.3.11 Swish33 2.3.12 Scaled Exponentially-Regularised Linear Units (SERLU)35 2.3.13 Softmax35 2.4 The Universal Function Approximation Theorem45 2.5 Conclusions48 3 Training Neural Networks49 3.1 Backpropagation and Adjoint Algorithmic Differentiation50 3.1.1 Adjoint Algorithmic Differentiation51 3.2 Data Preparation and Scaling53 3.2.1 Vectorization53 3.2.2 Input Normalization54 3.2.3 Handling Test and Validation Data57 3.2.4 Feature Engineering?57 3.3 Weight Initialization57 3.3.1 Initializing Weights58 3.3.2 Initializing Biases60 3.4 The Choice of Loss Function68 3.4.1 Regression68 3.4.2 Binary Classification74 3.4.3 Multi-class Classification79 3.4.4 Multi-label Classification81 3.5 Optimization Algorithms82 3.5.1 Basic Techniques82 3.5.2 Optimizers with Adaptive Learning Rates91 3.6 Common Training Problems97 3.6.1 Overfitting/Underfitting97 3.6.2 Defining Bias and Variance Mathematically100 3.6.3 Local Minima101 3.6.4 Saddle Points and Second Order Methods101 3.6.5 Vanishing and Exploding Gradients102 3.7 Batch Normalization104 3.8 Evaluation and Validation110 3.8.1 The Train / Test / Validation Split110 3.8.2 Evaluation Metrics113 3.9 Sobolev Training Using Function Derivatives124 3.9.1 Incorporating Derivatives125 3.9.2 Key Theorems126 3.9.3 Empirical Results127 3.10 Conclusions131 4 Regularisation 133 4.1 Introduction Regularisation and Generalisation133 4.2 Weight Decay134 4.2.1 L2 Regularisation135 4.2.2 L1 Regularisation136 4.3 Early Stopping 137 4.4 Ensemble Methods and Dropout138 4.4.1 Bootstrap Aggregating (Bagging)139 4.4.2 Dropout140 4.5 Data Augmentation146 4.6 Other Regularisation Methods147 4.6.1 Batch Normalisation as Regularisation147 4.6.2 Multitask Learning147 4.7 Conclusions Regularisation Strategy149 5 Hyperparameter Optimization 151 5.1 Introduction151 5.1.1 Types of Hyperparameter153 5.1.2 Types of HPO154 5.2 Manual155 5.3 Grid Search155 5.4 Random Search 158 5.5 Bayesian Optimization159 5.5.1 The Gaussian Process Surrogate Model 160 5.5.2 The Acquisition Function 161 5.5.3 Enhancements for Bayesian Hyperparameter Optimization162 5.6 Bandit-based165 5.6.1 Successive Halving (SHA) 166 5.6.2 Hyperband 169 5.6.3 BOHB 173 5.6.4 Asynchronous Successive Halving (ASHA)176 5.7 Population Based Training (PBT)181 5.8 Conclusions184 6 Convolutional Neural Networks 187 6.1 Introduction187 6.2 Convolutions188 6.2.1 Mathematics of Convolutions188 6.2.2 Convolutional Layers191 6.2.3 Edge Effects Padding194 6.2.4 Multi-channel Convolutions 195 6.2.5 Selecting Filter Sizes198 6.2.6 Choosing the Number of Filters 203 6.3 Downsampling 203 6.3.1 Strided Convolutions203 6.3.2 Pooling 203 6.4 Data Augmentation206 6.5 Transfer Learning Using Pre-trained Networks 211 6.6 Visualising Features213 6.6.1 Visualizing Filters and Feature Activations213 6.6.2 Gradient-based Visualization 214 6.7 Famous CNNs 223 6.7.1 LeNet 223 6.7.2 AlexNet 225 6.7.3 VGG 230 6.7.4 Inception234 6.7.5 ResNet 245 6.8 Conclusions on CNNs 252 7 Sequence Models 255 7.1 Introducing Sequence Models 255 7.2 Recurrent Neural Networks 257 7.2.1 Shallow RNNs258 7.2.2 Bidirectional RNNs 263 7.2.3 Deep RNNs267 7.2.4 Vanishing and Exploding Gradients269 7.2.5 Long Short Term Memory (LSTM)270 7.2.6 Gated Recurrence Unit (GRU)272 7.3 Neural Natural Language Processing 276 7.3.1 Introducing NLP 276 7.3.2 NLP Preprocessing 276 7.3.3 N-grams281 7.3.4 Evaluation Metrics for NLP 283 7.3.5 A Neural Probabilistic Language Model286 7.3.6 Word Embeddings293 7.3.7 RNNs and NLP 297 7.3.8 Sequence to Sequence Models301 7.3.9 Attention Mechanisms309 7.3.10 Transformers and Large Language Models314 7.4 Conclusions on Sequence Models322 8 Autoencoders 323 8.1 Introduction323 8.1.1 Encoders and Decoders325 8.2 Autoencoders and Singular-Valued Decomposition 325 8.2.1 PCA and SVD325 8.2.2 Linear Autoencoders replicate SVD328 8.2.3 Autoencoders as non-Linear PCA332 8.3 Shallow and Deep Autoencoders332 8.4 Regularized and Sparse Autoencoders 336 8.5 Denoising Autoencoders339 8.6 Autoencoders and Generative Models 341 8.7 Conclusion342 9 Generative Models 343 9.1 Introduction343 9.2 Evaluating Generative Model Performance 345 9.2.1 Inception Score 346 9.2.2 Fréchet Inception Distance 348 9.3 Energy-based Models (EBMs) 348 9.3.1 Boltzmann Machines349 9.3.2 Restricted Boltzmann Machines (RBMs) 353 9.3.3 Deep Belief Networks360 9.3.4 Deep Boltzmann Machines 362 9.3.5 Deep Energy-Based Models 363 9.3.6 Joint Energy-Based Model (JEM)373 9.3.7 Score-based Models 377 9.4 Variational Autoencoders (VAEs)383 9.4.1 Why Variational?383 9.4.2 Empirical View of VAEs 384 9.4.3 Probabilistic View of VAEs 384 9.4.4 Evidence Lower Bound (ELBO) 387 9.4.5 Stochastic Gradient Descent and ELBO 387 9.4.6 The Reparameterization Trick388 9.4.7 Marginal Likelihood 389 9.4.8 Challenges with VAEs389 9.5 Generative Adversarial Networks (GANs) 396 9.5.1 Early GANs 397 9.5.2 Stabilizing GANs 408 9.5.3 Controlling Generation424 9.5.4 High Resolution GANs436 9.5.5 Image Translation458 9.5.6 GAN Inversion485 9.5.7 Conclusions on GANs491 9.6 Latent Diffusion Models (LDMs)491 9.7 Conclusions on Generative Models 493 10 Deep Reinforcement Learning 495 10.1 Introduction495 10.2 Key Concepts in Reinforcement Learning 496 10.2.1 Defining RL496 10.2.2 Rewards 496 10.2.3 Agent and Environment 497 10.2.4 History and State499 10.2.5 Policy, Value and State-action Functions 501 10.2.6 Model 502 10.3 Markov Decision Processes (MDPs) and the Bellman Equations506 10.3.1 Optimal Policy508 10.4 Dynamic Programming and Policy Search 509 10.4.1 Policy Evaluation or Prediction 509 10.4.2 Policy Improvement 510 10.4.3 Policy Iteration 510 10.4.4 Value Iteration510 10.5 Monte Carlo Methods for RL 516 10.5.1 Monte Carlo Prediction517 10.5.2 Monte Carlo Control527 10.6 TD Learning535 10.6.1 TD Prediction535 10.6.2 On-policy TD Control SARSA 538 10.6.3 Off-policy TD Control Q-learning541 10.6.4 TD and Bias-Variance Trade-off 544 10.6.5 n-step TD 545 10.7 Deep Q Networks (DQNs) 546 10.7.1 Introducing DQNs546 10.7.2 Architecture of DQNs548 10.7.3 Experience Replay549 10.7.4 Training a DQN 551 10.7.5 DQN Variants553 10.8 Policy Gradient 561 10.8.1 Parametrised Policies561 10.8.2 Policy Gradient Theorem 563 10.8.3 REINFORCE564
10.9 Actor-Critic Methods 567 10.10Conclusions568 11 Derivative Valuation using Neural Networks 571 11.1 Introduction571 11.2 Derivative Valuation using Neural Networks trained as Non-parametric Models572 11.3 Derivative Valuation Function Approximation584 11.3.1 Deeply Learning Derivatives 586 11.3.2 Controlling Asymptotic Behaviour598 11.3.3 Fine Tuning 599 11.3.4 Conclusions on Derivative Valuation Function Approximation 601 12 High Dimensional PDE and BSDE Solvers 603 12.1 Introduction603 12.2 Deep Galerkin Method (DGM)604 12.2.1 Introduction604 12.2.2 Algorithm608 12.2.3 Theorems609 12.2.4 Numerical Examples 610 12.3 Deep BSDE Solvers619 12.3.1 Introducing Backward Stochastic Differential Equations 619 12.3.2 Deep BSDE Algorithm621 12.3.3 Deep BSDEs in Quant Finance 623 12.3.4 Conclusions on Deep BSDE Solvers640 12.4 Projection and Martingale Solvers641 12.5 Deep Path Dependent PDEs (DPPDE)642 12.6 Physics Informed Neural Networks (PINNs)644 12.7 Deep Backward Dynamic Programming (DBDP) 646 12.8 Deep Splitting (DS)647 12.9 Conclusions649 13 Deep Monte Carlo and Optimal Stopping 651 13.1 Introduction651 13.2 Deep Monte Carlo 653 13.2.1 Deep Importance Sampling 653 13.2.2 Learning Control Variates 660 13.2.3 Deep Weighted Monte Carlo 671 13.2.4 Conclusions on Deep Monte Carlo685 13.3 Deep Optimal Stopping and Applications 685 13.3.1 American and Bermudan Options685 13.3.2 American Monte Carlo686 13.3.3 Deep Optimal Stopping for Valuation and XVA690 13.4 Conclusion Deep Monte Carlo703 14 Static Replication using Neural Networks 705 14.1 (Semi) Static Replication705 14.2 Neural Static Replication708 14.2.1 Regress Now or Later?708 14.2.2 Neural Regress Later709 14.3 Conclusions on Neural Static Replication716 15 Volatility Surfaces 717 15.1 Introduction717 15.2 Volatility Surface Models718 15.2.1 Definitions 718 15.2.2 Heston 719 15.2.3 SABR 719 15.2.4 Rough Bergomi 720 15.2.5 Calibrating Volatility Models 721 15.3 Deep Learning Volatility Surfaces722 15.4 Deep Local Volatility 736 15.4.1 The Dupire Local Volatility 736 15.4.2 Fitting a Local Volatility Consistent Pricing Function737 15.4.3 Fitting self-consistent Implied and Local Volatilities 739 15.4.4 Conclusions on Deep Local Volatility 750 15.5 Conclusions750 16 Model Calibration 751 16.1 Introduction751
16.2 Model Calibration 752 16.2.1 One or Two Step752 16.2.2 Heston 752 16.2.3 Short rate / HJM models 760 16.3 Conclusion on Deep Calibration767 17 XVA 769 17.1 Introduction769 17.2 Credit Curve Mapping771 17.2.1 Classification Approach772 17.2.2 Regression Approach774 17.3 Exposure Calculation using Neural Networks784 17.4 Conclusions on Deep XVA 791 18 Generating Realistic Market Data 793 18.1 Introduction and Classical Methods 793 18.2 Motivation and Applications of Synthetic Financial Market Data796 18.2.1 Motivation for Synthetic Financial Market Data 796 18.2.2 Applications of Synthetic Financial Market Data 797 18.3 Time Series Generation798 18.3.1 Empirical Properties of Financial Time Series798 18.3.2 Empirical Tests801 18.3.3 Time Series Generation809 18.3.4 Conclusions on Time-Series Generation 862 18.4 Generating Higher Dimensional Market Data Structures864 18.4.1 Generating Yield Curves 864 18.4.2 Generating Correlation Matrices 872 18.4.3 Generating Volatility Surfaces883 18.5 Completing Market Data - imputing missing values886 18.6 Conclusions Synthetic Market Data 888 19 Deep Hedging 893 19.1 Introduction893 19.2 Approaches to Deep Hedging 894 19.2.1 Introduction894 19.2.2 Target Applications 896 19.2.3 Datasets896 19.2.4 Duration and Frequency 897 19.2.5 State and Action898 19.2.6 Reward / Objective 898 19.2.7 RL Methodology 900 19.2.8 Network Architecture917 19.2.9 Results 926 19.2.10Conclusions the Deep Hedging Literature935 19.3 Deep Hedging Examples935 19.3.1 Target Application936 19.3.2 Datasets936 19.3.3 Duration and Frequency 936 19.3.4 State and Action936 19.3.5 Reward / Objective 937 19.3.6 Methodology937 19.3.7 Implementation 938 19.3.8 Results 941 19.4 Conclusion942 20 The Future Quant 957 20.1 Conclusion on Deep Learning 957 20.2 The Future of Quantitative Analytics 959 20.3 The Future Quant 960 20.4 A Final Word960
| Erscheinungsdatum | 06.12.2023 |
|---|---|
| Reihe/Serie | Wiley Finance |
| Verlagsort | New York |
| Sprache | englisch |
| Maße | 170 x 244 mm |
| Themenwelt | Wirtschaft ► Betriebswirtschaft / Management |
| ISBN-10 | 1-119-68524-9 / 1119685249 |
| ISBN-13 | 978-1-119-68524-1 / 9781119685241 |
| Zustand | Neuware |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
aus dem Bereich