Recent Advances in Ensembles for Feature Selection (eBook)
205 Seiten
Springer International Publishing (Verlag)
978-3-319-90080-3 (ISBN)
This book offers a comprehensive overview of ensemble learning in the field of feature selection (FS), which consists of combining the output of multiple methods to obtain better results than any single method. It reviews various techniques for combining partial results, measuring diversity and evaluating ensemble performance.
With the advent of Big Data, feature selection (FS) has become more necessary than ever to achieve dimensionality reduction. With so many methods available, it is difficult to choose the most appropriate one for a given setting, thus making the ensemble paradigm an interesting alternative.
The authors first focus on the foundations of ensemble learning and classical approaches, before diving into the specific aspects of ensembles for FS, such as combining partial results, measuring diversity and evaluating ensemble performance. Lastly, the book shows examples of successful applications of ensembles for FS and introduces the new challenges thatresearchers now face. As such, the book offers a valuable guide for all practitioners, researchers and graduate students in the areas of machine learning and data mining.
Foreword 7
Preface 8
Contents 10
1 Basic Concepts 1
1.1 What Is a Dataset, Feature and Class? 14
1.2 Classification Error/Accuracy 17
1.3 Training and Testing 18
1.4 Comparison of Models: Statistical Tests 20
1.4.1 Two Models and a Single Dataset 21
1.4.2 Two Models and Multiple Dataset 22
1.4.3 Multiple Models and Multiple Dataset 22
1.5 Data Repositories 22
1.6 Summary 23
References 24
2 Feature Selection 25
2.1 Foundations of Feature Selection 26
2.2 State-of-the-Art Feature Selection Methods 28
2.2.1 Filter Methods 28
2.2.2 Embedded Methods 30
2.2.3 Wrapper Methods 30
2.3 Which Is the Best Feature Selection Method? 30
2.3.1 Datasets 31
2.3.2 Experimental Study 32
2.4 On the Scalability of Feature Selection Methods 43
2.4.1 Experimental Study 43
2.5 Summary 47
References 47
3 Foundations of Ensemble Learning 50
3.1 The Rationale of the Approach 50
3.2 Most Popular Methods 52
3.2.1 Boosting 55
3.2.2 Bagging 57
3.3 Summary 60
References 60
4 Ensembles for Feature Selection 63
4.1 Introduction 64
4.2 Homogeneous Ensembles for Feature Selection 66
4.2.1 A Use Case: Homogeneous Ensembles for Feature Selection Using Ranker Methods 66
4.3 Heterogeneous Ensembles for Feature Selection 83
4.3.1 A Use Case: Heterogeneous Ensemble for Feature Selection Using Ranker Methods 84
4.4 A Comparison on the Result of Both Use Cases: Homogeneous Versus Heterogeneous Ensemble for Feature Selection Using Ranker Methods 87
4.5 Summary 89
References 90
5 Combination of Outputs 92
5.1 Combination of Label Predictions 92
5.1.1 Majority Vote 93
5.1.2 Decision Rules 94
5.2 Combination of Subsets of Features 95
5.2.1 Intersection and Union 96
5.2.2 Using Classification Accuracy 96
5.2.3 Using Complexity Measures 98
5.3 Combination of Rankings of Features 99
5.3.1 Simple Operations Between Ranks 102
5.3.2 Stuart Aggregation Method 103
5.3.3 Robust Rank Aggregation 103
5.3.4 SVM-Rank 104
5.4 Summary 104
References 105
6 Evaluation of Ensembles for Feature Selection 106
6.1 Introduction 106
6.2 Diversity 108
6.3 Stability 112
6.3.1 Stability of Subsets of Features 112
6.3.2 Stability of Rankings of Features 113
6.4 Performance of Ensembles 115
6.4.1 Are the Selected Features the Relevant Ones? 115
6.4.2 The Ultimate Evaluation: Classification Performance 117
6.5 Summary 119
References 120
7 Other Ensemble Approaches 123
7.1 Introduction 124
7.2 Ensembles for Classification 126
7.2.1 One-Class Classification 127
7.2.2 Imbalanced Data 128
7.2.3 Data Streaming 129
7.2.4 Missing Data 131
7.3 Ensembles for Quantification 133
7.4 Ensembles for Clustering 134
7.4.1 Types of Clustering Ensembles 138
7.5 Ensembles for Other Preprocessing Steps: Discretization 140
7.6 Summary 141
References 141
8 Applications of Ensembles Versus Traditional Approaches: Experimental Results 147
8.1 The Rationale of the Approach 147
8.2 The Process of Selecting the Methods for the Ensemble 149
8.3 Two Filter Ensemble Approaches 150
8.3.1 Ensemble 1 150
8.3.2 Ensemble 2 151
8.4 Experimental Setup 153
8.5 Experimental Results 154
8.5.1 Results on Synthetic Data 155
8.5.2 Results on Classical Datasets 155
8.5.3 Results on Microarray Data 159
8.5.4 The Imbalance Problem 162
8.6 Summary 163
References 164
9 Software Tools 165
9.1 Popular Software Tools 165
9.1.1 Matlab 165
9.1.2 Weka 166
9.1.3 R 168
9.1.4 KEEL 169
9.1.5 RapidMiner 170
9.1.6 Scikit-Learn 171
9.1.7 Parallel Learning 172
9.2 Code Examples 174
9.2.1 Example: Building an Ensemble of Trees 174
9.2.2 Example: Adding Feature Selection to Our Ensemble of Trees 175
9.2.3 Example: Exploring Different Ensemble Sizes for Our Ensemble 176
References 179
10 Emerging Challenges 180
10.1 Introduction 181
10.2 Recent Contributions in Feature Selection 182
10.2.1 Applications 183
10.3 The Future: Challenges Ahead for Feature Selection 189
10.3.1 Millions of Dimensions 190
10.3.2 Scalability 191
10.3.3 Distributed Feature Selection 193
10.3.4 Real-Time Processing 196
10.3.5 Feature Cost 198
10.3.6 Missing Data 201
10.3.7 Visualization and Interpretability 202
10.4 Summary 203
References 204
| Erscheint lt. Verlag | 30.4.2018 |
|---|---|
| Reihe/Serie | Intelligent Systems Reference Library | Intelligent Systems Reference Library |
| Zusatzinfo | XIV, 205 p. 39 illus., 36 illus. in color. |
| Verlagsort | Cham |
| Sprache | englisch |
| Themenwelt | Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik |
| Technik ► Bauwesen | |
| Schlagworte | Data reduction • dimensionality reduction • ensemble learning • Information Fusion • machine learning • pattern recognition |
| ISBN-10 | 3-319-90080-3 / 3319900803 |
| ISBN-13 | 978-3-319-90080-3 / 9783319900803 |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich