Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators (eBook)
John Wiley & Sons (Verlag)
978-1-118-76257-8 (ISBN)
Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators provides a uniquely broad compendium of the key mathematical concepts and results that are relevant for the theoretical development of functional data analysis (FDA).
The self-contained treatment of selected topics of functional analysis and operator theory includes reproducing kernel Hilbert spaces, singular value decomposition of compact operators on Hilbert spaces and perturbation theory for both self-adjoint and non self-adjoint operators. The probabilistic foundation for FDA is described from the perspective of random elements in Hilbert spaces as well as from the viewpoint of continuous time stochastic processes. Nonparametric estimation approaches including kernel and regularized smoothing are also introduced. These tools are then used to investigate the properties of estimators for the mean element, covariance operators, principal components, regression function and canonical correlations. A general treatment of canonical correlations in Hilbert spaces naturally leads to FDA formulations of factor analysis, regression, MANOVA and discriminant analysis.
This book will provide a valuable reference for statisticians and other researchers interested in developing or understanding the mathematical aspects of FDA. It is also suitable for a graduate level special topics course.
Tailen Hsing Professor, Department of Statistics, University of Michigan, USA. Professor Hsing is a fellow of International Statistical Institute and of the Institute of Mathematical Statistics. He has published numerous papers on subjects ranging from bioinformatics to extreme value theory, functional data analysis, large sample theory and processes with long memory.
Randall Eubank Professor Emeritus, School of Mathematical and Statistical Sciences, Arizona State University, USA. Professor Eubank is well know and respected in the functional data analysis (FDA) field. He has published numerous papers on the subject and is a regular invited speaker at key meetings.
Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators provides a uniquely broad compendium of the key mathematical concepts and results that are relevant for the theoretical development of functional data analysis (FDA).The self contained treatment of selected topics of functional analysis and operator theory includes reproducing kernel Hilbert spaces, singular value decomposition of compact operators on Hilbert spaces and perturbation theory for both self adjoint and non self adjoint operators. The probabilistic foundation for FDA is described from the perspective of random elements in Hilbert spaces as well as from the viewpoint of continuous time stochastic processes. Nonparametric estimation approaches including kernel and regularized smoothing are also introduced. These tools are then used to investigate the properties of estimators for the mean element, covariance operators, principal components, regression function and canonical correlations. A general treatment of canonical correlations in Hilbert spaces naturally leads to FDA formulations of factor analysis, regression, MANOVA and discriminant analysis.This book will provide a valuable reference for statisticians and other researchers interested in developing or understanding the mathematical aspects of FDA. It is also suitable for a graduate level special topics course.
Tailen Hsing Professor, Department of Statistics, University of Michigan, USA. Professor Hsing is a fellow of International Statistical Institute and of the Institute of Mathematical Statistics. He has published numerous papers on subjects ranging from bioinformatics to extreme value theory, functional data analysis, large sample theory and processes with long memory. Randall Eubank Professor Emeritus, School of Mathematical and Statistical Sciences, Arizona State University, USA. Professor Eubank is well know and respected in the functional data analysis (FDA) field. He has published numerous papers on the subject and is a regular invited speaker at key meetings.
Preface xi
1 Introduction 1
1.1 Multivariate analysis in a nutshell 2
1.2 The path that lies ahead 13
2 Vector and function spaces 15
2.1 Metric spaces 16
2.2 Vector and normed spaces 20
2.3 Banach and Lp spaces 26
2.4 Inner Product and Hilbert spaces 31
2.5 The projection theorem and orthogonal decomposition 38
2.6 Vector integrals 40
2.7 Reproducing kernel Hilbert spaces 46
2.8 Sobolev spaces 55
3 Linear operator and functionals 61
3.1 Operators 62
3.2 Linear functionals 66
3.3 Adjoint operator 71
3.4 Nonnegative, square-root, and projection operators 74
3.5 Operator inverses 77
3.6 Fréchet and Gâteaux derivatives 83
3.7 Generalized Gram-Schmidt decompositions 87
4 Compact operators and singular value decomposition 91
4.1 Compact operators 92
4.2 Eigenvalues of compact operators 96
4.3 The singular value decomposition 103
4.4 Hilbert-Schmidt operators 107
4.5 Trace class operators 113
4.6 Integral operators and Mercer's Theorem 116
4.7 Operators on an RKHS 123
4.8 Simultaneous diagonalization of two nonnegative definite operators 126
5 Perturbation theory 129
5.1 Perturbation of self-adjoint compact operators 129
5.2 Perturbation of general compact operators 140
6 Smoothing and regularization 147
6.1 Functional linear model 147
6.2 Penalized least squares estimators 150
6.3 Bias and variance 157
6.4 A computational formula 158
6.5 Regularization parameter selection 161
6.6 Splines 165
7 Random elements in a Hilbert space 175
7.1 Probability measures on a Hilbert space 176
7.2 Mean and covariance of a random element of a Hilbert space 178
7.3 Mean-square continuous processes and the Karhunen-Lòeve Theorem 184
7.4 Mean-square continuous processes in L2 (E,B(E), mu) 190
7.5 RKHS valued processes 195
7.6 The closed span of a process 198
7.7 Large sample theory 203
8 Mean and covariance estimation 211
8.1 Sample mean and covariance operator 212
8.2 Local linear estimation 214
8.3 Penalized least-squares estimation 231
9 Principal components analysis 251
9.1 Estimation via the sample covariance operator 253
9.2 Estimation via local linear smoothing 255
9.3 Estimation via penalized least squares 261
10 Canonical correlation analysis 265
10.1 CCA for random elements of a Hilbert space 267
10.2 Estimation 274
10.3 Prediction and regression 281
10.4 Factor analysis 284
10.5 MANOVA and discriminant analysis 288
10.6 Orthogonal subspaces and partial cca 294
11 Regression 305
11.1 A functional regression model 305
11.2 Asymptotic theory 308
11.3 Minimax optimality 318
11.4 Discretely sampled data 321
References 327
Index 331
Notation Index 334
Chapter 1
Introduction
Briefly stated, a stochastic process is an indexed collection of random variables all of which are defined on a common probability space . If we denote the index set by , then this can be described mathematically as
where is a -measurable function on the sample space . The argument will generally be suppressed and will typically be shortened to just .
Once the have been observed for every , the process has been realized and the resulting collection of real numbers is called a sample path for the process. Functional data analysis (fda), in the sense of this text, is concerned with the development of methodology for statistical analysis of data that represent sample paths of processes for which the index set is some (closed) interval of the real line; without loss, the interval can be taken as . This translates into observations that are functions on and data sets that consist of a collection of such random curves.
From a practical perspective, one cannot actually observe a functional data set in its entirety; at some point, digitization must occur. Thus, analysis might be predicated on data of the form
involving sample paths for some stochastic process with each sample path only being evaluated at points in . When viewed from this perspective, the data is inherently finite dimensional and the temptation is to treat it as one would data in a multivariate analysis (mva) context. However, for truly functional data, there will be many more “variables” than observations; that is, . This leads to drastic ill conditioning of the linear systems that are commonplace in mva which has consequences that can be quite profound. For example, Bickel and Levina (2004) showed that a naive application of multivariate discriminant analysis to functional data can result in a rule that always classifies by essentially flipping a fair coin regardless of the underlying population structure.
Rote application of mva methodology is simply not the avenue one should follow for fda. On the other hand, the basic mva techniques are still meaningful in a certain sense. Data analysis tools such as canonical correlation analysis, discriminant analysis, factor analysis, multivariate analysis of variance (MANOVA), and principal components analysis exist because they provide useful ways to summarize complex data sets as well as carry out inference about the underlying parent population. In that sense, they remain conceptually valid in the fda setting even if the specific details for extracting the relevant information from data require a bit of adjustment. With that in mind, it is useful to begin by cataloging some of the multivariate methods and their associated mathematical foundations, thereby providing a roadmap of interesting avenues for study. This is the subject of the following section.
1.1 Multivariate analysis in a nutshell
mva is a mature area of statistics with a rich history. As a result, we cannot (and will not attempt to) give an in-depth overview of mva in this text. Instead, this section contains a terse, mathematical sketch of a few of the methods that are commonly employed in mva. This will, hopefully, provide the reader with some intuition concerning the form and structure of analogs of mva techniques that are used in fda as well as an appreciation for both the similarities and the differences between the two fields of study. Introductions to the theory and practice of mva can be found in a myriad of texts including Anderson (2003), Gittins (1985), Izenman (2008), Jolliffe (2004), and Johnson and Wichern (2007).
Let us begin with the basic set up where we have a -dimensional random vector having (variance-)covariance matrix
with
the mean vector for . Here, corresponds to mathematical expectation and indicates the transpose of a vector . The matrix admits an eigenvalue–eigenvector decomposition of the form
for eigenvalues and associated orthonormal eigenvectors , that satisfy
where is 1 or 0 depending on whether or not and coincide. This provides a basis for principal components analysis (pca).
We can use the eigenvectors in (1.3) to define new variables , which are referred to as principal components. These are linear combinations of the original variables with the weight or loadings that is applied to in the th component indicating its importance to ; more precisely,
In fact,
as, if is full rank, provide an orthonormal basis for ; this is even true when has less than full rank as is zero with probability one when . The implication of (1.4) is that can be represented as a weighted sum of the eigenvectors of with the weights/coefficients being uncorrelated random variables having variances that are the eigenvalues of .
In practice, one typically retains only some number of the components and views them as providing a summary of the (covariance) relationship between the variables in As with any type of summarization, this results in a loss of information. The extent of this loss can be gauged by the proportion of the total variance that is recovered by the principal components that are retained. In this regard, we know that
while the variance of the th component is
Thus, the th component accounts for percentage of the total variance and is the percentage of variability that is not accounted for by .
Principal components possess various optimality features such as the one catalogued in Theorem 1.1.1.
Theorem 1.1.1
.
The proof of this result is, e.g., a consequence of developments in Section 4.2. It can be interpreted as saying that the th principle component is the linear combination of that accounts for the maximum amount of the remaining total variance after removing the portion that was explained by .
The discussion to this point has been concerned with only the population aspects of pca. Given a random sample of observations on , we estimate by the sample covariance matrix
with
the sample mean vector. As is positive semidefinite, it has the eigenvalue–eigenvector representation
where the are orthonormal and satisfy
This produces the sample principle components for with and the associated scores that provide sample information concerning the .
Theorems 9.1.1 and 9.1.2 of Chapter 9 can be used to deduce the large sample behavior of the sample eigenvalue–eigenvector pairs, . The limiting distributions of and are found to be normal which provides a foundation for hypothesis testing and interval estimation.
The next step it to assume that consists of two subsets of variables that we indicate by writing , where and . Questions of interest now concern the relationships that may exist between and . Our focus will be on those that are manifested in their covariance structure. For this purpose, we partition the covariance matrix for from (1.1) as
Here, are the covariance matrices for , respectively, and is sometimes called the cross-covariance matrix.
The goal is now to summarize the (cross-)covariance properties of and . Analogous to the pca approach, this will be accomplished using linear combinations of the two random vectors. Specifically, we seek vectors and that maximize
This optimization problem can be readily solved with the help of the singular value decomposition: e.g., Corollary 4.3.2. Assuming that contain no redundant variables, both and will be positive-definite with nonsingular square roots . This allows us to write
where
and . The matrix can be viewed as a multivariate analog of the linear correlation coefficient between two variables. Using the singular value decomposition in Corollary 4.3.2, we can see that (1.10) is maximized by choosing to be the pair of singular vectors that correspond to its largest singular value . The optimal linear combinations of and are therefore provided by the vectors and . The corresponding random variables and are called the first canonical variables of the and spaces, respectively. They each have unit variance and correlation that is referred to as the first canonical correlation.
The summarization process need not stop after the first canonical variables. If has rank , then there are actually additional canonical variables that can be found: namely, for , we have
and
where , with , the other singular vector pairs from that correspond to its remaining nonzero singular values . For each choice of the index , the random variable pair is uncorrelated with all the other canonical variable pairs and has corresponding canonical correlation . When all this is put Together, it gives us
with
the matrix of canonical weight vectors for and
a diagonal matrix containing the corresponding canonical correlations.
There are various other ways of characterizing the canonical correlations and vectors. As they stem from the singular values and vectors of , they are the eigenvalue and eigenvectors obtained from and . For...
| Erscheint lt. Verlag | 7.4.2015 |
|---|---|
| Reihe/Serie | Wiley Series in Probability and Statistics |
| Wiley Series in Probability and Statistics | Wiley Series in Probability and Statistics |
| Sprache | englisch |
| Themenwelt | Mathematik / Informatik ► Mathematik ► Analysis |
| Mathematik / Informatik ► Mathematik ► Statistik | |
| Mathematik / Informatik ► Mathematik ► Wahrscheinlichkeit / Kombinatorik | |
| Technik | |
| Schlagworte | biometrics • Biometrie • Data Analysis • Datenanalyse • Regression Analysis • Regressionsanalyse • Statistical Analysis of Functional Data, Randall Eubank, Tailen Hsing, functional data analysis, spatial data analysis, longitudinal data analysis, Functional Data Analysis, handwriting analysis, human growth patterns, criminal behavior, financial analysis • Statistics • Statistik |
| ISBN-10 | 1-118-76257-6 / 1118762576 |
| ISBN-13 | 978-1-118-76257-8 / 9781118762578 |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich