An Introduction to Econometric Theory (eBook)
John Wiley & Sons (Verlag)
978-1-119-48492-9 (ISBN)
A guide to economics, statistics and finance that explores the mathematical foundations underling econometric methods
An Introduction to Econometric Theory offers a text to help in the mastery of the mathematics that underlie econometric methods and includes a detailed study of matrix algebra and distribution theory. Designed to be an accessible resource, the text explains in clear language why things are being done, and how previous material informs a current argument. The style is deliberately informal with numbered theorems and lemmas avoided. However, very few technical results are quoted without some form of explanation, demonstration or proof.
The author - a noted expert in the field - covers a wealth of topics including: simple regression, basic matrix algebra, the general linear model, distribution theory, the normal distribution, properties of least squares, unbiasedness and efficiency, eigenvalues, statistical inference in regression, t and F tests, the partitioned regression, specification analysis, random regressor theory, introduction to asymptotics and maximum likelihood. Each of the chapters is supplied with a collection of exercises, some of which are straightforward and others more challenging. This important text:
- Presents a guide for teaching econometric methods to undergraduate and graduate students of economics, statistics or finance
- Offers proven classroom-tested material
- Contains sets of exercises that accompany each chapter
- Includes a companion website that hosts additional materials, solution manual and lecture slides
Written for undergraduates and graduate students of economics, statistics or finance, An Introduction to Econometric Theory is an essential beginner's guide to the underpinnings of econometrics.
JAMES DAVIDSON is Professor of Econometrics at the University of Exeter. He has also held teaching posts at the University of Warwick, the London School of Economics, the University of Wales Aberystwyth and Cardiff University, as well as visiting positions at the University of California, Berkeley, the University of California, San Diego, and Central European University, Budapest.
JAMES DAVIDSON is Professor of Econometrics at the University of Exeter. He has also held teaching posts at the University of Warwick, the London School of Economics, the University of Wales Aberystwyth and Cardiff University, as well as visiting positions at the University of California, Berkeley, the University of California, San Diego, and Central European University, Budapest.
1
Elementary Data Analysis
1.1 Variables and Observations
Where to begin? Data analysis is the business of summarizing a large volume of information into a smaller compass, in a form that a human investigator can appreciate, assess, and draw conclusions from. The idea is to smooth out incidental variations so as to bring the ‘big picture’ into focus, and the fundamental concept is averaging, extracting a representative value or central tendency from a collection of cases. The correct interpretation of these averages, and functions of them, on the basis of a model of the environment in which the observed data are generated,1 is the main concern of statistical theory. However, before tackling these often difficult questions, gaining familiarity with the methods of summarizing sample information and doing the associated calculations is an essential preliminary.
Information must be recorded in some numerical form. Data may consist of measured magnitudes, which in econometrics are typically monetary values, prices, indices, or rates of exchange. However, another important data type is the binary indicator of membership of some class or category, expressed numerically by ones and zeros. A thing or entity of which different instances are observed at different times or places is commonly called a variable. The instances themselves, of which collections are to be made and then analyzed, are the observations. The basic activity to be studied in this first part of the book is the application of mathematical formulae to the observations on one or more variables.
These formulae are, to a large extent, human‐friendly versions of coded computer routines. In practice, econometric calculations are always done on computers, sometimes with spreadsheet programs such as Microsoft Excel but more often using specialized econometric software packages. Simple cases are traditionally given to students to carry out by hand, not because they ever need to be done this way but hopefully to cultivate insight into what it is that computers do. Making the connection between formulae on the page and the results of running estimation programs on a laptop is a fundamental step on the path to econometric expertise.
The most basic manipulation is to add up a column of numbers, where the word “column” is chosen deliberately to evoke the layout of a spreadsheet but could equally refer to the page of an accounting led!ger in the ink‐and‐paper technology of a now‐vanished age. Nearly all of the important concepts can be explained in the context of a pair of variables. To give them names, call them and . Going from two variables up to three and more introduces no fundamental new ideas. In linear regression analysis, variables are always treated in pairs, no matter how many are involved in the calculation as a whole.
Thus, let denote the pair of variables chosen for analysis. The enclosure of the symbols in parentheses, separated by a comma, is a simple way of indicating that these items are to be taken together, but note that is not to be regarded as just another way of writing . The order in which the variables appear is often significant.
Let , a positive whole number, denote the number of observations or in other words the number of rows in the spreadsheet. Such a collection of observations, whose order may or may not be significant, is often called a series. The convention for denoting which row the observation belongs to is to append a subscript. Sometimes the letters , , or are used as row labels but there are typically other uses for these, and in this book we generally adopt the symbol for this purpose. Thus, the contents of a pair of spreadsheet columns may be denoted symbolically as
We variously refer to the and as the elements or the coordinates of their respective series.
This brings us inevitably to the question of the context in which observations are made. Very frequently, macroeconomic or financial variables (prices, interest rates, demand flows, asset stocks) are recorded at successive dates, at intervals of days, months, quarters, or years, and then is simply a date, standardized with respect to the time interval and the first observation. Such data sets are called time series. Economic data may also be observations of individual economic units. These can be workers or consumers, households, firms, industries, and sometimes regions, states, and countries. The observations can represent quantities such as incomes, rates of expenditure on consumption or investment, and also individual characteristics, such as family size, numbers of employees, population, and so forth. If these observations relate to a common date, the data set is called a cross‐section. The ordering of the rows typically has no special significance in this case.
Increasingly commonly studied in economics are data sets with both a time and a cross‐sectional dimension, known as panel data, representing a succession of observations on the same cross section of entities. In this case two subscripts are called for, say and . However, the analysis of panel data is an advanced topic not covered in this book, and for observations we can stick to single subscripts henceforth.
1.2 Summary Statistics
As remarked at the beginning, the basic statistical operation of averaging is a way of measuring the central tendency of a set of data. Take a column of numbers, add them up, and divide by . This operation defines the sample mean of the series, usually written as the symbol for the designated variable with a bar over the top. Thus,
where the second equality defines the ‘sigma’ representation of the sum. The Greek letter , decorated with upper and lower limits, is a neat way to express the adding‐up operation, noting the vital role of the subscript in showing which items are to be added together. The formula for is constructed in just the same way.
The idea of the series mean extends from raw observations to various constructed series. The mean deviations are the series
Naturally enough this ‘centred’ series has zero mean, identically:
Not such an interesting fact, perhaps, but the statistic obtained as the mean of the squared mean deviations is very interesting indeed. This is the sample variance,
which contains information about how the series varies about its central tendency. The same information, but with units of measurement matching the original data, is conveyed by the square root , called the standard deviation of the series. If is a measure of location, then is a measure of dispersion.
One of the mysteries of the variance formula is the division by , not as for the mean itself. There are important technical reasons for this,2 but to convey the intuition involved here, it may be helpful to think about the case where , a single observation. Clearly, the mean formula still makes sense, because it gives . This is the best that can be done to measure location. There is clearly no possibility of computing a measure of dispersion, and the fact that the formula would involve dividing by zero gives warning that it is not meaningful to try. In other words, to measure the dispersion as , which is what (1.3) would produce with division by instead of , would be misleading. Rather, it is correct to say that no measure of dispersion exists.
Another property of the variance formula worth remarking is found by multiplying out the squared terms and summing them separately, thus:
In the first equality, note that “adding up” instances of (which does not depend on ) is the same thing as just multiplying by . The second equality then follows by cancellation, given the definition (1.1). This result shows that to compute the variance, there is no need to perform subtractions. Simply add up the squares of the coordinates, and subtract times the squared mean. Clearly, this second formula is more convenient for hand calculations than the first one.
The information contained in the standard deviation is nicely captured by a famous result in statistics called Chebyshev's rule, after the noted Russian mathematician who discovered it.3 Consider, for some chosen positive number , whether a series coordinate falls ‘far from’ the central tendency of the data set in the sense that either or . In other words, does lie beyond a distance from the mean, either above or below? This condition can be expressed as
Letting denote the number of cases that satisfy inequality (1.5), the inequality
is true by definition, where the ‘sigma’ notation variant expresses compactly the sum of the terms satisfying the stated condition. However, it is also the case that
since, remembering the definition of from (1.3), the sum cannot exceed , even with . Putting together the inequalities in (1.6) and (1.7) and also dividing through by and by yields the result
In words, the proportion of series coordinates falling beyond a distance from the mean is at...
| Erscheint lt. Verlag | 18.7.2018 |
|---|---|
| Sprache | englisch |
| Themenwelt | Mathematik / Informatik ► Mathematik ► Angewandte Mathematik |
| Naturwissenschaften ► Chemie | |
| Wirtschaft ► Volkswirtschaftslehre ► Ökonometrie | |
| Schlagworte | Angewandte Mathematik • A Note on Computation • Applied mathematics • A Random Experiment • Computing the Regression Line • Conditional distributions • Correlation • Cramer's Rule • Definite Matrices • Determinant and Adjoint • Discrete Random Variables • Econometrics • Economics • Elementary Data Analysis • Expected values • Finanz- u. Wirtschaftsstatistik • Linear Dependence and Rank • Mathematics • Mathematik • Matrix Algebra Basics • Matrix Calculus • Matrix Inversion • matrix representation • multiple regression • Ökonometrie • Other Continuous Distributions • Partitioned • Partitioning and Inversion • probability distributions • Properties of the Normal Distribution • random vectors • Regression • Rules of Matrix Algebra • Solving the Matrix Equation • Statistics • Statistics for Finance, Business & Economics • Statistik • summary statistics • Systems of Equations • The Classical Assumptions • The Classical Regression Model • The General Linear Regression • The Least Squares Solution • The Multivariate Normal Distribution • Transposes and Products • Variables and Observations • Volkswirtschaftslehre |
| ISBN-10 | 1-119-48492-8 / 1119484928 |
| ISBN-13 | 978-1-119-48492-9 / 9781119484929 |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich