Zum Hauptinhalt springen
Nicht aus der Schweiz? Besuchen Sie lehmanns.de

Categorical Data Analysis by Example (eBook)

eBook Download: PDF
2016 | 1. Auflage
216 Seiten
Wiley (Verlag)
978-1-119-30791-4 (ISBN)

Lese- und Medienproben

Categorical Data Analysis by Example -  Graham J. G. Upton
Systemvoraussetzungen
90,99 inkl. MwSt
(CHF 88,90)
Der eBook-Verkauf erfolgt durch die Lehmanns Media GmbH (Berlin) zum Preis in Euro inkl. MwSt.
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

Introduces the key concepts in the analysis of categoricaldata with illustrative examples and accompanying R code

This book is aimed at all those who wish to discover how to analyze categorical data without getting immersed in complicated mathematics and without needing to wade through a large amount of prose. It is aimed at researchers with their own data ready to be analyzed and at students who would like an approachable alternative view of the subject.

Each new topic in categorical data analysis is illustrated with an example that readers can apply to their own sets of data. In many cases, R code is given and excerpts from the resulting output are presented. In the context of log-linear models for cross-tabulations, two specialties of the house have been included: the use of cobweb diagrams to get visual information concerning significant interactions, and a procedure for detecting outlier category combinations. The R code used for these is available and may be freely adapted. In addition, this book:

  • Uses an example to illustrate each new topic in categorical data
  • Provides a clear explanation of an important subject
  • Is understandable to most readers with minimal statistical and mathematical backgrounds
  • Contains examples that are accompanied by R code and resulting output
  • Includes starred sections that provide more background details for interested readers

Categorical Data Analysis by Example is a reference for students in statistics and researchers in other disciplines, especially the social sciences, who use categorical data. This book is also a reference for practitioners in market research, medicine, and other fields.



GRAHAM J. G. UPTON is formerly Professor of Applied Statistics, Department of Mathematical Sciences, University of Essex. Dr. Upton is author of The Analysis of Cross-tabulated Data (1978) and joint author of Spatial Data Analysis by Example (2 volumes, 1995), both published by Wiley. He is the lead author of The Oxford Dictionary of Statistics (OUP, 2014). His books have been translated into Japanese, Russian, and Welsh.

GRAHAM J. G. UPTON is formerly Professor of Applied Statistics, Department of Mathematical Sciences, University of Essex. Dr. Upton is author of The Analysis of Cross-tabulated Data (1978) and joint author of Spatial Data Analysis by Example (2 volumes, 1995), both published by Wiley. He is the lead author of The Oxford Dictionary of Statistics (OUP, 2014). His books have been translated into Japanese, Russian, and Welsh.

CATEGORICAL DATA ANALYSIS BY EXAMPLE 3
Contents 7
Preface 13
Acknowledgments 15
1 Introduction 17
1.1 What are Categorical Data? 17
1.2 A Typical Data Set 18
1.3 Visualization and Cross-Tabulation 19
1.4 Samples, Populations, and Random Variation 20
1.5 Proportion, Probability, and Conditional Probability 21
1.6 Probability Distributions 22
1.6.1 The Binomial Distribution 22
1.6.2 The Multinomial Distribution 23
1.6.3 The Poisson Distribution 23
1.6.4 The Normal Distribution 23
1.6.5 The Chi-Squared (??????2) Distribution 24
1.7 *The Likelihood 25
2 Estimation and Inference for Categorical Data 27
2.1 Goodness of Fit 27
2.1.1 Pearson’s X2 Goodness-of-Fit Statistic 27
2.1.2 *The Link between X2 and the Poisson and ??????2-Distributions 28
2.1.3 The Likelihood-Ratio Goodness-of-Fit Statistic, G2 29
2.1.4 *Why the G2 and X2 Statistics Usually have Similar Values 30
2.2 Hypothesis Tests for a Binomial Proportion (Large Sample) 30
2.2.1 The Normal Score Test 31
2.2.2 *Link to Pearson’s X2 Goodness-of-Fit Test 31
2.2.3 G2 for a Binomial Proportion 31
2.3 Hypothesis Tests for a Binomial Proportion (Small Sample) 32
2.3.1 One-Tailed Hypothesis Test 32
2.3.2 Two-Tailed Hypothesis Tests 34
2.4 Interval Estimates for a Binomial Proportion 34
2.4.1 Laplace’s Method 35
2.4.2 Wilson’s Method 35
2.4.3 The Agresti–Coull Method 36
2.4.4 Small Samples and Exact Calculations 36
References 38
3 The 2 × 2 Contingency Table 41
3.1 Introduction 41
3.2 Fisher’s Exact Test (for Independence) 43
3.2.1 *Derivation of the Exact Test Formula 44
3.3 Testing Independence with Large Cell Frequencies 45
3.3.1 Using Pearson’s Goodness-of-Fit Test 46
3.3.2 The Yates Correction 46
3.4 The 2 × 2 Table in a Medical Context 48
3.5 Measuring Lack of Independence (Comparing Proportions) 50
3.5.1 Difference of Proportions 51
3.5.2 Relative Risk 52
3.5.3 Odds-Ratio 53
References 56
4 The I × J Contingency Table 57
4.1 Notation 57
4.2 Independence in the I × J Contingency Table 58
4.2.1 Estimation and Degrees of Freedom 58
4.2.2 Odds-Ratios and Independence 59
4.2.3 Goodness of Fit and Lack of Fit of the Independence Model 59
4.3 Partitioning 62
4.3.1 *Additivity of G2 62
4.3.2 Rules for Partitioning 65
4.4 Graphical Displays 65
4.4.1 Mosaic Plots 65
4.4.2 Cobweb Diagrams 66
4.5 Testing Independence with Ordinal Variables 68
References 70
5 The Exponential Family 71
5.1 Introduction 71
5.2 The Exponential Family 72
5.2.1 The Exponential Dispersion Family 73
5.3 Components of a General Linear Model 73
5.4 Estimation 74
References 75
6 A Model Taxonomy 77
6.1 Underlying Questions 77
6.1.1 Which Variables are of Interest? 77
6.1.2 What Categories should be Used? 77
6.1.3 What is the Type of Each Variable? 78
6.1.4 What is the Nature of Each Variable? 78
6.2 Identifying the Type of Model 79
7 The 2 × J Contingency Table 81
7.1 A Problem with X2 (and G2) 81
7.2 Using the Logit 82
7.2.1 Estimation of the Logit 83
7.2.2 The Null Model 84
7.3 Individual Data and Grouped Data 85
7.4 Precision, Confidence Intervals, and Prediction Intervals 89
7.4.1 Prediction Intervals 90
7.5 Logistic Regression with a Categorical Explanatory Variable 92
7.5.1 Parameter Estimates with Categorical Variables (J > 2)
7.5.2 The Dummy Variable Representation of a Categorical Variable 95
References 96
8 Logistic Regression with Several Explanatory Variables 97
8.1 Degrees of Freedom when there are no Interactions 97
8.2 Getting a Feel for the Data 99
8.3 Models with two-Variable Interactions 101
8.3.1 Link to the Testing of Independence between Two Variables 103
9 Model Selection and Diagnostics 105
9.1 Introduction 105
9.1.1 Ockham’s Razor 106
9.2 Notation for Interactions and for Models 107
9.3 Stepwise Methods for Model Selection Using G2 108
9.3.1 Forward Selection 110
9.3.2 Backward Elimination 112
9.3.3 Complete Stepwise 114
9.4 AIC and Related Measures 114
9.5 The Problem Caused by Rare Combinations of Events 116
9.5.1 Tackling the Problem 117
9.6 Simplicity Versus Accuracy 119
9.7 DFBETAS 121
References 123
10 Multinomial Logistic Regression 125
10.1 A Single Continuous Explanatory Variable 125
10.2 Nominal Categorical Explanatory Variables 129
10.3 Models for an Ordinal Response Variable 131
10.3.1 Cumulative Logits 131
10.3.2 Proportional Odds Models 132
10.3.3 Adjacent-Category Logit Models 137
10.3.4 Continuation-Ratio Logit Models 138
References 140
11 Log-Linear Models for I × J Tables 141
11.1 The Saturated Model 141
11.1.1 Cornered Constraints 142
11.1.2 Centered Constraints 145
11.2 The Independence Model for an I × J Table 147
12 Log-Linear Models for I × J × K Tables 151
12.1 Mutual Independence: A?B?C 152
12.2 The Model AB?C 153
12.3 Conditional Independence and Independence 155
12.4 The Model AB?AC 156
12.5 The Models AB?AC?BC and ABC 157
12.6 Simpson’s Paradox 157
12.7 Connection between Log-Linear Models and Logistic Regression 159
Reference 162
13 Implications and Uses of Birch’s Result 163
13.1 Birch’s Result 163
13.2 Iterative Scaling 164
13.3 The Hierarchy Constraint 165
13.4 Inclusion of the All-Factor Interaction 166
13.5 Mostellerizing 167
References 169
14 Model Selection for Log-Linear Models 171
14.1 Three Variables 171
14.2 More than Three Variables 175
Reference 179
15 Incomplete Tables, Dummy Variables, and Outliers 181
15.1 Incomplete Tables 181
15.1.1 Degrees of Freedom 181
15.2 Quasi-independence 183
15.3 Dummy Variables 183
15.4 Detection of Outliers 184
16 Panel Data and Repeated Measures 191
16.1 The Mover-Stayer Model 192
16.2 The Loyalty Model 194
16.3 Symmetry 196
16.4 Quasi-Symmetry 196
16.5 The Loyalty-Distance Model 198
References 200
Appendix R Code for Cobweb Function 201
Index 207
Author Index 211
Index of Examples 213
EULA 215

"Concise introduction to dealing with
categorical data (with supporting R code)
which will help the general data scientist." (Raspberry Pi March 2017)

Erscheint lt. Verlag 20.10.2016
Sprache englisch
Themenwelt Mathematik / Informatik Mathematik Statistik
Mathematik / Informatik Mathematik Wahrscheinlichkeit / Kombinatorik
Technik
Schlagworte Analysis • Book • categorical • categoricaldata • categorical data analysis • Code • Concepts • ConText • Data • Datenanalyse • Example • Examples • excerpts • Given • Illustrated • illustrative • Kategoriale Datenanalyse • Kategorielle Datenanalyse • Key • Loglinear • many • Models • New • Output • presented • Readers • Ready • Researchers • Statistical Software / R • Statistics • Statistics for Social Sciences • Statistik • Statistik in den Sozialwissenschaften • Statistiksoftware / R • students • topic
ISBN-10 1-119-30791-0 / 1119307910
ISBN-13 978-1-119-30791-4 / 9781119307914
Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?
PDFPDF (Adobe DRM)

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich