Analysis of Biomarker Data (eBook)
John Wiley & Sons (Verlag)
978-1-118-55245-2 (ISBN)
A 'how to' guide for applying statistical methods to biomarker data analysis
Presenting a solid foundation for the statistical methods that are used to analyze biomarker data, Analysis of Biomarker Data: A Practical Guide features preferred techniques for biomarker validation. The authors provide descriptions of select elementary statistical methods that are traditionally used to analyze biomarker data with a focus on the proper application of each method, including necessary assumptions, software recommendations, and proper interpretation of computer output. In addition, the book discusses frequently encountered challenges in analyzing biomarker data and how to deal with them, methods for the quality assessment of biomarkers, and biomarker study designs.
Covering a broad range of statistical methods that have been used to analyze biomarker data in published research studies, Analysis of Biomarker Data: A Practical Guide also features:
- A greater emphasis on the application of methods as opposed to the underlying statistical and mathematical theory
- The use of SAS®, R, and other software throughout to illustrate the presented calculations for each example
- Numerous exercises based on real-world data as well as solutions to the problems to aid in reader comprehension
- The principles of good research study design and the methods for assessing the quality of a newly proposed biomarker
- A companion website that includes a software appendix with multiple types of software and complete data sets from the book's examples
STEPHEN W. LOONEY, PHD, is Professor in the Department of Biostatistics and Epidemiology at Georgia Regents University, USA. He is a Fellow of the American Statistical Association and the Royal Statistical Society, an elected member of the International Statistical Institute, and a member of the International Biometric Society.
JOSEPH L. HAGAN, SCD, is Research Statistician at Texas Children's Hospital and Assistant Professor at the Baylor College of Medicine, USA. He is a member of the American Statistical Association.
STEPHEN W. LOONEY, PHD, is Professor in the Department of Biostatistics and Epidemiology at Georgia Regents University, USA. He is a Fellow of the American Statistical Association and the Royal Statistical Society, an elected member of the International Statistical Institute, and a member of the International Biometric Society. JOSEPH L. HAGAN, SCD, is Research Statistician at Texas Children's Hospital and Assistant Professor at the Baylor College of Medicine, USA. He is a member of the American Statistical Association.
Preface
Acknowledgements
1. Introduction
1.1 What Is a Biomarker?
1.2 Biomarkers vs. Surrogate Markers
1.3 Organization of This Book
2. Designing Biomarker Studies
2.1 Introduction
2.2 Designing the Study
2.3 Designing the Analysis
2.4 Presenting Statistical Results
Problems
3. Elementary Statistical Methods for Analyzing Biomarker Data
3.1 Introduction
3.2 Graphical and Tabular Summaries
3.3 Descriptive Statistics
3.4 Describing the Shape of Distributions
3.5 Sampling Distributions* 3.6 Introduction to Statistical Inference
3.7 Comparing Means across Groups
3.8 Correlation Analysis
3.9 Regression Analysis
3.10 Analyzing Cross-Classified Data
Problems
4. Frequently Encountered Challenges in Analyzing Biomarker Data and How to Deal With Them
4.1 Introduction
4.2 Non-Normally Distributed Data
4.3 Heterogeneity of Variance
4.4 Dependent Groups
4.5 Correlated Outcomes
4.6 Clustered Data
4.7 Outliers
4.8 Limits Of Detection and Non-Detected Observations
4.9 The Analysis of Cross-Classified Categorical Data
Problems
5. Validation of Biomarkers
5.1 Overview of Methods for Assessing Validity and Reliability of Biomarkers
5.2 General Discussion of Measures of Agreement
5.3 Assessing Reliability of A Biomarker
5.4 Assessing Validity
Problems
References
Subject Index
Solutions to Problems
1
INTRODUCTION
1.1 WHAT IS A BIOMARKER?
According to the Dictionary of Epidemiology, a biomarker (or biological marker) is “a cellular, biochemical, or molecular indicator of exposure; of biological, subclinical, or clinical effects; or of possible susceptibility” (Porta 2008, p. 21). As Porta points out, the term “biomarker” is often ambiguous; this is perhaps an indication that there is insufficient understanding of the pathophysiological or mechanistic role of the “marker.”
The ambiguity may also be due to the fact that biomarkers are involved in one way or another with so many different disciplines (clinical trialists, statisticians, regulators, etc.) and clinical research applications. In fact, there is so much potential ambiguity associated with the term “biomarker” that several efforts have been made to provide a formal definition of exactly what a biomarker is.
For example, in a 1987 US National Research Council report, biomarkers were defined to be “indicators signaling events in biological systems or samples.” In 2001, a Biomarkers Definitions Working Group (BDWG), convened by the US National Institutes of Health, proposed the following definition of biological marker (biomarker): “A characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention.”
As interest in the development, validation, and application of new biomarkers has increased, numerous classification systems for biomarkers have been proposed. These include Type 0–Type 6 biomarkers; Type I and II biomarkers (Mildvan et al. 1997); prognostic and predictive biomarkers; genomic, proteomic, and combinatorial biomarkers; screening and stratification biomarkers, and so on. (See Table 1 of DeCaprio (2006) for details.) Most of these classification systems reflect the intended use of the biomarker data in a particular discipline; however, all biomarkers are related in the sense that each of them is designed to be an “indicator” of something, as noted in the Dictionary of Epidemiology definition cited above. Our primary focus in this book is on markers of exposure, although the statistical techniques we describe can be applied to almost any type of biomarker. By using “real” data taken from published biomarker studies to exemplify the proper application of these techniques, we have tried to illustrate the broad applicability of statistical methods in the analysis of biomarker data, regardless of the particular type of biomarker that is being considered.
1.2 BIOMARKERS VERSUS SURROGATE ENDPOINTS
In their report describing preferred definitions for biomarkers and surrogate endpoints, the BDWG defined a clinical endpoint as: “A characteristic or variable that reflects how a patient feels, functions, or survives.” They then defined a surrogate endpoint as: “A biomarker that is intended to substitute for a clinical endpoint.” A surrogate endpoint is thus expected to “predict clinical benefit (or harm or lack of benefit or harm) based on epidemiologic, therapeutic, pathophysiologic, or other scientific evidence.” As they pointed out, all surrogate endpoints are biomarkers, but not all biomarkers are surrogate endpoints. In fact, “it is likely that only a few biomarkers will achieve surrogate endpoint status.” (Note that they discouraged the use of the term surrogate marker, and advocated the exclusive use of surrogate endpoint instead (BDWG 2001, p. 91).)
Because of the requirement that one must be able to substitute a surrogate endpoint in place of the corresponding clinical endpoint, the process of validating a surrogate endpoint goes far beyond what is usually required when validating a biomarker (see Chapter 5). In fact, the BDWG claimed that the term validation is unsuitable for describing the process of linking biomarkers to clinical endpoints; they proposed that the process of determining surrogate endpoint status be referred to as evaluation. They reserved use of validation to describe the process of addressing what they referred to as the “performance characteristics” (e.g., sensitivity, specificity, and reproducibility) of a measurement process or assay technique. This is consistent with our use of the term biomarker validation in Chapter 5.
Because of the complexity involved in evaluating a surrogate endpoint, various approaches have been proposed, almost all of which involve examining the effect of a treatment for the clinical endpoint (typically referred to as the “disease”) on the surrogate for the endpoint. In a landmark paper, Prentice (1989) formulated a definition of surrogate endpoints and defined a set of operational criteria for their evaluation. In their subsequent work, Freedman et al. (1992) proposed that one should focus attention on the proportion of the treatment effect explained by the surrogate for the disease endpoint, whereas Buyse and Molenberghs (1998) proposed that the primary focus should be on the relative effect of the treatment on the surrogate. Various authors also advocated the use of meta-analytic data in the evaluation of a surrogate endpoint (Freedman et al. 1992; Lin et al. 1997; Daniels and Hughes 1997). The application of meta-analytic techniques to surrogate endpoint evaluation was further developed by Buyse et al. (2000); Gail et al. (2000); Molenberghs et al. (2002); and others. The very comprehensive textbook edited by Burzykowski et al. (2005) thoroughly discuss all of these statistical approaches and subsequent developments. The Institute of Medicine report (Micheel and Ball 2010) approaches the evaluation of surrogate endpoints from a more clinical perspective.
Although surrogate endpoints are certainly a very important special case of biomarkers, we feel that the specialized techniques developed for evaluating them, especially as these techniques relate to treatment of the clinical endpoint, are beyond the scope of this text. Hence, we do not discuss surrogate endpoints as a separate topic elsewhere in this book. However, the methods that we describe for analyzing biomarker data and validating a biomarker (as defined by BDWG), certainly apply to surrogate endpoints as well.
1.3 ORGANIZATION OF THIS BOOK
In Chapter 1, we define what we mean by a biomarker and then describe our understanding of the differences and similarities between biomarkers and surrogate endpoints.
In Chapter 2, we cover basic principles of effective design of a study that will make use of biomarker data, including selecting the most appropriate type of study design (cross-sectional, case–control, etc.), choosing the appropriate measure of association once the type of design has been selected, designing the statistical analysis that will be applied to the study data once they have been obtained, and choosing the appropriate sample size for the study that is being planned. We also describe several features of what we consider to be the effective presentation of statistical results once the study data have been analyzed.
In Chapter 3, we provide a survey of elementary statistical methods that are widely used when analyzing biomarker data. To be specific, the methods that we cover include: graphical and tabular summaries; descriptive statistics; basic concepts of statistical inference, including point estimation, confidence interval estimation, and hypothesis testing; comparisons of means between two groups and among more than two groups; statistical inference for correlation coefficients; simple and multiple linear regression; and analysis of cross-classified data, including the chi-square test of independence and methods for comparing proportions across two or more groups. Our intention in this chapter is not to provide comprehensive coverage of all of elementary statistical methods, but rather to describe selected methods in sufficient detail so that someone who is relatively inexperienced in the application of statistics will be able to carry out these analyses appropriately and with a minimum of effort.
In Chapter 4, we describe various “challenges” that one is likely to encounter in the analysis of biomarker data and offer our recommendations on preferred methods for dealing with them. These challenges include: (1) violations of underlying assumptions (normality, homogeneity of variance), (2) lack of independence between the groups being compared, (3) proper analysis of correlated data, (4) clustered data, (5) contaminated data, (6) non-detectable observations, (7) choosing the appropriate measure of association between predictor and outcome, and (8) choosing the appropriate method of analysis for cross-classified data (i.e., contingency tables). Each of these challenges is illustrated using data from a “real” biomarker study, most of which were taken from the scientific literature.
In Chapter 5, we provide a detailed discussion of the methods we recommend for evaluating the quality of a newly proposed (or existing) biomarker (also called biomarker validation). Our focus is on establishing that the biomarker has adequate reliability and validity.
Throughout Chapters 3–5, we provide what we hope is sufficient mathematical detail for those who are interested, but...
| Erscheint lt. Verlag | 28.1.2015 |
|---|---|
| Sprache | englisch |
| Themenwelt | Mathematik / Informatik ► Mathematik ► Computerprogramme / Computeralgebra |
| Mathematik / Informatik ► Mathematik ► Statistik | |
| Mathematik / Informatik ► Mathematik ► Wahrscheinlichkeit / Kombinatorik | |
| Naturwissenschaften ► Biologie | |
| Sozialwissenschaften ► Soziologie ► Empirische Sozialforschung | |
| Technik | |
| Schlagworte | Applied Statistics • biological science • biomarker study design • Biomarker validation • biometrics</p> • Biostatistics • Biostatistik • clinical laboratory science • Environmental Science • Environmental Statistics & Environmetrics • Environmental Studies • epidemiology • Health Science • <p>biomarker data analysis • pharmacology • Quality Assessment • Statistical Methods • Statistics • Statistik • study designs • Toxicology • Umweltforschung • Umweltstatistik • Umweltstatistik u. Environmetrics • Umweltwissenschaften |
| ISBN-10 | 1-118-55245-8 / 1118552458 |
| ISBN-13 | 978-1-118-55245-2 / 9781118552452 |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich