Healthcare Analytics (eBook)
John Wiley & Sons (Verlag)
978-1-119-37464-0 (ISBN)
With a focus on cutting-edge approaches to the quickly growing field of healthcare, Healthcare Analytics: From Data to Knowledge to Healthcare Improvement provides an integrated and comprehensive treatment on recent research advancements in data-driven healthcare analytics in an effort to provide more personalized and smarter healthcare services. Emphasizing data and healthcare analytics from an operational management and statistical perspective, the book details how analytical methods and tools can be utilized to enhance healthcare quality and operational efficiency.
Organized into two main sections, Part I features biomedical and health informatics and specifically addresses the analytics of genomic and proteomic data; physiological signals from patient-monitoring systems; data uncertainty in clinical laboratory tests; predictive modeling; disease modeling for sepsis; and the design of cyber infrastructures for early prediction of epidemic events. Part II focuses on healthcare delivery systems, including system advances for transforming clinic workflow and patient care; macro analysis of patient flow distribution; intensive care units; primary care; demand and resource allocation; mathematical models for predicting patient readmission and postoperative outcome; physician-patient interactions; insurance claims; and the role of social media in healthcare. Healthcare Analytics: From Data to Knowledge to Healthcare Improvement also features:
* Contributions from well-known international experts who shed light on new approaches in this growing area
* Discussions on contemporary methods and techniques to address the handling of rich and large-scale healthcare data as well as the overall optimization of healthcare system operations
* Numerous real-world examples and case studies that emphasize the vast potential of statistical and operational research tools and techniques to address the big data environment within the healthcare industry
* Plentiful applications that showcase analytical methods and tools tailored for successful healthcare systems modeling and improvement
The book is an ideal reference for academics and practitioners in operations research, management science, applied mathematics, statistics, business, industrial and systems engineering, healthcare systems, and economics. Healthcare Analytics: From Data to Knowledge to Healthcare Improvement is also appropriate for graduate-level courses typically offered within operations research, industrial engineering, business, and public health departments.
HUI YANG, PhD, is Associate Professor in the Harold and Inge Marcus Department of Industrial and Manufacturing Engineering at The Pennsylvania State University. His research interests include sensor-based modeling and analysis of complex systems for process monitoring/control; system diagnostics/ prognostics; quality improvement; and performance optimization with special focus on nonlinear stochastic dynamics and the resulting chaotic, recurrence, self-organizing behaviors. EVA K. LEE, PhD, is Professor in the H. Milton Stewart School of Industrial and Systems Engineering at the Georgia Institute of Technology, Director of the Center for Operations Research in Medicine and HealthCare, and Distinguished Scholar in Health System, Health Systems Institute at both Emory University School of Medicine and Georgia Institute of Technology. Her research interests include health-risk prediction; early disease prediction and diagnosis; optimal treatment strategies and drug delivery; healthcare outcome analysis and treatment prediction; public health and medical preparedness; large-scale healthcare/medical decision analysis and quality improvement; clinical translational science; and business intelligence and organization transformation.
LIST OF CONTRIBUTORS xvii
PREFACE xxi
PART I ADVANCES IN BIOMEDICAL AND HEALTH INFORMATICS 1
1 Recent Development in Methodology for Gene Network Problems and Inferences 3
Sung W. Han and Hua Zhong
1.1 Introduction 3
1.2 Background 5
1.3 Genetic Data Available 7
1.4 Methodology 7
1.5 Search Algorithm 13
1.6 PC Algorithm 15
1.7 Application/Case Studies 16
1.8 Discussion 23
1.9 Other Useful Softwares 23
Acknowledgments 24
References 24
2 Biomedical Analytics and Morphoproteomics: An Integrative Approach for Medical Decision Making for Recurrent or Refractory Cancers 31
Mary F. McGuire and Robert E. Brown
2.1 Introduction 31
2.2 Background 32
2.3 Methodology 37
2.4 Case Studies 46
2.5 Discussion 51
2.6 Conclusions 52
Acknowledgments 53
References 53
3 Characterization and Monitoring of Nonlinear Dynamics and Chaos in Complex Physiological Systems 59
Hui Yang, Yun Chen, and Fabio Leonelli
3.1 Introduction 59
3.2 Background 61
3.3 Sensor-Based Characterization and Modeling of Nonlinear Dynamics 65
3.4 Healthcare Applications 80
3.5 Summary 88
Acknowledgments 90
References 90
4 Statistical Modeling of Electrocardiography Signal for Subject Monitoring and Diagnosis 95
Lili Chen, Changyue Song, and Xi Zhang
4.1 Introduction 95
4.2 Basic Elements of ECG 96
4.3 Statistical Modeling of ECG for Disease Diagnosis 99
4.4 An Example: Detection of Obstructive Sleep Apnea from a Single ECG Lead 115
4.5 Materials and Methods 115
4.6 Results 118
4.7 Conclusions and Discussions 121
4.8 Conclusion 121
References 121
5 Modeling and Simulation of Measurement Uncertainty in Clinical Laboratories 127
Varun Ramamohan, James T. Abbott, and Yuehwern Yih
5.1 Introduction 127
5.2 Background and Literature Review 129
5.3 Model Development Guidelines 138
5.4 Implementation of Guidelines: Enzyme Assay Uncertainty Model 141
5.5 Discussion and Conclusions 152
References 154
6 Predictive Analytics: Classification in Medicine and Biology 159
Eva K. Lee
6.1 Introduction 159
6.2 Background 161
6.3 Machine Learning with Discrete Support Vector Machine Predictive Models 163
6.4 Applying DAMIP to Real-World Applications 170
6.5 Summary and Conclusion 182
Acknowledgments 183
References 183
7 Predictive Modeling in Radiation Oncology 189
Hao Zhang, Robert Meyer, Leyuan Shi, Wei Lu, and Warren D'Souza
7.1 Introduction 189
7.2 Tutorials of Predictive Modeling Techniques 191
7.3 Review of Recent Predictive Modeling Applications in Radiation Oncology 194
7.4 Modeling Pathologic Response of Esophageal Cancer to Chemoradiotherapy 199
7.5 Modeling Clinical Complications after Radiation Therapy 205
7.6 Modeling Tumor Motion with Respiratory Surrogates 211
7.7 Conclusion 215
References 215
8 Mathematical Modeling of Innate Immunity Responses of Sepsis: Modeling and Computational Studies 221
Chih-Hang J. Wu, Zhenshen Shi, David Ben-Arieh, and Steven Q. Simpson
8.1 Background 221
8.2 System Dynamic Mathematical Model (SDMM) 223
8.3 Pathogen Strain Selection 224
8.4 Mathematical Models of Innate Immunity of Air 239
8.5 Discussion 247
8.6 Conclusion 254
References 254
PART II ANALYTICS FOR HEALTHCARE DELIVERY 299
9 Systems Analytics: Modeling and Optimizing ClinicWorkflow and Patient Care 301
Eva K. Lee, Hany Y. Atallah, Michael D. Wright, Calvin Thomas IV, Eleanor T. Post, Daniel T. Wu, and Leon L. Haley Jr
9.1 Introduction 302
9.2 Background 304
9.3 Challenges and Objectives 305
9.4 Methods and Design of Study 306
9.5 Computational Results, Implementation, and ED Performance Comparison 323
9.6 Benefits and Impacts 330
9.7 Scientific Advances 335
Acknowledgments 336
References 337
10 A Multiobjective Simulation Optimization of the Macrolevel Patient Flow Distribution 341
Yunzhe Qiu and Jie Song
10.1 Introduction 341
10.2 Literature Review 343
10.3 Problem Description and Modeling 346
10.4 Methodology 350
10.5 Case Study: Adjusting Patient Flow for a Two-Level Healthcare System Centered on the Puth 354
10.6 Conclusions and the Future Work 367
Acknowledgments 368
References 369
11 Analysis of Resource Intensive Activity Volumes in us Hospitals 373
Shivon Boodhoo and Sanchoy Das
11.1 Introduction 373
11.2 Structural Classification of Hospitals 375
11.3 Productivity Analysis of Hospitals 377
11.4 Resource and Activity Database for us Hospitals 379
11.5 Activity-Based Modeling of Hospital Operations 382
11.6 Resource use Profile of Hospitals from HUC Activity Data 389
11.7 Summ
Chapter 1
Recent Development in Methodology for Gene Network Problems and Inferences
Sung W. Han and Hua Zhong
Division of Biostatistics, School of Medicine, Department of Population Health, New York University, New York, NY, USA
1.1 Introduction
The cell inside of a human body is similar to a manufacturing system producing an appropriate protein that functions according to the specific organ or the part of the body to which it belongs. The nucleus centered at the cell contains the DNA sequence, which is a designed map for the human body. Each time the cell produces a protein, it duplicates a certain part of the DNA sequence and generates mRNA sequences. This is called a transcription process. After leaving the nucleus, the mRNA is attached to a ribosome, and the ribosome interprets the code in mRNA. This is called a translation process. After interpretation, the ribosome generates a sequence of amino acids; then it is folded into a certain type of protein.
The manufacturing system from DNAs to proteins sometimes malfunctions due to the DNA damage, which is known to be a main cause of cancers, also called malignant neoplasms [1, 2]. The DNA damage can occur naturally, but the damage can also be caused by two groups of agents: (i) exogenous agents such as radiation, smoke [3], ultraviolet light [4], and viruses [5]; and (ii) endogenous agents such as diet [6] and macrophages/neutrophils [5]. Such DNA damage leads to epigenetic alteration for DNA repair genes, which play the key roles in preventing cancer cell growth. Reducing the DNA repair gene expression (DNA repair deficiency; [7]) or switching off the function of the DNA repair gene, called silence, finally leads to the development of cancers. For example, MGMT is the DNA repair gene, and most types of colorectal cancers have reduced MGMT expression ([8–11], and [12]). The following are other examples of proteins corresponding to DNA repair genes [1].
- BRCA1 and BRCA2 (breast cancer genes 1 and 2) for breast and ovarian cancers.
- ATM (ataxia telangiectasia mutated) for leukemia and breast cancers.
- XPC (xeroderma pigmentosum) for skin cancers.
- p53 (Li–Fraumeni syndrome) for sarcoma, leukemia, breast, lung, skin, pancreas, and brain cancers.
In addition, the miRNA (micro RNA) outside of the nucleus is known to have an effect on the DNA repair gene because it can reduce the expression of DNA damage response genes or repair genes [1]. For example, miRNA-155 is overly expressed in colon cancers, and it is known to reduce the expression of MLH1, a DNA repair protein [13].
For finding the mechanism of cancer development, understanding the causal relationship in transcriptional regulatory networks is important, and the related inference is often based on the gene network problem. The examples of the application of the network problem are in gene expression analysis or gene–gene expression networks [14–19], protein–protein interaction analysis [20, 21], phenotype networks utilizing gene expression information [22–24], and causal networks linking gene expression and metabolic change [24].
The probabilistic graphical modeling is a popular approach to find causal relationships between variables in cell signal pathways or gene networks [25]. In this chapter, the graphical models are assumed to be directed acyclic graphs (DAGs), in which all the edges are directed edges and contain no cycles [26]. Since the estimation of DAGs is computationally very challenging, we cannot simply apply approaches that are used to estimate undirected graphs [27–29]. First, DAGs with the same set of conditional independence are not identifiable from observational data alone [26]; this is called observational equivalence. Second, the number of possible DAGs exponentially increases as the number of nodes increases [27]. Third, in gene network problems, the number of genes is much larger than the sample size, which is called high-dimensional data.
The DAGs with conditional probability distribution for each child node given its parents are called Bayesian networks. The comprehensive review about learning Bayesian network is in Buntine [30, 31], Heckerman [32], Neapolitan [33], and Daly et al. [34]. Apart from cancer gene problems, the Bayesian network is used in broad applications such as ecology [35, 36], neuroscience [37, 38], distributed sensor networks for change detection, and diagnosis [39–41].
The main approaches to estimate the Bayesian networks are as follows: (i) a score-and-search approach through the space of Bayesian network structures, (ii) a constraint-based approach that uses conditional independencies identified in the data, and (iii) a hybrid approach. A score-and-search approach is to find a structure corresponding to a good score function value [42] and use a heuristic algorithm to find the solution. The examples of this approach are in Daly et al. [34]. A constraint-based approach is to use a statistical test of conditional independence on the data. One of the efficient methods is the PC algorithm [43]. In high-dimensional contexts, Kalisch and Buhlmann [44] proposed the PC algorithm with a reasonable computational time [43] and proved consistency for sparse DAGs. Hybrid search strategies including the above-mentioned two criteria have also been proposed such as in Tsamardinos et al. [45], where the method used is a Max–Min Hill-Climbing (MMHC) algorithm. The methods mentioned have been successfully proposed to estimate DAGs with a small to moderate number of nodes.
For the score-and-search approach, a network is identified by maximizing a certain score function [31, 33, 42, 46], and several heuristic search algorithms are then developed to find a high score [27, 34]. To overcome high dimensionality in gene expression data, the L1-penalized method or lasso approach has been recently developed. Meinshausen and Buhlmann [28] theoretically show that the neighborhood of a node corresponding to a conditional dependence set can be obtained by a lasso problem, and it is efficient for high-dimensional DAGs. For DAGs, Shojaie and Michailidis [29] used the L1-penalized likelihood with a structural equation model to estimate directed graphs with a known variable order and found that such a problem was transformed into separable subproblems with lasso penalty. Huang et al. [47] used a penalized linear regression that imposes penalties to the coefficient values as well as to acyclic constraints. Fu and Zhou [48] used an adaptive lasso-based score function when the variable order is unknown. However, their objective function without the acyclic constraint is nonconvex, which makes finding the optimal solution infeasible. Han et al. [49] proposed the adaptive lasso-based score function, and it demonstrated superior performance to other methods when the network has a hub structure. In this chapter, we overview the approach based on the lasso-type score function for gene network problems in high-dimensional data.
1.2 Background
We explain the basic theoretical background in probabilistic graphical modeling or Bayesian networks. Let us have p random variables, , and the variables have causal relationships with each other. The variables and relationships in probabilistic distribution need to be mapped to nodes, , and edge sets, . In other words, the separation in a graph needs to be mapped to the independence in probability [50].
In probabilistic graphical modeling, the d-separation (directed separation) is an important concept described by Pearl [26]. The definition of d-separation is complicated, but it implies the following argument. Suppose we have three node sets , and . We define that is a d-separate between and if one of the conditions is satisfied:
- All edges between and inflow from to , and all edges between and inflow from to .
- All edges between and inflow from to , and all edges between and inflow from to .
- All edges between and inflow from to , and all edges between and inflow from to .
For all disjoint subsets of , and , we state that the probability distribution P is faithful to the graph G if the following condition is satisfied.
and are independent given if and only if and are d-separated given
Based on the d-separation, we can express the probability distribution by using the Markov property. The probability distribution is represented by
where is a set of parents for .
Another important issue in probabilistic graphical model is observational equivalence. The example of observational equivalence is in Figure 1.1. The three cases in Figure 1.1a–c are not distinguishable based on observational data. They are said to be in one equivalence class. However, based on the data, the case in Figure 1.1d can be distinguished from the other three cases. We say that this case has a v-structure. Such equivalence class causes multiple solutions with the same score function values if we apply the score-and-search approach to estimate a DAG. To show all equivalence classes, the complete partial DAG (cpDAG) can be used, which can be implemented by the “essentialGraph()” function in R package [51].
Figure 1.1 Examples of observational equivalence.
1.3 Genetic Data Available
The technology in recent decades has allowed genome-wide monitoring of DNA and RNA levels on thousands of samples [52]. For example, The Cancer Genome Atlas (TCGA) project seeks to provide a comprehensive landscape of genetic and genomic alternations...
| Erscheint lt. Verlag | 13.10.2016 |
|---|---|
| Reihe/Serie | Wiley Series in Operations Research and Management Science |
| Wiley Series in Operations Research and Management Science | Wiley Series in Operations Research and Management Science |
| Sprache | englisch |
| Themenwelt | Medizin / Pharmazie ► Gesundheitswesen |
| Wirtschaft ► Betriebswirtschaft / Management ► Unternehmensführung / Management | |
| Schlagworte | and prognostics • Betriebswirtschaft • Betriebswirtschaft u. Operationsforschung • Big Data Analytics • biomedical simulation and modeling • Biostatistics • Business & Management • chronic disease screening policy • Data Mining • Data Mining Statistics • Diagnostics • Forschung im Gesundheitswesen • Gesundheits- u. Sozialwesen • Gesundheitswesen • Health & Social Care • Healthcare • Health Care • Healthcare Analytics • Healthcare improvement • Healthcare Logistics • healthcare organizational management and market analysis • healthcare quality assessment and improvement • health care research • healthcare systems simulation • Information Technology • Management Science/Operational Research • Nonlinear Dynamics • patient readmission • Patient scheduling • Personalized Healthcare • Predictive Modeling • real-time patient monitoring • Simulation Optimization • statistical learning methods • Statistics • Statistik • Treatment optimization • Wirtschaft u. Management |
| ISBN-10 | 1-119-37464-2 / 1119374642 |
| ISBN-13 | 978-1-119-37464-0 / 9781119374640 |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich