Zum Hauptinhalt springen
Nicht aus der Schweiz? Besuchen Sie lehmanns.de

Strategies in Biomedical Data Science (eBook)

Driving Force for Innovation

(Autor)

eBook Download: EPUB
2016
John Wiley & Sons (Verlag)
978-1-119-25597-0 (ISBN)

Lese- und Medienproben

Strategies in Biomedical Data Science - Jay A. Etchings
Systemvoraussetzungen
50,99 inkl. MwSt
(CHF 49,80)
Der eBook-Verkauf erfolgt durch die Lehmanns Media GmbH (Berlin) zum Preis in Euro inkl. MwSt.
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
An essential guide to healthcare data problems, sources, and solutions

Strategies in Biomedical Data Science provides medical professionals with much-needed guidance toward managing the increasing deluge of healthcare data. Beginning with a look at our current top-down methodologies, this book demonstrates the ways in which both technological development and more effective use of current resources can better serve both patient and payer. The discussion explores the aggregation of disparate data sources, current analytics and toolsets, the growing necessity of smart bioinformatics, and more as data science and biomedical science grow increasingly intertwined. You'll dig into the unknown challenges that come along with every advance, and explore the ways in which healthcare data management and technology will inform medicine, politics, and research in the not-so-distant future. Real-world use cases and clear examples are featured throughout, and coverage of data sources, problems, and potential mitigations provides necessary insight for forward-looking healthcare professionals.

Big Data has been a topic of discussion for some time, with much attention focused on problems and management issues surrounding truly staggering amounts of data. This book offers a lifeline through the tsunami of healthcare data, to help the medical community turn their data management problem into a solution.

  • Consider the data challenges personalized medicine entails
  • Explore the available advanced analytic resources and tools
  • Learn how bioinformatics as a service is quickly becoming reality
  • Examine the future of IOT and the deluge of personal device data

The sheer amount of healthcare data being generated will only increase as both biomedical research and clinical practice trend toward individualized, patient-specific care. Strategies in Biomedical Data Science provides expert insight into the kind of robust data management that is becoming increasingly critical as healthcare evolves.

JAY A. ETCHINGS is the director of operations at Arizona State University's Research Computing program, where he is responsible for developing innovative architectures to progress fluid technical environments supporting highly computational workloads, peta-scale data analysis, next-generation cyber capabilities, and emerging network innovations.


An essential guide to healthcare data problems, sources, and solutions Strategies in Biomedical Data Science provides medical professionals with much-needed guidance toward managing the increasing deluge of healthcare data. Beginning with a look at our current top-down methodologies, this book demonstrates the ways in which both technological development and more effective use of current resources can better serve both patient and payer. The discussion explores the aggregation of disparate data sources, current analytics and toolsets, the growing necessity of smart bioinformatics, and more as data science and biomedical science grow increasingly intertwined. You'll dig into the unknown challenges that come along with every advance, and explore the ways in which healthcare data management and technology will inform medicine, politics, and research in the not-so-distant future. Real-world use cases and clear examples are featured throughout, and coverage of data sources, problems, and potential mitigations provides necessary insight for forward-looking healthcare professionals. Big Data has been a topic of discussion for some time, with much attention focused on problems and management issues surrounding truly staggering amounts of data. This book offers a lifeline through the tsunami of healthcare data, to help the medical community turn their data management problem into a solution. Consider the data challenges personalized medicine entails Explore the available advanced analytic resources and tools Learn how bioinformatics as a service is quickly becoming reality Examine the future of IOT and the deluge of personal device data The sheer amount of healthcare data being generated will only increase as both biomedical research and clinical practice trend toward individualized, patient-specific care. Strategies in Biomedical Data Science provides expert insight into the kind of robust data management that is becoming increasingly critical as healthcare evolves.

JAY A. ETCHINGS is the director of operations at Arizona State University's Research Computing program, where he is responsible for developing innovative architectures to progress fluid technical environments supporting highly computational workloads, peta-scale data analysis, next-generation cyber capabilities, and emerging network innovations.

Foreword xi

Acknowledgments xv

Introduction 1

Who Should Read This Book? 3

What's in This Book? 4

How to Contact Us 6

Chapter 1 Healthcare, History, and Heartbreak 7

Top Issues in Healthcare 9

Data Management 16

Biosimilars, Drug Pricing, and Pharmaceutical Compounding 18

Promising Areas of Innovation 19

Conclusion 25

Notes 25

Chapter 2 Genome Sequencing: Know Thyself, One Base Pair at a Time 27

Content contributed by Sheetal Shetty and Jacob Brill

Challenges of Genomic Analysis 29

The Language of Life 30

A Brief History of DNA Sequencing 31

DNA Sequencing and the Human Genome Project 35

Select Tools for Genomic Analysis 38

Conclusion 47

Notes 48

Chapter 3 Data Management 53

Content contributed by Joe Arnold

Bits about Data 54

Data Types 56

Data Security and Compliance 59

Data Storage 66

SwiftStack 70

OpenStack Swift Architecture 78

Conclusion 94

Notes 94

Chapter 4 Designing a Data-Ready Network Infrastructure 105

Research Networks: A Primer 108

ESnet at 30: Evolving toward Exascale and Raising Expectations 109

Internet2 Innovation Platform 111

Advances in Networking 113

InfiniBand and Microsecond Latency 114

The Future of High-Performance Fabrics 117

Network Function Virtualization 119

Software-Defined Networking 121

OpenDaylight 122

Conclusion 157

Notes 157

Chapter 5 Data-Intensive Compute Infrastructures 163

Content contributed by Dijiang Huang, Yuli Deng, Jay Etchings, Zhiyuan Ma, and Guangchun Luo

Big Data Applications in Health Informatics 166

Sources of Big Data in Health Informatics 168

Infrastructure for Big Data Analytics 171

Fundamental System Properties 186

GPU-Accelerated Computing and Biomedical Informatics 187

Conclusion 190

Notes 191

Chapter 6 Cloud Computing and Emerging Architectures 211

Cloud Basics 213

Challenges Facing Cloud Computing Applications in Biomedicine 215

Hybrid Campus Clouds 216

Research as a Service 217

Federated Access Web Portals 219

Cluster Homogeneity 220

Emerging Architectures (Zeta Architecture) 221

Conclusion 229

Notes 229

Chapter 7 Data Science 235

NoSQL Approaches to Biomedical Data Science 237

Using Splunk for Data Analytics 244

Statistical Analysis of Genomic Data with Hadoop 250

Extracting and Transforming Genomic Data 253

Processing eQTL Data 256

Generating Master SNP Files for Cases and Controls 259

Generating Gene Expression Files for Cases and Controls 260

Cleaning Raw Data Using MapReduce 261

Transpose Data Using Python 263

Statistical Analysis Using Spark 264

Hive Tables with Partitions 268

Conclusion 270

Notes 270

Appendix: A Brief Statistics Primer 290

Content Contributed by Daniel Peñaherrera

Chapter 8 Next-Generation Cyberinfrastructures 307

Next-Generation Cyber Capability 308

NGCC Design and Infrastructure 310

Conclusion 327

Note 330

Conclusion 335

Appendix A The Research Data Management Survey: From Concepts to Practice 337

Brandon Mikkelsen and Jay Etchings

Appendix B Central IT and Research Support 353

Gregory D. Palmer

Appendix C HPC Working Example: Using Parallelization Programs Such as GNU Parallel and OpenMP with Serial

Tools 377

Appendix D HPC and Hadoop: Bridging HPC to Hadoop 385

Appendix E Bioinformatics + Docker: Simplifying Bioinformatics Tools Delivery with Docker Containers 391

Glossary 399

About the Author 419

About the Contributors 421

Index 427

Introduction


Never let the future disturb you.

You will meet it, if you have to, with the same weapons of reason which today arm you against the present.

—Marcus Aurelius

Some time ago, while I was engaged as a consultant, it became painfully obvious that the approaches to healthcare data management and overall infrastructure architecture were stuck in the Stone Age. While data and information technology (IT) professionals sprinted to remain on the cutting edge of top tech trends, much of the healthcare system remained a technical backwater. The many explanations for this include compliance controls, challenges associated with the rapid proliferation of data, and reliance on old systems with proprietary code where porting was more painful than the day-to-day operations. This state of affairs has been frustrating for all involved. But beyond the very real frustrations, there are far more important negative impacts. Technical inefficiencies increase costs, lead to a loss of research productivity, and hurt clinical outcomes. In other words, everyone suffers. When I talk to people about data management and IT support within the healthcare field, a recurring theme is that much is “lost in translation” between the various stakeholders: IT professionals, researchers, doctors, clinicians, and administrators.

Over the past 20 years, much of my time has been spent in medical and technical fields. I have held positions with two large insurance payer providers and have worked with the Centers for Medicare & Medicaid Services (CMS) as a recovery audit contractor. I have even worked clinically as an emergency medical technician with a strong background in exercise physiology. Seeking greater challenges led me to Las Vegas, Nevada, where I was fortunate to work on the first cloud-enabled centrally deterministic (Class 2) gaming systems for the state lottery. This was well before the term “cloud” had even arrived. At the close of the project, I returned to the medical field, joining a Fortune 50 payer provider ingesting targeted acquisitions.

My wide-ranging work experiences have showed me that medical and research professionals are usually not technology experts, and most do not desire to be. At the same time, computer scientists and infrastructure experts are not biologists, doctors, or researchers. This longtime disconnect paves the way for high-paid consultants to act as intermediaries brought in to work between IT and biomedical staff.

Not surprisingly, this does not work terribly well, neither does it best serve the medical and research communities. Consultants typically demand high compensation and often are not able to perform the sort of knowledge transfer necessary to make a meaningful and sustainable impact. There are many different permutations and possible explanations for this. But, in the end, I think it is at heart a failure to adequately translate or bridge biomedicine and IT.

The primary motivation for this book is to begin to create a sustainable and readily accessible bridge between IT and data technologists, on one hand, and the community of clinicians, researchers, and academics who deliver and advance healthcare, on the other hand. This book is thus a translational text that will hopefully work both ways. It can help IT staff learn more about clinical and research needs within biomedicine. It also can help doctors and researchers learn more about data and other technical tools that are potentially at their disposal.

My experience in healthcare has shown me that both IT professionals and biologists tend to become isolated or siloed in their professional worlds. This isolation hurts us all: IT staff, biologists, doctors, and patients alike. This is not to suggest that IT staff and data managers should get master’s degrees in biology or epidemiology. Rather, I am suggesting that as IT staff and data managers learn more about the biomedical context of their work, they will be able to work better and more efficiently. Furthermore, as biomedicine becomes ever more dependent on computing and big data, there is more and more domain-specific technical knowledge to assimilate.

As IT and biomedicine innovate with increasing rapidity, I predict that we will see more and more hybrid job titles, such as health technologist and bioinformatician. In order to stay current, both IT professionals and biomedical professionals will need to become less isolated. This book begins to bring together these two fields that are so dependent on each other and have so much to offer each other. It is my sincere hope that this work will narrow the gap between those engaged in use-inspired research and those supporting that research from an infrastructure delivery perspective.

In the interest of creating as accessible a bridge text as possible between IT staff and biomedical personnel, this book is relatively nontechnical. For the most part, the aim is to offer a conceptual introduction to key topics in data management for the biomedical sciences. While a certain familiarity with IT, networking, and applications is assumed, you will find very little in the way of code examples. The goal is to equip you with some foundational concepts that will leave you prepared to seek out whatever additional information you and your institution might need.

I have worked in IT for over 20 years, but I am most inspired by how computing technologies can be used to solve human problems. I certainly appreciate elegant code and innovative technical solutions. But at the end of the day, it is the prospect of improving patient outcomes that keeps me engaged and driven to learn and continually extend the boundaries of the possible. One area of biomedical research that I find particularly inspiring is the potential to use targeted therapies to more effectively treat pediatric low-grade astrocytomas (PLGAs). PLGAs are by far the most common cancer of the brain among children. They are often fatal, and current chemotherapies frequently have lifelong side effects, including neurocognitive impairment. Dr. Joshua LaBaer, interim director of the Biodesign Institute at Arizona State University, is working to develop effective targeted therapies that reduce harmful effects on normal cells. Proceeds from this book support the ASU Research Foundation and the work of Dr. Joshua LaBaer, Director, The Biodesign Institute, Personalized Diagnostics and Virginia G. Piper Chair in Personalized Medicine.

In reflecting on the important roles to be played by humans and by computing, I am reminded of a frequently cited quote by Leo M. Cherne, an American economist and public servant, that is often inaccurately attributed to Albert Einstein: “The computer is incredibly fast, accurate, and stupid. Man is unbelievably slow, inaccurate, and brilliant. The marriage of the two is a force beyond calculation.” As our capabilities to gather, analyze, and archive data dramatically improve, computing is likely to be increasingly valuable to biomedical research and clinical medicine. Yet let us always remember the need for humans, slow and inaccurate as we usually are.

WHO SHOULD READ THIS BOOK?


Strategies in Biomedical Data Science is designed to help anyone who works with biomedical data. This certainly includes IT staff and systems administrators. These readers will hopefully gain a deeper understanding of particular challenges and solutions for biomedical data management. The target audience also includes bioscience researchers and clinical staff. While persons in these roles are not typically directly responsible for data management, they are most certainly concerned with and affected by how data is created, used, and archived. I hope these readers will gain a deeper understanding of how IT staff tend to approach systems architecture and data management. Quite frequently we focus on research academic and other public research institutions. Such institutions are tremendously important for cutting-edge research and collaboration. Most of the best practices and scenarios presented in the book are, however, equally applicable to private-sector use cases.

All readers are welcome to work through this book in whatever order best suits their particular interests and needs.

WHAT’S IN THIS BOOK?


Strategies in Biomedical Data Science offers a relatively high-level introduction to the cutting-edge and rapidly changing field of biomedical data. It provides biomedical IT professionals with much-needed guidance toward managing the increasing deluge of healthcare data. This book demonstrates ways in which both technological development and more effective use of current resources can better serve both patient and payer. The discussion explores the aggregation of disparate data sources, current analytics and tool sets, the growing necessity of smart bioinformatics, and more as data science and biomedical science grow increasingly intertwined. Real-world use cases and clear examples are featured throughout, and coverage of data sources, problems, and potential mitigation provides necessary insight for forward-looking healthcare professionals.

The book begins with an overview of current technical challenges in healthcare and then moves into topics in biomedical data management, including network infrastructure, compute infrastructure, cloud architecture, and finally next-generation cyberinfrastructures.

Many of the chapters include use cases and/or case studies. Use cases examine a general use case and typically focus on one application or technology....

Erscheint lt. Verlag 27.12.2016
Reihe/Serie SAS Institute Inc
SAS Institute Inc
Wiley and SAS Business Series
Vorwort Ken Buetow
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Medizin / Pharmazie Allgemeines / Lexika
Medizin / Pharmazie Gesundheitswesen
Medizin / Pharmazie Medizinische Fachgebiete
Naturwissenschaften Biologie
Schlagworte Big Data Analytics • Big data in healthcare • Bioinformatics • biomedical data careers • biomedical data generation • biomedical engineering • biomedical research analytics • Biomedizintechnik • Healthcare Analytics • healthcare data challenges • healthcare data management • healthcare data solutions • healthcare data sources • healthcare data volume • health device data • Jay Etchings • Medical Data • medical data and politics • medical data management • Medical Informatics & Biomedical Information Technology • Medizininformatik u. biomedizinische Informationstechnologie • personalized medicine data • Strategies in Biomedical Data Science: Driving Force for Innovation
ISBN-10 1-119-25597-X / 111925597X
ISBN-13 978-1-119-25597-0 / 9781119255970
Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?
EPUBEPUB (Adobe DRM)

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belle­tristik und Sach­büchern. Der Fließ­text wird dynamisch an die Display- und Schrift­größe ange­passt. Auch für mobile Lese­geräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Der Leitfaden für die Praxis

von Christiana Klingenberg; Kristin Weber

eBook Download (2025)
Carl Hanser Fachbuchverlag
CHF 48,80