Blick ins Buch

Programming Multicore and Many-core Computing Systems (eBook)

Sabri Pllana, Fatos Xhafa (Herausgeber)

eBook Download: PDF

2017
John Wiley & Sons (Verlag)
978-1-119-33199-5 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (PDF)

Programming multi-core and many-core computing systems

Sabri Pllana, Linnaeus University, Sweden

Fatos Xhafa, Technical University of Catalonia, Spain

Provides state-of-the-art methods for programming multi-core and many-core systems

The book comprises a selection of twenty two chapters covering: fundamental techniques and algorithms; programming approaches; methodologies and frameworks; scheduling and management; testing and evaluation methodologies; and case studies for programming multi-core and many-core systems.

Program development for multi-core processors, especially for heterogeneous multi-core processors, is significantly more complex than for single-core processors. However, programmers have been traditionally trained for the development of sequential programs, and only a small percentage of them have experience with parallel programming. In the past, only a relatively small group of programmers interested in High Performance Computing (HPC) was concerned with the parallel programming issues, but the situation has changed dramatically with the appearance of multi-core processors on commonly used computing systems. It is expected that with the pervasiveness of multi-core processors, parallel programming will become mainstream.

The pervasiveness of multi-core processors affects a large spectrum of systems, from embedded and general-purpose, to high-end computing systems. This book assists programmers in mastering the efficient programming of multi-core systems, which is of paramount importance for the software-intensive industry towards a more effective product-development cycle.

Key features:

Lessons, challenges, and roadmaps ahead.
Contains real world examples and case studies.
Helps programmers in mastering the efficient programming of multi-core and many-core systems.

The book serves as a reference for a larger audience of practitioners, young researchers and graduate level students. A basic level of programming knowledge is required to use this book.

Sabri Pllana is an Associate Professor in the Department of Computer Science at Linnaeus University, Sweden. Before joining Linnaeus University, he worked for 12 years at the Research Group Scientific Computing, University of Vienna in Austria. His current research interests include performance-oriented software engineering and self-adaptive techniques for performance portability across various heterogeneous computing systems. He contributed to several EU-funded projects and coordinated the FP7 project PEPPHER. He has contributed as member/chair to more than 60 program committees. He holds a PhD degree (with distinction) in computer science from the Vienna University of Technology. He is a Senior Member of the IEEE, a member of the European Network of Excellence on High Performance and Embedded Architecture and Compilation (HiPEAC) and of the European ICT COST Action (IC1406) on High-Performance Modelling and Simulation for Big Data Applications, an associate member of ETP4HPC, and a member of the Euro-Par Advisory Board.
Fatos Xhafa received his PhD in Computer Science in 1998 from the Technical University of Catalonia (UPC), Barcelona, Spain. Currently, he holds a permanent position of Professor Titular d'Universitat at UPC. He was a Visiting Professor at University of London (UK), 2009-2010, and Research Associate at Drexel University (USA), 2004/2005. He has widely published in international journals, conferences/workshops, book chapters, edited books and proceedings in the field. He is editor in Chief of the International Journal of Grid and Utility Computing, International Journal of Space-based and Situated Computing, Inderscience. He is Editor in Chief of the Elsevier Book Series 'Intelligent Data-Centric Systems' and of Springer Lecture Notes in Data Engineering and Communication Technologies. He is a member of IEEE Communications Society, IEEE Systems, Man & Cybernetics Society and Emerging Technical Subcommittee of IoT. His research interests include parallel and distributed computing, massive data processing, collective intelligence, optimization, trustworthy computing, machine learning, etc.

Programming multi-core and many-core computing systems Sabri Pllana, Linnaeus University, Sweden Fatos Xhafa, Technical University of Catalonia, Spain Provides state-of-the-art methods for programming multi-core and many-core systems The book comprises a selection of twenty two chapters covering: fundamental techniques and algorithms; programming approaches; methodologies and frameworks; scheduling and management; testing and evaluation methodologies; and case studies for programming multi-core and many-core systems. Program development for multi-core processors, especially for heterogeneous multi-core processors, is significantly more complex than for single-core processors. However, programmers have been traditionally trained for the development of sequential programs, and only a small percentage of them have experience with parallel programming. In the past, only a relatively small group of programmers interested in High Performance Computing (HPC) was concerned with the parallel programming issues, but the situation has changed dramatically with the appearance of multi-core processors on commonly used computing systems. It is expected that with the pervasiveness of multi-core processors, parallel programming will become mainstream. The pervasiveness of multi-core processors affects a large spectrum of systems, from embedded and general-purpose, to high-end computing systems. This book assists programmers in mastering the efficient programming of multi-core systems, which is of paramount importance for the software-intensive industry towards a more effective product-development cycle. Key features: Lessons, challenges, and roadmaps ahead. Contains real world examples and case studies. Helps programmers in mastering the efficient programming of multi-core and many-core systems. The book serves as a reference for a larger audience of practitioners, young researchers and graduate level students. A basic level of programming knowledge is required to use this book.

Sabri Pllana is an Associate Professor in the Department of Computer Science at Linnaeus University, Sweden. Before joining Linnaeus University, he worked for 12 years at the Research Group Scientific Computing, University of Vienna in Austria. His current research interests include performance-oriented software engineering and self-adaptive techniques for performance portability across various heterogeneous computing systems. He contributed to several EU-funded projects and coordinated the FP7 project PEPPHER. He has contributed as member/chair to more than 60 program committees. He holds a PhD degree (with distinction) in computer science from the Vienna University of Technology. He is a Senior Member of the IEEE, a member of the European Network of Excellence on High Performance and Embedded Architecture and Compilation (HiPEAC) and of the European ICT COST Action (IC1406) on High-Performance Modelling and Simulation for Big Data Applications, an associate member of ETP4HPC, and a member of the Euro-Par Advisory Board. Fatos Xhafa received his PhD in Computer Science in 1998 from the Technical University of Catalonia (UPC), Barcelona, Spain. Currently, he holds a permanent position of Professor Titular d'Universitat at UPC. He was a Visiting Professor at University of London (UK), 2009-2010, and Research Associate at Drexel University (USA), 2004/2005. He has widely published in international journals, conferences/workshops, book chapters, edited books and proceedings in the field. He is editor in Chief of the International Journal of Grid and Utility Computing, International Journal of Space-based and Situated Computing, Inderscience. He is Editor in Chief of the Elsevier Book Series "Intelligent Data-Centric Systems" and of Springer Lecture Notes in Data Engineering and Communication Technologies. He is a member of IEEE Communications Society, IEEE Systems, Man & Cybernetics Society and Emerging Technical Subcommittee of IoT. His research interests include parallel and distributed computing, massive data processing, collective intelligence, optimization, trustworthy computing, machine learning, etc.

Cover???????????????????????? 1
Title Page?????????????????????????????????? 5
Copyright???????????????????????????????? 6
Contents?????????????????????????????? 7
List of Contributors?????????????????????????????????????????????????????? 11
Preface???????????????????????????? 17
Acknowledgements?????????????????????????????????????????????? 25
Acronyms?????????????????????????????? 27
Part I Foundations?????????????????????????????????????????????????? 33
Chapter 1 Multi- and Many-Cores, Architectural Overview for Programmers???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 35
1.1 Introduction 35
1.2 Why Multicores? 41
1.3 Homogeneous Multicores 44
1.4 Heterogeneous Multicores 51
1.5 Concluding Remarks 56
References 57
Chapter 2 Programming Models for MultiCore and Many-Core Computing Systems?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 61
2.1 Introduction 61
2.2 A Comparative Analysis of Many - Cores 62
2.3 Programming Models Features 64
2.4 Programming Models for Many-Cores 69
2.5 An Overview of Many-Core Programming Models 81
2.6 Concluding Remarks 87
References 88
Chapter 3 Lock-free Concurrent Data Structures?????????????????????????????????????????????????????????????????????????????????????????????????????????? 91
3.1 Introduction 91
3.2 Synchronization Primitives 93
3.3 Lock-Free Data Structures 95
3.4 Memory Management for Concurrent Data Structures 101
3.5 Graphics Processors 104
References 106
Chapter 4 Software Transactional Memory???????????????????????????????????????????????????????????????????????????????????????????? 113
4.1 Introduction 113
4.2 STM: A Programmer’s Perspective 114
4.3 Transactional Semantics 116
4.4 STM Design Space 119
4.5 STM: A Historical Perspective 124
4.6 Application Performance on STM 126
4.7 Concluding Remarks 128
References 129
Part II Programming Approaches?????????????????????????????????????????????????????????????????????????? 131
Chapter 5 Hybrid/Heterogeneous Programming with OmpSs and its Software/Hardware Implications?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 133
5.1 Introduction 133
5.2 The OMPSS Proposal 135
5.3 Implementation 141
5.4 Task Granularity 148
5.5 Related Work 150
5.6 Future Work 150
5.7 Concluding Remarks 151
Acknowledgments 151
References 151
Chapter 6 Skeleton Programming for Portable Many-Core Computing???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 153
6.1 Introduction 153
6.2 Background: Skeleton Programming 154
6.3 Skepu: A Tunable Skeleton Programming Library 156
6.4 Skelcl: A Library for High-Level Multi-Gpu Programming 164
6.5 Related Work 169
6.6 Concluding Remarks and Future Work 171
References 171
Chapter 7 DSL Stream Programming on Multicore Architectures???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 175
7.1 Introduction?????????????????????????????????????????????? 175
7.2 A High-Level DSL: SLICES 180
7.3 Intermediary Representation SJD???????????????????????????????????????????????????????????????????????????????????? 182
7.4 Optimizing the Intermediate Representation?????????????????????????????????????????????????????????????????????????????????????????????????????????? 185
7.5 Reducing Intercore Communication Cost???????????????????????????????????????????????????????????????????????????????????????????????? 188
7.6 Evaluation?????????????????????????????????????????? 189
7.7 Conclusion?????????????????????????????????????????? 192
References?????????????????????????????????? 193
Chapter 8 Programming with Transactional Memory???????????????????????????????????????????????????????????????????????????????????????????????????????????? 197
8.1 Introduction?????????????????????????????????????????????? 197
8.2 Concurrency Made Simple???????????????????????????????????????????????????????????????????? 198
8.3 TM Language Constructs?????????????????????????????????????????????????????????????????? 200
8.4 Implementing A TM???????????????????????????????????????????????????????? 202
8.5 Performance Limitations???????????????????????????????????????????????????????????????????? 207
8.6 Recent Solutions?????????????????????????????????????????????????????? 209
8.7 Concluding Remarks?????????????????????????????????????????????????????????? 212
Acknowledgments???????????????????????????????????????????? 212
References?????????????????????????????????? 212
Chapter 9 Object-Oriented Stream Programming?????????????????????????????????????????????????????????????????????????????????????????????????????? 217
9.1 Stream Programming?????????????????????????????????????????????????????????? 218
9.2 Object-Oriented Stream Programming?????????????????????????????????????????????????????????????????????????????????????????? 219
9.3 XJAVA???????????????????????????????? 219
9.4 Performance???????????????????????????????????????????? 226
9.5 Experiences???????????????????????????????????????????? 232
9.6 Related Work?????????????????????????????????????????????? 233
9.7 Future Work???????????????????????????????????????????? 234
9.8 Summary???????????????????????????????????? 234
References?????????????????????????????????? 234
Chapter 10 Software-Based Speculative Parallelization???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 237
10.1 Introduction???????????????????????????????????????????????? 237
10.2 Speculative Execution in Cord?????????????????????????????????????????????????????????????????????????????????? 239
10.3 Advanced Features of Cord?????????????????????????????????????????????????????????????????????????? 246
10.4 Related Work???????????????????????????????????????????????? 253
10.5 Future Work?????????????????????????????????????????????? 254
10.6 Concluding Remarks???????????????????????????????????????????????????????????? 255
References?????????????????????????????????? 255
Chapter 11 Autonomic Distribution and Adaptation?????????????????????????????????????????????????????????????????????????????????????????????????????????????? 259
11.1 Introduction???????????????????????????????????????????????? 259
11.2 Parallel Programming Models?????????????????????????????????????????????????????????????????????????????? 260
11.3 Concurrent Code?????????????????????????????????????????????????????? 264
11.4 Conclusions?????????????????????????????????????????????? 271
References?????????????????????????????????? 272
Part III Programming Frameworks???????????????????????????????????????????????????????????????????????????? 273
Chapter 12 PEPPHER: Performance Portability and Programmability for Heterogeneous Many-Core Architectures???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 275
12.1 Introduction and Background?????????????????????????????????????????????????????????????????????????????? 276
12.2 The Peppher Framework?????????????????????????????????????????????????????????????????? 278
12.3 The Peppher Methodology?????????????????????????????????????????????????????????????????????? 284
12.4 Performance Guidelines and Portability???????????????????????????????????????????????????????????????????????????????????????????????????? 284
12.5 Further Technical Aspects?????????????????????????????????????????????????????????????????????????? 285
12.6 Guiding Applications and Benchmarks?????????????????????????????????????????????????????????????????????????????????????????????? 288
12.7 Conclusion???????????????????????????????????????????? 289
References?????????????????????????????????? 289
Chapter 13 Fastflow: High-Level and Efficient Streaming on Multicore?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 293
13.1 Fastflow Principles?????????????????????????????????????????????????????????????? 294
13.2 Fastflow ?-Tutorial 297
13.3 Performance?????????????????????????????????????????????? 302
13.4 Related Work???????????????????????????????????????????????? 308
13.5 Future Work and Conclusions?????????????????????????????????????????????????????????????????????????????? 309
References?????????????????????????????????? 310
Chapter 14 Parallel Programming Framework for H.264/AVC Video Encoding in Multicore Systems???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 313
14.1 Introduction???????????????????????????????????????????????? 313
14.2 Parallel Programming Framework for H.264/AVC Video Encoding?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 317
14.3 Programming Parallel H.264 Video Encoders?????????????????????????????????????????????????????????????????????????????????????????????????????????? 325
14.4 Evaluation of Parallel H.264 Video Encoders?????????????????????????????????????????????????????????????????????????????????????????????????????????????? 326
14.5 Concluding Remarks???????????????????????????????????????????????????????????? 330
Acknowledgments???????????????????????????????????????????? 331
References?????????????????????????????????? 331
Chapter 15 Parallelizing Evolutionary Algorithms on GPGPU Cards with the EASEA Platform???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 333
15.1 Introduction???????????????????????????????????????????????? 333
15.2 Easea Parallelization of EA on GPGPU???????????????????????????????????????????????????????????????????????????????????????????????? 335
15.3 Experiments and Applications???????????????????????????????????????????????????????????????????????????????? 341
15.4 Conclusion???????????????????????????????????????????? 349
References?????????????????????????????????? 349
Part IV Testing, Evaluation and Optimization?????????????????????????????????????????????????????????????????????????????????????????????????????? 353
Chapter 16 Smart Interleavings for Testing Parallel Programs?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 355
16.1 Introduction???????????????????????????????????????????????? 355
16.2 Reviews of Parallel Programs???????????????????????????????????????????????????????????????????????????????? 356
16.3 Testing of Parallel Programs???????????????????????????????????????????????????????????????????????????????? 367
16.4 Related Work???????????????????????????????????????????????? 372
16.5 Future Work?????????????????????????????????????????????? 372
16.6 Concluding Remarks???????????????????????????????????????????????????????????? 372
References?????????????????????????????????? 373
Problems?????????????????????????????? 374
Chapter 17 Parallel Performance Evaluation and Optimization???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 375
17.1 Sequential Versus Parallel Performance???????????????????????????????????????????????????????????????????????????????????????????????????? 376
17.2 Thread Overheads???????????????????????????????????????????????????????? 377
17.3 Cache Coherence Overheads?????????????????????????????????????????????????????????????????????????? 380
17.4 Synchronization Overheads?????????????????????????????????????????????????????????????????????????? 382
17.5 Nonuniform Memory Access???????????????????????????????????????????????????????????????????????? 387
17.6 Overlapping Latency?????????????????????????????????????????????????????????????? 389
17.7 Diagnostic Tools and Techniques?????????????????????????????????????????????????????????????????????????????????????? 390
17.8 Summary?????????????????????????????????????? 393
References?????????????????????????????????? 394
Chapter 18 A Methodology for Optimizing Multithreaded System Scalability on Multicores?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 395
18.1 Introduction???????????????????????????????????????????????? 395
18.2 Multithreading and Scalability???????????????????????????????????????????????????????????????????????????????????? 397
18.3 Controlled Performance Measurements?????????????????????????????????????????????????????????????????????????????????????????????? 398
18.4 Workload Design and Implementation???????????????????????????????????????????????????????????????????????????????????????????? 399
18.5 Quantifying Scalability?????????????????????????????????????????????????????????????????????? 400
18.6 Case Study: Memcached Scalability?????????????????????????????????????????????????????????????????????????????????????????? 405
18.7 Other Multithreaded Applications???????????????????????????????????????????????????????????????????????????????????????? 408
18.8 Case Study: Data Validation?????????????????????????????????????????????????????????????????????????????? 409
18.9 Scalability on Many-Core Architectures???????????????????????????????????????????????????????????????????????????????????????????????????? 412
18.10 Future Work???????????????????????????????????????????????? 413
18.11 Concluding Remarks?????????????????????????????????????????????????????????????? 414
References?????????????????????????????????? 414
Chapter 19 Improving Multicore System Performance through Data Compression?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 417
19.1 Introduction???????????????????????????????????????????????? 417
19.2 Our Approach???????????????????????????????????????????????? 419
19.3 Experimental Evaluation?????????????????????????????????????????????????????????????????????? 425
19.4 Related Work???????????????????????????????????????????????? 433
19.5 Future Work?????????????????????????????????????????????? 434
19.6 Concluding Remarks???????????????????????????????????????????????????????????? 434
References?????????????????????????????????? 434
Part V Scheduling and Management?????????????????????????????????????????????????????????????????????????????? 437
Chapter 20 Programming and Managing Resources on Accelerator-Enabled Clusters???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 439
20.1 Introduction???????????????????????????????????????????????? 439
20.2 Programming Accelerators on Large-Scale Clusters???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 443
20.3 Mapreduce for Heterogeneous Clusters???????????????????????????????????????????????????????????????????????????????????????????????? 444
20.4 Resource Configuration???????????????????????????????????????????????????????????????????? 447
20.5 Resource Management on Clusters with Accelerators?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 450
20.6 Evaluation???????????????????????????????????????????? 455
20.7 Conclusion???????????????????????????????????????????? 458
References?????????????????????????????????? 458
Chapter 21 An Approach for Efficient Execution of SPMD Applications on Multicore Clusters???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 463
21.1 Introduction???????????????????????????????????????????????? 463
21.2 SPMD Applications on Multicore Clusters?????????????????????????????????????????????????????????????????????????????????????????????????????? 465
21.3 Methodology for Efficient Execution?????????????????????????????????????????????????????????????????????????????????????????????? 468
21.4 Scalability and Efficiency of SPMD Applications?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 473
21.5 SPMD Applications and Performance Evaluation???????????????????????????????????????????????????????????????????????????????????????????????????????????????? 475
21.6 Related Works?????????????????????????????????????????????????? 479
21.7 Conclusion and Future Works?????????????????????????????????????????????????????????????????????????????? 480
References?????????????????????????????????? 481
Chapter 22 Operating System and Scheduling for Future Multicore and Many-Core Platforms???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 483
22.1 Introduction???????????????????????????????????????????????? 483
22.2 Operating System Kernel Models???????????????????????????????????????????????????????????????????????????????????? 485
22.3 Scheduling???????????????????????????????????????????? 494
References?????????????????????????????????? 499
Glossary?????????????????????????????? 507
Index???????????????????????? 513
EULA 0

Erscheint lt. Verlag	23.1.2017
Reihe/Serie	Wiley Series on Parallel and Distributed Computing
	Wiley Series on Parallel and Distributed Computing
	Wiley Series on Parallel and Distributed Computing
Sprache	englisch
Themenwelt	Mathematik / Informatik ► Informatik ► Netzwerke
Themenwelt	Mathematik / Informatik ► Informatik ► Theorie / Studium
Schlagworte	computer programmers • Computer Science • Grid & Cloud Computing • Grid- u. Cloud-Computing • Heterogeneous Many-core Architectures • Informatik • Multi-core Clusters • Objectoriented Stream Programming • Optimization for Multi-core and Many-core Computing Systems • Parallel and Distributed Computing • Paralleles u. Verteiltes Rechnen • Parallel Performance • product-development cycle • Programming Multi-core and Many-core Computing Systems • programming multi-core systems • Programming with Transactional Memory • Scheduling • Skeleton Programming • spectrum of computing systems • state-of-the-art methods for programming • Testing Parallel Programs • Verteilte Programmierung
ISBN-10	1-119-33199-4 / 1119331994
ISBN-13	978-1-119-33199-5 / 9781119331995

Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?

PDF (Adobe DRM)

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.