GPU Programming with C++ and CUDA - Paulo Motta

Blick ins Buch

GPU Programming with C++ and CUDA (eBook)

Uncover effective techniques for writing efficient GPU-parallel C++ applications

Paulo Motta (Autor)

eBook Download: EPUB

2025
270 Seiten
Packt Publishing (Verlag)
978-1-80512-882-3 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (EPUB)

Written by Paulo Motta, a senior researcher with decades of experience, this comprehensive GPU programming book is an essential guide for leveraging the power of parallelism to accelerate your computations. The first section introduces the concept of parallelism and provides practical advice on how to think about and utilize it effectively. Starting with a basic GPU program, you then gain hands-on experience in managing the device. This foundational knowledge is then expanded by parallelizing the program to illustrate how GPUs enhance performance.
The second section explores GPU architecture and implementation strategies for parallel algorithms, and offers practical insights into optimizing resource usage for efficient execution.
In the final section, you will explore advanced topics such as utilizing CUDA streams. You will also learn how to package and distribute GPU-accelerated libraries for the Python ecosystem, extending the reach and impact of your work.
Combining expert insight with real-world problem solving, this book is a valuable resource for developers and researchers aiming to harness the full potential of GPU computing. The blend of theoretical foundations, practical programming techniques, and advanced optimization strategies it offers is sure to help you succeed in the fast-evolving field of GPU programming.

Learn to solve parallel problems with GPU-accelerated C++ code and create reusable libraries that can be accessed from other programming languagesKey FeaturesHarness the power of GPU parallelism to accelerate real-world tasksUtilize CUDA streams and scale performance with custom C++ solutionsCreate reusable GPU libraries and expose them to Python seamlesslyBook DescriptionWritten by Paulo Motta, a senior researcher with decades of experience, this comprehensive GPU programming book is an essential guide for leveraging the power of parallelism to accelerate your computations. The first section introduces the concept of parallelism and provides practical advice on how to think about and utilize it effectively. Starting with a basic GPU program, you then gain hands-on experience in managing the device. This foundational knowledge is then expanded by parallelizing the program to illustrate how GPUs enhance performance. The second section explores GPU architecture and implementation strategies for parallel algorithms, and offers practical insights into optimizing resource usage for efficient execution. In the final section, you will explore advanced topics such as utilizing CUDA streams. You will also learn how to package and distribute GPU-accelerated libraries for the Python ecosystem, extending the reach and impact of your work. Combining expert insight with real-world problem solving, this book is a valuable resource for developers and researchers aiming to harness the full potential of GPU computing. The blend of theoretical foundations, practical programming techniques, and advanced optimization strategies it offers is sure to help you succeed in the fast-evolving field of GPU programming.What you will learnManage GPU devices and accelerate your applicationsApply parallelism effectively using CUDA and C++Choose between existing libraries and custom GPU solutionsPackage GPU code into libraries for use with PythonExplore advanced topics such as CUDA streamsImplement optimization strategies for resource-efficient executionWho this book is forC++ developers and programmers interested in accelerating applications using GPU programming will benefit from this book. It is suitable for those with solid C++ experience who want to explore high-performance computing techniques. Familiarity with operating system fundamentals will help when dealing with device memory and communication in advanced chapters.]]>

1 Introduction to Parallel Programming

Welcome to the world of graphics processing unit (GPU) programming!

Before we talk about programming GPUs, we must understand what parallel programming is and how it can benefit our applications. As with everything in life, it has its challenges. In this chapter, we’ll explore both the benefits and drawbacks of parallel programming, laying the groundwork for our deep dive into GPU programming. So in this first chapter, we’ll be discussing a variety of topics without developing any code. In doing so, we’ll establish the foundations on which to build throughout our journey.

Apart from being useful, the information provided in this chapter is fundamental to understanding what happens inside a GPU, as we’ll discuss shortly. By the end of the chapter, you’ll understand why parallelism is important and when it makes sense to use it in your applications.

In this chapter, we’re going to cover the following main topics:

What parallelism is in software, and why it’s important
Different types of parallelism
An overview of GPU architecture
Comparing central processing units (CPUs) and GPUs
Advantages and challenges of GPU programming

Getting the most out of this book – get to know your free benefits

Unlock exclusive free benefits that come with your purchase, thoughtfully crafted to supercharge your learning journey and help you learn without limits.

Here’s a quick overview of what you get with this book:

Next-gen reader

Figure 1.1: Illustration of the next-gen Packt Reader’s features

Our web-based reader, designed to help you learn effectively, comes with the following features:

Multi-device progress sync: Learn from any device with seamless progress sync.

Highlighting and notetaking: Turn your reading into lasting knowledge.

Bookmarking: Revisit your most important learnings anytime.

Dark mode: Focus with minimal eye strain by switching to dark or sepia mode.

Interactive AI assistant (beta)

Figure 1.2: Illustration of Packt’s AI assistant

Our interactive AI assistant has been trained on the content of this book, to maximize your learning experience. It comes with the following features:

Summarize it: Summarize key sections or an entire chapter.

AI code explainers: In the next-gen Packt Reader, click the Explain button above each code block for AI-powered code explanations.

Note: The AI assistant is part of next-gen Packt Reader and is still in beta.

DRM-free PDF or ePub version

Figure 1.3: Free PDF and ePub

Learn without limits with the following perks included with your purchase:

Learn from anywhere with a DRM-free PDF copy of this book.

Use your favorite e-reader to learn using a DRM-free ePub version of this book.

Unlock this book’s exclusive benefits now

Scan this QR code or go to packtpub.com/unlock, then search for this book by name. Ensure it’s the correct edition.

Note: Keep your purchase invoice ready before you start.

Technical requirements

For this chapter, the only technical requirement that we have is the goodwill to keep reading!

What is parallelism in software?

Parallel programming is a way of making a computer do many things at once. But wait – isn’t this what already happens daily? Yes and no. Most common processors today are capable of executing more than one task at the same time – and we mean at the same time. However, this is only the first requirement for parallel software. The second is to make at least some of the processor cores work on the same problem in a coordinated way. Let’s consider an example.

Imagine that you’re taking on a big task, such as sorting a huge pile of books. Instead of doing it alone, you ask a group of friends to help. Each friend takes a small part of the pile and sorts it. You all work at the same time, and the job gets done much faster. This is similar to how parallel programming works: it breaks a big problem into smaller pieces and solves them at the same time using multiple cores.

Of course, this example was chosen because it has a special characteristic: it’s easily parallelizable in the sense that we can perceive how to break the big tasks into smaller ones. Not all problems can be easily broken down for parallel processing. One of our first challenges is finding ways to decompose problems into smaller tasks that can be executed simultaneously. Sometimes, there are parts of our algorithm that need to be executed on a single core while all others sit idle before we can separate the parallel tasks. This is usually called a sequential part. It’s time for a different example.

Let’s suppose you’re having a movie and games night with your friends. You all decide to prepare some food and for that, you go to the supermarket. To make things faster, your friends come along so that, once there, everyone can select multiple ingredients at the same time – this is the parallel part. However, since you’re all going in a single car, only one person can drive at a given time, no matter how many licensed drivers there are in the vehicle. You can always argue that they could take turns driving a part of the way, but in this scenario, it would only take longer to get to the supermarket.

Upon arriving, each person heads to a different aisle to gather the pre-defined ingredients. Once everything is collected, another crucial decision arises: should each person go to a separate checkout line to pay with their credit card, or should they all queue together if they only have one card? Opting for the parallel payment method reveals another interesting aspect of parallel processing.

Even when tasks are processed in parallel (each person is on a different checkout line), the execution times can vary unpredictably. This means that at any given moment, different lines move at different speeds, and those who have already paid for their ingredients may end up waiting for their friends (processors) to finish their payments.

Once all the payments are complete, a new sequential part is followed: driving back home. This time, a different driver might be executing this task while the other people – I mean, processors – sit idle waiting for the next task to execute. Some algorithms have sequential parts to synchronize data or to share intermediate results, and that’s why only one processor is working. Here, we’re collecting the data that each processor – I mean, friend – got from the supermarket and we have to move this from one location to another. There’s no use for parallelism in this small part.

Why is parallelism important?

There are many situations in which the size of the problems we want to solve increases dramatically. And this is the moment when we have to start talking about more ‘serious’ real-world applications, such as weather forecasting, scientific research, and artificial intelligence.

Remember when we were driving to the supermarket and we mentioned that we could switch drivers for each part of the way? Wouldn’t this only end up taking us more time? This was due to context switching – we would have to find a place to park, then switch drivers, then drive the car until the next stop. But why are we talking about this again? Because most of the time, we need a ‘serious’ real-world application to make it worthwhile working through all the details of parallel programming.

One exception could be using parallel programming to accelerate graphics and physics processing in video games; although these applications may not be critical for human life, they’re pretty serious. We could always classify video games within the ‘serious’ simulation category. Let’s understand some of the benefits we get by using parallelism in our software.

Speeding up tasks

Splitting tasks into smaller parts that can be done simultaneously dramatically speeds up the overall process. We now have multiple processors...

Erscheint lt. Verlag	29.8.2025
Sprache	englisch
Themenwelt	Informatik ► Programmiersprachen / -werkzeuge ► C / C++
Themenwelt	Informatik ► Weitere Themen ► Hardware
ISBN-10	1-80512-882-5 / 1805128825
ISBN-13	978-1-80512-882-3 / 9781805128823

Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?

EPUB (Ohne DRM)

Digital Rights Management: ohne DRM
Dieses eBook enthält kein DRM oder Kopierschutz. Eine Weitergabe an Dritte ist jedoch rechtlich nicht zulässig, weil Sie beim Kauf nur die Rechte an der persönlichen Nutzung erwerben.

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür die kostenlose Software Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.