Zum Hauptinhalt springen
Nicht aus der Schweiz? Besuchen Sie lehmanns.de

DeepSeek in Practice (eBook)

From basics to fine-tuning, distillation, agent design, and prompt engineering of open source LLM
eBook Download: EPUB
2025
478 Seiten
Packt Publishing (Verlag)
9781806020843 (ISBN)

Lese- und Medienproben

DeepSeek in Practice - Andy Peng, Alex Strick van Linschoten, Duarte O.Carmo
Systemvoraussetzungen
35,99 inkl. MwSt
(CHF 35,15)
Der eBook-Verkauf erfolgt durch die Lehmanns Media GmbH (Berlin) zum Preis in Euro inkl. MwSt.
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

Learn how to build, fine-tune, and deploy AI systems using DeepSeek, one of the most influential open-source large language models available today. This book guides you through real-world DeepSeek applications-from understanding its core architecture and training foundations to developing reasoning agents and deploying production-ready systems.
Starting with a concise synthesis of DeepSeek's research, breakthroughs, and open-source philosophy, you'll progress to hands-on projects including prompt engineering, workflow design, and rationale distillation. Through detailed case studies-ranging from document understanding to legal clause analysis-you'll see how to use DeepSeek in high-value GenAI scenarios.
You'll also learn to build sophisticated agent workflows and prepare data for fine-tuning. By the end of the book, you'll have the skills to integrate DeepSeek into local deployments, cloud CI/CD pipelines, and custom LLMOps environments.
Written by experts with deep knowledge of open-source LLMs and deployment ecosystems, this book is your comprehensive guide to DeepSeek's capabilities and implementation.


Gain hands-on experience building, fine-tuning, and deploying GenAI applications using DeepSeekKey FeaturesExplore DeepSeek's architecture, training data, and reasoning capabilitiesBuild agents, fine-tune with distillation, and deploy with CI/CD pipelinesApply DeepSeek to real-world use cases like coding, ideation, and legal analysisPurchase of the print or Kindle book includes a free PDF eBookBook DescriptionLearn how to build, fine-tune, and deploy AI systems using DeepSeek, one of the most influential open-source large language models available today. This book guides you through real-world DeepSeek applications-from understanding its core architecture and training foundations to developing reasoning agents and deploying production-ready systems. Starting with a concise synthesis of DeepSeek's research, breakthroughs, and open-source philosophy, you ll progress to hands-on projects including prompt engineering, workflow design, and rationale distillation. Through detailed case studies ranging from document understanding to legal clause analysis you ll see how to use DeepSeek in high-value GenAI scenarios. You ll also learn to build sophisticated agent workflows and prepare data for fine-tuning. By the end of the book, you ll have the skills to integrate DeepSeek into local deployments, cloud CI/CD pipelines, and custom LLMOps environments. Written by experts with deep knowledge of open-source LLMs and deployment ecosystems, this book is your comprehensive guide to DeepSeek s capabilities and implementation.What you will learnDiscover DeepSeek's unique traits in the LLM landscapeCompare DeepSeek's multimodal features with leading modelsConsume DeepSeek via the official API, Ollama, and llama.cppUse DeepSeek for coding, document understanding, and creative ideationIntegrate DeepSeek with third-party platforms like OpenRouter and CloudflareDistill and deploy DeepSeek models into production environmentsIdentify when and where to use DeepSeekUnderstand DeepSeek's open philosophyWho this book is forAI engineers, developers, and builders working with open-source LLMs who want to integrate DeepSeek into GenAI applications, agent workflows, or deployment pipelines. Readers should have hands-on experience with Python, APIs, and tools like Ollama or llama.cpp, and a solid understanding of machine learning concepts.]]>

1


What Is DeepSeek?


Artificial intelligence (AI) is rapidly evolving, and with it comes a suite of tools that allow developers, researchers, and innovators to build smarter, more adaptive systems. One such emerging tool is DeepSeek: a powerful, open-source large language model (LLM) designed to rival the capabilities of major LLMs such as GPT-4 and LLaMA. But what exactly is DeepSeek, and why should you care?

In this chapter, we’re going to dive into what DeepSeek is, how it fits into the broader AI landscape, and why it’s generating interest across the tech industry. You’ll gain an understanding of DeepSeek’s unique features and how it compares to other models in terms of training data, efficiency, and performance benchmarks.

By the end of this chapter, you’ll be equipped to understand the development of DeepSeek and key contributors to its success.

In this chapter, we’re going to cover the following main topics:

  • Introducing DeepSeek
  • Understanding the technical breakthroughs of DeepSeek
  • Impact on the global AI ecosystem
  • Exploring the versions and evolution of DeepSeek

    Free Benefits with Your Book

    Your purchase includes a free PDF copy of this book along with other exclusive benefits. Check the Free Benefits with Your Book section in the Preface to unlock them instantly and maximize your learning experience.

Introducing DeepSeek


DeepSeek is an open-source language model that aims to democratize and make advanced AI accessible. The first version, DeepSeek-R1, appeared on 20 January 2025, just before the Chinese New Year. Instead of shipping only a closed binary, the team published the weights, training scripts, and inference code, so anyone can examine or rebuild the system.

Released under the MIT license, the model carries no usage fees or strict terms. Anyone may run it locally or adapt it for new tasks. This freedom drew developers, researchers, teachers, and small firms worldwide. They apply DeepSeek-R1 in support bots, classroom aids, lab studies, and writing tools.

Benchmarks show DeepSeek-R1, competing with OpenAI-o3 and Gemini-2.5-Pro. It handles math, code, many languages, and complex prompts. The results suggest strong models need not be closed and underscores China’s growing role in frontier AI. The release also revived debates on open access and safety, and improving global research cooperation. On September 17, 2025, another milestone was reached as the DeepSeek-AI team published their research on the model DeepSeek-R1 in Nature and made it to the cover of that issue (https://www.nature.com/articles/s41586-025-09422-z).

From an architectural standpoint, DeepSeek leaned heavily on innovations in transformer-based models, while adding its own spin in later versions (explored in depth in the section, Versions and evolution of DeepSeek). But what made it truly stand out was its usability. DeepSeek could be deployed in a wide range of environments, from cloud servers to edge devices, and even laptops using lightweight versions.

Figure 1.1: Benchmark performance of DeepSeek-R1 (0528) (source: https://api-docs.deepseek.com/news/news250528)

Let’s talk about the various factors that contributed to the sudden rise and popularity of DeepSeek:

  • Open source architecture and training details: DeepSeek-R1 was released with a detailed research paper (https://arxiv.org/abs/2501.12948) outlining its architecture and training approach, benchmark scores across reasoning, math, and programming tasks (https://artificialanalysis.ai/providers/deepseek). This release was supported by full model weights, configuration files, and training scripts, along with six smaller distilled variants suited for local or low-resource environments (https://api-docs.deepseek.com/news/news250120) and immediate API availability (https://api-docs.deepseek.com/guides/reasoning_model) for developers wanting hosted access.
  • Timing: Part of its popularity was due to timing, as global organizations, scientists, and developers began exploring the new release. Additionally, the MIT license provided complete freedom for commercial use – an increasingly rare trait among performant models. The release also sparked excitement because it was not a research-only artifact; it was practical. Developers were able to fine-tune it, deploy it in production environments, and integrate it into existing AI workflows. The combination of power and usability became an instant draw.
  • Initial technical highlights: The most notable aspects of DeepSeek-R1 at launch included the following:
    • Reasoning: It outperformed or matched leading models on key benchmarks involving mathematics, code, and logical reasoning.
    • Efficiency: It provided performance close to GPT-4-level systems at significantly lower inference costs.
    • Reinforcement-first training: Unlike conventional fine-tuning workflows that depend on supervised human-annotated data, DeepSeek skipped straight to reinforcement learning from human feedback (RLHF) or similar paradigms – though with minimal human labeling. The change sped up reasoning scores, lowered human-labeling costs, and allowed the model to tackle diverse tasks, such as math problems, zero-shot code, and so on, without needing narrow task-specific instructions.
    • Custom architecture: While based on the transformer framework, DeepSeek-R1 incorporated innovations optimized for training stability and long-context understanding.

These elements together enabled the model to punch well above its weight, especially in multi-step reasoning and problem solving.

  • Real-world readiness: DeepSeek demonstrated real-world readiness from the outset, standing apart from many state-of-the-art models that excel in benchmarks but struggle in deployment. Unlike others that require extensive setup or closed infrastructure, DeepSeek was immediately usable in practical settings. It offered production-ready access via its API, local deployment with open weights and inference code, and customization through LoRA fine-tuning or prompt engineering. Integration with enterprise platforms such as Trae and Windsurf further streamlined orchestration. These capabilities, rarely combined so seamlessly in other models at launch, underscored DeepSeek’s commitment to practical utility beyond academic performance.
  • Community interest: DeepSeek-R1’s release sparked intense community activity. GitHub quickly overflowed with plug-ins, adapters, and fine-tuned spin-offs, while thousands of Hugging Face forks powered tools for contract review, tutoring, research aid, summarization, and coding support. Online forums shared benchmarks and hardware guides, and universities adopted the model for courses and lab projects. Forums such as Reddit, Zhihu, and Stack Overflow buzzed with shared experiments, performance tests, and guides for fine-tuning DeepSeek on local hardware. The accessibility of the model turned casual enthusiasts into researchers and developers into entrepreneurs. Today, DeepSeek also fuels educational initiatives. Several MOOCs and university labs have begun teaching LLM theory and experimentation using DeepSeek as the base model due to its openness and clarity.
  • Philosophical vision: DeepSeek’s vision aligns with a broader movement to build AI not as a gatekept asset, but as a shared global resource. Much like how Linux reshaped the software industry, DeepSeek aims to reshape AI development by putting tools in the hands of anyone curious or capable enough to use them. Its strategy is not just to compete with OpenAI or Google but to focus on accessibility and collaborative innovation.

DeepSeek-R1’s success is attributed to three factors: RLHF training, open source commitment, and its competitive performance across benchmarks.

Together, these elements laid the groundwork for DeepSeek not just as a model, but as an ecosystem. The remainder of this chapter will explore the reception of DeepSeek-R1 and the motivations behind its open philosophy, what technical breakthroughs powered it, and how it is evolving into a full-scale ecosystem.

Understanding the technical breakthroughs of DeepSeek


What truly sets DeepSeek-R1 apart are the technical innovations embedded in its architecture and training process. These innovations enabled it to outperform many contemporary models and helped redefine how future LLMs might be built.

The training process


Before we begin with DeepSeek’s training process, let’s first take a look at the development process of leading LLMs, which usually consists of the following:

  1. Pretraining on massive corpora using self-supervised learning: In this stage, models are exposed to large-scale, diverse datasets such as books, websites, and code. The goal is to learn general...

Erscheint lt. Verlag 21.11.2025
Sprache englisch
Themenwelt Informatik Theorie / Studium Künstliche Intelligenz / Robotik
ISBN-13 9781806020843 / 9781806020843
Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?
EPUBEPUB (Ohne DRM)

Digital Rights Management: ohne DRM
Dieses eBook enthält kein DRM oder Kopier­schutz. Eine Weiter­gabe an Dritte ist jedoch rechtlich nicht zulässig, weil Sie beim Kauf nur die Rechte an der persön­lichen Nutzung erwerben.

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belle­tristik und Sach­büchern. Der Fließ­text wird dynamisch an die Display- und Schrift­größe ange­passt. Auch für mobile Lese­geräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür die kostenlose Software Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Die Grundlage der Digitalisierung

von Knut Hildebrand; Michael Mielke; Marcus Gebauer

eBook Download (2025)
Springer Fachmedien Wiesbaden (Verlag)
CHF 29,30
Die materielle Wahrheit hinter den neuen Datenimperien

von Kate Crawford

eBook Download (2024)
C.H.Beck (Verlag)
CHF 17,55