Fundamentals of Cost-Efficient AI
Academic Press Inc (Verlag)
978-0-443-33362-0 (ISBN)
The book covers fine-tuning and compression techniques such as low-rank adaptation (LoRA), parameter-efficient fine-tuning (PEFT), adapter-based tuning, pruning, and quantization. It also explores inference acceleration through Flash Attention, prefill optimization, and speculative decoding, and explains how mixture-of-experts (MoE) architectures can scale models efficiently across GPUs and edge devices.
To build a strong conceptual understanding, the text introduces fundamentals of GPU architecture, matrix multiplication, memory hierarchies, and parallelization strategies, helping readers develop an intuition for optimizing training and inference pipelines.
While applicable across domains, the book places special emphasis on healthcare and biomedicine, where efficient AI can reduce costs and improve diagnostics, precision medicine, and clinical decision support. Real-world case studies and interviews with experts from organizations such as Google and Microsoft provide practical insights into building scalable healthcare AI systems. Aimed at graduate students, researchers, clinicians, biomedical engineers, data scientists, and AI practitioners, this book bridges algorithmic principles with applied implementation.
Rohit Kumar studied at Stanford, IIT Delhi, and RPI, specializing in machine learning. He is the Global Head of AI & Analytics at HCLTech (Digital Business), a visiting faculty at Shiv Nadar University, and a PhD scholar at IIT researching AI hallucinations. With over 20 years of product development experience in Silicon Valley, he has served as the Head of R&D at the Ministry of IT (Government of India), Senior Director at WalmartLabs, and CEO of a blockchain startup. He holds multiple patents and publications on generative AI, data mining, and large-scale distributed systems.
Introduction
Efficient transformer architectures
Efficient model fine-tuning
Model compression techniques
Efficient reinforcement learning
Efficient graph algorithms
Training data augmentation
Training data generation
Cost efficient mixture of experts
GPU fundamentals and model inference
Fast matrix multiplication algorithms
Running models locally
Expert interviews and use cases
| Erscheinungsdatum | 13.12.2025 |
|---|---|
| Verlagsort | San Diego |
| Sprache | englisch |
| Maße | 191 x 235 mm |
| Gewicht | 450 g |
| Themenwelt | Medizin / Pharmazie ► Gesundheitswesen |
| Medizin / Pharmazie ► Physiotherapie / Ergotherapie ► Orthopädie | |
| Naturwissenschaften ► Biologie | |
| Technik ► Medizintechnik | |
| ISBN-10 | 0-443-33362-9 / 0443333629 |
| ISBN-13 | 978-0-443-33362-0 / 9780443333620 |
| Zustand | Neuware |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
aus dem Bereich