Zum Hauptinhalt springen
Nicht aus der Schweiz? Besuchen Sie lehmanns.de
AI and ML Unlocked -  M.S. Ali

AI and ML Unlocked (eBook)

A Course Book Bridging Fundamentals and Industry Challenges

(Autor)

eBook Download: EPUB
2025 | 1. Auflage
150 Seiten
Publishdrive (Verlag)
978-0-00-104716-7 (ISBN)
Systemvoraussetzungen
4,49 inkl. MwSt
(CHF 4,35)
Der eBook-Verkauf erfolgt durch die Lehmanns Media GmbH (Berlin) zum Preis in Euro inkl. MwSt.
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

AI and ML Unlocked: A Course Book Bridging Fundamentals and Industry Challenges


From Foundational Concepts to Real-World Deployment and Ethical Considerations


Transform Your Understanding of Artificial Intelligence from Theory to Practice


In a world where artificial intelligence shapes everything from the photos on your phone to life-saving medical diagnoses, understanding how these systems work isn't just advantageous-it's essential. AI and ML Unlocked written with the help of AI, bridges the critical gap between abstract mathematical concepts and the practical skills needed to build, deploy, and responsibly manage AI systems that create real value.


Why This Book Stands Apart


Most AI education falls into two camps: dense academic texts that bury practical insights under layers of theory, or superficial tutorials that show you how to use tools without understanding why they work. This book takes a revolutionary third path-learning through building. Every mathematical concept connects directly to code you'll write. Every algorithm comes alive through projects you'll complete. Every ethical consideration emerges from real systems you'll design.


The 'Spiral of Understanding' Approach


Our unique pedagogical framework ensures deep, lasting comprehension:


Intuitive Foundation: Start with analogies and real-world examples that make complex ideas feel natural


Mathematical Clarity: Build rigorous understanding without drowning in notation


Hands-On Implementation: Strengthen knowledge through immediate practical application


Critical Analysis: Develop judgment about when, how, and whether to deploy different techniques


What You'll Master


Part I: The Foundation That Actually Matters


Move beyond memorizing definitions to understanding what makes machine learning fundamentally different from traditional programming. Grasp the mathematical concepts that power every AI system-linear algebra, calculus, and probability-through intuitive explanations and Python implementations that illuminate rather than intimidate.


Part II: Supervised and Unsupervised Learning in Action


Build classification and regression systems that solve real problems. Master decision trees, support vector machines, and clustering algorithms through projects with actual datasets. Learn not just how these algorithms work, but when to use each one and how to evaluate their performance honestly.


Part III: Deep Learning and Generative AI


Construct neural networks from scratch, then scale up to convolutional networks that can see and transformers that can understand language. Explore the cutting-edge world of generative AI and large language models, understanding both their remarkable capabilities and their significant limitations.


Part IV: The Production Reality


Bridge the notorious gap between promising prototypes and production systems. Master MLOps practices, learn to deploy models that can handle real-world scale and complexity, and understand how to monitor and maintain AI systems over time. Work through detailed case studies from healthcare, finance, and manufacturing.


Part V: Responsible AI Leadership


Develop the critical thinking skills to navigate bias, fairness, and explainability challenges. Understand the societal implications of AI systems and learn frameworks for making ethical decisions in high-stakes applications. Prepare for the evolving landscape of AI governance and regulation.

Chapter 3: Data and Preprocessing - The Unsung Heroes


Here's a truth that might surprise you: in most machine learning projects, you'll spend far more time working with data than building models. Data preprocessing isn't glamorous, but it's absolutely critical. A brilliant algorithm trained on poor data will fail, while a simple algorithm trained on high-quality, well-prepared data can achieve remarkable results.

The Importance of Data: Garbage In, Garbage Out


The phrase "garbage in, garbage out" is fundamental in data science. Your model can only learn patterns that exist in your training data. If that data is incomplete, biased, or irrelevant, your model will learn the wrong lessons.

Consider a resume screening system trained only on resumes from successful hires over the past 10 years. If historical hiring was biased toward certain demographics, the model will learn and perpetuate those biases. The algorithm isn't inherently biased—it's learning from biased historical data.

Data Quality Dimensions


High-quality data has several characteristics:

  • Accuracy: The data correctly represents reality
  • Completeness: No important information is missing
  • Consistency: The same information is represented the same way everywhere
  • Relevance: The data is actually useful for your problem
  • Timeliness: The data is current and reflects the present situation

Data Structures and Types: Understanding Your Raw Materials


Data comes in many forms, and understanding these different types helps you choose appropriate preprocessing techniques and algorithms.

Tabular Data: The Familiar Spreadsheet


Tabular data is what most people think of when they hear "data"—rows and columns like a spreadsheet. Each row represents one observation (a customer, a transaction, a patient), and each column represents one feature or attribute.

import pandas as pd import numpy as np # Creating sample customer data customer_data = pd.DataFrame({ 'customer_id': [1001, 1002, 1003, 1004, 1005], 'age': [25, 34, 28, 42, 31], 'income': [45000, 78000, 52000, 95000, 63000], 'city': ['New York', 'Chicago', 'New York', 'Los Angeles', 'Chicago'], 'purchases_last_year': [12, 8, 15, 22, 9], 'customer_since': ['2020-03-15', '2019-07-22', '2021-01-08', '2018-11-30', '2020-09-12'] }) print(customer_data.head()) print(f"/nData shape: {customer_data.shape}") print(f"Data types:/n{customer_data.dtypes}")

Time-Series Data: When Order Matters


Time-series data is collected over time, and the order of observations matters. Stock prices, sensor readings, website traffic, and sales data are common examples.

# Creating sample time-series data dates = pd.date_range('2023-01-01', '2023-12-31', freq='D') np.random.seed(42) # Simulate daily sales with trend and seasonality trend = np.linspace(100, 150, len(dates)) seasonality = 20 * np.sin(2 * np.pi * np.arange(len(dates)) / 365.25) noise = np.random.normal(0, 5, len(dates)) sales = trend + seasonality + noise sales_data = pd.DataFrame({ 'date': dates, 'daily_sales': sales }) print(sales_data.head()) print(f"Sales range: ${sales_data['daily_sales'].min():.2f} to ${sales_data['daily_sales'].max():.2f}")

Text Data: The Challenge of Human Language


Text data presents unique challenges because computers don't naturally understand human language. Text needs to be converted into numerical representations before machine learning algorithms can work with it.

# Sample text data - customer reviews reviews_data = pd.DataFrame({ 'review_id': [1, 2, 3, 4, 5], 'rating': [5, 2, 4, 1, 5], 'review_text': [ 'Absolutely love this product! Fast delivery and great quality.', 'Disappointed with the purchase. Poor quality and overpriced.', 'Good value for money. Works as expected.', 'Terrible experience. Product broke after one day.', 'Excellent service and amazing product quality!' ] }) print(reviews_data) print(f"/nAverage rating: {reviews_data['rating'].mean():.1f}")

Image Data: Pixels as Features


Image data consists of pixels, where each pixel has color values. A grayscale image has one value per pixel (0-255), while color images typically have three values (RGB) per pixel.

Data Cleaning and Wrangling: Turning Mess into Gold


Real-world data is messy. It has missing values, inconsistent formats, duplicates, and errors. Data cleaning is the process of detecting and correcting these issues.

Handling Missing Values


Missing data is one of the most common issues you'll encounter. There are several strategies for dealing with it:

# Creating data with missing values to demonstrate handling techniques messy_data = pd.DataFrame({ 'name': ['Alice', 'Bob',

The Kernel Trick


The real power of SVMs comes from the "kernel trick". Sometimes data isn't separable with a straight line, but it becomes separable if we transform it to a higher dimension. Kernels allow SVMs to implicitly work in these higher dimensions without explicitly computing the transformation.

Linear Kernel: Finds straight-line boundaries. Good for linearly separable data.

RBF (Radial Basis Function) Kernel: Creates circular/curved boundaries. Good for complex, non-linear patterns.

Polynomial Kernel: Creates polynomial-curved boundaries. Good for data with polynomial relationships.

# Demonstrating kernels with non-linear data # Create circular data that isn't linearly separable np.random.seed(42) n_samples = 300 # Inner circle (class 0) angles_inner = np.random.uniform(0, 2*np.pi, n_samples//2) radii_inner = np.random.uniform(0, 1, n_samples//2) inner_x = radii_inner * np.cos(angles_inner) + np.random.normal(0, 0.1, n_samples//2) inner_y = radii_inner * np.sin(angles_inner) + np.random.normal(0, 0.1, n_samples//2) # Outer ring (class 1) angles_outer = np.random.uniform(0, 2*np.pi, n_samples//2) radii_outer = np.random.uniform(2, 3, n_samples//2) outer_x = radii_outer * np.cos(angles_outer) + np.random.normal(0, 0.1, n_samples//2) outer_y = radii_outer * np.sin(angles_outer) + np.random.normal(0, 0.1, n_samples//2) # Combine the data X_circular = np.column_stack([ np.concatenate([inner_x, outer_x]), np.concatenate([inner_y, outer_y]) ]) y_circular = np.concatenate([np.zeros(n_samples//2), np.ones(n_samples//2)]) # Split and scale X_train_circ, X_test_circ, y_train_circ, y_test_circ = train_test_split( X_circular, y_circular, test_size=0.3, random_state=42 ) scaler_circ = StandardScaler() X_train_circ_scaled = scaler_circ.fit_transform(X_train_circ) X_test_circ_scaled = scaler_circ.transform(X_test_circ) # Compare linear vs RBF kernel on circular data linear_svm = SVC(kernel='linear', random_state=42) rbf_svm = SVC(kernel='rbf', random_state=42) linear_svm.fit(X_train_circ_scaled, y_train_circ) rbf_svm.fit(X_train_circ_scaled, y_train_circ) linear_score = linear_svm.score(X_test_circ_scaled, y_test_circ) rbf_score = rbf_svm.score(X_test_circ_scaled, y_test_circ) print(f"/nCircular Data Classification:") print(f"Linear SVM accuracy: {linear_score:.3f}") print(f"RBF SVM accuracy: {rbf_score:.3f}") print(f"RBF improvement: {rbf_score - linear_score:.3f}") print("/nWhy RBF works better:") print("- Linear SVM tries to draw straight lines through circular patterns") print("- RBF SVM can create curved boundaries that follow the circular structure")

SVM Hyperparameters


SVMs have important hyperparameters that control their behavior:

C (Regularization parameter): Controls the trade-off between smooth decision boundary and classifying training points correctly. Higher C = less regularization = more complex boundaries.

gamma (for RBF kernel): Controls how far the influence of a single training example reaches. Higher gamma = more complex boundaries.

# Hyperparameter tuning for SVM from sklearn.model_selection import GridSearchCV # Define parameter grid param_grid = { 'C': [0.1, 1, 10, 100], 'gamma': ['scale', 'auto', 0.001, 0.01, 0.1, 1] } # Grid search with cross-validation svm_grid = SVC(kernel='rbf', random_state=42) grid_search = GridSearchCV(svm_grid, param_grid, cv=5, scoring='accuracy', n_jobs=-1) grid_search.fit(X_train_circ_scaled, y_train_circ) print(f"SVM Hyperparameter Tuning Results:") print(f"Best parameters: {grid_search.best_params_}") print(f"Best CV score: {grid_search.best_score_:.3f}") # Test the best model best_svm = grid_search.best_estimator_ test_score_tuned = best_svm.score(X_test_circ_scaled, y_test_circ) print(f"Test accuracy with tuned parameters: {test_score_tuned:.3f}") # Compare with default parameters print(f"Improvement from tuning: {test_score_tuned - rbf_score:.3f}")

Performance Metrics: Beyond Accuracy


While accuracy is a good starting point, real-world problems often require more nuanced evaluation metrics. Let's explore advanced metrics that give deeper insights into model performance.

ROC Curves and AUC


The ROC (Receiver Operating Characteristic) curve plots True Positive Rate vs. False Positive Rate at various threshold settings. The AUC (Area Under Curve) summarizes this into a single number.

# ROC Curves and AUC analysis from sklearn.metrics import roc_curve, auc, roc_auc_score # Get probability predictions from different models models_for_roc = { 'Logistic Regression': LogisticRegression(random_state=42), 'Random Forest': RandomForestClassifier(n_estimators=100, random_state=42), 'SVM': SVC(kernel='rbf', probability=True,...

Erscheint lt. Verlag 3.9.2025
Sprache englisch
Themenwelt Technik
ISBN-10 0-00-104716-7 / 0001047167
ISBN-13 978-0-00-104716-7 / 9780001047167
Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?
EPUBEPUB (Adobe DRM)
Größe: 16,4 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belle­tristik und Sach­büchern. Der Fließ­text wird dynamisch an die Display- und Schrift­größe ange­passt. Auch für mobile Lese­geräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Multifunktionsgurt für die Feuerwehr

von Ivo Ernst

eBook Download (2025)
Kohlhammer Verlag
CHF 10,70