Speech and Audio Signal Processing - Dan Ellis, Ben Gold, Nelson Morgan

Blick ins Buch

Speech and Audio Signal Processing (eBook)

Processing and Perception of Speech and Music

Dan Ellis, Ben Gold, Nelson Morgan (Autoren)

eBook Download: PDF

2011 | 2. Auflage
688 Seiten
Wiley (Verlag)
978-1-118-14291-2 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (PDF)

When Speech and Audio Signal Processing published in 1999, it stood out from its competition in its breadth of coverage and its accessible, intutiont-based style. This book was aimed at individual students and engineers excited about the broad span of audio processing and curious to understand the available techniques. Since then, with the advent of the iPod in 2001, the field of digital audio and music has exploded, leading to a much greater interest in the technical aspects of audio processing. This Second Edition will update and revise the original book to augment it with new material describing both the enabling technologies of digital music distribution (most significantly the MP3) and a range of exciting new research areas in automatic music content processing (such as automatic transcription, music similarity, etc.) that have emerged in the past five years, driven by the digital music revolution. New chapter topics include: Psychoacoustic Audio Coding, describing MP3 and related audio coding schemes based on psychoacoustic masking of quantization noise Music Transcription, including automatically deriving notes, beats, and chords from music signals. Music Information Retrieval, primarily focusing on audio-based genre classification, artist/style identification, and similarity estimation. Audio Source Separation, including multi-microphone beamforming, blind source separation, and the perception-inspired techniques usually referred to as Computational Auditory Scene Analysis (CASA).

The late Ben Gold consulted at Massachusetts Institute of Technology and Lincoln Laboratory and taught at the University of California at Berkeley. He was the author of Digital Processing of Signals and the coauthor of Theory and Applications of Digital Signal Processing. Dr. Gold was an IEEE Fellow, member of the National Academy of Engineering, and recipient of several IEEE awards. Nelson Morgan is the Director of the International Computer Science Institute, an independent, not-for profit research laboratory affiliated with the University of California at Berkeley. Dr. Morgan is also Professor-in-Residence in the Electrical Engineering and Computer Sciences Department at UC Berkeley. Dr. Morgan is an IEEE Fellow. Dan Ellis is Associate Professor in the Electrical Engineering Department of Columbia University. Dr. Ellis's Laboratory for Recognition and Organization of Speech and Audio (LabROSA) investigates how to extract high-level information from audio, including speech recognition, music description, and environmental sound processing.

PREFACE TO THE 2011 EDITION xxi

CHAPTER 1 INTRODUCTION 1

PART I HISTORICAL BACKGROUND

CHAPTER 2 SYNTHETIC A UDIO: A BRIEF HISTORY 9

CHAPTER 3 SPEECH ANALYSIS AND SYNTHESIS OVERVIEW 21

CHAPTER 4 BRIEF HISTORY OF AUTOMATIC SPEECH RECOGNITION 40

CHAPTER 5 SPEECH-RECOGNITION OVERVIEW 59

PART II MATHEMATICAL BACKGROUND

CHAPTER 6 DIGITAL SIGNAL PROCESSING 73

CHAPTER 7 DIGITAL FILTERSAND DISCRETE FOURIER TRANSFORM 87

CHAPTER 8 PATTERN CLASSIFICATION 105

CHAPTER 9 STATISTICAL PATTERN CLASSIFICATION 124

PART III ACOUSTICS

CHAPTER 10 WAVE BASICS 141

CHAPTER 11 ACOUSTIC TUBE MODELING OF SPEECH PRODUCTION 152

CHAPTER 12 MUSICAL INSTRUMENT ACOUSTICS 158

CHAPTER 13 ROOM ACOUSTICS 179

PART IV AUDITORY PERCEPTION

CHAPTER 14 EAR PHYSIOLOGY 193

CHAPTER 15 PSYCHOACOUSTICS 209

CHAPTER 16 MODELS OF PITCH PERCEPTION 218

CHAPTER 17 SPEECH PERCEPTION 232

CHAPTER 18 HUMAN SPEECH RECOGNITION 250

PART V SPEECH FEATURES

CHAPTER 19 THE AUDITORY SYSTEM AS A FILTER BANK 263

CHAPTER 20 THE CEPSTRUM AS A SPECTRAL ANALYZER 277

CHAPTER 21 LINEAR PREDICTION 286

PART VI A UTOMATIC SPEECH RECOGNITION

CHAPTER 22 FEATURE EXTRACTION FOR ASR 301

CHAPTER 23 LINGUISTIC CATEGORIES FOR SPEECH RECOGNITION 319

CHAPTER 24 DETERMINISTIC SEQUENCE RECOGNITION FOR ASR 337

CHAPTER 25 STATISTICAL SEQUENCE RECOGNITION 350

CHAPTER 26 STATISTICAL MODEL TRAINING 364

CHAPTER 27 DISCRIMINANT ACOUSTIC PROBABILITY ESTIMATION 381

CHAPTER 28 ACOUSTIC MODEL TRAINING: FURTHER TOPICS 394

CHAPTER 29 SPEECH RECOGNITION AND UNDERSTANDING 416

PART VII SYNTHESIS AND CODING

CHAPTER 30 SPEECH SYNTHESIS 431

CHAPTER 31 PITCH DETECTION 455

CHAPTER 32 VOCODERS 473

CHAPTER 33 LOW-RATE VOCODERS 493

CHAPTER 34 MEDIUM-RATE AND HIGH-RATE VOCODERS 505

CHAPTER 35 PERCEPTUAL A UDIO CODING 531

PART VIII OTHER APPLICATIONS

CHAPTER 36 SOME ASPECTS OF COMPUTER MUSIC SYNTHESIS 553

CHAPTER 37 MUSIC SIGNAL ANALYSIS 567

CHAPTER 38 MUSIC RETRIEVAL 581

CHAPTER 39 SOURCE SEPARATION 59

CHAPTER 40 SPEECH TRANSFORMATIONS 617

CHAPTER 41 SPEAKER VERIFICATION 633

CHAPTER 42 SPEAKER DIARIZATION 644

Erscheint lt. Verlag	30.9.2011
Sprache	englisch
Themenwelt	Mathematik / Informatik ► Informatik
	Technik ► Elektrotechnik / Energietechnik
	Technik ► Nachrichtentechnik
Schlagworte	Audio & Speech Processing & Broadcasting • Audio-, Sprachverarbeitung u. Ãbertragung • Audio-, Sprachverarbeitung u. Übertragung • Computer Science • Electrical & Electronics Engineering • Elektrotechnik u. Elektronik • Informatik • Informationstechnologie • Information Technologies • Sprachverarbeitung
ISBN-10	1-118-14291-8 / 1118142918
ISBN-13	978-1-118-14291-2 / 9781118142912

Haben Sie eine Frage zum Produkt?

PDF (Adobe DRM)
Größe: 40,5 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Print-Ausgabe

Buch | Hardcover

CHF 189,95