Man-Machine Speech Communication
Springer Verlag, Singapore
978-981-95-5381-5 (ISBN)
- Titel nicht im Sortiment
- Artikel merken
The 40 papers included in these proceedings were carefully reviewed and selected from 157 submissions. the conference will feature special events such as a Young Scholars Forum, Student Forum, Industry Forum, and Product and Technology Exhibition. Beyond the main program, the conference will also include publicoutreach activities, grant-writing workshops, and several special sessions.
.- Zero- and One-Shot Data Augmentation for Sentence-Level Dysarthric Speech
Recognition in Constrained Scenarios.
.- Multilevel and Granular L2 Pronunciation Assessment Using Stress-Based
Suprasegmental Features and Proficiency Adaptation.
.- CDMGTU-Net: A Causal Dual-Branch Multi-Channel Speech Enhancement Network
with Multi-Scale Gateted Feature Fusion.
.- A Two-Stage Band-Split Mamba-2 Network For Music Source Separation.
.- Ideal-LLM: Integrating Dual Encoders and Language-Adapted LLM for Multilingual Speech-to-Text.
.- MambaVoc: State Space Models for High-Fidelity Audio Synthesis.
.- StreamFlow: Streaming Flow Matching with Block-wise Guided Attention Mask for Speech Token Decoding.
.- Automatic Speech Evaluation Method Leveraging Deep Feature Fusion.
.- Curriculum Reinforcement Learning for Robust Low-Resource Chinese Dialect Speech Recognition.
.- An Acoustic Study on Intonation Production of English Learners from Guanzhong Region in Shaanxi Province.
.- Improving Anomalous Sound Detection with Top-M Pseudo-Labeling.
.- Dementia Detection via Speech Temporal Sequences with Shifted Windows.
.- CL-EDiff: Cross-lingual emotional TTS system based on diffusion model.
.- When AI Speaks, Do We Follow? Phonetic Entrainment in Human-AI Dialogues.
.- Aishell1Mix: Towards Robust Mandarin Speech Separation with Scalable Audio Language Models.
.- Study of the Low-Rank Minimum Variance Distortionless Response Beamformer for Speech Enhancement.
.- Exploring Gender Bias in Alzheimer’s Disease Detection: Insights from Mandarin and Greek Speech Perception.
.- UniDaugMamba: A Unimodal Data-augmented Mamba for Speech-Based Depression Detection.
.- Serial-Parallel Dual-Path Architecture for Speaking Style Recognition.
.- Knowledge Augmented Finetuning Matters in Both RAG and Agent Based Dialog Systems.
.- NC-KWS: Few-Shot Class-Incremental Keyword Spotting Based on Neural Collapse.
.- ZSEmo-MTVITS: A Zero-Shot Cross-Lingual Emotional Speech Synthesis Model for Mandarin and Tibetan Based on VITS.
.- CUHK-EE Systems for the vTAD Challenge at NCMMSC 2025.
.- Accent Familiarity and Phonological Weighting in Spoken-Word Recognition.
.- Audio Deepfake Detection via Dual Branch Classifier with Self-Supervised Pre-Trained Model.
.- A Multi-Subspace Attention Approach for Robust Speech Spoofing Detection in Silence-Trimming Conditions.
.- Temporally Consistent Teeth Restoration for Talking Heads.
.- EEG as a Biometric Identifier: The Impact of Electrode Arrangement, Brain Areas, and Frequency Bands.
.- The Phonetic Modification and Facial Movements Made During Mandarin Vowel and Tone Production in Noise.
.- Exploring Audio-Visual Fusion for Sound Event Localization and Detection with BEATs.
.- On Multi-Input Multi-Frame MVDR Filter for Speech Enhancement with Heterophasic Presentation.
.- Adaptive Multi-source Fusion for Uyghur ASR Error Correction.
.- The determinants of Chinese lexical stress.
.- Introducing Discriminative Speaker Embeddings for Voice Timbre Attribute Detection.
.- TSELM: Target Speaker Extraction using Discrete Tokens and Language Models.
.- A Timbre Attribute Discrimination System Fusing Pre-trained Speaker Feature Extractors with Gender Prior Features.
.- Improving the Robustness of Audio-Visual Target Speaker Extraction With AV-HuBERT Based Lip Features.
.- A Hierarchical Fusion Modeling from Perception to Prediction with Personalized Features for Multimodal Depression Detection.
.- Revisiting Target Signal Definitions in Distortionless Superdirective Beamforming for Reverberant Speech Enhancement.
.- HiStyle: Hierarchical Style Embedding Prediction for Text-Prompt-Guided Controllable Speech Synthesis.
| Erscheint lt. Verlag | 27.1.2026 |
|---|---|
| Reihe/Serie | Communications in Computer and Information Science |
| Zusatzinfo | 144 Illustrations, black and white |
| Verlagsort | Singapore |
| Sprache | englisch |
| Maße | 155 x 235 mm |
| Themenwelt | Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik |
| Technik ► Elektrotechnik / Energietechnik | |
| Schlagworte | Audio signal analysis • Phonetics, phonology and prosody • Sound event detection • Speaker Recognition • Speech coding • speech emotion recognition • Speech Enhancement • Speech large language model • Speech Perception • Speech processing • Speech Recognition • Speech Science • speech security • speech synthesis and conversion • spoken dialog system |
| ISBN-10 | 981-95-5381-4 / 9819553814 |
| ISBN-13 | 978-981-95-5381-5 / 9789819553815 |
| Zustand | Neuware |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
aus dem Bereich